Commun. Math. Phys. 226, 1 – 40 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Minimal Representations, Spherical Vectors and Exceptional Theta Series David Kazhdan1 , Boris Pioline2, , Andrew Waldron3, 1 Dept of Mathematics, Harvard University, Cambridge, MA 02138, USA.
E-mail:
[email protected]
2 LPTHE, Universités Paris VI & VII, Boîte 126, Tour 16, 4 place Jussieu, 75252 Paris, France.
E-mail:
[email protected]
3 Physics Department, Brandeis University, Waltham, MA 02454, USA.
E-mail:
[email protected] Received: 31 July 2001 / Accepted: 2 October 2001
Abstract: Theta series for exceptional groups have been suggested as a possible description of the eleven-dimensional quantum supermembrane. We present explicit formulae for these automorphic forms whenever the underlying Lie group G is split (or complex) and simply laced. Specifically, we review and construct explicitly the minimal representation of G, generalizing the Schrödinger representation of symplectic groups. We compute the spherical vector in this representation, i.e. the wave function invariant under the maximal compact subgroup, which plays the rôle of the summand in the automorphic theta series. We also determine the spherical vector over the complex field. We outline how the spherical vector over the p-adic number fields provides the summation measure in the theta series, postponing its determination to a sequel of this work. The simplicity of our result is suggestive of a new Born–Infeld-like description of the membrane where U-duality is realized non-linearly. Our results may also be used in constructing quantum mechanical systems with spectrum generating symmetries. 1. Introduction Despite considerable insights afforded by dualities, the fundamental degrees of freedom of M-theory remain elusive. Recently the rôle of the eleven-dimensional supermembrane has been tested [1] in an attempt to rederive toroidally compactified, M-theoretic, supersymmetric four-graviton scattering amplitudes at order R 4 . These amplitudes are known independently on the basis of supersymmetry and duality, to be given by an Eisenstein series of the U-duality group [2–5] (see [6] for a review), but still lack a finite microscopic derivation (see however [7] for a discussion of perturbative computations in eleven-dimensional supergravity). In analogy with the string one-loop computation, a one-loop membrane amplitude was constructed as the integral of a modular invariant On leave of absence from Jefferson Physical Laboratory, Harvard University, Cambridge, MA 02138, USA. On leave of absence from Dept. of Mathematics, UC Davis, CA 95616, USA.
2
D. Kazhdan, B. Pioline, A. Waldron
partition function on the fundamental domain of a membrane modular group Gl(3, Z). The action of a membrane instanton configuration with given winding numbers is given by the Polyakov action and as a working hypothesis the summation measure was taken to be unity. A comparison to the exact result showed that the mass spectrum and the instanton saddle points were correctly reproduced by this ansatz, but the spectrum multiplicities and instanton summation measure were incorrect1 . The proposed partition function was therefore not U-duality invariant. However, a general method to construct invariant partition functions was outlined: exceptional theta series should provide the correct partition function for the BPS membrane on torii. While theta series for symplectic groups are very common both in mathematics, e.g., in the study of Riemann surfaces, and physics where they arise as partition functions of free theories, their generalization to other groups is not as well understood. One difficulty is that group invariance requires a generalization of the standard Poisson resummation formula (i.e., Gaussian integration) to cubic characters (i.e., “Airy” integration). This scenario is clearly well adapted to the membrane situation, where the Wess–Zumino interaction is cubic in the brane winding numbers. Since theta series reside at the heart of many problems in the theory of automorphic forms, it would be very desirable from both physical and mathematical viewpoints, to have explicit expressions for them. As outlined in [1], the construction of theta series for a simple non-compact group G requires three main ingredients: (i) An irreducible representation of the group in an appropriate space of functions. In the symplectic case, this is simply the Weyl representation of the Heisenberg algebra [pi , x j ] = −iδi j , which gives rise to the Schrödinger representation of Sp(n, R). (ii) A special function f , known as the spherical vector, which is invariant under the maximal compact subgroup K of G. This generalizes the i 2 Gaussian character e−(x ) /2 appearing in the symplectic theta series. (iii) A distribution δ invariant under an arithmetic subgroup G(Z) ⊂ G generalizing the sum with unit weight over integers x i ∈ Z of the symplectic case. As for step (i), one observes that for any simple Lie algebra G there exists a unique non-zero minimal conjugacy class O ⊂ G. This nilpotent orbit carries the standard Kirillov-Kostant symplectic form, whose quantization furnishes a representation of G on the Hilbert space of wave functions on a Lagrangian submanifold of O. Its quantization relies heavily on the existence, discovered by Joseph [9], of a unique completely prime two-sided ideal J of the enveloping algebra U (G) whose characteristic variety coincides with O∪{0}. The obtained representation is minimal, in the sense that its Gelfand-Kirillov dimension is smallest among all representations, being equal to half the dimension of O. The minimal representation exists not only for the split real group G(R), but also for the group G(F ) for arbitrary local field F as long as G is any simply-laced split Lie algebra. In the case when G is of the type Dn , the minimal representation can be realized using Howe’s theory of dual pairs [12]. The general construction was described in [10] and [11], the latter of which we will closely follow in this work. Step (ii) is the main subject of the present paper; we will obtain the spherical vector for all groups G(R) of A, D, E type in the split real form, using techniques from Eisenstein series (An ), dual pairs (Dn ) and PDE’s (E6,7,8 ). A simple generalization will also provide the spherical vector for the complex group G(C). As we will see, step (iii) amounts to solving step (ii) over all p-adic number fields Qp instead of the reals. Our methods will allow us to obtain the p-adic spherical vector for A and D groups. The exceptional case requires more powerful techniques, and will be treated in a sequel to this paper [13]. 1 See [8] for a very recent discussion of the membrane summation measure.
Minimal Representations, Spherical Vectors, Exceptional Theta Series
3
While this paper is mostly concerned with the mathematical construction of exceptional theta series, a few words about the physical implications of our results are in order. First and foremost, we find that a membrane partition function invariant under both the modular group Gl(3, Z) and the U-duality group Ed (Z) cannot be constructed by summing over the 3d membrane winding numbers alone (which confirms the findings of [1]). Indeed, the dimension of the minimal representation of the smallest simple group G containing Sl(3, R) × Ed is always bigger than 3d. Second, we find that the minimal representation of G has a structure quite reminiscent of the membrane, but (in the simplest d = 3 case) necessitates two new quantum numbers, which would be very interesting to understand from the point of view of the quantum membrane. In fact, the form of the spherical vector in this representation, displayed in Eqs. (4.41) and (4.52) below, is very suggestive of a Born-Infeld-like formulation of the membrane, which would then exhibit a hidden dynamical Ed+2 (Z) symmetry. A more complete physical analysis of these results in the context of the eleven-dimensional supermembrane will appear elsewhere. In addition, our minimal representation provides the quantized phase space for quantum mechanical systems with dynamical non-compact symmetries, which may find a use in M-theory or other contexts. By choosing one of the compact generators as the Hamiltonian, one may construct integrable quantum mechanical systems with a spectrum-generating exceptional symmetry, and the spherical vector we constructed would then give the ground state wave function. The organization of this paper is as follows: In Sect. 2, we use the Sl(2) case as a simple example to introduce the main technology. In Sect. 3, we review the construction of the minimal representation for simply-laced groups. Section 4 contains the new results of this paper; real and complex spherical vectors for all A, D, E groups (the main formulae may be found in Eqs. (4.18), (4.28), (4.43), (4.53), (4.69), (4.84) and (4.88)). We close in Sect. 5 with a preliminary discussion of the physics interpretation of our formulae. Miscellaneous group theoretical data is gathered in the Appendix. 2. Sl(2) Revisited As an introduction to our techniques, let us consider two familiar examples of automorphic forms for Sl(2, Z). 2.1. Symplectic theta series. Our first example is the standard Jacobi theta series 2 2 1/4 1/4 θ (τ ) = τ2 eiπτ m = fτ (m), fτ (x) = τ2 eiπτ x , (2.1) m∈Z
m∈Z
where we inserted a power of τ2 to cancel the modular weight. As is well known, this series is an holomorphic modular form of Sl(2, Z) up to a system of phases. The invariance under the generator T : τ → τ + 2 is manifest, while the transformation under S : τ → −1/τ yielding √ θ(−1/τ ) = iθ (τ ), (2.2) follows from the Poisson resummation formula, f (p), f (p) ≡ dx f (x) e2πipx , f (m) = m∈Z
p∈Z
(2.3)
4
D. Kazhdan, B. Pioline, A. Waldron
applied to the Gaussian kernel fτ (x). A better understanding of the mechanism behind the invariance of the theta series (2.1) can be gained (see e.g., [14]) by rewriting it as θ (τ ) = δ, ρ(gτ ) · f .
(2.4)
˜ In this symbolic form, ρ is a representation of the double G of Sl(2, R) in the cover 1 τ1 √ space S of Schwartz functions of one variable; gτ = / τ2 is an element of 0 τ2 G = Sl(2, R) parameterizing the coset U (1)\Sl(2, R) in the Iwasawa gauge; f (x) = 2 e−x /2 is the spherical vector of the representation ρ, i.e. an element of S which is an ˜ of the maximal compact subgroup K = U (1) of eigenvector of the preimage U˜ ⊂ G G corresponding to the basic character of U˜ ; finally, δZ (x) = m∈Z δ(x − m) is a distribution in the dual space of S, invariant under the action of Sl(2, Z). [The inner product · , · is just integration dx.] The invariance of θ(τ ) then follows trivially from the covariance of the various pieces in (2.4). More explicitly, ρ is the so-called metaplectic representation 2 1t ρ : φ(x) → eiπtx φ(x), (2.5) 01 −t e 0 ρ : φ(x) → et/2 φ(et x), (2.6) 0 et 0 −1 (−x), ρ : φ(x) → eiπ/4 φ (2.7) 1 0 acting on a function φ ∈ S. It is easily checked that the defining relation (ST )3 = 1 holds modulo a phase, and that the generators S and T leave the distribution δ invariant. Linearizing (2.5) and (2.6) yields generators for the positive root and Cartan elements E+ = iπ x 2 ,
H =
1 (x∂x + ∂x x), 2
(2.8)
while the negative root follows by a Weyl reflection i 2 ∂ , 4π x
(2.9)
H = [E+ , E− ].
(2.10)
E− = −ρ(S) · E+ · ρ(S −1 ) = and we have the Sl(2, R) algebra, [H, E± ] = ±2E± ,
In this representation, there does not exist a spherical vector strictly speaking, since the compact generator E+ − E− (recognized as the Hamiltonian of the harmonic oscillator) does not admit a state with zero eigenvalue. The lowest state has eigenvalue i/2, and plays the role of the spherical vector in (2.4), (E+ − E− )f =
i f, 2
f (x) = e−πx . 2
(2.11)
Its invariance (up to a phase) under the compact K guarantees that the theta series (2.4) depends only on τ ∈ K\G (up to a phase). In particular, the S generator corresponds to the rotation by an angle π inside K, and therefore leaves f invariant. This is the statement that the Gaussian kernel f is invariant under Fourier transformation, and lies
Minimal Representations, Spherical Vectors, Exceptional Theta Series
5
at the heart of the automorphic invariance of the theta series (2.1). The construction holds, in fact, for any symplectic group Sp(n, Z) (with Sp(1) = Sl(2)), and leads to the well known Jacobi–Siegel theta functions, i j θSp(n,Z) = eiπm τij m . (2.12) (mi )∈Zn
This corresponds to the minimal representation E ij =
i i j xx , 2
Eij =
i ∂ i ∂j , 2
Hji = (x i ∂j + ∂j x i )/2
(2.13)
of Sp(n, R), with algebra 1 i j j j j δl Hk + δl Hki + δki Hl + δk Hli , 4
[E ij , Ekl ] =
(2.14)
acting on the Schwartz space of functions of n variables xi (see e.g., [9]). 2.2. Eisenstein series and spherical vector. Our second example is the non-holomorphic Eisenstein series (see e.g., [15, 4]) s τ2 Es (τ, τ¯ ) = , (2.15) |m + nτ |2 (m,n)=(0,0)
which is a function on the upper half plane U (1)\Sl(2, R) parameterized by τ and is invariant under the right action of Sl(2, Z) given by τ → (aτ + b)/(cτ + d). This action can be compensated by a linear one on the vector (m, n) and the Eisenstein series can therefore be rewritten in the symbolic form (2.4), where now δ = (m,n)∈Z2 \(0,0) δ(x − m) δ(y − n) and ρ is the linear representation ab ρ : φ(x, y) → φ(ax + by, cx + dy) (2.16) cd corresponding to the infinitesimal generators E+ = x∂y ,
E− = y∂x ,
H = x∂x − y∂y ,
(2.17)
generating the Sl(2) algebra (2.10). The spherical vector f (x, y) = (x 2 + y 2 )−s of the representation ρ is clearly invariant under the maximal compact subgroup U (1) ⊂ Sl(2) generated by E+ − E− . In this case, it is not unique (any function of x 2 + y 2 is U (1) invariant) because the linear action (2.16) on functions of two variables is reducible. An irreducible representation in a single variable, known as the first principal series, is obtained by restricting to homogeneous, even functions of degree 2s φ(x, y) = λ2s φ(λx, λy) and setting y = 1 (say)
φ(x) ≡ φ(x, y)
y=1
.
(2.18)
(2.19)
6
D. Kazhdan, B. Pioline, A. Waldron
The representation ρ induces an irreducible one 1t : φ(x) → φ(x + t), ρ s 01 −t e 0 ρ s : φ(x) → e−2st φ(e−2t x), 0 et 0 −1 ρ s : φ(x) → x −2s φ(−1/x) 1 0
(2.20) (2.21) (2.22)
with spherical vector fs = (x 2 + 1)−s .
(2.23)
An equivalent representation can be obtained by Fourier transforming the variable x. In terms of the Eisenstein series (2.15), this amounts to performing a Poisson resummation on m, √ 2 π τ21−s -(s − 1/2) ζ (2s − 1) Sl(2,Z) s E2;s = 2 ζ (2s) τ2 + -(s) √ 2π s τ2
m
s−1/2 Ks−1/2 (2π |mn|τ2 ) e−2πimnτ1 . (2.24) +
-(s) n m=0 n=0
Using instead the summation variable N = mn, this can be rewritten as Sl(2,Z)
E2;s
√ 2 π τ21−s -(s − 1/2) ζ (2s − 1) = 2 ζ (2s) τ2s + -(s) √ 2π s τ2 µs (N )N s−1/2 Ks−1/2 (2π τ2 N ) e2πiτ1 N , + -(s) ∗
(2.25)
N∈Z
where the summation measure of the bulk term can be expressed in terms of the numbertheoretic quantity µs (N ) = n−2s+1 . (2.26) n|N
Indeed, disregarding for now the first two degenerate terms, we see that the Eisenstein series can again be written as in (2.4), where the summation measure is µs (N )δ(y − N ), (2.27) δs (y) = N∈Z∗
and the one-dimensional representation ρs acting as 1t : φ(y) → e−ity φ(y), ρs 01 −t e 0 ρs : φ(y) → e−2(s−1)t φ(e2t y), 0 et
(2.28) (2.29)
Minimal Representations, Spherical Vectors, Exceptional Theta Series
7
is generated by E+ = iy,
E− = i(y∂y + 2 − 2s)∂y ,
H = 2y∂y + 2 − 2s.
(2.30)
Note that this minimal representation has a parameter s, and is distinct from the one in (2.8, 2.9). It is, of course, intertwined with the representation (2.21, 2.22) by Fourier transform. The function fs = y s−1/2 Ks−1/2 (y)
(2.31)
can be easily checked to be annihilated by the compact generator K = E+ − E− = −i(y∂y2 +(2−2s)∂y −y), and therefore is a spherical vector of the representation (2.30). At each value of s, it is unique if one requires that it vanishes as y → ∞. 2.3. Summation measure, p-adic fields and degenerate contributions. While the spherical vector can be easily obtained by solving a linear differential equation, the distribution δ invariant under the discrete subgroup Sl(2, Z) appears to be more mysterious. In fact, it has a simple interpretation in terms of p-adic number fields, as we now explain. The simplest instance arises for the θ series (2.1) itself which can be rewritten (at the origin τ = i) as a sum over principal adeles exp(−π x 2 ) γp (x), (2.32) θ (τ = i) = x∈Q
p prime
where γp (x) is 1 on the p-adic integers and 0 elsewhere. The real spherical vector is the Gaussian and the function γp (x) is its p-adic analog: just like the real Gaussian it is invariant under p-adic Fourier transform (the review [16] provides an introduction to padic numbers and integration theory for physicists). Hence γp (x) is the p-adic spherical vector of the representation (2.5), and we have thus obtained an “adelic” formula for the unit weight summation measure. To take a less trivial case, consider the summation measure (2.26) appearing in the distribution δ in (2.27). It can also be rewritten as an infinite product over primes, N
µs (N ) =
x∈Q p prime
fp (x),
fp (x) = γp (x)
1 − p −2s+1 |x|2s−1 p 1 − p −2s+1
,
(2.33)
where |x|p is the p-adic norm of N (if N is integer, |N | = p −k , where k is the largest integer such that pk divides N ). Just as above, fp (x) can in fact be interpreted as the p-adic spherical vector of the representation (2.29). To convince oneself of this fact, one may take the p-adic Fourier transform of fp , and find fp (u) = (1 − p −2s )−1 max(|u|p , 1)−2s .
(2.34)
This is indeed invariant under u → −1/u, and therefore is a spherical vector for the representation (2.20)2 . It is in fact identical to the real spherical vector (2.15), upon replacing the orthogonal real norm (x, 1)2 ≡ x 2 + 1 by the p-adic norm (x, 1)p ≡ max(|x|p , 1). This suggests that the p-adic spherical vector is simply related to the real 2 One may also check that the product of f˜ (u) over all p reproduces the correct summation measure in p the Eisenstein series (2.15) upon using the summation variable u = m/n.
8
D. Kazhdan, B. Pioline, A. Waldron
spherical vector by changing from orthogonal to p-adic norms and Bessel functions to “p-adic Bessel” functions. We shall not pursue this line further here, referring to [13] for a rigorous derivation. Finally, we should say a word about the first two power terms in (2.25). As seen from the above Poisson resummation, these two terms can viewed as the regulated value of the spherical vector f (x) at x = 0. Unfortunately, we do not know of a direct way to extract them from f (x) alone; an unsatisfactory method is to deduce them by imposing invariance of (2.25) under the generator S. 2.4. Generalization to Sl(n, Z). The construction of the minimal representation of Sl(2, R) above can be easily generalized to any Sl(n) by starting with the Sl(n, Z) Eisenstein series in the fundamental representation, Sl(n,Z) En;s = [mI gI J mJ ]−s , (2.35) mI ∈Zn \{0}
and Poisson resumming one integer, m1 ≡ m say. In the language of [5], this amounts to the small radius expansion in one direction and we find √ π-(s − 1/2) Sl(n,Z) −2s En;s = 2ζ (2s)R + [mi gij mj ]−s+1/2 R -(s) i n−1 m ∈Z
\{0}
2π s + -(s)R s+1/2
s−1/2
m2 2 |m| i i j e−2πimm Ai .
2π × K m g m s−1/2 ij
mi gij mj
R m=0 mi ∈Zn−1 \{0}
(2.36) We have decomposed the n-dimensional metric gI J parameterizing SO(n, R)\Sl(n, R) into an n − 1 dimensional metric gij = gij − R12 Ai Aj , the radius of the nth direction 1/2
R = g11 and the off-diagonal metric Ai = g1i /g11 . We now have an n − 1 dimensional representation of Sl(n) on n − 1 variables x i with Sl(n − 1) realized linearly. The infinitesimal generators corresponding to positive and negative roots are given by i = ix i , E+ i = xi ∂ , E+ j j
E−i = i(x j ∂j + 2 − 2s)∂i , j E− i = x j ∂i (i > j ),
(2.37)
with Cartan elements following by commutation. This is the minimal representation of Sl(n, R), generalizing the Sl(2, R) case in (2.30). Note that this minimal representation again has a continuous parameter s. For other groups than An , the minimal representation will in fact be unique. For An , the above representation is unitary when Re(s) = n/4. The spherical vector is easily read off from (2.36), evaluated at the origin gij = gij = δij , R = 1 (rescaling x i → x i /(2π)) (x i )2 = Ks−1/2 ((x 1 , . . . , x n−1 )), (2.38) fAn ,s = Ks−1/2
Minimal Representations, Spherical Vectors, Exceptional Theta Series
9
where Kt (x) ≡ x −t Kt (x) (Kt is the modified Bessel function of the second kind) and
the Euclidean norm (x1 , x2 , . . . ) ≡ x12 + x22 + · · ·. This spherical vector is indeed annihilated by the compact generators following from (2.37). The p-adic spherical vector in the representation corresponding to (2.37) may be obtained from the summation measure in (2.36) by the method as outlined in Sect. 2.3. The result is fp (x 1 , . . . , x n−1 ) = γp (x 1 ) · · · γp (x n−1 )
1 − p −s (x 1 , . . . , x n−1 )sp 1 − p −s
.
(2.39)
Again, this may be obtained from the real spherical vector (2.33) by replacing the Euclidean norm by the p-adic one along with Ks → Kp,s (x) = (1 − p −s x)/(1 − p −s ). 3. Minimal Representation for Simply Laced Lie Groups The minimal representation we have described for Sl(n, R) has been generalized in [11] to the case of simply-laced groups G(F ) for arbitrary local field F. In this section, we shall review the construction of [11], and make it fully explicit.
3.1. Nilpotent orbit and canonical polarization. The minimal representation can be understood as the quantization of the smallest co-adjoint orbit in G. In order to construct this minimal orbit, one observes that all simple Lie algebras have an essentially unique 5-grading (see e.g., [18]) G = G−2 ⊕ G−1 ⊕ G0 ⊕ G1 ⊕ G2
(3.1)
by the charge under the Cartan generator Hω associated to the highest root Eω (for a given choice of Cartan subalgebra and system of simple roots αi ). The spaces G±2 have dimension 1 and are generated by the highest and lowest root E±ω respectively. G1 contains only positive roots, and G0 contains all Cartan generators as well as the remaining positive roots and the corresponding negative ones; G−k is obtained from Gk by mapping all positive roots to minus themselves. The grading (3.1) can also be obtained by branching the adjoint representation of G into the maximal subgroup Sl(2)×H , where Sl(2) is generated by (Eω , Hω , E−ω ) and H is the maximal subgroup of G commuting with Sl(2) (explicit decompositions are shown in Table 1 for all simply-laced groups): G ⊃ Sl(2) × H adjG = (3, 1) ⊕ (2, R) ⊕ (1, adjH ) = 1 ⊕ R ⊕ [1 ⊕ adjH ] ⊕ R ⊕ 1.
(3.2)
In particular, G1 and G−1 transform as a (possibly reducible) representation R of H , with a symplectic reality condition so that (2, R) is real. The set CHω ⊕ G1 ⊕ CEω is the coadjoint orbit of the highest root Eω , namely the minimal orbit O we are seeking. Since the highest root generator Eω is nilpotent, this is in fact a nilpotent orbit. As any coadjoint orbit, it carries a standard Kirillov–Kostant symplectic form, and its restriction to G1 is the symplectic form providing the reality condition just mentioned. The nilpotent orbit can also be understood as the coset P \G, where P is the parabolic subgroup generated by G−2 ⊕ G−1 ⊕ (G0 \ {Hω }). The group G acts on O by right multiplication on the coset P \G, and therefore on the functions on O.
10
D. Kazhdan, B. Pioline, A. Waldron
The minimal representation can be obtained by quantizing the orbit O, i.e. by replacing functions on the symplectic manifold O by operators on the Hilbert space of sections of a line bundle on a Lagrangian submanifold of O. In more mundane terms, we need to choose a polarization, i.e. a set of positions and momenta among the coordinates of O. For this, note that, as a consequence of the grading, the subspace G1 ⊕ G2 forms a Heisenberg algebra [Eα1 , Eα2 ] = (α1 , α2 )Eω ,
α1 , α2 ∈ G1 ,
(3.3)
where (· , ·) is the symplectic form. A standard polarization can be constructed by picking in G1 the simple root β0 to which the affine root attaches on the extended Dynkin diagram3 . The positive roots in G1 then split into roots that have an inner product α, β0 with β0 equal to 1 (we denote them βi ), −1 (denoted γi = ω − βi ), 2 (β0 itself), or 0 (denoted γ0 = ω − β0 ). We choose as position operators Eγ0 , Eγi and Eω : Eω = iy,
Eγi = ixi
i = 0, . . . , d − 1
(3.4)
acting on a space of functions of the variables y, xi . The conjugate momenta are then represented as derivative operators, Eβi = y∂i
i = 0, . . . , d − 1.
(3.5)
The expression for the remaining momentum-like generator Hω will be determined below, but could be obtained at this stage by computing the Kirillov–Kostant symplectic form on P \G. To summarize our notations the 5-grading (3.1) corresponds to the decomposition G2 = {Eω }, G1 = {(Eβi , Eγi )}, G0 = {E−αj , Hαk , Eαj }, G−1 = {(E−βi , E−γi )}, G−2 = {E−ω }, where i = 0, . . . , d − 1 = dim(R)/2 − 1, j = 1, . . . , (dim(H ) − rank(G) + 1)/2 and Hαk are the Cartan generators of the simple roots with k = 1, . . . , rank(G). 3.2. Induced representation and Weyl generators. Having represented the Heisenberg subalgebra on a space of functions of d + 1 variables (y, xi=0,...,d−1 ), it remains to extend this representation to all generators in G. This can be done by unitary induction from the parabolic subgroup P . Rather than taking this approach, we prefer to generate the missing generators using the unbroken symmetry under H and Weyl generators. As a first step, it is useful to note that the choice of polarization : is invariant under a subalgebra H0 ⊂ H acting linearly on (xi=1,... ,d−1 ) while leaving (y, x0 ) invariant. For the D and E groups, H0 is the subalgebra generated by the simple roots which are not attached to β0 in the Dynkin diagram of G, whilst for the A series, by the simple roots attached to neither β0 nor the root at the other end of the Dynkin diagram. The subalgebras H0 are listed in Table 2. 3 For Sl(n), the affine root attaches to two roots α and α 1 n−1 . We choose β0 = α1 .
Minimal Representations, Spherical Vectors, Exceptional Theta Series
11
Sl(n) ⊃ Sl(2) × Sl(n − 2) × R+ adj = (3, 1, 0) ⊕ [(2, n − 2, 1) ⊕ (2, n − 2, −1)] ⊕ (1, adj, 0) = 1 ⊕ 2(n − 2) ⊕ [1 ⊕ adj] ⊕ 2(n − 2) ⊕ 1 SO(2n) ⊃ adj = =
Sl(2) × Sl(2) × SO(2n − 4) (3, 1, 1) ⊕ (2, 2, 2n − 4) ⊕ (1, 3, 1) ⊕ (1, 1, adj) 1 ⊕ (2, 2n − 4) ⊕ [1 ⊕ adj] ⊕ (2, 2n − 4) ⊕ 1
E6 78
⊃ = =
Sl(2) × Sl(6) (3, 1) ⊕ (2, 20) ⊕ (1, 35) 1 ⊕ 20 ⊕ [1 ⊕ 35] ⊕ 20 ⊕ 1
E7 133
⊃ = =
Sl(2) × SO(6, 6) (3, 1) ⊕ (2, 32) ⊕ (1, 66) 1 ⊕ 32 ⊕ [1 ⊕ 66] ⊕ 32 ⊕ 1
E8 248
⊃ = =
Sl(2) × E7 (3, 1) ⊕ (2, 56) ⊕ (1, 133) 1 ⊕ 56 ⊕ [1 ⊕ 133] ⊕ 56 ⊕ 1
(3.6)
Table 1. Five-graded decomposition for simply laced simple groups G Sl(n) SO(n, n) E6 E7 E8
dim n−1 2n − 3 11 17 29
H0 Sl(n − 3) SO(n − 3, n − 3) Sl(3) × Sl(3) Sl(6) E6
G∗1 [n − 3] 1 ⊕ [2n − 6] (3, 3) 15 27
I3 0 x1 ( x2i x2i+1 ) det Pf 27⊗s 3 |1
Table 2. Dimension of minimal representation, linearly realized subgroup H0 ⊂ H ⊂ G, representation of G∗1 under H0 , and associated cubic invariant I3
In order to extend the action of H0 and the Heisenberg subalgebra to the rest of G, we introduce the action of two Weyl generators S and A. The first, S, exchanges the momenta βi with the positions γi for all i = 0, . . . , d − 1 and is therefore achieved by Fourier transformation in the Heisenberg coordinates xi = 0, . . . , d − 1, d−1 i d−1 i=0 dpi i=0 pi xi . y f (y, p , . . . , p )e (3.7) (Sf )(y, x0 , . . . , xd−1 ) = 0 d (2πy)d/2 It also sends all αi to −αi , while leaving ω invariant, SEαi S −1 = E−αi ,
SEω S −1 = Eω .
(3.8)
The second generator A is the Weyl reflection with respect to the root β0 . It maps β0 to minus itself, γ0 to ω, and all βi to the roots αj that were not in H0 . All roots in H0 are invariant under A, and so are all γi=1,...,d−1 . In order to write the action of A, we need to introduce an H0 -invariant cubic form on G∗1 , I3 = c(i, j, k)xi xj xk , (3.9) i<j
where the sum extends over all i, j, k = 1, . . . , d − 1 such that βi + βj + βk = β0 + ω. The sign c(i, j, k) is given by [11] c(i, j, k) = (−)B(βi ,βj )+B(βi ,βk )+B(βj ,βk )+B(β0 ,ω)+1 ,
(3.10)
12
D. Kazhdan, B. Pioline, A. Waldron
where B(α, β) is the adjacency matrix (namely a bilinear form such that α, β = B(α, β) + B(β, α)). The cubic invariant I3 in (3.9) is unique, except for the case of G = Sl(n) where there is none, and is listed in the last column of Table 2. The action of A is given in terms of I3 by (Af )(y, x0 , x1 , . . . , xd−1 ) = e
iI3 0y
−x
f (−x0 , y, x1 , . . . , xd−1 ).
(3.11)
One may check that the generators A and S satisfy the relation (AS)3 = (SA)3
(3.12)
in the Weyl group. In fact, as in the symplectic case where the relation (ST )3 was 2 equivalent to the invariance of the Gaussian character eix under Fourier transform, the relation (3.12) amounts to the invariance of the cubic character x0α (I3 )β eiI3 /x0 under Fourier transform over all xi=0,...,d−1 . This invariance can be easily checked in the stationary phase approximation4 , and holds exactly for particular values of the exponents α, β [20]. In fact, the minimal representation yields all cubic forms I3 such that eiI3 /x0 is invariant [20]. 3.3. Example: Minimal representation of D4 . Using the Weyl generators (3.7) and (3.11), we can now compute the action of Eα in the minimal representation for all positive and negative roots and in turn obtain the Cartan generators through [Eα , E−α ] = α · H . As an illustration, we display the SO(4, 4) case in detail [19]. The data for other groups are tabulated in the Appendix. The Dynkin diagram of D4 is 3 ❣ α2 ❣ 1 α1
❣
❣,
2 β0
4 α3
where we have indicated the standard labeling as well as one more convenient for our purposes; the construction will be symmetric under permutations of (α1 , α2 , α3 ) and hence under SO(4, 4) triality. The positive roots graded by their height along β0 are α1 =(1, 0, 0, 0) = A(β1 ), α2 =(0, 0, 1, 0) = A(β2 ), α3 =(0, 0, 0, 1) = A(β3 ), β0 β1 β2 β3
=(0, 1, 0, 0), =(1, 1, 0, 0), =(0, 1, 1, 0), =(0, 1, 0, 1),
γ0 γ1 γ2 γ3
= (1, 1, 1, 1), = (0, 1, 1, 1), = (1, 1, 0, 1), = (1, 1, 1, 0),
ω = (1, 2, 1, 1) = A(γ0 ).
(3.13)
(3.14)
(3.15)
4 For D , the invariance of the function (1/|x |)eix1 x2 x3 /x0 under Fourier transform of all 4 variables can 4 0 be checked explicitly by performing the (delta function) integrals over x1 , x2 , x0 , x3 in that order.
Minimal Representations, Spherical Vectors, Exceptional Theta Series
13
We start with the generators Eβ0 Eβ1 Eβ2 Eβ3
= y∂0 , = y∂1 , = y∂2 , = y∂3 , Eω =
Eγ0 Eγ1 Eγ2 Eγ3 iy.
= ix0 , = ix1 , = ix2 , = ix3 ,
(3.16) (3.17)
The cubic form (3.9) reduces to I3 = x1 x2 x3 . The Weyl generator A (acting by conjugation) yields the generators for the remaining simple roots Eαi upon which we act with S to obtain E−αi , ix2 x3 , E−α1 = x1 ∂0 + iy∂2 ∂3 , y ix3 x1 = −x0 ∂2 − , E−α2 = x2 ∂0 + iy∂3 ∂1 , y ix1 x2 , E−α3 = x3 ∂0 + iy∂1 ∂2 . = −x0 ∂3 − y
Eα1 = −x0 ∂1 − Eα2 Eα3
(3.18)
A further application of A on (Eβ0 , E−αi ) yields (E−β0 , −E−βi ) upon which S produces the (E−γ0 , −E−γi ). Penultimately we may act with A on E−γ0 to produce the lowest root E−ω , E−β0 = − x0 ∂ +
ix1 x2 x3 , y2
x1 (1 + x2 ∂2 + x3 ∂3 ) − ix0 ∂2 ∂3 , y = 3i∂0 + iy∂∂0 − y∂1 ∂2 ∂3 + i(x0 ∂0 + x1 ∂1 + x2 ∂2 + x3 ∂3 ) ∂0 , x 2 x3 = iy∂1 ∂ + i(2 + x0 ∂0 + x1 ∂1 )∂1 − ∂0 , y i x 1 x2 x3 = 3i∂ + iy∂ 2 + + ix0 ∂0 ∂ + ∂0 y y2 i + (x1 x2 ∂1 ∂2 + x3 x1 ∂3 ∂1 + x2 x3 ∂2 ∂3 ) y 1 + i(x1 ∂1 + x2 ∂2 + x3 ∂3 )(∂ + ) + x0 ∂1 ∂2 ∂3 , y
E−β1 = x1 ∂ + E−γ0 E−γ1 E−ω
(3.19)
as well as cyclic permutations of (1, 2, 3), denoting ∂ ≡ ∂y . Finally, commutators produce the Cartan generators, Hβ0 Hα1 Hα2 Hα3
= −y∂ + x0 ∂0 , = −1 − x0 ∂0 + x1 ∂1 − x2 ∂2 − x3 ∂3 , = −1 − x0 ∂0 − x1 ∂1 + x2 ∂2 − x3 ∂3 , = −1 − x0 ∂0 − x1 ∂1 − x2 ∂2 + x3 ∂3 ,
(3.20)
where Hα ≡ α · H = [Eα , E−α ] is the Cartan generator along the simple root α. Note that the Cartan generator corresponding to the highest root ω has a simple form, Hω = [Eω , E−ω ] = −3 − 2y∂ − x0 ∂0 − x1 ∂1 − x2 ∂2 − x3 ∂3 ,
(3.21)
14
D. Kazhdan, B. Pioline, A. Waldron
and therefore acts by a uniform rescaling on all xi , and a double rescaling on y. This agrees with the fact that Hω is the grading operator in Eq. (3.1). The form of this expression holds therefore for all groups (save for the non-universal constant term −3 above). This is also true of the expression for Hβ0 . The generators for the positive and negative simple roots and Cartan elements for all simply-laced groups, computed following the same procedure, are given in the Appendix. All other roots can be obtained by commuting ±Eα+β if α + β is a root [Eα , Eβ ] = (3.22) 0 otherwise. Finally, to specify our conventions for positive and negative roots, we record that the quadratic Casimir operator for D4 is C= Hi C ij Hj − Eαk + (Eβl + Eγl ) − Eβ0 − Eγ0 − Eω . (3.23) i
k
l
The same formula may be applied to other groups as well. Here C ij is the inverse Cartan matrix and Eα ≡ Eα E−α + E−α Eα . Evaluated on the minimal representation above, we have C = −8, in agreement with irreducibility. Note however that in contrast to ordinary representations, the center of the minimal representation (Joseph’s ideal) is much larger: e.g, any quadratic polynomial in the Cartan generators Hi can be supplemented with a linear combination of Eα operators to make a scalar element. To summarize, we have obtained a unitary irreducible representation of any simplylaced split group G by quantizing the action of G on its minimal nilpotent orbit (the classical limit can be obtained from our formulae for the generators by replacing the derivative operators i(∂, ∂i ) by momenta (p, pi ) conjugate to the coordinates (y, xi ) and dropping the “normal ordering” terms; this yields the Hamiltonians for the generators of G on the nilpotent orbit with symplectic form dp ∧ dy + dpi ∧ dxi ). Choosing one of the generators as the Hamiltonian gives a dynamical system with a spectrum generating symmetry G. This generalizes the A1 case corresponding to conformally invariant quantum mechanics [26]. 4. Spherical Vectors for Dn and E6,7,8 Lie Groups With the explicit minimal representation for all simply-laced groups at hand, we focus our attention on the spherical vector; a function f (y, xi ) annihilated by all compact generators in G. This is our main result and is a central building block for the construction of theta series for all groups. 4.1. From symplectic to orthogonal. One way of obtaining the symplectic vector is to solve the differential equation (Eα ± E−α )f = 0 for all roots α (the sign is chosen so that the generator is compact). It is sufficient to solve these equations for α a simple root only, since all other equations can be obtained by commutation. This still sounds like a formidable task, even though we shall in fact be able to carry it out later on for exceptional groups. For now however, we would like to take an alternate approach, well suited to orthogonal groups. The spherical vector we shall obtain will turn out to generalize quite simply to exceptional groups as well. The main observation is that there is a maximal embedding of SO(n, n, R)×Sl(2, R) in Sp(2n, R). The minimal representation of Sp(2n, R) has dimension 2n, and is also a
Minimal Representations, Spherical Vectors, Exceptional Theta Series
15
representation of SO(n, n, R), albeit reducible. By considering functions invariant under Sl(2, R) however, we can reduce it to a 2n − 3 dimensional representation, which is the dimension of the minimal representation. In this way we thus obtain a representation equivalent to the one described in Sect. 3. In order to obtain the spherical vector in that representation, we just need to integrate over the second factor in the decomposition SO(n, n, R) Sl(2, R) Sp(2n, R) × ⊂ SO(n) × SO(n) U (1) U (2n, R)
(4.1)
to get a function on the first space. This procedure is familiar to string theorists since it gives precisely the one-loop result for half-BPS amplitudes. Indeed, the partition function of the worldsheet winding modes on a torus T n is a theta series for the symplectic group Sp(2n, R), restricted to the subspace (4.1) of the moduli space. It can be written in a form which makes the modular symmetry Sl(2, Z) manifest, θSp = V
(mi + ni τ )gij (mj + nj τ¯ ) exp − π + 2π imi Bij nj , τ2 i
(4.2)
mi ,n
where we recognize a sum weighted by√the Polyakov action for classical toroidal strings winding around T n with volume V = det gij via X i (σ1 , σ2 ) = mi σ1 + ni σ2 ; or with manifest SO(n, n, Z) target space symmetry, n/2
θSp =τ2
exp − π τ2 (mi + Bik nk )g ij (mj + Bj l nl ) + ni gij nj
mi ,ni i
(4.3)
+ 2π iτ1 mi n . In this form, we recognize the contribution of states with momentum mi and winding ni in the Schwinger representation, with a BPS constraint mi ni = 0. The two representations are related by Poisson resummation over all Kaluza–Klein modes mi ↔ mi . The oneloop amplitude is obtained by integrating this theta series over the fundamental domain of the upper half-plane U (1)\Sl(2) parameterized by the worldsheet modulus τ : θSO(n,n) (gij , Bij ) = 2π
d 2τ θ (τ, τ¯ ; gij , Bij ). 2 Sp F τ2
(4.4)
The result is an automorphic form under the T-duality group SO(n, n, Z). Its expansion at large volume using the methods described, e.g., in [5], reads θSO(n,n)
1 2π 2 = + 4π V V + 2V i g mj 3 m ij i m =0
(mi ,ni )/Sl(2)
e−2π
√
(mij )2 +2πimij Bij
(mij )2
,
(4.5) and exhibits a sum of power-suppressed contributions, together with worldsheet instantons. The double sum runs over integer vectors (mi , ni ) modulo the linear action of
16
D. Kazhdan, B. Pioline, A. Waldron
Sl(2, Z). The worldsheet instantons however depend only on the Sl(2) invariant combination mij = mi nj − mj ni (with (mij )2 ≡ 2!1 mij gik gj l mkl ), so they can be rewritten as √ −2π (mij )2 +2πimij Bij e µ(mij ) , (4.6) (mij )2 mij rank2
where the measure factor µ(mij ) = n|mij n accounts for the Jacobian factor between variables (mi , ni ) and mij . We thus have a representation of SO(n, n, R) on a space of rank 2 antisymmetric matrices mij . The dimension of this space is precisely 2n − 3 and ought therefore to be equivalent to the minimal representation described in Sect. 3. We can also read off the real spherical vector immediately by going to the origin (gij = δij , Bij = 0) of the moduli space, √ −2π (mij )2 e . (4.7) fDn = (mij )2 The p-adic spherical vector can also be extracted from the summation measure µ(mij ) in the same way as in (2.33), and reads fp = γp (mij )
1 − p(mij )p . 1−p
(4.8)
D4 spherical vector in the standard minimal representation. Having found the spherical vector in this “string inspired” representation, we now would like to map it to the standard minimal representation, with the aim of generalizing it to exceptional groups. For this we need to find the linear operator that intertwines between the two representations and let it act on the spherical vector (4.7). For simplicity, we will describe the SO(4, 4) case only, since the method generalizes easily to higher n. In this case, the constraint that mij of the “string-inspired” representation has rank 2 is Aij kl mij mkl = 0,
(4.9)
which describes a quadratic cone in R6 . Firstly, consider the operator acting by multiplication by mij (corresponding to shifting Bij by a constant) in the representation (4.5). These shifts make a 6-dimensional Abelian subalgebra of the Borel subgroup of SO(4, 4) (i.e., the group generated by the positive roots). We can identify six commuting generators by choosing those for roots with height one in the direction of, for example, α3 , namely (Eα3 , Eβ3 , Eγ1 , Eγ2 , Eγ0 , Eω ). Since these operators commute, we can diagonalize them simultaneously and we call their eigenvalues i(m43 , m24 , m14 , m23 , m13 , m12 ). Using the expressions in (3.16)– (3.19) for the generators, we find a common eigenstate ψmij = δ(y − m12 ) δ(x0 − m13 ) δ(x1 − m14 )δ(x2 − m23 )e
im24 x3 m12
,
(4.10)
but only if the eigenvalues are related by m43 = −
m14 m23 m13 m24 − . m12 m12
(4.11)
Minimal Representations, Spherical Vectors, Exceptional Theta Series
17
This is the same as (4.9), providing the rationale for our identification. Therefore the two representations are intertwined by Fourier transformation in a single variable x3 , ij f (m ) = dydx0 d 3 x ψmij (y, x0 , x) f (y, x0 , x) = dx3 exp(im24 x3 /m12 ) f (m12 , m13 , m14 , m23 , x3 ) . (4.12) where x stands for (x1 , x2 , x3 ). Conversely, we have dm24 − 2π imy24 x3 x1 x2 + x0 m24 f y, x0 , x1 , x2 , m24 , e f (y, x0 , x) = y y 2π im24 x3 − y = dm24 dm43 e δ(x1 x2 + x0 m24 + ym43 )
(4.13)
× f (y, x0 , x1 , x2 , m24 , m43 ), where fmij ≡ f(m12 , m13 , m14 , m23 , m24 , m43 ). To see how the kernel (4.7) translates into the standard minimal representation we must compute the Fourier transform (4.13). For that purpose, it is convenient to take the integral representation √ +∞ −2π (mij )2 e dt ij 2 f = , (4.14) = exp −π/t − π t (m ) t 1/2 0 (mij )2 along with the standard one for the Dirac delta function of the constraint. Hence, the action of the intertwining operator on the string-inspired spherical vector may be written as +∞ ∞ dt dθdm24 dm43 f = t 1/2 −∞ 0 (4.15) m24 x
− π −πt (mij )2 −2πiθ(x1 x2 +x0 m24 +ym43 )−2πi y 3
×e t
m12 =y, m13 =x0 , . m14 =x1 , m23 =x2
The integrals over m24 , m43 are Gaussian and yield +∞ x dt −πt (y 2 +x02 +x12 +x22 )− πt 1+(θy)2 +(θx0 − y3 )2 −2πiθx1 x2 f = dθ e . t 3/2 −∞
(4.16)
The integral over θ is again Gaussian, and the t integral is of Bessel type so all integrals can be computed explicitly. The saddle point yields a classical action at
(y 2 + x02 + x12 )(y 2 + x02 + x22 )(y 2 + x02 + x32 ) x 0 x 1 x 2 x3 S = 2π − 2π i . (4.17) y 2 + x02 y(y 2 + x02 ) Taking into account the measure factor, we find that in the standard representation, the kernel becomes (rescaling all variables (y, x0 , x1 , x2 , x3 ) by 1/(2π )) x x x x (y 2 + x02 + x12 )(y 2 + x02 + x22 )(y 2 + x02 + x32 ) −i 0 1 2 3 4π e y(y 2 +x02 ) . K0 fD4 = y 2 + x02 y 2 + x02 (4.18)
18
D. Kazhdan, B. Pioline, A. Waldron
This expression is the prototype of the spherical vectors that we will obtain later on, and therefore deserves several comments: It is invariant under permutations of (x1 , x2 , x3 ), i.e., under SO(4, 4) triality which was manifest in the standard representation but not at all in the string-inspired one. In fact, on the basis of Heterotic/type II duality, it was found that the one-loop string amplitude (4.4) for n = 4 would have to be invariant under triality [21] (see also [22] for a related observation). Therefore our triality invariant result gives strong support to non-perturbative Heterotic/type II duality. (ii) The spherical vector (4.18) could also have been derived by solving the differential equations for K-invariance. As we will show for exceptional cases later, the system of PDE’s reduces to a single differential equation for a single function of the variable S1 ,
(y 2 + x02 + x12 )(y 2 + x02 + x22 )(y 2 + x02 + x32 ) . (4.19) S1 = y 2 + x02
(i)
The equation is a linear second order differential equation of Bessel type, for which (4.18) is the only solution with exponential decrease at infinity (i.e., S1 → ∞). The same phenomenon will also hold for exceptional groups, except that the variable S1 will be a more complicated function of the coordinates (y, xi ) (but reducing to the same form (4.19) for particular configurations of the variables xi ), and the order of the Bessel function will be different. (iii) The phase exp(−iS2 ), where S2 is the imaginary part of the classical action S2 =
x 0 I3 , y(y 2 + x02 )
(4.20)
is precisely such that the spherical vector is invariant under the Weyl generator A in (3.11). Indeed, this follows from the trivial identity yI3 I3 x 0 I3 = 2 . − 2 x0 y x0 x 0 + y 2 y x0 + y 2
(4.21)
Defining f (y, xi ) ≡ g(y, xi )e
−i
x0 I3
y y 2 +x02
,
(4.22)
we see that the invariance under A requires g to be symmetric under (y, x0 ) → (−x0 , y). In fact, the invariance under the compact generator Eβ0 + E−β0 requires g to depend on (y, x0 ) through y 2 + x02 only since it acts on the function g(y, xi ) as the rotation operator y∂0 − x0 ∂. This will hold for all simply-laced groups. (iv) In the√limit (y, x0 ) → 0, due to the asymptotic behavior Ks (y → ∞) ∼ e−y π/(2y), the spherical vector takes a much simpler form fD4 ∼ √
|x x x | 1 − 1 yz2 3 e |x1 x2 x3 |
or
√
|x x x | 1 − 1 y2z¯ 3 e |x1 x2 x3 |
(4.23)
depending on the sign of x1 x2 x3 , where z = y + ix0 . We recognize the same kernel as in the definition of the Weyl generator A in (3.11). The spherical vector (4.18) can therefore be thought of as a Fourier-invariant non-linear (physicists would say “Born–Infeld”) completion of the Fourier invariant kernel in (3.11).
Minimal Representations, Spherical Vectors, Exceptional Theta Series
19
(v) The result for the spherical vector (4.18) may
be rewritten more compactly in terms of the Euclidean norm (x1 , x2 , . . . ) ≡
x12 + x22 + · · · as
x 0 I3 I3 1 exp(−i ), f = K0 X, ∇X R R yR 2
(4.24)
where R = (y, x0 ), X ≡ (y, x0 , x1 , x2 , x3 ) and ∇X (I3 /R) denotes the gradient of I3 /R with respect to the X coordinates. (vi) The p-adic spherical vector in the string inspired representation can be read off from the large volume expansion (4.5), ij 1 − p(mij )p fp mij , m ∧ m = 0 = . γp m 1−p
(4.25)
i<j
The corresponding spherical vector in the triality invariant representation can be obtained via intertwining by p-adic Fourier transform. We leave the details of this computation to [13], and simply mention that it takes the same form as (4.24), upon replacing the Euclidean norm with the p-adic norm (x1 , x2 , . . . ) = max(|x1 |p , |x2 |p , . . . ), and K0 by a simple algebraic function. Dn spherical vector in the standard minimal representation. Before moving on to exceptional groups, let us note that the same manipulation can be performed for higher SO(n, n) groups. In the string representation there are n(n − 1)/2 variables mij subject to constraints Ai1 ...id−4 ij kl mij mkl = 0 ⇔ m[ij mkl] = 0.
(4.26)
Of these, only (n − 2)(n − 3)/2 namely m1[2 mkl] = 0 (say), are independent, so the dimension of the minimal representation is n(n − 1) (n − 2)(n − 3) − = 2n − 3, 2 2
(4.27)
as given in Table 2. The intertwining operator is a Fourier transform on n − 3 variables, and its action on the spherical vector (4.7) can be computed using the same manipulations as before. We quote fDn = where S1 =
y 2 + x02 + x12
n−4
(y 2 + x02 )2 + (y 2 + x02 )P + Q2
4
K n−4 (S1 )e−iS2 2
, y 2 + x02
(y 2 + x02 )3 + (y 2 + x02 )2 I2 + (y 2 + x02 )(I22 − I4 )/2 + (I3 )2 y 2 + x02 S2 =
x 0 I3 , y(y 2 + x02 )
(4.28)
,
(4.29)
(4.30)
20
D. Kazhdan, B. Pioline, A. Waldron
and P =
2n−5 j =2
I2 = x12 + P ,
xj2 ,
Q=
n−3
I3 = x1 Q,
i+1 x x 2i 2i+1 , i=1 (−) 4 I4 = x1 + P 2 − 2Q2 .
(4.31)
In contrast to the n = 4 case, for n > 4 the spherical vector must be invariant under the maximal compact subgroup K0 = SO(n − 3) × SO(n − 3) of the linearly realized H0 = SO(n − 3, n − 3, R). This is indeed the case of our result, since P and Q are the K0 -invariant square norms of the SO(n − 3, n − 3)-vector (x2 , . . . , x2n−5 ) (Q is even H0 invariant). Using this symmetry we can choose all xi>3 to vanish. In this case the classical action, S = S1 + iS2 reduces to the D4 case (4.19,4.20). As for D4 , we can express the Dn spherical vector more compactly as fDn =
I 1 (y, x0 , x1 ) n−4 x 0 I3 3 K n−4 (X, ∇X . exp − i 2 R R R yR 2
(4.32)
Here Kt (x) ≡ x −t Kt (x), X ≡ (y, x0 , x1 , . . . , x2n−5 ) and R ≡ (y, x0 ). The form (4.29) of the argument of the Bessel function in terms of the three K0 -invariants I2 , I3 , I4 , will apply to the exceptional groups as will the overall form (4.32). As in the D4 case, the p-adic spherical vector in the string inspired representation can be read off from the large volume expansion of the symplectic theta series. The spherical vector in the “standard” minimal representation could therefore be obtained by p-adic Fourier transform. 4.2. E6 . In the case of exceptional groups, we unfortunately do not have a string-inspired representation which we could use to obtain the spherical vector. In fact, it is the other way around, since we are aiming at a “membrane-inspired” representation for exceptional theta series! Our only remaining line of attack is therefore to find an explicit solution of the differential equations (Eα ± E−α )f = 0 determining the spherical vector. For this, let us recall that (i) once the phase factor in (4.22) is factored out, the dependence of f on (y, x0 ) is through (y 2 + x02 ) only, and (ii) the spherical vector has to be invariant under the maximal compact subgroup K0 of H0 , which is linearly realized on (x1 , . . . , xd ). Our first task, therefore, is to determine the invariants of (x1 , . . . , xd ) under K0 . In the E6 case, (see Table 2) the variables (x1 , . . . , x9 ) transform in a (3, 3) representation of H0 = Sl(3) × Sl(3). Using the K0 transformations implied by the explicit expressions for the roots given in the Appendix, we can assign the 9 variables to a 3 × 3 matrix x1 x3 x6 Z = x2 x5 x9 , (4.33) x 4 x7 x 8 on which Sl(3) × Sl(3) act linearly by left and right multiplication respectively. An independent set of invariants under the maximal compact subgroup K0 = SO(3) × SO(3) is given by the quadratic, cubic and quartic combinations I2 = Tr(Z t Z),
I3 = − det(Z),
I4 = Tr(Z t ZZ t Z).
(4.34)
Minimal Representations, Spherical Vectors, Exceptional Theta Series
21
In fact, I3 is our familiar cubic form, invariant under the whole of H0 rather than only its maximal compact subgroup. Note also that higher traces are algebraically related to the ones above. Now, given that the spherical vector has to be invariant under K0 , we can work in a frame where Z is diagonal keeping (x1 , x5 , x8 ) as the only non-vanishing entries. The invariants then reduce to I2 = x12 + x52 + x82 ,
I3 = −x1 x5 x8 ,
I4 = x14 + x54 + x84 .
(4.35)
Let us now consider the equation (Eβ1 − E−β1 )f = 0. The negative root generator E−β1 can be obtained by commuting the negative roots given in the Appendix, and reads 2 1 x1 + x1 (x5 ∂5 + x7 ∂7 + x8 ∂8 + x9 ∂9 ) y y (4.36) − x2 x3 ∂5 − x2 x6 ∂9 − x3 x4 ∂7 − x4 x6 ∂8 .
E−β1 = x1 ∂ + ix0 (∂5 ∂8 − ∂7 ∂9 ) +
Using the ansatz (4.22) and setting all xi=0,...,9 but (x1 , x5 , x8 ) to zero at the end, we get a first order differential equation x1 (x5 ∂5 + x8 ∂8 − 2)g + y 2 (∂1 + 2x1 ∂y )g = 0,
(4.37)
which is solved by g(y 2 , x1 , x5 , x8 ) = y12 h(y 2 + x12 , xy5 , xy8 ). Demanding invariance under the compact generators of β5 and β8 requires the same equation to hold for permutations of x1 , x5 , x8 so the only possibility is (y 2 + x12 )(y 2 + x52 )(y 2 + x82 ) 1 . (4.38) g(y 2 , x1 , x5 , x8 ) = 2 h y y2 The argument of h is easily recognizable as the universal form S1 in (4.19) and we can restore the dependence on all variables using K0 invariance; first in our particular frame we write
(y 2 + x12 )(y 2 + x52 )(y 2 + x82 ) S1 = y2
y 6 + y 4 (x12 + x52 + x82 ) + y 2 (x14 + x54 + x84 ) + (x1 x5 x8 )6 = . (4.39) y2 Then using the relations (4.35) for the invariants I2 , I3 and I4 , and recalling that the dependence on (y, x0 ) is through the norm y 2 + x02 , we find that h has to depend on the coordinates (y, x0 , . . . , x9 ) through the combination
(y 2 + x02 )3 + (y 2 + x02 )2 I2 + (y 2 + x02 )(I22 − I4 )/2 + I32 S1 = . (4.40) (y 2 + x02 ) In fact, this expression can be rewritten in a much more concise way as det(ZZ t + |z|2 I3 ) , S1 = |z|2
(4.41)
22
D. Kazhdan, B. Pioline, A. Waldron
where z = y + ix0 and I3 is the 3 × 3 identity matrix. This expression is manifestly invariant under SO(3) × SO(3) × SO(2) ⊂ K. Finally, the (Eα3 − E−α3 )f = 0 equation reduces to h&& +
2 & h − h = 0. S1
(4.42)
√ This is a Bessel-type equation, solved by h = K1/2 (S1 )/ S1 ∝ e−S1 /S1 . Altogether, we have thus found an explicit expression for the E6 spherical vector in the minimal representation, −S1 −i
fE6
x0 I3 2 2
−i
x0 I3 2 2
y(x0 +y ) K1/2 (S1 )e y(x0 +y ) e = ∝ . √ (y 2 + x02 )S1 (y 2 + x02 ) S1
(4.43)
As in the D4 case, this expression simplifies greatly in the limit |z| → 0, fE6
− | det Z|
e yz ∼ | det Z|
(4.44)
or its complex conjugate, depending on the sign of det(Z). While this spherical vector has been constructed in the standard representation presented in Sect. 3, other choices of polarization can be relevant in certain applications, and yield different expressions for the spherical vector. In the E6 case, there is another interesting polarization, where the linearly realized group is Sl(5) rather than Sl(3) × Sl(3). By looking at the root lattice displayed in the Appendix, it is easy to see that this representation can be reached by performing a Fourier transform on the coordinates (x6 , x9 , x8 ). This breaks one of the Sl(3) factors, but the other unbroken factor gets enlarged to an Sl(5), under which the 10 new coordinates transform as an antisymmetric matrix 0 −p8 p9 x1 x3 0 −p6 x2 x5 0 x 4 x7 , X= (4.45) a/s 0 x0 0 where (p6 , p8 , p9 ) are the momenta conjugate to (x6 , x8 , x9 ). The spherical vector in this representation is obtained by Fourier transform on (x6 , x8 , x9 ), dx6 dx8 dx9 f˜E6 = fE6 e2i(p6 x6 +p8 x8 +p9 x9 )/y . (4.46) y 3/2 Remarkably, the integral can still be computed by the same method as in Sect. 4.1, and yields the simple result 1 1 f˜E6 = √ (4.47) K1 J4 , y yJ4 where J4 is the polynomial of order 4, y2 1 (4.48) Tr(X 2 ) + (TrX2 )2 − 2TrX 4 , 2 8 manifestly invariant under the maximal compact subgroup SO(5) ⊂ Sl(5). Note that x0 is now unified with the other xi coordinates, and that the phase has disappeared. J4 = y 4 −
Minimal Representations, Spherical Vectors, Exceptional Theta Series
23
4.3. E7 . The same strategy presented for E6 above yields the E7 and E8 spherical vectors. In the case of E7 , the minimal representation has dimension 17, and is realized on a space of functions of (y, x0 , . . . , x15 ). The linearly realized subgroup is H0 = Sl(6, R), with maximal compact subgroup K0 = SO(6, R). The coordinates (x1 , . . . , x15 ) transform in the adjoint representation of H0 . Using the explicit expression for the roots given in the Appendix, we can fit them into an antisymmetric matrix 0 −x1 x2 −x4 −x6 x9 0 x3 −x5 −x8 x12 0 x7 x11 −x15 Z= . (4.49) 0 −x14 x13 a/s 0 x10 0 The independent invariants of Z under the adjoint action of SO(6, R) are the three Casimir operators of SO(6) ∼ Sl(4), i.e. 1 I2 = − Tr(Z 2 ), 2
I3 = −PfZ,
I4 =
1 Tr(Z 4 ). 2
(4.50)
As for E6 , I3 is in fact invariant under the full H0 , and is the cubic form that enters the expression of the Weyl generator A in (3.11). Using the action of K0 , we can skewdiagonalize Z, and set all coordinates but x1 , x7 , x10 to zero. The invariants then reduce to the simple symmetric combinations 2 I2 = x12 + x72 + x10 ,
I3 = −x1 x7 x10 ,
4 I4 = x14 + x74 + x10 .
(4.51)
Looking at the action of Eβ1,5,8 , we again find that the spherical vector must take the x I −i 20 3 2 form f = h(S1 )e y(y +x0 ) (y 2 + x02 )3/2 , with S1 the usual form in (4.40). As in the E6 case, it can be written more compactly as det(Z + |z|I6 ) S1 = , (4.52) |z|2 where again z = y + ix0 . The equation (Eα2 + E−α2 )f = 0 now requires h&& + K1 (S1 )/S1 . The E7 spherical vector is therefore given by
3 & S1 h
− h = 0, hence h =
x I
fE7 =
−i 20 3 2 K1 (S1 ) y(x0 +y ) e (y 2 + x02 )3/2 S1
(4.53)
with S1 as in (4.40). In the limit |z| → 0, this reduces to fE7
− |PfZ|
e yz ∼ |PfZ|3/2
(4.54)
or its complex conjugate, depending on the sign of PfZ. As in the E6 case, we can find the spherical vector for other polarizations as well. A particularly interesting one is obtained by Fourier transform on the last column of the matrix Z in (4.49), which, as examination of the root lattice shows, yields a representation
24
D. Kazhdan, B. Pioline, A. Waldron
with an SO(5, 5) group acting linearly. The 16 coordinates now transform as a spinor of SO(5, 5), or as 1 + 10 + 5 in terms of its Sl(5) subgroup,
x0 ,
0 −x1 x2 −x4 −x6 0 x3 −x5 −x8 0 x7 x11 , X= a/s 0 −x14 0
p9 p12 Y = −p15 . p 13 p10
(4.55)
Again, the Fourier transform of the spherical vector (4.53) can be computed using the same method as in Sect. 4.1, and yields a simple form y 3/2 f˜E7 = 5/4 K3/2 J4
1 J4 , y
(4.56)
where J4 is a SO(5) × SO(5) invariant polynomial of degree 4, 1 J4 = y 4 + y 2 x02 + Y t Y − TrX 2 2 1 1 + x02 Y t Y − (TrX 4 − (TrX 2 )2 ) − 2 x0 X ∧ X ∧ Y 4 2
(4.57)
and the last term denotes the contraction with the five-dimensional Levi–Civita tensor.
4.4. E8 . Finally, in the E8 case, the minimal representation has dimension 29, and is realized on a space of functions of (y, x0 , . . . , x27 ). E6 is linearly realized, and acts on (x1 , . . . , x27 ) in the 27 representation. Its maximal compact subgroup is U Sp(8), under which the (x1 , . . . , x27 ) transform as an antitraceless antisymmetric representation. It is somewhat awkward to fit the 27 coordinates into such a matrix, nevertheless we can easily find their transformation under the Sl(3) × Sl(3) × Sl(3) subgroup of H0 = E6 . We have the branching rule E6 ⊃ Sl(3) × Sl(3) × Sl(3), 27 = (3, 3, 1) ⊕ (3, 1, 3) ⊕ (1, 3, 3),
(4.58)
so that the xi can be assigned to three 3 × 3 matrices U31
x10 x11 x13 = − x12 x14 x16 , x15 x17 x20
x7 x9 x18 = −x6 −x8 x21 , x4 x5 x24
V12
x27 −x25 x22 = x26 −x23 x19 , x3 x2 x1 (4.59)
W23
acted upon from the left and from the right by the Sl(3) factors denoted by subscripts. The maximal subgroup K0 = U Sp(8) of H0 = E6 branches into SO(3) × SO(3) × SO(3), where the three SU (2) are generated by (Kα1 , Kα3 , Kα8 ), (−Kα2 , Kα50 , Kα53 ) and (Kα5 , Kα6 , Kα12 ), where Kα ≡ Eα + E−α , respectively. The invariants under K0
Minimal Representations, Spherical Vectors, Exceptional Theta Series
25
can be constructed out of the SO(3)×SO(3)×SO(3) invariants by requiring invariance under the extra Kα4 compact generator, and read I2 = Tr(U t U ) + Tr(V t V ) + Tr(W t W ) , I3 = Tr(U V W ) − (det(U ) + det(V ) + det(W )) ,
(4.60) (4.61)
I4 = Tr(U U t U U t ) + Tr(V V t V V t ) + Tr(W W t W W t ) − 2(Tr(U V V t U t ) + Tr(V W W t V t ) + Tr(W U U t W t )) t
t
t
t
(4.62)
t
t
+ 2(Tr(U U )Tr(V V ) + Tr(V V )Tr(W W ) + Tr(W W )Tr(U U )) + 4(det(W )Tr(U V W −t ) + det(U )Tr(V W U −t ) + det(V )Tr(W U V −t )). Equivalently, we can make the Sl(6) × Sl(2) subgroup of H0 manifest, by arranging the (15, 1) + (6, 2) xi ’s into an antisymmetric 6 × 6 matrix and a doublet of 6-vectors,
0 x5 x8 x10 0 x9 x11 0 x13 Z = − 0 a/s
x12 x14 x16 x19 0
x7 −x6 x Y1 = 4 , x3 x 2 x1
x15 x17 x20 , x23 x26 0
x18 x21 x Y2 = 24 . −x27 x 25 −x22
(4.63)
In this notation, the K0 -invariants can be rewritten more concisely as I2 = − Tr(Z 2 )/2 + Tr(Yi Yit ),
(4.64)
I3 =
Pf(Z) + Tr(Y1t ZY2 ),
(4.65)
I4 =
1 Tr(Z 4 ) + Tr((Yi Yit )2 ) + 2TrYit Z 2 Yi 2 1 − (TrZ 2 )(TrYi Yit ) + A ij klmn Zij Zkl Ym1 Yn2 . 2
(4.66)
Using an U Sp(8) rotation, we can set all xi ’s to zero except e.g., x1 , x20 , x24 . The invariants above then reduce to 2 2 I2 = x12 + x20 + x24 ,
I3 = −x1 x20 x24 ,
4 4 I4 = x14 + x20 + x24 .
(4.67)
The β1,20,24 equations require the ansatz f = (y 2 + x02 )−5/2 h(S1 )e while the α7 equation gives h&& + spherical vector is therefore
5 & S1 h
−i
x0 I3 y(x02 +y 2 )
,
(4.68)
− h = 0 and hence h = K2 (S1 )/S12 . The E8 x I
fE8
−i 20 3 2 K2 (S1 ) y(x0 +y ) = 2 e (y + x02 )5/2 S12
(4.69)
26
D. Kazhdan, B. Pioline, A. Waldron
with S1 as in (4.40). Again, the real part of the action S1 can be more compactly written as 1 (4.70) S1 = 2 det(Z + |z|I6 ) + |z|4 Tr(Yi Yit ) |z| 1 j + |z|2 2 det(Yαi Yβ ) + Tr(Yi Yit )TrZ 2 − Z ∧ Z ∧ Y1 ∧ Y2 αβ 2 + Tr(Y1t ZY2 ) 2Pf(Z) + Tr(Y1t ZY2 ) . As for E6 and E7 , another interesting representation can be obtained by Fourier transforming on the 13 coordinates (x0 , Y1 , Y2 ) (or, equivalently, under the 15 coordinates in X)5 . In this polarization, the linearly realized symmetry group is enlarged to Sl(8), and the 28 coordinates x0 , . . . , x27 transform as an antisymmetric matrix 0 x5 x8 x10 x12 x15 p7 p18 0 x9 x11 x14 x17 −p6 p21 0 x13 x16 x20 p4 p24 0 x19 x23 −p3 −p27 X = − . (4.71) 0 x26 p2 p25 0 p1 −p22 a/s 0 p0 0 It would be interesting to find the spherical vector in this representation. Summary. The general form of the spherical invariant for E6,7,8 in the standard minimal representation is x I
fEn
−i 20 3 2 Ks/2 (S1 ) y(x0 +y ) = 2 e (y + x02 )(s+1)/2 1 I3 exp −i x0 I3 , X, ∇ = s+1 Ks/2 X R R yR 2
(4.72)
where Kt is expressed in terms of the standard Bessel function as Kt (x) ≡ x −t Kt (x) and the “classical action” S1 is given in terms of the quadratic, cubic and quartic invariants I2,3,4 by
(y 2 + x02 )3 +(y 2 + x02 )2 I2 + (y 2 + x02 )(I22 − I4 )/2 + I32 I3 , S1 = = X, ∇ X R y 2 + x02 (4.73) where X = (y, x0 , xi ), R = (y, x0 ) and s = 1, 2, 4 for n = 6, 7, 8 respectively. The parameter s can be identified with the dimension of the field R, C, H entering in the alternate construction of the minimal representations of E6,7,8 through Jordan algebras in [10]. We have also found alternate representations for E6 and E7 , where Sl(5) and SO(5, 5) act linearly, respectively; the spherical vectors in this representation can be found in (4.47), (4.56). 5 This representation has also been constructed independently in [23].
Minimal Representations, Spherical Vectors, Exceptional Theta Series
27
4.5. Complex spherical vectors. For completeness, we discuss the case of a complex group, for which our methods also allow us to derive the spherical vector. The complex group G(C) can be obtained by complexifying its split real form G(R), i.e., adjoining to the real generators Ei of G(R) a set of “imaginary” generators Ei& such that [Ti , Tj ] = cij k Tk ,
[Ti , Tj& ] = cij k Tk& ,
[Ti& , Tj& ] = −cij k Tk .
(4.74)
Equivalently, one can introduce the holomorphic and anti-holomorphic generators Ti = Ti + iTi& , Ti = Ti − iTi& , satisfying [Ti , Tj ] = cij k Tk ,
[Ti , Tj ] = 0,
[Ti , Tj ] = cij k Tk .
(4.75)
We stress that the holomorphic generators Ti are identical in form to the original Ti except that the variables are now complex, while the generators Ti are obtained by replacing all variables by their complex conjugates. Dividing the generators into Cartan ones Hα and those associated with simple roots E±α , the maximal compact subgroup K ⊂ G(C) is generated by Eα ± E−α ,
Eα ± E−α
and
Hα − Hα .
(4.76)
[The choice of sign again depends on the conventions for positive and negative roots.] K is simply the real compact group of the same type as G. The simplest example is Sl(2, C) with maximally compact subgroup SU (2). The spherical vectors in the complexified metaplectic and Eisenstein representations (see (2.5), (2.6), (2.7) and (2.20), (2.21), (2.22) respectively) are fmeta = e−x x¯ ,
fEis = |x|1−ν Kν−1 (2|x|).
(4.77)
For the groups D4 , E6 , E7 and E8 , the form of the complex spherical vector in the standard minimal representation is uniform. Again, the requirement that f be annihilated by the linearly realized compact subgroup allows us to reduce the problem to one in five complex variables (y, x0 , x1 , x2 , x3 ) (for E6,7,8 we have renamed the remaining xi ’s for simplicity and will relabel corresponding roots accordingly). The compact Cartan generators Hα − Hα imply that the all dependence is through the complex modulus or the ratio x1 x2 x3 /(yx0 ), namely f = f (|y|, |x0 |, |x1 |, |x2 |, |x3 |, x1 x2 x3 /(yx0 )). The compact generators of the root attached to the affine one iI 3 Eβ0 + E−β0 = y∂0 − x¯0 ∂¯ + 2 , y¯
iI3 Eβ0 − E−β0 = y¯ ∂¯0 − x0 ∂ + 2 y
(4.78)
[where I 3 ≡ x¯1 x¯2 x¯3 + · · · ] imply the phase factor f = exp(−iS2 )g((y, x0 ), |x1 |, |x2 |, |x3 |),
(4.79)
with S2 ≡
x¯ I 1 x0 I 3 0 3 + . |y|2 + |x0 |2 y y¯
(4.80)
We can drop the x0 dependence of the function g and reinstate it at the end of the calculation, so we look at the root β1 at x0 = 0 (and drop non-diagonal terms in the xi ’s), x¯1 Eβ1 − E−β1 = y∂1 − x¯1 ∂¯ − (4.81) s + 1 + x¯2 ∂¯2 + x¯3 ∂¯3 , y¯
28
D. Kazhdan, B. Pioline, A. Waldron
where the constant s = 0 for D4 and s = 1, 2, 4 for E6,7,8 . This implies (at x0 = 0) that (|y|2 + |x1 |2 )(|y|2 + |x2 |2 )(|y|2 + |x3 |2 ) −2(s+1) . (4.82) h 2 g = |y| |y|2 Finally, we examine the root α1 at x0 = 0 with diagonal xi ’s, ix2 x3 Eα1 + E−α1 = − + x¯1 ∂¯0 + i y¯ ∂¯2 ∂¯3 + [s off-diagonal double derivatives] . y (4.83) Note that we may not neglect x0 or off-diagonal xi derivatives even though these variables are set to zero at the end of the calculation. In turn the function h satisfies the ordinary, Bessel-type, differential equation xh&& + (2s + 1)h& + xh = 0 (to verify that the s offdiagonal double derivative terms in (4.83) produce the coefficient 2s requires knowledge of the invariants of the linearly realized compact subgroup described below). It is now a matter of reinstating the dependence on the remaining variables; orchestrating the above results we find the complex spherical vector Ks (S1 ) x¯0 I3 −i x0 I 3 f = exp + , (4.84) (|y|2 + |x0 |2 )s+1 |y|2 + |x0 |2 y y¯ where s = 0, 1, 2, 4 for D4 , E6,7,8 , respectively, and Kt (x) ≡ x −t Kt (x). The action is
(|y|2 + |x0 |2 )3 + (|y|2 + |x0 |2 )2 I2 + (|y|2 + |x0 |2 )(I22 − I4 )/2 + |I3 |2 S1 = 2 , |y|2 + |x0 |2 (4.85) and I2 , I3 , I 3 and I4 are the invariants of the linearly realized subgroup. For E6 and E7 they are subsumed by the elegant formulae det(ZZ † + (|y|2 + |x0 |2 )I3 ) (det(ZZ † + (|y|2 + |x0 |2 )I6 ))1/4 E6 E7 S1 = , S = . 1 |y|2 + |x0 |2 |y|2 + |x0 |2 (4.86) The matrices Z are the same as (4.33) and (4.49) for complex variables; the cubic invariant I3 is holomorphic and takes the same expression as in the real case. The quadratic and quartic invariants I2 and I4 have the same form as in the real case with hermitian conjugation replacing transposition. Finally, we note that a rewriting in terms of the norm (x1 , x2 , . . . ) ≡ |x1 |2 + |x2 |2 + · · · and R ≡ (y, x0 ) also holds 1 I3 −i x¯0 I3 x0 I 3 f = 2(s+1) Ks 2 (X, R∇ exp + . (4.87) X R2 R2 y y¯ R Similar generalizations of the spherical vectors for complexifications of the groups An and Dn may also be obtained using exactly the methods presented above; the former is trivial, while for the Dn case we find n−4 I3 1 (y, x0 , x1 ) Kn−4 × 2 (X, R∇ fDn = 2 X R R R2 −i x¯0 I3 x0 I 3 exp + . (4.88) R2 y y¯
Minimal Representations, Spherical Vectors, Exceptional Theta Series
29
5. Discussion The main object of this paper was the derivation of the spherical vector for the minimal representation of real simply laced groups in the split real form. Our results are displayed in (4.28) and (4.72) above. This spherical vector is an essential component in the construction of automorphic theta series for exceptional groups. It has been proposed that such objects would provide the partition function for the winding modes of the quantum eleven-dimensional supermembrane in M-theory. Although the physical interpretation of the results obtained herein will be discussed elsewhere, we close with some comments, both mathematical and physical: (i)
We have obtained the spherical vector over the real field. As we have explained in Sect. 2, the spherical vectors over the p-adic fields Qp are also important, since their product gives the summation measure when constructing a theta series. Unfortunately in the p-adic case, we cannot rely on partial differential equations anymore. One may however require both the spherical vector and its A and S transforms to have support on the p-adic integers, which together with invariance under the linearly realized maximal compact group K0 (Zp ) should fix it uniquely. Nonetheless the presentation of the real spherical vectors in terms of norms does suggest a natural generalization to the p-adic case. (ii) We have left the space of functions of (y, xi ) unspecified. In order for the generators S and A to be well-defined, it should consist of functions in the Schwartz space, as well as their images. The main issue is the regularity at the origin. A proper understanding of this issue would provide the degenerate contributions to the theta series which we have overlooked in this paper. (iii) We have only discussed the minimal representation for simply-laced groups. For non-simply laced cases, some differences arise: for G2 and F4 , the minimal representation does not contain any singlet under reduction to the maximal compact subgroup K ⊂ G, therefore there is no spherical vector, but a multiplet of such vectors transforming into each other [24]. It would be interesting to find the wave function for the lowest such multiplet. For Bn≥3 , there simply does not exist a representation of the same dimension as that of the smallest nilpotent subalgebra [25]. For the symplectic case Cn , the spherical vector is not annihilated by the compact generators, but has a non-vanishing eigenvalue. (iv) From a physical viewpoint, the simplicity of our results is very encouraging. Specifically, in the E6 case, we have found a representation on 2 + 9 variables, such that 9 of them transform as a bifundamental under Sl(3) × Sl(3) in E6 : these variables should be identified with the winding numbers of a membrane on T 3 , with the two Sl(3) factors being the worldvolume and target space modular groups, respectively. The target space Sl(2) U-duality group, corresponding to the modular transformations of the modulus τ = C123 + iV3 , would then be realized through Fourier transform (i.e., Poisson resummation). It would be very interesting to understand the physical meaning of the two extra quantum numbers (y, x0 ) that we found necessary for realizing the symmetry. It would also be important to understand if the Born–Infeld-like action (4.41) for the zero-modes of the membrane can be generalized to include fluctuations, and yield a new description of the membrane where U-duality is non-linearly realized as a dynamical symmetry. For E7 , we have constructed a representation in terms of 2+15 variables, with an action (4.52) suggestive of a Born–Infeld-like action on a six dimensional worldvolume, whose interpretation is unclear. For E8 , we have a representation on 2 + 27 variables,
30
D. Kazhdan, B. Pioline, A. Waldron
where 27 of them transform linearly under E6 . This is the appropriate number of charges for M-theory on T 6 , so the spherical vector (4.69) should correspond to the membrane (or, perhaps more appropriately, the five-brane) partition function on T 6 in the Schwinger representation, where the sum over BPS states is apparent. The modular group Sl(3) should then be realized by Poisson resummation. It would be very interesting to find a representation where Sl(3) acts linearly, and identify it with the membrane action. The intertwining operator between the two representations would then realize membrane/five-brane duality. Finally, the representations we have displayed can be used to construct quantum mechanical systems with spectrum-generating exceptional symmetries, much in the spirit of conformal quantum mechanics [26]. It would be interesting to see if they can play a rôle in string theory. Acknowledgements. The authors are grateful to K. Koepsell, H. Nicolai, N.Obers, J. Plefka, E. Rabinovici and N. Wyllard for useful discussions, to D. Vogan for correspondence on the non-simply laced case, and especially to A. Polishchuk for discussions about the p-adic case. B.P gratefully acknowledges the hospitality of ITP, Santa Barbara during part of this project. B.P. and A.W. are grateful to the Max-Planck-Institut für Gravitationsphysik Albert-Einstein-Institut for hospitality during the course of this project. Many thanks are also extended to the authors of the Form [27], Lie and Mathematica symbolic computing packages, which have been instrumental in arriving at the results presented here. The work of B.P. is supported by the David and Lucile Packard Foundation and that of A.W. by NSF grant PHY99-73935.
A. Group Theory Data In this appendix, we supply the list of positive roots for all simply-laced groups, graded by their charge under the affine Cartan generator Hω . The grade-one subspace is presented in the standard polarization, as explained in the text below Eq. (3.3), and the action of the Weyl reflection A w.r.t. to β0 is indicated. The explicit expressions for the cubic invariant I3 , the Cartan generators and the (positive and negative) simple roots in the minimal representation are listed. The expressions for the grade 1 and 2 positive roots are given in (3.4), and repeated here for convenience, Eβi = y∂i ,
Eγi = ixi ,
Eω = iy,
i = 0, . . . , d − 1 .
(A.1)
The Cartan generators for the simple root β0 and the affine root ω take also universal forms, Hβ0 = −y∂ + x0 ∂0 , Hω = −ν − 2y∂ −
(A.2) d−1
x i ∂i ,
(A.3)
i=0
up to the “normal ordering constant” ν = (n − 1), 6, 9, 15 for Dn , E6 , E7 , E8 , respectively. More compact expressions can be obtained by making manifest the covariance under the linearly realized group H0 .
Minimal Representations, Spherical Vectors, Exceptional Theta Series
31
A.1. An . Dynkin diagram: ❣ 1 β0
❣ 2 α1
❣ 3 α2
❣ ... ...
❣ . n αn−1
Positive roots: α1 = (0, 1, 0, α2 = (0, 0, 1, .. . (0, 0, 0, αn−2 = (0, 0, 0, αn−1 = (0, 1, 1, αn = (0, 0, 1, .. . (0, 0, 0,
. . . , 0, 0 ) = A(β1 ), . . . , 0, 0 ) .. . , 0, 0 ) . . . , 1, 0 ), 0, . . . , 0 ) = A(β2 ) 1, 0, 0 ) .. .. ., ., 0 )
α2n−5 = (0, 0, . . . , 1, .. .
α(n−1)(n−2)/2 = (0, 1, . . . , 1, β0 = (1, 0, 0, 0, 0) β1 = (1, 1, 0, 0, 0) .. . . . . (.., .., . . , 0, 0) βn−2 = (1, 1, . . . , 1, 0)
1, 0 )
1, 0 ) = A(βn−2 ),
γ0 = (0, 1, . . . , γ1 = (0, 0, 1, .. . (0, 0, . . . , γn−2 = (0, 0, . . . ,
1, 1) . . . , 1) .. . , 1) 0, 1),
ω = (1, 1, . . . , 1, 1) = A(γ0 ). Cartan generators (ν = (n + 1)/2 in the standard minimal rep): Hβ0 = −y∂ + x0 ∂0 , Hα1 = −x0 ∂0 + x1 ∂1 , Hα2 = −x1 ∂1 + x2 ∂2 , .. . Hαn−2 = −xn−3 ∂n−3 + xn−2 ∂n−2 , Hγn−2 = −ν − y∂ − x0 ∂0 − · · · − xn−3 ∂n−3 − 2xn−2 ∂n−2 . Simple roots: Eα1 Eα2 .. . Eαn−2
= =
x 0 ∂1 , x 1 ∂2 , .. = . = xn−3 ∂n−2 ,
E−α1 E−α2 .. . E−αn−2
= =
x 1 ∂0 , x 2 ∂1 .. = . = xn−2 ∂n−3 .
32
D. Kazhdan, B. Pioline, A. Waldron
A.2. D5 . Dynkin diagram: 4 ❣ α3 ❣ 1 α1
❣ 2 β0
❣
❣.
3 α2
5 α4
Positive roots:
α1 α2 α3 α4 α5 α6 α7 β0 β1 β2 β3 β4 β5
= = = = = =
= (1, 0, 0, 0, 0) = A(β1 ), = (0, 0, 1, 0, 0) = A(β2 ), = (0, 0, 0, 1, 0) = A(α3 ), = (0, 0, 0, 0, 1) = A(α4 ), = (0, 0, 1, 1, 0) = A(β4 ), = (0, 0, 1, 0, 1) = A(β5 ), = (0, 0, 1, 1, 1) = A(β3 ).
(0, 1, 0, 0, 0), (1, 1, 0, 0, 0), (0, 1, 1, 0, 0), (0, 1, 1, 1, 1), (0, 1, 1, 1, 0), (0, 1, 1, 0, 1),
γ0 γ1 γ2 γ3 γ4 γ5
= (1, 1, 2, 1, 1), = (0, 1, 2, 1, 1), = (1, 1, 1, 1, 1), = (1, 1, 1, 0, 0), = (1, 1, 1, 0, 1), = (1, 1, 1, 1, 0),
ω = (1, 2, 2, 1, 1) = A(γ0 ). Cubic form: I3 = x1 (x2 x3 − x4 x5 ). Cartan generators: Hβ0 Hα1 Hα2 Hα3 Hα4
= = = = =
−y∂ + x0 ∂0 , −2 − x0 ∂0 + x1 ∂1 − x2 ∂2 − x3 ∂3 − x4 ∂4 − x5 ∂5 , −1 − x0 ∂0 − x1 ∂1 + x2 ∂2 − x3 ∂3 , −x2 ∂2 + x3 ∂3 + x4 ∂4 − x5 ∂5 , −x2 ∂2 + x3 ∂3 − x4 ∂4 + x5 ∂5 .
Simple roots: Eα1 Eα2 Eα3 Eα4 E−α1 E−α2 E−α3 E−α4
= = = = = = = =
−x0 ∂1 − i(x2 x3 − x4 x5 )/y, −x0 ∂2 − ix1 x3 /y, x 2 ∂4 + x 5 ∂ 3 , −x2 ∂5 − x4 ∂3 , x1 ∂0 + iy (∂2 ∂3 − ∂4 ∂5 ), x2 ∂0 + iy ∂1 ∂3 , −x3 ∂5 − x4 ∂2 , x 3 ∂ 4 + x 5 ∂2 .
Minimal Representations, Spherical Vectors, Exceptional Theta Series
A.3. E6 . Dynkin diagram: 2 ❣ β0 ❣ 1 α1
❣ 3 α2
❣
❣
4 α3
5 α4
Positive roots:
❣. 6 α5
α1 = (1, 0, 0, 0, 0, 0) = A(α1 ), α2 = (0, 0, 1, 0, 0, 0) = A(α2 ), α3 = (0, 0, 0, 1, 0, 0) = A(β1 ), α4 = (0, 0, 0, 0, 1, 0) = A(α4 ), α5 = (0, 0, 0, 0, 0, 1) = A(α5 ), α6 = (1, 0, 1, 0, 0, 0) = A(α6 ), α7 = (0, 0, 1, 1, 0, 0) = A(β2 ), α8 = (0, 0, 0, 1, 1, 0) = A(β3 ), α9 = (0, 0, 0, 0, 1, 1) = A(α9 ), α10 = (1, 0, 1, 1, 0, 0) = A(β4 ), α11 = (0, 0, 1, 1, 1, 0) = A(β5 ), α12 = (0, 0, 0, 1, 1, 1) = A(β6 ), α13 = (1, 0, 1, 1, 1, 0) = A(β7 ), α14 = (0, 0, 1, 1, 1, 1) = A(β9 ), α15 = (1, 0, 1, 1, 1, 1) = A(β8 ), β0 β1 β2 β3 β4 β5 β6 β7 β8 β9
= = = = = = = = = =
(0, 1, 0, 0, 0, 0), γ0 = (1, 1, 2, 3, 2, 1), (0, 1, 0, 1, 0, 0), γ1 = (1, 1, 2, 2, 2, 1), (0, 1, 1, 1, 0, 0), γ2 = (1, 1, 1, 2, 2, 1), (0, 1, 0, 1, 1, 0), γ3 = (1, 1, 2, 2, 1, 1), (1, 1, 1, 1, 0, 0), γ4 = (0, 1, 1, 2, 2, 1), (0, 1, 1, 1, 1, 0), γ5 = (1, 1, 1, 2, 1, 1), (0, 1, 0, 1, 1, 1), γ6 = (1, 1, 2, 2, 1, 0), (1, 1, 1, 1, 1, 0), γ7 = (0, 1, 1, 2, 1, 1), (1, 1, 1, 1, 1, 1), γ8 = (0, 1, 1, 2, 1, 0), (0, 1, 1, 1, 1, 1), γ9 = (1, 1, 1, 2, 1, 0), ω = (1, 2, 2, 3, 2, 1) = A(γ0 ).
Cubic form: I3 = −x1 x5 x8 + x1 x7 x9 + x2 x3 x8 − x2 x6 x7 − x3 x4 x9 + x4 x5 x6 . Cartan generators: Hβ0 Hα1 Hα2 Hα3 Hα4 Hα5
= = = = = =
−y∂ + x0 ∂0 , −x2 ∂2 + x4 ∂4 − x5 ∂5 + x7 ∂7 + x8 ∂8 − x9 ∂9 , −x1 ∂1 + x2 ∂2 − x3 ∂3 + x5 ∂5 − x6 ∂6 + x9 ∂9 , −2 − x0 ∂0 + x1 ∂1 − x5 ∂5 − x7 ∂7 − x8 ∂8 − x9 ∂9 , −x1 ∂1 − x2 ∂2 + x3 ∂3 − x4 ∂4 + x5 ∂5 + x7 ∂7 , −x3 ∂3 − x5 ∂5 + x6 ∂6 − x7 ∂7 + x8 ∂8 + x9 ∂9 .
33
34
D. Kazhdan, B. Pioline, A. Waldron
Simple roots:
Eα1 Eα2 Eα3 Eα4 Eα5 E−α1 E−α2 E−α3 E−α4 E−α5
= = = = = = = = = =
−x2 ∂4 − x5 ∂7 − x9 ∂8 , −x1 ∂2 − x3 ∂5 − x6 ∂9 , −x0 ∂1 + i(x5 x8 − x7 x9 )/y, −x1 ∂3 − x2 ∂5 − x4 ∂7 , −x3 ∂6 − x5 ∂9 − x7 ∂8 , x 4 ∂2 + x 7 ∂ 5 + x 8 ∂ 9 , x 2 ∂1 + x 5 ∂ 3 + x 9 ∂ 6 , x1 ∂0 − iy(∂5 ∂8 − ∂7 ∂9 ), x 3 ∂ 1 + x 5 ∂2 + x 7 ∂ 4 , x 6 ∂ 3 + x 8 ∂7 + x 9 ∂5 .
A.4. E7 . Dynkin diagram: 2 ❣ α1 ❣ 1 β0
❣ 3 α2
❣ 4 α3
❣ 5 α4
❣ 6 α5
❣. 7 α6
Positive roots: α1 = (0, 1, 0, 0, 0, 0, 0) = A(α1 ), α2 = (0, 0, 1, 0, 0, 0, 0) = A(β1 ), α3 = (0, 0, 0, 1, 0, 0, 0) = A(α3 ), α4 = (0, 0, 0, 0, 1, 0, 0) = A(α4 ), α5 = (0, 0, 0, 0, 0, 1, 0) = A(α5 ), α6 = (0, 0, 0, 0, 0, 0, 1) = A(α6 ), α7 = (0, 1, 0, 1, 0, 0, 0) = A(α7 ), α8 = (0, 0, 1, 1, 0, 0, 0) = A(β2 ), α9 = (0, 0, 0, 1, 1, 0, 0) = A(α9 ), α10 = (0, 0, 0, 0, 1, 1, 0) = A(α10 ), α11 = (0, 0, 0, 0, 0, 1, 1) = A(α11 ), α12 = (0, 1, 1, 1, 0, 0, 0) = A(β3 ), α13 = (0, 1, 0, 1, 1, 0, 0) = A(α13 ), α14 = (0, 0, 1, 1, 1, 0, 0) = A(β4 ), α15 = (0, 0, 0, 1, 1, 1, 0) = A(α15 ), α16 = (0, 0, 0, 0, 1, 1, 1) = A(α16 ), α17 = (0, 1, 1, 1, 1, 0, 0) = A(β5 ), α18 = (0, 1, 0, 1, 1, 1, 0) = A(α18 ), α19 = (0, 0, 1, 1, 1, 1, 0) = A(β6 ), α20 = (0, 0, 0, 1, 1, 1, 1) = A(α20 ), α21 = (0, 1, 1, 2, 1, 0, 0) = A(β7 ), α22 = (0, 1, 1, 1, 1, 1, 0) = A(β8 ), α23 = (0, 1, 0, 1, 1, 1, 1) = A(α23 ), α24 = (0, 0, 1, 1, 1, 1, 1) = A(β9 ),
Minimal Representations, Spherical Vectors, Exceptional Theta Series
α25 α26 α27 α28 α29 α30
= = = = = =
(0, 1, 1, 2, 1, 1, 0) = A(β11 ), (0, 1, 1, 1, 1, 1, 1) = A(β12 ), (0, 1, 1, 2, 2, 1, 0) = A(β14 ), (0, 1, 1, 2, 1, 1, 1) = A(β15 ), (0, 1, 1, 2, 2, 1, 1) = A(β13 ), (0, 1, 1, 2, 2, 2, 1) = A(β10 ),
β0 = (1, 0, 0, 0, 0, 0, 0), γ0 = (1, 2, 3, 4, 3, 2, 1), β1 = (1, 0, 1, 0, 0, 0, 0), γ1 = (1, 2, 2, 4, 3, 2, 1), β2 = (1, 0, 1, 1, 0, 0, 0), γ2 = (1, 2, 2, 3, 3, 2, 1), β3 = (1, 1, 1, 1, 0, 0, 0), γ3 = (1, 1, 2, 3, 3, 2, 1), γ4 = (1, 2, 2, 3, 2, 2, 1), β4 = (1, 0, 1, 1, 1, 0, 0), β5 = (1, 1, 1, 1, 1, 0, 0), γ5 = (1, 1, 2, 3, 2, 2, 1), β6 = (1, 0, 1, 1, 1, 1, 0), γ6 = (1, 2, 2, 3, 2, 1, 1), β7 = (1, 1, 1, 2, 1, 0, 0), γ7 = (1, 1, 2, 2, 2, 2, 1), β8 = (1, 1, 1, 1, 1, 1, 0), γ8 = (1, 1, 2, 3, 2, 1, 1), β9 = (1, 0, 1, 1, 1, 1, 1), γ9 = (1, 2, 2, 3, 2, 1, 0), β10 = (1, 1, 1, 2, 2, 2, 1), γ10 = (1, 1, 2, 2, 1, 0, 0), β11 = (1, 1, 1, 2, 1, 1, 0), γ11 = (1, 1, 2, 2, 2, 1, 1), β12 = (1, 1, 1, 1, 1, 1, 1), γ12 = (1, 1, 2, 3, 2, 1, 0), β13 = (1, 1, 1, 2, 2, 1, 1), γ13 = (1, 1, 2, 2, 1, 1, 0), β14 = (1, 1, 1, 2, 2, 1, 0), γ14 = (1, 1, 2, 2, 1, 1, 1), β15 = (1, 1, 1, 2, 1, 1, 1), γ15 = (1, 1, 2, 2, 2, 1, 0), ω = (2, 2, 3, 4, 3, 2, 1) = A(γ0 ). Cubic form: I3 = − x1 x7 x10 + x1 x11 x13 − x1 x14 x15 + x2 x5 x10 − x2 x8 x13 + x2 x12 x14 − x3 x4 x10 + x3 x6 x13 − x3 x9 x14 + x4 x8 x15 − x4 x11 x12 − x5 x6 x15 + x5 x9 x11 + x6 x7 x12 − x7 x8 x9 . Cartan generators: Hβ0 Hα1 Hα2 Hα3 Hα4 Hα5 Hα6
= = = = = = =
−y∂ + x0 ∂0 , −x2 ∂2 + x3 ∂3 − x4 ∂4 + x5 ∂5 − x6 ∂6 + x8 ∂8 − x9 ∂9 + x12 ∂12 , −3 − x0 ∂0 + x1 ∂1 − x7 ∂7 − x10 ∂10 − x11 ∂11 − x13 ∂13 − x14 ∂14 − x15 ∂15 , −x1 ∂1 + x2 ∂2 − x5 ∂5 + x7 ∂7 − x8 ∂8 + x11 ∂11 − x12 ∂12 + x15 ∂15 , −x2 ∂2 − x3 ∂3 + x4 ∂4 + x5 ∂5 − x11 ∂11 + x13 ∂13 + x14 ∂14 − x15 ∂15 , −x4 ∂4 − x5 ∂5 + x6 ∂6 − x7 ∂7 + x8 ∂8 + x10 ∂10 + x11 ∂11 − x13 ∂13 , −x6 ∂6 − x8 ∂8 + x9 ∂9 − x11 ∂11 + x12 ∂12 + x13 ∂13 − x14 ∂14 + x15 ∂15 .
Simple roots: Eα1 Eα2 Eα3 Eα4 Eα5 Eα6
= = = = = =
x2 ∂3 + x4 ∂5 + x6 ∂8 + x9 ∂12 , −x0 ∂1 + i(x7 x10 − x11 x13 + x14 x15 )/y, x1 ∂2 + x5 ∂7 + x8 ∂11 + x12 ∂15 , x2 ∂4 + x3 ∂5 + x11 ∂14 + x15 ∂13 , x4 ∂6 + x5 ∂8 + x7 ∂11 + x13 ∂10 , x6 ∂9 + x8 ∂12 + x11 ∂15 + x14 ∂13 ,
35
36
D. Kazhdan, B. Pioline, A. Waldron
E−α1 E−α2 E−α3 E−α4 E−α5 E−α6
= = = = = =
−x3 ∂2 − x5 ∂4 − x8 ∂6 − x12 ∂9 , x1 ∂0 − iy(∂7 ∂10 − ∂11 ∂13 + ∂14 ∂15 ), −x2 ∂1 − x7 ∂5 − x11 ∂8 − x15 ∂12 , −x4 ∂2 − x5 ∂3 − x13 ∂15 − x14 ∂11 , −x6 ∂4 − x8 ∂5 − x10 ∂13 − x11 ∂7 , −x9 ∂6 − x12 ∂8 − x13 ∂14 − x15 ∂11 .
A.5. E8 . Dynkin diagram: 2 ❣ α2 ❣ 1 α1
❣ 3 α3
❣ 4 α4
❣ 5 α5
❣ 6 α6
❣ 7 α7
❣. 8 β0
Positive roots: α1 = (1, 0, 0, 0, 0, 0, 0, 0) = A(α1 ), α2 = (0, 1, 0, 0, 0, 0, 0, 0) = A(α2 ), α3 = (0, 0, 1, 0, 0, 0, 0, 0) = A(α3 ), α4 = (0, 0, 0, 1, 0, 0, 0, 0) = A(α4 ), α5 = (0, 0, 0, 0, 1, 0, 0, 0) = A(α5 ), α6 = (0, 0, 0, 0, 0, 1, 0, 0) = A(α6 ), α7 = (0, 0, 0, 0, 0, 0, 1, 0) = A(β1 ), α8 = (1, 0, 1, 0, 0, 0, 0, 0) = A(α8 ), α9 = (0, 1, 0, 1, 0, 0, 0, 0) = A(α9 ), α10 = (0, 0, 1, 1, 0, 0, 0, 0) = A(α10 ), α11 = (0, 0, 0, 1, 1, 0, 0, 0) = A(α11 ), α12 = (0, 0, 0, 0, 1, 1, 0, 0) = A(α12 ), α13 = (0, 0, 0, 0, 0, 1, 1, 0) = A(β2 ), α14 = (1, 0, 1, 1, 0, 0, 0, 0) = A(α14 ), α15 = (0, 1, 1, 1, 0, 0, 0, 0) = A(α15 ), α16 = (0, 1, 0, 1, 1, 0, 0, 0) = A(α16 ), α17 = (0, 0, 1, 1, 1, 0, 0, 0) = A(α17 ), α18 = (0, 0, 0, 1, 1, 1, 0, 0) = A(α18 ), α19 = (0, 0, 0, 0, 1, 1, 1, 0) = A(β3 ), α20 = (1, 1, 1, 1, 0, 0, 0, 0) = A(α20 ), α21 = (1, 0, 1, 1, 1, 0, 0, 0) = A(α21 ), α22 = (0, 1, 1, 1, 1, 0, 0, 0) = A(α22 ), α23 = (0, 1, 0, 1, 1, 1, 0, 0) = A(α23 ), α24 = (0, 0, 1, 1, 1, 1, 0, 0) = A(α24 ) α25 = (0, 0, 0, 1, 1, 1, 1, 0) = A(β4 ), α26 = (1, 1, 1, 1, 1, 0, 0, 0) = A(α26 ), α27 = (1, 0, 1, 1, 1, 1, 0, 0) = A(α27 ), α28 = (0, 1, 1, 2, 1, 0, 0, 0) = A(α28 ), α29 = (0, 1, 1, 1, 1, 1, 0, 0) = A(α29 ),
Minimal Representations, Spherical Vectors, Exceptional Theta Series
α30 α31 α32 α33 α34 α35 α36 α37 α38 α39
= = = = = = = = = =
(0, 1, 0, 1, 1, 1, 1, 0) = A(β5 ), (0, 0, 1, 1, 1, 1, 1, 0) = A(β6 ), (1, 1, 1, 2, 1, 0, 0, 0) = A(α32 ), (1, 1, 1, 1, 1, 1, 0, 0) = A(α33 ), (1, 0, 1, 1, 1, 1, 1, 0) = A(β7 ), (0, 1, 1, 2, 1, 1, 0, 0) = A(α35 ), (0, 1, 1, 1, 1, 1, 1, 0) = A(β8 ), (1, 1, 2, 2, 1, 0, 0, 0) = A(α37 ), (1, 1, 1, 2, 1, 1, 0, 0) = A(α38 ), (1, 1, 1, 1, 1, 1, 1, 0) = A(β9 ),
α40 α41 α42 α43 α44 α45 α46 α47 α48 α49 α50 α51 α52 α53 α54 α55 α56 α57 α58 α59 α60 α61 α62 α63
= = = = = = = = = = = = = = = = = = = = = = = =
(0, 1, 1, 2, 2, 1, 0, 0) = A(α40 ), (0, 1, 1, 2, 1, 1, 1, 0) = A(β10 ), (1, 1, 2, 2, 1, 1, 0, 0) = A(α42 ), (1, 1, 1, 2, 2, 1, 0, 0) = A(α43 ), (1, 1, 1, 2, 1, 1, 1, 0) = A(β11 ), (0, 1, 1, 2, 2, 1, 1, 0) = A(β12 ), (1, 1, 2, 2, 2, 1, 0, 0) = A(α46 ), (1, 1, 2, 2, 1, 1, 1, 0) = A(β13 ), (1, 1, 1, 2, 2, 1, 1, 0) = A(β14 ), (0, 1, 1, 2, 2, 2, 1, 0) = A(β15 ), (1, 1, 2, 3, 2, 1, 0, 0) = A(α50 ), (1, 1, 2, 2, 2, 1, 1, 0) = A(β16 ), (1, 1, 1, 2, 2, 2, 1, 0) = A(β17 ), (1, 2, 2, 3, 2, 1, 0, 0) = A(α53 ), (1, 1, 2, 3, 2, 1, 1, 0) = A(β19 ), (1, 1, 2, 2, 2, 2, 1, 0) = A(β20 ), (1, 2, 2, 3, 2, 1, 1, 0) = A(β22 ), (1, 1, 2, 3, 2, 2, 1, 0) = A(β23 ), (1, 2, 2, 3, 2, 2, 1, 0) = A(β25 ), (1, 1, 2, 3, 3, 2, 1, 0) = A(β26 ), (1, 2, 2, 3, 3, 2, 1, 0) = A(β27 ), (1, 2, 2, 4, 3, 2, 1, 0) = A(β24 ), (1, 2, 3, 4, 3, 2, 1, 0) = A(β21 ), (2, 2, 3, 4, 3, 2, 1, 0) = A(β18 ),
β0 = (0, 0, 0, 0, 0, 0, 0, 1), β1 = (0, 0, 0, 0, 0, 0, 1, 1), β2 = (0, 0, 0, 0, 0, 1, 1, 1), β3 = (0, 0, 0, 0, 1, 1, 1, 1), β4 = (0, 0, 0, 1, 1, 1, 1, 1), β5 = (0, 1, 0, 1, 1, 1, 1, 1), β6 = (0, 0, 1, 1, 1, 1, 1, 1), β7 = (1, 0, 1, 1, 1, 1, 1, 1), β8 = (0, 1, 1, 1, 1, 1, 1, 1), β9 = (1, 1, 1, 1, 1, 1, 1, 1), β10 = (0, 1, 1, 2, 1, 1, 1, 1), β11 = (1, 1, 1, 2, 1, 1, 1, 1), β12 = (0, 1, 1, 2, 2, 1, 1, 1), β13 = (1, 1, 2, 2, 1, 1, 1, 1), β14 = (1, 1, 1, 2, 2, 1, 1, 1),
γ0 = (2, 3, 4, 6, 5, 4, 3, 1), γ1 = (2, 3, 4, 6, 5, 4, 2, 1), γ2 = (2, 3, 4, 6, 5, 3, 2, 1), γ3 = (2, 3, 4, 6, 4, 3, 2, 1), γ4 = (2, 3, 4, 5, 4, 3, 2, 1), γ5 = (2, 2, 4, 5, 4, 3, 2, 1), γ6 = (2, 3, 3, 5, 4, 3, 2, 1), γ7 = (1, 3, 3, 5, 4, 3, 2, 1), γ8 = (2, 2, 3, 5, 4, 3, 2, 1), γ9 = (1, 2, 3, 5, 4, 3, 2, 1), γ10 = (2, 2, 3, 4, 4, 3, 2, 1), γ11 = (1, 2, 3, 4, 4, 3, 2, 1), γ12 = (2, 2, 3, 4, 3, 3, 2, 1), γ13 = (1, 2, 2, 4, 4, 3, 2, 1), γ14 = (1, 2, 3, 4, 3, 3, 2, 1),
37
38
D. Kazhdan, B. Pioline, A. Waldron
β15 β16 β17 β18 β19 β21 β22 β23 β24 β25 β26 β27
= = = = = = = = = = = =
(0, 1, 1, 2, 2, 2, 1, 1), (1, 1, 2, 2, 2, 1, 1, 1), (1, 1, 1, 2, 2, 2, 1, 1), (2, 2, 3, 4, 3, 2, 1, 1), (1, 1, 2, 3, 2, 1, 1, 1), (1, 2, 3, 4, 3, 2, 1, 1), (1, 2, 2, 3, 2, 1, 1, 1), (1, 1, 2, 3, 2, 2, 1, 1), (1, 2, 2, 4, 3, 2, 1, 1), (1, 2, 2, 3, 2, 2, 1, 1), (1, 1, 2, 3, 3, 2, 1, 1), (1, 2, 2, 3, 3, 2, 1, 1),
γ15 γ16 γ17 γ18 γ19 γ21 γ22 γ23 γ24 γ25 γ26 γ27
= (2, 2, 3, 4, 3, 2, 2, 1), = (1, 2, 2, 4, 3, 3, 2, 1), = (1, 2, 3, 4, 3, 2, 2, 1), = (0, 1, 1, 2, 2, 2, 2, 1), = (1, 2, 2, 3, 3, 3, 2, 1), = (1, 1, 1, 2, 2, 2, 2, 1), = (1, 1, 2, 3, 3, 3, 2, 1), = (1, 2, 2, 3, 3, 2, 2, 1), = (1, 1, 2, 2, 2, 2, 2, 1), = (1, 1, 2, 3, 3, 2, 2, 1), = (1, 2, 2, 3, 2, 2, 2, 1), = (1, 1, 2, 3, 2, 2, 2, 1),
ω = (2, 3, 4, 6, 5, 4, 3, 2) = A(γ0 ). Cubic form: I3 = x1 x15 x18 + x1 x17 x21 + x1 x20 x24 − x1 x23 x27 + x1 x25 x26 + x2 x12 x18 + x2 x14 x21 + x2 x16 x24 − x2 x19 x27 + x2 x22 x26 + x3 x10 x18 + x3 x11 x21 + x3 x13 x24 − x3 x19 x25 + x3 x22 x23 + x4 x8 x18 + x4 x9 x21 + x4 x13 x27 − x4 x16 x25 + x4 x20 x22 − x5 x6 x18 − x5 x7 x21 + x5 x13 x26 − x5 x16 x23 + x5 x19 x20 + x6 x9 x24 − x6 x11 x27 + x6 x14 x25 − x6 x17 x22 − x7 x8 x24 + x7 x10 x27 − x7 x12 x25 + x7 x15 x22 − x8 x11 x26 + x8 x14 x23 − x8 x17 x19 + x9 x10 x26 − x9 x12 x23 + x9 x15 x19 − x10 x14 x20 + x10 x16 x17 + x11 x12 x20 − x11 x15 x16 − x12 x13 x17 . Cartan generators: Hβ0 = − y∂ + x0 ∂0 , Hα1 = −x6 ∂6 + x7 ∂7 − x8 ∂8 + x9 ∂9 − x10 ∂10 + x11 ∂11 − x12 ∂12 + x14 ∂14 − x15 ∂15 + x17 ∂17 + x18 ∂18 − x21 ∂21 , Hα2 = − x4 ∂4 + x5 ∂5 − x6 ∂6 − x7 ∂7 + x8 ∂8 + x9 ∂9 − x19 ∂19 + x22 ∂22 − x23 ∂23 + x25 ∂25 − x26 ∂26 + x27 ∂27 , Hα3 = − x4 ∂4 − x5 ∂5 + x6 ∂6 + x8 ∂8 − x11 ∂11 + x13 ∂13 − x14 ∂14 + x16 ∂16 − x17 ∂17 + x20 ∂20 + x21 ∂21 − x24 ∂24 , Hα4 = − x3 ∂3 + x4 ∂4 − x8 ∂8 − x9 ∂9 + x10 ∂10 + x11 ∂11 − x16 ∂16 + x19 ∂19 − x20 ∂20 + x23 ∂23 + x24 ∂24 − x27 ∂27 , Hα5 = − x2 ∂2 + x3 ∂3 − x10 ∂10 − x11 ∂11 + x12 ∂12 − x13 ∂13 + x14 ∂14 + x16 ∂16 − x23 ∂23 − x25 ∂25 + x26 ∂26 + x27 ∂27 , Hα6 = − x1 ∂1 + x2 ∂2 − x12 ∂12 − x14 ∂14 + x15 ∂15 − x16 ∂16 + x17 ∂17 − x19 ∂19 + x20 ∂20 − x22 ∂22 + x23 ∂23 + x25 ∂25 , Hα7 = − 5 − x0 ∂0 + x1 ∂1 − x15 ∂15 − x17 ∂17 − x18 ∂18 − x20 ∂20 − x21 ∂21 − x23 ∂23 − x24 ∂24 − x25 ∂25 − x26 ∂26 − x27 ∂27 .
Minimal Representations, Spherical Vectors, Exceptional Theta Series
39
Simple roots: Eα1 Eα2 Eα3 Eα4 Eα5 Eα6 Eα7 E−α1 E−α2 E−α3 E−α4 E−α5 E−α6 E−α7
= = = = = = = = = = = = = =
−x6 ∂7 − x8 ∂9 − x10 ∂11 − x12 ∂14 − x15 ∂17 + x21 ∂18 , −x4 ∂5 − x6 ∂8 − x7 ∂9 + x19 ∂22 + x23 ∂25 + x26 ∂27 , −x4 ∂6 − x5 ∂8 − x11 ∂13 − x14 ∂16 − x17 ∂20 + x24 ∂21 , −x3 ∂4 + x8 ∂10 + x9 ∂11 + x16 ∂19 + x20 ∂23 + x27 ∂24 , −x2 ∂3 + x10 ∂12 + x11 ∂14 + x13 ∂16 + x23 ∂26 + x25 ∂27 , −x1 ∂2 + x12 ∂15 + x14 ∂17 + x16 ∂20 + x19 ∂23 + x22 ∂25 , −x0 ∂1 + i(−x15 x18 − x17 x21 − x20 x24 + x23 x27 − x25 x26 )/y, x7 ∂6 + x9 ∂8 + x11 ∂10 + x14 ∂12 + x17 ∂15 − x18 ∂21 , x5 ∂4 + x8 ∂6 + x9 ∂7 − x22 ∂19 − x25 ∂23 − x27 ∂26 , x6 ∂4 + x8 ∂5 + x13 ∂11 + x16 ∂14 + x20 ∂17 − x21 ∂24 , x4 ∂3 − x10 ∂8 − x11 ∂9 − x19 ∂16 − x23 ∂20 − x24 ∂27 , x3 ∂2 − x12 ∂10 − x14 ∂11 − x16 ∂13 − x26 ∂23 − x27 ∂25 , x2 ∂1 − x15 ∂12 − x17 ∂14 − x20 ∂16 − x23 ∂19 − x25 ∂22 , x1 ∂0 + iy(∂15 ∂18 + ∂17 ∂21 + ∂20 ∂24 − ∂23 ∂27 + ∂25 ∂26 ).
References 1. Pioline, B., Nicolai, H., Plefka, J., and Waldron, A.: R 4 couplings, the fundamental membrane and exceptional theta correspondences. JHEP 0103, 036 (2001), hep-th/0102123 2. Green, M.B., and Gutperle, M.: Effects of D instantons. Nucl. Phys. B498, 195–227 (1997), hepth/9701093 3. Kiritsis, E., and Pioline, B.: On R 4 threshold corrections in IIB string theory and (p, q) string instantons. Nucl. Phys. B508, 509–534 (1997), hep-th/9707018; Pioline, B., and Kiritsis, E.: U-duality and D-brane combinatorics. Phys. Lett. B418, 61 (1998), hep-th/9710078 4. Pioline, B.:A note on non-perturbative R 4 couplings. Phys. Lett. B 431, 73 (1998) hep-th/9804023; Green, M.B., and Sethi, S.: Supersymmetry constraints on type IIB supergravity: Phys. Rev. D 59, 046006 (1999) hep-th/9808061 5. Obers, N.A., and Pioline, B.: Eisenstein series and string thresholds. Commun. Math. Phys. 209, 275 (2000), hep-th/9903113; Obers, N.A., and Pioline, B.: Eisenstein series in string theory. Class. Quant. Grav. 17, 1215 (2000), hep-th/9910115 6. Obers, N.A., and Pioline, B.: U-duality and M-theory. Phys. Rept. 318, 113 (1999), hep-th/9809039; Obers, N.A., and Pioline, B.: U-duality and M-theory, an algebraic approach. hep-th/9812139 7. Green, M.B., Gutperle, M., and Vanhove, P.: One loop in eleven dimensions. Phys. Lett. B 409, 177 (1997) hep-th/9706175; Green, M.B., Kwon, H.h., and Vanhove, P.: Phys. Rev. D 61, 104010 (2000) hep-th/9910055; Chalmers, G.: M-theory and automorphic scattering. hep-th/0104132 8. Sugino, F., and Vanhove, P.: U-duality from matrix membrane partition function. hep-th/0107145 9. Joseph,A.: Minimal realizations and spectrum generating algebras. Commun. Math. Phys. 36, 325 (1974); The minimal orbit in a simple Lie algebra and its associated maximal ideal. Ann. Scient. Ecole Normale Sup. 4ème série 9, 1–30 (1976) 10. Brylinski, R., and Kostant, B.: Minimal representations of E6 , E7 , and E8 and the generalized Capelli identity. Proc. Nat. Acad. Sci. U.S.A. 91, 2469 (1994); Minimal representations, geometric quantization, and unitarity. Proc. Nat. Acad. Sci. U.S.A. 91, 6026 (1994); Lagrangian models of minimal representations of E6 , E7 and E8 . In: Functional Analysis on the Eve of the 21st century, Progress in Math. Basel–Boston: Birkhäuser, 1995 11. Kazhdan, D., and Savin, G.: The smallest representation of simply laced groups. Israel Math. Conf. Proceedings, Piatetski-Shapiro Festschrift 2, 209–223 (1990) 12. Howe, R.: θ -series and invariant theory. Proceedings of Symposia in Pure Mathematics XXXIII. Providence, RI: AMS, 1979 13. Kazhdan, D., and Polishchuk, A.: Spherical vector in the minimal representation of a simply-laced p-adic group. To appear
40
D. Kazhdan, B. Pioline, A. Waldron
14. Lion, G., and Vergne, M.: The Weil representation, Maslov index and theta series. Progress in Mathematics 6, Basel–Boston: Birkhäuser, 1980; Mumford, D.: Tata lectures on Theta III. Progress in Mathematics, Basel–Boston: Birkhäuser, 1991 15. Terras, A.: Harmonic Analysis on Symmetric Spaces and Applications II. New-York: Springer-Verlag, 1985 16. Brekke, L., and Freund, P.G.O.: p-adic numbers in physics. Phys. Rep. 233, 1 (1993) 17. Gelfand, I., Graev, M., and Pjatetskii-Shapiro, I.: Representation theory and automorphic functions. London: Saunders, 1969 18. Gunaydin, M., Koepsell, K., and Nicolai, H.: Conformal and quasiconformal realizations of exceptional Lie groups. Commun. Math. Phys. 221, 57 (2001) hep-th/0008063 19. Kazhdan, D.: The minimal representation of D4 . In: Operator algebras, Unitary representations, enveloping algebras and invariant theories, A. Connes, M. Duflo, A. Joseph, R. Rentschler, eds., Progress in Mathematics 92, Boston: Birkhäuser, 1990, p. 125 20. Etingof, P., Kazhdan, D., and Polishchuk, A.: When is the Fourier transform of an elementary function elementary? math.AG/0003009 21. Kiritsis, E., Obers, N.A., and Pioline, B.: Heterotic/type II triality and instantons on K3. JHEP 0001, 029 (2000), hep-th/0001083 22. Nahm, W., and Wendland, K.: A hiker’s guide to K3: Aspects of N = (4, 4) superconformal field theory with central charge c = 6. Commun. Math. Phys. 216, 85 (2001) hep-th/9912067. 23. Gunaydin, M., Koepsell, K., and Nicolai, H.: The Minimal Unitary Representation of E8(8) . hepth/0109005 24. Vogan, D.: The unitary dual of G2 . Invent. Math. 116, 677–791 (1994) 25. Vogan, D.: Singular unitary representations. Lecture Notes in Math. 880, Berlin–Heidelberg–New York: Springer Verlag 26. de Alfaro, V., Fubini, S., and Furlan, G.: Conformal Invariance In Quantum Mechanics. Nuovo Cim. A 34, 569 (1976) 27. Vermaseren, J.A.M.: New Features of Form. math-ph/0010025 Communicated by H. Nicolai
Commun. Math. Phys. 226, 41 – 60 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Asymptotic Spectral Measures, Quantum Mechanics, and E-Theory Diane Martinez, Jody Trout Department of Mathematics, Dartmouth College, 6188 Bradley Hall, Hanover, NH 03755, USA. E-mail:
[email protected];
[email protected] Received: 27 December 2000 / Accepted: 8 October 2001
Abstract: We study the relationship between POV-measures in quantum theory and asymptotic morphisms in the operator algebra E-theory of Connes–Higson. This is done by introducing the theory of “asymptotic” PV-measures and their integral correspondence with positive asymptotic morphisms on locally compact spaces. Examples and applications involving various aspects of quantum physics, including quantum noise models, semiclassical limits, strong deformation quantizations, and pure half-spin particles, are also discussed. 1. Introduction In the Hilbert space formulation of quantum mechanics by von Neumann [VN], an observable is modeled as a self-adjoint operator on the Hilbert space of states of the quantum system. The Spectral Theorem relates this theoretical view of a quantum observable to the more operational one of a projection-valued measure (PVM or spectral measure) which determines the probability distribution of the experimentally measurable values of the observable. To solve foundational problems with the concept of measurement and to better analyze unsharp results in experiments, this view was generalized to include positive operator-valued measures (POVMs). Since the work of Jauch and Piron [JP], POV-measures have played an ever increasing role in both the foundations and operational aspects of quantum physics [BGL,S]. See the Appendix for a quick review of POVMs, their use in quantum mechanics, and relation to the Spectral Theorem. In this paper, we study the relationship between POVMs and asymptotic morphisms in the operator algebra E-theory of Connes and Higson [CH], which has already found many applications in mathematics [Bl, GHT, H,Tr], most notably to classification problems in operator K-theory, index theory, representation theory, geometry, and topology. The basic ingredients of E-theory are asymptotic morphisms, which are given by continuous families of functions {Qh¯ }h¯ >0 : A → B from a C ∗ -algebra A to a C ∗ -algebra Research of the second author was supported by NSF grant DMS-0071120
42
D. Martinez, J. Trout
B that satisfy the axioms of a ∗-homomorphism in the limit as the parameter h¯ tends to 0. Asymptotic multiplicativity is a modern version of the Bohr-von Neumann correspondence principle [L] from quantization theory: For all f, g ∈ A, Qh¯ (f g) − Qh¯ (f )Qh¯ (g) → 0 as h¯ → 0. It is then no surprise that quantization schemes may naturally define asymptotic morphisms, say, from the C ∗ -algebra A of classical observables to the C ∗ -algebra B of quantum observables. Hence, such quantizations can give cycles in the abelian group E(A, B), which was defined by Connes and Higson as a certain matrix-stable homotopy group of asymptotic morphisms from A to B. For example, Guentner [G1] showed that Wick quantization on the Fock space F of C defines a positive asymptotic morphism ¯ {QW h¯ } : C0 (C) → K(F), whose E-theory class is equal to the class of the ∂-operator W ¯ = [[Q ]] ∈ E(C0 (C), C). (We will discuss Guentner’s work in our context in [[∂]] h¯ Example 5.5.) See the papers [N1, N2, Ro] and the books [C, GVF] for more on the connections between operator algebra K-theory, E-theory, and quantization. We show that there is a fundamental quantum-E-theory relationship by introducing the concept of an asymptotic spectral measure (ASM or asymptotic PVM) {Ah¯ }h¯ >0 : → B(H) associated to a measurable space (X, ). (See Definition 3.1.) Roughly, this is a continuous family of POV-measures which are “asymptotically” projective (or quasiprojective) as h¯ tends to 0: Ah¯ ()2 − Ah¯ () → 0 as h¯ → 0 for certain measurable sets ∈ . Let X be a locally compact space with Borel σ -algebra X and let C0 (X) denote the C ∗ -algebra of continuous functions vanishing at infinity on X. One of our main results is an “asymptotic” Riesz representation theorem (Theorem 4.2) which gives a bijective correspondence between certain positive asymptotic morphisms {Qh¯ } : C0 (X) → B and Borel asymptotic spectral measures {Ah¯ } : (X , CX ) → (B(H), B), where CX denotes the open subsets of X with compact closure and B is a hereditary C ∗ -subalgebra of B(H). This correspondence is given by operator integration Qh¯ (f ) = f (x) dAh¯ (x). X
The associated asymptotic morphism {Qh¯ } : C0 (X) → B then allows one to define an E-theory invariant for the asymptotic spectral measure {Ah¯ }, [[Ah¯ ]] =def [[Qh¯ ]] ∈ E(C0 (X), B) ∼ = E0 (X; B), in the E-homology group of X with coefficients in B. It has been well-established that operator K-theory and the dual K-homology groups provide suitable receptacles for invariants of quantum systems, such as chiral anomalies in quantum field theory [N] and, more recently, as D-brane charges in string theory and
Asymptotic Spectral Measures, Quantization, and E-Theory
43
M-theory [P,W]. Since E-theory subsumes both K-theory and K-homology [Bl], it is reasonable to assume that E-theory elements of quantizations and asymptotic spectral measures may provide interesting topological invariants of the associated quantum systems. Although in this paper we will be more concerned with asymptotic morphisms and their relation to POV-measures than computing E-theory elements (but see Example 5.5), a long-range goal of this research project is to develop an E-theoretic calculus for computing these invariants directly from the asymptotic measure-theoretic data, e.g., by developing the appropriate notions of homotopy and suspension for ASMs, thus bypassing the technical functional-analytic aspects of asymptotic morphisms. Another benefit of using this asymptotic measure-theoretic approach is operational in nature. Experimental data from position and momentum measurements on an elementary quantum system (via visibility data from interference experiments) is collected which is then used to construct the associated POVM. This method [S] is based on using frame manuals for the instrument state space and Sakai operators associated to localization operators on rectangles in the classical phase space X. The POVM {Ah¯ } depends on Planck’s constant, of course, and generally satisfies the (unsharp) separation property Ah¯ (1 ∩ 2 ) = Ah¯ (1 )Ah¯ (2 ). However, if letting h¯ → 0 one then obtains an ASM, which is equivalent to lim Ah¯ (1 ∩ 2 ) − Ah¯ (1 )Ah¯ (2 ) = 0, h¯ →0
then one can directly associate an E-homological invariant [[Ah¯ ]] ∈ E0 (X; B) to the quantum system under experimental study using our theory. The outline of this paper is as follows. In Sect. 2 we discuss positive asymptotic morphisms associated to hereditary and nuclear C ∗ -algebras. The basic definitions and properties of asymptotic spectral measures are developed in Sect. 3. Asymptotic Riesz representation theorems and some of their consequences are proven in Sect. 4. Examples and applications of ASMs associated to various aspects of quantum physics are discussed in Sect. 5, e.g., constructing ASMs from PVMs by quantum noise models, quasiprojectors and semiclassical limits, unsharp spin measurements of spin- 21 particles (including an example from quantum cryptography), strong deformation quantizations, and Wick quantization on bosonic Fock space. The authors would like to thank Navin Khaneja, Iain Raeburn, and Dana Williams for helpful conversations. Also, we would like to thank the referee for helpful comments. See Beggs [B] for a related method of obtaining asymptotic morphisms by an integration technique involving spectral measures. 2. Positive Asymptotic Morphisms and Hereditary C ∗ -Subalgebras Let A and B be C ∗ -algebras. Recall that a linear map Q : A → B is called positive [M] if Q(f ) ≥ 0 for all f ≥ 0 in A. It is called completely positive if every inflation to n × n matrices Mn (Q) : Mn (A) → Mn (B) is also positive. Every ∗-homomorphism from A to B is clearly completely positive. The following definition interpolates between (completely) positive linear maps and ∗-homomorphisms. Definition 2.1. A (completely) positive asymptotic morphism from A to B is a family of maps {Qh¯ }h¯ ∈(0,1] : A → B parameterized by h¯ ∈ (0, 1] such that the following conditions hold:
44
D. Martinez, J. Trout
(a) Each Qh¯ is a (completely) positive linear map; (b) The map (0, 1] → B : h¯ → Qh¯ (f ) is continuous for each f ∈ A; (c) For all f, g ∈ A we have lim Qh¯ (f g) − Qh¯ (f )Qh¯ (g) = 0.
h¯ →0
For the basic theory of asymptotic morphisms see the books [GHT, C, Bl] and papers [CH, G2]. For the importance of positive asymptotic morphisms to C ∗ -algebra K-theory see [HLT]. Note that any ∗-homomorphism Q : A → B determines the constant completely positive asymptotic morphism {Qh¯ } : A → B defined by Qh¯ = Q for all h¯ > 0. Also, it follows that for any f ∈ A, a mild boundedness condition [Bl] always holds, lim sup Qh¯ (f ) ≤ f . h¯ →0
Remark. In the E-theory literature, asymptotic morphisms are usually parameterized by t ∈ [1, ∞). We chose to use the equivalent parameterization h¯ = 1/t ∈ (0, 1] to make the connections to quantum physics more transparent. Note that other authors have used different parameter spaces, including discrete ones [L2,Th]. The results in this paper translate verbatim to these parameter spaces, and Condition (b) is obviously irrelevant in the discrete case. Definition 2.2. Two asymptotic morphisms {Qh¯ }, {Qh¯ } : A → B are called equivalent if for all f ∈ A we have that limh¯ →0 Qh¯ (f ) − Qh¯ (f ) = 0. We will let [[A, B]]a(cp) denote the collection of all asymptotic equivalence classes of (completely positive) asymptotic morphisms from A to B. A C ∗ -algebra A is called nuclear [M] if the identity map id : A → A can be approximated pointwise in norm by completely positive finite rank contractions. This is equivalent to the condition that there is a unique C ∗ -tensor product A ⊗ B for any C ∗ algebra B. If H is a separable Hilbert space, the C ∗ -algebra K(H) of compact operators on H is nuclear. If X is a locally compact space, then the C ∗ -algebra C0 (X) of continuous complex-valued functions on X vanishing at infinity is also nuclear. If A ∼ = C(X) is unital and commutative, then every positive linear map Q : A → B is completely positive by Stinespring’s Theorem. The following result is a consequence of the completely positive lifting theorem of Choi and Effros [CE] for nuclear C ∗ -algebras. (See also 25.1.5 of Blackadar [Bl] for a discussion.) Lemma 2.3. Let A be a nuclear C ∗ -algebra. Every asymptotic morphism from A to any C ∗ -algebra B is equivalent to a completely positive asymptotic morphism. That is, there is a bijection of sets [[A, B]]a ∼ = [[A, B]]acp . Definition 2.4. Let A1 ⊂ A and B1 ⊂ B be subalgebras of the C ∗ -algebras A and B. If Q : A → B is a linear map such that Q(A1 ) ⊂ B1 , we will denote this by Q : (A, A1 ) → (B, B1 ). The notation {Qh¯ } : (A, A1 ) → (B, B1 ) then has the obvious meaning.
Asymptotic Spectral Measures, Quantization, and E-Theory
45
Lemma 2.5. Let A1 ⊂ A and B1 ⊂ B be non-closed ∗-subalgebras. Every positive linear map Q : (A, A1 ) → (B, B1 ) also satisfies Q : (A, A1 ) → (B, B1 ), where A1 denotes the closure of A1 ⊂ A (similarly for B1 ). Proof. Follows from the fact that a positive linear map is automatically norm bounded. Let A be a ∗-subalgebra of a C ∗ -algebra B. Recall that A is said to be hereditary [M] if 0 ≤ b ≤ a and a ∈ A implies that b ∈ A. Every (closed two-sided ∗-invariant) ideal in a C ∗ -algebra is a hereditary ∗-subalgebra. In particular, if H is a Hilbert space, the ideal of compact operators K(H) is a hereditary C ∗ -subalgebra of the C ∗ -algebra of bounded operators B(H). An important (non-closed) hereditary ∗-subalgebra for quantum theory is the (non-closed) ideal B1 (H) ⊂ K(H) of trace-class operators: B1 (H) = {ρ ∈ K(H) : trace |ρ| < ∞}. We then have that K(H)∗ = B1 (H) by the dual pairing ρ(T ) = trace(ρT ), where ρ ∈ B1 (H) and T ∈ K(H). If X is a locally compact space, then the ideal C0 (X) is a hereditary C ∗ -subalgebra of the C ∗ -algebra Cb (X) of continuous bounded complex-valued functions on X. Also, the (non-closed) ideal Cc (X) of compactly supported functions is a (non-closed) hereditary ∗-subalgebra of Cb (X). However, in general, Cδ (X), for δ = c, 0, b, is not a hereditary subalgebra of the C ∗ -algebra Bb (X) of bounded Borel functions on X. 3. Asymptotic Spectral Measures In this section we give the basic definitions and properties of asymptotic spectral measures. See the Appendix for a review of POV and spectral measures. Let (X, ) be a measurable space and H a separable Hilbert space. Let E ⊂ denote a fixed collection of measurable subsets. Definition 3.1. An asymptotic spectral measure (ASM) on (X, , E) is a family of maps {Ah¯ }h¯ ∈(0,1] : → B(H) parameterized by h¯ ∈ (0, 1] such that the following conditions hold: (a) Each Ah¯ is a POVM on (X, ) with lim suph¯ →0 Ah¯ (X) ≤ 1; (b) The map (0, 1] → B(H) : h¯ → Ah¯ () is continuous for each ∈ E; (c) For each 1 , 2 ∈ E we have that lim Ah¯ (1 ∩ 2 ) − Ah¯ (1 )Ah¯ (2 ) = 0.
h¯ →0
The triple (X, , E) will be called an asymptotic measure space. The family E will be called the asymptotic carrier of {Ah¯ }. Condition (c) will be called asymptotic projectivity (or quasiprojectivity) and generalizes the projectivity condition (A.1) of a spectral measure. It is motivated by the quantum theory notion of quasiprojectors, as discussed in Example 5.2. If E = then we will call {Ah¯ } a full ASM on (X, ). If each Ah¯ is normalized, i.e., Ah¯ (X) = IH , then we will say that {Ah¯ } is normalized. The mild boundedness condition in (a) is then redundant. (Also see the remark after Definition 2.1.)
46
D. Martinez, J. Trout
A spectral (PV) measure E : → B(H) determines a “constant” full asymptotic spectral measure {Ah¯ } by the assignment Ah¯ = E for all h. ¯ Also, any continuous family {Eh¯ } of spectral measures (in the sense of (b)) determine an ASM on (X, , E). See [CHM] for an application of smooth families of spectral measures to the Quantum Hall Effect. Definition 3.2. Two asymptotic spectral measures {Ah¯ }, {Bh¯ } : → B(H) on (X, , E) are said to be (asymptotically) equivalent if for each measurable set ∈ E, lim Ah¯ () − Bh¯ () = 0.
h¯ →0
This will be denoted {Ah¯ } ∼E {Bh¯ }. If this holds for E = we will call them fully equivalent. From now on, we let X denote a locally compact Hausdorff topological space with Borel σ -algebra X . We will assume that E = CX denotes the collection of all open subsets of X with compact closure, i.e., the pre-compact open subsets. Definition 3.3. Let B ⊂ B(H) be a hereditary ∗-subalgebra. A Borel POV-measure A : X → B(H) will be called locally B-valued if A(U ) ∈ B for all pre-compact open subsets U ∈ CX and this will be denoted by A : (X , CX ) → (B(H), B). A family of Borel POV-measures {Ah¯ } on X will be called locally B-valued if each POVM Ah¯ is locally B-valued and will be denoted {Ah¯ } : (X , CX ) → (B(H), B). We will use the term locally compact-valued for locally K(H)-valued. If B = B1 (H) ⊂ K(H) is the trace-class operators, then we will say that {Ah¯ } has locally compact trace. We will let ((X, B)) denote the set of all equivalence classes of locally B-valued Borel asymptotic spectral measures on (X, X , CX ). The equivalence class of {Ah¯ } will be denoted ((Ah¯ )) ∈ ((X, B)). Given a Borel POV-measure A on X, the cospectrum of A is defined as the set cospec(A) = {U ⊂ X : U is open and A(U ) = 0}. The spectrum of A is the complement spec(A) = X\ cospec(A). The following definition is adapted from Berberian [Be]. Definition 3.4. A POVM A on X will be said to have compact support if the spectrum of A is a compact subset of X. An ASM {Ah¯ } on X will be said to have compact support if there is a compact subset K of X such that spec(Ah¯ ) ⊂ K for all h¯ > 0. The relationship among these compactness notions is contained in the following. Proposition 3.5. Let X be second countable. Let A be a Borel POVM on X with compact support. Let B be the hereditary subalgebra of B(H) generated by A(spec(A)). Then A is a locally B-valued POVM, i.e., A : (X , CX ) → (B(H), B). Proof. Since X is second countable, the σ -algebra BX of Baire subsets equals the Borel σ -algebra BX = X . Thus, by Theorem 23 [Be] A(cospec(A)) = 0. Let U ∈ CX be a pre-compact open subset of X. We then have that 0 ≤ A(U ∩ cospec(A)) ≤ A(cospec(A)) = 0, and since X is the disjoint union X = spec(A) cospec(A), 0 ≤ A(U ) = A(U ∩ spec(A)) ≤ A(spec(A)). Since B is hereditary, A(U ) ∈ B for all U ∈ CX and so A is locally B-valued.
Asymptotic Spectral Measures, Quantization, and E-Theory
47
4. Asymptotic Riesz Representation Theorems Throughout this section, we let X denote a locally compact Hausdorff space with Borel σ -algebra X . Let CX ⊂ X denote the collection of all pre-compact open subsets of X. And we let B ⊂ B(H) denote a hereditary ∗-subalgebra of the bounded operators on a fixed Hilbert space H. Lemma 4.1. There is a bijective correspondence between locally B-valued Borel POVMs A : (X , CX ) → (B(H), B) and positive linear maps Q : C0 (X) → B. This correspondence is given by Q(f ) = f (x) dA(x). (4.1.1) X
Proof. In view of Theorem A.3 we only need to check that the locally B-valued condition corresponds to Q(C0 (X)) ⊂ B. Suppose A(CX ) ⊂ B. Let f ∈ Cc (X) be compactly supported. Since Q is positive linear, it suffices to assume f ≥ 0. Let K = supp(f ) which is a compact subset of X. By local compactness, there is an open subset U ∈ CX such that K ⊂ U . By the Extreme Value Theorem there is a C > 0 such that 0 ≤ f ≤ CχU . Since Q is positive, 0 ≤ Q(f ) = f dA ≤ C χU dA = CA(U ) ∈ B X
X
by hypothesis. Since B is hereditary, Q(f ) ∈ B. Conversely, suppose Q : C0 (X) → B is positive linear and given by formula (4.1.1). Then Q defines a positive map Q : (Bb (X), C0 (X)) → (B(H), B). Let U ∈ CX be a pre-compact open subset. Since X is completely regular, we have by Urysohn’s Lemma a continuous function f ∈ Cc (X) with 0 ≤ f ≤ 1 such that U = f −1 (1). Thus, 0 ≤ χU ≤ f and so 0 ≤ A(U ) = Q(χU ) ≤ Q(f ) ∈ B. Thus, A(U ) ∈ B and so A is locally B-valued as desired. Define B0 (X) to be the C ∗ -subalgebra of Bb (X) generated by {χU : U ∈ CX }. If X is also σ -compact, a paracompactness argument then shows that C0 (X) ⊂ B0 (X) as a closed (but not necessarily hereditary) ∗-subalgebra. (Recall that if f ∈ Cc (X) is compactly supported, then Interior(supp(f )) ∈ CX .) The following is our main result. Theorem 4.2. If X is σ -compact, there is a bijective correspondence between positive asymptotic morphisms {Qh¯ } : (B0 (X), C0 (X)) → (B(H), B) and locally B-valued Borel asymptotic spectral measures {Ah¯ } : (X , CX ) → (B(H), B). This correspondence is given by Qh¯ (f ) =
X
f (x) dAh¯ (x).
(4.1)
48
D. Martinez, J. Trout
Proof. Let {Qh¯ } : (B0 (X), C0 (X)) → (B(H), B) be a positive asymptotic morphism. By the lemma, there is a locally B-valued family of POVMs {Ah¯ } : (X , CX ) → (B(H), B) such that (4.1) holds for all h¯ > 0 and f ∈ B0 (X). For each U ∈ CX we have that h¯ → Ah¯ (U ) = Qh¯ (χU ) is continuous by Condition (2.1.b.). Also, we have that for any U = ∅ ∈ CX , lim sup Ah¯ (U ) = lim sup Qh¯ (χU ) ≤ χU = 1. h¯ →0
h¯ →0
Since X is σ -compact, there is an increasing sequence {Un } ⊂ CX of pre-compact open subsets such that X = ∪∞ ¯ > 0, Ah¯ is a “regular” 1 Un . By Theorem 18 [Be], for all h Borel POVM, so Ah¯ (X) = LUB{Ah¯ (Un ) : n ∈ N} (in the sense of positive operators). It follows that lim suph¯ →0 Ah¯ (X) ≤ 1 as desired. Now let U1 , U2 ∈ CX . Since characteristic functions satisfy χU1 ∩U2 = χU1 χU2 we then have by asymptotic multiplicativity (2.1.c) that lim Ah¯ (U1 ∩ U2 ) − Ah¯ (U1 )Ah¯ (U2 ) = lim Qh¯ (χU1 χU2 ) − Qh¯ (χU1 )Qh¯ (χU2 )|| = 0.
h¯ →0
h¯ →0
Thus, the family {Ah¯ } is a locally B-valued ASM on X. Conversely, let {Ah¯ } : (X , CX ) → (B(H), B) be a locally B-valued ASM on X. Define the family of maps {Qh¯ } : B0 (X) → B(H) by Eq. (4.1). Hence, each Qh¯ is positive linear and Qh¯ : (B0 (X), C0 (X)) → (B(H), B) by Lemma 4.1. Let S0 (X) denote the dense ∗-subalgebra of B0 (X) consisting of simple functions f = ni=1 ai χUi , where Ui ∈ CX . Asymptotic projectivity (3.1.c) and the calculation above then show that for any simple functions f, g ∈ S0 (X) we have lim Qh¯ (f g) − Qh¯ (f )Qh¯ (g) = 0.
h¯ →0
Also, for any such simple function f ∈ S0 (X), h¯ → Qh¯ (f ) =
n
ai Ah¯ (Ui )
1
is continuous from (0, 1] → B by (3.1.b). To conclude that {Qh¯ } is asymptotically multiplicative on the closure B0 (X) we need to show that it is bounded. By (A.3.1) we have that for any f ∈ B0 (X), Qh¯ (f ) = f (x) dAh¯ (x) ≤ 2 f Ah¯ (X) . X
By Condition (3.1.a) we then have lim sup Qh¯ (f ) ≤ 2f lim sup Ah¯ (X) ≤ 2f . h¯ →0
h¯ →0
The result now follows since every bounded asymptotic morphism on a dense ∗-subalgebra extends to the closure.
Asymptotic Spectral Measures, Quantization, and E-Theory
49
Corollary 4.3. Under the above hypotheses, equivalent Borel asymptotic spectral measures correspond to equivalent positive asymptotic morphisms. Thus, there is a welldefined map ((X, B)) → [[C0 (X), B]]acp which maps ((Ah¯ )) → [[Qh¯ ]]acp . Proof. Follows from the fact that Ah¯ (U ) = Qh¯ (χU ) and any two asymptotic morphisms equivalent on a dense subalgebra, are equivalent. Also, since C0 (X) is nuclear, the second statement follows from Lemmas 2.3 and 2.5. Let Cδ (X) denote a unital C ∗ -subalgebra of Cb (X) such that C0 (X) Cδ (X). By the Gelfand-Naimark Theorem [GN], Cδ (X) ∼ = C(δX) for some “continuous” compactification δX ⊇ X. Corollary 4.4. Let I B(H) be an ideal. Every locally I-valued full Borel asymptotic spectral measure {Ah¯ } on X determines a canonical relative asymptotic morphism (in the sense of Guentner [G2]) {Qh¯ } : (Cδ (X), C0 (X)) → (B(H), I) for any continuous compactification δX of X. Definition 4.5. A family {Ah¯ }h¯ >0 : → B(H) of Borel POV-measures on X will be called a Cδ -asymptotic spectral measure if the family of maps {Qh¯ } defined by equation (4.1) determines an asymptotic morphism {Qh¯ } : Cδ (X) → B(H). The following proposition is then easy to prove using Theorem A.3 and the results above. Proposition 4.6. There is a one-one correspondence between locally B-valued Cδ asymptotic spectral measures {Ah¯ } : (X , CX ) → (B(H), B) and positive asymptotic morphisms {Qh¯ } : (Cδ (X), C0 (X)) → (B(H), B). 5. Examples and Applications 5.1. Constructing ASMs via quantum noise models. We give a general method for constructing asymptotic spectral measures from spectral measures (on a possibly different measure space) by adapting a convolution technique used to model noise and uncertainty in quantum measuring devices. See Sect. II.2.3 of Busch et al. [BGL] for the relevant background material. Let (X1 , 1 ) and (X2 , 2 ) be measure spaces. Let E2 ⊂ 2 . Consider a family of maps {ph¯ } : 2 × X1 → [0, 1] such that the following conditions hold: (a) For every ω ∈ X1 , → ph¯ (, ω) is a probability measure on X2 ; (b) For each ∈ E2 , the map h¯ → ph¯ (, ·) is continuous [0, 1) → Bb (X1 );
50
D. Martinez, J. Trout
(c) For every 1 , 2 ∈ E2 , lim ph¯ (1 , ·)ph¯ (2 , ·) − ph¯ (1 ∩ 2 , ·)∞ = 0,
h¯ →0
where · ∞ denotes the sup-norm on Bb (X1 ). Let E : 1 → B(H) be a spectral measure on X1 . Define a family of maps {Ah¯ } : 2 → B(H) by the formula Ah¯ () = ph¯ (, ω) dE(ω) X1
for any ∈ 2 . Theorem 5.1.2. The family {Ah¯ } : 2 → B(H) defines an ASM on (X2 , 2 , E2 ). If E is normalized then {Ah¯ } is also normalized. Proof. The fact that each Ah¯ is a POVM on X2 is easy. Continuity in h¯ follows from Condition (b) and the following estimate for ∈ E2 , Ah¯ () − Ah¯ 0 () = (p (, ω) − p (, ω)) dE(ω) h¯ h¯ 0 X1
≤ ph¯ (, ·) − ph¯ 0 (, ·)∞ . Now we need to prove asymptotic projectivity. Let 1 , 2 ∈ E2 . Consider the calculation Ah¯ (1 )Ah¯ (2 ) − Ah¯ (1 ∩ 2 ) = ph¯ (1 , ω) dE(ω) ph¯ (2 , ω) dE(ω) − ph¯ (1 ∩ 2 , ω) dE(ω) = X X1 X1 1 = ph¯ (1 , ω)ph¯ (2 , ω) dE(ω) − ph¯ (1 ∩ 2 , ω) dE(ω) X1
X1
≤ ph¯ (1 , ·)ph¯ (2 , ·) − ph¯ (1 ∩ 2 , ·)∞ → 0 as h¯ → 0 by (c). We finish by showing that the mild normalization condition holds: Ah¯ (X2 ) = ph¯ (X2 , ω) dE(ω) = 1 dE(ω) = E(X1 ) ≤ I X1
X1
by Condition (a) above and the fact that E(X1 ) is a projection.
Note that the inequalities in the previous proof require that E be a PVM. (See Theorems 15 and 16 [Be] and Theorem A.4.) See Example 5.3 below for a concrete example of this smearing technique. The physical interpretation (for finite systems) is that ph¯ models the noise or uncertainty in interpreting the readings of a measurement. For example, if E has an eigenstate φ = E({ω})φ, then the expectation value of Ah¯ () when the system is in state φ is given by "φ|Ah¯ ()|φ# = ph¯ (, ω). Thus, ph¯ determines a (conditional) confidence measure of the system.
Asymptotic Spectral Measures, Quantization, and E-Theory
51
5.2. Quasiprojectors and semiclassical limits. In this example, we show that the theory of ASMs can be used to study semiclassical limits. The relevant background for the material in this section can be found in Chapters 10 and 11 of Omnes book [O]. We first need the following well-known result which is an easy consequence of the functional calculus and spectral mapping theorem. (See also Lemma 5.1.6. [WO].) It gives a rigorous statement of the procedure used to “straighten out” quasiprojectors into projections. Lemma 5.2.1. Let {ah¯ : h¯ > 0} be a continuous family of elements in a C ∗ -algebra B such that 0 ≤ ah¯ ≤ 1 for each h¯ > 0 and lim ah¯ − ah2¯ = 0.
h¯ →0
There is a continuous family of projections h¯ → eh¯ = eh∗¯ = eh2¯ such that lim ah¯ − eh¯ = 0.
h¯ →0
Let (X, , E) be an asymptotic measure space. Proposition 5.2.2. Let {Ah¯ } be a normalized ASM on (X, , E). For each subset ∈ E there is a continuous family of projections {Eh¯ ()} such that lim Ah¯ () − Eh¯ () = 0.
h¯ →0
Moreover if 1 and 2 are disjoint measurable sets in E then lim Eh¯ (1 )Eh¯ (2 ) = 0.
h¯ →0
Proof. For each ∈ E we have by monotonicity and normalization that 0 ≤ Ah¯ () ≤ I for all h¯ > 0. Setting = 1 = 2 in the asymptotic projectivity condition (3.1.c) we have that lim Ah¯ () − Ah¯ ()2 = 0. h¯ →0
Now invoke the previous lemma to get the continuous family {Eh¯ ()} of projections. If 1 ∩ 2 = ∅ then by Condition (3.1.c) again, we have that lim Ah¯ (1 )Ah¯ (2 ) = 0.
h¯ →0
A simple triangle inequality argument plus normalization then shows that lim Eh¯ (1 )Eh¯ (2 ) = 0
h¯ →0
as was desired. The relation to semiclassical limits occurs when we take X to be the locally compact phase space of a classical system and B = B1 (H) to be the algebra of trace-class operators.
52
D. Martinez, J. Trout
Proposition 5.2.3. Let {Ah¯ } be a Borel ASM on X with locally compact trace. Then for any subset ∈ CX we have lim trace(Ah¯ () − Ah¯ ()2 ) = 0,
h¯ →0
and there is a unique integer N ∈ N such that N = lim trace(Ah¯ ()). h¯ →0
Moreover, this integer is constant on the asymptotic equivalence class of {Ah¯ }. Proof. The first limit follows from the continuity of the trace. Let {Eh¯ ()} be the projections from the previous result. Since Ah¯ () ∈ B1 (H) ⊂ B1 (H) = K(H) it follows that Eh¯ () ∈ K(H), i.e., h¯ → Eh¯ () is a continuous family of compact (hence, finite rank) projections. Therefore, since the rank of a projection is a continuous invariant [D] lim trace(Ah¯ ()) = lim trace(Eh¯ ()) = rank(Eh¯ 0 ()) =def N
h¯ →0
h¯ →0
for any h¯ 0 > 0. The last statement follows again by continuity of the trace. Suppose X denotes the position-momentum phase space (x, p) of a particle. Let {Ah¯ } be a locally compact trace Borel ASM on X. A bounded rectangle R in phase space with center (x0 , p0 ) and sides 2x and 2p can then be used to represent a classical property asserting the simultaneous existence of the position and momentum (x0 , p0 ) of the particle with given error bounds (x, p) on measurement. The nonnegative integer NR which satisfies NR = lim trace(Ah¯ (R)) h¯ →0
can then be interpreted as the number of semiclassical states of the particle bound in the rectangular box R, which is familiar from elementary statistical mechanics. We then have that trace(Ah¯ (R) − Ah¯ (R)2 ) = NR O(h), ¯ trace(Eh¯ (R) − Ah¯ (R)) = NR O(h). ¯ Thus, h¯ represents a classicity parameter. When h¯ ≈ 0 is small, the quantum representation of the classical property is essentially correct and when h¯ ≈ 1 the classical property has essentially no meaning from the standpoint of quantum mechanics. Since these relations are preserved on equivalence classes, “a classical property corresponding to a sufficiently large a priori bounds x and p is represented by a set of equivalent quantum projectors” [O], i.e., equivalent locally compact trace ASMs. In addition, if R1 and R2 are disjoint rectangles, representing distinct classical properties, then we have that Ah¯ (R1 )Ah¯ (R2 ) = O(h), ¯ and so “two clearly distinct classical properties are (asymptotically) mutually exclusive when considered as quantum properties” [O].
Asymptotic Spectral Measures, Quantization, and E-Theory
53
5.3. Unsharp spin measurements of spin- 21 systems. In this example, we give a geometric classification of certain asymptotic spectral measures associated to pure spin- 21 particles. Recall that pure spin systems are represented by the Hilbert space H = C2 [BGL, S]. We then have B(H) ∼ = M2 (C). The Pauli spin operators σ1 , σ2 , σ3 are the 2 × 2 matrices σ1 =
01 , 10
σ2 =
0 −i , i 0
σ3 =
1 0 , 0 −1
which satisfy the relations: • σi∗ = σi , σi2 = I , • σi σj = −σi σj for i = j , • σi σj = i1ij k σk for i = j , where I denotes the identity operator. A density operator (or state) on H is a positive matrix ρ ≥ 0 with trace one. A fundamental result in the theory is the following. Lemma 5.3.1. Any density operator ρ on H can be written uniquely in the form ρ = ρ(x) =
1 (I + x · σ ), 2
x = (x1 , x2 , x3 ) ∈ R3 ,
x ≤ 1,
where x · σ = x1 σ1 + x2 σ2 + x3 σ3 and x2 = x12 + x22 + x32 . Moreover, ρ is a one-dimensional projection iff x is a unit vector x = 1. Definition 5.3.3. A spin POVM on X2 = {− 21 , + 21 } is a normalized POVM A = {A+ , A− } such that trace(A± ) = 1, where A± = A({± 21 }). Thus A± ≥ 0 is a density operator and A+ + A− = I . An asymptotic spectral measure {Ah¯ } on X2 will be called spin if each Ah¯ is a spin POVM. Let B 3 = {x ∈ R3 : x ≤ 1} denote the closed unit ball in R3 . Let S 2 = ∂B 3 denote the unit sphere. For each x ∈ B 3 we obtain a spin POVM Ax on X2 by defining A± x = ρ(±x) =
1 (I ± x · σ ) 2
(5.3.4)
which determines an “unsharp” spin observable. Let λ = λ(x) = x and define the quantities 1 + λ(x) 1 − λ(x) 1 1 rx = > , ux = < . 2 2 2 2 The quantity rx is called the degree of reality and ux is the degree of unsharpness of the unsharp observable Ax [BGL, RK]. Lemma 5.3.5. There is a bijective correspondence between spin POVMs A = {A+ , A− } and points x ∈ B 3 in the closed unit ball of R3 given by (5.3.4). Moreover, A is a spectral measure if and only if x ∈ S 2 is a unit vector. Proof. Follows from Lemma 5.3.1, Definition 5.3.3, and A− = I − A+ .
54
D. Martinez, J. Trout
Theorem 5.3.6. There is a bijective correspondence between spin asymptotic spectral measures {Ah¯ } = {Ah¯ + , Ah¯ − } and continuous maps A : (0, 1] → B 3 such that lim A(h¯ ) = 1.
h¯ →0
(5.3.6.1)
This correspondence is given by the formula 1 (I ± A(h¯ ) · σ ). 2 Proof. By the lemma we only need to prove continuity in h¯ and that asymptotic projectivity corresponds to condition (5.3.6.1) above. By the properties of the Pauli spin operators above, we can show by direct computations (see also formulas (66a) and (66b) in [S]) that + 2 2 4[(A+ ¯ )I = 0 h¯ ) − Ah¯ ] + (1 − A(h) and Ah¯ ± =
+ + 2 A(h) ¯ − A(h¯ 0 ) = − det((A(h) ¯ − A(h¯ 0 )) · σ ) = 4 det(Ah¯ − Ah¯ 0 ).
The result now easily follows. Thus, we can geometrically realize the space of spin asymptotic spectral measures as the space of continuous paths in the closed unit ball of R3 which asymptotically approach the unit sphere, i.e. they are “asymptotically sharp”. Note that this provides nontrivial examples of asymptotic spectral measures which do not converge to a fixed spectral measure. Let n be a unit vector and define A(h) The associated spin asymptotic ¯ = (1 − h)n. ¯ spectral measure given by 1 (I ± (1 − h)n ¯ · σ) 2 is used by Roy and Kar [RK] to analyze eavesdropping strategies in quantum cryptography using EPR pairs of correlated spin- 21 particles. Violations of Bell’s inequality occur √ √ 1 when the parameter h¯ > 1 − 2( 2 − 1) 2 . This spin ASM is also obtained by the asymptotic smearing construction in 5.1. Let E ± = A± vector n. Define the family 0 be the spectral measure associated to the unit {ph¯ } : P(X2 ) × X2 → [0, 1] by the formula ph¯ (, j ) = i∈ λhij¯ , where (λhij¯ ) is the stochastic matrix 1 − h2¯ h2¯ h¯ (λij ) = . h¯ 1 − h2¯ 2 One can then verify that 1 Ah¯ ± = ph¯ ({± }, l)E ∓ . 2 1 A± h =
l=∓ 2
Corollary 5.3.7. Two spin asymptotic spectral measures {Ah¯ } and {Bh¯ } are equivalent if and only if their associated maps A, B : (0, 1] → B 3 are asymptotic, i.e., lim A(h¯ ) − B(h¯ ) = 0.
h¯ →0
+ + Proof. A(h) ¯ − B(h) ¯ 2 = 4 det(Ah¯ − Bh¯ ).
Asymptotic Spectral Measures, Quantization, and E-Theory
55
5.4. Strong deformation quantization. Let X be a locally compact space. Let B be a C ∗ -algebra. A strong deformation from X to B is a continuous field [D] of C ∗ -algebras {Bh¯ : h¯ ∈ [0, 1]} such that B0 = C0 (X) and {Bh | h¯ > 0} ∼ = B × (0, 1]. ¯
Here we give a measure-theoretic criterion, based on E-theory arguments, for when a locally B-valued Borel ASM {Ah¯ } on X determines a strong deformation from X to B, where B is a hereditary C ∗ -subalgebra of B(H). First, we make the following general definition. Definition 5.4.1. Let {Ah¯ } be an ASM on (X, , E). We will call {Ah¯ } injective if lim inf Ah¯ () > 0 h¯ →0
for all nonempty subsets = ∅ in E. Thus, if {Ah¯ } is a locally B-valued Borel ASM on X, then by local compactness and monotonicity, injectivity is equivalent to lim inf Ah¯ (U ) > 0 h¯ →0
for all nonempty open subsets U = ∅ of X. Let {Qh¯ } : C0 (X) → B be the associated asymptotic morphism given by Theorem 4.2. Recall that {Qh¯ } is called injective [L1] if lim inf Qh¯ (f ) > 0 h¯ →0
for all f = 0 in C0 (X). By the results in [CH, L1, DL], (weakly) injective asymptotic morphisms determine strong deformations from X to B. Theorem 5.4.2. Let {Ah¯ } be an injective locally B-valued Borel ASM on X. Then the associated asymptotic morphism {Qh¯ } : C0 (X) → B is injective and so satisfies the continuity condition f = lim Qh¯ (f ) h¯ →0
for all f ∈ C0 (X). Hence, there is an associated strong deformation from X to B. Proof. Let f = 0 be in C0 (X). Thus, there is an x0 ∈ X such that |f (x0 )| > C > 0. Since {Qh¯ } is positive linear, without loss of generality, we may assume f ≥ 0 and so f (x0 ) > C > 0. Let U ∈ CX be the pre-compact open subset of X defined by U = {x ∈ X : f (x) > C}. Then CχU ≤ f and so for all h¯ > 0 we have that CAh¯ (U ) = CχU dAh¯ ≤ f dAh¯ = Qh¯ (f ) X
X
which implies that 0 < |C| lim inf Ah¯ (U ) ≤ lim inf Qh¯ (f ). h¯ →0
h¯ →0
It follows that {Qh¯ } : C0 (X) → B is injective and so by Lemma 3 [L1], f = lim Qh¯ (f ) h¯ →0
for all f ∈ C0 (X). Thus, {Qh¯ } is the asymptotic morphism associated to a strong deformation from X to B. The continuous sections of the field {Ah¯ } are then determined by the equivalence class [[Qh¯ ]]a(cp) of the associated asymptotic morphism {Qh¯ } : C0 (X) → B.
56
D. Martinez, J. Trout
5.5. Wick quantization on bosonic Fock space. The background material for this section can be found in Guentner[G1]. Let H = L2 (C, dµ(z)) denote the Hilbert space of measurable complex-valued functions on the complex plane X = C which are square2 integrable with respect to the normalized Gaussian measure dµ(z) = π −1 e−|z| dλ(z) = −1 z w ¯ π k(z, z)dλ(z), where k(z, w) = e denotes the Bergman kernel and dλ(z) denotes Lebesgue measure. The (bosonic) Fock space is the closed subspace F ⊂ H consisting of analytic functions. For any bounded Borel function f ∈ Bb (C), the Wick operator Tf : F → F of f is the integral operator defined by k(z, w)f (w)φ(w)dµ(z), Tf (φ) = C
for all φ ∈ F. Lemma 5.5.1. For each f ∈ L2 (C, dλ) ∩ Bb (C) the operator Tf ∈ K(F). Proof. Follows from the calculations in the proof of Proposition 3.2 [G1].
We define the Wick quantization map QW : Cb (C) → B(F) : f → QW (f ) = Tf . Let P : H → F denote the orthogonal projection. We can then define a POV-measure AW : C → B(F) by AW () = P ◦ χ , where χ denotes (the operator on H of multiplication by) the characteristic function χ . Note that it is the compression of the PVM → E() = χ . Lemma 5.5.2. The POVM AW : (C , CC ) → (B(F), K(F)) is normalized and locally compact-valued. The associated positive linear map is the Wick-Toeplitz quantization QW : (Cb (C), C0 (C)) → (B(F), K(F)). Proof. Follows from the fact that AW () = P ◦ χ = Tχ = QW (χ ). When U ∈ CC is pre-compact then χU ∈ L2 (C, dλ) ∩ Bb (C) and so AW (U ) = TχU ∈ K(F). Normalization follows from AW (C) = P = IF . For each h¯ > 0 and f ∈ Bb (C) define αh¯ (f )(z) = f (h¯ z) for all z ∈ C. We can then define a family of positive linear maps W {QW h¯ } : (Cb (C), C0 (C)) → (B(F), K(F)) : f → Qh¯ (f ) = Q(αh¯ (f )).
Guentner [G1] realized that to obtain an asymptotic morphism from the Wick quantization we need to pass to a unital subalgebra of Cb (C) that still contains C0 (C) as an ideal. Let δC denote the compactification of the complex plane C by the circle at infinity. The continuous functions on δC are “flat at infinity” when restricted to C. Let Cδ (C) = C(δC). We then have that C0 (C)Cδ (C) ⊂ Cb (C). The following result is a consequence of Proposition 4.6 above and Propositions 3.2 and 3.3 [G1].
Asymptotic Spectral Measures, Quantization, and E-Theory
57
Proposition 5.5.4. The family {QW h¯ } defines a relative positive asymptotic morphism {QW h¯ } : (Cδ (C), C0 (C)) → (B(F), K(F)) whose associated Cδ -asymptotic spectral measure {Ah¯ W } is given by Ah¯ W () = AW (h¯ −1 ) = P ◦ αh¯ (χ ). The restricted asymptotic morphism {QW h¯ } : C0 (C) → K(F) determines an Etheory class, W ∼ [[AW h¯ ]] =def [[Qh¯ ]] ∈ E(C0 (C), K) = E(C0 (C), C),
where we have used the matrix-stability of E-theory. √ ∂ ∂ ¯ Let ∂¯ = 21 ( ∂x + −1 ∂y ) be the ∂-operator on C ∼ = R2 , considered as an unbounded elliptic differential operator on the Hilbert space L2 (C, dλ(z)). The formal adjoint √H = 1 ∂ ∂ of ∂¯ is the operator −∂, where ∂ = 2 ( ∂x − −1 ∂y ). It follows that the 2 × 2 matrix operator 0 −∂ D= ¯ ∂ 0 determines a symmetric unbounded operator on H ⊕ H with bounded propagation, and so is (essentially) self-adjoint. By the results of Guentner [G1, G2] the operator D determines an E-theory class ¯ ∈ E(C0 (C), C), which is the homotopy class of the asymptotic morphism denoted [[∂]] determined by the formula C0 (R) ⊗ C0 (C) → C0 (R) ⊗ K(H ⊕ H ) : f ⊗ φ → Mφ ◦ f (hD ¯ + x1), where x ∈ R and 1 is the grading operator of the Z2 -graded Hilbert space H ⊕ H . A direct consequence of Proposition 5.5.4 above, Theorem 4.5 [G1], and the excision property of relative E-theory [G2] is that the E-theory classes of the Wick ASM above ¯ and the ∂-operator are in fact equal. ∼ ¯ Theorem 5.5.5. [[AW h¯ ]] = [[∂]] ∈ E(C0 (C), C) = Z. Appendix. POV-Measures and Quantum Mechanics Let X be a set equipped with a σ -algebra of subsets of X. Let H be a separable Hilbert space with inner product "·, ·#. A positive operator-valued measure (POVM) on the measurable space (X, ) is a mapping A : → B(H) which satisfies the following properties: • A(∅) = 0, • A() ≥ 0 for all ∈ , ∞ ∞ • A(∞ ) = n 1 A(n ) for disjoint measurable sets {n }1 ⊂ , 1
58
D. Martinez, J. Trout
where the sum converges in the weak operator topology [Be, BGL, S]. Note that 0 ≤ A() ≤ A(X) ≤ A(X) < ∞ for all ∈ . We will say that A is normalized if A(X) = IH . If each A() is a projection in B(H), i.e., A()2 = A()∗ = A(), then we call A a projection-valued measure (PVM or spectral measure) on X. This is equivalent to the condition that: For all 1 , 2 ∈ , A(1 ∩ 2 ) = A(1 )A(2 ).
(A.1)
See Berberian [Be] for the basic integration theory of POVMs and Brandt [Br] for a short history of POVMs in quantum theory and an application to photonic qubits in quantum information processing. The monographs [S, BGL] give a thorough exposition of POVMs in foundational and operational aspects of quantum physics. Let X be a locally compact Hausdorff topological space. Let C0 (X) denote the C ∗ algebra of all continuous complex-valued functions on X which vanish at infinity. Definition A.2. A general quantization of X on a Hilbert space H is a positive linear map Q : C0 (X) → B(H). If X is compact, we require that Q(1X ) = IH . If X is a non-compact space, we require that Q should have a unital extension Q+ : C0 (X)+ = C(X+ ) → B(H) which is a positive linear map. Another reason for the importance of these operator-valued measures in quantization is the following generalized Riesz representation theorem for the dual of C0 (X) (compare Proposition 1.4.8 [L] and Theorem 19 [Be]): Theorem A.3. Let X be the Borel σ -algebra on the space X. There is a one-one correspondence between positive linear maps Q : C0 (X) → B(H) and POV-measures A : X → B(H), given by Q(f ) = f (x) dA(x). X
The map Q is a general quantization if and only if A is a normalized POVM. Moreover, Q is a ∗-homomorphism if and only if A is a spectral measure (PVM). The above integral is to be interpreted in the weak sense: For all v, w ∈ H, "Q(f )v, w# = f (x)"dA(x)v, w#. X
The map Q then extends to Q : Bb (X) → B(H) and satisfies (Theorem 10 [Be]): For all f ∈ C0 (X) ⊂ Bb (X), Q(f ) = f (x) dA(x) (A.3.1) ≤ 2f A(X). X
Thus, spectral measures (PVM’s) correspond to representations of abelian C ∗ -algebras on Hilbert space. The fundamental result in the von Neumann formulation of quantum theory is the following Spectral Theorem of Hilbert.
Asymptotic Spectral Measures, Quantization, and E-Theory
59
Spectral Theorem A.4. Let X = R. There is a one-one correspondence between Borel spectral measures A on R and self-adjoint operators T on the associated Hilbert space. This correspondence is given by the formulas: ∞ T = λ dA(λ), A() = χ (T ), −∞
where χ denotes the characteristic function of the Borel set ⊂ R. Let A be a normalized POV-measure on the phase space X of a quantum system. The physical interpretation of the map → A() is the probability that the physical system, in a state represented by a density operator ρ, is localized in the subset of the phase space X given by the number
ρ dA . Pρ () = trace(ρ ◦ A()) = trace
The mean or vacuum expectation value of a quantum observable T is then computed by the formula
∞ "T # = trace(ρT ) = trace
−∞
λρ(λ) dA(λ) ,
where ρ is the (normalized) probability density operator of the physical system. Note that according to the Naimark Extension Theorem [RS], every POVM A is the compression of a PVM E defined on a minimal extension H ⊃ H. That is, A() = P E()P , where P : H → H is the orthogonal projection. One could then try to compute the integrals X f (x) dA(x) by computing X f (x) dE(x) on H and then projecting back down to H. There are two problems with this [S]. The first is that H could have no physical meaning, thus making the analysis unsatisfying to the physicist. Also, the integration process may not commute with the projection process (e.g., when the associated operator is unbounded). References [B]
Beggs, E.J.: Strongly Asymptotic Morphisms on Separable Metrisable. Algebras. J. Funct. Anal. 177, 16–53 (2000) [Be] Berberian, S.K.: Notes on Spectral Theory. Princeton, NJ: Van Nostrand, 1966 [BGL] Busch, P., Grabowski, M. and Lahti, P.: Operational Quantum Physics. Lecture Notes in Physics. Berlin: Springer-Verlag, 1995 [Bl] Blackadar, B.: K-theory for operator algebras. MSRI Publication Series 5 (2nd ed.). New York: Springer-Verlag, 1998 [Br] Brandt, H.E.: Positive operator valued measure in quantum information processing. Am. J. Phys. 67, 434–439 (1999) [C] Connes, A.: Noncommutative Geometry. San Diego, CA: Academic Press, Inc., 1994 [CE] Choi, M. D. and Effros, E.G.: The completely positive lifting problem for C ∗ -algebras. Ann. of Math. 104, 585–609 (1976) [CHM] Carey, A.L., Hannabus, K.C. and Mathai, V.: Quantum Hall Effect and Noncommutative Geometry. Preprint math.OA/0008115 [CH] Connes, A. and Higson, N.: Déformations, morphismes asymptotique et K-theorie bivariante. C. R. Acad. Sci. Paris Sér. I Math. 311, 101–106 (1990) [D] Dixmier, J.: C ∗ -algebras. Amsterdam: North-Holland, 1977 [DL] Dadarlat, M. and Loring, T.: Deformations of topological spaces predicted by E-theory. Algebraic methods in operator theory. Boston: Birkhauser Boston, 1994, pp. 316–327 [G1] Guentner, E.: Wick Quantization and Asymptotic Morphisms. Houston Journal Math. 26, 361–375 (2000) [G2] Guentner, E.: Relative E-theory. K-Theory 17, 55–93 (1999)
60
D. Martinez, J. Trout
[GHT] Guenter, E., Higson, N. and Trout, J.: Equivariant E-theory for C ∗ -algebra. Mem. Amer. Math. Soc. 703 (2000) [GN] Gelfand, I.M. and Naimark, M.: On the embedding of normed rings into the ring of operators in Hilbert space. Mat. Sb. 12, 197–213 (1943) [GVF] Gracia-Bondia, J.M., Várilly, J.C. and Figueroa, H.: Elements of Noncommutative Geometry. Birkhäuser Advanced Texts. Boston: Birkhäuser, 2001 [H] Higson, N.: On the K-theory proof of the index theorem. Contemporary Mathematics 148, 67–86 (1993) [HLT] Houghton-Larsen, T.G. and Thomsen, K.: Universal (Co) Homology Theories. K-theory 16, 1–27 (1999) [JP] Jauch, J.M. and Piron, C.: Generalized Localizability. Helv. Phys. Acta 40, 559–570 (1967) [L] Landsman, N.P.: Mathematical Topics between Classical and Quantum Mechanics. Springer Monographs in Mathematics. New York: Springer-Verlag, 1998 [L1] Loring, T.A.: A test for injectivity for asymptotic morphisms. Algebraic methods in operator theory. Boston: Birkhauser Boston, 1994, pp. 272–275 [L2] Loring, T.A.: Almost multiplicative maps between C ∗ -algebras. Operator algebras and quantum field theory (Rome, 1996). Cambridge, MA: Internat. Press, pp. 111–122 [M] Murphy, G.J.: C ∗ -algebras and operator theory. Boston: Academic Press, Inc., 1990 [N] Nash, C.: Differential Topology and Quantum Field Theory. San Diego, CA: Academic Press, Inc., 1991 [N1] Nagy, G.: E-theory with *-homomorphisms. J. Funct. Anal. 140, 275–299 (1996) [N2] Nagy, G.: Deformation quantization and K-theory. Contemp. Math. 214, 111–134 (1997) [O] Omnes, R.: Understanding Quantum Mechanics. Prinecton, NJ: Princeton University Press, 1999 [P] Periwal, V.: D-brane charges and K-homology. J. High Energy Phys. 7, Paper 41 (2000), 6 pp. [RK] Roy, S. and Kar, G.: Quantum Cryptography, Eavesdropping and Unsharp Spin Measurement. Chaos, Solitons & Fractals. 10. Elsevier Science Ltd., 1999, pp. 1715–1718 [Ro] Rosenberg, J.: Behavior of K-theory under quantization. Operator Algebras and Quantum Field Theory. Cambridge, MA: International Press, 1996, pp. 404–415 [RS] Riesz, F. and Sz.-Nagy, B.: Functional Analysis. Mineola: Dover Publications, Inc., 1990 [S] Schroeck, Jr., F.E.: Quantum Mechanics on Phase Space. Fundamental Theories of Physics. Dordrecht, The Netherlands: Kluwer Academic Publishers, 1996 [Th] Thomsen, K.: Discrete asymptotic homomorphisms in E-theory and KK-theory. Preprint [Tr] Trout, J.: Asymptotic Morphisms and Elliptic operators over C ∗ -algebras. K-theory 18, 277–315 (1999) [VN] von Neumann, J.: Mathematische Grundlagen der Quantenmechanik. Berlin: Springer-Verlag, 1932; English translation: Mathematical Foundations of Quantum Mechanics. Princeton, NJ: Princeton University Press, 1955 [W] Witten, E.: D-branes and K-theory. J. High Energy Phys. 12 (1998) (Paper 19, 41 pp.) [WO] Wegge-Olsen, N.E.: K-theory and C ∗ -algebras. New York: Oxford University Press, 1993 Communicated by A. Connes
Commun. Math. Phys. 226, 61 – 100 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Construction of Quasi-Periodic Breathers via KAM Technique Xiaoping Yuan1,2, 1 Department of Mathematics and Laboratory of Mathematics for Nonlinear Science, Fudan University,
Shanghai 200433, P.R. China. E-mail:
[email protected],
[email protected]
2 CMAF Universidade de Lisboa, Av. Prof. Gama Pinto 2, 1649-003 Lisbon, Portugal
Received: 29 March 2001 / Accepted: 10 October 2001
Abstract: By developing a KAM theorem which involves an infinitely multiple normal frequency, it is shown that there are plenty of breathers, quasi-periodic in time and superexponentially localized in space, for the networks of weakly coupled oscillators. This answers an open problem by Aubry [A2] in case the linearized system has no continuous spectrum. 1. Introduction and Results In this paper we are concerned with the existence of quasi-periodic breathers for the networks of weakly coupled oscillators: d 2 xn2 + V (xn ) = W (xn+1 − xn ) − W (xn − xn−1 ), dt 2
n ∈ Z,
(1.1)
where V is the local potential with V (0) = 0, V (0) = β 2 ,( β > 0), and W is the coupling potential. This equation has been deeply investigated by some authors. See [A3,B,M-A], for example. In the classical case breathers are time-periodic and spatially localized solutions of the equations of motion. Aubry [A,A-A] posed the well-known concept of anti-integrability or anti-continuation by which the existence of the breathers for some equations of motion can be proven. The first rigorous proof for the existence in a wide class of models (including (1.1)) was given later by MacKay and Aubry [MA]. This proof was obtained by the anti-integrability method. Roughly, the method is as follows. Let us take Eq. (1.1) as an example. First, at the anti-integrable limit (i.e.
= 0), Eq. (1.1) is reduced to a discrete array of uncoupled anharmonic oscillators. Then breathers, corresponding to one oscillator moving freely, trivially exist and can be continued up to small = 0 by an infinitely dimensional version of the implicit function Supported and by by the National Natural Science Foundation of China (60074005) the Post-doctor Fellowship in CMAF University of Lisbon.
62
X. Yuan
theorem. The breathers with many commensurate frequencies can also be constructed by a similar method. See [Ah]. More recently, [Yu] investigated the existence of the breathers for some limited-dimensional strong coupled oscillators. Naturally, the extension to quasi-periodic breathers should be considered. A solution of the equation of motion is called a quasi-periodic breather if it is quasi-periodic (with many incommensurate frequencies) in time and localized in space. The anti-integrability method can also be used to construct the quasi-periodic breathers with two (but not more) incommensurate frequencies for some exceptional models, for example, discrete nonlinear Schrödinger models √ −1ψ˙ n + (ψn+1 + ψn−1 ) + |ψn |2σ ψn = 0, σ > 0. (1.2) This is due to the fact that Eq. (1.2) possesses some invariant property of translation. See [J-A] for details. For (1.1), quasi-periodic breathers are trivially obtained at the antiintegrability limit (i.e. = 0) as solutions where different sites oscillate with incommensurate frequencies. However, one expects that it should not be possible to continue these solutions to local breather solutions for non-zero coupling, since the harmonics of the breather frequencies should fill the densely real axis making it impossible to avoid resonances with the line band. To construct the quasi-periodic breathers for more general nonlinear models, other means besides anti-integrability are needed. Aubry [A2] remarked that the existence of the quasi-periodic breathers is an open problem which should relate the concepts of the KAM theory and of anti-integrability. In the present paper, the existence of quasi-periodic breathers will be shown via KAM technique. In the 1980’s,the celebrated KAM theory was successfully extended to infinitely dimensional Hamiltonian systems of short range so as to deal with a certain class of Hamiltonian networks of weakly coupled oscillators. Vittot & Bellissard [V-B], Fröhlich, Spencer & Wayne [F-S-W] showed that there is a set ⊂ R∞ + with Prob() > 0 (where “Prob” is some probability measure) such that for some ω = (ωi )i∈Z ∈ , 0 < 1, there is an infinitely dimensional invariant torus for the following Hamiltonian ωi Ii + P (I, φ), (1.3) H = H (I, φ) = i∈Z
where P is of short range. Pöschel [P1] obtained also the above results for the Hamiltonian systems with more general spatial structure. Thus, any solution starting from the torus is an almost-periodic (with an infinite number of incommensurate frequencies) in time. By the technique of action-angle variables the Hamiltonian of Eq. (1.1) can be reduced to the form of (1.3). Thus, it seems possible that there is an almost-periodic solution for (1.1). However, here we focus our attention on quasi-periodic breathers. To this end we need to show that there is a finitely dimensional invariant torus for (1.1). In the 1990’s, the KAM theory has been significantly generalized to infinitely dimensional Hamiltonian systems without being of short range so as to show that there is a quasiperiodic (not almost periodic) solution for some class of partial differential equations. Kuksin [K], Pöschel [P2,3] and Wayne [W1] (in alphabetic order) showed that there are quasi-periodic solutions for the Hamiltonian H =
N i=1
∞
ωj Ij +
1 λj u2j + P (I, φ, u), 2
(1.4)
i=1
provided that λj = 1 and some other conditions are satisfied, where (I, φ) ∈ CN ×
(C/2πZ)N , u = (uj ) is in some Hilbert space, and we denote by λj the multiplicity of
Construction of Quasi-Periodic Breathers via KAM Technique
63
λj , that is, λj = cardinality of the set of all λl with λl = λj . Their result applies to some nonlinear partial differential equations, such as wave and Schrödinger equations, subject to Dirichlet and Neumann boundary conditions. By developing Craig & Wayne’s [C-W] method, Bourgain [Bo1] constructed the quasi-periodic solutions for 2 dimensional (in space) nonlinear Schrödinger equations where λj < ∞ for all j and limj →∞ λj = ∞. Bourgain [Bo2] established the existence of quasi-periodic (also periodic) solutions for (1.4) with λj < d¯ for all j ∈ Z, where d¯ is a given positive integer, which applies to 1-dimensional nonlinear wave equations subject to periodic boundary conditions.(See also [C-Y].) The condition λj < ∞ excludes Eq. (1.1) since the Hamiltonian of Eq. (1.1) √ can be reduced to the form of (1.4) with λj = V (0) for all j ∈ Z, i.e. λj = ∞, again by the technique of action-angle variables. We will overcome this difficulty arising from λj = ∞ by advantage of the fact that Eq. (1.1) is of short range. We will show a KAM theorem which involves infinitely multiple normal frequencies. By this KAM theorem we construct quasi-periodic breathers for (1.1). Our idea is as follows. The strength of the resonance decays so fast that we can regard an infinitely dimensional Hamiltonian as a finitely dimensional one with the help of short range, after taking an exponential norm with weight e|n|/a . For a finitely dimensional Hamiltonian, the resonance in the normal direction does not prevent the existence of quasi-periodic solution if some other conditions are met. See [B-M-S] and [Bo3], for example. More exactly, let u = (un )n∈Z and u∗ = (un )|n|≤m+1 . We can regard H = H (I, θ, u) as H = H (I, θ, u∗ ) in m + 1th KAM iteration. The Hamiltonian H = H (I, θ, u∗ ) is finitely dimensional. Our difficulty is that the dimensional number 2m + 3 of u∗ tends to ∞ as m to ∞. In general, the small parameter → 0 when m to ∞. Because of this, some care is needed in the estimate of measure. We must find out a sufficiently small 0 such that, for all 0 < < 0 and all m ∈ N, our estimates of measure hold true. Before stating our theorem, we need the following assumptions on the potentials V and W : (V0) The local potential V and the coupling potential W are analytic in the strip domain {x ∈ C : |x| < δ0 } for some constant δ0 > 0; (V1) V (0) = V (0) = 0, V (0) = β 2 with β > 0,and W (x) = O(|x|3 ); (V2) For a given compact interval I ⊂ R+ and any h ∈ I,the equation 21 y 2 +V (x) = h defines a simple closed curve )(h) which encloses the origin (0, 0) in the (x, y)plane. (V3) Let ρ = ρ(h) be the area enclosed by the closed curve )(h), i.e. ρ(h) = ydx. y 2 /2+V (x)=h
Then ρ (h) = 0, ρ (h) = 0 for any h ∈ I. For given integer N ≥ 1 and any choice J = {j1 , . . . , jN } ⊂ Z, Eq. (1.1) can be regarded a perturbation of the following system: d 2 xn2 + V (xn ) = 0, dt 2
n ∈ J,
(1.5a)
64
X. Yuan
d 2 xn2 + β 2 xn = 0, dt 2
n ∈ Z\J.
(1.5b)
To state our theorem we introduce a Hilbert space + as follows: (|xn |2 + |yn |2 )e|n|/a < ∞}, + = {u = (xn , yn )n∈Z\J : (xn , yn ) ∈ C2 , u := n∈Z\J
where the constant a > 0 is a fixed constant (say a = 1). Define the inner product · , ·+ in + as follows: u, v+ := (un · vn )e|n|/a , n∈Z\J
where the dot · is the inner product in C2 . Let = I N .For any η = (h1 , . . . , hN ) ∈ ,then, by assumption (V2) and the fact that 21 yn2 + V (xn ) = h is a first integral of (1.5a), )(h1 ) × · · · × )(hN ) is an invariant torus with the rotational frequencies ω(η) = (H0 (ρ(h1 )), . . . , H0 (ρ(hN ))) for (1.5a) where H0 is the inverse of ρ = ρ(h). Observe that (0, 0) is an equilibrium of (1.5b).Thus, T (η) = )(h1 ) × · · · × )(hN ) × {0} is an invariant torus with the rotational frequencies ω(η) for (1.5), where 0 is the origin in +. Therefore, any solution of (1.5) starting from T (η) is trivially breather for (1.5). Our end is to prove that the torus T (η) persists under the small perturbation. Here is our main theorem which states that there does persist a large Cantor sub-family of rotational N-tori which are only slightly deformed, thus the solutions starting from the persisted tori are quasi-periodic breathers of (1.1). Theorem 1.1. Suppose that the potentials V and W satisfy Assumptions (V0)÷(V3). Then, for given integer N , compact set = I N ⊂ RN + and small constant γ > 0, there is a positive constant ∗ = ∗ (, N, γ ) sufficiently small such that, when 0 < < ∗ , there is a Cantor set S ⊂ with measS = (meas)(1−O(γ )) (where meas ≡ Lebesgue measure) ,a family of N -tori T (η) ⊂ T (η) T [S] = η∈S
η∈
over S, and an analytic embedding / : T [S] 0→ RN × TN × +
which is a higher order perturbation of the inclusion map /0 : η∈ T (η) 0→ RN × TN × + restricted to T [S], such that the restriction / to each T (η) in the family is an embedding of a rotational N -torus for (1.1). Moreover, any solution of (1.1) starting from /(T (η)) is a quasi-periodic breather of frequencies ω∗ with |ω∗ − ω| = O( 1/3 ). Remark 1.1. The breathers in Theorem 1 are very well localized in space. More exactly, for the amplitude of the oscillation xn+N of the n + N th particle, the estimate n /3
|xn+N | ≤ Ce−|n|/a (1+2)
holds true, namely the decay is super-exponential, where C > 0, 0 < 2 < 1/9 are fixed constants. This very fast decay is surprising at first sight, but it is due to the fact the
Construction of Quasi-Periodic Breathers via KAM Technique
65
interaction starts with a cubic term. In the end of Sect. 5 we will give the details for the proof of the super-exponential decay. Bambusi [B] showed the Nekhoroshev stability of the breathers. In [B] the existence of periodic breathers was also shown by Poincaré’s continuation theorem. It is worth to point out that the super-exponential decay of the periodic (but not quasi-periodic) breather in the case with cubic coupling potential can also be obtained by the standard Poincaré theorem in a space of weighted sequence, in light of the idea in [B]. Remark 1.2. In general, one may not expect to specify the perturbed frequency vector ω∗ independently of the perturbation. In other words, usually ω∗ = ω. See [Bo3] for an example. Remark 1.3. For the cubic coupling potential, the linearized system of (1.1) has not any continuous spectrum. The proof of Theorem 1 depends heavily on this fact. It is worth pointing out that the existence of quasi-periodic breathers in the case with continuous spectrum is still an open problem. Corollary 1.1. If V (0) = V (0) = 0, V (0) > 0,W = O(|x|3 ), and there is k ≥ 3 such that V (3) (0) = · · · = V (k−1) (0) = 0, V (k) (0) = 0, then Eq. (1.1) has “rich” quasi-periodic breathers of small amplitude when is sufficiently small. Proof. In fact, rewrite ρ(h) as
√ ρ(h) = 2 2
x+
h − V (s) ds,
0
where x+ > 0 is given by V (x+ ) = h. Choosing σ = V (x)/ h as the new variable of integration in the above expression, we get that, for small h, √ 1√ 1 1−σ dσ, ρ(h) = 2 2 V (ξ ) 0 where ξ = ξ(σ h) is given by V (ξ ) = σ h. By elementary calculation we get ρ(h) = c1 h + c2 hk/2 + o(|h|k/2 ), ρ (h) = c1 + c3 hk/2−1 + o(|h|k/2−1 ), ρ (h) = c4 hk/2−2 + o(|h|k/2−2 ), where ci ’s are some non-zero constants. Thus, assumption (V3) is satisfied for small h. Assumptions (V0,1,2) are obviously satisfied. ! Corollary 1.2. If V (x) = 21 β 2 x 2 + · · · + ζ x 2k , k ≥ 2, W = O(|x|3 ), and ζ > 0, then Eq.(1.1) has “rich” quasi-periodic breathers of large amplitude when is sufficiently small. Proof. It is not difficult to verify that, for large h, ρ(h) ≥ c5 h1/2+1/k , ρ (h) ≥ c6 h−1/2+1/k , |ρ (h)| ≥ c7 h−3/2+1/k . See [Y] for their proofs, for example. Thus, assumption (V3) is satisfied for large h. !
66
X. Yuan
This paper is organized as follows. In Sect. 2, Eq. (1.1) is, by the technique of actionangle variables, reduced to a normal form to which a KAM theorem is applicable; In Sect. 3, a KAM theorem involving an infinitely multiple normal frequency is given out, and the “analytic part” of the proof for the theorem is finished. Section 4 is devoted to the “geometric part” of the proof for the theorem. The KAM theorem is proven in Sect. 5. Finally,an Appendix is given out. In this paper, most notations (especially in Sect. 3) are taken from the book of Kuksin [K1]. 2. Reduced to Normal Form Let x˙n = yn . Then (1.1) is a Hamiltonian system with its Hamiltonian H =
1 n∈Z
2
When n ∈ Z \ J , write V (xn ) = H =
yn2 + V (xn ) + W (xn+1 − xn ). β2 2 2 xn
(2.1)
+ O(|xn |3 ). Then (2.1) can be rewritten as
1 1 1
W (xn+1 − xn ). yn2 + V (xn ) + yn2 + β 2 xn2 + O(|xn |3 ) + 2 2 2
n∈J
n∈Z\J
n∈Z
(2.2) We now carry out the standard reduction to action-angle variables (see [Ar], for example). By assumption (V2), the expression 21 y 2 + V (x) = h with h ∈ I denotes a simple closed curve which encloses the origin (0, 0) in the (x, y)-plane. To construct the map (x, y) " → (θ, ρ), where I and θ are action and angle variables, respectively; we let H0 (ρ) be the value of the function 21 y 2 + V (x) on the closed curve which encloses area ρ in the (x, y)-plane, i.e. we define H0 (ρ) implicitly by ydx = ρ. (2.3) y 2 /2+V (x)=H0 (ρ)
We now define a generating function S(x, ρ) as follows: y dx, S(x, ρ) = )∗
(2.4)
where ) ∗ is the part of the closed curve y 2 /2 + V (x) = H0 (ρ) connecting the y-axis with point (x, y), oriented clockwise. We define the map ψ0 : (θ, ρ) " → (x, y) via Sx (x, ρ) = y
Sρ (x, ρ) = θ.
Then dx ∧ dy = dx ∧ (Sxx dx + Sxρ dρ) = Sxρ dx ∧ dρ, dθ ∧ dρ = (Sρx dx + Sρρ dρ) ∧ dρ = Sρx dx ∧ dρ. Thus, dx ∧ dy = dθ ∧ dρ.
(2.5)
Construction of Quasi-Periodic Breathers via KAM Technique
Let
(xn , yn ) = ψ0 (θn , ρn ), n∈J √ √ (xn , yn ) = (x˜n / β, β y˜n ), n ∈ Z \ J .
;: Then
67
dxn ∧ dyn +
n∈J
dxn ∧ dyn =
n∈Z\J
dθn ∧ dρn +
n∈J
(2.6)
d x˜n ∧ d y˜n .
n∈Z\J
This implies that ; is symplectic. Thus, Hamiltonian (2.1*) is transformed into H = H (θ, ρ, x, ˜ y) ˜ =
n∈J
+
H0 (ρn ) +
1 β(x˜n2 + y˜n2 ) + O(|x˜n |3 ) 2
n∈Z\J
W (xn+1 − xn ).
(2.7)
n∈Z
We can assume that J = {1, . . . , N} without loss of generality, otherwise we can rearrange the subscript n for that end. Set J1 = J ∪ {0}. Let un = (x˜n , y˜n ),|un |2 = x˜n2 + y˜n2 , and P˜ 3 (u) = O(|xn |3 ) = O(|x˜n |3 ), n∈Z\J
P 2 (u) =
n∈Z\J
W (xn+1 − xn ) =
n∈Z\J1
P˜ (ρ, θ, u) =
n∈Z\J1
1 W √ (x˜n+1 − x˜n ) , β
W (xn+1 − xn ),
n∈J1
where xn = xn (ρn , θn ) with 1 ≤ n ≤ N are defined by (2.4). In what follows, we let W (( √1β (·)) ≡ W (·) by the abuse of the notations. Then (2.6) can be written as H = H (ρ, θ, u) =
n∈J
H0 (ρn ) +
1 β|un |2 + P˜ + P 2 + P˜ 3 . 2
(2.8)
n∈Z\J
By assumption (V3), there exists the inverse H0−1 of H0 . Let < = (H0−1 (I))N . For any ξ = (ξ1 , . . . ξN ) ∈ <, let ρ = I + ξ , where I = (I1 , . . . , IN ). Expand H0 (In + ξn ) in ξn by Taylor’s formula : H0 (In + ξn ) = H0 (ξn ) + H0 (ξn )In + O(|In |2 ).
Let ω(ξ ) = (H0 (ξ1 ), . . . , H0 (ξN )). Write ω · I = N i=1 H0 (ξi )Ii . Then (2.7) can be written as 1 H = H (I, θ, u; ξ ) = ω(ξ ) · I + β|un |2 + P˜ (θ, I + ξ, u) 2 (2.9) n∈Z\J + P 2 (u) + P˜ 3 (u) + O(|I |2 ),
68
X. Yuan
where the ξ -dependence constant H0 (ξ ) is omitted since it does not affect the dynamics. Now we need to introduce the domain of the definition for Hamiltonian H . By abuse of notation, denote by + the Hilbert space (|u1n |2 + |u2n |2 )e|n|/a < ∞}, + = {u = (u1n , u2n )n∈Z : (u1n , u2n ) ∈ C2 , u := n∈Z
where the inner product ·, ·+ in + is defined as follows: (un · vn )e|n|/a . u, v+ := n∈Z
Set D0 = {(I, θ, u) ∈ CN × CN × + : |I | < s0 , |I mθ | < δ0 , u < s0 }, 2/3
1/3
where s0 and δ0 are positive constant sufficiently small. Letting P (I, θ, u; ξ ) = P˜ (θ, I + ξ, u),P 3 (I, u; ξ ) = P˜ 3 (u) + O(|I |2 ) and rearranging the subscript of un : 0 → 0; −n → −n; N + n → n for all n ≥ 1, then Hamiltonian (2.7) is of the form H (I, θ, u; ξ ) = ω(ξ ) · I +
1 n∈Z
2
β|un |2 + P (θ, I, u; ξ ) + P 2 + P 3 ,
(2.10)
where H satisfies the following conditions: (A1) Hamiltonian H is analytic in D0 × < and real for real arguments. (A2) On the domain D0 the Hamiltonian H is of short range: P 2 (u; ξ ) =
n∈Z
P (I, u; ξ ) = 3
W (u1n+1 − u1n ),
O(|un |3 ) + O(|I |2 ),
n∈Z
∂P (I, θ, u; ξ ) ≡0 ∂uj
if
|j | > 1,
where u = (un )n∈Z , ∂P /∂uj = (∂P /∂u1j , ∂P /∂u2j ) and uj = (u1j , u2j ). (A3) (Non-degenerate) There is a constant δa > 0 such that on some complex neighbourhood of < ∂ω | | ≥ δa ∂ξ . (A4) −1/3 |P | ≤ K1 , ∇u P ≤ K1 s0 , where K1 is a positive constant,∇ is the gradient with respect to the usual inner product ·, · in the usual square-summable space +2 . Remark 2.1. Condition (A2) is just a special short range. See [F-S-W] for the general notion of short range.
Construction of Quasi-Periodic Breathers via KAM Technique
69
3. A KAM Theorem Involving an Infinitely Multiple Frequency 3.1. Statement of KAM Theorem. Recall < = (H0−1 (I))N is a compact set in RN + . Let ˆ = CN /(2πZ)N . Define the phase space T ˆ × C × + ) (I, θ, u). P=T We now consider a small perturbation H = H0 + P (I, θ, u, ξ ),
ξ ∈ <,
(3.1)
of an infinite dimensional Hamiltonian in the parameter dependent normal form H0 = ω(ξ ) · I +
1 β|un |2 , 2
ξ ∈ <,
(3.2)
n∈Z
on the phase space P with the symplectic structure
dθj ∧ Ij +
1≤j ≤N
n∈Z
du1n ∧ du2n ,
un = (u1n , u2n ) ∈ C2 .
(3.3)
Usually, the frequencies ω are referred to as tangent ( inner) frequencies while (β, β, · · · ) are referred to as normal frequencies. The Hamiltonian equations of motion of H0 are θ˙ = ω,
I˙ = 0,
u˙ = J∞ u,
0 1 . Hence, for each ξ ∈ <, there is an where J∞ = diag(· · · , J, · · · ) and J = −1 0 invariant N -torus:T0N = TN × {0} × {0} for H0 . Our aim in the present paper is to prove the persistence of the torus T0N under the small perturbation P for “most” (in the sense of Lebesgue measure) ξ ∈ <. Note that the normal frequency β is infinitely multiple, i.e. β = ∞. Here is our main theorem. Theorem 3.1. Suppose that the Hamiltonian H = ω(ξ ) · I +
1 β|un |2 + P (I, θ, u; ξ ) + P 2 (u) + P 3 (I, θ, u; ξ ), 2
(3.4)
n∈Z
where (I, θ, u; ξ ) ∈ D0 × <, satisfies conditions (A1–4) in Sect. 2. Then, for a given γ > 0 there is a small constant ∗ = ∗ (<, N, γ , δa ) > 0 such that, if 0 < < ∗ , then there is a Cantor set <∞ ⊂ < with meas<∞ ≥ (meas<) · (1 − O(γ )), an analytic family of torus embedding ;∞ : TN × < → P, and a map ω∞ : < → RN , such that for each ξ ∈ <∞ , the map ;∞ restricted to TN × {ξ } is an analytic embedding of rotational torus with frequencies ω∞ (ξ ) for the Hamiltonian H defined by (3.4).
70
X. Yuan
3.2. Iterative constants and iterative domains. In what follows, we denote by C, C1 , C2 , · · · positive constants which arrive in estimates, and by K, K1 , K2 , · · · positive constants which arrive in lemmas and theorems. Both of them are independent of and the number m of the iteration, and may be different in different parts of the text. Let 2 4 C(m) be the function of m of the form C1 mC2 m or C1 mC2 m or C1 mC2 m . Besides, let K(ω) = maxξ ∈< |ω(ξ )| and K1 (ω) = maxξ ∈< |∂ξ ω(ξ )| As usual, the KAM theorem is proved by the Newton-type iteration procedure which involves an infinite sequence of coordinate changes. In order to make our iteration procedure run, we need the following iterative constants and iterative domains (most of them are taken from [K1]):
1. em =
2. 3. 4. 5.
6. 7. 8.
9.
0, m=0
∞ −2 −2 −2 (1 + · · · + m )/2 j =1 j , m ≥ 1
(thus,em < 1/2 for allmm ∈ N).
0 = , m = (1+2) , 0 < 2 < 1/9, ( m bounds the size of the perturbation after m iterations). j δm = δ0 (1 − em ), δm = (1 − j6 )δm + j6 δm+1 ,(δm measures the size of the analytic domain in the angular variables after m iterations). j j Um = U (δm ) = {θ ∈ (C/2πZ)N : |I mθ | < δm }, Um = U (δm ). 2/3 1/3 j j Dm = Um × O( m , CN ) × O( m , +), Dm = Um × O((2−j m )2/3 , CN ) × 2/3 2/3 1/3 O((2−j m )1/3 , +) ,where O( m , CN ) = {I ∈ CN : |I | < m }, O( m , +) = 1/3 {u ∈ + : u < m }. −mˆ
ˆ m ˆ + 1)D N−1 γ −1 = O(1/C(m)), γm+1 = (4β + 2K1 (ω))−mˆ γ˜m+1 = m( ˆ = 4(2m + 3)2 , D is the diameter of set <. γ˜m+1 = O(1/C(m)) where m τ˜m+1 = 2N m, ˆ τm+1 = m ˆ − 1 + τ˜m+1 = (2N + 1)m ˆ − 1. Mm+1 = (1/bm )| ln m | = O(C(m)| ln |), where bm = δm − δm+1 , (Mm+1 determines the number of Fourier coefficients we must consider at the m + 1th step of the iteration). 1/3 4(2m+5)2 −16(2m+5)2 Km+1 = 2c0
(m+1)∗ = O(C(m+2) ∧ {−16(2m+5)2 (1+2)3(m+1) }) for m + 1 ≥ m0 ; here (m + 1)∗ = 3(m + 1)1/3 and c0 = 4N K(ω), and m0 = m0 (2) is a fixed constant such that when m ≥ m0 the following inequalities are fulfilled: (1 + 2)(m + 1)−6 − m−6 > (2N + 1)−1
2 (m + 1)−6 , 2
2 1/3 (1 + 2)m−3(m+1) (m + 1)−6 (2m + 5)−2 (2m + 3)−2 − 1 > 1. 128 −(1+τ
1/m6
)m
1 10. qm = m , as m ≥ m0 ; qm = ( 8(β+K) γm0 +1 )m Mm+1 m+1 = O(1/(C1 C | ln | 2 )) as m < m0 (it measures the size of the analytic domain in the frequency space).
Remark 3.1. The Domains Dm ’s form a family of neighborhoods in P of the torus T0N = (R/2πZ)N × {0} × {0} ⊂ P. The real part of the domain Um is N -torus RN /(2π Z)N r , here and the real part of Dm is Dm r Dm = RN /(2πZ)N × O( m , RN ) × O( m , *+). 2/3
1/3
Construction of Quasi-Periodic Breathers via KAM Technique
71
These domains form decreasing families with m increasing: U0 ⊃ U1 ⊃ · · · ⊃ Um ⊃ · · · ⊃ RN /(2π Z)N , D0 ⊃ D1 ⊃ · · · ⊃ Dm ⊃ · · · ⊃ U ( 21 δ0 ) × {0} × {0} ⊃ T0N , r ⊃ · · · ⊃ T0N . D0r ⊃ D1r ⊃ · · · ⊃ Dm
3.3. Iterative lemma. In the following, we denote by fˆ(k; ξ ) the k-Fourier coefficient of the function f (θ ; ξ ) on the variable θ . And we denote by · ∞ the sup-norm for any matrix or vector of finite order. If B1 and B2 are k × k and l × l matrices with k < l, respectively, we define B1 0 B 1 + B2 = + B2 . 0 0 l×l 1 ) and k × l matrix B is a (mk) × (nl) The tensor product of the m × n matrix B1 = (bij 2 matrix defined by 1 1 B b11 B2 · · · b1n 2 1 B1 ⊗ B2 = (bij B2 ) = · · · · · · · · · . 1 B · · · b1 B bm1 2 mn 2
Set
2l
Hl = diag (β, · · · , β) . For any set S and functions f : S → + and g : S → CN , we write f S = sup f (x) s∈S
and
|g|S = sup |g(x)|. s∈S
Let u(l) = (un )|n|≤l , then u˜ = (· · · , 0, (un )|n|≤l , 0, · · · ) ∈ +. Note that each element un of u ∈ + is in C2 . Define u(l) := u. ˜ If B is a matrix of finite order, let B˜ = B0 ˜ where | · | is the operator norm reduced by · . . Define |B| = |B|, 0 0 ∞×∞ Let
0 is a constant small enough but fixed. By abuse of the notation, we write kα as α when k is in some fixed compact subset of R+ , in order to simplify notation. Lemma 3.1. Consider a family of Hamiltonians Hl (0 ≤ l ≤ m): 1 1 Hl = ωl (ξ ) · I + Al (ξ )u(l), u(l) + β|uj |2 + l Pl (I, θ, u; ξ ) + P 2 + P 3 , 2 2 |j |>l
(3.5) where (I, θ, u; ξ ) ∈ Dl × Ol , Al is a matrix of order 4l + 2, u(l) = (uj )|j |≤l , and H is of short range:
72
X. Yuan
(l.0)
3 P = n∈Z O (|un |3 ) + O(|I |2 )
P 2 (u) = n∈Z W (u1n+1 − u1n ) ∂Pl ≡ 0, for|j | > l + 1. ∂uj Write Pl := P2l + P3l where P3l = Pl − P2l and 1I 1u P2l = h1θ l (θ ; ξ ) + I · hl (θ; ξ ) + u(l + 1), hl (θ; ξ ) 1 + u(l + 1), h1uu l (θ; ξ )u(l + 1), 2
(3.6)
here h1θ being a scalar, h1I an N-vector, h1u an 4(l +1)+2-vector, and h1uu a symmetric matrix of order 4(l + 1) + 2. Assume that, for 0 ≤ l ≤ m, the following conditions hold true: (l.1) Hl is analytic and real for real argument in the domain Dl × Ol ;
ˆ 1I (l.2) ωl (ξ ) = ω(ξ ) + l−1 j =0 j hj (0; ξ ), l ≥ 1, ω0 (ξ ) = ω(ξ ) with |∂ξ ω(ξ )| ≥ δa for ξ ∈ Ol , hˆ 1I (0; ξ )’s are analytic, real for real arguments in Oj ,and |hˆ 1I (0; ξ )|Oj ≤ −2/3
j ;
j
l−1
ˆ uu ˆ uu j =0 j hj (0; ξ ) for l ≥ 1 , A0 = H1 , hj ’s are O −2/3−α j for real arguments in Oj , and hˆ uu ; j ∞ ≤ j −1/3 |Pl | ≤ C(l) and ∇u Pl ≤ C(l) l in the domain Dl2 × Ol ; meas(< \
(l.3) Al (ξ ) = H2l+1 + (l.4) (l.5)
j
analytic, real
the following conditions: (i) Kl−1 ≤ |k| < Kl with l > m0 , (ii) l = m0 and 0 = |k| < Km0 , (iii) l < m0 ,and 0 = |k| < Ml , is fulfilled, we have the following inequalities hold true: |k · ωl |−1 ≤ |k|τl /γl , |k · ωl ± β ∓ β|−1 ≤ |k|τl /γl , √ −1k · ωl E2l+1 − Al J2l+1 )−1 ∞ ≤ |k|τl /γl , √ −1k · ωl E(4l+2)2 − E4l+2 ⊗ (J2l+1 Al ) − (J2l+1 Al ) ⊗ E4l+2 )−1 ∞
≤ |k|τl /γl , √ −1k · ωl E16l+8 − E4 ⊗ (J2l+1 Al ) − (βJ2 ) ⊗ E2l+1 )−1 ∞ ≤ |k|τl /γl .
Then there is a positive constant ∗ small enough such that, if 0 < < ∗ , there is a set <m+1 ⊂ <m with meas(< \ <m+1 ) ≤ Cγ em+1 , and a change of variables /m+1 : Dm+1 × Om+1 → Dm × Om being analytic and real for real argument in Dm+1 × Om+1 , where Om+1 is the complex qm+1 -neighborhood of <m+1 . Furthermore, the new Hamiltonian Hm+1 = Hm ◦ /m+1 is of the form
Construction of Quasi-Periodic Breathers via KAM Technique
73
1 1 Hm+1 = ωm+1 (ξ ) · I + Am+1 (ξ )u(m + 1), u(m + 1) + 2 2 + m+1 Pm+1 (I, θ, u; ξ ) + P + P 2
β|uj |2 (3.7)
|j |>m+1
3
and satisfies all the above conditions (l.0) ÷ (l.5) with l being replaced by m + 1. This lemma is the essential part of Theorem 2. Subsections 3.4, 3.5, and Sect. 4 will be devoted to the proof.
3.4. Derivation of homological equations. Step 1. Splitting the perturbation. Let us consider the Hamiltonian Hm . Following Kuksin [K1], we split the perturbation m Pm (i.e. l = m in (3.6)) which is linear in I , quadratic in u, into an “ essential” part m P2m + P and and an unessential part m P3m . Write Pm := P2m 3m 1I ∗ 1u ∗ 1uu ∗ P2m = h1θ m (θ ; ξ ) + I · hm (θ; ξ ) + u , hm (θ; ξ ) + u , hm (θ ; ξ )u ,
(3.8)
1I 1u where u∗ := u(m + 1) = (un )|n|≤m+1 , h1θ m being a scalar, hm an N -vector, hm an 1uu 4(m + 1) + 2-vector, and hm a symmetric matrix of order 4(m + 1) + 2.
Lemma 3.2. If 0 < < ∗ 1, then the following estimates hold true: a) −2/3
Um ×Om Um ×Om |h1θ ≤ C(m), |h1I ≤ C(m) m m| m|
−1/3−α
Um ×Om , h1u ≤ C(m) m m ∞
, (3.9)
b) h1uu m is a symmetric matrix of order 4(m + 1) + 2, real for real argument, and −2/3−α
Um ×Om h1uu ≤ C(m) m m ∞
,
(3.10)
|Dm+1 ×Om ≤ C(m + 1)
Dm+1 ×Om ≤ C(m + 1)
c) m |P3m m+1 , m ∇u P3m m+1 , d) the functions P2m and P3m are analytic, real for real arguments, and do not involve the variable uj with |j | > m + 1. 2/3
Proof. First, we give the proof for statement b). Since |Pm | ≤ C(m) byAssumption (l.4), 2 ×O |Dm m ≤ C(m). Therefore,|h1uu u∗ , u∗ | ≤ C(m). Write h1uu = get that |P2m m m
we1uu hm (i, j ) |i|,|j |≤m+1 . For any k, l with |k|, |l| ≤ m + 1. Let u∗0 = (uj )|j |≤m+1 with 1 1/3
exp(−|k|/a), j = k 4 m 1/3 uj = 1 m exp(−|l|/a), j = l 4 0, otherwise. ∗ ∗ Then (. . . , 0, u∗0 , 0, . . . ) ≤ (2−2 m )1/3 . Thus, |h1uu m u0 , u0 | ≤ C(m). At the same time, ∗ ∗ |h1uu m u0 , u0 | = |
1 2/3
m exp(−(|k| + |l|)/a)h1uu m (k, l)|. 16
74
X. Yuan
Then, −2/3
|h1uu m (k, l)| ≤ C(m) m ≤ ≤ U 2 ×O
exp(|k| + | l|/a)
−2/3 C(m) m exp(2(m + 1)/a) −2/3−α C(m) m
−2/3−α
m m that is, h1uu ≤ C(m) m . This proves statement b). The last inequality in m ∞ statement a) is similar. It follows from (l.0) with l = m that d) holds true. The remaining argument can be found in [K1, pp. 59–60]. We omit it. !
Step 2. Truncation. Let ωm+1 (ξ ) = ω(ξ ) +
m j =0
j hˆ 1I j (0; ξ ),
Am+1 (ξ ) = H2(m+1)+1 +
m j =0
(3.11)
j hˆ 1uu j (0; ξ ).
(3.12)
Then, by Lemma 3.2, the frequencies satisfy assumptions (l.2) and (l.3) with l = m + 1. Let √ −1k·θ , hˆ 1I hImE (θ ; ξ ) = m (k; ξ )e 0 =k,|k|≤Mm+1
hImR (θ ; ξ ) = hθmE (θ ; ξ ) = hθmR (θ ; ξ ) = h1u mE (θ ; ξ ) = h1u mR (θ ; ξ ) = huu mE (θ ; ξ ) = huu mR (θ ; ξ ) =
|k|>Mm+1
hˆ 1I m (k; ξ )e
0 =k,|k|≤Mm+1
|k|>Mm+1
|k|≤Mm+1
|k|>Mm+1
hˆ 1u m (k; ξ )e hˆ 1u m (k; ξ )e
0 =k,|k|≤Mm+1
|k|>Mm+1
−1k·θ
hˆ 1θ m (k; ξ )e
hˆ 1θ m (k; ξ )e
√
√
√
√
−1k·θ
−1k·θ
,
−1k·θ
,
√
−1k·θ
hˆ 1uu m (k; ξ )e
hˆ 1uu m (k; ξ )e
,
√
,
,
√
−1k·θ
−1k·θ
,
.
Write P2m = hθmE (θ ; ξ ) + I · hImE (θ; ξ ) + u∗ , h1u mE (θ ; ξ ) ∗ (θ; ξ )u , + u∗ , huu mE Pˆ2m = hθ (θ ; ξ ) + I · hImR (θ ; ξ )
(3.13)
mR
P3m
∗ uu ∗ + u∗ , h1u mR (θ ; ξ ) + u , hmR (θ ; ξ )u , = Pˆ2m + P3m .
(3.14) (3.15)
Construction of Quasi-Periodic Breathers via KAM Technique
75
Then Pm = P2m + P3m .
(3.16)
Hm = H0m+1 + m P2m + m P3m + P 2 + P 3 ,
(3.17)
Thus, we can rewrite Hm as where 1 1 H0m+1 = ωm+1 · I + u∗ , Am+1 u∗ + 2 2
β|uj |2 ,
(3.18)
|j |>m+1
where we have omitted the ξ -depending constant hˆ θ (0; ξ ) since it does not affect the dynamics. Claim 1. 2 , m |hImR |Um+1 ×Om ≤ C(m) m ,
m |hθmR |Um+1 ×Om ≤ C(m) m 2/3
U
m+1
m h1u mR ∞
×Om
5/3−α
≤ C(m) m
U
m+1 , m huu mR ∞
In fact, for (θ, ξ ) ∈ Um+1 × Om ,
m huu mR ∞ = m (by Cauchy’s Theorem) ≤ m
|k|>Mm+1
|k|>Mm+1
(by Lemma 3.2. a) ) ≤ m
×Om
4/3−α
≤ C(m) m
hˆ 1uu m (k; ξ )e
√
−1k·θ
(3.19)
.
∞
m ×Om e −|k|δm |e h1uu U ∞
√
−1k·θ
|
−2/3−α
C(m)e−|k|(δm −δm+1 ) m
|k|>Mm+1 −2/3−α
2 ≤ C(m) m
m
;
the remaining arguments are similar to the above. Claim 2. 2 2/3
m |Pˆ2m |Dm ×Om ≤ C(m + 1) m+1 , m ∇u Pˆ2m Dm ×Om ≤ C(m + 1) m+1 . Proof. It is sufficient to check that each term in Pˆ2m satisfies the above estimate. We ∗ show only that u∗ , huu mR u satisfies the above estimate: ∗ huu
m |u∗ , huu mR u | = m | mR (i, j )ui uj | |i|,|j |≤m+1
≤ m huu mR ∞
|i|,|j |≤m+1
≤ C(m + 1) m huu mR ∞ ≤ C(m + 1) m huu mR ∞ ≤
|ui uj |
|ui |2
|i|≤m+1
|ui |2 exp(|i|/a)
|i|≤m+1
2 C(m + 1) m huu mR ∞ u 1−2−α 1+2 4/3−α 2/3 C(m + 1) m
m = C(m + 1) m
m
(Claim 1) ≤ ≤ C(m + 1) m+1 .
76
X. Yuan
And for (I, θ, u; ξ ) ∈ Dm × Om we have ∗ uu ∗
m ∇u u∗ , huu mR u ∞ = 2 m hmR u ∞ uu = 2 m hmR (i, j )uj |i|≤m+1 |j |≤m+1
≤ 2 m max | |i|≤m+1
≤
2 m huu mR ∞
|j |≤m+1
|uj |
|j |≤m+1
1/2 ≤ 2 m huu ( mR ∞ (m + 1)
≤ ≤
∞
huu mR (i, j )uj |
|uj |2 exp(|j |/a))1/2
|j |≤m+1
1/2 2 m huu u mR ∞ (m + 1) 5/3−α . 2(m + 1)1/2 m
∗ This implies that m ∇u u∗ , huu mR u ≤ C(m + 1) m+1 . 2/3
!
In view of Claim 2 and Lemma 3.2 c), we get that | m P3m |Dm+1 ×Om ≤ C(m + 1) m+1 , m ∇u P3m Dm+1 ×Om ≤ C(m + 1) m+1 . (3.20) 2/3
By (3.20) and Lemma 3.2, we have Lemma 3.2 . If 0 < < ∗ 1, then the following estimates hold true: a) −2/3
Um ×Om Um ×Om ≤ C(m), |h1I ≤ C(m) m |h1θ mE | mE | Um ×Om h1u mE ∞
≤
,
−1/3−α C(m) m ,
b) huu mE is a symmetric matrix of order 4(m + 1) + 2, real for real argument, and −2/3−α Um ×Om ≤ C(m) m ; huu mE ∞ 2/3 D × O m m+1 c) m |P3m | ≤ C(m + 1) m+1 , m ∇u P3m Dm+1 ×Om ≤ C(m + 1) m+1 ; d) the functions P2m and P3m are analytic, real for real argument, and do not involve the variable uj with |j | > m + 1. Step 3. Derivation of the homological equations. Consider an auxiliary Hamiltonian θ I 1u uu F = fmE (θ ; ξ ) + I · fmE (θ ; ξ ) + u∗ , fmE (θ ; ξ ) + u∗ , fmE (θ ; ξ )u∗ ,
(3.21)
where each term has the same meaning as that in (3.13). In the following we omit t the subscript mE in fmE ’s, in order to simplify notation. Let Sm+1 be the flow of the Hamiltonian vector field with its Hamiltonian m F . Let t /tm+1 (I, θ, u) = (Sm+1 (I, θ, u∗ ), u∗∗ ),
here
u = (u∗ , u∗∗ ).
Construction of Quasi-Periodic Breathers via KAM Technique
77
Write /m+1 = /1m+1 . Then it is symplectic transformation. Thus the Hamiltonian Hm is transformed by it into a new one: Hm+1 = Hm ◦ /m+1 = (H0m+1 + m P2m + m P3m + P 2 + P 3 ) ◦ /m+1
(3.22)
= H0m+1 + m {H0m+1 , F } + m P2m + m+1 Pm+1 + P 2 + P 3 , where
m+1 Pm+1 1 1 2 2 = m (1 − t){{H0m+1 , F }, F } ◦ /tm+1 dt + m {P2m , F } ◦ /tm+1 dt 0
0
+ m P3m ◦ /1m+1 + m
1 0
{P 2 , F }◦ /tm+1 dt + m
1 0
{P 3 , F }◦ /tm+1 dt,(3.23)
here {· , ·} is Poisson bracket with the symplectic structure dθ ∧ dI + Clearly, we hope to find F such that
{H0m+1 , F } + P2m ≡ 0;
1 j ∈Z duj
∧ du2j . (3.24)
thus this equation leads to the following homological equations: ∂f θ ∂f I θ = h , = hImE , mE ∂ω ∂ω ∂f u − Am+1 J2m+3 f u = h1u mE , ∂ω uu ∂f + f uu J2m+3 Am+1 − Am+1 J2m+3 f uu = huu mE , ∂ω where
∂ ∂ω
= ωm+1 ·
(3.25) (3.26) (3.27)
∂ ∂θ .
3.5. Solutions to the homological equations and investigation of the transformation /m . Recall that each term of Hamiltonian F in (3.21) has the same meaning as that in (3.13), that is, fθ =
fˆθ (k; ξ )e
0<|k|≤Mm+1
fI =
fˆI (k; ξ )e
0<|k|≤Mm+1
f 1u =
fˆ1u (k; ξ )e
√
√
√
0<|k|≤Mm+1
,
−1k·θ
,
−1k·θ
|k|≤Mm+1
f uu =
−1k·θ
fˆuu (k; ξ )e
,
√ −1k·θ
,
78
X. Yuan
In view of the homological equations (3.25)–(3.27), we get that 1 hˆ θ (k; ξ ), fˆθ (k; ξ ) = √ −1k · ωm+1 1 fˆI (k; ξ ) = √ hˆ I (k; ξ ), −1k · ωm+1 −1
√ −1k · ωm+1 E4m+6 − Am+1 J2m+3 hˆ 1u (k; ξ ), fˆ1u (k; ξ ) =
√ −1k · ωm+1 E4m+6 − Am+1 J2m+3 fˆuu (k; ξ ) + fˆuu (k; ξ )J2m+3 Am+1 = hˆ uu (k; ξ ),
(3.28) (3.29) (3.30)
(3.31)
where 0 < |k| ≤ Mm+1 in (3.28),(3.29) and (3.31), and 0 ≤ |k| ≤ Mm+1 in (3.30). Lemma 3.3. There is a Hamiltonian F , being analytic and real for real arguments in 1 ×O the domain Dm m+1 ,which solves the homological equations (3.25)–(3.27), where Om+1 is constructed by Lemmas 4.3 and 4.4. Moreover, the following estimates hold true: −α , |f θ |Um ×Om+1 < m 1
(3.32)
− 23 −α
|f I |Um ×Om+1 < m 1
1 ×O Um m+1
f 1u ∞
1 ×O Um m+1
f uu ∞
,
(3.33)
,
(3.34)
− 2 −α
m 3 .
(3.35)
− 13 −α
< m <
Proof. We give the proofs of (3.32) and (3.35) only. The proofs of (3.33) and (3.34) are similar to that of (3.32). When m + 1 ≤ m0 , the proof is easily obtained by (3.28), (3.31), Lemmas A.3 and 4.4 (ii) and (iii), in view of Mm0 < Km0 . We omit the details. In the following we assume that m + 1 > m0 . By the definitions of Mm+1 and Km+1 it is easy to check that Mm+1 ≤ Km+1 if is small enough. Thus, when |k| ≤ Mm+1 , we have that Case 1. There is an integer l with m0 ≤ l ≤ m such that Kl < |k| ≤ Kl+1 . Case 2. |k| ≤ Km0 , and m + 1 > m0 . First, we give out the proof for Case 1. Let κ = m − l. Write B2κ
A˜ =
Al+1
B2κ
,
β
B2κ = . . . , β 2κ×2κ
ˆ 1I r˜ = l+1 hˆ 1I l+1 (0; ξ ) + · · · + m hm (0; ξ ), if l < m, R˜ =
ˆ uu
l+1 hˆ uu l+1 (0; ξ ) + · · · + m hm (0; ξ ), if
(3.36) r˜ = 0, if l = m;
l < m,
(3.37) R˜ = 0, if l = m.
Then ωm+1 = ωl+1 + r˜ By Lemma
3.2
˜ Am+1 = A˜ + R.
(3.38)
and Assumption (l.3),we have 1/3
|k · r˜ | ≤ 2Kl+1 l+1 ,
˜ ≤ 2 . R l+1 1/3
(3.39)
Construction of Quasi-Periodic Breathers via KAM Technique
79
For Kl < |k| ≤ Kl+1 , |k · ωm+1 | = |k · ωl+1 + k · r˜ | ≥ |k · ωl+1 | − |k · r˜ |,
1/3 ((3.39), (4.49) below) ≥ γl+1 /|k|τl+1 1 − 2|k|τl+1 Kl+1 l+1 /γl+1 , ≥
1 1 γl+1 /|k|τl+1 ≥ γm+1 /|k|τm+1 . 2 2
where we have used the fact 2|k|τl+1 Kl+1 l+1 /γl+1 < 1/2 which is from the last inequality in Item 9 in Sect. 3.2. Hence, 1/3
|f θ |Um ×Om+1 | ≤ 1
|√
0<|k|≤Mm+1
≤
1 −1k · ωm+1
hˆ θ (k; ξ )|Om+1 |e
√
1 −1k·θ Um
|
1 (|k|τm+1 /γm+1 )|hˆ θ (k; ξ )|Om e|k|δm ,
0<|k|≤Mm+1
(Cauchy’s estimate) ≤
(|k|τm+1 /γm+1 )|hθmE |Um ×Om e−|k|(δm −δm ) , 1
0<|k|≤Mm+1
(Lemma 3.2 ) ≤
(|k|τm+1 /γm+1 )C(m)e−|k|(δm −δm+1 )/6
0<|k|≤Mm+1
≤ (C(m)/γm+1 )
|k|τm+1 e−|k|δ0 /6(m+1) , 2
k
(Lemma A.1) ≤ (C(m)/γm+1 )(τm+1 /e)τm+1 ((m + 1)2 /3δ0 )τm+1 +N −α ≤ C1 mC2 m δ0−C3 m γ −C4 m ≤ m . 2
This gives the proof of (3.32). Now we are in position to prove the estimate (3.35). Rewrite (3.31) as
√ ˜ 2m+3 fˆuu (k; ξ ) + fˆuu (k; ξ )J2m+3 A˜ −1k · ωl+1 E4m+6 − AJ
˜ 2m+3 fˆuu (k; ξ ) + (k · r˜ )fˆuu = hˆ uu (k; ξ ). + fˆuu (k; ξ )J2m+3 R˜ − RJ
(3.40)
In order to solve the last equation, we need to solve the following equation which is an approximation to (3.40): √ ˜ 2m+3 fˆuu (k; ξ ) + fˆuu (k; ξ )J2m+3 A˜ = C, −1k · ωl+1 E4m+6 − AJ
(3.41)
where C, depending analytically on ξ ∈ Om+1 and real for real argument, is a (4m + 6) × (4m + 6) matrix. For convenience, we simply write (3.41) as Γ fˆuu = C.
(3.41∗ )
80
X. Yuan
In order to solve this equation, we will break it up into several “small” equations. To this end, write C C f f J 0 , fˆuu = 1 2 , C = 1 2 , J2m+3 = 2l+3 0 J2κ f3 f 4 C3 C4 where f1 , f2 , f3 and f4 are (4l + 6) × (4l + 6),(4l + 6) × (4κ),(4κ) × (4l + 6) and 4κ × 4κ matrices, respectively; and Cj ’s have the same meanings. Then (3.41) can be broken up into the following equations: √ −1k · ωl+1 E4l+6 − Al+1 J2l+3 f1 + f1 J2l+3 Al+1 = C1 , (3.42) √ −1k · ωl+1 E4l+6 − Al+1 J2l+3 f2 + f2 J2κ B4κ = C2 , (3.43) √ −1k · ωl+1 E4l+6 − Al+1 J2l+3 f3T + f3T J2κ B4κ = C3T , (3.44) √ −1k · ωl+1 E4κ − B4κ J2κ f4 + f4 J2κ B4κ = C4 . (3.45) By Lemma A.3 below, we get O
f1 ∞l+1 √ ≤ −1k · ωl+1 E(4l+6)2 − E4l+6 ⊗ (Al+1 J2l+3 ) −1 − (Al+1 J2l+3 ) ⊗ E4l+6 C1 ∞ . ∞
Moreover, by Lemma 4.4 below,
O O f1 ∞m+1 ≤ f1 ∞l+1 ≤ |k|τl+1 /γl+1 C∞ .
(3.46)
Similarly, O
f2 ∞m+1 √ −1 ≤ −1k · ωl+1 E4κ(4l+6) − E4κ ⊗ (Al+1 J2l+3 ) − (B4κ J2κ ) ⊗ E4l+6 ∞ C1 ∞ −1 √ = Eκ ⊗ −1k · ωl+1 E16l+24 − E4 ⊗ (Al+1 J2l+3 ) − (βJ2 ) ⊗ E2l+3 ∞ C∞ √ −1 = −1k · ωl+1 E16l+24 − E4 ⊗ (Al+1 J2l+3 ) − (βJ2 ) ⊗ E2l+3 ∞ C∞ (4.53) ≤ |k|τl+1 /γl+1 C∞ . (3.47) Similarly we get
O O f3 ∞m+1 = f3T ∞l+1 ≤ |k|τl+1 /γl+1 C∞ .
(3.48)
Finally, let us solve (3.45). Noting that B2κ = diag(β, · · · , β), we can rewrite (3.45) as √ −1k · ωl+1 E4κ − βJ2κ f4 + βf4 J2κ = C4 . (3.45a)
Construction of Quasi-Periodic Breathers via KAM Technique
Let
√ 1 1 √−1 , P=√ 2 1 − −1
E=
81
√ − −1 √0 . 0 −1
Denote by Pκ , Eκ two matrices of order 4κ: Pκ = diag(P, · · · , P), Then
Eκ = diag(E, · · · , E).
J2κ = Pκ−1 Eκ Pκ .
Left-multiplying Pκ and right-multiplying Pκ−1 in (3.45a) and letting f˜4 = Pκ f4 Pκ−1 , C˜4 = Pκ C˜4 Pκ−1 , we get √ −1k · ωl+1 E2κ − βE4κ f˜4 + β f˜4 Eκ = C˜4 . (3.45b) Write f˜4 = (f˜4 (r, s))1≤r,s≤4κ , C˜4 = (C˜4 (r, s))1≤r,s≤4κ . It follows from (3.45b) that f˜4 (r, s) = √
1 −1(k · ωl+1 + β ± β)
C˜4 (r, s).
Hence, |f˜4 (r, s)| ≤ |C˜4 (r, s)|/|k · ωl+1 + β ± β| ≤ (|k|τl+1 /γl+1 )|C˜4 (r, s)|, where Lemma 4.4 is used in the last inequality. Thus, O
f˜4 ∞l+1 ≤ (|k|τl+1 /γl+1 )C˜4 ∞ . Note that Pκ = diag(P, · · · , P). We get O O f4 ∞m+1 ≤ 4f˜4 ∞l+1 ≤ 4(|k|τl+1 /γl+1 )C˜4 ∞ ≤ 16(|k|τl+1 /γl+1 )C4 ∞ .
(3.49)
The combination of (3.46)–(3.49) leads to the following estimate: O O O fˆuu ∞m+1 = max{f1 ∞m+1 , · · · , f4 ∞m+1 } ≤ 16(|k|τl+1 /γl+1 )C∞ .
(3.50)
Therefore, we can write formally fˆuu = Γ −1 C,
Γ −1 ∞ ≤ 16|k|τl+1 /γl+1 .
(3.51)
In view of (3.39), we get ˜ ∞ , Γ −1 ∞ |k · r˜ |} < R := 3 max{Γ −1 ∞ R
1 . 2
(3.52)
By the definition of Γ , (3.40) can be written as ˜ 2m+3 fˆuu (k; ξ ) + (k · r˜ )fˆuu = hˆ uu (k; ξ ). (3.53) Γ fˆuu + fˆuu (k; ξ )J2m+3 R˜ − RJ
82
X. Yuan
The solution to this equation will be the limit of the following sequence: fˆ1uu = Γ −1 hˆ uu (k; ξ ), uu ˜ 2m+3 fˆnuu + (k · r˜ )fˆnuu , = Γ −1 hˆ ∗uu (k; ξ ) − fˆnuu J2m+3 R˜ + RJ fˆn+1
(3.54) n ∈ N. (3.55)
Thus, it is easy to check that O O fˆ1uu ∞m+1 ≤ Γ −1 ∞ hˆ uu ∞m+1 , O uu Om+1 fˆn+1 ∞ ≤ Γ −1 ∞ hˆ uu ∞m+1 O
O
uu fˆk+l − fˆluu ∞m+1 ≤ Γ −1 ∞ hˆ uu ∞m+1
n
Rj,
j =0 k+l
Rj.
j =l
O
Therefore, the sequence {fˆnuu (k; ξ )}n∈N is a Cauchy one in the topology · ∞m+1 . Let fˆuu := limn→∞ fˆnuu . It is easy to see that fˆuu solves (3.40) and O
O
fˆuu (k; ξ )∞m+1 ≤ 32(|k|τl+1 /γl+1 )hˆ uu (k; ξ )∞m+1 O ≤ 32(|k|τm+1 /γm+1 )hˆ uu (k; ξ )∞m+1 .
It follows that D1 ×Om+1
f uu ∞m
≤
O
fˆuu (k; ξ )∞m+1 e|k|δm 1
0<|k|≤Mm+1
≤ 32
1 O (|k|τm+1 /γm+1 )hˆ uu (k; ξ )∞m+1 e|k|δm ,
0<|k|≤Mm+1
(Cauchy’s ) ≤ 32
D ×Om+1 −(δm −δm+1 )|k|/6
(|k|τm+1 /γm+1 )huu ∞m
0<|k|≤Mm+1 −2/3−α
(Lemma 3.2 ,b)) ≤ 32( m
/γm+1 )
e
,
|k|τm+1 e−(δm −δm+1 )|k|/6 ,
0<|k|≤Mm+1
(Lemma A.1) ≤
−2/3−α
m ,
where we have written 2α as α by abuse of notation. Second, we give the proof of Case 2: m + 1 > m0 , and 0 m − m0 + 1, and β B2κ , B2κ = Am0 A˜ = ... B2κ ˆ 1I r˜ = m0 hˆ 1I m0 + · · · + m hm , ˆ uu R˜ = m0 hˆ uu m0 + · · · + m hm , ωm+1 = ωm0 + r˜ ,
˜ Am+1 = A˜ + R.
< |k| < Km0 .Let κ = , β
Construction of Quasi-Periodic Breathers via KAM Technique
83
The remaining proof is similar to that of Case 1 with the help of Lemma 4.4 (ii). This completes the proof of Lemma 3.3. ! t We are now prepared to investigate the flow Sm+1 of the Hamiltonian vector field with its Hamiltonian m F . Recall that
F = f θ (θ ; ξ ) + I · f I (θ; ξ ) + u∗ , f 1u (θ; ξ ) + u∗ , f uu (θ; ξ )u∗ .
(3.56)
By Lemma 3.3, we have that
D ×Om+1
m |u∗ , f 1u |Dm ×Om+1 ≤ m f 1u ∞m
|uj |
1≤j ≤m+1
D ×Om+1
≤ m f 1u ∞m
(m + 1)1/2
1/2 |uj |2 exp(|j |/a)
|j |≤m+1 1+ 13 − 13 −α
≤ (m + 1)1/2 m 1−α ≤ m
α (m + 1) ≤ 1 is used and kα is written as α. Similarly, we get that where m 1−α .
m |u∗ , f uu u∗ |Dm ×Om+1 ≤ m
Thus by Lemma 3.3, we get that 1−α .
m |F |Dm ×Om+1 ≤ m 1
(3.57)
Thus, by Cauchy’s estimate,
m |∂θ F |Dm ×Om+1 ≤ 2
6 m 1 1−α |F |Dm ×Om+1 ≤ m . δm − δm+1
(3.58)
It follows from (3.56) and Lemma 3.3 that the following estimates hold true: 1
m |∂I F |Dm ×Om+1 ≤ m |f I |Dm ×Om+1 ≤ m3 2
1
D2 ×O
m ∇u∗ F ∞m m+1
≤ ≤
D1 ×O (f 1u ∞m m+1 (2/3)−α
m .
−α
,
(3.59)
D1 ×O + 2f uu ∞m m+1
· u∗ (m + 1)1/2 ) m
By Lemma A.2, (2/3)−α
m ∇u∗ F Dm ×Om+1 ≤ m 2
.
(3.60)
We denote by ϒP ,ϒ< the projectors ϒP : P × < → P,
ϒ< : P × < → <
and denote by ϒθ , ϒI , ϒu the projectors of P = CN /(2π Z)N × CN × + on the first, second, third factor, respectively. Note that the Hamiltonian vector field with its Hamiltonian m F is θ˙ = m ∂I F,
I˙ = − m ∂θ F,
u˙ ∗ = m J2m+3 ∇u∗ F.
(3.61)
84
X. Yuan
In view of (3.58)–(3.60), and using the contract map principle, we get that the flow t 3 and parameter Sm+1 of (5.61) exists for t ∈ [−1, 1] if the initial data (I, θ, u) ∈ Dm ξ ∈ Om+1 ; moreover, (1/3)−α
t − ϒP )|Dm ×Om+1 ≤ m |ϒθ ◦ (Sm+1 3
<
1 δm+1 , 2
(3.62)
1 2/3 ,
2 m+1 1 1/3 (2/3)−α ≤ m < m+1 . 2
t 1−α − ϒP )|Dm ×Om+1 ≤ m < |ϒI ◦ (Sm+1 3
t ϒu ◦ (Sm+1 − ϒP )Dm ×Om+1 3
(3.63) (3.64)
Let h(t) = (θ (t), I (t), u∗ (t)) be the solution of (3.61) with the initial data h(0) = h. The variational equations for δh = (δθ, δI, δu∗ ) ∈ CN × CN × + along the solution h(t) have the form 2 δ θ˙ = m ∂θI F (h(t))δθ,
δ I˙ = − m ∂I2θ F (h(t))δI,
δ u˙ ∗ = m J2m+3 D 2 F (h(t))δu∗ , (3.65)
where D 2 is the Hessian with respect to the usual metric in the usual square-summable space +2 . Note that F is a polynomial in I of order 1, in u∗ of order 2. Using Lemma 3.3 and Cauchy’s estimate, we get (1/3)−α
2 F |Dm ×Om+1 ≤ m |∂θ f I |Um ×Om+1 ≤ m
m |∂θI 2
2
,
(3.66)
(where | · | is the norm in CN ), and D2 ×Om+1
m D 2 F ∞m
U 2 ×Om+1
≤ m f uu ∞m
(1/3)−α
≤ m
.
Moreover, by Lemma A.2, (1/3)−α
m |D 2 F |Dm ×Om+1 ≤ m 2
.
(3.67)
By (3.65)–(3.67) we get (1/3)−α
t |(ϒθ )∗ ◦ ((Sm+1 )∗ − (ϒP )∗ )|Dm ×Om+1 ≤ m 3
3 t )∗ − (ϒP )∗ )|Dm ×Om+1 |(ϒI )∗ ◦ ((Sm+1 3 t )∗ − (ϒP )∗ )|Dm ×Om+1 |(ϒu )∗ ◦ ((Sm+1
,
(3.68)
≤
(1/3)−α
m ,
(3.69)
≤
(1/3)−α
m ,
(3.70)
t t . where (Sm+1 )∗ is the tangent map of Sm
Lemma 3.4. The quantities
m |{F, P 2 }|Dm ×Om+1 , m |{F, P2m }|Dm ×Om+1 , m |{F, P 3 }|Dm ×Om+1 3
3
3
are bounded by C(m + 1) m+1 ; the quantities
m |∇u {F, P 2 }|Dm ×Om+1 , m |∇u {F, P2m }|Dm ×Om+1 , m |∇u {F, P 3 }|Dm ×Om+1 3
3
2/3
are bounded by C(m + 1) m+1 .
3
Construction of Quasi-Periodic Breathers via KAM Technique
85
Proof. Recall that P 2 = n∈Z W (u1n+1 − u1n ) which does not involve the action-angle variables (I, θ ). Note that F does not involve the variables uj with |j | > m + 1. Thus, {F, P 2 } = ∇u∗ F J2m+3 ∇u∗ P 2 . Write ∇u∗ P 2 = (η−m−1 , · · · , η−1 , η0 , η1 , · · · , ηm+1 )T , here η ∈ C2 . Then |ηj | |{F, P 2 }| ≤ ∇u∗ F J2m+3 ∞ ≤ ∇u∗ F ∞ (2m + 3)( |ηj |2 exp(j/a))1/2
(3.71)
≤ ∇u∗ F ∞ (2m + 3)∇u P 2 . Noting that, by the fact the coupling potential W is cubic, |∂uj P 2 | = |(W (u1j +1 − u1j ) − W (u1j − u1j −1 ), 0)| ≤ C(|u1j +1 − u1j |2 + |u1j − u1j −1 |2 )
(3.72)
≤ C(|uj +1 | + |uj | + |uj −1 | ), 2
and using the fact
exp(|j ±1|/a) exp(|j |/a)
∇u P 2 Dm ×Om+1 ≤ C 3
2
2
= e±1/a , we get
1/2 (|uj +1 |2 + |uj |2 + |uj −1 |2 )2 exp(|j |/a) j
(3.73)
≤ Cu ≤ 2
2/3 C m .
By (3.60), (3.71) and (3.73) we get 4
m |{F, P 2 }|Dm ×Om+1 ≤ C(m + 1) m3 3
−α
1+ρ ≤ C(m + 1) m = C(m + 1) m+1 . (3.74)
Now we are in position to estimate m |∇u {F, P 2 }|Dm ×Om+1 . Noting that ∂P 2 /∂u1i ∂u2j ≡ 0 and 0, |i − j | ≥ 2 −W (u1 − u1 ) − W (u1 − u1 ), i = j ∂P 2 j +1 j j j −1 = , W (u1 − u1 ) i =j +1 ∂u1i ∂u1j j1 +1 1 j W (uj − uj −1 ) i =j −1 3
we get that, for any v = (vj1 , vj2 )j ∈Z ∈ +, 2 2 ∂P 2 2 2 vj exp(|i|/a) ≤ Cu2 v2 . D P (u)v = 1 1 ∂ui ∂uj i∈Z |i−j |≤1 Then |D 2 P 2 (u)|Dm ×Om+1 ≤ Cu ≤ C m . 3
1/3
(3.75)
By (3060), (3.67), (3.73) and (3.75) we have
m |∇u {F, P 2 }|Dm ×Om+1 ≤ m (|D 2 F | · ∇u P 2 + |D 2 P 2 | · ∇u F 3
(1/3)−α 2/3
m
≤ C( m
1/3 2/3−α
+ m m
2/3
) ≤ C(m + 1) m+1 .
The remaining quantities are similarly estimated. We omit the details.
!
86
X. Yuan
Lemma 3.5. The quantities
m |{F, P 2 } ◦ /tm |Dm ×Om+1 , m |{F, P2m } ◦ /tm |Dm ×Om+1 , m |{F, P 3 } ◦ /tm |Dm ×Om+1 4
4
4
are bounded by C(m + 1) m+1 ; the quantities
m |∇u {F, P 2 } ◦ /tm |Dm ×Om+1 , m |∇u {F, P2m } ◦ /tm |Dm ×Om+1 , 4
4
m |∇u {F, P 3 } ◦ /tm |Dm ×Om+1 4
2/3
are bounded by C(m + 1) m+1 . t Proof. In view of (3.62)–(3.64), using /tm+1 (u) = (Sm+1 (u∗ ), u∗∗ ), we get that 4 → Dm+1 , /tm+1 : Dm
for
t ∈ [0, 1].
(3.76)
By (3.68)–(3.70) we get that |(/tm+1 )∗ |Dm ×Om+1 ≤ 2, where (/tm+1 )∗ is the tangent map of /tm+1 . Using the above facts and Lemma 3.4, we can easily finish the proof of Lemma 3.5. ! 4
Inserting (3.24) into (3.23) we get 1 2 t{P2m , F } ◦ /tm dt + m P3m ◦ /1m
m+1 Pm+1 = m 0
+ m
1
{P
0
2
, F } ◦ /tm dt
+ m
1
(3.77) {P
0
3
, F } ◦ /tm dt.
By application of Lemmas 3.4 and 3.5 as well as Lemma 3.2 ,c) to (3.77) we get that −1/3
|Pm+1 |Dm+1 ×Om+1 ≤ C(m + 1),
∇u Pm+1 Dm+1 ×Om+1 ≤ C(m + 1) m+1 .
This implies that the proof of (l.4) with l = m+1 is finished. (l.1) with l = m+1 holds true obviously. Finally, we verify that assumption (l.0) with l = m + 1 is satisfied. In fact, by Lemma 3.2 ,d)., P2m and P3m do not involve the variables un with |n| > m + 1. Note that F does not involves the variables un with |n| > m + 1, too. Thus, 1 2 t 1 ∂uj m t{P2m , F } ◦ /m dt + m P3m ◦ /m ≡ 0, when |j | > m + 1. 0
Observing that P3 =
O(|un |3 ) + O(|I |2 ),
n∈Z
P (u) = 2
n∈Z
W (u1n+1 − u1n ),
and that F does not involve the variables un with |n| > m + 1, we get that {P 2 , F } and {P 3 , F } do not involve the variables un with |n| > m + 2. By combination of the above facts, we get ∂Pm+1 /∂uj ≡ 0 for |j | > m + 2. The proof of (l.5) with l = m + 1 is given in Lemma 4.4. This completes the proof of Lemma 3.1. !
Construction of Quasi-Periodic Breathers via KAM Technique
87
Remark. Recall that t (I, θ, u∗ ), u∗∗ ), /tm+1 (I, θ, u) = (Sm+1
here
u = (u∗ , u∗∗ )
(see the formula between (3.21) and (3.22).) This implies that at the m + 1th step of the KAM construction only the variable u∗ = (un )|n|≤m+1 is involved. This is quite surprising since at such a step the perturbation is of order m+1 , but it is due to the fact that the coupling potential W is cubic. More exactly, before we perform the m + 1th KAM iteration, we have the Hamiltonian (3.17). Since | m P3m | = O( m+1 ) by Lemma 3.2 , and since we are seeking the low dimensional invariant torus of the form TN × {0}, we need only kill the term P2m by the symplectic transformation /m+1 . Since P2m does not involve the variables un with |n| > m + 1, so is the transformation /m+1 . However, the perturbed term m+1 Pm+1 after carrying out the transformation /m+1 is, in general, not necessarily of order O( m+1 ) if we do not kill P 2 . Fortunately, when the coupling potential W is cubic, the perturbed term m+1 Pm+1 is indeed of order O( m+1 ). See the proof of Lemma 3.4 for details. 4. Measure Estimates Recall that ωm+1 (ξ ) = ω(ξ ) +
m j =0
Am+1 (ξ ) = H2(m+1)+1 +
m j =0
j hˆ 1uu j (0; ξ ),
j hˆ 1I j (0; ξ ),
(4.1)
H2(m+1)+1 = diag(β, · · · , β),
(4.2)
where hˆ 1I (0; ξ ), hˆ 1uu (0; ξ ) (j = 0, 1, · · · , m) are analytic, real for real arguments in the domain Oj . In what follows, we denote by | · |d the determinant of a matrix. Let R1k = {ξ ∈ <m : |k · ωm+1 | < γ˜m+1 /|k|τ˜m+1 }, R2k R3k R4k
(4.3) τ˜m+1
= {ξ ∈ <m : |k · ωm+1 ± β ∓ β| < γ˜m+1 /|k| }, (4.4) √ −1 τ˜m+1 = {ξ ∈ <m : | −1k · ωm+1 E4m+6 − Am+1 J2m+3 ) |d < γ˜m+1 /|k| }, (4.5) √ = {ξ ∈ <m : | −1k · ωm+1 E(4m+6)2 − E4m+6 ⊗ (J2m+3 Am+1 )
− (J2m+3 Am+1 ) ⊗ E4m+6 )|d < γ˜m+1 /|k|τ˜m+1 }, √ R5k = {ξ ∈ <m : | −1k · ωm+1 E16m+24 − E4 ⊗ (J2m+3 Am+1 ) − (βJ2 ) ⊗ E4m+6 )|d < γ˜m+1 /|k|τ˜m+1 }.
(4.6) (4.7)
Lemma 4.1. Let K1 = maxξ ∈< |∂ξ ω|. If m + 1 > m0 and Km < |k| or if m + 1 ≤ m0 and k = 0, then
j (4.8) meas Rk ≤ K1 γ /|k|2N , j = 1, · · · , 5.
88
X. Yuan
Proof. Recall that Ol is the complex ql -neighborhood of
1/3−α Ol | l hˆ 1uu ≤ l , l (0; ξ )|
0 ≤ l ≤ m.
(4.9)
Let j
j
j
Ol = O(
where j = 0, 1, 2, · · · , 6, ql = (1 −
Note that j
j +1
ql − ql
=
j j )ql + ql+1 , Oj0 = Oj . 6 6
1 1 (ql − ql+1 ) > ql . 6 12
Then by Cauchy’s estimate we get j Ol1 | l ∂ξ hˆ 1I l (0; ξ )| ,
j 1/3−α −j Ol1 | l ∂ξ hˆ 1uu ≤ 12j l ql , l (0; ξ )|
0 ≤ l ≤ m. −(1+τ
1/ l 6
(4.10)
)l
1 Recall that ql = l , as l ≥ m0 ;ql = ( 8(β+K) γm0 +1 )l Ml+1 l+1 , as l < m0 , where K = maxξ ∈< {|ω(ξ )|} and m0 is a constant defined in item 9 in Sect. 3.2. Then by (4.9), −α 1/4 Ol2
l |∂ξ hˆ 1I ≤ 12 l3 ql−1 < l , l | 1
0 ≤ l ≤ m.
(4.11)
In view of (4.1) and |∂ξ ω(ξ )| ≥ δa , we get that, if 0 < < ∗ < δa , |∂ξ ωm+1 (ξ )|Om ≥ δa − 2
m j =0
1/4
j
>
1 δa . 2
(4.12)
−1 Moreover, by the inverse function theorem, there exists the inverse ωm+1 (ω) for ω ∈ def.
2 ) = 2 , and ωm+1 (Om m −1 −2 (ω)|m ≤ K/δa ≤ m |∂ω ωm+1 ∗, 2
here
K = max |ω(ξ )|. ξ ∈<
(4.13)
Case 1. m + 1 > m0 . Note m ˆ = 4(2m + 3)2 and m∗ = 3m1/3 . Then, for 0 ≤ j ≤ m, ˆ l ≥ m∗ , 1 1 1 j m ˆ − 6 > − ∗6 > . 3 l 3 m 12
(4.14)
By (4.10) and (4.14) as well as the definitions of ql , j
Ol | l ∂ξ hˆ 1uu l | j 1 3 −α− l 6 (1/12)−α −j ≤ 12j l ≤ 12j l ≤ m∗ /m + 1, 0 ≤ j ≤ m, ˆ l ≥ m∗ , l ≥ m 0 ; 1 1 −α −j −α −j ≤ 12j l3 ql ≤ 12j l3 | ln |j C ≤ m∗ /m + 1, 0 ≤ j ≤ m, ˆ l ≥ m∗ , l < m 0 . (4.15) 2
Construction of Quasi-Periodic Breathers via KAM Technique
89
When l ≤ m∗ , by (4.10), j Ol2 | l ∂ξ hˆ 1uu l | j 1 3 −α− l 6 −j/2 −j ≤ 12j l ≤ l ≤ m∗ /m + 1, 0 ≤ j ≤ m, ˆ l ≤ m∗ , l ≥ m 0 1/3−α −j ≤ 12j l | ln l |−j C ≤ m∗ /m + 1, 0 ≤ j ≤ m, ˆ l ≤ m∗ , l < m0 ,
(4.16)
where C is a constant depending on m0 . Note that O ⊃ O1 ⊃ · · · ⊃ Om . Thus, by (4.15), (4.16) and (4.2), j
−j
|∂ξ Am+1 (ξ )|Om ≤ m∗ , 2
1 ≤ j ≤ m. ˆ
(4.17)
Similarly, j
−j
|∂ξ ωm+1 (ξ )|Om ≤ m∗ , Note that
2
1 ≤ j ≤ m. ˆ
(4.18)
−1 (ω) ≡ ω. ωm+1 ◦ ωm+1
j
Applying ∂ω to this identity and using the induction, in view of (4.12) and (4.18), we get −2j
−1 |∂ωj ωm+1 (ω)|m ≤ m∗ , 2
1 ≤ j ≤ m. ˆ
(4.19)
Let −1 . Am+1 (ω) := Am+1 ◦ ωm+1
(4.20)
Then, the combination of (4.19) and (4.17) leads to −4j
|∂ωj Am+1 (ω)|m ≤ m∗ , 2
1 ≤ j ≤ m. ˆ
(4.21)
Let B(ω) = E4m+6 ⊗ (J2m+3 Am+1 (ω)) + (J2m+3 Am+1 (ω)) ⊗ E4m+6 , √ Mk (ω) = −1k · ω − B(ω) . d
(4.22) (4.23)
Then, by (4.21), 2
−4j
∂ωj B(ω)∞m ≤ m∗ ,
1 ≤ j ≤ m. ˆ
(4.24)
j
We are now in a position to estimate ∂ω Mk (ω). To this end, write B(ω) = (bij ). Then ˆ ϕr (ω)(k · ω)m−r ; (4.25) Mk (ω) = ±(k · ω)mˆ + 1≤r≤m−1 ˆ
here ϕr (ω) =
1≤ji ≤m ˆ
±(or ±
√
−1)b1j1 · · · biji · · · brjr .
(4.26)
90
X. Yuan
We can assume that |k| ≤ N |k1 | without loss of generality. Observe that, for 1 ≤ j ≤ m, ˆ 2
−4j
|∂ωj 1 bij | ≤ ∂ωj B(ω)∞m ≤ m∗ , where ω1 is the first entry of ω = (ω1 , · · · , ωN ). Thus, for ω ∈ 2m , |
ds (b1j1 · · · brjr )| = | dω1s s
(s )
1 +···+sr =s
≤
s1 +···+sr =s
(s )
b1j11 · · · brjrr |
−4s s −4s
m ∗ ≤ 2 m∗ .
(4.27)
s d ˆ −4s ≤ m ϕ (ω) 2 s m ∗ . dωs r r
Thus,
1
It follows that mˆ d mˆ dω1
ˆ ϕr (ω)(k · ω)m−r 1≤r≤m−1 ˆ d mˆ m−r ˆ ≤ mˆ ϕr (ω)(k · ω) dω 1 1≤r≤m−1 ˆ m−s m d ˆ ˆ d s m−r ˆ ϕr m−s (k · ω) ≤ s s dω ˆ dω1 1 1≤r≤m−1 ˆ r≤s≤m ˆ m ˆ m ˆ −4s m−s ˆ 2 s m |k · ω|s−r ≤ ∗ |k1 | r s
(4.28a)
1≤r≤m−1 ˆ r≤s≤m ˆ
· (m ˆ − r)(m ˆ − r − 1) · · · (s − r + 1) m ˆ m ˆ −4m ˆ m ˆ m−1 ˆ m! ˆ
m∗ ≤ (N|ω|) |k1 | r s 1≤r≤m−1 ˆ
≤ ≤
r≤s≤m ˆ
−4m ˆ ˆ
m ˆ (4N|ω|) |k1 |m−1 ∗ m! −4 m ˆ ˆ c0mˆ |k1 |m−1
m∗ m!, ˆ m ˆ
where c0 = 4N max< |ω(ξ )| = 4N K(ω). ˆ = 4(2m + 3)2 , we have l ≤ m0 − 1 Case 2. m + 1 ≤ m0 . When 1 ≤ l ≤ m, 0 ≤ j ≤ m −j 2 and j ≤ 4(2m0 + 3) . Thus we have ql ≤ (| ln l |)C(m0 ) , where C(m0 ) is a constant depending on m0 . Hence j 1/3 −j 1/3 1/4 Ol2 ≤ 12j l ql ≤ C(| ln l |)C(m0 ) l < l . | l ∂ξ hˆ 1uu l |
Similarly,
j
Ol < l l ∂ξ hˆ 1I l 2
1/4
.
It follows from the last two inequalities that −1 (ω)|m < C 1/4 < 1/5 . |∂ωj Am+1 (ω) ◦ ωm+1 2
Construction of Quasi-Periodic Breathers via KAM Technique
Hence
∂ωj B(ω)Ol < 1/5 . 2
ds m ˆ | s ϕr (ω)| ≤ 2s r/5 . r dω1
It follows that
Then
91
mˆ d mˆ dω1
ˆ m−1 ˆ ϕr (ω)(k · ω)m−r m!
ˆ 1/5 , ≤ C(m0 , δa )|k1 | 1≤r≤m−1 ˆ
(4.28b)
where C(m0 , δa ) is constant depending on m0 and δa . Obviously, d mˆ dω1mˆ
(k · ω)mˆ = m!|k ˆ 1 |mˆ .
(4.29)
The combination of (4.28a) and (4.29) leads to the following: d mˆ dω1mˆ
−4m ˆ Mk (ω) ≥ m!|k ˆ 1 |mˆ (1 − |k1 |−1 c0mˆ m ∗ ) >
1 m!|k ˆ 1 |mˆ , 2
(4.30a)
if −4m ˆ |k1 | > 2c0mˆ m ∗ .
(4.31)
Note that the last inequality holds true if |k| > Km , m + 1 > m0 . The combination of (4.28b) and (4.29) leads to d mˆ dω1mˆ
Mk (ω) ≥ m!|k ˆ 1 |mˆ (1 − |k1 |−1 C(m0 , δa ) 1/5 ) >
1 m!|k ˆ 1 |mˆ , 2
(4.30b)
if m + 1 ≤ m0 and 1 such that C(m0 , δa ) 1/5 < 1/2. In order to estimate the measure of R4k we need the following lemma which was used in [B-M-S, Cap. 5], [Bo1, 2] and [C-Y]. The proof is very simple. Lemma 4.2. Let I be an interval in R1 and I¯ its closure. Suppose that g : I¯ → C is k-times continuously differentable. Let Ih = {x ∈ I¯ : |g(x)| ≤ h}, h > 0. If k g(x) 1/k with for some constant d > 0, | d dx k | ≥ d for any x ∈ I, then measIh ≤ ch −1 c = 2(2 + 3 + · · · + k + d ). Let c1 = 2(2 + 3 + · · · + m ˆ + 1) = m( ˆ m ˆ + 1); and let D be the diameter of ωm+1 (<). Recall that (see Sect. 3.2) −mˆ γ˜m+1 = m( ˆ m ˆ + 1)D N−1 γ −1 ; τ˜m+1 = 2N m. ˆ Write
˜ 4 = {ω ∈ ω−1 (<m ) : |Mk (ω)|d ≤ γ˜m+1 /|k|τ˜m+1 }. R k m+1
92
X. Yuan
Therefore, by Lemma 4.2 and (4.30), we get 1/mˆ ˜ 4 ≤ c1 γ˜m+1 /|k|τ˜m+1 D N−1 ≤ γ /|k|2N . measR k
(4.32)
measR4k ≤ K1 (ω)γ /|k|2N .
(4.33)
Thus,
!
This completes the proof of Lemma 4.1. Lemma 4.3. Let <m+1 := <m \
1≤j ≤5
0=|k|
j
Rk (ξ ),
1≤j ≤5
Km ≤|k|
<m+1 := <m \
<m+1 := <m \
1≤j ≤5
0=|k|<Mm+1
for m + 1 = m0
j
Rk (ξ ) for m + 1 > m0 .
j
Rk (ξ ) for m + 1 < m0 .
Then meas<m+1 ≥ meas<m − 5K1 γ
1/|k|2N ,
(4.34a)
(4.34b)
(4.34c)
(4.35a)
0=|k|
meas<m+1 ≥ meas<m − 5K1 γ
1/|k|2N ,
(4.35b)
Km ≤|k|
meas<m+1 ≥ meas<m − 5K1 γ
1/|k|2N ,
(4.35c)
0=|k|<Mm+1
in particular, meas
<m = (meas<)(1 − O(γ )).
m≥0
Assume ξ ∈ <m+1 . If one of the following conditions: (i) Km ≤ |k| < Km+1 with m + 1 > m0 , (ii) m + 1 = m0 and 0 = |k| < Km0 , (iii) m + 1 < m0 ,and 0 = |k| < Mm+1 , is fulfilled, then the following inequalities hold true: |k · ωm+1 |−1 ≤ |k|τm+1 /γm+1 ,
(4.36)
|k · ωm+1 ± β ∓ β|−1 ≤ |k|τm+1 /γm+1 ,
(4.37)
√ −1k · ωm+1 E4m+6 − Am+1 J2m+3 )−1 ∞ ≤ |k|τm+1 /γm+1 ,
(4.38)
Construction of Quasi-Periodic Breathers via KAM Technique
93
√ −1k · ωm+1 E(4m+6)2 − E4m+6 ⊗ (J2m+3 Am+1 ) − (J2m+3 Am+1 ) ⊗ E4m+6 )−1 ∞ √
≤ |k|τm+1 /γm+1 ,
(4.39)
−1k · ωm+1 E16m+24 − E4 ⊗ (J2m+3 Am+1 ) − (βJ2 ) ⊗ E4m+6 ) ≤ |k|τm+1 /γm+1 ,
−1
∞ (4.40)
where τm+1 = (2N + 1)m ˆ − 1, m ˆ = 4(2m + 3)2 , −mˆ ˆ m ˆ + 1)D N−1 γ −1 . γm+1 = 4β + 2K1 (ω))m(
Proof. It follows from Lemma 4.1 that (4.35) holds true. Note that τm+1 ≥ τ˜m+1 , γ˜m+1 ≥ γm+1 . Thus, (4.36) and (4.37) hold true, since ξ is not in the sets R1k and R2k . We give the proofs for (4.39) only. The proofs for (4.40) and (4.38) are similar to that for (4.39). Let B(ξ ) = −E4m+6 ⊗ (J2m+3 Am+1 (ξ )) − (J2m+3 Am+1 (ξ )) ⊗ E4m+6 ,
(4.41)
√ G(ξ ) = ± −1k · ωm+1 (ξ )E4m+6 + B(ξ ).
(4.42)
max{B∞ , |ωm+1 (ξ )|<m+1 } ≤ 2(β + K).
(4.43)
m G(ξ )O ∞ ≤ 2(β + K)|k|.
(4.44)
It is easy to check
Therefore,
If there exists the inverse of G, then G−1 =
1 adjG, |G|d
(4.45)
where adjG is the adjoint of the matrix G. Observe that any entry of adjG is of the form m ˆ −1 m−1 ˆ ˆ () = σm−1 (k · ωm+1 )m−2 (k · ω ) + σ b1,m−2 m+1 ˆ m−2 ˆ ˆ 1 m ˆ −1 b · · · bm−1,0 + · · · + σ0 , ˆ m ˆ − 1 1,0 where σj = 0 or 1, bij ’s are the entries of the matrix B. Thus, m ˆ −1 m ˆ −1 m−1 ˆ m−2 ˆ ˆ |()| ≤ |k · ωm+1 | + B∞ |k · ωm+1 | + ··· + Bm−1 ∞ 1 m ˆ −1 ˆ ˆ ˆ = (B∞ + |k · ωm+1 |)m−1 ≤ (4β + 2K)m−1 |k|m−1
(4.46)
94
X. Yuan
When ξ ∈ <m+1 , it implies that ξ is not in the set R4k , thus G|d | = |Mk (ξ )| ≥ γ˜m+1 /|k|τ˜m+1 .
(4.47)
The combination of (4.45),(4.46) and (4.47) leads to <
ˆ ˆ |k|m−1 |k|τ˜m+1 /γ˜m+1 = |k|τm+1 /γm+1 . G−1 (ξ )∞m+1 ≤ (4β + 2K)m−1
This completes the proof of Lemma 4.3.
(4.48)
!
Lemma 4.4. Let Om+1 be a complex qm+1 -neighborhood of the set <m+1 . Assume ξ ∈ Om+1 . If one of the following conditions: (i) Km ≤ |k| < Km+1 with m + 1 > m0 , (ii) m + 1 = m0 and 0 = |k| < Km0 , (iii) m + 1 < m0 ,and 0 = |k| < Mm+1 , is fulfilled, then we have |k · ωm+1 |−1 ≤ 2|k|τm+1 /γm+1 ,
m = 0, 1, · · · ,
|k · ωm+1 ± β ∓ β|−1 ≤ 2|k|τm+1 /γm+1 ,
(4.49)
m = 0, 1, · · · ,
√ −1k · ωm+1 E4m+6 − Am+1 J2m+3 )−1 ∞ ≤ 2|k|τm+1 /γm+1 ,
(4.50)
m = 0, 1, · · · , (4.51)
√ ( −1k · ωm+1 E(4m+6)2 − E4m+6 ⊗ (J2m+3 Am+1 ) − (J2m+3 Am+1 ) ⊗ E4m+6 )−1 ∞ ≤ 2|k|τm+1 /γm+1 ,
m = 0, 1, · · · ,
(4.52)
√ ( −1k · ωm+1 E16m+24 − E4 ⊗ (J2m+3 Am+1 ) − (βJ2 ) ⊗ E4m+6 )−1 ∞ ≤ 2|k|τm+1 /γm+1 ,
m = 0, 1, · · · .
(4.53)
Proof. We give the proof for (4.52) only. We claim that if one of the conditions (i), (ii) and (iii) is fulfilled, then 1+τm+1 q 1 m+1 2(β + K) |k| ≤ . (4.54) γm+1 (qm − qm+1 ) 2 Indeed, when one of conditions (i) and (ii) is fulfilled, (4.54) is a corollary of Lemma A.4. Now assume condition (iii) is fulfilled, namely m + 1 < m0 . By the definition of ql with l < m0 (see item 10 in Sect. 3.2), the term of the left-hand side of (4.54) is bounded by γm0 +1 1 ≤ . 2γm+1 2 This completes the proof Claim (4.54).
Construction of Quasi-Periodic Breathers via KAM Technique
95
For ξ ∈ Om+1 , there is ξ0 ∈ <m+1 such that |ξ − ξ0 | < qm+1 . Then O
G(ξ ) − G(ξ0 )∞ ≤ ∇ξ G(ξ )∞m+1 |ξ − ξ0 | m ≤ G(ξ )O ∞ qm+1 /(qm − qm+1 ) (4.44) ≤ 2(β + K)|k|qm+1 /(qm − qm+1 ).
Thus, −1 G−1 (ξ )∞ = G−1 (ξ0 )∞ · E + G−1 (ξ0 )(G(ξ ) − G(ξ0 )) ∞ ≤
1 − G−1 (ξ
G−1 (ξ0 )∞ 0 )∞ · G(ξ ) − G(ξ0 )∞
−1 |k|τm+1 |k|τm+1 qm+1 1 − 2(β + K)|k| (4.48) ≤ γm+1 γm+1 qm − qm+1 τ m+1 |k| (4.54) < 2 . γm+1 This finishes the proof. ! 5. Proof of the Theorems Proof of Theorem 2. The proof is finished by running Lemma 3.1. See [K1] and [P2] for details. We sketch the proof for the convenience of the reader. Obviously, the Hamiltonian H defined by (3.4) satisfies the conditions (l.1)–(l.4), (l.6) and (l.7) with l = 0. Lemma 4.4 show that (l.5) with l = 0 holds true. Thus, the iterative Lemma (Lemma 3.1) works. Inductively, we get the following sequences Dm+1 × <m+1 ⊂ Dm × <m , ; m = /1 ◦ · · · ◦ /m+1 : Dm+1 × <m+1 → D0 , H ◦ ; m = Hm+1 = H0m+1 + m+1 Pm+1 + P 2 + P 3 . Let <∞ = ∩∞ m=0 <m . By the arguments similar to those of Lemmas 2.4 and 2.5 in [K1, pp. 63–65], in view of (5.59)–(5.61), we conclude that H0m+1 , ; m , D; m , Hm converges uniformly on the domain D∞ × <∞ = D( 21 δ0 , 0, 0) × <∞ , and 1 H0∞ := lim Hm = ω∞ (ξ ) · I + A∞ (ξ )u, u, m→∞ 2 H∞ := H0∞ + P 2 + P 3 1 = ω∞ (ξ ) · I + A∞ (ξ )u, u + P 2 + P 3 . 2 Thus, TN × {0} × {0} is an embedding torus with rotational frequencies ω∞ (ξ ) ∈ S := ω∞ (<∞ ) of the Hamiltonian H∞ . Returning to the original Hamiltonian H , it has an embedding torus /(TN ×{0}×{0}) with frequencies ω∞ (ξ ), where / := limm→∞ ; m . This proves the theorem. ! Proof of Theorem 1. By Sect. 2, Theorem 1 is just a corollary of Theorem 2.
96
X. Yuan
Remark 5.1. Here is the proof of the fact that the breathers are super-exponentially localized in space. Denote by ϒk the projector + on the k-th variable uk , namely, ϒk u = uk for u = (· · · , u0 , u1 , · · · , uk , · · · ). By the remark in the end of Sect. 3, the symplectic transformation /m+1 does not involve the variables un with |n| > m + 1; in strictly speaking, ϒn /m+1 = I d for |n| > m + 1. Therefore, we have ϒm+1 / = ϒm+1 ( lim /m+1 ◦ /m+2 ◦ · · · ◦ /m+k ). k→∞
(5.1)
Note that for any η ∈ TN × {0} × {0}, /(η) is a breather. Let xN+k+1 be the oscillation of the N + k + 1-th particle where |k| = m. Without loss of generality, let k = m. By (5.1), xN+m+1 = ϒm+1 /(η) = ϒm+1 ( lim /m+1 ◦ /m+2 ◦ · · · ◦ /m+k )(η). k→∞
Since the range of /m+1 is in the domain Dm+1 = Um+1 × O( m+1 , CN ) × O( m+1 , +) (See (3.76) and item 5 of Sect. 3.2), so 2/3
1/3
|xN+m+1 | ≤ Ce−|m+1|/a m+1 . 1/3
The proof is now complete. !
Appendix Lemma A.1. For δ > 0, ν > 0, the following inequality holds true: ν ν 1 e−2|k|δ |k|ν ≤ (1 + e)N . ν+N e δ N k∈Z
Proof. This lemma can be found in [B-M-S]. We shall find the value of z ≥ 1 yielding a maximum value for the expression: ν ln z − δz. Differentiating it in z and equating the result to zero, we get that ν/z = δ,z = ν/δ > 1. From this it follows that ν ν ln z − δz ≤ ν ln − 1 . δ This expression obtained yields ν ν ν exp(δz) zν ≤ exp(δz) exp ν ln − 1 = . δ e δν Thus,
ν ν 1 e−|k|δ e δν k ν ν 1 1 + exp(−δ) N = e δ ν 1 − exp(−δ) ν ν 1 1 + e N ≤ . e δν δ
e−2|k| |k|ν ≤
k∈ZN
This proves this lemma. !
Construction of Quasi-Periodic Breathers via KAM Technique
97
Lemma A.2. Let A be a symmetric matrix of order 2(m + 1) + 1 and B an 2(m + 1) + 1vector. Denote by · ∞ the sup-norm for vector and matrix. Then 1 −α A∞ ≤ |A| ≤ m A∞ . K −α B∞ ≤ B ≤ m B∞ ,
where K > 0 is a constant. Proof. First, we prove that A∞ ≤ K|A|. Write A = (aij ). Recall that ∗ A 0 |A| := 0 0 ∞×∞ where | · |∗ is the operator norm reduced by the inner product ·, ·+ in the space +.In the following we omit the star. For any fixed k ∈ Z, pick u = (uj ) ∈ + with uk = exp(−|k|/2a) and uj = 0 with j = k. Then Au, u = akk u2k e|k|/a = akk . But |Au, u+ | ≤ |A|u2 ≤ |A|, since u = 1. Thus, |akk | ≤ |A|. For any fixed k, l ∈ Z, pick u = (uj ) ∈ + with uk = e−|k|/2a , ul = e−|l|/2a and uj = 0 with j = k and j = l. Then u2 = 2. In view of the definition of ·, ·+ , Au, u+ = (akk uk + akl ul )uk e|k|/a + (alk uk + all ul )ul e|l|/a |k| − |l| |l| − |k| = akk + all + akl exp + alk exp 2a 2a In view of |Au, u+ | ≤ |A|u2 ≤ 2|A|,and the fact |akk |, |all | ≤ |A| and akl = alk ,we get that |akl | ≤ C|A|, namely A∞ ≤ C|A|. −α |A| . Let w = e|i|/a . For any u ∈ +, Second, we prove |A| ≤ m ∞ i 2 2 aij uj wi |Au| = |i|≤m+1 |j |≤m+1 2 −1/2 1/2 = a w u w ij j j j wi |i|≤m+1 |j |≤m+1 ≤ aij2 wj−1 u2j wj wi |i|≤m+1
= u2
|j |≤m+1
|i|≤m+1
≤ A2∞ u2
|j |≤m+1
|j |≤m+1
aij2 wj−1 wi
|i|≤m+1 |j |≤m+1
Thus, |A| ≤ A∞
wi wj−1 .
|i|≤m+1 |j |≤m+1
wi wj−1 .
98
X. Yuan
Note that A is a matrix of order 4(m + 1) + 2. Then
|i|≤m+1 |j |≤m+1
wi wj−1 =
exp(
|i|≤m+1 |j |≤m+1
i−j ) a
≤ C(m + 1) exp(2(m + 1)/a) −2α ≤ m .
This proves this lemma for the matrix A. The proof for the vector B is simpler. We omit it. ! Lemma A.3. Let A, B, C be r × r, s × s r × s matrices, respectively; and let X be a r × s unknown matrix. Then the matrix equation AX + XB = C is solvable if and only if the vector equation (Es ⊗ A + B T ⊗ Er )X = C is solvable, where X = (X1T , · · · , XsT )T , C = (C1T , · · · , CsT )T if we write X = (X1 , · · · , Xs ) and C = (C1 , · · · , Cs ). Moreover, X∞ ≤ (Es ⊗ A + B T ⊗ Er )−1 ∞ C∞ if the inverse exists. Proof. This lemma can be found in many textbooks on matrix theory, for example [L, p. 256]. ! Lemma A.4. Assume that m0 > 0 is a constant such that when m + 1 ≥ m0 , the following inequalities: (1 + 2)(m + 1)−6 − m−6 > (2N + 1)−1
2 (m + 1)−6 , 2
(1)
2 1/3 (1 + 2)m−3(m+1) (m + 1)−6 (2m + 3)−2 (2m + 5)−2 − 1 > 1 (2) 128
are fulfilled, then we have 1+τ Km+1m+1 qm+1 1 2(β + K(ω)) ≤ , γm+1 qm − qm+1 2
m + 1 ≥ m0 ,
if 0 < 1 Proof. When m + 1 > m0 , namely m ≥ m0 , by the definition of qm we have ! " qm+1 = ∧ (1 + 2)m ((1 + 2)(m + 1)−6 − m−6 ) , qm #2 $ (by(1)) ≤ ∧ (1 + 2)m (m + 1)−6 . 2
(3)
Construction of Quasi-Periodic Breathers via KAM Technique
99
When m + 1 = m0 , −m0 +1 qm+1 1 qm0 (1+τm )(m0 −1) 1/m6 = = m0 0 Mm0 0 qm qm0 −1 8(β + K(ω))γm0 +1 #2 $ ≤ ∧ (1 + 2)m (m + 1)−6 , 2
(4)
where 1/γm+1 = O(C(m)) and Mm+1 = O(C(m)| ln |) are used. By (3) and (4) we get qm+1 < 21 qm and then qm − qm+1 > 21 qm . By item (9) in Sect. 3.2, we have qm+1 1+τ Km+1m+1 qm − qm+1 % 1/3 ∧ ≤ C(m + 2) (64(2N + 1)(1 + 2)3(m+1) (2m + 3)2 (2m + 5)2 ) ·
(1 + 2)m−3(m+1)
1/3
(2)
2 (2N + 1)−1 (m + 1)−6 (2m + 3)−2 (2m + 5)−2 − 1 128
&
→ ≤C(m + 2) ∧ {(64(2N + 1)(1 + 2)3(m+1) (2m + 3)2 (2m + 5)2 )}. 1/3
−1 In view of γm+1 = O(C(m)), we have 1+τ Km+1m+1 qm+1 1 ≤ , 2(β + K) γm+1 qm − qm+1 2
if 0 < 1.
!
Acknowledgements. This work was carried out during the author’s stay at CMAF University of Lisbon under grant of Post-Doctor Fellowship from FCT. I am grateful to A. Nunes for useful discussions, to D. Qian for his attracting my attention to MacKay and Aubry’s work [M-A],to D. Bambusi and C. E. Wayne for pointing out a fatal error in the previous version of this paper, and to J. Wan for her invaluable encouragement. I am indebted to Robert MacKay for his kind invitation to me to visit the Univ. of Warwick and for useful discussions. My cordial thanks are acknowledged to the referee for his invaluable comments and suggestions.
References [Ah] [A1] [A2] [A3] [A-A] [Ar] [B] [Bo1] [Bo2] [Bo3]
Ahn, T.: Multisite oscillations in networks of weakly coupled autonomous oscillators. Nonlinearity 11, 965–989 (1998) Aubry, S.: The concept of anti-integrability applied to dynamical systems and to structural and electronic models in condensed matter physics. Physica D 71, 196–221 (1994) Aubry, S.: Anti-integrability in dynamical and variational problems. Physica D 86, 284–296 ( 1995) Aubry, S.: Breathers in nonlinear lattices: Existence, linear stability and quantization. Physica D 103, 201–250 (1997) Aubry, S. and Abramovici, G.: Chaotic trajectories in the standard map. The concept of antiintegrability. Physica D 43, 199–219 (1990) Arnol d, I. V.: Mathematical Methods of Classical Mechanics. New York: Springer-Verlag, 1978 Bambusi, D.: Exponential stability of breathers in Hamiltonian networks of weakly coupled oscillators. Nonlinearity 9, 433–457 (1996) Bourgain, J.: Quasi-periodic solutions of Hamiltonian perturbations for 2D linear Schrödinger equation. Ann. Math. 148, 363–439 (1998) Bourgain, J.: Construction of quasi-periodic solutions for Hamiltonian perturbations of linear equations and application to nonlinear pde. International Math. Research Notices 11, 475–497 (1994) Bourgain, J.: On Melnikov’s persistency problem. Math. Res. Lett. 4, 445–458 (1997)
100
X. Yuan
[B-M-S] Bogolyubov, N.N., Mitropol skiı, Yu.A. and Samoılenko, A.M.: Methods of Accelerated Convergence in Nonlinear Mechanics. NewYork: Springer-Verlag, 1976 [Russian Original: Kiev: Naukova Dumka, 1969] [C-W] Craig, W. and Wayne, C.E.: Newton’s method and periodic solutions of nonlinear wave equation. Commun. Pure. Appl. Math. 46, 1409–1501 (1993) [C-Y] Chierchia, L. and You, J.: KAM tori for 1D nonlinear wave equations with periodic boundary conditions. Comm. Math. Phys. 211, 497–525 (2000) [F-S-W] Frohlich, J., Spencer, T. and Wayne, C.E.: Localization in Disordered, Nonlinear Dynamical Systems. J. Statistical Physics. 42, 247–274 (1986) [J-A] Johansson, M., Aubry, S., Gaididei,Y.B., Christiansen, P.L. and Rasmussen, K.: Dynamics of breathers in discrete nonlinear Schrödinger models. Physica D 119, 115–124 (1998) [K1] Kuksin, S.B.: Nearly integrable infinite-dimensional Hamiltonian systems. (Lecture Notes in Math. 1556). New York: Springer-Verlag, 1993 [K2] Kuksin, S.B.: Elements of a qualitative theory of Hamiltonian PDEs. Doc. Math. J. DMV (Extra Volume ICM) II, 819–829 (1998) [K-P] Kuksin, S.B. and Poschel, J.: Invariant Cantor manifolds of quasiperiodic oscillations for a nonlinear Schrödinger equation. Ann. Math. 142, 149–179 (1995) [L] Lancaster, P.: Theory of Matrices. New York and London: Academic Press, 1969 [M] MacKay, R.S.: Recent progress and outstanding problems in Hamiltonian dynamics. Physica D 86, 122–133 (1995) [M-A] MacKay, R.S. and Aubry, S.: Proof of existence of breathers of time-reversible or Hamiltonian networks of weakly coupled oscillators. Nonlinearity 7, 1623–1643 (1994) [P1] Pöschel, J.: Small divisors with spatial structure in infinite dimensional Hamiltonian systems. Comm. Math. Phys. 127, 351–393 (1990) [P2] Pöschel, J.: A KAM-theorem for some nonlinear PDEs. Ann. Scuola Norm. Sup. Pisa, Cl. Sci., IV Ser. 15 23, 119–148 (1996) [P3] Pöschel, J.: Quasiperiodic solutions for nonlinear wave equation. Commun. Math. Helvetici 71, 269–296 (1996) [V-B] Vittot, M. and Bellissard, J.: Invariant tori for an infinite lattice of coupled classical rotators. Preprint, CPT-Marseille (1985) [W1] Wayne, C.E.: Periodic and quasi-periodic solutions of nonlinear wave equations via KAM theory. Commun. Math. Phys.127, 479–528 (1990) [W2] Wayne, C.E.: Bounds on the trajectories of a system of weak coupled rotators. Commun. Math. Phys. 104, 21–36 (1986) [Yu] Yu, J.: The existence of discrete breather of non-weakly coupled oscillators. Preprint, Fudan University (2000) [Y] Yuan, X.: Invariant tori for Duffing-Type equations. J. Differential Equations 142(2), 231–262 (1998) Communicated by G. Gallavotti
Commun. Math. Phys. 226, 101 – 130 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Possible Loss and Recovery of Gibbsianness During the Stochastic Evolution of Gibbs Measures A.C.D. van Enter1 , R. Fernández2 , F. den Hollander3 , F. Redig4 1 Instituut voor Theoretische Natuurkunde, Rijksuniversiteit Groningen, Nijenborg 4, 9747 AG Groningen,
The Netherlands
2 Labo de Maths Raphael SALEM, UMR 6085, CNRS-Université de Rouen, Mathematiques, Site Colbert,
76821 Mont Saint Aignan, France
3 EURANDOM, Postbus 513, 5600 MB Eindhoven, The Netherlands 4 Faculteit Wiskunde en Informatica, Technische Universiteit Eindhoven, Postbus 513, 5600 MB Eindhoven,
The Netherlands Received: 26 April 2001 / Accepted: 10 October 2001
Abstract: We consider Ising-spin systems starting from an initial Gibbs measure ν and evolving under a spin-flip dynamics towards a reversible Gibbs measure µ = ν. Both ν and µ are assumed to have a translation-invariant finite-range interaction. We study the Gibbsian character of the measure νS(t) at time t and show the following: (1) For all ν and µ, νS(t) is Gibbs for small t. (2) If both ν and µ have a high or infinite temperature, then νS(t) is Gibbs for all t > 0. (3) If ν has a low non-zero temperature and a zero magnetic field and µ has a high or infinite temperature, then νS(t) is Gibbs for small t and non-Gibbs for large t. (4) If ν has a low non-zero temperature and a non-zero magnetic field and µ has a high or infinite temperature, then νS(t) is Gibbs for small t, non-Gibbs for intermediate t, and Gibbs for large t. The regime where µ has a low or zero temperature and t is not small remains open. This regime presumably allows for many different scenarios. 1. Introduction Changing interaction parameters, like the temperature or the magnetic field, in a thermodynamic system is the preeminent way of studying such a system. In the theory of interacting particle systems, which are used as microscopic models for thermodynamic systems, one associates with each such interaction parameter a class of stochastic evolutions, like Glauber dynamics or Kawasaki dynamics. In recent years there has been extensive interest in the quenching regime, in which one starts from a high- or infinite-temperature Gibbs state and considers the behavior of the system under a low- or zero-temperature dynamics. This is interpreted as a fast cooling procedure (which is different from the slow cooling procedure of simulated annealing). One is interested in the asymptotic behavior of the system, in particular, the occurrence of trapping in metastable frozen or semi-frozen states (see [13, 36, 37, 14, 35, 38, 6]).
102
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
Another regime that has been intensively studied is the one where, starting from a low-non-zero-temperature Gibbs state of Ising spins in a positive magnetic field, one considers a low-non-zero-temperature negative-magnetic-field Glauber dynamics (see [40] and references therein). Under an appropriate rescaling of the time and the magneticfield strength, one finds a metastable transition from the initial plus-state to the final minus-state. In this paper we concentrate on the opposite case of the unquenching regime, in which one starts from a low-non-zero-temperature Gibbs state of Ising spins and considers the behavior of the system under a high- or infinite-temperature Glauber dynamics. This is interpreted as a fast heating procedure.As far as we know, this regime has not been studied much (see e.g. [1]), as no singular behavior was expected to occur. Although we indeed know that there is exponentially fast convergence (cf. [25], Chapter 1, Theorem 4.1, and [33, 34]) to the high- or infinite-temperature Gibbs state (i.e., the asymptotic behavior is unproblematic), we will show that at sharp finite times there can be transitions between regimes where the evolved state is Gibbsian and regimes where the evolved state is non-Gibbsian. If a measure is Gibbsian, then it is at least moderately well-behaved in the sense that it is possible to associate a reasonable interaction to it, and by taking the inverse norm of this interaction, some notion of effective temperature. As has been discussed in detail in [10], for discrete bounded spins, such as the Ising spins which we consider here, uniform summability of the interaction is one of the weakest notions of “well-behavedness” one could desire. The problem we study here can thus be interpreted as the question if by fast heating a system can lose (instead of change) its temperature. A more detailed question, which we do not address here (see Open Problem 7.4.1), would be to compute explicitly the changing temperatures (interactions) in those regimes where they exist. Also one might hope to obtain better regularity properties than just uniform summability. In the light of the results in [10], Chapter 4, on renormalization-group transformations, it should perhaps not come as a surprise that such transitions can happen. Indeed, we can view the time-evolved measure as a kind of (single-site) renormalized Gibbs measure. Even though the image spin σt (x) at time t at site x is not a (random) function of the original spins σ0 (y) at time 0 for y in only a finite block around x, by the Feller character of the Glauber dynamics it depends only weakly on the spins σ0 (y) with y large. In that sense the time evolution is close to a standard renormalization-group transformation, without rescaling, and so a priori we can expect Griffiths-Pearce pathologies. We will prove the following: (1) For an arbitrary initial Gibbs measure and an arbitrary Glauber dynamics, both having finite range, the measure stays Gibbsian in a small time interval, whose length depends on both the initial measure and the dynamics (Theorem 4.1). This result, though somewhat surprising, essentially comes from the fact that for small times the set of sites where a spin flip has occurred consists of “small islands” that are far apart in a “sea” of sites where no spin flip has occurred. (2) For a high- or infinite-temperature initial Gibbs measure and a high- or infinitetemperature Glauber dynamics, the measure is Gibbsian for all t > 0 (Theorems 5.1 and 6.2). (3) For a low-non-zero-temperature initial Gibbs measure and a high- or infinitetemperature Glauber dynamics, there is a transition from Gibbs to non-Gibbs (Theorems 5.2 and 6.3). This result is somewhat counter-intuitive: after some time of heating the system it reaches a high temperature, where a priori we would expect the measure to be well-behaved because it should be exponentially close to a Completely
Stochastic Evolution of Gibbs Measures
103
Analytic (see [8]) high-temperature Gibbs measure. As we will see, this intuition is wrong because the system in fact loses its temperature. However, from the results of [31] it follows that this transition does not occur when the initial measure is a rigid ground state (zero-temperature) measure (i.e., a Dirac measure). (4) For a low-non-zero-temperature initial Gibbs measure and a high- or infinitetemperature Glauber dynamics, there possibly is a transition back from non-Gibbs to Gibbs when the Hamiltonian of the initial Gibbs measure has a non-zero magnetic field (Theorems 5.2 and 6.3). The complementary regimes, with a low- or zero-temperature Glauber dynamics acting over large times, are left open. In Sect. 2 we start by giving some basic notations and definitions, and formulating some general facts. In Sect. 3 we give representations of the conditional probabilities of the time-evolved measure and clarify the link between the Gibbsian character of the time-evolved measure and the Feller property of the backwards process. These results are useful for proving the “positive side”, i.e., for showing that the time-evolved measure is Gibbsian. We use a criterion of [10], Chapter 4, Step 1, or [11] to identify bad configurations (points of essential discontinuity of every version of the conditional probabilities) as those configurations for which the constrained system (i.e., the measure at time 0 conditioned on the future bad configuration at time t > 0) exhibits a phase transition. This criterion will serve for the “negative side”, i.e., for showing that the time-evolved measure is non-Gibbsian. In Sect. 4 we prove that for an arbitrary initial measure and an arbitrary dynamics, both having finite-range interactions, the measure at time t is Gibbs for all t ∈ [0, t0 ], where t0 depends on the interactions. In Sect. 5 we treat the case of infinite-temperature dynamics, i.e., a product of independent Markov chains. This example already exhibits all the transitions between Gibbs and non-Gibbs we are after. Moreover, it has the advantage of fitting exactly in the framework of the renormalization-group transformations: the time-evolved measure is nothing but a single-site Kadanoff transform of the original measure, where the parameter p(t) of this transform varies continuously from p(0) = ∞ to p(∞) = 0. For the case of a low-temperature initial measure we restrict ourselves to the d-dimensional Ising model. In Sect. 6 we show that the results of Sect. 5 also apply in the case of a hightemperature dynamics. The basic ingredient is a cluster expansion in space and time, as developed in [30] and worked out in detail in [27]. This is formulated in Theorem 6.1 and is the technical tool needed to develop the “perturbation theory” around the infinitetemperature case. In Sect. 7 we give a dynamical interpretation of the transition from Gibbs to nonGibbs in terms of a change in the most probable history of an improbable configuration. We show that the transition is not linked with a wrong behavior in the large deviations at fixed time, and we close by formulating a number of open problems. 2. Notations and Definitions d
2.1. Configuration space. The configuration space of our system is = {−1, +1}Z , endowed with the product topology. Elements of are denoted by σ, η. A configuration σ assigns to each lattice point x ∈ Zd a spin value σ (x) ∈ {−1, +1}. The set of all finite subsets of Zd is denoted by S. For ∈ S and σ ∈ , we denote by σ the restriction of
104
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
σ to , while denotes the set of all such restrictions. A function f : → R is called local if there exists a finite set ⊂ Zd such that f (η) = f (σ ) for σ and η coinciding on . The minimal such is called the dependence set of f and is denoted by Df . The vector space of all local functions is denoted by L. This is a uniformly dense subalgebra of the set of all continuous functions C(). A local function f : → R with dependence set Df ⊂ can be viewed as a function on . With a slight abuse of notation we use f for both objects. For σ, η ∈ and ⊂ Zd , we denote by σ ηc the configuration whose restriction to (resp. c ) coincides with σ (resp. ηc ). For x ∈ Zd and σ ∈ , we denote by τx σ the shifted configuration defined by τx σ (y) = σ (y + x). The shift on functions f ∈ C() is defined by τx f (σ ) = f (τx σ ). For a sequence of real numbers a , indexed by finite subsets of Zd , we write lim a = a
(2.1)
↑Zd
if for any > 0 there exists a finite set 0 ⊂ Zd such that for any ⊃ 0 |a − a| ≤ .
(2.2)
A sequence of probability measures µ on is said to converge to a probability measure µ on (notation µ → µ) if lim f dµ = f dµ ∀f ∈ L. (2.3) ↑Zd
2.2. Dynamics. The dynamics we consider in this paper is governed by a collection of spin-flip rates c(x, σ ), x ∈ Zd , σ ∈ , satisfying the following conditions: 1. Finite range: cx : σ → c(x, σ ) is a local function of σ for all x, with diam(Dcx ) ≤ R < ∞. 2. Translation invariance: τx c0 = cx for all x. 3. Strict positivity: c(x, σ ) > 0 for all x and σ . Note that these conditions imply that there exist , M ∈ (0, ∞) such that 0 < ≤ c(x, σ ) ≤ M < ∞
∀x ∈ Zd , σ ∈ .
(2.4)
Given the rates (cx ), we consider the generator defined on local functions f ∈ L by Lf = cx ∇x f, (2.5) x∈Zd
where ∇x f (σ ) = f (σ x ) − f (σ ). σx
σ x (x)
(2.6) σ x (y)
Here, denotes the configuration defined by = −σ (x) and = σ (y) for y = x. In [25], Theorem 3.9, it is proved that the closure of L on C() is the generator of a unique Feller process {σt : t ≥ 0}. We denote by S(t) = exp(tL) the corresponding semigroup, by Pσ the path-space measure given σ0 = σ , and by Eσ expectation over Pσ . The semigroup works on probability measures ν on via (S(t)f )(σ )ν(dσ ) = f (σ )νS(t)(dσ ), (2.7)
Stochastic Evolution of Gibbs Measures
105
stated in words: νS(t) is the distribution of the configuration at time t if the initial distribution at time zero is ν. A probability measure µ on the Borel σ -field of is called invariant for the process with generator L if Lf dµ = 0 ∀f ∈ L. (2.8) It is called reversible if
(Lf )gdµ =
f (Lg)dµ
∀f, g ∈ L.
(2.9)
Reversibility implies invariance. For spin-flip dynamics with generator L defined by (2.5), reversibility of µ is equivalent to c(x, σ x )
dµx = c(x, σ ) dµ
∀x ∈ Zd , σ ∈ ,
(2.10)
where µx denotes the distribution of σ x when σ is distributed according to µ. Note that (2.10) implies the existence of a continuous version of the Radon-Nikodým (RN)derivative dµx /dµ. This will be important in the sequel. 2.3. Interactions and Gibbs measures. A good interaction is a function U : S × → R,
(2.11)
such that the following two conditions are satisfied: 1. Local potentials in the interaction: U (A, σ ) depends on σ (x), x ∈ A, only. 2. Uniform summability: sup |U (A, σ )| < ∞ x ∈ Zd . (2.12) Ax σ ∈
The set of all good interactions will be denoted by B. A good interaction is called translation invariant if U (A + x, τ−x σ ) = U (A, σ )
∀A ∈ S, x ∈ Zd , σ ∈ .
(2.13)
The set of all translation-invariant good interactions is denoted by Bti . An interaction U is called finite-range if there exists an R > 0 such that U (A, σ ) = 0 for all A ∈ S with diam(A) > R. The set of all finite-range interactions is denoted by B f r and the set of fr all translation-invariant finite-range interactions by Bti . For U ∈ B, ζ ∈ , ∈ S, we define the finite-volume Hamiltonian with boundary condition ζ as ζ H (σ ) = U (A, σ ζc ) (2.14) A∩=∅
and the Hamiltonian with free boundary condition as H (σ ) = U (A, σ ), A⊂
(2.15)
106
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
which depends only on the spins inside . Corresponding to the Hamiltonian in (2.14) U,ζ we have the finite-volume Gibbs measures µ , ∈ S, defined on by
U,ζ
f (ξ )µ (dξ ) =
ζ
f (σ ζc )
exp [−H (σ )] ζ
Z
σ ∈
ζ
,
(2.16)
U,ζ
where Z denotes the partition function normalizing µ to a probability measure. ζ U,ζ Because of the uniform summability condition, (2.12), the objects H and µ are continuous as a function of the boundary condition ζ . ζ For a probability measure µ on , we denote by µ the conditional probability distribution of σ (x), x ∈ , given σc = ζc . Of course, this object is only defined on a set of µ-measure one. For ∈ S, ' ∈ S and ⊂ ', we denote by µ' (σ |ζ ) the conditional probability to find σ inside , given that ζ occurs on ' \ . For U ∈ B, we call µ a Gibbs measure with interaction U if its conditional probabilities coincide with the ones prescribed in (2.16), i.e., if ζ
U,ζ
µ = µ
µ − a.s.
∀ ∈ S, ζ ∈ .
(2.17)
We denote by G(U ) the set of all Gibbs measures with interaction U . For any U ∈ B, G(U ) is a non-empty compact convex set. The set of all Gibbs measures is G(U ). (2.18) G= U ∈B
Note that G is not a convex set, since, for U and V in Bti , convex combinations of G(U ) and G(V ) are not in G unless G(U ) = G(V ) (see [10, Sect. 4.5.1]). Remark. We will often use the notation H (·) = A U (A, ·) for the “Hamiltonian” corresponding to the interaction U . This formal sum has to be interpreted as the collection of finite-volume Hamiltonians in (2.14), or as “energy differences”, i.e., if σ and η agree outside a finite volume , then η
η
H (η) − H (σ ) = H (η) − H (σ ).
(2.19)
Definition 2.1. A measure µ is called Gibbsian if µ ∈ G, otherwise it is called nonGibbsian.
2.4. Gibbsian and non-Gibbsian measures. In this paper we study the time-dependence of the Gibbsian property of a measure under the stochastic evolution S(t). In other words, we want to investigate whether or not νS(t) ∈ G at a given time t > 0. Proposition 2.2. The following three statements are equivalent: 1. µ ∈ G. 2. µ admits a continuous and strictly positive version of its conditional probabilities ζ µ , ∈ S, ζ ∈ . 3. µ admits a continuous version of the RN-derivatives dµx /dµ, x ∈ Zd . Proof. See [23] and [41].
Stochastic Evolution of Gibbs Measures
107
We will mainly use item 3 and look for a continuous version of the RN-derivatives dµx /dµ by approximating them uniformly with local functions. A necessary and sufficient condition for µ to be non-Gibbsian (µ ∈ G) is the existence of a bad configuration, i.e., a point of essential discontinuity. This is defined as follows: Definition 2.3. A configuration η ∈ is called bad for a probability measure µ if there exist > 0 and x ∈ Zd such that for all ∈ S there exist ' ⊃ and ξ, ζ ∈ such that: µ' (σ (x)|η\{x} ζ'\ ) − µ' (σ (x)|η\{x} ξ'\ ) > . (2.20) Note that in this definition only the finite-dimensional distributions of µ enter. It is clear that a bad configuration is a point of discontinuity of every version of the conditional probabilities of µ. Therefore, for such a configuration the equality (2.17) cannot hold. Conversely, a measure that has no bad configurations is Gibbsian (see e.g. [29]). 2.5. Main question. Our starting points in this paper are the following ingredients: 1. A translation invariant initial measure ν ∈ G(Uν ), corresponding to a finite-range fr translation-invariant interaction Uν ∈ Bti as introduced in Sect. 2.3. 2. A spin-flip dynamics, with flip rates as introduced in Sect. 2.2. This dynamics is supposed to have a translation-invariant reversible measure µ, which thus satisfies dµx c(x, σ ) = . dµ c(x, σ x )
(2.21)
Hence, by Proposition 2.2 there exists an interaction Uµ ∈ B such that µ ∈ G(Uµ ). Since the rates are translation invariant and have finite range, this interaction can fr actually be chosen in Bti and satisfies (recall (2.14) and (2.17)) dµx x [Uµ (A, σ ) − Uµ (A, σ )] . (2.22) = exp dµ Ax
Without loss of generality we can take the rates c(x, σ ) of the form 1 x c(x, σ ) = exp [Uµ (A, σ ) − Uµ (A, σ )] . 2
(2.23)
Ax
A finite-volume approximation of the rates in (2.23) that we will often use is given by
µ µ c (x, σ ) = exp H (σ ) − H (σ x ) , µ
(2.24)
where H is the Hamiltonian with free boundary condition (with a slight abuse of notation the upper index µ is referring here to the measure µ, not to the boundary condition) associated with the interaction Uµ (recall (2.15)). These rates generate a pure-jump process on = {−1, +1} with generator (L f )(·) = c (x, ·)∇x f (·) ∀f ∈ L. (2.25) x∈
108
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
Since L f converges to Lf as ↑ Zd for any local function f ∈ L, the corresponding semigroup S (t) converges strongly in the uniform topology on C() to the semigroup S(t), i.e., S (t)f → S(t)f as ↑ Zd in the uniform topology for any f ∈ C(). Therefore we have the following useful approximation result. Let ν be a probability measure on and ν its restriction to (viewed as a subset of ). Then lim ν S (t) = νS(t),
↑Zd
(2.26)
where the limit is in the sense of (2.3). If ν ∈ G(Uν ) is a Gibbs measure, then we can replace the finite-volume restriction ν by the free-boundary-condition finite-volume Gibbs measure (in the uniqueness regime), or by the appropriate finite-volume Gibbs measure with generalized boundary condition that approximates ν (in the phase coexistence regime). The main question that we will address in this paper is the following: Question. Is νS(t) a Gibbs measure? In order to study this rather general question we have to distinguish between different regimes, as defined next. Definition 2.4. U ∈ B is a high-temperature interaction if sup (|A| − 1) sup |U (A, σ ) − U (A, σ )| < 2. x∈Zd Ax
σ,σ ∈
(2.27)
Inequality (2.27) implies the Dobrushin uniqueness condition for the associated condiU,ζ tional probabilities µ , ∈ S, ζ ∈ (see [15], p. 143, Proposition 8.8). In particular, it implies that |G(U )| = 1 (i.e., no phase transition). Note that it is independent of the “single-site part” of the interaction, i.e., of the interactions U ({x}, σ ). Remark. The left-hand side of (2.27) defines a norm that we interpret as “inverse temperature”, i.e., small norm means high temperature. Definition 2.5. We call: 1. a Gibbs measure ν “high-temperature” if it has an interaction satisfying (2.27). 2. a measure ν “infinite-temperature” if it is a product measure, (i.e., if the corresponding interaction Uν satisfies Uν (A, σ ) = 0 for all A with |A| > 1). 3. a dynamics “high-temperature” if it has an associated reversible Gibbs measure µ with an interaction Uµ satisfying (2.27). 4. a dynamics “infinite-temperature” if it has a reversible product measure µ (i.e., if the corresponding interaction Uµ satisfies Uµ (A, σ ) = 0 for all A with |A| > 1). We will use the following abbreviations: Tν 1 is shorthand for “the Gibbs measure ν is high-temperature”, Tν = ∞ for “the Gibbs measure ν is infinite-temperature”, Tµ 1 for “the dynamics is high-temperature” and Tµ = ∞ for “the dynamics is infinite-temperature”. Thus, Tν denotes the “temperature” of the initial measure ν, while Tµ denotes the “temperature” of the dynamics. As we will see in Sect. 5, the study of infinite-temperature dynamics is particularly instructive, since it can be treated essentially completely and already contains all the interesting phenomena we are after. In the regime we will study (Tµ 1), we automatically have the uniqueness of the reversible measure µ and the convergence νS(t) → µ as t → ∞.
Stochastic Evolution of Gibbs Measures
109
3. General Facts 3.1. Representation of the RN-derivative. As summarized in Proposition 2.2, an object of particular use in the investigation of the Gibbsian character of a measure is its RNderivative dµx /dµ w.r.t. a spin flip at site x. In this section we show how to exploit the reversibility of the dynamics in order to obtain a sequence of continuous functions converging to the RN-derivative of the time-evolved measure νt = νS(t) w.r.t. spin flip. Let us first consider the finite-volume case. We start from the finite-volume generator [L f ](σ ) = c (x, σ )(f (σ x ) − f (σ )), (3.1) x∈
where the finite-volume rates c (x, ·) are given by (2.24). Suppose that our starting measure ν ∈ G(Uν ) is such that |G(Uν )| = 1, which implies that the free-boundarycondition finite-volume approximations ν converge to ν. The free-boundary-condition finite-volume Gibbs measure µ , corresponding to the interaction Uµ , is the reversible measure of the generator L . We can then compute, using reversibility,
d[ν S (t)]x d[µ S (t)]x d[ν S (t)]x d[µ S (t)] (σ ) (σ ) = (σ ) (σ ) d[ν S (t)] d[µ S (t)]x d[µ S (t)] d[ν S (t)]
x
dµ d[ν S (t)] x d[µ S ](t) = (σ ) (σ ) (σ ) d[µ S (t)] dµ d[ν S (t)]
−1
x
dµ dν dν (σ x ) (σ ) (3.2) (σ ) S (t) . = S (t) dµ dµ dµ µ,ν Definition 3.1. We define the “difference Hamiltonian” H (σ ) = A⊂ [Uµ (A, σ )− µ,ν Uν (A, σ )]. Note that H depends on both the initial measure and the dynamics. Using this definition, we may rewrite (3.2) as µ,ν E dµx dν S (t)x σ x exp[H (σt )] , (σ ) (σ ) = µ,ν dν S (t) dµ Eσ exp[H (σt )]
(3.3)
where E σ denotes the expectation for the process with semigroup S (t) starting from σ . Since this semigroup converges to the semigroup S(t) of the infinite-volume process as → Zd , we obtain the following: Proposition 3.2. If the right-hand side of (3.3) converges uniformly as ↑ Zd , then for any σ ∈ and t ≥ 0: µ,ν Eσ x exp[H (σt )] dνS(t)x dµx , (σ ) lim (σ ) = (3.4) µ,ν dµ dνS(t) ↑Zd Eσ exp[H (σt )] and νS(t) is a Gibbs measure. Proof. The claim follows from a combination of (2.26) and (3.3) with Lemma 3.3 below.
110
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
Lemma 3.3. If νn → ν weakly as n → ∞, and dνnx /dνn ∈ C() exists for any n ∈ N and converges uniformly to a continuous function -, then - defines a continuous version of dν x /dν. Proof. Let f : → R be a continuous function. Define θx : → by θx (σ ) = σ x . Then also f ◦ θx : → R is a continuous function. Therefore f dν x = (f ◦ θx (σ )) ν(dσ ) = lim (f ◦ θx (σ )) νn (dσ ) n↑∞ dνnx = lim (σ )f (σ )νn (dσ ) n↑∞ dνn = lim -(σ )f (σ )νn (dσ ) n↑∞ = -f dν, (3.5) where the fourth equality follows from x dν x dν lim n (σ ) − -(σ )f (σ )νn (dσ ) ≤ lim !f !∞ ! n − -!∞ = 0. n↑∞ n↑∞ dνn dνn
(3.6)
Since (3.5) holds for any continuous function f , the statement of the lemma follows from the Riesz representation theorem. Proposition 3.2, combined with Proposition 2.2, will be used in Sects. 4–6 to prove Gibbsianness. 3.2. Path-space representation of the RN-derivative. An alternative representation of the RN-derivative dνtx /dνt is obtained by observing that νt = νS(t) is the restriction to the “layer” {t} × . In some sense, this path-space of the path-space measure P[0,t] ν measure can be given a Gibbsian representation with the help of Girsanov’s formula. The “relative energy for spin flip” of this path-space measure is a well-defined (though unbounded) random variable. Conditioning the path-space measure RN-derivative for a spin flip at site x on the layer {t} × , we get the RN-derivative dνtx /dνt . More formally, let us denote by πt the projection on time t in path space, i.e., πt (ω) = ωt with ω ∈ D([0, t], ), the Skorokhod space. By a spin flip at site x in path space we mean a transformation 1x : D([0, t], ) → D([0, t], )
(3.7)
(πt (ω))x = πt (1x (ω)).
(3.8)
such that
Different choices are possible, but in this section we choose −ω(s, x) for y = x, 0 ≤ s ≤ t, (1x (ω))(s, y) = ω(s, y) otherwise.
(3.9)
Stochastic Evolution of Gibbs Measures
111
Let F[t] denote the σ -field generated by the projection πt . Then we can write the following formula: [0,t] ◦ 1x dνS(t)x [0,t] dPν = Eν (3.10) | F[t] . dνS(t) dP[0,t] ν This equality is useful because of the Gibbsian form of the RHS of (3.10) given by Girsanov’s formula, as shown in the proof of the following: Proposition 3.4. Let ν be a Gibbs measure on . For any t > 0, νS(t)x " νS(t),
(3.11)
and the RN-derivative can be written in the form x
dν dνS(t)x [0,t] = Eν ◦ π0 -x | F[t] , dνS(t) dν
(3.12)
where -x : D([0, t], ) → R is a continuous function on path space (in the Skorokhod topology). Proof. We first approximate our process by finite-volume pure-jump processes and use Girsanov’s formula to obtain the densities of these processes w.r.t. the independent spinflip process. Indeed, denote by P σ the path-space measure of the finite-volume approximation with generator (2.25) and by P,0 the path-space measure of the independent σ spin-flip process in , i.e., the process with generator ∇x f, f ∈ L. (3.13) L0 f = x∈
We have for f : → R such that Df ⊂ , f (σ ) νS(t)x (dσ ) = lim ν(dσ ) P σ (dω) f (πt (1x (ω))) ↑Zd
= lim
↑Zd
P,0 σ (dω)
ν(dσ )
dP σ
dP,0 σ
(ω) f (πt (1x (ω))) . (3.14)
Since P,0 is the path-space measure of the independent spin-flip process, the transσ ,0 formed measure P,0 σ ◦ 1x equals Pσ x . Abbreviate F (ω) =
dP ω0
dP,0 ω0
(ω).
(3.15)
Then we obtain ν(dσ ) P,0 σ (dω)F (ω)f (πt (1x (ω))) = ν(dσ ) P,0 σ x (dω)F (1x (ω))f (πt (ω)) = =
ν(dσ )
P σ x (dω)
dν x ν(dσ ) (σ ) dν
dP,0 σx dP σx
(ω)F (1x (ω))f (πt (ω))
P σ (dω) -x, (ω)f (πt (ω)) ,
(3.16)
112
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
where - can be computed from Girsanov’s formula (see [26] p. 314) and for large enough reads t c(y, ωsx ) y dNs (ω) log -x, (ω) = exp c(y, ωs ) 0 |y−x|≤R
t + [c(y, ωs ) − c(y, ωsx )]ds , (3.17) |y−x|≤R 0
y
where Nt (ω) is the number of spin flips at site y up to time t along the trajectory ω. We thus obtain the representation of (3.12) by observing that -x, does not depend on d for large enough and using the convergence of P σ to Pσ as ↑ Z . Indeed, the only point to check is that
x dν ◦ π0 -x ∈ L1 (Pν ), (3.18) dν so that the conditional expectation in (3.12) is well-defined. However, this is a consequence of the following two observations: 1. dν x /dν is uniformly bounded because ν ∈ G. 2. For -x we have the bound
NtR,x (ω) 2Ct M , |-x (ω)| ≤ e
(3.19)
where, as in (2.4), M and are the maximum and minimum rates, NtR,x (ω) is the total number of spin flips in the region {y : |y − x| ≤ R} up to time t along the trajectory ω. Since the rates are bounded from above, the expectation of the RHS of (3.19) over Pσ is finite uniformly in σ . 3.3. Backwards process. Proposition 3.4 provides us with a representation of the RNderivative dνtx /dνt that can be interpreted as the expectation of a continuous function on path space in the backwards process. The backwards process is the Markov process with a time-dependent transition operator given by (Sν∗ (s, t)f )(σ ) = Eν (f ◦ πs |σt = σ )
0 ≤ s ≤ t,
(3.20)
where Eν (·|σt = σ ) is conditional expectation with respect to the σ -field at time t. Note that this transition operator depends on the initial Gibbs measure ν and is a function of s and t (time-inhomogeneous process). Although the evolution has a reversible measure µ, at any finite time the distribution at time t is not µ. This causes essential differences between the forward and the backwards process. The dependence of Sν∗ (s, t) on ν is crucial and shows that even for innocent dynamics, like the independent spin-flip process, the transition operators of the backwards process may fail to be Feller for certain choices of ν (see Sect. 5 below). In general, the independence of the Poisson clocks that govern the spin flips (in the backwards process this means were flipped) is lost. In order to have continuity of the RN-derivative dνtx /dνt , it is sufficient that the operators Sν∗ (s, t) have the Feller property, i.e., map continuous functions to continuous functions.
Stochastic Evolution of Gibbs Measures
113
Proposition 3.5. If ν is a Gibbs measure, then: Sν∗ (s, t)C() ⊂ C() ∀ 0 ≤ s < t ≤ t0
$⇒
νS(t) ∈ G ∀ 0 ≤ t ≤ t0 .
Proof. This is an immediate consequence of Proposition 3.4. See also [22].
(3.21)
As in Sect. 3.1, we can thus hope to approximate the transition operators of the backwards process by “local operators” (operators mapping L onto L). Proposition 3.6. For any σ ∈ and 0 ≤ s < t, if, for f ∈ L, the sequence of functions µ,ν Eσ exp[H (σt )]f (σt−s ) σ → µ,ν Eσ exp[H (σt )]
(3.22)
converges uniformly as ↑ Zd , then (Sν∗ (s, t)f )(σ )
µ,ν Eσ exp[H (σt )]f (σt−s ) = lim , µ,ν Eσ exp[H (σt )] ↑Zd
(3.23)
and Sν∗ (s, t)f ∈ C(), i.e., Sν∗ (s, t) is Feller. Proof. Let us first compute Sν∗ (s, t) in the case of the finite-volume reversible Markov chain with generator (2.25). For the sake of notational simplicity, we omit the indices referring to the finite volume, and abbreviate νs = νS(s): (Sν∗ (s, t)f )(σ ) = = =
= = µ,ν
where H
η
pt−s (η, σ )
νs (η) f (η) νt (σ )
µt (σ ) νs (η) f (η) pt−s (σ, η) νt (σ ) η µs (η)
−1
dν dν pt−s (σ, η) S(s) (σ ) (η) f (η) S(t) dµ dµ η dν S(t − s) S(s) dµ f (σ ) dν S(t) dµ µ,ν Eσ exp[H (σt )]f (σt−s ) , (3.24) µ,ν Eσ exp[H (σt )]
is defined in Definition 3.1.
Propositions 3.5 and 3.6 are the analogues of Propositions 2.2 and 3.2. We will not actually use them, but they provide useful insight. The relation between the Feller property of the backwards process and the Gibbsianness of the stationary measure has been observed in [21].
114
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
3.4. Criterion for Gibbsianness of νS(t). A useful tool to study whether νS(t) ∈ G is to consider the joint distribution of (σ0 , σt ), where σ0 is distributed according to ν. Let us denote this joint distribution by νˆ t , which can be viewed as a distribution on {−1, +1}S with S = Zd ⊕ Zd consisting of two “layers” of Zd . The correspondence between νˆ t and νS(t) is made explicit by the formula νˆt (dσ, dη)f (σ )g(η) = ν(dσ )(f S(t)g)(σ ) f, g ∈ L. (3.25) A priori the joint distribution νˆt has more chance of being Gibbsian than νS(t) and for the high-temperature dynamics we study this is actually the case. The measure νS(t) can then be viewed as the restriction of a Gibbs measure of a two-layer system to the second layer. Restrictions of Gibbs measures have been studied e.g. in [39, 31, 11, 29, 28], and it is well-known that they can fail to be Gibbsian. In fact, most examples of non-Gibbsian measures can be viewed as restrictions of Gibbs measures. Formally, the Hamiltonian of νˆt is Ht (σ, η) = Hν (σ ) − log pt (σ, η),
(3.26)
where pt (σ, η) is the transition kernel of the dynamics. Of course, the object log pt (σ, η) has to be interpreted in the sense of the formal sums A U (A, σ ) introduced in Sect. 2.3. More precisely, if δσ S(t) is a Gibbs measure for any σ , then log pt (σ, η) is the Hamiltonian of this Gibbs measure. In order to prove or disprove Gibbsianness of the measure η νS(t), one has to study the Hamiltonian (3.26) for fixed η. Let us denote by G(Ht ) the η set of Gibbs measures associated with the Hamiltonian Ht (·) = Ht (·, η). From [11] we have the following: Proposition 3.7. For any t ≥ 0, if νˆ t is Gibbs, then η
1. If |G(Ht )| = 1 for all η ∈ , then νS(t) is a Gibbs measure. η 2. For monotone specifications, if |G(Ht )| ≥ 2, then η is a bad configuration for νS(t), so νS(t) is not a Gibbs measure (by Proposition 2.2). Remark. Part 2 is expected to be true without the requirement of monotonicity but this has not been proved. A monotone specification arises e.g. when the Hamiltonian of (3.26) comes from a ferromagnetic pair potential and an arbitrary single-site part (possibly an inhomogeneous magnetic field). 4. Conservation of Gibbsianness for Small Times Having put the technical machinery in place in Sects. 2–3, we are now ready to formulate and prove our main results in Sects. 4–6. In this section we prove that, for every finite-range spin-flip dynamics starting from a Gibbs measure ν corresponding to a finite-range interaction, the measure νS(t) remains Gibbsian in a small interval of time [0, t0 ]. The intuition behind this theorem is that for small times the set of sites where a spin flip has occurred consists of “small islands” that are far apart in a “sea” of sites where no spin flip has occurred. This means that sites that are far apart have more or less disjoint histories.
Stochastic Evolution of Gibbs Measures
115
Theorem 4.1. Let both the initial measure ν and the reversible measure µ be Gibbs measures for finite-range interactions Uν , resp. Uµ . Then there exists t0 = t0 (µ, ν) > 0 such that νS(t) is a Gibbs measure for all 0 ≤ t ≤ t0 . µ,ν
Proof. During the proof we abbreviate H = H . We prove that the limit lim
↑Zd
Eσ x (exp[H (σt )]) Eσ (exp[H (σt )])
(4.1)
converges uniformly in t ∈ [0, t0 ] for t0 small enough when Uν , Uµ ∈ B f r . The t0 depends on both Uν and Uµ . Let us write Rν , Rµ to denote the range of Uν , Uµ (see Sect. 2.2). I: Rν < ∞, Rµ = 0. To warm up, we first deal with unbiased independent spinflip dynamics. For this dynamics the distribution of σt under P0σ x coincides with the distribution of σtx under P0σ . Therefore we can write |A| ||−|A| exp[(H A{x} − H )(σ )] E0σ exp[H (σtx )] A⊂ δt (1 − δt ) = |A| E0σ (exp[H (σt )]) δt (1 − δt )||−|A| exp([H A − H )(σ )]
=
A⊂
A{x} − H {x} )(σ )] exp[(H A⊂ -x (σ ), (4.2) |A| δt A exp[(H − H )(σ )] A⊂ 1−δt
δt 1−δt
|A|
where -x (σ ) = exp[(H {x} − H )(σ )]
(4.3)
is a continuous function of σ , the sum runs over A = {y ∈ : σt (y) = σ0 (y)},
(4.4)
δt = P0σ (σt (x) = σ0 (x)) = 1 − e−2t .
(4.5)
while
The notation H A , A ⊂ , is defined by H A (σ ) = H (σ A )
(4.6)
with σ A the configuration obtained from σ by flipping all the spins in A. Suppose first that Rν = 1. Then H A∪B − H A = H B − H
∀ A, B : d(A, B) > 1.
(4.7)
For A ⊂ we can decompose A into disjoint nearest-neighbor connected subsets γ1 , . . . , γk and thus rewrite (4.2) as follows: n ∞ 1 x E0σ exp[H (σtx )] γ1 ,... ,γn ⊂,γi ∩γj =∅ n=0 n! i=1 wσ (γi ) = ∞ 1 (4.8) -x n E0σ (exp[H (σt )]) γ ,... ,γ ⊂,γ ∩γ =∅ n=0 n! i=1 wσ (γi ) 1
n
i
j
116
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
with |γ |
wσx (γ ) = t exp[H γ {x} (σ ) − H {x} (σ )], |γ | wσ (γ ) = t exp[H γ (σ ) − H (σ )],
(4.9)
and t = δt /(1 − δt ). Note that wσx (γ ) = wσ (γ ) for all γ that do not contain x. Next, since |(H γ − H )(σ )| ≤ |γ |C
(4.10)
with C = 2 sup sup
σ
|H (σ )| < ∞, ||
(4.11)
we have the estimate |wσ (γ )| ≤ exp(−αt |γ |)
with αt = −C + log(1/t ).
(4.12)
A similar estimate holds for |wσx (γ )|. Since αt ↑ ∞ as t ↓ 0, it follows that for t small enough we can expand the logarithm of both the numerator and the denominator in (4.8) in a uniformly convergent cluster expansion: n ∞ 1 log wσx (γi ) = a(')wσx ('), n! ' n=0 γ1 ,... ,γn ⊂,γi ∩γj =∅ i=1 n ∞ 1 log wσ (γi ) = a(')wσ ('). (4.13) n! n=0
γ1 ,... ,γn ⊂,γi ∩γj =∅ i=1
'
The weights wσx (') and wσ (') differ only for clusters ' containing x. By the uniformity in σ of the estimate (4.12) we have, for t small enough, |a(')| sup |wσ (')| < ∞, (4.14) 'x
σ
and the same holds for wσ (') replaced by wσx ('). Therefore sup |a(')(wσx (') − wσ ('))| = 0 lim sup ↑Zd 'x,'⊂ σ
∀x ∈ Zd
(4.15)
and hence we obtain uniform convergence of the limit in (4.1). The case Rν < ∞ is treated in the same way. We only have to redefine the γi ’s as the Rν -connected decomposition of A. Note that t0 depends on Rν and converges to zero when Rν ↑ ∞. II: Rν < ∞, Rµ < ∞. Next we prove that the limit (4.1) converges uniformly if both interactions Uµ , Uν are finite range. For the sake of notational simplicity we first restrict ourselves to the case Rν = Rµ = 1. We abbreviate U = Uµ − Uν . The idea is that we go back to the independent spin-flip dynamics via Girsanov’s formula. After that we can again set up a cluster expansion, which includes additional factors in the weights due to the dynamics.
Stochastic Evolution of Gibbs Measures
117
The first step is to rewrite (4.1) in terms of the independent spin-flip dynamics: Eσ x (exp[H (σt )]) (4.16) Eσ (exp[H (σt )]) t x )dN y + t (1 − c(y, σ x ))ds exp[H (σ x )] E0σ exp log c(y, σ s s s t y∈ 0 0 . = t t y 0 Eσ exp log c(y, σ )dN + (1 − c(y, σ ))ds exp[H (σ )] s s t s y∈ 0 0 For a given realization ω of the independent spin-flip process, we say that a site y is ω-active if the spin at that site has flipped at least once. The set of all ω-active sites is denoted by J (ω). Let σ¯ denote the trajectory that stays fixed at σ over the time interval [0, t]. For A ⊂ , define t t y U1 (A, ω) = 0 log c(y, ωs )dNs (ω) + 0 (1 − c(y, ωs ))ds if A = Dcy =0
if A = Dcy ,
(4.17)
U2 (A, ω) = U (A, ωt ), and put U(A, ω) = U1 (A, ω) + U2 (A, ω).
(4.18)
U x (A, ω) = U(A, ωx ),
(4.19)
Also define
where the trajectory ωx is defined as (ωx )s = (ωs )x
0 ≤ s ≤ t.
(4.20)
With this notation we can rewrite the right-hand side of (4.16) as x x ¯ )] E0σ exp A⊂ [U (A, ω) − U (A, σ -x (σ ), E0σ exp ¯ )] A⊂ [U(A, ω) − U(A, σ where
-x (σ ) = exp
(4.21)
x
[U(A, σ¯ ) − U(A, σ¯ )]
(4.22)
Ax
is a continuous function of σ . In order to obtain the uniform convergence of (4.1), it suffices now to prove the uniform convergence of the expression between brackets in (4.21). As in Part I, we decompose the set of ω-active sites into disjoint nearest-neighbor connected sets γ1 , . . . , γk and rewrite, using the product character of E0σ , x x E0σ exp ¯ )] A⊂ [U (A, ω) − U (A, σ E0σ exp ¯ )] A⊂ [U(A, ω) − U(A, σ ∞ 1 n x γ1 ,... ,γn ⊂,γi ∩γj =∅ n=0 n! i=1 wσ (γi ) = ∞ 1 . (4.23) n γ1 ,... ,γn ⊂,γi ∩γj =∅ n=0 n! i=1 wσ (γi )
118
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
The cluster weights are now given by wσ (γ ) = et|γ | E0σ 1{J (ω)⊃γ } exp
[U(A, ωγ σ¯ \γ ) − U(A, σ¯ )] ,
(4.24)
A∩γ =∅
and an analogous expression for wσx after we replace U by U x . The factor et|γ | arises from the probability et|γi | . (4.25) P0σ J (ω)c ⊃ \ ∪i γi = e−t|\∪i γi | = e−t|| i
Having arrived at this point, we can proceed as in the case of the independent spin-flip dynamics. Namely, we estimate the weights wσ and prove that wσ (γ ) ≤ e−αt |γ |
(4.26)
with αt ↑ ∞ as t ↓ 0. To obtain this estimate, note that P0σ (J (ω) ⊃ γ ) ≤ (1 − e−t )|γ | .
(4.27)
Then apply to (4.24) Cauchy-Schwarz, the bounds in (2.4) on the flip rates, and the estimate 1 |U (A, σ )| < ∞, (4.28) C = sup sup σ || A∩=∅
to obtain wσ (γ ) ≤ eKt|γ | (1 − e−t )− 2 |γ | 1
for some K = K(, M, C).
This clearly implies (4.26). The case Rν , Rµ < ∞ is straightforward after redefinition of the γi ’s.
(4.29)
5. Infinite-Temperature Dynamics 5.1. Set-up. In this section we consider the evolution of a Gibbs measure ν under a product dynamics, i.e., the flip rates c(x, σ ) depend only on σ (x). The associated process {σt : t ≥ 0} is a product of independent Markov chains on {−1, +1}: Pσ = ⊗x∈Zd Pσ (x) ,
(5.1)
where Pσ (x) is the Markov chain on {−1, +1} with generator working on ϕ : {−1, +1} → R: Lx ϕ(α) = c(x, α)[ϕ(−α) − ϕ(α)].
(5.2)
Let us denote by ptx (α, β) the probability for this Markov chain to go from α to β in time t. The Hamiltonian (3.26) of the joint distribution of (σ0 , σt ) is then given by log ptx (σ (x), η(x)). (5.3) Ht (σ, η) = Hν (σ ) − x
Stochastic Evolution of Gibbs Measures
119
This equation can be rewritten as Ht (σ, η) = Hν (σ ) −
x
with
hx1 (t)σ (x) −
x
hx2 (t)η(x) −
x
hx12 (t)σ (x)η(x)
px (+, +)ptx (+, −) 1 log tx , 4 pt (−, +)ptx (−, −) px (+, +)ptx (−, +) 1 , hx2 (t) = log tx 4 pt (+, −)ptx (−, −) 1 px (+, +)ptx (−, −) . hx12 (t) = log tx 4 pt (+, −)ptx (−, +)
(5.4)
hx1 (t) =
(5.5)
The fields hx1 , resp. hx2 , tend to pull σ , resp. η, in their direction, while hx12 is a coupling between σ and η that tends to align them. Indeed, note that hx12 (t) is positive because ptx (+, +)ptx (−, −) − ptx (+, −)ptx (−, +) = det(exp(tLx )) ≥ 0.
(5.6)
In what follows we will consider the case where the single-site generators Lx are independent of x and are given by 1 L= 2
−1 + 1 − 1 + −1 −
for some 0 ≤ < 1.
(5.7)
For > 0 this means independent spin flips favoring plus spins, for = 0 it means independent unbiased spin flips. The invariant measure of the single-site Markov chain is (ν(+), ν(−)) = 21 (1 + , 1 − ). The relevant parameter in what follows is δ=
1− ν(−) = . ν(+) 1+
(5.8)
In terms of this parameter the fields in (5.5) become 1 1 + δe−t , h1 (t) = log 4 1 + 1δ e−t 1 h2 (t) = − log δ + h1 (t), 2 (1 + δe−t )(1 + 1δ e−t ) 1 h12 (t) = log . 4 (1 − e−t )2
(5.9)
In particular, for δ = 1 we get h1 (t) = h2 (t) = 0 and h12 (t) =
1 1 + e−t . log 2 1 − e−t
(5.10)
120
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
5.2. 1 " Tν ≤ ∞, Tµ = ∞. Theorem 5.1. Let ν be a high- or infinite-temperature Gibbs measure, i.e., its interaction Uν satisfies (2.27). Let S(t) be the semigroup of an arbitrary infinite-temperature dynamics. Then νS(t) is a Gibbs measure for all t ≥ 0. Proof. The joint distribution of (σ0 , σt ) is Gibbs with Hamiltonian (recall (3.26) and (5.4)) [h1 (t) + h12 (t)η(x)]σ (x) + h2 (t) ηx . (5.11) Ht (σ, η) = Hν (σ ) + x
x
For fixed η, the last term is constant in σ and can therefore be forgotten. Since Ht (·, η) differs from Hν (·) only in the single-site interaction, Ht (·, η) satisfies (2.27) if and only if Hν (·) satisfies (2.27). Hence |G(Ht (·, η)| = 1 for any η, and we conclude from Proposition 3.7 that νS(t) is Gibbsian. Theorem 5.1 should not come as a surprise: the infinite-temperature dynamics acts as a single-site Kadanoff transformation and in the Dobrushin uniqueness regime such renormalized measures stay Gibbsian [16, 20, 10]. 5.3. 0 < Tν " 1, Tµ = ∞, δ = 1. For the initial measure we choose the lowtemperature plus-phase of the d-dimensional Ising model, ν = νβ,h , i.e., the Hamiltonian Hν is specified to be σ (x)σ (y) − h σ (x), (5.12) Hν (σ ) = −β <x,y>
x
where <x,y> denotes the sum over nearest-neighbor pairs, and β βc with βc the critical inverse temperature. The dynamics has generator ∇x f, (5.13) Lf = x
corresponding to the case δ = 1. The joint measure has a Hamiltonian as in (5.11), with h1 (t) = h2 (t) = 0 and h12 (t) = ht : σ (x)σ (y) − h σ (x) − ht σ (x)η(x). (5.14) Ht (σ, η) = −β <x,y>
x
x
The “dynamical field” is given by ht = −(1/2) log[tanh(t/2)]. Theorem 5.2. For β βc : 1. There exists a t0 = t0 (β, h) such that νβ,h S(t) is a Gibbs measure for all 0 ≤ t ≤ t0 . 2. If h > 0, then there exists a t1 = t1 (h) such that νβ,h S(t) is a Gibbs measure for all t ≥ t1 . 3. If h = 0, then there exists a t2 = t2 (β) such that νβ,0 S(t) is not a Gibbs measure for all t ≥ t2 . 4. For d ≥ 3, if 0 < h ≤ h(β) small enough, then there exist t3 = t3 (β, h) < t4 = t4 (β, h) such that νβ,h S(t) is not a Gibbs measure for all t3 ≤ t ≤ t4 .
Stochastic Evolution of Gibbs Measures
121
Proof. The proof uses (5.14). 1. For small t the dynamical field ht is large and, for given η, forces σ in the direction of η. Rewrite the joint Hamiltonian in (5.14) as β h Ht (σ, η) = ht − √ σ (x)σ (y) − √ σ (x) − ht σ (x)η(x) ht <x,y> ht x x = ht H˜ t (σ, η). (5.15) For 0 ≤ t ≤ t0 small enough, H˜ t (·, η) has the unique ground state η and so, for λ ≥ λ0 large enough, λH˜ t satisfies the Dobrushin uniqueness criterion (see [15], Example 2, √ p. 147). Therefore, for 0 ≤ t ≤ t0 such that ht ≥ λ0 , Ht (·, η) has a unique Gibbs measure for any η. Hence, νS(t) is Gibbs by Proposition 3.7(1). 2. For large t the dynamical field ht is small and cannot cancel the effect of the external field h > 0. Rewrite the joint Hamiltonian as h ht Ht (σ, η) = β − β σ (x)σ (y) − √ σ (x) − √ σ (x)η(x) β x β x <x,y> = β H˜ t (σ, η). (5.16) For t ≥ t1 = t1 (h) large enough, H˜ t (·, η) has the unique ground state σ = h/|h|. Hence, √ for β large enough, β H˜ t (·, η) satisfies the Dobrushin uniqueness criterion (again, see [15], Example 2, p. 147). Hence, νS(t) is Gibbs by Proposition 3.7(1). 3. This fact is a consequence of the results in [10], Sect. 4.3.4, for the single-site Kadanoff transformation. Choose η = ηa to be a fully alternating configuration. For t ≥ t2 large enough, Ht (·, ηa ) has two ground states, and by an application of Pirogov–Sinai theory (see [10] Appendix B), it follows that, for β large enough, |G(Ht (·, ηa )| ≥ 2. Therefore ηa is a bad configuration for νS(t), implying that νS(t) is not Gibbs by Proposition 3.7(2). 4. In this case we rewrite the Hamiltonian in (5.14) as Ht (σ, η) = −β σ (x)σ (y) − [h + ht η(x)]σ (x). (5.17) <x,y>
x
For “intermediate” t we have that h and ht are of the same order. More precisely, choose t such that 23 h ≤ ht ≤ 25 h. As explained in [10], Sect. 4.3.6, we can find a bad configuration ηspec such that the dynamical-field term (x) in the x ht η(x)σ Hamiltonian “compensates” the homogeneous-field term x hσ (x) (i.e., x∈ (h + ht ηspec (x)) = o(||) for large blocks ). For this ηspec , Ht (·, ηspec ) has two ground states, which are predominantly plus and minus. Since the proof of existence of ηspec requires analysis of the (non-symmetric) random field Ising model, we have to restrict to the case d ≥ 3. Unlike the previous case, ηspec is not constructed, but chosen from a (Bernoulli) measure one set. If in (5.17) we choose η to be Bernoulli distributed with probability p for a +1 (notation νp ), then we are exactly in the situation of the asymmetric random field Ising model. Zahradnik’s result (see [42]) shows that there exists a value of p such that, for β large enough and νp -almost every η, Ht (·, η) exhibits a phase transition. If we choose now ηspec to be an element of this set of νp -measure one, then |G(Ht (·, ηspec )| ≥ 2, implying that νS(t) is not Gibbs by Proposition 3.7(2).
122
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
Remark. From the estimate (B89) in [10], Appendix B, we can conclude the following: 1. 2. 3. 4.
t0 (β, h) → 0 as β → ∞, t0 (β, h) → ∞ as h → ∞. t1 (h) → 0 as h → ∞, t1 (h) → ∞ as h → 0. t2 (β) → 0 as β → ∞. t3 (β, h) → 0 as β → ∞, t4 (β, h) → ∞ as β → ∞.
5.4. 0 < Tν " 1, Tµ = ∞, δ < 1. Let us now consider a biased dynamics. At first sight one might expect this case to be analogous to the case of an unbiased dynamics with an initial measure having h > 0. However, this intuition is false. Theorem 5.3. The same results as in Theorem 5.2 hold, but with the ti ’s also depending on δ. For item 4 the additional restrictions d ≥ 3 and |h + 41 log δ| small enough are needed. Proof. The last term in (5.4) being irrelevant, we can drop it and study the Hamiltonian σ (x)σ (y) − σ (x) [(h + h1 (t)) + h12 (t)η(x)] . (5.18) Hˆ t (σ, η) = −β <x,y>
x
This Hamiltonian is of the same form as (5.14), but with h becoming t-dependent. We have limt↑∞ h1 (t) = 0 and limt↑∞ h12 (t) = 0 with lim
t↑∞
1+δ h12 (t) = > 1, h1 (t) 1−δ
(5.19)
so that, in the regime where β βc , h = 0, t 1, we find that the effect of h12 (t) dominates. Hence we can find a special configuration that compensates the effect of the field h1 (t) for which the Hamiltonian (5.18) has two ground states, implying that νS(t) ∈ G. Similarly, when h > 0 we can find t intermediate such that x (h + h1 (t))σ (x) is “compensated” by h12 (t)σ (x)η(x), as in the proof of item 4 of the previous theorem. Remark. Note that if Tν = 0, Tµ = ∞, then νS(t) is a product measure for all t > 0 and hence is Gibbs. 6. High-Temperature Dynamics 6.1. Set-up. In this section we generalize our results in Sect. 5 for the infinitetemperature dynamics to the case of a high-temperature dynamics. The key technical tool is a cluster expansion that allows us to obtain Gibbsianness of the joint distribution of (σ0 , σt ) with a Hamiltonian of the form (3.26). The main difficulty is to give meaning to the term log pt (σ, η), i.e., to obtain Gibbsianness of the measure δσ S(t) for any σ . In the whole of this section we will assume that the rates c(x, σ ) satisfy the conditions in Sect. 2.2 and, in addition, c(x, σ ) = 1 + (x, σ )
(6.1)
with supσ,x |(x, σ )| = δ " 1, (x, σ ) = (x, −σ ).
(6.2)
Stochastic Evolution of Gibbs Measures
123
The latter corresponds to a high-temperature unbiased dynamics, i.e., a small unbiased perturbation of the unbiased independent spin-flip process. For the initial measure we consider two cases: 1. A high- or infinite-temperature Gibbs measure ν. In that case we will find that νS(t) stays Gibbsian for all t > 0. 2. The plus-phase of the low-non-zero-temperature d-dimensional Ising model, νβ,h , corresponding to the Hamiltonian in (5.12). In that case we will find the same transitions as for the infinite-temperature dynamics.
6.2. Representation of the joint Hamiltonian. In this section we formulate the main result of the space-time cluster expansion in [30] and [27]. We indicate the line of proof of this result, and refer the reader to [27] for the complete details. Theorem 6.1. Let ν be a Gibbs measure with Hamiltonian Hν , and let the dynamics be governed by rates satisfying (6.1–6.2). Then the joint distribution of (σ0 , σt ), when σ0 is distributed according to ν, is a Gibbs measure with Hamiltonian t Ht (σ, η) = Hν (σ ) + Hdyn (σ, η).
(6.3)
t (σ, η) corresponds to an interaction U t (A, σ, η), A ∈ S, that The Hamiltonian Hdyn dyn has the following properties:
1. The interaction splits into two terms t Udyn = U0t + Uδt ,
(6.4)
where U0t is the single-site potential corresponding to the Kadanoff transformation: U0t ({x}, σ, η) = − 21 log[tanh(t/2)]σ (x)η(x) U0t (A, σ, η) = 0
x ∈ Zd , if |A| = 1.
(6.5)
2. The term Uδt = Uδt (A, σ, η) decays exponentially in the diameter of A, i.e., there exists α(δ) > 0 such that sup sup sup eα(δ)diam(A) |Uδt (A, σ, η)| < ∞, (6.6) t≥0 x Ax σ,η
and α(δ) ↑ ∞ as δ ↓ 0. t 3. The potential Udyn converges exponentially fast to the potential Uµ of the hightemperature reversible Gibbs measure: lim sup sup eα(δ)diam(A) |Uδt (A, σ, η) − Uµ (A, η)| = 0. (6.7) t↑∞ x
Ax
σ,η
4. The term Uδt is a perturbation of the term U0t , i.e., supx Ax supσ,σ ,η |Uδt (A, σ, η) − Uδt (A, σ , η)| lim sup = 0. δ↓0 t≥0 log[tanh(t/2)]
(6.8)
124
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
Remarks. 1. Equation (6.5) corresponds to the infinite-temperature dynamics (i.e., c ≡ 1). 2. Equation (6.8) expresses that the potential as a function of the rates c is continuous at the point c ≡ 1, and that the Kadanoff term is dominant for δ " 1. Main steps in the proof of Theorem 6.1 in [27]. • Discretization. The semigroup S(t) can be approximated in a strong sense by discrete-timeprobabilistic cellular automata with transition operators of the form Pn (σ |σ ) = x Pn (σ (x)|σ ), where 1 1 Pn (σ (x)|σ ) = 1 − c(x, σ ) δσ (x),σ (x) + c(x, σ )δσ (x),−σ (x) . n n
(6.9)
• Space-time cluster expansion for fixed discretization n. For n fixed the quantity 1nt2 x dδσ Pn x -n (σ, η) = log (6.10) 1nt2 dδσ Pn is defined by the convergent cluster expansion x,n wσ,η ('), -nx (σ, η) =
(6.11)
'x,'∈C
where C is an appropriate set of clusters on Zd+1 . • Uniformity in the discretization n. The functions -nx converge uniformly as n ↑ ∞ to a continuous function - x (which defines a continuous version of dµx /dµ). This is shown in two steps: 1. Uniform boundedness. sup sup sup |-nx (σ, η)| < ∞. n
x
σ,η
(6.12)
2. Uniform continuity. lim sup sup sup |-nx (σ ζc , η ξc ) − -nx (σ, η)| = 0
↑Zd ζ,ξ
x
n
∀σ, η ∈ . (6.13)
By (6.12) and (6.13) we obtain that -nx as a function of n contains a uniformly convergent subsequence. The limit - x is independent of the subsequence, since it is a continuous version of dµx /dµ. 6.3. 1 " Tν ≤ ∞, 1 " Tµ < ∞. Given the result of Theorem 6.1, the case of a highor infinite-temperature initial measure is dealt with via Dobrushin’s uniqueness criterion (recall Theorem 5.1 in Sect. 5.2). Theorem 6.2. Let ν be a high-temperature Gibbs measure, i.e., its interaction Uν satisfies (2.27). Let the rates satisfy (6.1–6.2). Then, for δ small enough, νS(t) is a Gibbs measure for all t ≥ 0.
Stochastic Evolution of Gibbs Measures
125 η,δ
Proof. For fixed η, the Hamiltonian Ht (·, η) of (6.3) corresponds to an interaction Ut . By (6.6) and (6.8), this interaction satisfies lim sup sup δ↓0
t
=
x
η,δ
η,δ
(|A| − 1) sup |Ut (σ ) − Ut (σ )| σ,σ
Ax
(|A| − 1)|Uν (σ ) − Uν (σ )| < 2.
(6.14)
Ax η,δ
Therefore, for δ small enough, (2.27) is satisfied for the interaction Ut for all t ≥ 0 and all η. Hence |G(Ht (·, η))| = 1, and we conclude from Proposition 3.7(1) that νS(t) ∈ G. 6.4. 0 < Tν " 1, 1 " Tµ < ∞. We consider as the initial measure the plus-phase of the low-temperature Ising model νβ,h , introduced in Sect. 5.3. The joint distribution of (σ0 , σt ) has the Hamiltonian Ht (σ, η) = − β
σ (x)σ (y) − h
<x,y>
−
σ (x)
x
1 σ (x)η(x) + Htδ (σ, η), log[tanh(t/2)] 2 x
(6.15)
δ introduced in (6.4). The following is the where Htδ corresponds to the interaction Udyn analogue of Theorem 5.2:
Theorem 6.3. For β βc and 0 < δ " 1: 1. There exists t0 = t0 (β, h, δ) such that νβ,h S(t) is a Gibbs measure for all 0 ≤ t ≤ t0 . 2. If h > 0, then there exists t1 = t1 (β, h, δ) such that νβ,h S(t) is a Gibbs measure for all t ≥ t1 . 3. If h = 0, then there exists t2 = t2 (β, δ) such that νβ,0 S(t) is not a Gibbs measure for all t ≥ t2 . 4. For d ≥ 3, if 0 < h < h(β) and 0 < δ < δ(β, h), then there exist t3 (β, h, δ) < t4 (β, h, δ) such that νβ,h S(t) is not a Gibbs measure for all t3 ≤ t ≤ t4 . Proof. 1. This a consequence of Theorem 4.1. 2. This is proved in exactly the same way as the corresponding item in Theorem 5.2. 3. Let ηa be the fully alternating configuration. We cannot rely on monotonicity in this case, because of the presence of the dynamical part of the Hamiltonian Ht (·, ηa ), which is not a single-site interaction. It is therefore not sufficient to show that, for the fully alternating configuration ηa , the Hamiltonian H (·, ηa ) exhibits a phase transition. In order to prove that ηa is a bad configuration, we have to show the following slightly stronger fact. There exists γ > 0 such that for all ⊂ Zd , if a c m+ (dσ ) is a Gibbs measure corresponding to the interaction H (·, η + ), then
m+ (dσ )σ (0) > γ > 0.
(6.16)
126
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
Indeed, if we can show this, then for any m− that is Gibbs measure with Hamiltonian a − c ): Ht (·, η
m− (dσ )σ (0) < −γ ,
(6.17)
and, as a consequence, a a +c − νS(t) σ (0)|η −c | ≥ 2γ > 0. |νS(t) σ (0)|η
(6.18)
This shows that ηa is an essential point of discontinuity for the conditional probabilities of νS(t). The proof of (6.16) relies on Pirogov-Sinai theory for the Hamila + c ). The first step is to prove that the all-plus-configuration is the tonian Ht (·, η unique ground state of this Hamiltonian. Since the Ising Hamiltonian satisfies the Peierls condition, we conclude from [10] Proposition B.24 that the ground states of a +c ) are either the all-plus-configuration or the all-minus-configuration. If Ht (·, η a + c ) (i.e., if δ = 0), then the remaining Hamiltonian has we drop the term Htδ (·, η as the unique ground state the all-plus-configuration and satisfies the Peierls condition. Therefore, for δ small enough, we conclude from [10] Proposition B.24 that a + c ) has the all-plus-configuration as the only possible ground state. From Ht (·, η (6.15) it is easy to verify that the all-plus-configuration is actually a ground state for δ small enough. In order to conclude that, for β large enough, the unique phase of a + c ) is a weak perturbation of the all plus configuration (uniformly in ), Ht (·, η the only extra complication is given by the infinite-range character of the dynamic part of the Hamiltonian Ht (·, η). However, for this we can rely on the theory developed in [3], or [7], which allows exponentially decaying perturbations of a finite-range interaction satisfying the Peierls condition (see e.g. Eqs. (1.3), (2.2) of [3]). Similarly, a − c ) has a unique phase which is a weak perturbation for β large enough, Ht (·, η of the all-minus-configuration. 4. We can use the line of proof of item 4. of Theorem 5.2, introducing a special configuration ηspec such that, in the Hamiltonian Ht (·, ηspec ), the effect of the uniform magnetic field is “compensated”. This requires analysis of the Ising model perturbed by a uniformly small random interaction and therefore we have to restrict to d ≥ 3. More precisely, if we consider Ht (·, η), where η is distributed according to the Bernoulli measure νp , then we are exactly in the framework of [42], except that our random perturbation is not of finite range but exponentially decaying. However, as Pirogov– Sinai analyses do not distinguish between finite range and exponentially decaying interactions, similar arguments as those developed in [42] should still work in our case to conclude the existence of a value of p such that, for νp -almost every η, Ht (·, η) exhibits a phase transition. However, we have not written out the details here. Remark. A result related to Theorem 6.2 was obtained in [30]. Although the abstract of that paper is formulated in a somewhat ambiguous manner, its results apply only to initial measures that are product measures (in particular, Dirac measures). In particular, this includes the case Tν = 0 and 1 " Tµ < ∞. The results of [30] (or [27]) then imply that the measure is Gibbs for all t > 0. This seems surprising, because t2 (β, δ) → 0 as β → ∞. It is therefore better for the intuition to imagine a Dirac-measure as a product measure, rather than to view it as a limit of low-temperature measures.
Stochastic Evolution of Gibbs Measures
127
7. Discussion 7.1. Dynamical interpretation. In the case of renormalization-group pathologies, the interpretation of non-Gibbsianness is typically linked to the presence of a hidden phase transition in the original system conditioned on the image spins (the constrained system). In the context of the present paper, we view the phenomenon of transition from Gibbs to non-Gibbs as a change in the choice of most probable history of an improbable configuration at time t > 0. To that end we offer the following heuristic picture. Let us consider the case of the low-temperature plus-phase of the Ising model in zero magnetic field (β βc , h = 0) with an unbiased (δ = 1) infinite-temperature dynamics. Consider the spin at the origin at time t conditioned on a neutral (say alternating) configuration in a sufficiently large annulus around it. For small times the occurrence of such an improbable configuration indicates that with overwhelming probability a very similar configuration was present already at time 0. As the initial measure is an Ising Gibbs measure, the distribution at time 0 of the spin at the origin is determined by its local environment only and does not depend on what happens outside the annulus . As all spins flip independently, no such dependence can appear within small times. However, after a sufficient amount of time (larger than the transition time t2 in Theorem 6.3), if the same improbable configuration is observed, then it has much more chance of being recently created (due to atypical fluctuations in the spin-flip processes) than of being the survivor of an initial state. Indeed, having been there at time 0 is improbable, but having survived for a large time is even more improbable. Suppose now that outside the annulus we observe an enormous annulus ' in which the magnetization is more negative than −m∗ (t)/2, where m∗ (t) is the value of the evolved magnetization (which starts from m∗ (0) and decays exponentially fast to zero). Because a large droplet of the minus-phase shrinks only at finite speed and typically carries a magnetization characteristic of the evolved minus-phase, with large probability there was an enormous droplet of the minus phase (even a bit larger than ') at time 0, which the spin at the origin remembers. Indeed, the probability of this happening is governed by the size of the surface of '. In contrast, the probability of a large negatively magnetized droplet, arising through a large fluctuation in the spin-flip process starting from a typical plusphase configuration, is governed by the volume of '. Therefore, this second scenario can safely be forgotten. Although for any size of the initial droplet of the minus-phase there is a time after which it has shrunk away, for each fixed time t we can choose an initial droplet size such that at time t it has shrunk no more than to size '. Since we want the shrinkage until time t to be negligible with respect to the linear size of ', we need to choose ' larger when t is larger. Thus, the transition reflects a cross-over between two improbable histories for seeing an improbable (alternating) annulus configuration. It can be viewed as a kind of large deviation phenomenon for a time-inhomogeneous system. One could alternatively describe it by saying that for small times a large alternating droplet must have occurred at time 0, while after the transition time t2 a large alternating droplet must have been created by the random spin-flips: a “nature to nurture” transition [37]. The mathematical analysis of this interpretation would rely on finding the (constrained) minimum of an entropy function on the space of trajectories. Alternatively, one could try to study the large deviation rate function for the magnetization of the measure at time 0 conditioned on an alternating configuration at time t. This rate function should exhibit a unique minimum for 0 ≤ t < t2 and two minima for t > t2 .
128
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
7.2. Variational principle and large deviations. If νS(t) is a Gibbs measure, then the relative entropy densities h(µ|νS(t)) exist for any translation invariant probability measure µ. In a forthcoming publication, we will prove that this weaker property of existence of relative entropy density is true in a much more general context: it only depends on the positivity and locality of the rates c(x, σ ) and it is true for all t ≥ 0. This means that the non-Gibbsianness of νS(t) is not related to “wrong large deviation properties”. 7.3. Reversibility. Throughout the whole paper, we have assumed the stationary measure µ to be reversible. However, this is a condition that only serves to make formulas nicer. It is not at all a necessary condition: if we consider any high-temperature spin-flip dynamics, then we know that the stationary measure µ is a high-temperature Gibbsmeasure. Equation (3.2) can be rewritten in the general situation: we have to replace ∗ (t), where S ∗ (t) is the semigroup corresponding to S (t) in the right-hand side by S the rates of the reversed process, i.e., the rates c∗ (x, σ ) = c(x, σ x )
dµx . dµ
(7.1)
In all the formulas of Sect. 2, we then have to replace Eσ by E∗σ , referring to expectation in the process with semigroup S ∗ (t). 7.4. Open problems. 1. Trajectory of the interaction. In the regime 1 " Tν ≤ ∞, 1 " Tµ ≤ ∞, what can we say about the trajectory t → Ut ? It is not hard to prove that it is analytic in Bti and converges to Uµ . In fact, since the interaction of the two-layer system is exponentially decaying, we expect the analyticity of the curve t → Uνt to hold in a subspace of B with a stronger norm. But can we say something about the rate of convergence? Note that we can view the curve {Uνt : t ≥ 0} as a continuous trajectory in the space B, interpolating between Uν and Uµ , which implies that G contains an arc-connected subset (i.e., we can pass from one high-temperature Gibbs measure ν to another µ along a weakly continuous curve not leaving G). Other topological characteristics of G are discussed in [10], Sect. 4.5.6. 2. Uniqueness of the transitions and estimates for the transition times. Even in the case Tµ = ∞ we have not proved that the transition from Gibbs to non-Gibbs is unique, e.g. that t0 (β, 0) = t2 (β) in Theorem 5.2. However, we expect that when h = 0 the alternating configuration is “the worst configuration”, i.e., the transition is sharp and occurs at the first time at which the alternating configuration is bad. Another issue is to find good estimates for the ti ’s as a function of e.g. the temperatures, the magnetic fields and the ranges of the interaction in ν and µ. 3. Weak Gibbsianness. In the regimes where νS(t) is not a Gibbs measure we expect the measure to be “almost Gibbs” and “weakly Gibbs”. Almost Gibbs means that the measure of the set of bad configurations is zero: this property has recently been proved for several transformations of the Ising model, including the Kadanoff transformation (see [12]). Weakly Gibbs means that we can define a νS(t)-a.s. summable interaction Ut such that the conditional probabilities of νS(t) can be written in Gibbsian form (see [9, 29]). The interaction Ut can e.g. be constructed along similar lines as are followed in the proof of Kozlov’s theorem (see [23, 28]) and its summability is to be controlled by the decay of “quenched correlations”, i.e., the decay of correlations in
Stochastic Evolution of Gibbs Measures
129
the measure at time 0 conditioned on having a fixed configuration η at time t. These correlations are expected to decay exponentially for νS(t)-a.e. η, which would lead to νS(t)-a.s. summability of the Kozlov-potential. 4. Low-temperature dynamics. The main problem of analyzing the regime 0 < Tµ " 1 for large t is the impossibility of a perturbative representation of − log pt (σ, η). If we still continue to work with the picture of the joint Hamiltonian in (3.26), then the term − log pt (σ, η) will not converge to a σ -independent Hamiltonian as t ↑ ∞. Therefore we cannot argue that for large t the Gibbsianness of the measure νS(t) depends only on the presence or absence of a phase transition in the Hamiltonian Hν of the initial measure ν. The dynamical part of the joint Hamiltonian can induce a phase transition. The regime 0 < Tµ " 1 is very delicate and there is no reason to expect a robust result for general models. Metastability phenomena will enter. 5. Zero-temperature dynamics. What happens when Tµ = 0? In this case there is only nature, no nurture. We therefore expect the behavior to be different from 0 < Tµ " 1. Trapping phenomena will enter. 6. Other dynamics. Do similar phenomena occur under spin-exchange dynamics, like Kawasaki dynamics? In particular, how do conservation laws influence the picture (see [18, 19, 1])? Acknowledgements. We thank C. Maes and K. Netocny for fruitful discussions. A.C.D. v.E. thanks H. van Beijeren for pointing out reference [1] to him. Part of this collaboration was made possible by the Dutch “Samenwerkingsverband Mathematische Fysica”. R. F. thanks the Department of Theoretical Physics at Groningen for kind hospitality.
References 1. Aspelmeier, T., Schmittman, B., Zia, R.K.P.: Microscopic kinetics and time-dependent structure factors. Preprint, http://xxx.lanl.gov, cond-mat/0101189, 2001 2. Bertini, L., Cirillo, E.N.M., Olivieri, E.: Renormalization-group transformations under strong mixing conditions: Gibbsianness and convergence of renormalized interactions. J. Stat. Phys. 97, 831–915 (1999) 3. Borgs, C., Kotecky, R., Ueltschi, D.: Low-temperature phase diagrams for quantum perturbations of classical spin systems. Commun. Math. Phys. 181, 409–446 (1996) 4. Bricmont, J., Kupiainen, A.: Phase transitions in the 3-dimensional random field Ising model. Commun. Math. Phys. 116, 539–572 (1988) 5. Bruce, A.D., Pryce, J.M.: Statistical mechanics of image restoration. J. Phys. A. 28, 511–532 (1995) 6. Camia, F., De Santis, E., Newman, C.M.: Clusters and recurrence in the two-dimensional zero-temperature stochastic Ising model. Preprint, http://xxx.lanl.gov, PR/0103050, 2001 7. Datta, N., Fernández, R., Fröhlich, J.: Low-temperature phase diagrams of quantum lattice systems I. Stability for quantum perturbations of classical systems with finitely many ground states. J. Stat. Phys. 84, 455–534 (1996) 8. Dobrushin, R.L., Shlosman, S.B.: Completely analytical interactions: Constructive description. J. Stat. Phys. 46, 983–1014 (1987) 9. Dobrushin, R.L., Shlosman, S.B.: Non-Gibbsian states and their Gibbsian description. Commun. Math. Phys. 200, 125–179 (1999) 10. van Enter, A.C.D., Fernández, R., Sokal, A.D.: Regularity properties and pathologies of position-space renormalization-group transformations: Scope and limitations of Gibbsian theory. J. Stat. Phys. 72, 879– 1167 (1993) 11. Fernández, R., Pfister, C.E.: Global specifications and non-quasilocality of projections of Gibbs measures. Ann. Probab. 25, 1284–1315 (1997) 12. Fernández, R., Le Ny, A., Redig, F.: Variational principle and almost sure quasilocality for some renormalized measures. Preprint, http://www.xxx.lanl.gov, PR/010708, 2001 13. Fontes, L.R., Isopi, M., Newman, C.M.: Chaotic time dependence in a disordered spin system. Probab. Theory and Relat. Fields 115, 417–443 (1999) 14. Gandolfi, A., Newman, C.M., Stein, D.L.: Zero-temperature dynamics of ±J spin glasses and related models. Commun. Math. Phys. 214, 373–387 (2000)
130
A.C.D. van Enter, R. Fernández, F. den Hollander, F. Redig
15. Georgii, H.-O.: Gibbs Measures and Phase Transitions. Berlin: Walter de Gruyter & Co., 1988 16. Griffiths, R.B., Pearce, P.A.: Mathematical properties of position-space renormalization-group transformations. J. Stat. Phys. 20, 499–545 (1979) 17. Haller, K., Kennedy, T.: Absence of renormalization group pathologies: Two examples. J. Stat. Phys. 85, 607–638 (1996) 18. den Hollander, F., Olivieri, E., Scoppola, E.: Metastability and nucleation for conservative dynamics. J. Math. Phys. 41, 1424–1498 (2000) 19. den Hollander, F., Olivieri, E., Scoppola, E.: Nucleation in fluids: Some rigorous results. Physica A279, 110–122 (2000) 20. Israel, R.B.: Banach algebras and Kadanoff transformations. In: Random Fields, Esztergom, 1979, eds. J. Fritz, J.L. Lebowitz and D. Szász, Vol. II, Amsterdam: North-Holland, 1981, pp. 593–608 21. Künsch, H.: Non-reversible stationary measures for infinite interacting particle systems. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 66, 407–421 (1984) 22. Künsch, H.: Time reversal and stationary Gibbs measures. Stoch. Proc. Appl. 17, 159–166 (1984) 23. Kozlov, O.K.: Gibbs description of a system of random variables. Probl. Info. Trans. 10, 258–265 (1974) 24. Lebowitz, J.L., Schonmann, R.H.: Pseudo-free energies and large deviations for non-Gibbsian FKGmeasures. Probab. Theory and Relat. Fields 77, 49–64 (1988) 25. Liggett, T.M.: Interacting Particle Systems. New York: Springer-Verlag, 1985 26. Liptser, R.S., Shiryayev, A.N.: Statistics of Random Processes, part II. New-York: Springer-Verlag, 1977 27. Maes, C., Netocny, K.: Space-time expansions for weakly interacting particle systems. Preprint, http://www.tfdec1.fys.kuleuven.ac.be/ christ, 2001 28. Maes, C., Redig, F., Shlosman, S., Van Moffaert, A.: Percolation, path large deviations and weakly Gibbs states. Commun. Math. Phys. 209, 517–545 (2000) 29. Maes, C., Redig, F., Van Moffaert, A.: The restriction of the Ising model to a layer. J. Stat. Phys. 94, 893–912 (1999) 30. Maes, C., Vande Velde, K.: The interaction potential of the stationary measure of a high-noise spinflip process. J. Math. Phys. 34, 3030–3038 (1993) 31. Maes, C., Vande Velde, K.: Relative energies for non-Gibbsian states. Commun. Math. Phys. 189, 277–286 (1997) 32. Malyshev, V.A., Minlos, R.A.: Gibbs Random Fields. Cluster expansions. Dordrecht: Kluwer, 1991 33. Martinelli, F., Olivieri, E.: Approach to equilibrium of Glauber dynamics in the one phase region. I. The attractive case. Commun. Math. Phys. 161, 447–486 (1994) 34. Martinelli, F., Olivieri, E.: Approach to equilibrium of Glauber dynamics in the one phase region. II. The general case. Commun. Math. Phys. 161, 487–514 (1994) 35. Nanda, S., Newman, C.M., Stein, D.L.: Dynamics of Ising spin systems at zero temperature. In: On Dobrushin’s way. From Probability Theory to Statistical Physics, Providence, RI: Amer. Math. Soc., 2000, pp. 183–194 36. Newman, C.M., Stein, D.L.: Blocking and persistence in the zero-temperature dynamics of ordered and disordered Ising models. Phys. Rev. Lett. 82, 3944–3947 (1999) 37. Newman, C.M., Stein, D.L.: Metastable states in spin glasses and disordered ferromagnets. Phys. Rev. E 60, 5244–5260 (1999) 38. Newman, C.M., Stein, D.L.: Zero-temperature dynamics of Ising spin systems following a deep quench: Results and open problems. Physica A 279, 159–168 (2000) 39. Schonmann, R.: Projections of Gibbs measures may be non-Gibbsian. Commun. Math. Phys. 124, 1–7 (1989) 40. Schonmann, R.H., Shlosman, S.B.: Wulff droplets and the metastable relaxation of kinetic Ising models. Commun. Math. Phys. 194, 389–462 (1998) 41. Sullivan, W.G.: Potentials for almost Markovian random fields. Commun. Math. Phys. 33, 61–74 (1973) 42. Zahradník, M.: On the structure of low-temperature phases in three-dimensional spin models with random impurities: A general Pirogov-Sinai approach. In: Phase Transitions: Mathematics, Physics, Biology, ed. R. Kotecký, Singapore: World Scientific, 1992, pp. 225–237 Communicated by H. Spohn
Commun. Math. Phys. 226, 131 – 162 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Non-Equilibrium Steady States of Finite Quantum Systems Coupled to Thermal Reservoirs V. Jakˇsi´c1 , C.-A. Pillet2,3,4 1 Department of Mathematics and Statistics, McGill University, 805 Sherbrooke Street West, Montreal,
QC, H3A 2K6, Canada
2 Université de Toulon, B.P. 132, 83957 La Garde Cedex, France 3 CPT-CNRS Luminy, Case 907, 13288 Marseille Cedex 9, France 4 FRUMAM
Received: 12 July 2001 / Accepted: 11 October 2001
Dedicated to Jean Michel Combes on the occasion of his sixtieth birthday Abstract: We study the non-equilibrium statistical mechanics of a 2-level quantum system, S, coupled to two independent free Fermi reservoirs R1 , R2 , which are in thermal equilibrium at inverse temperatures β1 = β2 . We prove that, at small coupling, the combined quantum system S + R1 + R2 has a unique non-equilibrium steady state (NESS) and that the approach to this NESS is exponentially fast. We show that the entropy production of the coupled system is strictly positive and relate this entropy production to the heat fluxes through the system. A part of our argument is general and deals with spectral theory of NESS. In the abstract setting of algebraic quantum statistical mechanics we introduce the new concept of the C-Liouvillean, L, and relate the NESS to zero resonance eigenfunctions of L∗ . In the specific model S + R1 + R2 we study the resonances of L∗ using the complex deformation technique developed previously by the authors in [JP1]. 1. Introduction 1.1. The framework. This paper deals with some thermodynamical aspects of a class of models in non-equilibrium quantum statistical mechanics which are commonly used to describe interaction of a small quantum system S with finitely many heat reservoirs Ri . We will study the simplest non-trivial model, namely in our work S is an 2-level atom (spin 1/2) and each reservoir Ri is a free Fermi gas in thermal equilibrium at inverse temperature βi . Various generalizations of our results will be discussed in Sect. 1.3 and in the forthcoming paper [JP4]. We will work in the framework of algebraic quantum statistical mechanics [BR1, BR2, Ha]. For the reader convenience and notational purposes, in this section we review some basic notions of this framework. In the algebraic formalism a physical system is described either by a C ∗ - or W ∗ dynamical system. The advantage of Fermi reservoirs is that we can deal with C ∗ -systems which are conceptually simpler. A C ∗ -dynamical system is a pair (O, τ ), where O is a
132
V. Jakˇsi´c, C.-A. Pillet
C ∗ -algebra with identity and τ is a strongly continuous group of automorphisms of O (that is, the map R t → τ t (A) is norm continuous for each A ∈ O). The elements of O describe observables of the physical system under consideration and the group τ specifies their time evolution. A physical state is described by a mathematical state on O, that is, a positive linear functional ω such that ω(1) = 1. The set E(O) of all states is a convex, weak-∗ compact subset of the dual O∗ . A state ω is called faithful if ω(A∗ A) = 0 ⇒ A = 0 and τ -invariant if ω ◦ τ t = ω for all t. The thermal equilibrium states of (O, τ ) are characterized by the KMS condition. Let β = 0 be the inverse temperature (although the physically relevant case is β > 0, it is mathematically convenient to define KMS-states for all non-zero β). The state ω is (τ, β)-KMS if for any pair A, B ∈ O there exists a complex function FA,B , analytic inside the strip {z | 0 < sign(β)Imz < |β|}, bounded and continuous on its closure, and satisfying the KMS boundary conditions FA,B (t) = ω(Aτ t (B)),
FA,B (t + iβ) = ω(τ t (B)A).
A (τ, β)-KMS state is faithful and τ -invariant. Let (O, τ ) be a C ∗ -dynamical system and let δ be the generator of τ (τ t = etδ ). The operator δ is a ∗-derivation: Its domain D(δ) is a ∗-subalgebra of O and for A, B ∈ D(δ), δ(A)∗ = δ(A∗ ),
δ(AB) = δ(A)B + Aδ(B).
Let V = V ∗ ∈ O be a perturbation (such perturbations are called local). The generator of the perturbed dynamics is δV (A) = δ(A)+i[V , A]. The operator δV is also a ∗-derivation and D(δV ) = D(δ). The perturbed dynamics is described by τVt (A) := etδV (A) = τ t (A) +
n≥1
in
t
t1
dt1 0
0
tn−1
dt2 · · · 0
dtn [τ tn (V ), [· · · [τ t1 (V ), τ t (A)]]].
Until the end of this section we fix a C ∗ -dynamical system (O, τ ), a state ω, and a local perturbation V . The non-equilibrium steady states (NESS) of the locally perturbed system (O, τV ) associated to the initial state ω are the weak-* limit points of the set of states 1 T ω ◦ τVt dt, (1.1) T 0 for T > 0. In other words, ωV+ is a NESS if there is a sequence Tn → ∞ such that for all A ∈ O, Tn 1 ω ◦ τVt (A)ds = ωV+ (A). lim n→∞ Tn 0 The set V+ (ω) of NESS associated to ω is a non-empty weak-* compact subset of E(O) whose elements are τV -invariant. One of the key concepts of non-equilibrium thermodynamics is the notion of entropy production. Within the framework of algebraic quantum statistical mechanics this notion has been precisely defined in the recent works [Ru2,JP3], see also [Sp1, O1, O2, OHI]. We recall the definitions and the results we will need.
Non-Equilibrium Steady States of Finite Quantum Systems
133
For positive linear functionals η, ξ ∈ O∗ , let Ent(η | ξ ) be the relative entropy of Araki (we use the ordering and the sign convention of Brattelli–Robinson [BR2, Don]). For definition and properties of Araki’s relative entropy we refer the reader to [Ar1,Ar2, BR2, Don, OP]. We make the following assumption: (E1) There exists a C ∗ -dynamics σω such that ω is a (σω , −1)-KMS state. The choice of reference temperature β = −1 is made for mathematical convenience. If (E1) holds, then for any β = 0 there is a C ∗ -dynamics σω,β such that ω is a (σω,β , β)−t/β t = σω ). KMS state (set σω,β Let δω be the generator of σω . Our second assumption concerns the local perturbation V . (E2) V ∈ D(δω ). Until the end of this section we assume that (E1) and (E2) hold. We set σV := δω (V ) and call Ep(η) = η(σV ) the entropy production (w.r.t. the reference state ω) of the perturbed system (O, τV ) in the state η ∈ E(O). The following identity was proven in [JP3]: Ent(ω ◦ τVt | ω) = −
0
t
ω(τVs (σV ))ds.
(1.2)
This identity motivates the definition of entropy production and is the starting point for study of this notion [JP3, JP4]. In particular, since the relative entropy is non-positive, Relation (1.2) yields that for any ωV+ ∈ V+ (ω), Ep(ωV+ ) ≥ 0. The NESS ωV+ is thermodynamically non-trivial if Ep(ωV+ ) > 0. One of the central problems of mathematical theory of non-equilibrium quantum statistical mechanics is to show that the NESS of concrete physically relevant models are thermodynamically non-trivial. We describe below one simple criterion which ensures strict positivity of entropy production and which will be used in this paper. Let (Hω , πω , ω ) be the GNS-representation of the algebra O associated to ω. The states in O∗ which are represented by density matrices on Hω are called ω-normal. The set of all ω-normal states is a norm closed subset of E(O) which we denote by Nω . One can show that the entropy production of ω-normal NESS is zero, see [JP4]. Theorem 1.1. Assume that NESS ωV+ satisfies the following: (a) ωV+ ∈ Nω . T (b) supT >0 0 (ω(τVt (σV )) − ωV+ (σV ))dt < ∞. Then Ep(ωV+ ) > 0. We will prove this theorem in Sect. 5. One of the main results of this paper is that the class of systems we study has strictly positive entropy production. For additional information about NESS and entropy production we refer the reader to [JP4].
134
V. Jakˇsi´c, C.-A. Pillet
1.2. The model and the results. We now describe the specific model we will study in this paper. The C ∗ -algebra of observables of the system S is Os ≡ M(C2 ), the matrix algebra on Hs ≡ C2 . Let σx , σy , σz be the usual Pauli matrices. The dynamics is specified by the automorphisms τst (A) = eitHs Ae−itHs ,
(1.3)
where Hs ≡ σz is the Hamiltonian of the system S. Let h be the Hilbert space of a single fermion and h its energy operator. Let Hf ≡ %− (h) be the Fermi Fock space and a(f ), a ∗ (f ) the corresponding annihilation and creation operators on Hf . In the sequel a # stands either for a or a ∗ . It follows from CAR (canonical anti-commutation relations) that a # (f ) = f . The algebra of observables of the free Fermi gas, Of , is the C ∗ -algebra of operators generated by {a # (f ) | f ∈ h} and the identity 1. The field operators are defined by 1 ϕ(f ) ≡ √ (a(f ) + a ∗ (f )). 2 The Hamiltonian and the dynamics are specified by Hf = d%(h) and τft (a # (f )) = eitHf a # (f )e−itHf = a # (eith f ). The pair (Of , τf ) is a C ∗ -dynamical system describing a free Fermi gas. For each β > 0 there exists a unique (τf , β)-KMS state ωf,β on Of . ωf,β is a quasi-free, gauge-invariant state uniquely determined by the two point function ωf,β (a ∗ (f )a(f )) = (f, (eβh + 1)−1 f ). Notation. In the sequel, whenever the meaning is clear within the context, we denote by A the operators A ⊗ 1, 1 ⊗ A. (i)
(i)
We consider now two identical reservoirs (Of , τf ), i = 1, 2. The C ∗ -algebra of observables of the combined system S + R1 + R2 is (1)
(2)
O ≡ Os ⊗ O f ⊗ O f ,
(1.4)
the tensor product algebra of operators on H ≡ Hs ⊗Hf ⊗Hf . The free dynamics is given (1) (2) by the group of automorphisms τ = τs ⊗ τf ⊗ τf . The pair (O, τ ) is a C ∗ -dynamical system describing the combined system in absence of interaction. Note that τ t (A) = eitH Ae−itH , where (1)
H = Hs + Hf
(2)
+ Hf .
We now describe the interaction of S with the reservoirs. Choose form-factors αi ∈ h, i = 1, 2, and set V1 = σx ⊗ ϕ(α1 ) ⊗ 1, V2 = σx ⊗ 1 ⊗ ϕ(α2 ), V = V1 + V2 .
(1.5)
Non-Equilibrium Steady States of Finite Quantum Systems
135
Obviously, V = V ∗ ∈ O. The Hamiltonian and the dynamics of the interacting system are specified by Hλ t τλ (A)
= H + λV , = eitHλ Ae−itHλ ,
where λ is a real coupling constant. The pair (O, τλ ) is a C ∗ -dynamical system. In what follows we fix the inverse temperatures βi > 0 of the reservoirs. Let ωs (i) (i) be a state on Os and ωf,βi be the (τf , βi )-KMS state on Of describing the thermal equilibrium state of the i th reservoir. Consider first the initial states of the form ω = ωs ⊗ ωβ1 ⊗ ωβ2 ,
ωs ∈ E(Os ).
(1.6)
We denote the set of all such states by Ns . For ω ∈ Ns , let N ⊂ E(O) be the set of all ω-normal states (N does not depend on the choice of ω ∈ Ns ). Our goal is to study NESS of (O, τλ ) associated to initial states in N . For technical reasons related to use of the complex deformation technique of [JP1], we impose some regularity assumptions on the reservoirs and form factors. Our first assumption is: (A1) h ≡ L2 (R+ ; G) for some auxiliary Hilbert space G, and h is the operator of multiplication by s ∈ R+ . Let I (δ) ≡ {z ∈ C : |Imz| < δ}. We denote by H 2 (δ) the Hardy class of all analytic functions f : I (δ) → G such that f H 2 (δ) ≡ sup
|θ |<δ R
f (s + iθ)2G ds < ∞.
We fix a complex conjugation f → f on h which commutes with h. To any f ∈ h we associate a function f˜ : R → G by f˜(s) =
f (s)
if s ≥ 0,
f (|s|) if s < 0.
(1.7)
Our second regularity assumption is: (A2) For some δ > 0, e−βi s/2 α˜ i ∈ H 2 (δ) for i = 1, 2. Our third assumption ensures that the small system S is effectively coupled to the reservoirs. (A3) αi (2)G > 0 for i = 1, 2. To illustrate the above assumptions with a concrete example, assume that h = L2 (Rd , dk) and that h is operator of multiplication by k 2 /2m. Passing to polar coordinates and changing the variable one sees that (A1) holds with G = L2 (S d−1 , dσ ), where d−2 4 S d−1 is the unit sphere in Rd and dσ is the surface measure. If αi (k) = |k| 2 e−|k| , then (A2) and (A3) hold (in this example (A2) holds for all δ and βi ). Our first result is:
136
V. Jakˇsi´c, C.-A. Pillet
Theorem 1.2. Assume that (A1)–(A3) hold. Then, for some 1 > 0 and 0 < |λ| < 1, there is a state ωλ+ on O so that the following hold: (i) For all η ∈ N and A ∈ O, lim η(τλt (A)) = ωλ+ (A).
t→∞
(1.8)
(ii) The limit (1.8) is exponentially fast in the following sense: There exist γ (λ) > 0, a norm dense set of states N0 ⊂ N , and a norm-dense ∗-subalgebra O0 ⊂ O such that for η ∈ N0 , A ∈ O0 , and t > 0, |η(τ t (A)) − ωλ+ (A)| ≤ CA,η,λ e−γ (λ)t .
(1.9)
Moreover, Ns ⊂ N0 and Os ⊂ O0 . (iii) For A ∈ O0 , the functions λ → ωλ+ (A) are analytic for |λ| < 1. Remark 1. Our proof gives that 1 = O(min 1/βi ), and thus the above theorem is a hightemperature result. It is an interesting question whether the techniques of [BFS] or [DJ1, DJ2] can be adapted to prove the above theorem for 1 independent of the temperatures βi . Remark 2. If β1 = β2 , then ωλ+ is not a (τλ , β)-KMS state for any β. Remark 3. In the thermal equilibrium case β1 = β2 = β Theorem 1.2 was proven in [JP1, JP2] (ωλ+ is then the unique (τλ , β)-KMS state of (O, τλ )). The method of this paper is suited to non-equilibrium situations and, when restricted to the thermal equilibrium case, differs from the method of [JP1, JP2]. In particular, here we require a stronger regularity condition than [JP1, JP2] (there it suffices that αi ∈ H 2 (δ)) but we also obtain a slightly stronger result (the method of [JP1, JP2] fails to show that Ns ⊂ N0 and Os ⊂ O0 ). Remark 4. Our proof gives that γ (λ) = γ0 λ2 + O(λ4 ), where γ0 =
π α1 (2)2G + α2 (2)2G . 2
Remark 5. Regarding (iii), it follows from our arguments that there exist linear functionals ωk+ : O0 → C, k ≥ 0, such that for A ∈ O0 , ωλ+ (A) =
∞ k=0
λk ωk+ (A).
(1.10)
The first term ω0 is computed from a linear eigenvalue problem on Hs . This eigenvalue problem is determined by the second order correction (Fermi’s Golden Rule) for the resonances of a suitable non-self-adjoint operator (C-Liouvillean). Although formulas for the higher order terms become quickly very complicated, in principle it is possible to compute all terms in the expansion (1.10). We will discuss this point at the end of Sect. 4.
Non-Equilibrium Steady States of Finite Quantum Systems
137
Theorem 1.2 establishes the basic thermodynamical property of the system S + R1 + R2 , namely that the set of initial states N belongs to the basin of attraction of a single NESS ωλ+ . We now discuss the other thermodynamical properties of this system. The first question is whether ωλ+ belongs to the set N of normal states. Theorem 1.3. Assume that (A1) − (A3) hold and that β1 = β2 . Then there is 4 > 0 such that for 0 < |λ| < 4 there are no τλ -invariant states in N . In particular, if 0 < |λ| < min(1, 4), then ωλ+ ∈ N . Remark 1. This result can be proven under a more general condition than (A2), see [DJ1, DJ2]. Remark 2. The constant 4 differs from the constant 1 in Theorem 1.2. In contrast to 1, 4 can be chosen independently of the size of βi ’s as βi → ∞ (see [DJ2] for details). On the other hand, 4 depends on d = |β1 − β2 | and 4 ↓ 0 as d ↓ 0. The constant 1 can be chosen independently of d as long as 0 ≤ d ≤ const. Recall that the entropy production depends on the choice of the initial state ω. Let s be the set of states in Ns with the property that ωs > 0 and is τs -invariant. The N s . If (A2) holds, then (E2) holds for the assumption (E1) of Sect. 1.1 holds for all ω ∈ N perturbation V . s , Theorem 1.4. Under the assumptions of Theorem 1.3, for any initial state ω ∈ N Ep(ωλ+ ) = ωλ+ (δω (λV )) > 0, for 0 < |λ| < min(1, 4). Moreover, Ep(ωλ+ ) does not depend on the choice of the initial s . state ω ∈ N Remark. This theorem can be proven in two different ways. The short proof (the one we will give in this paper) is based on Theorem 1.1. This proof gives no estimate on the size of entropy production. The second proof is based on the perturbative expansion of the state ωλ+ . Although computationally tedious, this proof has the advantage of showing that the entropy production is strictly positive to the lowest non-trivial order (the first non-trivial term can be also computed using the van Hove weak coupling limit, see [LS]). We will discuss the perturbative proof of Theorem 1.4 in [JP4]. We finish this section with a brief discussion of the heat fluxes. Let δi be the generator (i) of τf . (A2) implies that Vi ∈ D(δi ). The observable describing the heat flux (energy transfer) from the rest of the system into the i th reservoir is 5i := δi (λVi ). Theorem 1.5. Assume that (A1)–(A3) hold and that β1 = β2 . Then, for 0 < |λ| < min(1, 4), the following relations hold: ωλ+ (51 ) + ωλ+ (52 ) = 0, β1 ωλ+ (51 ) + β2 ωλ+ (52 ) = −Ep(ωλ+ ) < 0,
(1.11)
where in the second relation the entropy production is computed w.r.t. any initial state s . in N Remark 1. Relations (1.11) are respectively the first and the second law of thermodynamics for the model S + R1 + R2 .
138
V. Jakˇsi´c, C.-A. Pillet
Remark 2. If β1 > β2 , then ωλ+ (51 ) > 0. Thus, in NESS ωλ+ there is a constant nonvanishing heat flow from the hotter to the colder reservoir across the system S. Remark 3. Except for the strict positivity of entropy production, the relations (1.11) follow only from a few structural properties of the model S + R1 + R2 , and can be proven in considerable generality, see [JP4] for details.
1.3. Remarks. Although in this paper we have chosen to study the simplest non-trivial model, our results can be easily extended to the case where S is an N -level atom, there are M-reservoirs instead of two, and Vi is a finite sum of terms of the form Qi ⊗ ϕ(αi1 ) . . . ϕ(αin )in(n−1)/2 , (one assumes that (αik , αij ) = 0 for k = j and Qi = Q∗i ∈ M(CN )). In this case, Assumption (A3) has to be replaced with a more complicated algebraic condition which ensures that a suitable N × N matrix has zero as a simple eigenvalue. This condition is studied in detail in [DJ2] and is closely related to the non-degeneracy condition discussed in the context of the master equation approach to the non-equilibrium thermodynamics [Da, LS, Sp2, Fr]. We will discuss both the more general model and the relation of our results with the master equation technique in the continuation of this paper [JP4]. If the Fermi reservoirs are replaced with Bose reservoirs, then the combined system has to be described within the framework of W ∗ -dynamical systems. In this case the perturbation V is an unbounded operator and this leads to some technical difficulties in the study of the L∞ -Liouvillean (the analog of C-Liouvillean for W ∗ -systems). It is an important open problem to prove the analog of Theorem 1.2 for Bose reservoirs. Among the works related to ours, we mention the one of Davies [Da], where the dynamics of the system S + i Ri is studied in the van Hove weak-coupling limit t = λ2 t, λ ↓ 0, t ↑ ∞. In particular, Davies proves the existence and uniqueness of NESS in the van Hove limit (this state coincides with ω0 in the expansion (1.10)). Lebowitz and Spohn [LS] have used Davies results to study the thermodynamics of the system S + i Ri in the van Hove limit steady state ω0 . There is a substantial literature on the use of the van Hove limit and Markovian master equations in statistical mechanics, see [GFV, Hak] for references and additional information. The results beyond van Hove limit are scarce. In [JP1, JP2] Theorem 1.2 was proven in the thermal equilibrium case where β1 = β2 . The method of the proof was based on quantum Koopmanism and the spectral analysis of the quantum Koopman operator – the (standard) Liouvillean – of the system S + i Ri . Various extensions and generalizations of these results are given in [BFS, DJ1, DJ2, M]. An alternative (abstract) approach to the study of non-equilibrium steady states of finite quantum systems coupled to thermal reservoirs was recently proposed in [Ru1]. This proposal is based on the scattering theory of C ∗ -dynamical systems and an ergodicity hypothesis called L1 -asymptotic abelianness. This hypothesis is difficult to verify in concrete models, and in particular it is not known whether it holds for the model studied in this paper. We would like to add the following general remark. It is known that the ergodic properties of C ∗ -dynamical systems in thermal equilibrium are encoded in the spectrum of a suitable self-adjoint operator, the quantum Koopman operator or Liouvillean, see e.g. [JP2]. In non-equilibrium situations, the quantum Koopmanism is not applicable, and it has been generally believed that the understanding of NESS requires the development
Non-Equilibrium Steady States of Finite Quantum Systems
139
of scattering theory. In the models of physical interest this is a difficult task, and the progress has been slow (see however [DG1, DG2, FGS]). A perhaps surprising aspect of our method is that at least in some situations, the spectral approach to NESS is possible, and that the structure of NESS is encoded in the spectral resonances of a suitable nonselfadjoint operator, the C-Liouvillean. The paper is organized as follows. The method of the proof is described in the abstract setting in Sect. 2 where we introduce the concept of the C-Liouvillean, L, and show how the NESS of an abstract C ∗ -dynamical system are related to the resonances of L∗ . The results of Sect. 2 are quite general and, we believe, shed some light on the structure of non-equilibrium quantum statistical mechanics. In Sections 3 and 4 we apply the abstract formalism of Sect. 2 to the specific model S + R1 + R2 – in Sect. 3 we explicitly compute the modular structure and C-Liouvillean L, and in Sect. 4 we study the resonances of L∗ using the complex deformation technique previously developed in [JP1]. 2. Liouvilleans and NESS The goal of this chapter is to introduce the basic new ingredient of our method, the C-Liouvillean. In Sect. 2.1 we recall the basic notions of Tomita–Takesaki modular theory and in particular the notion of the standard Liouvillean. In Sect. 2.2 we introduce C-Liouvilleans. In Sect. 2.3 we describe the relation between the C-Liouvilleans and NESS. Throughout this section we adopt the following framework. Let (O, τ ) be a C ∗ -dynamical system and ω a given faithful state. Let (H, π, ) be the GNS-representation of the algebra O associated to ω (for simplicity, we write H for Hω , etc). Since ω is faithful, π is an injection and we can identify O and π(O) (with a slight abuse of notation, we write A for π(A)). We set M = π(O)!! and assume that is a separating vector for the von Neumann algebra M (A ∈ M, A = 0 ⇒ A = 0). We denote by N ⊂ E(O) the set of all π-normal states, that is, the states represented by density matrices on H. Every element of N extends uniquely to a state on M. In what follows we assume that ω is τ -invariant. Then τ has a unique extension to a weakly continuous group of automorphisms of M which we denote by the same letter. The state ω(A) = (, A) is a τ -invariant state on M. Let V ∈ O be a local perturbation and τV the perturbed C ∗ -dynamics. The group τV also extends to a weakly continuous group of automorphisms of M which we denote by the same letter. 2.1. The standard Liouvillean. There exists a unique self-adjoint operator L on H such that for A ∈ M, τ t (A) = eit L Ae−it L , L = 0. We call the operator L the standard Liouvillean. Note that the perturbed time evolution τV also has a unitary implementation τVt (A) = eit (L+V ) Ae−it (L+V ) .
140
V. Jakˇsi´c, C.-A. Pillet
Let 9, J and P be the modular operator, the modular conjugation and the natural 1 cone of the pair (M, ). By definition of the modular structure, M ⊂ D(9 2 ) and for A ∈ M, J 9 2 A = A∗ . 1
(2.1)
By the Tomita–Takesaki theorem, 9it M9−it = M, J MJ = M! . For every normal state η ∈ N there is a unique vector η ∈ P such that η(A) = (η , Aη ). Let LV ≡ L + V − J V J. We will call LV the standard Liouvillean for the perturbation V . The operator LV is the unique self-adjoint operator satisfying τVt (A) = eit LV Ae−it LV , e−it LV P ⊂ P, see [BR2, DJP]. An immediate consequence of these relations is: Proposition 2.1. The state η ∈ N is τV -invariant iff LV η = 0. By this proposition, the study of normal, τV -invariant states reduces to the study of Ker LV . If ω is (τ, β)-KMS, then by the fundamental result of Araki there exists a state ωV ∈ N which is (τV , β)-KMS. Thus, in thermal equilibrium Ker LV is never empty. On the other hand, if ω is not a KMS-state, then typically Ker LV = {0} and to study NESS using spectral techniques we need new concepts. 2.2. C-Liouvillean. The vector space O = {A | A ∈ O} equipped with the norm A∞ = A,
(2.2)
is a Banach space which we denote by C(O, ). Note that every A ∈ O defines, by right multiplication, a bounded linear map on C(O, ). This map we again denote by A. Obviously, the map O A → A ∈ C(O, ), is a Banach space isomorphism. Under this isomorphism, the group τVt is mapped into a continuous group TVt of isometries of C(O, ). Clearly, TVt A = τVt (A),
(2.3)
and TVt = , TVt ATV−t = τVt (A).
(2.4)
Non-Equilibrium Steady States of Finite Quantum Systems
141
The generator of the group TVt we denote by LV and call it C-Liouvillean. It is convenient to include the imaginary unit in the definition of LV so that TVt = eitLV . By (2.3), D(LV ) = {A | A ∈ D(δV )} , and iLV A = δV (A). We proceed to compute the operator LV in terms of the modular structure. Let A ∈ D(δV ) = D(δ) be given. Differentiating the relation eitLV A = eit (L+V ) Ae−it (L+V ) , and setting t = 0 we derive LV A = (L + V )A − (V A∗ )∗ . Applying (2.1) twice we obtain (V A∗ )∗ = J 9 2 V J 9 2 A. 1
1
Since J 9 2 = 9− 2 J on O, the operator LV has the form 1
1
LV = L + V − J 9 2 V 9− 2 J. 1
1
(2.5)
Note that J 9 2 V 9− 2 J : C(O, ) → C(O, ), 1
1
is a bounded operator with norm V . We now identify conditions under which TVt extends to a strongly continuous group on H. The formula (2.5) implies that the operator LV extends to a dense subspace D := D(L) ∩ O. Moreover, since D ⊂ D(L∗V ), the linear operator LV with domain D is closable. We denote its closure by the same letter. It follows that TVt extends to a strongly continuous group on H iff LV satisfies the conditions of Hille–Yosida–Phillips theorem: (R1) For some a > 0, σ (LV ) ⊂ {z | |Imz| ≤ a}. (R2) There is a M > 0 such that for all z with |Imz| > a and all integers n ≥ 1, (z − LV )−n ≤ M(|Imz| − a)−n . In the next proposition we summarize some elementary consequences of the assumptions (R1) and (R2). In the sequel L#V stands either for LV or L∗V . Proposition 2.2. Assume that (R1) and (R2) hold. Then the operators iL#V are generators of strongly continuous groups on H. Moreover: (i) eitLV ≤ Mea|t| . #
142
V. Jakˇsi´c, C.-A. Pillet
(ii) If Imz > a, then (z − L#V )−1 =
1 i
∞
eizt e−itLV dt. #
(2.6)
0
(iii) For all A ∈ M, ∗
∗
τVt (A) = eitLV Ae−itLV = eitLV Ae−itLV . (iv) LV = 0. Proof. Parts (i) and (ii) are well-known properties of strongly continuous groups. Parts (iii) and (iv) follow from (2.4). # $ It is convenient to introduce conditions on the perturbation V which can be easily checked in concrete models and which imply (R1) and (R2) above. We describe one such condition below. For self-adjoint V ∈ O and t ∈ R we set Vt ≡ 9it V 9−it .
(R3) The function R t → Vt ∈ M has an analytic continuation to the strip {z | |Imz| < 1/2} which is bounded and continuous on its closure. Note that since Vt is self-adjoint we must have Vz∗ = Vz . Clearly, (R3) implies (R1) and (R2), and LV = L + V − J V−i/2 J, L∗V = L + V − J Vi/2 J. Moreover, if (R3) holds, then one can take a = Vi/2 = V−i/2 and M = 1 in (R1)–(R2). If ω is a (τ, β)-KMS state, there is an important relation between standard Liouvillean LV and C-Liouvillean LV . A simple computation shows that for t ∈ R, L + V − J Vt J = e−iβt (L+V ) LV eiβt (L+V ) . If (R3) holds, then by analytic continuation the relation LV = eβ(L+V )/2 LV e−β(L+V )/2 ,
(2.7)
holds in quadratic form sense on a domain D(e−β(L+V )/2 ) ∩ D(eβ(L+V )/2 ). The identity (2.7) leads to a simpler proof of some fundamental results of Araki’s theory of perturbations of W ∗ -dynamical systems (see [DJP] for details). It can also be used to relate the method of the proof of Theorem 1.2, restricted to the thermal equilibrium case β1 = β2 = β, to the method of [JP1, JP2]. For reasons of space we omit the details. If ω is not a KMS-state, then there is no direct relation between LV and LV .
Non-Equilibrium Steady States of Finite Quantum Systems
143
2.3. Spectral theory of NESS. Our goal is to study NESS using spectral theory of CLiouvilleans. For this reason it is more convenient to deal with NESS defined using Abelian limits. The weak-* limit points of the set of states ∞ ; e−;t ω ◦ τVt dt, 0
V+,Ab (ω).
The set V+,Ab (ω) is a non-empty weak-* compact as ; ↓ 0 we denote by subset of E(O) whose elements are τV -invariant. Moreover: Proposition 2.3. If either V+,Ab (ω) or V+ (ω) consists of a single state, then V+,Ab (ω) = V+ (ω). The proof of this proposition follows from standard Abelian and Tauberian theorems [Si]. With a slight abuse of terminology we will also call the elements of V+,Ab (ω) the NESS of (O, τV ) associated to the initial state ω. In what follows we assume that the assumptions (R1) and (R2) hold. Our goal is to characterize NESS in V+,Ab (ω) in terms of the corresponding CLiouvillean. To motivate this characterization, for Imz > a let z := (z − L∗V )−1 , and let ωz ∈ O∗ be defined by ωz (A) = (, Az ). Then, since 1 ∞ izt ωz (A) = e ω(τVt (A))dt, i 0 the functionals ωz have weak-∗ analytic extension to the half-plane Imz > 0 and V+,Ab (ω) is the weak-∗ limit point set of the set of states {i;ωi; | ; > 0} as ; ↓ 0. We wish to go further along these lines and characterize V+,Ab (ω) directly in terms of the vectors z . Our main tool is an axiomatic abstract version of the complex deformation technique. Let D ≥ 0 be a bounded operator on H such that RanD is dense in H and D = . Set RD (z) := D(z − L∗V )−1 D. Our first assumption is: (DL1) The vector-valued function z → RD (z), originally defined for Imz > a, has an analytic continuation to the half-plane Imz > 0 such that sup ;RD (i;) < ∞. ;>0
Note that since (, RD (i;)) = (i;)−1 , inf ;>0 ;RD (i;) ≥ 1. We define a vector subspace OD ⊂ O by
OD = A ∈ O | A∗ ∈ D(D −1 ) . cl be the norm closure of O . Our next two assumptions are: Let OD D cl = O. (DL2) OD (DL3) The set {D −1 A∗ | A ∈ OD } is dense in H.
(2.8)
144
V. Jakˇsi´c, C.-A. Pillet
Let WV+ be the weak limit point set of i;RD (i;) as ; ↓ 0. Since the unit ball in a Hilbert space is weakly compact, (2.8) implies that WV+ is non-empty. Proposition 2.4. Assume that (DL1) and (DL2) hold. Then there is an injection + + WV+ + V → ωV ∈ V ,Ab (ω)
(2.9)
such that for A ∈ OD , ωV+ (A) = (D −1 A∗ , + V ).
(2.10)
If in addition (DL3) holds, then the map (2.9) is a bijection. Remark. The vectors in WV+ are naturally interpreted as the zero-resonance eigenvectors associated to the triple (L∗V , D, ), and in this sense Theorem 2.4 identifies NESS with zero resonance eigenvectors of L∗V . Proof. Proposition 2.2 yields that for A ∈ OD , ∞ e−;t ω(τVt (A))dt = i;(D −1 A∗ , RD (i;)). ;
(2.11)
0
cl = O, from this relation it follows that each + ∈ W + determines a unique Since OD V V + state ωV ∈ V+,Ab (ω) and that (2.10) holds for A ∈ OD . If in addition (DL3) holds, then Relation (2.11) and the uniform bound (2.8) imply + that each ωV+ ∈ V+,Ab (ω) determines a unique + $ V ∈ WV . #
An immediate consequence of Proposition 2.4 is that under the assumptions (DL1)(DL3), V+,Ab (ω) consists of a single state ωV+ iff w − lim i;RD (i;) = + V, ;↓0
and in this case for all A ∈ O we have ∞ 1 T t ω(τV (A))dt = lim ; e−;t ω(τVt (A))dt = ωV+ (A). lim ;↓0 T →∞ T 0 0 To refine the above result, we need additional assumptions. Let
! MD = C ∈ M! C ∗ C ∈ D(D −1 ) , ! cl ! in H. be the closure of MD and let MD ! cl (DL4) MD = H. Note that since is a separating vector for M, (M! )cl = H. We denote by ND the set of vector states η( · ) = (C, · C), ! and C = 1. (DL4) implies that N is norm-dense in N . where C ∈ MD D
Non-Equilibrium Steady States of Finite Quantum Systems
145
We will replace assumption (DL1) with (DL4) The operator-valued function z → RD (z), originally defined for Imz > a, has an analytic continuation to the region Imz > 0 and there is a bounded operator PV+ such that w − lim i;RD (i;) = PV+ . ;↓0
Proposition 2.5. Assume that the assumptions (DL2), (DL4) and (DL5) hold and that dim RanPV+ = 1. Then, for all η ∈ N , V+,Ab (η) = V+,Ab (ω) = {ωV+ }. Proof. Note that since PV+∗ = and dim RanPV+ = 1, PV+ ( · ) = (, · )+ V . To prove the proposition it suffices to show that for η ∈ ND and A ∈ OD , ∞ lim ; e−;t η(τVt (A))dt = (D −1 A∗ , + (2.12) V ). ;↓0
0
! be such that η( · ) = (C, · C). Let η ∈ ND and A ∈ OD be given. Let C ∈ MD t Since [C, τV (A)] = 0, we derive from Proposition 2.2 that ∞ ∞ e−;t η(τVt (A))dt = e−;t (C, τVt (A)C)dt 0
0
= i(D −1 A∗ , RD (i;)D −1 C ∗ C). Therefore
∞
lim ; ;↓0
0
e−;t η(τVt (A))dt = (D −1 A∗ , PV+ D −1 C ∗ C).
(2.13)
−1 Relations PV+ ( · ) = (, · )+ V , D = , and C = 1 yield
PV+ D −1 C ∗ C = + V, and (2.12) follows from (2.13) and (2.14).
(2.14)
$ #
The last result we wish to discuss concerns conditions under which the approach to NESS is exponentially fast. For µ ∈ R let P(µ) be the half-plane {z | Imz > µ}. We replace (DL5) with: (DL6) The operator-valued function z → RD (z), originally defined for z ∈ P(a), has a meromorphic continuation to a half-plane P(µ) for some µ < 0. Since (, RD (z)) = 1/z, zero is always a pole of RD (z). It is not difficult to show that if in addition (DL3) holds, then zero is a simple pole of RD (z) and all other poles are in the half-plane Imz ≤ 0. In particular, (DL3) ∧ (DL6) ⇒ (DL5). We will not make use of assumption (DL3) below. Assume in addition to (DL6) that the function RD (z) has only finitely many poles {z0 , z1 , . . . , zn } (z0 = 0) in the half-plane P(µ) and let mk be the order or the pole zk . Then we can decompose RD (z) as RD (z) = RaD (z) + RsD (z),
(2.15)
146
V. Jakˇsi´c, C.-A. Pillet
where RaD (z) is an analytic operator-valued function in the half-plane P(µ) and RsD (z)
=
Sk (z) =
n k=0 mk i=1
Sk (z), Ski . (z − zk )i
(2.16)
Let PV+ be the residue of RD (z) at z = 0. Then 1 PV+ = RD (z)dz = S01 , 2π i γ where γ is a small circle around zero such that inside γ zero is the only singularity of RD (z). Theorem 2.1. Assume the following: (a) Assumptions (DL2), (DL4) and (DL6) hold. (b) The function RD (z) has only finitely many singularities {z0 , z1 , . . . , zn } in P(µ), where z0 = 0 and Imzk < 0 for k ≥ 1. (c) dim RanPV+ = 1. j (d) For all ? ∈ H and j = 0, 1, supy>µ R |∂x (?, RaD (x + iy)?)|2−j dx < ∞. Then, (i) For all η ∈ N , V+,Ab (η) = V+,Ab (ω) = {ωV+ }. Moreover, for all A ∈ O, lim η(τVt (A)) = ωV+ (A).
t→∞
(2.17)
(ii) For all η ∈ ND , A ∈ OD , and t > 0, t η(τ (A)) − ω+ (A) ≤ Cη,A e−γ t &t'r−1 , V V where γ ≡ min1≤k≤n |Imzk | and r is the maximum order of dominant poles (the poles in {z1 , · · · , zn } closest to the real axis). Proof. Since (ii) ⇒ (i), we have to prove (ii) only. Fix η ∈ ND , η( · ) = (C, · C), and A ∈ OD . Then, ∞ eizt η(τVt (A))dt = i(D −1 A∗ , RD (z)D −1 C ∗ C) ≡ 4(z). 0
Fix δ > 0 and µ! such that µ < µ! < −γ . Let α > 0 be a large number and %α the rectangle with vertices {±α + iδ, ±α + iµ! }. Then, for any ; > 0, α 1 1 −itz e 4(z)dz = − e−it (x+iδ) 4(x + iδ)dx + S(α) + B(α) 2π %α 2π −α = −ωV+ (A) −
mk n (−it)i−1 k=1 i=1
(i − 1)!
(D −1 A∗ , Ski D −1 C ∗ C)e−itzk , (2.18)
Non-Equilibrium Steady States of Finite Quantum Systems
147
where S(α) is the integral of 4 over the vertical sides of the rectangle %α and B(α) is the integral over the bottom side. Integration by parts and (d) with j = 1 yield that for ! t > 1 and uniformly in α, |B(α)| = O(eµ t ). Using (d) with j = 0, a standard argument (see e.g. Theorem 19.2 in [Rud]) yields that for some sequence αn → ∞, |S(αn )| → 0. Moreover, the sequence αn can be chosen independently of δ as long as δ < const. Pick a subsequence αnk such that αn k 1 lim e−it (x+iδ) 4(x + iδ)dx = η(τVt (A)), k→∞ 2π −αn k for Lebesgue a.e. t > 0 and set α = αnk in (2.18). Taking k → ∞ we derive that for a.e. t > 0, t η(τ (A)) − ω+ (A) ≤ CA,η e−γ t &t'r−1 . (2.19) V V Since both sides in (2.19) are continuous functions of t, the estimate (2.19) holds for all t > 0. # $ 3. Modular Structure of the Model In this section we return to the model S + R1 + R2 . We explicitly compute the modular structure associated to (O, τ ) and the states in Ns . We then use these results to compute the standard and the C-Liouvillean of the locally perturbed system. Since the results of this section are either well-known or follow from simple computations we will omit the proofs. Notation. If A is a linear operator on Hs , we denote by A the linear operator Aψ = Aψ, where on the right-hand side · is the usual complex conjugation on Hs = C2 . We begin by computing the modular structure associated to the small system S. Set Hs = Hs ⊗ Hs , πs (A) = A ⊗ 1, πs# (A) = 1 ⊗ A. = Tr(ρs A). The Let ωs be a state on Os . Then there is a density matrix ρs such that ωs (A) state ωs is faithful iff ρs > 0 and τs -invariant iff [Hs , ρs ] = 0. If ρs ( · ) = pi (ψi , · )ψi , let 1 s := pi2 ψi ⊗ ψ i . Recall that the dynamics of S is specified by automorphisms (1.3). Let Ls ≡ Hs ⊗ 1 − 1 ⊗ Hs .
148
V. Jakˇsi´c, C.-A. Pillet
Proposition 3.1. The triple (Hs , πs , s ) is the GNS representation of Os associated to ωs . If ωs is τs -invariant, then Ls is the corresponding standard Liouvillean. If ωs is faithful, consider the pair (πs (Os ), s ). (i) Its modular operator is
9s = ρs ⊗ ρ s −1 .
(ii) Its modular conjugation is Js (φ ⊗ ψ) = ψ ⊗ φ. (iii) Js πs (A)Js = πs# (A). We now discuss the modular structure associated to a free Fermi reservoir in thermal equilibrium at inverse temperature β. We fix a complex conjugation (an anti-unitary ˆ f be involution) f → f which commutes with the single particle Hamiltonian h. Let the Fock vacuum on Hf , N the number operator, ϑ ≡ %(−1) = (−1)N , and
−1 . Gβ ≡ eβh + 1
The complex conjugation · on h naturally extends to a complex conjugation on Hf which we denote by the same symbol, i.e. ? → ?. Let Hf ≡ Hf ⊗ Hf , ˆf ⊗ ˆ f. f = The Araki–Wyss representation πβ of Of on Hf is defined by 1
πβ (a(f )) = a((1 − Gβ ) 2 f ) ⊗ 1 + ϑ ⊗ a ∗ (Gβ2 f ), 1
1
πβ (a ∗ (f )) = a ∗ ((1 − Gβ ) 2 f ) ⊗ 1 + ϑ ⊗ a(Gβ2 f ). 1
The dual representation πβ# is defined by 1
πβ# (a ∗ (f )) = ϑa(Gβ2 f ) ⊗ ϑ + 1 ⊗ a ∗ ((1 − Gβ ) 2 f )ϑ, 1
1
πβ# (a(f )) = a ∗ (Gβ2 f )ϑ ⊗ ϑ + 1 ⊗ ϑa((1 − Gβ ) 2 f ). 1
The representations πβ and πβ# were introduced for the first time in [AW] (see also Example 5.2.20 in [BR2]). Let Lf ≡ Hf ⊗ 1 − 1 ⊗ Hf . Proposition 3.2. The triple (Hf , πβ , f ) is the GNS representation of Of associated to the KMS-state ωf,β and Lf is the corresponding standard Liouvillean. The vector f is separating for the enveloping von Neumann algebra Mf,β ≡ πβ (Of )!! . Consider the pair (Mf,β , f ).
Non-Equilibrium Steady States of Finite Quantum Systems
149
(i) Its modular operator is 9f = e−β Lf . (ii) Its modular conjugation is Jf (5 ⊗ ?) = u? ⊗ u5, where u ≡ (−1)N(N−1)/2 . (iii) Jf πβ (A)Jf = πβ# (A). If (A1) holds, then the GNS representation and modular structure of a free Fermi gas can be described in a somewhat different form which is more suitable for the spectral analysis. In what follows we assume that (A1) holds. Let h˜ = L2 (R, G). To any f ∈ h we associate a pair of functions fβ , fβ# ∈ h˜ by
− 1 fβ (s) = e−βs + 1 2 f˜(s), fβ# (s) = ie−βs/2 fβ (s) = if β (−s), (f˜ is defined by (1.7)). For latter purposes we make the following remark. Assume that f˜ ∈ H 2 (δ) for some 0 < δ < π/β. Then C(δ, β) ≡ sup |1 + e−βz |−1/2 < ∞. |Imz|<δ
It then follows that fβ , fβ# ∈ H 2 (δ), fβ H 2 (δ) = fβ# H 2 (δ) = eβs/2 fβ# H 2 (δ) = e−βs/2 fβ H 2 (δ) , fβ H 2 (δ) ≤ C(δ, β)f˜H 2 (δ) ,
(3.1)
e−βs/2 fβ# H 2 (δ) ≤ C(δ, β)e−βs/2 f˜H 2 (δ) . ˜ be the vacuum on We denote by s the operator of multiplication by s ∈ R. Let ˜ %− (h). Theorem 3.3. There exists a unitary map ˜ ˜ f %− (h) U : Hf → such that ˜ f, U f = U Lf U −1 = d%(s), U πβ (ϕ(f ))U −1 = ϕ(fβ ), U πβ# (ϕ(f ))U −1 = i%(−1)ϕ(fβ# ). Proof. This result follows from the identification h ⊕ h = L2 (R, G) and the exponential law for fermionic systems (see Theorem 3.2 in [BSZ]). # $
150
V. Jakˇsi´c, C.-A. Pillet
In what follows we will work exclusively in the representation given by Theorem 3.3 ˜ f for ˜ f , Lf for and we identify the quantities related by U (Hf now stands for %− (h), d%(s), etc.). (i) Consider now two identical reservoirs (Of , τi ) and let O be given by (1.4). Let ωβi (i) be (τi , βi )-KMS on Of for some βi > 0. Set (1)
(2)
(1)
(2)
H = Hs ⊗ Hf ⊗ Hf , = s ⊗ f ⊗ f , π = πs ⊗ πβ1 ⊗ πβ2 , π # = πs# ⊗ πβ#1 ⊗ πβ#2 , (1)
(2)
L = Ls + Lf + Lf . Proposition 3.4. The GNS representation of O associated to ωs ⊗ωβ1 ⊗ωβ2 is (H, π, ). If ωs is τs -invariant, then L is the corresponding standard Liouvillean. If ωs is faithful, then is a separating vector for the enveloping von Neumann algebra M ≡ π(O)!! = πs (Os ) ⊗ Mf,β1 ⊗ Mf,β2 . For ωs faithful, consider the pair (M, ). (1)
(2)
(i) Its modular operator is 9 = 9s ⊗ 9f ⊗ 9f . (ii) Its modular conjugation is J = Js ⊗ Jf ⊗ Jf . (iii) J π(A)J = π # (A). Let now V be the perturbation (1.5). The standard Liouvillean LV for the perturbed dynamics is now easily computed in the representation π . With a slight abuse of notation (i) we identify V and π(V ). Moreover, we denote the field and number operators on Hf (i) by ϕ and Ni . Then, V = (σx ⊗ 1) ⊗ ϕ (1) (α1β1 ) + (σx ⊗ 1) ⊗ ϕ (2) (α2β2 ), # # J V J = (1 ⊗ σx ) ⊗ i(−1)N1 ϕ (1) (α1β ) + (1 ⊗ σx ) ⊗ i(−1)N2 ϕ (2) (α2β ) . 1 2 Proposition 3.5. The standard Liouvillean of the perturbed system (O, τλ ) in the representation π is Lλ = L + λV − λJ V J. Assume now that (A2) holds. Then, the assumption (R3) of Sect. 2.2 holds and −1/2 (1 ⊗ ρ 1/2 ) J V−i/2 J = i s σx ρ s i
1 # (i)∗ βi s/2 # ) + a (e α ) , ⊗ √ (−1)Ni a (i) (e−βi s/2 αiβ iβi i 2 J Vi/2 J = i (1 ⊗ ρ s −1/2 σx ρ 1/2 s ) i
1 # (i)∗ −βi s/2 # ) + a (e α ) . ⊗ √ (−1)Ni a (i) (eβi s/2 αiβ iβi i 2
Non-Equilibrium Steady States of Finite Quantum Systems
151
Proposition 3.6. If ωs is faithful and Hypothesis (A2) holds, then Hypothesis (R3) of Sect. 2.2 holds for the perturbation V and the C-Liouvillean is Lλ = L + λV − λJ V−i/2 J. The adjoint of Lλ is L∗λ = L + λV − λJ Vi/2 J. Although the standard Liouvillean does not depend on the choice of the initial state of the small system, the C-Liouvillean does through the term J V−i/2 J . It is often convenient to take a simple choice for the initial state ωs , namely ωs (A) = Tr(A)/2,
(3.2)
whose density matrix is ρs = 1/2. In this case Lλ takes a slightly simpler form and V±i/2 ≤ 2 eβi s/2 αi . i
4. Spectral Analysis The spectral analysis of the operators Lλ and L∗λ follows closely [JP1]. In this section we will state the main results of this analysis and discuss some of their consequences. We will only indicate the main steps of the proofs and the interested reader should consult [JP1] for details. Throughout this section we assume that Assumptions (A1) and (A2) hold. ˜ Let p ≡ i∂s be the generator of the group Recall that h˜ = L2 (R, G) and Hf = %− (h). ˜ of translations on h and P = d%(p) its second quantization. We adopt the shorthand 1 &P ' = (1 + P 2 ) 2 . Let δ > 0 be as in (A2). In what follows we fix κ such that 0 < κ < min(π/β1 , π/β2 , δ). Let D := 1 ⊗ e−κ&P ' ⊗ e−κ&P ' . (1)
(2)
Obviously, RanD is dense in H and the vectors of the form ψ ⊗ f ⊗ f , ψ ∈ Hs , are invariant under D. Recall that P(µ) ≡ {z | Imz > µ}. We deal first with the standard Liouvillean and Theorem 1.3. Theorem 4.1. For any µ > −κ there is a constant 1 > 0 such that for |λ| < 1 the operator-valued function z → D(z − Lλ )−1 D,
(4.1)
originally defined for Imz > 0, has a meromorphic continuation to the half-plane P(µ). The function (4.1) has at most four poles in P(−µ). If in addition (A3) holds and β1 = β2 , then there is a constant 4 > 0 such that for 0 < |λ| < 4 none of the poles is on the real axis. In particular, for 0 < |λ| < 4 the spectrum of Lλ is purely absolutely continuous and there are no τλ -invariant states in the set N of normal states.
152
V. Jakˇsi´c, C.-A. Pillet
The last part of Theorem 4.1, the absence of τλ -invariant states in N , is the statement of Theorem 1.3. The proof of Theorem 4.1 follows the argument in [JP1, JP2]. Although in these works the Bose reservoirs are studied, the same (in fact, slightly simpler) argument applies to Fermi reservoirs. For the readers convenience and for latter applications, we recall the main steps of the argument in [JP1, JP2]. 4.1. Sketch of the proof of Theorem 4.1. Let u(θ ) ≡ e−iθP = %(e−iθp ), be the second quantization of the group of translations and U (θ) = 1 ⊗ u(θ ) ⊗ u(θ ). We set Lλ (θ ) ≡ U (θ)Lλ U (−θ). Let N = N1 + N2 . Note that U (θ)LU (−θ) = L + θN, U (θ )N U (−θ) = N, and U (θ )V U (−θ ) =
(σx ⊗ 1) ⊗ ϕ (i) (e−iθp αiβi ),
i
# U (θ )J V J U (−θ ) = i (1 ⊗ σx ) ⊗ (−1)Ni ϕ (i) (e−iθp αiβ ). i i
If Vtot (θ ) = U (θ )(V − J V J )U (−θ), then Lλ (θ ) = L + θN + λVtot (θ ). By (A2) and (3.1) the operator Vtot (θ ) is defined for all θ ∈ I (κ) and the map I (κ) θ → Vtot (θ ) is an analytic operator-valued function satisfying √ C := sup Vtot (θ ) ≤ 2 2 C(κ, βi )α˜ i H 2 (κ) . θ∈I (κ)
i
Obviously, the operator Lλ (θ ) is also defined for θ ∈ I (κ). For Imθ = 0, Lλ (θ ) is a closed operator with domain D(L) ∩ D(N ). Let I − (κ) = {z | − κ < Imz < 0}. The function I − (κ) × C (θ, λ) → Lλ (θ ), with values in the closed operators on H, is an analytic family of type A in each variable separately. Note that the spectrum of L0 (θ ) consists of two simple eigenvalues ±2, a double degenerate eigenvalue 0, and of the sequence of lines {inImθ + R | n ≥ 1}. Let 1 be such that 1C < (κ − |µ|)/4. Then, for |λ| < 1 and −κ < Imθ < −(κ + |µ|)/2, the essential spectrum of Lλ (θ ) is contained in the half-plane {z | Imz < µ}. The location of the discrete spectrum inside P(µ) can be computed using regular perturbation
Non-Equilibrium Steady States of Finite Quantum Systems
153
theory. By possibly taking 1 smaller, one can show that this discrete spectrum consists of four points (resonances) {e±2 (λ), e01,2 (λ)}, where e±2 (λ) are near ±2 and e01,2 (λ) are near 0, see Fig. 1. These resonances do not depend on θ . Moreover, the functions λ → e±2 (λ) are analytic for |λ| < 1, e±2 (λ) = ±2 +
∞
± λ2j a2j ,
j =1
and one can compute a2± explicitly: 2 α ˜ (s) 1 i G a2± = −iπ α˜ i (2)2G ± PV ds , 2 s−2 R i
where PV stands for Cauchy’s principal value. −2
0
2
−Im(θ )
minπ/βi
O(λ)
Fig. 1. Resonances of the standard Liouvillean Lλ
The resonances e01,2 (λ) are the eigenvalues of a 2 × 2 matrix (λ) which is analytic for |λ| < 1, (λ) =
∞
λ2j 2j ,
j =1
and one can compute 2 explicitly: 2 = −iπ
i
α˜ i (2)2G Ti ,
154
V. Jakˇsi´c, C.-A. Pillet
where 1 Ti = 2 cosh βi
eβi −1 . −1 e−βi
If (A3) holds, then Ima2± < 0 and for λ non-zero and sufficiently small, Im e±2 (λ) < 0. Notice also that the matrices Ti are self-adjoint and non-negative with a simple eigenvalue 0 and corresponding eigenvector −β /2 e i ψβi = . eβi /2 Thus, unless β1 = β2 , i2 > 0, and for λ non-zero and sufficiently small, Im e01,2 (λ) < 0. To finish the proof, we have to relate D(z − Lλ )−1 D and (z − Lλ (θ ))−1 . To do so, we fix z with Imz large enough. Then, one shows that s − lim (z − Lλ (θ ))−1 = (z − Lλ (Reθ))−1 . Imθ↑0
(4.2)
Let D(θ ) = 1 ⊗ e−κ&P '+θP ⊗ e−κ&P '+θP , and consider the function I − (κ) θ → D(θ )(z − Lλ (θ ))−1 D(−θ). By analyticity, this function is constant in θ. By (4.2) and continuity, the relation D(z − Lλ )−1 D = D(θ )(z − Lλ (θ ))−1 D(−θ),
(4.3)
holds for −κ < Imθ ≤ 0. If θ in (4.3) satisfies −κ < Imθ < −(κ + |µ|)/2, then the right-hand side in (4.3) provides the desired meromorphic continuation of the function D(z − Lλ )−1 D. Since RanD is dense in H and D(z − Lλ )−1 D has no poles on the real axis, the spectrum of Lλ is purely absolutely continuous for 0 < |λ| < 4. In particular, KerLλ = {0}, and, by Proposition 2.1, there are no τλ -invariant states in the set N of normal states. $ # In the proof of Theorem 4.1 we have not used the full strength of Assumption (A2) and for this theorem it suffices that α˜ i ∈ H 2 (κ). In fact, if the complex deformation technique is replaced with Mourre theory, then the main conclusion of the theorem can be derived under a much weaker regularity condition on α˜ i , see [DJ1, DJ2]. We now deal with the C-Liouvillean and Theorem 1.2. As we have remarked at the end of the last section, it is convenient to take for the initial state of the small system the state ωs defined by (3.2). In what follows Lλ is the C-Liouvillean associated to ω = ωs ⊗ ωβ1 ⊗ ωβ2 . Let RD (z) = D(z − L∗λ )−1 D.
Non-Equilibrium Steady States of Finite Quantum Systems
155
Theorem 4.2. For any µ > −κ there is a constant 1 > 0 such that for |λ| < 1 the operator-valued function RD (z), originally defined for z ∈ P(a), has a meromorphic continuation to the half-plane P(µ). The function RD (z) has at most four poles in P(µ), and zero is one of its poles. Let Pλ+ be the residue of RD (z) at 0. If in addition (A3) holds and λ = 0, then dim RanPλ+ = 1 and all singularities of RD (z) except zero are contained in the half-plane Imz < 0. Moreover, Pλ+ is an analytic function of λ for |λ| < 1. The proof of this theorem is a slight elaboration of the arguments in [JP1, JP2] which we have already sketched above. We give below an outline of the proof.
4.2. Sketch of the proof of Theorem 4.2. We use the notation introduced in the proof of Theorem 4.1. For real θ let L∗λ (θ ) ≡ U (θ)L∗λ U (−θ), V˜tot (θ ) ≡ U (θ)V U (−θ) − U (θ)J Vi/2 J U (−θ). Clearly, L∗λ (θ ) = L + θN + λV˜tot (θ ). Assumption (A2) implies that I (κ) θ → V˜tot (θ ) is an analytic operator-valued function satisfying 1 C˜ := sup V˜tot (θ ) ≤ √ C(κ, βi ) 3α˜ i H 2 (κ) + e−βi s/2 α˜ i H 2 (κ) . (4.4) 2 i θ∈I (κ) The function C × I − (κ) (λ, θ ) → L∗λ (θ ), with values in the closed operators on H, is an analytic family of type A in each variable separately. One now repeats the analysis outlined in the proof of Theorem 4.1. For 1C˜ < (κ − |µ|)/4 and |λ| < 1 the essential spectrum of L∗λ (θ ) is contained in the halfplane {z | Imz < µ}. Here, again, the location of the discrete spectrum inside P(µ) can be computed using regular perturbation theory. This discrete spectrum consists of four points {e˜±2 (λ), e˜01,2 (λ)}, where e˜±2 (λ) are near ±2 and e˜01,2 (λ) are near 0, see Fig. 2. Since (L∗λ (θ ))∗ = 0, we have e˜01 (λ) = 0. Moreover, the functions λ → e˜±2 (λ) are analytic for |λ| < 1, e˜±2 (λ) = ±2 +
∞ j =1
and one finds that a˜ 2± = a2± .
± λ2j a˜ 2j ,
156
V. Jakˇsi´c, C.-A. Pillet
−2
0
2
Fig. 2. Resonances of L∗λ
˜ which is analytic for The resonances e˜01,2 (λ) are the eigenvalues of a 2 × 2 matrix (λ) |λ| < 1, ˜ (λ) =
∞
˜ 2j , λ2j
j =1
and ˜ 2 = −iπ
i
α˜ i (2)2G T˜i ,
where T˜i = e−βi σz /2 Ti eβσz /2 = Notice that
1 2 cosh βi
eβi −e−βi . −eβi e−βi
∗ 1 ˜ = 0, Ti 1
˜ 2 is equal to ˜ 2 . The second eigenvalue of and so zero is always an eigenvalue of ˜ 2 ) = −iπ α˜ i (2)2G . Tr( i
If (A3) holds, then this eigenvalue has negative imaginary part. Thus, for λ non-zero and sufficiently small, Ime˜02 (λ) < 0.
Non-Equilibrium Steady States of Finite Quantum Systems
157
Following the argument in the proof of Theorem 4.1, we see that RD (z) = D(θ )(z − L∗λ (θ ))−1 D(−θ), provides the required meromorphic continuation of RD . By this formula, the residue (1) Pλ+ is related to the spectral projection Qλ (θ ) corresponding to the zero eigenvalue of ∗ Lλ (θ ) by (1)
Pλ+ = D(θ)Qλ (θ )D(−θ).
(4.5)
This implies that dim RanPλ+ = 1. (1) To prove the last statement of the theorem we must show that Qλ (θ ) is analytic ˜ for |λ| < 1. We prove this by relating this operator to the spectral projection L(λ) −2 ˜ corresponding to the zero eigenvalue of the analytic matrix A(λ) = λ (λ). Notice ˜ that since 0 is a simple eigenvalue of A(0), L(λ) is analytic for λ small enough. ˜ Let us recall the construction of the operator (λ) [JP1, HP]. By taking 1 possibly smaller, one can find a contour γ around 0 such that for θ with Imθ sufficiently close to −κ and for |λ| < 1, the spectral projection corresponding to the group {e˜01 (λ), e˜02 (λ)} is given by 1 (z − L∗λ (θ ))−1 dz. (4.6) Qλ (θ ) = 2π i γ Qλ (θ ) is an analytic function of λ and Qλ (θ ) − Q0 (θ ) < 1. Notice that Q0 (θ ) = Q0 does not depend on θ and is the spectral projection of L corresponding to double degenerate eigenvalue 0. It follows that the maps Q0 : RanQλ (θ ) → RanQ0 , Qλ (θ ) : RanQ0 → RanQλ (θ ), are isomorphisms. Setting T (λ) ≡ Q0 Qλ (θ )Q0 , one easily checks that the operator Sλ (θ ) = Q0 Qλ (θ ) : RanQλ (θ ) → RanQ0 , has inverse Sλ (θ )−1 = Qλ (θ )Q0 T (λ)−1 . Using the isomorphism Sλ (θ ), we transport the reduced operator Qλ (θ )L∗λ (θ )Qλ (θ ) to RanQ0 = C2 . A simple calculation yields: ˜ (λ) ≡ Sλ (θ )Qλ (θ )L∗λ (θ )Qλ (θ )Sλ (θ )−1 = M(λ)T (λ)−1 ,
(4.7)
158
V. Jakˇsi´c, C.-A. Pillet
where M(λ) ≡ Q0 Qλ (θ )L∗λ (θ )Qλ (θ )Q0 . The operators T (λ) and M(λ) are independent of θ as long as |λ| < 1 and Imθ is sufficiently close to −κ. Moreover, they are analytic functions of λ. Formula (4.7) yields that (1) ˜ L(λ) = Sλ (θ )Qλ (θ )Sλ (θ )−1 .
(4.8)
(1)
Inverting this formula we derive that Qλ (θ ) (and hence Pλ+ ) is an analytic function for λ small enough. # $ Theorem 4.3. Assume that (A3) holds. Then there is 1 > 0 such that for 0 < |λ| < 1 all the assumptions of Theorem 2.1 hold. Proof. Choose 0 > µ > −κ and 1 so that Theorem 4.2 holds. This theorem verifies assumptions (DL6), (b) and (c) of Theorem 2.1. To verify (d) it suffices to show that for some r > 0 large enough, all ? ∈ H and j = 0, 1 j sup |∂x (?, RD (x + iy)?)|2−j dx < ∞. y>µ |x|>r
Since RD (z) = D(θ )(z − L∗λ (θ ))−1 D(−θ), it suffices to show that for Imθ close enough to −κ, λ small enough, all ? ∈ H and j = 0, 1, sup |(?, (x + iy − L∗λ (θ ))−1−j ?)|2−j dx < ∞. (4.9) y>µ |x|>r
Note that L∗0 (θ ) = L + θ N is a normal operator, and that the bounds sup (x + iy − L∗0 (θ ))−1 ?2 dx < ∞, y>µ |x|>r
sup
y>µ,|x|>r
(x + iy − L∗0 (θ ))−1 < ∞,
(4.10)
follow from the spectral theorem. The second relation in (4.10) and the resolvent identity yield that for λ small enough, (x + iy − L∗λ (θ ))−1 = G(x + iy − L∗0 (θ ))−1 ˜ = (x + iy − L∗0 (θ ))−1 G,
(4.11)
˜ (which depend on θ, λ, x, y) have uniformly bounded where the operators G and G norms. The first relations in (4.10) and (4.11) yield (4.9) for j = 0. The case j = 1 follows from the estimate ˜ |(?, (x + iy − L∗λ (θ ))−2 ?)| ≤ GG(x + iy − L∗0 (θ ))−1 ?2 .
Non-Equilibrium Steady States of Finite Quantum Systems
159
It remains to verify (DL2) and (DL4). Let
htest = f ∈ h | f˜ ∈ D(eκ&p' ) ,
(4.12)
and let Of,test be the vector subspace of Of generated by 1 and
a # (f1 ) · · · a # (fn ) | n ∈ N, fi ∈ htest . Set (1)
(2)
Otest = Os ⊗ Of,test ⊗ Of,test . Note that Otest is a ∗-subalgebra of O. Obviously, Otest ⊂ OD . Since the set htest is cl = O and (DL2) follows. dense in h, Otest ! . Since O cl = O and π(O )!! = To establish (DL4), note that J π(Otest )J ⊂ MD test test ! M, π(Otest ) is dense in H. Thus, MD is also dense in H. Following the above argument one can also easily verify Hypothesis (DL3) in our model. We will not make use of this hypothesis below. # $ We are now ready to finish: Proof of Theorem 1.2. Parts (i) and (ii) follow from Theorems 2.1 and 4.3 with N0 = ND and O0 = OD . From the construction of ND and OD it is immediate that Ns ⊂ ND and Os ⊂ OD . Since for A ∈ OD , + −1 ∗ ωλ+ (A) = (D −1 A∗ , + λ ) = (D A , Pλ ),
Part (iii) follows from the last statement of Theorem 4.2.
$ #
As we have pointed out in Remark 3 after Theorem 1.2, Part (iii) of Theorem 1.2 yields that for A ∈ OD we have an expansion ωλ+ (A)
=
∞ k=0
λk ωk+ (A).
(4.13)
It is an important question whether the functionals ωk+ can be (at least in principle) computed. If + λ = ωk+ (A)
(D −1 A∗ , + k ),
∞ k=0
λ k + k,
ωk+
then = so is determined by + k ((DL3) implies that the + opposite is also true). To compute the expansion of λ , one uses that Pλ+ = + λ and the identity (4.5). First, using (4.6), one expands Qλ (θ ) in powers of λ. Using this ˜ ˜ The expansion of (λ) and regular result, one expands T (λ), Sλ (θ ), M(λ) and (λ). ˜ perturbation theory yield the expansion of L(λ). The formulas (4.8) and (4.5) then (1) yield the expansions of Qλ (θ ) and Pλ+ (θ ). Although clearly the resulting formulas are complicated, at least in principle it is possible to compute any term in the expansion (4.13). In particular, the first term ω0 is determined by the vector (1) (2) + ˜ + 0 = P0 = (L(0)s ) ⊗ f ⊗ f .
160
V. Jakˇsi´c, C.-A. Pillet
5. Entropy Production Proof of Theorem 1.1. We assume that the reader is familiar with basic properties of relative entropy (a particularly clear review is given in [Don]). Let M = πω (O)!! and let M∗ be the predual of M. Assume that (a) and (b) hold, and that Ep(ωV+ ) = ωV+ (σV ) = 0. Then, by the formula (1.2) and (b), t
t ω(τVs (σV )) − ωV+ (σV )) ds ≥ −C, Ent(ω ◦ τV | ω) = − 0
for all t > 0 and some C > 0. Set 1 ωT ≡ T
0
T
ω ◦ τVt dt.
The convexity and the upper semicontinuity of the relative entropy yield that 1 T Ent(ωT | ω) ≥ Ent(ω ◦ τVt | ω)dt ≥ −C. T 0 Since the set of all states η ∈ Nω such that Ent(η | ω) ≥ −C is σ (M∗ , M)-compact, the set of weak-* limit points of {ωT | T > 0} is contained in Nω . It follows that ωV+ ∈ Nω , and this contradicts (a). # $ Proof of Theorem 1.4. Theorem 1.3 yields that assumption (a) of Theorem 1.1 holds. Let us verify (b) for the initial state ω = ωs ⊗ ωβ1 ⊗ ωβ2 , where ωs is given by (3.2). By Takesaki’s theorem [BR1], δω = −β1 δ1 − β2 δ2 ,
(5.1)
and δω (V ) = −β1 σx ⊗ ϕ(isα1 ) ⊗ 1 − β2 σx ⊗ 1 ⊗ ϕ(isα2 ). Since isαi ∈ htest , (htest is given by (4.12)), δω (V ) ∈ O0 . Hence, by Part (ii) of Theorem 1.2, assumption (b) of Theorem 1.1 holds, and Ep(ωλ+ ) > 0. It remains to show that the entropy production does not depend on the choice of the s . Let η = ηs ⊗ ωβ1 ⊗ ωβ2 ∈ N s . Then, by Theorem 1.1 in [JP3], initial state in N t t Ent(ω ◦ τV | η) = Ent(ω | η) − η(τVs (δη (λV )))ds. 0
By the proof of Proposition 1.3 in [JP3], Ent(ω ◦ τVt | ω) = Ent(ω ◦ τVt | η) + O(1), uniformly for t > 0. This implies that ωλ+ (δω (λV )) = ωλ+ (δη (λV )). $ #
(5.2)
Non-Equilibrium Steady States of Finite Quantum Systems
161
Relation (5.2) has one important consequence. Let ω and η be as in the above proof and ηs (A) = Tr(AeHs )/Tr(eHs ). Then, δη ( · ) = i[Hs , · ] + δω ( · ), and (5.2) yields that ωλ+ ([Hs , V ]) = 0.
(5.3)
Proof of Theorem 1.5. The second relation in (1.11) follows from the definition of entropy production and Relation (5.1). To prove the first, note that δ( · ) = i[Hs , · ] + δ1 ( · ) + δ2 ( · ), and
δλ ( · ) = δ( · ) + iλ[V , · ],
are the generators of the free and the perturbed dynamics. Since ωλ+ is τλ -invariant and V ∈ D(δ) = D(δλ ), 0 = ωλ+ (δλ (λV )) = ωλ+ (δ(λV )) = iλωλ+ ([Hs , V ]) + ωλ+ (51 ) + ωλ+ (52 ) = ωλ+ (51 ) + ωλ+ (52 ), where we used (5.3).
$ #
Acknowledgements. We are grateful to Jan Derezi´nski for many discussions on the subject of this paper, for remarks on the manuscript, and for pointing to us an argument which led to the proof of Theorem 1.3. The research of the first author was partly supported by NSERC. Part of this work has been performed during the visit of the first author to University of Toulon and during the visit of the second author to University of Ottawa which was supported by NSERC. The main part of this work was done during the visit of the first author to Johns Hopkins University. V.J. is grateful to Steve Zelditch for his friendship and to the Mathematics Department of Johns Hopkins University for generous support.
References [Ar1] [Ar2] [AW] [BSZ] [BFS] [BR1] [BR2] [Da]
Araki, H.: Relative entropy of states of von Neumann algebras. Pub. R.I.M.S., Kyoto Univ. 11, 809 (1976) Araki, H.: Relative entropy of states of von Neumann algebras II. Pub. R.I.M.S., Kyoto Univ. 13, 173 (1977) Araki, H., Wyss, W.: Representations of canonical anticommutation relations. Helv. Phys. Acta 37, 136 (1964) Baez, J.C., Segal, I.E., Zhou, Z.: Introduction to Algebraic and Constructive Quantum Field Theory. Princeton, NJ: Princeton University Press, 1991 Bach, V., Fröhlich, J., Sigal, I.: Return to equilibrium. J. Math. Phys. 41, 3985 (2000) Brattelli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 1. Berlin: Springer-Verlag, Second edition, 1987 Brattelli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. Berlin: Springer-Verlag, Second edition, 1996 Davies, E.B.: Markovian master equations. Commun. Math. Phys. 39, 91 (1974)
162
[Don] [DG1] [DG2] [DJ1] [DJ2] [DJP] [Fr] [FGS] [GFV] [Ha] [Hak] [HP] [JP1] [JP2] [JP3] [JP4] [LS] [O1] [O2] [OHI] [OP] [M] [Si] [Ru1] [Ru2] [Rud] [Sp1] [Sp2]
V. Jakˇsi´c, C.-A. Pillet
Donald, M.J.: Relative Hamiltonians which are not bounded from above. J. Func. Anal. 91, 143 (1990) Derezinski, J., Gerard, C.: Asymptotic completeness in quantum field theory. Massive Pauli-Fierz Hamiltonians. Rev. Math. Phys. 11, 383 (2000) Derezinski, J., Gerard, C.: Spectral and scattering theory of spatially cut-off P (ϕ)2 Hamiltonians. Commun. Math. Phys. 213, 39 (2000) Derezinski, J., Jakˇsi´c, V.: Spectral theory of Pauli-Fierz operators. J. Func. Anal. 180, 243 (2001) Derezinski, J., Jakˇsi´c, V.: Return to equilibrium for Pauli-Fierz systems. Submitted Derezinski, J., Jakˇsi´c, V., Pillet, C.-A.: Perturbation theory of KMS-states. Preprint Frigerio, A.: Quantum dynamical semigroups and approach to equilibrium. Lett. Math. Phys. 2, 79 (1977) Fröhlich, J., Griesmer, M., Schlein, B.: Asymptotic completeness for Rayleigh scattering. Preprint Gorini, V., Frigerio, A., Verri, M., Kossakowski, A., Sudarshan, E.C.G.: Properties of quantum Markovian master equations. Rep. Math. Phys. 13, 149 (1978) Haag, R.: Local Quantum Physics. Berlin: Springer-Verlag, 1993 Haake, F.: Statistical treatment of open systems by generalized master equations. Springer Tracts in Modern Physics 66. Berlin: Springer-Verlag, 1973 Hunziker, W., Pillet, C-A.: Degenerate asymptotic perturbation theory. Commun. Math. Phys. 90, 219 (1983) Jakˇsi´c, V., Pillet, C.-A.: On a model for quantum friction II. Fermi’s golden rule and dynamics at positive temperature. Commun. Math. Phys. 176, 619 (1996) Jakˇsi´c, V., Pillet, C.-A.: On a model for quantum friction III. Ergodic properties of the spin-boson system. Commun. Math. Phys. 178, 627 (1996) Jakˇsi´c, V., Pillet, C.-A.: On entropy production in quantum statistical mechanics. Commun. Math. Phys. 217, 285 (2001) Jakˇsi´c, V., Pillet, C.-A.: In preparation Lebowitz, J., Spohn, S.: Irreversible thermodynamics for quantum systems weakly coupled to thermal reservoirs. Adv. Chem. Phys. 38, 109, New-York: John Wiley and Sons, 1978 Ojima, I.: Entropy production and non-equilibrium stationarity in quantum dynamical systems: physical meaning of van Hove limit. J. Stat. Phys. 56, 203 (1989) Ojima, I.: Entropy production and non-equilibrium stationarity in quantum dynamical systems. In: Proceedings of international workshop on quantum aspects of optical communications. Lecture Notes in Physics 378, 164. Berlin: Springer-Verlag, 1991 Ojima, I., Hasegawa, H., Ichiyanagi, M.: Entropy production and its positivity in nonlinear response theory of quantum dynamical systems, J. Stat. Phys. 50, 633 (1988) Ohya, M., Petz, D.: Quantum Entropy and its Use. Berlin: Springer-Verlag, 1993 Merkli, M.: Positive commutators in non-equilibrium quantum statistical mechanics. Commun. Math. Phys. 223, 327 (2001) Simon, B.: Functional Integration and Quantum Physics. New York: Academic Press, 1979 Ruelle, D.: Natural nonequilibrium states in quantum statistical mechanics. J. Stat. Phys. 98, 57 (2000) Ruelle, D.: Entropy production in quantum spin systems. Preprint Rudin, W.: Real and Complex Analysis. New York: McGraw Hill, Inc, 1974 Spohn, H.: Entropy production for quantum dynamical semigroups. J. Math. Phys. 19, 227 (1978) Spohn, H.: An algebraic condition for the approach to equilibrium of an open N -level system. Lett. Math. Phys. 2, 33 (1977)
Communicated by H. Spohn
Commun. Math. Phys. 226, 163 – 181 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Dynamical Triangulation Models with Matter: High Temperature Region V. A. Malyshev I.N.R.I.A., Rocquencort, B.P. 105, 78153 Le Chesnay Cedex, France. E-mail: [email protected] Received: 19 June 2001 / Accepted: 12 October 2001
Abstract: We consider a canonical ensemble with a fixed number N of triangles for planar dynamical triangulation models with compact spin in the high temperature region. We find the asymptotics of the partition Z(N ) and reveal the analytic properties function of the generating function U (x) = Z(N )x N . New cluster expansion techniques are developed for this case. For fixed triangulation it would be quite standard but for random triangulations one has to deal with the non-zero entropy of the space between clusters. It is a multiscale expansion, where the role of scale is played by a topological parameter – the maximal length of chains of imbedded not simply connected clusters.
1. Definitions and Main Results 1.1. Introduction. We consider a model, related to quantum gravity, called the planar dynamical triangulation model with matter fields in the high temperature region (see the exact definitions below). Planar models without matter have an extensive history and are sufficiently well understood, both on physical and mathematical levels. In the physical literature there is a powerful random matrix method, in mathematics earlier combinatorial results by Tutte (see [4, 7]) solve the problem without spin. Random matrix methods in physics give some information about the Ising model on dynamical triangulations, see reviews [2, 1]. On the contrary, there are almost no rigorous mathematical results for models with matter fields. We develop new cluster expansion techniques. For a fixed triangulation it would be quite a standard exercise. But for dynamical triangulations the space outside clusters has non-zero entropy. It is related to diffeomorphism invariance for original continuous models. Note that for the lattice case we would have only a translation group conserving the form and the distances between clusters. Our cluster expansion is a kind of multiscale expansion where the induction parameter n = 1, 2, . . . , has a topological nature. Each n corresponds to the summation over all configuration with clusters of level not greater
164
V. A. Malyshev
than n. The level of a cluster is defined by induction: a cluster has level n if inside it all clusters have level less than n. One can roughly describe the situation as follows. In the absence of spin the asymptotics of the partition function is defined by an algebraic singularity of the generating function at some point x+ on the positive halfaxis. If the perturbation is imposed, an infinite number of new algebraic singularities xn , n = 1, 2, . . . , positive numbers close to x+ , appear. They have an accumulation point xacc = lim xn . If there exists n such that xn < xacc , then the asymptotics is canonical with the critical exponent − 27 . Otherwise, for example when xn < xn−1 for all n, then the asymptotics is not canonical. The proof consists of two parts. The first part presents an inductive formal cluster expansion. The second part uses complex analysis to get inductive estimates. 1.2. Triangulations and partition function. Graphs here can have multiple edges but no loops. Whenever necessary the graphs are considered as 1-dimensional complexes. In this paper we call triangulation a pair (G, φ), where G is a graph and φ is an imbedding of G in a closed two-dimensional sphere with a hole, that is a closed disk D. The following conditions are assumed to hold: if l is an edge of G then φ(l) is a smooth curve in D, and each of the open components of D \ φ(G) is homeomorphic to an open disk, the closure of each open component contains 3 different vertices and 3 different edges, that is 3 smooth curves φ(li ), where li are edges of G. Note that two vertices of G can be connected by more than one edge. Triangulation is called rooted if an edge (the root) on the boundary ∂D is specified together with its, say clockwise, direction (orientation). Two triangulations are called equivalent if there is a homeomorphism D → D which respects orientation, vertices, edges and the root. Let T0 (N, m) be the set of all equivalence classes (called further on also triangulations for brevity) of rooted triangulations with N triangles and m boundary edges. Let C0 (N, m) = |T0 (N, m)|. It is very convenient to assume the following conditions which will be the boundary conditions for the systems of equations below: C0 (N, 0) = C0 (N, 1) = 0, C0 (0, m) = δm,2 , C0 (1, m) = δm,3 . Only the case N = 0, m = 2 needs commentaries: this corresponds to a degenerate disk, an edge with two vertices. Let V (T ), L(T ), F (T ), B(T ) be correspondingly the sets of all vertices, edges, triangles and boundary edges of T . We denote T ∗ the dual graph of the triangulation T , its vertices v ∈ V (T ∗ ) correspond to triangles of T , edges l ∈ L(T ∗ ) – to pairs of adjacent triangles. All vertices of T ∗ have degree 3 except vertices corresponding to the triangles (there are not more than m = |B(T )| of such triangles), incident to at least one boundary edge. In each triangle of T , or in each vertex v of the dual graph T ∗ , there is a spin σv with values in the set S, this set is assumed finite for simplicity. Partition function for the canonical ensemble (with fixed number N ≥ 0 of triangles and fixed number m ≥ 2 of boundary edges) is defined as Z(T ), Z(N, m) = Zβ (N, m) = T :F (T )=N,B(T )=m
where the partition function Z(T ) for a given triangulation T ∈ T0 (N, m) is exp(−β (σv , σv )), N = |F (T )| = V (T ∗ ) , Z(T ) = |S|−N {σv :v∈V (T ∗ )}
Dynamical Triangulation Models with Matter: High Temperature Region
165
where < v, v > means a pair of nearest neighbor vertices (that is of adjacent triangles) v, v ∈ V (T ∗ ), (s, s ) is a real function on S × S, β > 0 – inverse temperature. The set of all symmetric interactions for given S is the Euclidean space R d of . We call a set of interactions generic if its complement dimension d = |S| + |S|(|S|−1) 2 has measure 0 in R d . Further on, for technical reasons only, we shall consider the ensemble with boundary conditions empty on the internal boundary, that is there are no spins on the triangles of F (T ) adjacent to the boundary of the disk. Somewhere we shall say how to treat more general boundary conditions. 1.3. Main results. It is known (see [4, 7]) that for fixed m as N → ∞ so that N + m is even (we always assume this condition in the sequel) 27 − 25 N Z0 (N, m) = C0 (N, m) ∼ φ(m)N c , c = , φ(m) > 0. 2 Note that Z0 (N, m) = 0 if N + m is odd. We want to stress that to get the asymptotics of the partition function itself is certainly more difficult than to get the asymptotics of 5 its logarithm. The asymptotics cN − 2 c1N is called canonical, and the critical exponent α = − 25 is also called canonical. Our goal is to prove similar results in the situation with spins. We prove that in many cases the partition function has canonical asymptotics (with the canonical critical exponent). In general, there is a constant c = c( , β) such that Z(N, m) ∼ φ(m, , β)N − 2 cN . 5
For example, we have the following result. Theorem 1. Let k=
[exp(−β (σ, σ )) − 1] < 0. σ,σ
Then for β sufficiently small Z(N, m) has canonical asymptotics. We shall see below that this asymptotics, that is the constant c( , β), is defined by the level 1 of the multiscale expansion. However, there are exceptions. Theorem 2. If
≤ 0 is not identically constant, then the asymptotics is not canonical.
Non-rooted triangulations. The easy corollary is that for non-rooted triangulations 7 of the sphere (thus there is no boundary) we have Z0 (N ) ∼ φ(β)N − 2 c(β)N in the canonical case, and for ≤ 0 the asymptotics is not canonical. The canonical critical exponent is defined for non-rooted triangulations to be − 27 . Example: scaling transformation. Introduce the constant nearest-neighbor interaction &µ (σ, σ ) ≡ µ. For non-rooted triangulations the term & gives an overall factor exp(−βµL∗ ) = exp(− 23 βµN ). However, for the ensembles where m is not fixed, such interaction & is a nontrivial interaction; it leads to an interesting phase transition in µ. Appending &µ to some interaction results in a scaling transformation of the generating functions (see below). Otherwise speaking, appending such interaction changes only the constant c(β) in the asymptotics, and does not change the canonical exponent.
166
V. A. Malyshev
2. Formal Expansion 2.1. Cluster representation for a fixed triangulation. Assume that the triangulation T is fixed. Expanding the exponent as exp(−β (σv , σv )) = 1 + exp(−β (σv , σv )) − 1 one can write Z(T ) =
z(L∗ ),
L∗
where
L∗
is over all subsets L∗ of edges of the dual graph T ∗ , and
z(L∗ ) = |S|−N
kl , kl = kl (σv , σv ) = exp(−β (σv , σv )) − 1.
{σv :v∈V (T ∗ )} l=(v,v )∈L∗
For each pair (T , L∗ ) a triangle δ of T is called colored if it corresponds to a vertex of some edge of L∗ , and blank otherwise. Denote the set of coloured triangles as V (L∗ ) = V (T , L∗ ). Recall that the distance between triangles in dynamical triangulation models is the distance between the corresponding vertices in the dual graph, that is the length (number of edges) of the shortest path between them in T ∗ . A set ) of triangles is called 1connected (or connected) if between each pair of triangles t, s ∈ ) there is a path, belonging to ), in which any pair of consecutive triangles are on the distance not greater than d = 1. For each set ) define the external boundary ∂e ) as the set of triangles on distance 1 from ), and the internal boundary ∂i ) as the set of triangles in ) on the distance 1 from the complement of ). For each pair (T , L∗ ) there is a unique decomposition of the closure of V (L∗ ), cl(V (L∗ )) =def V (L∗ ) ∪ ∂e V (L∗ ) = ∪Vi , where Vi = Vi (T , L∗ ) are maximal connected subsets of cl(V (L∗ )). Finally Z(T ) = 1 +
∞
k(V1 ) . . . k(Vp ),
(1)
p=1 {V1 ,... ,Vp }
where - = V1 , . . . , Vp is any system (called configuration) of connected subsets of V (T ∗ ) such that dist(Vi , Vj ) > 1 for any i = j , and for any such V , k(V ) =
L∗ :cl(V (L∗ ))=V
∗ |S|−|V (L )|
kl .
(2)
{σv :v∈V (T ∗ )} l∈L∗
We call Vi clusters for given (T , -), or (T , -)-clusters. That is for a given triangulation T and the subset - ⊂ F (T ), (T , -)-clusters are maximal connected components of -. Thus in (1) for any i there exists at least one nonempty L∗i such that cl(V (L∗i )) = Vi .
Dynamical Triangulation Models with Matter: High Temperature Region
167
2.2. Hierarchy of clusters and generating functions. For any set V ⊂ F (T ) the complement F (T ) \ V consists of two parts: the exterior part Ext(F (T ) \ V ), consisting of all triangles of F (T ) \ V , which can be connected with the boundary by a connected path, belonging to F (T ) \ V , and the interior part Int(F (T ) \ V ), containing all other triangles. Let V be one of the (T , -)-clusters. Then the interior part of its complement F (T )\V consists of some number r of connected components V1 , . . . , Vr . For given T a set V ⊂ F (T ) = V (T ∗ ) of triangles is called simple if it is connected (that is connected) and its interior part is empty. We say that (T , -)-cluster has level 1 if it is simple. We define (T , -)-clusters of level n > 1 by induction: cluster V has level n if n is the minimal number such that in its interior part there are only clusters of level less than n. Thus the (T , -)-clusters form a forest (a set of connected trees), where clusters are vertices of this forest. Two vertices of the tree are connected by an edge if one of the corresponding clusters is in the interior part of the other one, and their levels differ by 1. For given T a configuration - is said to be of level 1 if either there are no clusters at all or all (T , -)-clusters are simple. For given T a configuration - is said to be of level n > 1 if all (T , -)-clusters have level not greater than n and at least one of them has level n. For given T denote for n ≥ 1, Z (n) (T ) = 1 +
(≤n)
∞
k(V1 ) · · · k(Vp ), 0(n) (T )
p=1 -={V1 ,... ,Vp }
= δn1 +
∞
(n)
k(V1 ) · · · k(Vp ),
p=1 -={V1 ,... ,Vp }
where in the first case the sum with index (≤ n) is over all configurations - = V1 , .. . , Vp of level at most n. The sum with index (n) is over all configurations - = V1 , . . . , Vp of level n. Let us put Z (0) (T ) = 1, 0(0) (T ) = 0, and for n ≥ 0,
Z (n) (N, m) =
Z (n) (T ), 0(n) (N, m) =
T ∈T0 (N,m)
0(n) (T ).
T ∈T0 (N,m)
For n ≥ 0 let us call U (n) (x, y) =
∞ ∞
Z (n) (N, m)x N y m , Y (n) (x, y) =
N=0 m=2
∞ ∞
0(n) (N, m)x N y m
N=0 m=2
the generating function of level at most n and of level n correspondingly. Then obviously U (n) (x, y) = Y (k) (x, y), 1≤k≤n
Y (1) (x, y) = U (1) (x, y), Y (n) (x, y) = U (n) (x, y) − U (n−1) (x, y), n ≥ 2.
168
V. A. Malyshev
Lemma 1. There exists δ > 0 such that the functions U (n) (x, y), Y (n) (x, y) and U (x, y) =def
∞ ∞
Z(N, m)x N y m = lim U (n) (x, y) n→∞
N=0 m=2
are analytic for |x| , |y| < δ. Proof. Note first that the limit Z(N, m) = lim Z (n) (N, m) n→∞
exists (in fact, for fixed N, m, the sequence Z (n) (N, m) ≤ Z (n+1) (N, m) stabilizes as n → ∞) and is the partition function for the ensemble with the boundary conditions defined above. Moreover, there are a priori exponential bounds, easy to prove, Z(N, m) < C N+m for some C > 0 depending only on
and β.
We shall study properties of the functions U (n) by induction in n. If β ≤ 0 we call the model with the partition function Z (n) the random cluster model with clusters of the level not greater than n. For example, the level 0 partition function is the case when there is no spin at all, and the level 1 corresponds to a special random cluster model, where only simple clusters are taken into account. 2.3. Level 1 cluster expansion. As it is standard in cluster expansions, we have started with a resummation formula (polymer expansion, or cluster representation, see [3, 6]) for a given triangulation T . After this in the standard theory some kind of correlation equations are used. However, here we will have to follow a different way, by incorporating our expansion into the recurrent formulae of Tutte, that allow to censor all possible triangulations together with spin configurations on them. The reason is that the “empty space”, outside the clusters of the expansion, can vary considerably. In other words, the empty space has nonzero entropy. Our cluster expansion is inductive: it consists of steps n = 1, 2, . . . . On each step a new cluster expansion has to be done. It is a kind of multi-scale cluster expansion, where the role of scale is played by a topological parameter n, the length of maximal chains of imbedded non-simply connected clusters. 2.3.1. Level 1 cluster function. The nonempty (T , -)-cluster V is called complete if it contains all triangles of T . It is obviously simple and thus - consists only of this cluster. The complete (T , -)-cluster V is unique and we put K(T ) = k(V ). Then the cluster function is defined as W (x, y) = W (1) (x, y) =
∞ ∞ N=3 m=2
(1)
WN,m x N y m , WN,m = WN,m =
T :T ∈T0 (N,m)
K(T ).
Dynamical Triangulation Models with Matter: High Temperature Region
169
Lemma 2. There exist constants C2 > 0 such that for any β sufficiently small N K(T ) ≤ (C2 β) 6 , N = V (T ∗ ) .
(3)
It follows that the function W (x, y) is analytic in |x| , |y| < Cβ −a for some C > 0, a > 0, and W (x, y) together with its first partial derivatives are O(β |x|3 |y|2 ) for |x| , |y| ≤ 1. Proof. We have obviously for small β ∗ kl < (2β max | |)|L | . l∈L∗
|L∗ | At the same time, the dual graph is 3-regular and thus 2 |L∗ | ≥ |V (L∗ )| ≥ 3 . As the number of L∗ such that V (L∗ ) = V is bounded by 23|V | then from (2) it follows that k(V ) ≤ (C1 β)
|V | 2
for some C1 > 0. For any triangulation T and any L∗ , given a complete (T , -)-cluster, |V (T ∗ )| we have |V (L∗ )| ≥ . The number of V giving the complete (T , -)-cluster for 3 ∗ given T is not greater than 2|V (T )| . Thus we have the cluster estimate (3). From the exponential estimate for the number C0 (N ; m) of rooted triangulations we have the result if we note also that m ≤ 2N for N > 1. As clusters are assumed to be nonempty, one can factor out βx 3 y 2 , for any c > 0 and |x| , |y| < c we have by the cluster estimate W (x, y) = β |x|3 |y|2 O(1).
2.3.2. Recurrent equations. Now we shall give a procedure to construct all configurations with simple clusters only from complete clusters and the degenerate disk, that is the configuration with N = 0, m = 2. It is of primary importance that all complete clusters have blank triangles on their internal boundary. The canonical functional equation is the following equation in a small neighbourhood 7 ∈ C 2 of x = y = 0: F (x, y) = F (x, y)xy −1 + F 2 (x, y)xy −1 + y 2 + J (x, y) − xyF2 (x),
(4)
where we denote F (x, y) =
∞ ∞
FN,m x N y m , Fm (x) =
N=0 m=2
F and F2 are unknown functions, J = J (x, y) = and analytic in 7.
∞
FN,m x N .
N=0
∞
N=3
∞
m=2 JN,m x
N ym
is given
Lemma 3. The level 1 generating function U (1) = U1 satisfies the canonical functional equation U1 (x, y) = U1 (x, y)xy −1 + U12 (x, y)xy −1 + y 2 + W (1) (x, y) − xyS(x)
(5)
170
V. A. Malyshev
111 000 000 111 000 111 000 111 00 11 00 11 000 111 00 11 000 111 000 111
=
1 0 0 1 000 111 000 111 00 11 00 11
1111111 0000000 0000000 1111111 0000000 1111111 0000000 1111111 0000000 +1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111
1 0 0 1 0 1
+
11 00 0011 11 00 00 11
Fig. 1. Recurrent relation
with (1)
S(x) = S2 (x) =
∞
Z (1) (N, 2)x N ,
N=0
where it is convenient to use notation S instead of (U1 )2 . Proof. We use the idea of Tutte’s algorithm for censoring all triangulations. We have Z (1) (N, m) =Z (1) (N − 1, m + 1) + δN,0 δm,2 + WN,m + Z (1) (N1 , m1 )Z (1) (N2 , m2 )
(6)
N1 +N2 =N−1,m1 +m2 =m+1
for m ≥ 2, N ≥ 0 and Z (1) (−1, m)) = Z (1) (N, 0)) = Z (1) (N, 1)) = 0. It follows that Z (1) (0, m) = δm,2 , Z (1) (1, m) = δm,3 . Figure 1 shows the meaning of this recurrent relation. The irst term on the righthand of (6) side corresponds to appending a triangle, the next two terms – taking the degenerate triangulation with N = 0, m = 2 (omitted on the picture) and a complete cluster with the chosen root edge, the last term – joining together two already constructed triangulations by appending a triangle. Fat edges denote the root edges and show the rule to choose the root. ∞ Multiplying (6) on x N y m and summing ∞ N=0 m=2 we get the result. 2.4. Inductive resummation formula. The (T , -)-cluster V is called a boundary cluster if it contains all boundary triangles. Let Vco denote all coloured triangles of V , and let V be Vco together with the blank triangles, adjacent to the boundary. The complement of the closure cl(F (T ) \ V ) = F (T ) \ V consists of some number r ≥ 0 of nonempty maximal connected components R1 , . . . , Rr , isomorphic to a disk, and such that for each i there is at least one triangle in Ri not belonging to V . Note that if r = 0 then the boundary cluster V is a complete cluster. Denote mi = mi (V ) the number of edges on the boundary of Ri . In other words, the boundary of cl(F (T ) \ V ) can be uniquely subdivided on r circles (they can intersect in some vertices of T ), and mi are the lengths of these circles.
Dynamical Triangulation Models with Matter: High Temperature Region
171
A configuration of level n is called basic if there is only one cluster of level n and it is a boundary cluster. We shall construct all configurations of level n from the basic configurations of level n. Introduce the basic generating function of level n (n) W (n) (x, y) = x N y m WN,m , N,m
(n)
WN,m =
∞
k(V0 )k(V1 ) · · · k(Vp ),
T ∈T0 (N,m) p=0 {V0 ,V1 ,... ,Vp }
where the sum {V0 ,V1 ,... ,Vp } is over all configurations - = V0 , V1 , . . . , Vp (for given T ) with a boundary cluster of level n, we denote it V0 . Assume now that we know the generating functions U (k) (x, y) for k < n and thus (k) we know all Sm (x), k < n, m = 2, 3, . . . , defined similarly to the functions Fm in the canonical equation. Put (0) (1) (k) (k−1) Ym(0) (x) = Sm (x), Ym(1) (x) = Sm (x), Ym(k) (x) = Sm (x) − Sm (x), k ≥ 2
or (k) Sm (x) =
Z (k) (N, m)x N =
N
(j )
Ym (x).
j ≤k
Let the boundary cluster V0 have m edges on its exterior boundary, and mi , i = 1, . . . , r, edges on the “interior” boundaries of V0 . Then resummation of the latter formula gives W (n) (x, y) =
m
ym
V0
r
k(V0 )x N(V0 )
k1 ,... ,kr i=1
Ym(kii ) (x),
(7)
where the sum V0 is over all boundary clusters, having m edges on their exterior boundary, N (V0 ) is the number of triangles in V0 , mi , i = 1, . . . , r, is the number of
th edges on the i component of the interior boundary of V0 . And moreover, it is assumed that in k1 ,... ,kp any ki ≤ n − 1 and at least one ki equals n − 1. 2.4.1. Equation for the nth level generating function. Lemma 4. The function U (n) satisfies the canonical functional equation (4) with F = (n) U (n) , F2 = S2 (x), J = J (n) = W (1) + · · · + W (n) . We will call such an equation the nth level equation. Proof is quite similar to the proof for n = 1. The function J in the canonical recurrent equation has n new (comparatively to the case without spins) terms in the right-hand side appear, terms corresponding to the basic functions W (i) , i = 1, . . . , n. On Fig. 2 each of them is represented as a generic (third in the right-hand side) term with a boundary cluster. This corresponds to the recurrent equation (6) where instead of WN,m we substituted (n) WN,m . One can check that in the consecutive iterations of this recurrent relation only
172
V. A. Malyshev
111 000 000 111 000 111 000 111 000 111 00 11 00 11 000 111 000 111
+
11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 + 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111
11 00 00 11 00 11 00 11 00 11 00 11 00 11 00 11
=
1111111111 0000000000 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000 1111 0000000000 1111111111 0000 1111 0000000000 1111111111 0000 1111 0000000000 1111111111 0000 1111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111
+
1 0 0 1 0 1 0 1
+
11 00 00 0011 11 00 11 00 11
Fig. 2. Additional term with boundary cluster
configurations of level not more than n appear. And each configuration of level not more than n can be constructed with this recurrent relation. Note that all U (k) are analytic in some neighborhood of x = y = 0, because (k) Z (N, m) ≤ Z(N, ˜ ˜ m) ≤ C N+m , where Z(N, m) is the partition function for the interaction − | |. Thus the generating functions and the recurrent equations will be well-defined in some neighborhood of x = y = 0. 3. Analytic Part 3.1. Level 1. 3.1.1. Algebraic functions. Here we solve the functional equation (5) for the case when there is no spin, that is when W = 0. We rewrite it in the following form: (2xU1 (x, y) + x − y)2 = 4x 2 y 2 S (1) (x) + (x − y)2 − 4xy 3 − 4xyW (x, y)
(8)
and denote D its right-hand side. Consider the analytic set {(x, y) : 2xU1 + x − y = 0} in a small neighbourhood of x = y = 0. Note that it is not empty, (0, 0) belongs to this set and it defines a function y(x) = x + O(x 2 ) in a neighbourhood of x = 0. In particular, it will be shown that y(x) and S(x) are algebraic functions if W = 0. We have two equations valid at the points of this analytic set D = 0,
∂D =0 ∂y
or 4x 2 y 2 S(x) + (x − y)2 − 4xy 3 − 4xyW (x, y) = 0,
(9)
Dynamical Triangulation Models with Matter: High Temperature Region
173
8x 2 yS(x) − 2(x − y) − 12xy 2 − 4xW (x, y) − 4xyWy (x, y) = 0 from where one can exclude the function S(x) by multiplying the second equation (9) by y2 and subtracting it from the first equation. Then y = x + 2y 3 − 2yW + 2y 2 Wy
(10)
x . 1 − 2y 2 − 2(yWy − W )
(11)
or y=
Here the functions W = W (x, y, β), S = S(x, β), y = y(x, β) are also functions of the parameter β. By the theorem on implicit functions this equation gives the unique function y(x, β), analytic for small x with y(0, β) = 0. It is evident from (11) that the convergence radius of y(x, β) is finite. Note that y(x; β) is odd and S(x; β) is even, because for any triangulation N −m is even, and thus the coefficients of monomials x i y j of yWy − W have even i + j . Such symmetry will also hold in all future constructions. Now we consider the case β = 0 (or W = 0) in more detail. We rededuce here Tutte results in a different way, suitable for further generalizations. y(x) = y(x, 0) is an algebraic function satisfying the equation y 3 + py + q = 0 with p = − 21 , q = x2 . The polynomial f (y) = y 3 + py + q can have multiple roots only when f = fy = 0, 2 which gives x± = ± 27 . These roots are double roots because fy
= 0 at these points. 2 For x+ = 27 we have y+ = y(x+ ) = √1 , that can be seen from fy = 3y 2 − 21 = 0 6 and f = 0. From (11) it follows that x(−y) = −x(y) and thus y(x) is odd. It follows 2 that y(x) has both x± = ± 27 as its singular points. From (9) we know S(x) = S(x, 0), after that U1 (x, y) is explicit from Eq. (8). The unique branch y(x), defined by Eq. (11), is related to the unique branch of S(x) by the equation
S=
(1 − 3y 2 (x)) = x −2 y 2 (1 − 3y 2 ) (1 − 2y 2 (x))2
that is obtained by substituting x = y − 2y 3 to the first equation (11).
2 We know that S(x) has positive coefficients, that is why x = 27 should be among 2 its first singularities. Then x = − 27 should also be a singularity of both y(x) and S(x). 1 The principal part of the singularity at the double root x+ is y(x) = A(x −x+ )d+ 2 for 1
some integer d. As y+ = y(x+ ) is finite then d ≥ 0. At the same time y (x) = 1−6y 2 (x) that is ∞ for x = x+ . It follows that d = 0. For S we have the same type of singularity 1 A(x − x+ )d+ 2 but here d = 1 as S(x+ ) and S (x+ ) are finite but S
(x+ ) is infinite. If we introduce the Riemann surface S0 of the algebraic function y(x), then x(s) and y(s), s ∈ S0 , are analytic on S0 , except only the points where x = y = ∞. The function S(x(s)) is meromorphic on S0 with poles in the points s where x(s) = 0, y(s) = ± √1 .
Thus, S(x(s)) does not have poles if, for example, |x| ≤ 21 . Denote max |y(x(s))| = y¯ < ∞.
|x|≤ 21
2
174
V. A. Malyshev
3.1.2. The perturbation. Now let β be sufficiently small. To find the first positive singularity x+ (β) of y(x, β) put f (x, y; β) = −y + x + 2y 3 + Z(x, y), Z(x, y) = −2yW + 2y 2 Wy ,
(12)
fy = −1 + 6y 2 + Zy . We can rewrite Eqs. (12) as y2 =
x=
1 1 − Z , 6 6 y
2 y y + Zy − Z. 3 3
Consider first the case when we allow only simple clusters with the size N ≤ N0 for some N0 , that is when W (x, y) is a polynomial. The same argument as for the level 0 case (1) gives that y (1) (x, β), Sm (x, β) are algebraic functions, as well as U (1) (x, y). However, the first singularity x+ (β) will be different. The fixed point (x, y) = ( (1) perturbed, and the first singularity x+ (β)
2 27 ,
1 6)
is
= x+ (β) will be an analytic function of β for small β. Denote now y (0) (x) = y(x, 0), y (1) (x) = y(x, β). We will need the following bounds: for some C > 0, (1) (13) x+ (β) − x+ < Cβ, and, for example, if x is inside the circle of convergence of both one-valued branches y (0) (x) and y (1) (x), (1) (14) y (x) − y (0) (x) < Cβ. However this bound holds also for any x, |x| ≤ 21 , if y (0) (x) and y (1) (x) are corresponding branches. Correspondence between branches is established uniquely by analytic continuation if it is fixed for one point x = 0. It is convenient to use for this continuation the Riemann surfaces S0 , S1 of these two functions correspondingly. In fact, we have for any x, |x| ≤ 21 , f (x, y, β) − f (x, y, 0) = O(β), fy (x, y, β) − fy (x, y, 0) = O(β). Fix some δ such that 0 < β δ 1. Then there is such c > 0 that for all pairs (x, y(x)), outside some cδ-neighborhoods O(x± ) of the branching points x± , one can choose closed countours <(x, y(x)) in the y-plane around y = y(x) so that |f (x, y, 0)| = δ for all y ∈ <(x, y(x)). Then the bound (14) is readily obtained by the Cauchy formula, fy (x, y, β) fy (x, y, 0) 1 y( − )dy. y (1) (x) − y (0) (x) = 2πi <(x,y(x)) f (x, y, β) f (x, y, 0) To get (14) inside O(x± ) it could be more convenient to use the convergent Puiseux series (see for example [8]) at the branching points.
Dynamical Triangulation Models with Matter: High Temperature Region
175
The bound (13) follows from the graphs of two functions y and x + 2y 3 + Z(x, y) (1) for fixed 0 < x < 21 . They have a common tangent at the point x = x+ (β) = x+ (β). If (1) the sign of the O(β) term in Z(x, y) is plus then x+ (β) < x+ (β), if the sign is minus (1) then x+ (β) > x+ (β). If the O(β) term in Z(x, y) is zero then one should consider the 2 O(β ) term. In the double root x+ (β) the function y(x) has the main part of the singularity equal 1 to A(β)(x − x+ (β))d+ 2 with d = 0 because y(x+ (β)) is finite, but y (x+ (β)) =
1 + Zx 1 + Zx =− = ∞. 2
1 − 6y − Zy fy
At the same time from the first equation (9) it follows that xS = −
1 W 1 x − +y+ + 2 4y 2y 4x y
(15)
and (xS) x = 2y 3 y f (x, y, β) +
1 Wx + x2 y
(x (β)) = ∞. We have also that are finite at x = x+ (β), but Sxx +
S¯ = max |S| < ∞. |x|≤ 21
3.1.3. Asymptotics. In our case S(x, β) has two singular points x+ (β) and −x+ (β) 3 on the radius of convergence. The main parts of the singularities are c+ (1 − x+x(β) ) 2
and c+ (1 + x+x(β) ) 2 correspondingly for some positive constant c+ = c+ (β). Thus the asymptotics of the coefficients of S(x, β), by Darboux theorem, see [5]) is 3
5 5 3 3 c+ < −1 (− )N − 2 (x+ (β))−N + c+ < −1 (− )N − 2 (−x+ (β))−N . 2 2
Thus for even N 5 3 Z (1) (N, 2) ∼ 2c+ < −1 (− )N − 2 (x+ (β))−N . 2
(1) 3.1.4. Arbitrary m > 2. The generating functions Sm (x) = N Z (1) (N, m)x N for m > 2 can be obtained easily by the following recurrent procedure. Put U = y 2 R, then R(x, y) =
∞ m=2
(1)
Sm−2 (x)y m−2
and we can rewrite the functional equation as (1)
yR = y + xy 2 R 2 + x(R − S (1) ) + y −1 J (x, y), S (1) (x) = S2 (x) = R(x, 0). (16)
176
V. A. Malyshev (1)
Then all Sm are defined recursively by (1)
(1) xSm (x) = Sm−1 (x) + x
(1)
j +k=m−2
(1)
Sj (x)Sk (x) + Jm+1 (x),
(1) where J (x, y) = m Jm+1 (x)y m . We see from this that Sm (x) have similar singulari(1) ties as S0 (x) and for |x| ≤ 21 and some C > 0, (0) (1) (x, β) < (c + Cβ)m . Sm (x) < cm , Sm Thus we have proved Lemma 5. The asymptotics for the level 1 partition function is Z (1) (N, m) ∼ φ (1) (m, β)N − 2 (x+ (β))−N . 5
3.2. Inductive estimates. The scheme of the induction is the following. We use the functional equation in Lemma 4, and for fixed n denote the solutions of the nth level (n) equation as y (n) (x, β), S2 (x, β), U (n) (x, y). These functions will be initially defined as one-valued and analytic functions in |x| , |y| < ε for sufficiently small ε > 0. However, they are branches of some multivalued functions. It is convenient to consider these multi-valued functions as functions on Sn or Sn × C, where Sn is the Riemann surface of y (n) (x, β) and C is the complex plane. Thus we shall define a sequence of Riemann surfaces Sn , n = 2, . . . , and analytic covering maps φ (n) : Sn → Sn−1 , that is ... → Sn →φ
(n)
Sn−1 → · · · → S1 →φ
(1)
C
In fact, we will not need the complete Riemann surface Sn , but only some open part Dn of it. Dn will be defined inductively. The induction procedure depends on the case. We consider first the case of Theorem 1. (1)
Definition of D1 ⊂ S1 . Let Oδβ (x± (β)) be the δβ-neighborhood of the points (1) x± (β) ∈ C. In the complex plane C the function y (1) (x, β) has the unique analytic continuation to the set
(1) (1) (1) Aδβ = |x| < x+ (β) + δβ \ (Oδβ (x+ (β)) ∪ Oδβ (−x+ (β))) of the x-plane C. If y (1) (s (1) ) = y (1) (x(s (1) , β) on S1 , then put
D1 = s : y (1) (s) = y (1) (x(s), β), x ∈ Aδβ (1)
(1)
∪ (φ (1) )−1 (Oδβ (x+ (β)) ∪ Oδβ (x− (β))). It is instructive to start with the case n = 2. Before studying the 2-level equation the domain of analyticity the function W (2) (x, y) should be established. There exist
Dynamical Triangulation Models with Matter: High Temperature Region
177
constants C, a > 0 such that the function W (2) (x, y) is analytic on D1 × D(y), ¯ D(y) ¯ ⊂ C, and (2) (1) ¯ W (s , y) ≤ (Cβ)2a , (s (1) , y) ∈ D1 × D(y) This will be proved below. Note that it is no longer true that W (i) for i > 1 have radius of convergence of order β −a . The functions (2) y (2) (x, β), Sm (x, β), Ym(2) (x, β), U (2) (x, y), Y (2) (x, y)
are defined initially in a neighborhood of x = y = 0 as the solutions of the 2-level functional equation. All these functions have an analytic continuation to D2 ⊂ S2 defined as D2 = (φ (2) )−1 (D1 )
(1) ∩ s (2) : y (2) (s (2) ) = y (2) (x(s (2) ), β), x ∈ Aδβ ∪ (Oδβ (x+ (β)) (1) ∪ Oδβ (−x+ (β))) . (2)
(2)
Then the functions x(s (2) , β), y (2) (s (2) , β), Sm (s (2) , β), Ym (s (2) , β) are defined as one-valued analytic functions on D2 . Moreover, U (2) (s (2) , y), Y (2) (s (2) , y) are analytic functions on D2 × D(y). ¯ Now we can formulate inductive assumptions, definitions and estimates for (n)
W (n) , y (n) (x, β), r (n) (β), Y2 (x, β), Ym(n) (x, β), where r (n) (β) is the convergence radius of y (n) (x, β). Using the functional equation and resummation formula (7) we prove the inductive assumptions for (n+1)
W (n+1) , y (n+1) (x, β), x+
(n+1)
(β), Y2
(x, β), Ym(n+1) (x, β)
in this order. 1. Let s (n) ∈ Sn and assume that Dn is already defined. Then the functions x(s (n) ), (n) (n) (nb) y (n) (s (n) , β), S2 (s (n) , β), Sm (s (n) , β), Ym (s (n) , β) are defined as analytic func(n) (n) tions on Dn . The functions U (s , y), Y (n) (s (n) , y), W (n+1) (s (n) , y) will be onevalued analytic functions on Dn × D(y). ¯ Then Dn+1 ⊂ Sn+1 is defined as Dn+1 = (φ (n+1) )−1 (Dn )
∩ s (n+1) : y (n+1) (s (n+1) ) = y (n+1) (x(s (n+1) ), β), x ∈ Aδβ (1) (1) ∪ (Oδβ (x+ (β)) ∪ Oδβ (−x+ (β))) . 2. The function W (n) (x, y) is analytic on Dn−1 × D(c) and (n) (n−1) , y) ≤ (Cβ)an , (s (n−1) , y) ∈ Sn−1 × D(y). ¯ W (s
178
V. A. Malyshev
3. (n + 1)-level functional equation is defined on Dn × D(y), ¯ because W (n+1) (s (n) , y) is analytic on Sn × D(y). ¯ However, its unknown functions U (n+1) (s (n) , y) and (n) (n+1) Y (s , y) will have branching points on Sn × D(y), ¯ and thus it is reasonable to consider them as functions Sn+1 × D(y) ¯ without branching points. (n) 4. x+ (β) is defined as the positive singularity of the curve f (x, y; β) = −y + x + 2y 3 + Z (n) (x, y) = 0 (n)
(1)
¯ We will prove that x+ (β) > x+ (β) for all n > 1 and r (n) (β) = in Sn × D(y). (1) x+ (β). 5. For any s (n) ∈ Dn , (n) (n) y (s ) − y (n−1) (φ (n) (s (n) )) < (Cβ)an . (n)
6. The functions Ym , m = 2, 3, . . . are analytic on Dn and (n) (n) Ym (s ) ≤ (Cβ)an , s (n) ∈ Dn (1)
and have the same (canonical) main singularities at x± (β). Lemma 6. Assume that inductive assumptions hold for all k ≤ n. Then they hold also for k = n + 1. 3.2.1. Bounds on the cluster function. To prove the inductive assumptions we need the following cluster estimate for the cluster functions of levels j = 2, 3, . . . . Lemma 7. Consider the sum V0 over all boundary clusters V0 with fixed r, m, m1 , . . . , mr and N0 = N (V0 ). Then there exist C, a > 0 such that a |k(V0 )| ≤ (Cβ)aN0 ≤ (Cβ) 2 (N0 +(m+m1 +···+mr )) , V0
where m is the length of the boundary and mi are the lengths of the interior boundaries of V0 . Proof. As in the proof of (3) we have |k(V0 )| ≤ (Cβ)
N0 6
and m + m1 + · · · + mr ≤ 3N0 . This gives the result.
Consider for example the case n = 1. Then the function W (2) (x, y) is analytic in S1 × {y : |y| < y} ¯ and, by resummation formula, is bounded as r a (1) (2) ym (Cβ) 2 (N0 +(m+m1 +···+mr )) x N0 Smi (x) W (x, y) ≤ N,m
≤
N,m
N0 ,r,m1 ,... ,mr
y
m
i=1
(Cβ)
a 2 (N0 +(m+m1 +···+mr ))
x
N0 m1 +···+mr
c
= 0(β 2a ).
N0 ,r,m1 ,... ,mr
In general the estimation of W (n+1) (x, y) is quite similar. It follows that the function (k) W (n+1) (x, y) is analytic, and has the same main singularities at x± (β), k = 1, . . . , n, as U (n) .
Dynamical Triangulation Models with Matter: High Temperature Region
179
3.2.2. Singular points. (1)
Lemma 8. The convergence radius of y (2) is equal to x+ (β). Proof. Note that the sign of k or of β
(σ, σ )
σ,σ (1)
(2)
(1)
defines whether x+ (β) is less or greater than x+ (0). The fact that x+ (β) > x+ (β) is also defined by k together with two other facts. First one is the existence of boundary clusters of level 2 of the first order in β. This will give a first order terms in W (2) . An example of triangulation which gives first order term is shown in Fig. 3. Here the triangles 2 and 3 are the colored triangles of the boundary cluster of level 1 (an interaction between them is shown by the fat horizontal line), the triangles 1 and 4 are blank triangles of the cluster adjacent to the “exterior” boundary with m = 4, region 5 (not made precise) is the interior part of the cluster, containing blank triangles of the cluster itself and possibly other blank triangles but no other clusters.
2
1
3
4
5 Fig. 3. First order boundary cluster term of level 2
We have the following equation: y (2) = x + 2(y (2) )3 + Z(x, y (2) ), Z(x, y (2) ) = −2y(W (1) + W (2) ) + 2y 2 (W (1) + W (2) ) y . We have (1)
−W (2) (x, y) + yWy(2) = 3ky 4 x 4 S2 (x) + . . . . f
Note that yx = − fx is never zero for x, y of order one because fx is of order one for y
(2)
(1)
such x, y. We have from k < 0 that x+ (β) − x+ (β) > εβ for some ε > 0. The second fact is the absence of first order terms for boundary clusters of level (2) (k) greater than 2. It follows that all higher perturbations give that x+ − x+ = O((β)2 ) for any k > 2.
180
V. A. Malyshev
3.2.3. Bounds on y (n+1) . Denote the right hand side of (10) f (n+1) (s (n) , y, β) = x + 2y 3 − 2yJ (n+1) + 2y 2 (J (n+1) ) y . We want to find the difference )(n+1) (s (n) , β) = y (n+1) (s (n) , β) − y (n) (s (n) , β). We have (n+1) (n) (s , y, β) − f (n) (s (n) , y, β) f df (n+1) (n) df < (Cβ)a(n+1) , (s (n) , y, β) − (s (n) , y, β) = (Cβ)a(n+1) . dy dy Outside a vicinity of the branching points we have by Cauchy formula (n+1)
)
(s
(n)
1 , β) = 2π i −
y(
d(f (n+1) ) ()(n+1) (s (n) , β), y, β) dy f (n+1) ()(n+1) (s (n) , β), y, β)
< d(f (n) ) (n+1) (s (n) , β), y, β) dy () )dy, f (n) ()(n+1) (s (n) , β), y, β)
where < = <(x(s (n) , y (n) (s (n) , β))) are the same family of contours, which were used in step 1 of the cluster expansion. Thus (n+1) (n) (s , β) < (Cβ)a(n+1) . ) In the vicinity of the branching points one can again use Puiseux series. To prove the bound (n+1) (n) x+ (β) − x+ (β) < (Cβ)a(n+1) one could use the same trick for fy instead of f , or simply to consider graphs of f (n+1) (x, y) for fixed 0 < x < 21 , as in step 1. There exists x0 (β) such that for fixed (n+1) x = x0 (β) these functions have common tangent. Then put x+ (β) = x0 (β). (n+1)
3.2.4. Bounds on the functions Ym (n+1)
xS2
(s (n+1) , β) = −
. Then from
x 4(y (n+1) )2
+
1 2y (n+1)
−
1 J (n+1) + y (n+1) + (n+1) , 4x y
we get (n+1) (n+1) (n+1) (n+1) (n) (s ) = S2 (s ) − S2 (φ (n+1) s (n) ) < (Cβ)a(n+1) . Y2 (n+1)
For Ym step 1.
(n+1)
(s (n+1) ) = Sm
(n+1)
(s (n+1) ) − Sm
(φ (n+1) s (n) ) the derivation is similar to
Dynamical Triangulation Models with Matter: High Temperature Region
181
3.3. Proof of Theorem 2. Let now β ≤ 0. Induction procedure for this case should be done in a different way. But the details are quite similar, and we give only main points for shortness. Iterating equation (11) we get that the expansion of y(x, β) at x = 0 has all coefficients positive, because yWy − W has positive coefficients. By the same reason, the coefficients of y(x, β) increase when β increases, and the radius of convergence (n) x+ (β) of y (n) (x, β) decreases as n → ∞. Note also that 1 max |y(x, 0)| = √ 6 |x|≤ 2 27
2 , because the power series has positive coefficients. Note and is attained at |x| = 27 that y(x, β) also has its maximum on the circle of convergence on the positive half-axis. From the first equation we see also that this maximum y(x+ (β), β) decreases when β increases. It is interesting to remark that it follows that cut-off models of level not greater than n that is the random cluster models, defined at the end of the Sect. 2.2, have the canonical behaviour. However the complete model has not. The domains Dn should be chosen as
Dn = s (n) : y (n) (s (n) ) = y (n) (x(s (n) ), β), x ∈ D(δcn ) , (n−1)
(n)
where cn = x+ (β) − x+ (β) and δ > 0 does not depend on β and sufficiently small. Here y (n) (x, β) is considered as a two valued function on D(δcn ). 5 Assume that the asymptotics is canonical, that is φ(m)N − 2 cN . Then c−1 is the con(n) (n+1) vergence radius and thus equals limn→∞ x+ (β), because if β ≤ 0 then x+ (β) < (n) x+ (β). As above one can prove that (n)
0(n) (N, m) ∼ φ (n) (m, β)N − 2 (x+ (β))−N . 5
Then the asymptotics is bounded above by the sum 5 (n) φ (n) (m, β)N − 2 (x+ (β))−N . As
nφ
n
(n) (m, β)
converges, then φ(m) < ε for any ε > 0.
References 1. Di Francesco, P., Ginsparg, P., Zinn-Justin, J.: 2D Gravity and Random Matrix Models. Physics Reports 254, 1–133 (1995) 2. Ginsparg, P., Moore, G.: Lectures on 2D Gravity and 2D String Theory. TASI Summer School, 1992 3. Glimm, J., Jaffe, A.: Quantum Physics. New York: Springer-Verlag, 1981 4. Goulden, I., Jackson, D.: Combinatorial Enumeration. New York: Wiley, 1983. 5. Henrici, P.: Applied and computational complex analysis. V. 2, New York: John Wiley, 1977 6. Malyshev, V., Minlos, R.: Gibbs Random Fields. Dordrecht: Kluwer, 1990 7. Tutte, W.: A Census of Planar Triangulations. Canad. J. of Math. 14, 21–38 (1962) 8. Abhyankar, Sh.: Algebraic geometry for scientists and engineers. Providence, RI: Am. Math. Soc., 1990 Communicated by M. Aizenman
Commun. Math. Phys. 226, 183 – 203 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Yangian and Quantum Universal Solutions of Gervais–Neveu–Felder Equations D. Arnaudon1 , J. Avan2 , L. Frappat1, , E. Ragoucy1 1 Laboratoire d’Annecy-le-Vieux de Physique Théorique, LAPTH, CNRS, UMR 5108, Université de Savoie,
B.P. 110, 74941 Annecy-le-Vieux Cedex, France
2 Laboratoire de Physique Théorique et Hautes Énergies, LPTHE, CNRS, UMR 7589, Universités Paris
VI/VII, 4, place Jussieu, B.P. 126, 75252 Paris Cedex 05, France Received: 11 May 2001 / Accepted: 16 October 2001
Abstract: We construct universal Drinfel’d twists defining deformations of Hopf algebra structures based upon simple Lie algebras and contragredient simple Lie superalgebras. In particular, we obtain deformed and dynamical double Yangians. Some explicit realisations as evaluation representations are given for slN , sl(1|2) and osp(1|2). Contents 1. 2.
3.
4.
5.
Introduction . . . . . . . . . . . . . . . . . General Setting . . . . . . . . . . . . . . . . 2.1 Notations . . . . . . . . . . . . . . . 2.2 Quasi-Hopf algebras . . . . . . . . . 2.3 Drinfel’d twist . . . . . . . . . . . . 2.4 Represented R-matrices . . . . . . . Deformed Double Yangian DYr (g) . . . . . 3.1 Universal form . . . . . . . . . . . . 3.2 In representation for g = slN . . . . 3.3 In representation for g = sl(1|2) . . . Twist from Uq (g) to Bq,λ (g): A Summary . . 4.1 Universal form . . . . . . . . . . . . 4.2 In representation for g = slN . . . . 4.3 In representation for g = osp(1|2) . . Twist from U(g) to Us (g) . . . . . . . . . . 5.1 Universal form and cocycle condition 5.2 In representation for g = slN . . . . 5.3 In representation for g = osp(1|2) . . 5.4 In representation for g = sl(1|2) . . .
Member of Institut Universitaire de France
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
184 185 185 186 187 188 189 189 190 193 193 193 195 196 196 196 198 198 198
184
6.
7.
A. B.
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
Twist from Uq ( g) to Uq,λ ( g) . . . . . . . 6.1 Universal form . . . . . . . . . . 6.2 In representation for g = slN . . 6.3 In representation for g = osp(1|2) Twist from DY (g) to DYs (g) . . . . . . 7.1 Universal form . . . . . . . . . . 7.2 In representation for g = slN . . 7.3 In representation for g = sl(1|2) . Notations . . . . . . . . . . . . . . . . . N ) . . . . . . . . Definition of Bq,p,λ (sl
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
199 199 199 200 200 200 201 201 201 202
1. Introduction Several consistent deformations ofYangian algebras have been proposed in the past years, starting with the scaling limit, defined in [1, 2], of vertex-type quantum elliptic algebras Aq,p (sl2 ) [3]. Extension of these scaling limits to face-type (so-called “dynamical”) elliptic algebras [4, 5], and clarification of their connections at the level of evaluation representations, were proposed in [6] for structures based upon the Lie algebra sl2 . The double degeneracy limits of elliptic R-matrices, whether vertex-type [1, 2, 7] or face-type [6] give rise to algebraic structures which have been variously characterised as scaled elliptic algebras [2, 7], or double Yangian algebras [6, 8, 9]. As pointed out earlier [1, 2] although represented by formally identical Yang–Baxter relations RLL = LLR [10], these two classes of objects differ fundamentally in their structures (as is reflected in the very different mode expansions of L defining their individual generators) and must be considered separately. It appears clearly here that the universal algebraic structures associated with any limit of evaluated R-matrices may not be taken for granted, but must needs be explicitly constructed. This will be achieved here by identification of these particular limits as evaluation representations of universal R-matrices for the deformations by particular Drinfel’d twists [11], known as “shifted-cocycle” twists [12, 13], of Hopf algebra structures. Construction of several deformations of Yangian algebras at the universal level, and understanding thereof as Drinfel’d twists of the centrally extended double Yangian DY (sl2 ) [9, 14], was achieved in [15], following the schemes developed in the elucidation as Drinfel’d twists of face and vertex affine elliptic algebras based upon slN [16] and face finite quantum algebras based upon any simple (contragredient super) Lie algebra [17]. The deformed double Yangians were thus characterised as Quasitriangular Quasi-Hopf Algebra (QTQHA). Our purpose here is first of all to extend these universal constructions to the case of deformations of the centrally extended double Yangians DY (g), where g is a simple Lie algebra of type slN or a contragredient simple Lie superalgebra of type sl(M|N ) (M = N ). We will also construct, by the same techniques, consistent deformations of U(g) and Uq ( g), for any g. These constructions systematically endow these deformations with a Gervais–Neveu–Felder type QTQHA structure. It is characterised by a particular form of the universal Yang–Baxter equation, to be made explicit below. It must be emphasised that in general the universal Drinfel’d twists considered here are not obtained as (scaling or quasi-classical) limits of the universal Drinfel’d twists appearing in the elliptic algebras. We shall first of all construct a deformation DYr (g), with g (super) unitary, along the derivation generator d, of the centrally extended double Yangian DY (g). When g
Yangian and Quantum Universal Solutions of GNF Equations
185
is taken to be slN , the evaluation representation of the R-matrix for this QTQHA is identified, up to a gauge transformation, with the scaling limit of the R-matrix for the N ), obtained by sending q, p and the vertex-type elliptic quantum affine algebra Aq,p (sl spectral parameter z to 1 whilst keeping the ratio of their logarithms as finite parameters [18]. We then give as the simplest illustration of the superalgebra case the evaluation representation of DYr (sl(1|2)). We will then propose different deformations of Hopf algebra structures, this time along the Cartan subalgebra of the underlying (finite) Lie algebra. For historical reasons, they are called dynamical deformations. We first recall the previous construction [17] of a twist of the finite quantum enveloping algebra Uq (g) to the dynamical algebra Bq,λ (g). A consistent semi-classical limit of this procedure then yields a universal twist of shifted-cocycle form [12, 13], from the undeformed enveloping algebra U(g) to a dynamical deformation Us (g) for any simple Lie (contragredient super) algebra g, thereafter evaluated for g = slN , g = osp(1|2) and g = sl(1|2). Using now the Hopf algebra inclusion of Uq (g) into Uq ( g), the same twist acts on Uq ( g) to yield the QTQHA Uqλ ( g), the R-matrix of which may be obtained in the slN case (under an evaluated form) as a trigonometric limit p → 0 of R-matrix for the N ). elliptic affine face-type algebra Bq,p,λ (sl Using finally the Hopf algebra inclusion of U(g) into the extended double Yangian DY (g), the previous twist from U(g) to Us (g) leads from DY (g) to the dynamical double Yangian DYs (g). The R-matrix of this QTQHA may also be obtained in the slN case (under an evaluated form) as the scaling limit of the R-matrix for the previous algebra N ). Uqλ (sl 2. General Setting 2.1. Notations. Let g be a simple Lie algebra (or a contragredient simple Lie superalgebra different from psl(N |N )) of rank rg , with symmetrised Cartan matrix A = (aij ) and inverse A−1 = (dij ). In the superalgebra case, we denote by [.] its Z2 grading. We + denote by H the Cartan subalgebra of g with basis {hi } and dual basis {h∨ i }. Let be the set of positive roots of g endowed with a normal ordering <, i.e. if α, β, α + β ∈ + with α < β, then α < α + β < β. Let ρ be the half-sum of the positive roots (resp. even positive roots) for a simple Lie algebra (resp. superalgebra). We consider the corresponding quantum universal enveloping (super) algebra Uq (g). It is endowed with a Hopf structure. In the case of superalgebras, the tensor product is graded: (a1 ⊗ b1 )(a2 ⊗ b2 ) = (−1)[b1 ][a2 ] (a1 a2 ⊗ b1 b2 ). We use the following coproduct for the generators related to simple roots (ei ) = ei ⊗ 1 + q hi ⊗ ei , (fi ) = fi ⊗ q −hi + 1 ⊗ fi , (hi ) = hi ⊗ 1 + 1 ⊗ hi .
(2.1) (2.2) (2.3)
With this choice, the corresponding universal R-matrix was given (up to q ↔ q −1 ) in [19]. We regard the quantum affine universal enveloping (super) algebra Uq ( g), with universal R-matrix, compatible with Eqs. (2.1)–(2.3), given in [20]. In that case, the Cartan subalgebra is completed with the derivation and central charge generators d and c respectively.
186
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
We introduce the double Yangian DY (slN ) following [9, 14] with the generators ei± (u) ≡ ±
fi± (u) ≡ ±
ei,k u−k−1 ,
k≥0 k<0
h± i (u) ≡ 1 ±
fi,k u−k−1 ,
k≥0 k<0
(2.4)
hi,k u−k−1 .
k≥0 k<0
satisfying relations given in [14]. The generators related to non-simple roots are derived from suitable combinations of generators related to simple roots, using Chevalley type relations. Its universal R-matrix, defined in [9], obeys the Yang–Baxter equation. Denoting by πe the evaluation representation of DY (slN ), the Lax matrix L = (πe ⊗ 1I)(R) realises an FRT-type formalism of DY (slN ) with an R-matrix defined by R = (πe ⊗ πe )(R). Our constructions regarding the double Yangian deformations will rely upon the following conjecture [9]: The universal R-matrix of the centrally extended double Yangian DY (g) is given, for any Lie algebra g, by the general formula (5.3) in [9]. We will also assume that the evaluation of this R-matrix corresponds to the defining R-matrix of the Yangian Y(slN ). We will consider the double Yangian DY (sl(M|N )). In this case, the universal Rmatrix is supposed to be obtained similarly from the double of the corresponding superYangian. Again, central extensions of DY (g) will contain in addition the derivation d and central charge c.
2.2. Quasi-Hopf algebras. Definition 2.1. A unital associative C-algebra A is called a quasi-Hopf algebra if it is endowed with a coalgebra structure: the coproduct : A → A ⊗A and counit % : A → C are algebra homomorphisms, the product m : A ⊗ A → A and unit ι : C → A are coalgebra homomorphisms, and A is equipped with an antihomomorphism S : A → A (antipode), elements α, β ∈ A, and an invertible element ) ∈ A⊗A⊗A (coassociator), with (∀x ∈ A), (id ⊗ )((x)) = )( ⊗ id)((x)))−1 (id ⊗ %) ◦ = (% ⊗ id) ◦ = id, (id ⊗ id ⊗ )()) · ( ⊗ id ⊗ id)()) = (1I ⊗ )) · (id ⊗ ⊗ id)()) · () ⊗ 1I), (id ⊗ % ⊗ id)()) = 1I ⊗ 1I, and for the antipode i
i
(1)
(2)
S(xi )αxi
= %(x)α,
(1) (2) (3) S(ϕi )αϕi βS(ϕi )
i
= 1I,
(1)
(2)
xi βS(xi ) = %(x)β, i
(1)
(2)
(3)
ψi β S(ψi )αψi
= 1I,
Yangian and Quantum Universal Solutions of GNF Equations
187
(1) (2) where x ∈ A with (x) = i xi ⊗ xi and (1) (1) (2) (3) (2) (3) ϕi ⊗ ϕ i ⊗ ϕ i , )−1 = ψi ⊗ ψ i ⊗ ψ i . )= i
i
The element ) measures the lack of coassociativity of the coproduct. Definition 2.2. A quasi-Hopf algebra A is said to be quasi-triangular if it an invertible element R ∈ A ⊗ A exists, called the universal R-matrix, such that (x) = R(x)R−1 (312)
(∀x ∈ A),
( ⊗ id)(R) = )
R13 )(132)
(id ⊗ )(R) = )(231)
−1
−1
R23 )(123) ,
R13 )(213) R12 )(123)
−1
.
It follows that R satisfies the generalised Yang–Baxter equation (in A ⊗ A ⊗ A): R12 )(312) R13 )(132)
−1
R23 )(123) = )(321) R23 )(231)
−1
R13 )(213) R12 .
(2.5)
(1) (2) (3) The notation )(312) means that if )(123) = ⊗ ϕi , then )(312) = i ϕi ⊗ ϕ i (3) (1) (2) i ϕi ⊗ ϕi ⊗ ϕi , and so on. A quasi-Hopf algebra with ) = 1I⊗3 is coassociative. It is a Hopf algebra. 2.3. Drinfel’d twist. The notion of Drinfel’d twist allows us to associate to a given quasi-triangular quasi-Hopf algebra another quasi-triangular quasi-Hopf algebra in the following way [11]. Consider an invertible element F ∈ A ⊗ A such that (id ⊗ %)F = (% ⊗ id)F = 1I (when A is a quantum universal enveloping algebra, this means that the “leading” term in F is 1I ⊗ 1I). One sets −1 F (x) = F12 (x)F12 ,
F
R
=
(2.6)
−1 F21 R12 F12 ,
−1 )F = F23 (id ⊗ )(F) ) F12 ( ⊗ id)(F) , (1) (1) (2) (2) αF = S(wi )αwi and βF = vi βS(vi ), i
where F12 =
(2.7) (2.8) (2.9)
i
i
(1)
vi
(2)
⊗ vi
and
−1 F12 =
i
(1)
wi
(2)
⊗ wi .
(2.10)
Proposition 2.3 (Drinfel’d). If (A, ), , %, S, α, β, R) is a quasi-triangular quasi-Hopf algebra (QTQHA), then (A, )F , F , %, S, α F , β F , RF ) is also a QTQHA. F is called a Drinfel’d twist. In the following, we will mainly be concerned with twists acting on Hopf algebras. From now on, we consider the case where A is a Hopf algebra () = 1I⊗3 ) and F depends on parameters λ ∈ H, where H is an Abelian subalgebra of A.
188
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
Definition 2.4. A Drinfel’d twist F satisfying the so-called shifted cocycle condition (h ∈ H) F12 (λ) ( ⊗ id)(F(λ)) = F23 (λ + h(1) ) (id ⊗ )(F(λ))
(2.11)
is called a Gervais–Neveu–Felder (GNF) twist. We denote by λ a vector with coordinates (s1 , . . . , srg ) in the basis {hi } in the finite case, or with coordinates (s1 , . . . , srg , r, s ) in the basis {hi , d, c} otherwise. Then λ + h(1) (1) is the vector with coordinates si + h∨ in the finite case, and (s i , r + c(1) , s ) in other i (1) cases. The coefficients s i are given by s i = 0 if all sj are zero and s i = si + h∨ i otherwise. (123) = F23 (λ) F23 (λ+ In the case of a GNF twist, the coassociator )F is given by )F h(1) )−1 and the universal R-matrix RF satisfies the so-called GNF or dynamical Yang– Baxter equation: (3) F F (1) F F (2) F RF 12 (λ + h )R13 (λ)R23 (λ + h ) = R23 (λ)R13 (λ + h )R12 (λ).
(2.12)
The classes of QTQHA we obtain in this paper are all of GNF type.
2.4. Represented R-matrices. In representation the spectral parameter (if any) is explicit and the dynamical Yang–Baxter takes the forms R12 (z, λ + h(3) )R13 (zz , λ)R23 (z , λ + h(1) ) = R23 (z , λ)R13 (zz , λ + h(2) )R12 (z, λ), (2.13) or R12 (β, λ + h(3) )R13 (β + β , λ)R23 (β , λ + h(1) ) = R23 (β , λ)R13 (β + β , λ + h(2) )R12 (β, λ),
(2.14)
depending upon the multiplicative or additive nature of the spectral parameter. In the case of superalgebras, the R-matrix obtained as the evaluation of the universal R-matrix satisfies the graded Yang–Baxter equation j j
k j
(R12 )i11i22 (R13 )j11i33 (R23 )kj22jk33 (−1)[i1 ][i2 ]+[i3 ][j1 ]+[j2 ][j3 ] j j
j k
= (R23 )i22i33 (R13 )i11j33 (R12 )kj11jk22 (−1)[i3 ][i2 ]+[i1 ][j3 ]+[j2 ][j1 ]
(2.15)
Redefining the R-matrix as j1 j2 j j R = (R)i11i22 (−1)[i1 ][i2 ] , i1 i2
(2.16)
satisfies now the ordinary (i.e. non-graded) Yang–Baxter equation (this the R-matrix R notation will be used throughout the paper when dealing with superalgebras). In the osp(1|2) case (resp. sl(1|2)), the basis vectors v1 , v2 , v3 of the three-dimensional representation have Z2 gradings 1, 0, 0 (resp. 1, 0, 1).
Yangian and Quantum Universal Solutions of GNF Equations
189
3. Deformed Double Yangian DYr (g) 3.1. Universal form. Our aim is to construct a Drinfel’d twist F from DY (g) to a deformed double Yangian DYr (g), which is thereby endowed with a QTQHA structure. Let A = DY (g) with universal R-matrix R. Following [15, 17], we consider the linear equation in A ⊗ A, inspired by [21]: F ≡ F(r) = Ad(φ −1 ⊗ 1I)(F) · C,
(3.1)
φ = ψe(r+c)d = ωh0,ρ e(r+c)d ,
(3.2)
with
C≡e
1 2 (c⊗d+d⊗c)
R,
(3.3)
−1 −k Fk (r) = φ1k C12 φ1
(3.4)
) is a root of unity. where ω = exp( r2iπ g +1 Theorem 3.1. The expression F(r) =
←−
Fk (r),
k
is a solution of the linear equation (3.1). It satisfies the following shifted cocycle relation: (3.5) F (12) (r)( ⊗ id)(F(r)) = F (23) r + c(1) (id ⊗ )(F(r)), i.e. F(r) is a GNF twist. Proof. The operator d in the double Yangian DY (g) is defined by [d, eα (u)] = for any root α (see [14]). It satisfies (d) = d ⊗ 1 + 1 ⊗ d. The generator h0,ρ of DY (g) is such that h0,ρ eα (u) = eα (u)(h0,ρ + (ρ|α)),
d du eα (u)
h0,ρ fα (u) = fα (u)(h0,ρ − (ρ|α)),
[h0,ρ , hα (u)] = 0,
(3.6)
and hence τ = Ad (ψ ⊗ 1I) is idempotent, since all the scalar products (ρ|α) are rational. As in [16], the Fk satisfy the following properties: 1 (2) (13) (r + c(1) )Fk r + c(2) − , c 2k 1 (2) (12) (13) (id ⊗ )(Fk (r)) = Fk (r)Fk r+ , c 2k (23)
( ⊗ id)(Fk (r)) = Fk
(3.7) (3.8)
and (12)
Fk
l + 21 (2) (23) (13) Fl (r + c(1) ) (r)Fk+l r + c k+l (23)
= Fl
l − 21 (2) (12) (13) Fk (r). c (r + c(1) )Fk+l r + k+l
(3.9)
190
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
Using Eq. (3.9), one can prove by induction the following relation: ←
(23)
l≥k≥1
Fk
= ×
←
(r + c(1) )(id ⊗ )(F(r)) (12)
Fk
k≥1 ←
l≥k≥1
l + 21 (2) (13) c (r)Fk+l r + k+l
(23)
Fk
(13)
(r + c(1) )Fk
r + c(2) −
1 (2) . c 2k
(3.10)
Letting then l → ∞ and taking into account (3.7), one recovers the shifted cocycle condition. The twist F(r) defines a QTQHA denoted DYr (g) with R-matrix RDYr (r) = F21 (r)RDY F12 (r)−1 . Denoting by πev (u) an evaluation representation of DY (g) with evaluation parameter u, the Lax matrix L(u) = (πev (u) ⊗ id) RDYr realises an FRT-type formalism of DYr (g) with an evaluated R-matrix defined by R(u1 − u2 , r) = (πev (u1 ) ⊗ πev (u2 )) RDYr . The RLL relations take the form R12 (u1 − u2 , r + c)L1 (u1 , r)L2 (u2 , r + c(1) ) = L2 (u2 , r)L1 (u1 , r + c(2) )R12 (u1 − u2 , r)
(3.11)
3.2. In representation for g = slN . A deformed doubleYangian DYr (slN ) with R-matrix R(u, r) was obtained in [18] (and in [1] for sl2 ) by taking the scaling limit of the RLL N ). The matrix R(u, r) has N 3 non vanishing entries. We now representation of Aq,p (sl characterise it up to gauge transformation, as an evaluation representation of the action of the Drinfel’d twist (3.4) on the R-matrix of DY (slN ). In this way, we identify the structure in [18] as a QTQHA. 3.2.1. Gauge transformation of DYr (slN ). Proposition 3.2. There exists a gauge transformation that connects the N 3 -vertex Rmatrix R(u, r) of DYr (slN ) to a more sparsely filled N (2N −1)-vertex R-matrix R(u, r). Proof. The R-matrix R(u, r) of DYr (slN ) is given by c,a+b−c R a,b (u, r)
πu π sin sin 1 r r S c,a+b−c (u, r), = − ρDY r (u) π(u + 1) a,b N sin r
(3.12)
where π (u + 1 + (b − a)r) c,a+b−c Nr . S a,b (u, r) = π π (u + (b − c)r) sin (1 − (a − c)r) sin Nr Nr sin
(3.13)
Yangian and Quantum Universal Solutions of GNF Equations
191
The normalisation factor ρDY r (u) is defined by ρDY r (u) =
S2 (−u|r, N )S2 (1 + u|r, N ) , S2 (u|r, N )S2 (1 − u|r, N )
(3.14)
where S2 (x|ω1 , ω2 ) is Barnes’ double sine function of periods ω1 and ω2 . We perform the gauge transformation R := (V ⊗ V )R(V ⊗ V )−1 , S := (V ⊗ V )S(V ⊗ V )
−1
,
(3.15) (3.16)
with j
Vi = N −1/2 ω(i−1)j .
(3.17)
A similar transformation was recently exhibited in [22]. Equation (3.15) leads to the following expression of the matrix elements of S
u 1 j j j +j j j Si11i22 (u) = δi11+i22 δi21 6i2 −i1 + δi22 6i2 −i1 , r r
(3.18)
where the function 6n (x) is defined by 6n (x) :=
N−1
ωnk cot
k=0
π (x + k), N
(3.19)
that is
6n (x) = N
eiπx −2iπnx/N − iδn0 e sin πx
for n ∈ {0, ..., N − 1}.
(3.20)
Note that in (3.20) the integer n has to be taken in the interval [0, . . . , N − 1] since the expression (3.20) is not explicitly N -periodic in n. In particular, all the non zero entries of S are π πu aa Saa + cot , (u) = N cot r r eiπ/r −2iπ(b−a)/Nr ab Sab (u) = N e sin πr ba Sab (u) = N
eiπu/r −2iπ(b−a)u/Nr e sin πu r
(3.21) for b − a ∈ {1, ..., N − 1}, for b − a ∈ {1, ..., N − 1}.
(3.22) (3.23)
192
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
3.2.2. Twist from DY (slN ) to DYr (slN ). Proposition 3.3. The evaluation of the twisted universal R-matrix RDYr (r) is precisely the sparsely filled N (2N −1)-vertex R-matrix R(u, r) given by the gauge transformation (3.15). j j
Proof. We start with the R-matrix of DY (slN ). Its non-vanishing matrix elements Ri11i22 are given by (1 ≤ a, b ≤ N ) aa = ρDY (u), Raa
ab Rab = ρDY (u)
ρDY (u) =
u , u+1
ba Rab = ρDY (u)
1 , u+1
81 (u|N )81 (u + N |N ) , 81 (u + 1|N )81 (u + N − 1|N )
(3.24)
(3.25)
We then solve the evaluated linear equation, that is: F (u + r) = (ψ ⊗ 1I)−1 F (u)(ψ ⊗ 1I)R(u + r),
(3.26)
ψ = diag(ω−a ),
(3.27)
where
aa = 1 and to 2 × 2 and F is now a N 2 × N 2 matrix which reduces to 1 × 1 blocks Faa blocks: ab
ba (u) Fab (u) Fab bab (u) cab (u) = . (3.28) (u) b (u) ab (u) F ba (u) cab Fba ab ba
The linear equation (3.26) is then equivalent to: ωb−a (u + 2r + 1)bab (u + 2r) − (u + r + ωb−a (u + 2r))bab (u + r) + (u + r − 1)bab (u) = 0
(3.29)
(u + 2r + 1)cab (u + 2r) − (u + r + ωa−b (u + 2r))cab (u + r) + ωa−b (u + r − 1)cab (u) = 0
(3.30)
(u) and c (u), deduced from (3.29) by the change together with similar equations for bab ab −1 ω→ω .
a b ;z : The solution is expressed in term of hypergeometric functions 2 F1 c u−1
1 u−1 8 r +1 −r r + 1 ; ωb−a , u 2 F1 (3.31) bab (u) = u 8 r +1 r +1
1 +1 ωb−a 8 u−1 − r + 1 u−1 r r + 1 ; ωb−a . u 2 F1 cab (u) = − (3.32) u r 8 +2 r +2 r
(u) and c (u) are obtained from b (u) and c (u) by changing ω → ω−1 . Similarly, bab ab ab ab
Yangian and Quantum Universal Solutions of GNF Equations
193
aa = 1, The twist F (u), given by the collection of 2×2 blocks (3.28) and 1×1 blocks Faa −1 (u). It provides the is applied to the R-matrix (3.24) of DY (slN ), i.e. F21 (−u)R(u)F12 R-matrix R(u, r) of the deformed double Yangian DYr (slN ). This last result follows by a direct computation using properties of the hypergeometric functions 2 F1 such as the connection formula
8(c)8(b − a) a b a 1−c+a 1 −a ;z = ; (−z) 2 F1 2 F1 c 1−b+a 8(b)8(c − a) z
8(c)8(a − b) b 1−c+b 1 −b ; + (−z) 2 F1 . (3.33) 1−a+b 8(a)8(c − b) z
3.3. In representation for g = sl(1|2). The evaluated R-matrix of the double Yangian of sl(1|2) is assumed to have the canonical Yang-type simplest rational form: 1 = u R(u) (−1)[a][b] Eaa ⊗ Ebb + Eab ⊗ Eba + Eaa ⊗ Eaa, u+1 u+1 a a,b
a=b
(3.34) where the gradation is [1] = [3] = 1, [2] = 0 (the conventions used for sl(1|2) are those of [23] with the fermionic basis). A similar evaluation of the twist (3.4) now leads to the following expression for the R-matrix of the QTQHA DYr (sl(1|2)) (with N = 3): π(u−1) r (E11 ⊗ E11 + E33 ⊗ E33 ) + E22 ⊗ E22 sin π(u+1) r sin πu [a][b] iπ/r+2iπ(a−b)/Nr r + (−1) Eaa ⊗ Ebb e sin π(u+1) a
=− R(u)
sin
+ e−iπ/r−2iπ(b−a)/Nr Ebb ⊗ Eaa +
π r eiπu/r+2iπ(a−b)u/Nr Eab π(u+1) a
sin sin
(3.35)
⊗ Eba
+ e−iπu/r−2iπ(b−a)u/Nr Eba ⊗ Eab .
4. Twist from Uq (g) to Bq,λ (g): A Summary 4.1. Universal form. Let A = Uq (g) with universal R-matrix R given by γ , = R R = RK; R
(4.1)
γ ∈+
where the product is ordered with respect to > (the reversed normal order on + ), and γ and K are given by the objects R γ = expq 2 (−(q − q −1 )eγ ⊗ fγ ) R
(4.2)
194
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
and K = q−
ij
dij hi ⊗hj
.
(4.3)
eγ , fγ are the root generators and hi the Cartan generators in the Serre-Chevalley basis. In (4.2) the q-exponential is defined by expq (x) ≡
n∈Z+
xn , (n)q !
where (n)q ! ≡ (1)q (2)q . . . (n)q
and (k)q ≡
1 − qk . 1−q (4.4)
Theorem 4.1 ([16, 17]). 1. The linear equation in A ⊗ A F = Ad(φ −1 ⊗ 1I)(F)K−1 RK
(4.5)
has a unique solution in (Uq (B+ ) ⊗ Uq (B− ))c , with projection 1I ⊗ 1I on (Uq (H)⊗2 )c , where the superscript c denotes a suitable completion. It is expressed as F[Uq (g)→Bq,λ (g)] = K
−1
FK,
= F
←
−1 , Ad(φ ⊗ 1I)k R
(4.6)
k≥1
with φ ≡ qX ≡ q
ij
dij hi hj +2
i si hi
(si ∈ C).
(4.7)
2. This solution satisfies the following shifted cocycle relation: ∨ (1)
F12 (w)( ⊗ 1I)(F(w)) = F23 (wq h
)(1I ⊗ )(F(w))
with w = (w1 , . . . , wrg ) = q s = (q s1 , . . . , q srg ) ∈ Crg , wq h∨ wrg q rg ) and h∨ j dij hj . Hence F(w) is a GNF twist. i = 3. This twist connects Uq (g) to Bq,λ (g).
h∨
= (w1 q
(4.8) h∨ 1
,...,
The subsequent formulae (4.10), (4.11) taken from [17] will be used in Sect. 5 in their q → 1 limit. Expanding the product formula (4.1) with respect to a Poincaré–Birkhoff–Witt basis reads ordered with <, R = RK−1 = 1I ⊗ 1I + R σm e m ⊗ f m , (4.9) m∈Z ∗
where Z = Map(+ , Z+ ), and Z ∗ = Z \ {(0, . . . , 0)}. The term em , (resp. f m ) denotes an element of the PBW basis of the deformed enveloping nilpotent subalgebra Uq (N + ) (resp. Uq (N − )). Under the assumptions of the theorem, = 1I ⊗ 1I + F ϕpr (w)ep ⊗ f r , (4.10) {p,r}∈(Z ∗ )2
Yangian and Quantum Universal Solutions of GNF Equations
195
where the ϕpr (w) belong to C[[s1 , .., srg , s1−1 , .., sr−1 , h]] ¯ ⊗ (Uq (H)⊗2 )c . They are deg fined recursively (using (4.5)) by
∨ (1) +γ −s|γ ) p p
1 − q (−2h
=
ϕpr (w) ∨ (1) (−1)[l][m] apkm brlm σm q (−2h +γk −s|γk ) ϕkl (w). (4.11)
k+m=p l+m=r m=0
In the above equation, γp is the element of the root lattice associated to ep . The scalar product (.|.) is given by (x|y) ≡ i,j aij xi yj . The numbers apkm and brlm are defined by
ek em =
p∈Z
apkm ep and f l f m =
r∈Z
brlm f r .
(4.12)
4.2. In representation for g = slN . In the fundamental representation for g = slN , we get R=q
1/N
1I ⊗ 1I + (q
−1
− 1)
Eaa ⊗ Eaa + (q
−1
− q)
a
Eab ⊗ Eba ,
a
(4.13) the N × N matrices Eab being the usual elementary matrices with entry 1 in position (a, b) and 0 elsewhere. The twist is represented by = F
←
−1 (B ⊗ 1I)−k , (B ⊗ 1I)k R
k≥1
F[Uq (slN )→Bq,λ (slN )] = 1I ⊗ 1I + (q − q −1 )
a
(4.14)
wab Eab ⊗ Eba , 1 − wab
(4.15)
N −1
with B = q N diag(q xa ), wab = q xa −xb and xa = 2sa − 2sa−1 with s0 = sN = 0. j j The non-vanishing elements Ri11i22 of the R-matrix of Bq,λ (slN ) are then given by (1 ≤ a, b ≤ N) 1 aa = q 1/N , Raa q 1 ab Rab = q 1/N (1 − q 2 wab )(1 − q −2 wab ) (1 − wab )2 1 ba Rab = q 1/N (q − q −1 ) . wab − 1
if b > a if b < a
,
(4.16)
196
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
4.3. In representation for g = osp(1|2). The universal R-matrix of Uq (osp(1|2)) was initially obtained in [24, 25]. As indicated in Section 2, we use the finite R-matrix obtained as the evaluation of the universal R-matrix given in [19] (changing q to q −1 ):
−1 0 0 0 0 q −1 − q 0 0 0
12 = R
0 1 0 q −1 − q 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0
q −1 0 0 0 0
0 0 0 0 0 q 0 0 0
q2 − 1 0 q −1 − q 0 0 0 0 0 0 (q −1 − q)(q + 1) 1 0 0 q 0 0 0 0
0 0 0 0 0 0 0 0
q −1
The twist from Uq (osp(1|2)) to Bq,λ (osp(1|2)) is represented by w(q − q −1 ) (E13 ⊗ E31 − qE13 ⊗ E12 ) w−q qw(q − q −1 ) − (E21 ⊗ E12 − q −1 E21 ⊗ E31 ) qw − 1 w2 (q − q −1 )(q + 1) − (E23 ⊗ E32 ) (qw − 1)(w + 1)
F = 1I ⊗ 1I −
(4.17)
−1 The resulting R-matrix of Bq,λ (osp(1|2)) is given by F21 RF12 . The corresponding has the following expression matrix R
0
0
0
0
(q 3 w−1)(w−q) q(qw−1)2
0
(q 2 −1)w 1−qw
(q 2 −1)qw q−w
0
r1111
0 0 0 R = 0 q−q −1 qw−1 0 r
0
r1132
0
0
0
0
0
0
0
0
1
0
0
0
q−q −1 qw−1
q 2 −1 w−q
0
1
0
0
0
0
0
0
0
0
q −1
0
0
0
0
0
0
0
0
q
0
0
0
0
0
0
0
3211
0
0
0
0
(q 2 −1)(q+1)w 2 (1−qw)(w+1)
(q 3 −w)(1−qw) q(q−w)2
(q 2 −1)(q+1) (w−q)(w+1)
0
(q−q −1 )w q−w
0
0
0
0
0
0
0
(q 2 w+1)(q 2 +w) q(w+1)2
0
0
2 2 +w)(1−qw) q 2 (w−1)2 +w(q−1)(q 3 −1) , r1132 = (q −1)(q and q(qw−1)(q−w) (q−w)2 (w+1) (q 2 −1)w(q 2 w+1)(w−q) . It obeys the ordinary dynamical Yang–Baxter equation. q 2 (qw−1)2 (w+1)
where r1111 =
0
,
q −1
r3211 =
5. Twist from U(g) to Us (g) 5.1. Universal form and cocycle condition. We construct a GNF-twist F(s), obtained as the scaling limit of the twist F(λ) (see Eq. (4.6)), which, once applied to the Hopf
Yangian and Quantum Universal Solutions of GNF Equations
197
algebra U(g), leads to a QTQHA denoted Us (g), as illustrated by the following diagram: F (λ)
Uq (g) −−−−→ Bq,λ (g) q→1, λ→1 q→1 (scaling limit)
(5.1)
F (s)
U(g) −−−−→ Us (g) Taking the scaling limit corresponds here to considering representatives in the quotient Uh¯ (g)/(hU ¯ h¯ (g)), since there is no spectral parameter. In particular the scaling limit of F(λ) is given by F(s) = 1I ⊗ 1I + ϕpr (s)ep ⊗ f r , (5.2) {p,r}∈(Z ∗ )2
where the functions ϕpr (s) are the representatives of those appearing in (4.10). Proposition 5.1. F(s) can be written as a product (for any ξ = 0) F(s) =
←
(ξ 1I + X ) ⊗ 1I
−k
−1 k 1I ⊗ 1I + (ξ 1I + X ) ⊗ 1I r (ξ 1I + X ) ⊗ 1I ,
k≥1
(5.3) = 1I ⊗ 1I + h¯ where X was defined in (4.7) and R r + o(h¯ ). It satisfies the linear equation [X , F(s)] = F(s) r.
(5.4)
It obeys in U(g)⊗2 the shifted cocycle equation F12 (s)( ⊗ 1I)(F(s)) = F23 (s + h∨
(1)
)(1I ⊗ )(F(s))
(5.5)
defining a QTQHA denoted Us (g) with R-matrix R(s) = F21 (s)F12 (s)−1 . Proof. The consistency of the procedure follows from the well-known Hopf algebra identification Uh¯ (g)/(hU ¯ h¯ (g)) " U(g) [26]. In this quotient, the twist F coincides with (see (4.6)). It is given by (5.2), a formula analogous to (4.10). F The functions ϕpr (s) satisfy recursion equations obtained as the leading order in h¯ of (4.11). This procedure is well-defined since the coefficient of ϕ in the left-hand side of (4.11) is of order h¯ and the coefficients in the right-hand side are at least of order 1 in h¯ (due to the presence of σm ). The leading order in h¯ of (4.11) can be expressed as Eq. (5.4) for F. Equation (5.4) is also the first non-trivial term in the expansion in h¯ of (4.5). Under a similar hypothesis on F as in Sect. 4, replacing q-deformed enveloping algebras by classical enveloping algebras, Eq. (5.4) has a unique solution expressed either by (5.2) or as the infinite product (5.3). Note that for ξ = 0, ξ 1I + X is invertible in formal power series in U(g), hence formula (5.3) makes sense. By uniqueness of the twist, the result is independent of ξ . It then follows from the Hopf algebra identification that F(s) satisfies the shifted cocycle condition (5.5).
198
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
5.2. In representation for g = slN . In the fundamental representation for g = slN , the evaluated infinite product expression for F reads F[U (slN )→Us (slN )] =
←
(X ⊗ 1I)−k Y (X ⊗ 1I)k
(5.6)
k≥1
with X=
N−1 N 1I
+ diag(xa ),
xa = 2sa − 2sa−1 , s0 = sN = 0, (5.7) Y = 1I ⊗ 1I + (X ⊗ 1I) r = 1I ⊗ 1I + (X ⊗ 1I)−1 −2Eab ⊗ Eba , (5.8) where
−1
a
matrix of U(slN ). r = −2 a
(5.9)
a
j j
Regarding the R-matrix of Us (slN ), its non vanishing matrix elements Ri11i22 are given by (1 ≤ a, b ≤ N ) aa Raa = 1, 1 ab Rab = 1 − ba Rab =
4 (xa − xb )2
2 xa − x b
for
if
b>a
if
b
,
(5.10)
a = b,
which indeed satisfies the dynamical Yang–Baxter equation (2.12). 5.3. In representation for g = osp(1|2). Using again the infinite product expression (5.3), the twist from U(osp(1|2)) to Us (osp(1|2)) is represented by F = 1I ⊗ 1I − −
2 (E13 ⊗ E31 − E13 ⊗ E12 ) s−1
2 (E21 ⊗ E12 + E23 ⊗ E32 − E21 ⊗ E31 ). s+1
(5.11)
−1 The R-matrix of Us (osp(1|2)) is then straightforwardly R(s) = F21 (s)F12 (s). The satisfies the ordinary dynamical Yang–Baxter equation. corresponding matrix R
5.4. In representation for g = sl(1|2). Similarly, the twist from U(sl(1|2)) to Us (sl(1|2)) is represented by F = 1I ⊗ 1I +
1 1 1 E31 ⊗ E13 + E21 ⊗ E12 − E32 ⊗ E23 . s2 − 1 s1 + s 2 s1 + 1
(5.12)
−1 The R-matrix of Us (sl(1|2)) is then given by R(s) = F21 (s)F12 (s). Again, the corre satisfies the ordinary dynamical Yang–Baxter equation. sponding matrix R
Yangian and Quantum Universal Solutions of GNF Equations
199
6. Twist from Uq ( g) to Uq,λ ( g) 6.1. Universal form. Lemma 6.1. The twist (4.6) applied to the universal R-matrix of Uq ( g) leads to the R-matrix of a QTQHA denoted Uq,λ ( g). Proof. Using the fact that Uq (g) is a Hopf subalgebra of Uq ( g), the twist (4.6) can be used to construct the dynamical algebra Uq,λ ( g). Indeed, F[Uq (g)→Bq,λ (g)] seen as an element of Uq ( g)⊗2 satisfies the shifted cocycle condition, yielding a dynamical R-matrix −1 RUq,λ ( g) (w) = F21 (w)RUq ( g) F12 (w).
(6.1)
Note that in [27], a twist inherited from a Hopf subalgebra was already used to obtain a Hopf algebra deformation of the Yangian Y (sl2 ).
6.2. In representation for g = slN . Proposition 6.2. The evaluation representation of RUq,λ ( g) (w) for g = slN , is identified N ) R-matrix with the p → 0 limit of the evaluation representation of the elliptic Bq,p,λ (sl defined in Appendix B. j j N ) Proof. Direct computation of the matrix elements Ri11i22 of the R-matrix of Uq,λ (sl (1 ≤ a, b ≤ N ) gives: aa Raa = ρUq,λ (z), q(1 − z) if b > a 1 − q 2z ab , Rab = ρUq,λ (z) 2 )(1 − w q −2 ) q(1 − z) (1 − w q ab ab if b < a (1 − q 2 z) (1 − wab )2 ba Rab = ρUq,λ (z)
(1 − q 2 )(1 − wab z) , (1 − q 2 z)(1 − wab )
(6.2)
the normalisation factor being given by ρUq,λ (z) = q −
N −1 N
(q 2 z; q 2N )∞ (q 2N−2 z; q 2N )∞ . (z; q 2N )∞ (q 2N z; q 2N )∞
(6.3)
We recognise the limit p → 0 of the R-matrix (B.3). This R-matrix satisfies the Dynamical Yang–Baxter equation (2.13).
200
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
6.3. In representation for g = osp(1|2). We first construct a represented R-matrix of through a Baxterisation procedure [28]. We get two R-matrices with spectral osp(1|2) 12 , R −1 and P 12 (the non-graded perparameter constructed as a linear combinations of R 21 mutation) which obey the non-graded Yang–Baxter equation (with spectral parameter). They read: = R(z)
1−z 12 R (1 − za)(1 − zq 2 ) aq 2 z(1 − z) −1 z(1 − a)(1 − q 2 ) − + R P12 21 (1 − za)(1 − zq 2 ) (1 − za)(1 − zq 2 )
(6.4)
with a = −q or q 3 , 12 , R(∞) −1 and R(1) where the normalisations are such that R(0) = R = q −2 R = 21 −1 3 q P12 . Explicitly, for a = q , we get: zq 2 (1−z)(q 2 −1) (1−z)(q 2 −1) r1111
0 0 0 = R(z) 0 (1−z)(q −1 −q) (1−zq 2 )(1−zq 3 ) 0 zq(1−z)(q 2 −1) (1−zq 2 )(1−zq 3 )
0
−
0
0
0
0
1−z 1−zq 2
0
(q −1 −q)z 1−zq 2
0
0
0
0
0
0
1−z 1−zq 2
0
0
0
q −1 −q 1−zq 2
0
0
q −1 −q 1−zq 2
(1−zq 2 )(1−zq 3 )
0
0
(1−zq 2 )(1−zq 3 )
0
1−z 1−zq 2
0
0
0
0
0
0
0
q −1
0
0
0
0
0
0
0
0
q(1−z)(1−zq) (1−zq 2 )(1−zq 3 )
0
r2332
0
0
(q −1 −q)z 1−zq 2
0
1−z 1−zq 2
0
0
0
0
0
0
0
0
0
r3223
0
q(1−z)(1−zq) (1−zq 2 )(1−zq 3 )
0
0
0
0
0
0
0
−z2 q 4 +zq 5 +zq 4 −zq 3 −zq 2 +zq+z−q , q(1−zq 2 )(1−zq 3 ) 3 2 2 −1 z(q−q )(zq +zq −q −1) . (1−zq 2 )(1−zq 3 )
with r1111 =
r2332 =
0
q −1
(q−q −1 )(zq 3 +zq−q−1) (1−zq 2 )(1−zq 3 )
and
r3223 = Applying to the above matrix the twist given by formula (4.17) allows one to get the explicit evaluated R-matrix for the QTQHA Uq,λ (osp(1|2)). Due to its cumbersome nature, we omit this explicit form here. 7. Twist from DY (g) to DYs (g) 7.1. Universal form. Lemma 7.1. The twist (5.2) applied to the universal R-matrix of DY (g) leads to the R-matrix of a QTQHA denoted DYs (g). Proof. We similarly use the fact that U(g) is a Hopf subalgebra of DY (g), g belonging to the (super) unitary series. Hence the cocycle identity (5.5) is also an identity in DY (g)⊗3 for F defined as in (5.2), now considered as an element of DY (g)⊗2 . This twist, applied to the universal R-matrix of DY (g) (given in [9]), yields a dynamical R-matrix −1 R(s) = F21 (s)RF12 (s)
which characterises DYs (g) as a QTQHA.
(7.1)
Yangian and Quantum Universal Solutions of GNF Equations
201
7.2. In representation for g = slN . Proposition 7.2. The evaluation representation of R(s) for g = slN , is identified with N ). the scaling limit of the R-matrix (6.2) of Uq,λ (sl Proof. We apply the twist (5.9) to the R-matrix of DY (slN ), given in (3.24)–(3.25), to get the R-matrix of DYs (slN ). In the fundamental representation, it has the following j j non vanishing elements Ri11i22 (1 ≤ a, b ≤ N ): aa Raa = ρDYs (u), u u+1 ab Rab = ρDYs (u)
1− ba Rab
if b > a
, u 4 if b < a (xa − xb )2 u + 1
1 2u = ρDYs (u) 1 + , xa − x b u + 1
(7.2)
the normalisation factor being ρDYs (u) = ρDY (u). We recognise the scaling limit of N ). This R-matrix satisfies the dynamical Yang–Baxter the R-matrix (6.2) of Uq,λ (sl equation (2.14). 7.3. In representation for g = sl(1|2). Similarly, by applying the twist (5.12) to the Rmatrix of DY (sl(1|2)), one gets the evaluated R-matrix for the QTQHA DYs (sl(1|2)): 1−u = R
1+u
0
0
0
us2 (s2 −2) (1+u)(s2 −1)2
0
0
0
0
s2 −1−u (1+u)(s2 −1)
0 0 0
0
u(1−(s1 +s2 )2 ) (1+u)(s1 +s2 )2
0
0
0
0
0
s2 −1+u (1+u)(s2 −1) 0
0
0
0
0
0
s1 +s2 +u (1+u)(s1 +s2 )
0
0
0
0
0
u 1+u
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
s1 +s2 −u (1+u)(s1 +s2 )
us1 (s1 +2) (1+u)(s1 +1)2
0
0
0
0
0
0
0
0
0
0
s1 +1−u (1+u)(s1 +1)
0
0
0
0
0
0
0 −u 1+u
s1 +1+u (1+u)(s1 +1)
0
0
0
0
u 1+u
0
0
0
1−u 1+u
.
It satisfies the dynamical Yang–Baxter equation (2.14). A. Notations Multiple Gamma functions are defined by ∂ 8r (x|ω1 , . . . , ωr ) = exp (x + n1 ω1 + . . . + nr ω2 )−s . ∂s s=0
(A.1)
n1 ,... ,nr ≥0
Barnes’ multiple sine function Sr (x|ω1 , . . . , ωr ) of periods ω1 , . . . , ωr is defined by [29] r
Sr (x|ω1 , ω2 ) = 8r (ω1 + . . . + ωr − x|ω1 , . . . , ωr )(−1) 8r (x|ω1 , . . . , ωr )−1 . (A.2)
202
D. Arnaudon, J. Avan, L. Frappat, E. Ragoucy
They satisfy for each i ∈ [1, . . . , r], 1 8r (x + ωi |ω1 , . . . , ωr ) = , 8r (x|ω1 , . . . , ωr ) 8r−1 (x|ω1 , . . . , ωi−1 , ωi+1 , . . . , ωr ) 1 Sr (x + ωi |ω1 , . . . , ωr ) = . Sr (x|ω1 , . . . , ωr ) Sr−1 (x|ω1 , . . . , ωi−1 , ωi+1 , . . . , ωr )
(A.3) (A.4)
In particular, x/ω
ω 1 8 81 (x|ω1 ) = √ 1 2πω1
x ω1
,
S1 (x|ω1 ) = 2 sin
πx . ω1
(A.5)
N ) B. Definition of Bq,p,λ (sl N ) was originally defined in [4] using the The quantum affine elliptic algebra Bq,p,λ (sl RLL formalism. The characteristic R-matrix takes the following form (1 ≤ a, b ≤ N ): Ep (q −2 wab ) Ep (z) Eaa ⊗ Eaa + q2 R(z) = ρBq,p,λ (z) Eaa ⊗ Ebb Ep (wab ) Ep (q 2 z) a a=b Ep (wab z) Ep (q 2 ) (B.1) Eab ⊗ Eba , + Ep (wab ) Ep (q 2 z) a=b
xa −xb where Ep (z) = (z; p)∞ (pz−1 ; p)∞ (p; p)∞ and multiple wab = q n1 , theninfinite products being defined by (z; p1 , . . . , pm )∞ = ni ≥0 (1 − zp1 . . . pmm ). The normalisation factor ρBq,p,λ (z) is given by
ρBq,p,λ (z) = q − ×
N −1 N
(q 2 z; q 2N , p)∞ (q 2N−2 z; q 2N , p)∞ (z; q 2N , p)∞ (q 2N z; q 2N , p)∞
(pz−1 ; q 2N , p)∞ (pq 2N z−1 ; q 2N , p)∞ . (pq 2 z−1 ; q 2N , p)∞ (pq 2N−2 z−1 ; q 2N , p)∞
(B.2)
It was proven in [16] that this quantum affine elliptic algebra was a QTQHA obtained N ). The R-matrix thus obtained by a Drinfel’d twist of shifted-cocycle type from Uq (sl is actually a gauge transform of (B.1) where the coefficients of Eaa ⊗ Ebb become −1 2 −1 −2 (pwab q ; p)∞ (pwab q ; p)∞ Ep (z) q if b > a −1 2 Ep (q 2 z) (pwab ; p)∞ ab . (B.3) = ρBq,p,λ (z) Rab −1 2 −1 −2 (w q ; p) (w q ; p) E (z) ∞ ∞ p q ab ab if b < a −1 Ep (q 2 z) (wab ; p)2∞ Acknowledgements. This work was supported in part by CNRS, EC network contract number FMRX-CT960012 and PICS contract number 911. J.A. wishes to thank the LAPTH for its kind hospitality. The authors thank M. Rossi for discussions at an early stage of this work.
Yangian and Quantum Universal Solutions of GNF Equations
203
References 1. Konno, H.: Degeneration of the elliptic algebra Aq,p (sl(2)) and form factors in the Sine-Gordon theory. In: Proceedings of the Nankai-CRM joint meeting on “Extended and Quantum Algebras and their Applications to Physics”, Tianjin, China, 1996, to appear in the CRM series in mathematical physics, Springer Verlag, and hep-th/9701034. 2. Khoroshkin, S., Lebedev, D., Pakuliak, S.: Elliptic algebra Aq,p (slˆ2 ) in the scaling limit. Commun. Math. Phys. 190, 597 (1998) and q-alg/9702002 3. Foda, O., Iohara, K., Jimbo, M., Kedem, R., Miwa, T.,Yan, H.: An elliptic quantum algebra for sl2 . Lett. Math. Phys. 32, 259 (1994) and hep-th/9403094 4. Felder, G.: Elliptic quantum groups. Proc. ICMP Paris 1994, pp. 211 and hep-th/9412207 5. Gervais, J.L., Neveu, A.: Novel triangle relation and absence of tachyons in Liouville string field theory. Nucl. Phys. B238, 125 (1984) 6. Arnaudon, D.,Avan, J., Frappat, L., Ragoucy, E.,Rossi, M.: Towards a cladistics of double Yangian and elliptic algebras. J. Phys. A (Math. Gen.) 33, 6279 (2000) and math.QA/9906189 7. Jimbo, M., Konno, H., Miwa, T.: Massless XXZ model and degeneration of the elliptic algebra Aq,p ( sl(2)). In: Ascona 1996, “Deformation theory and symplectic geometry”, pp. 117–138 and hepth/9610079. 8. Bernard, D., LeClair, A.: The quantum double in integrable quantum field theory. Nucl. Phys. B 399, 709 (1993) 9. Khoroshkin, S.M., Tolstoy, V.N.:Yangian double (and rational R matrix). Lett. Math. Phys. 36, 373 (1996) and hep-th/9406194. 10. Faddeev, L.D., Reshetikhin, N.Yu. and Takhtajan, L.A.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193 (1990) 11. Drinfeld, V.G.: Quasi-Hopf algebras. Leningrad Math. Journal 1, 1419 (1990) 12. Babelon, O.: Universal Exchange Algebra for Bloch Waves and Liouville Theory. Commun. Math. Phys. 139, 619 (1991) 13. Babelon, O., Bernard, D., Billey, E.: A quasi-Hopf algebra interpretation of quantum 3 − j and 6 − j symbols and difference equations. Phys. Lett. B375, 89 (1996) 14. Khoroshkin, S.M.: Central extension of the Yangian double. Collection SMF, 7ème rencontre du contact franco-belge en algèbre, Reims 1995. q-alg/9602031 15. Arnaudon, D.,Avan, J., Frappat, L., Ragoucy, E., Rossi, M.: On the Quasi-Hopf structure of deformed double Yangians. Lett. Math. Phys. 51, 193 (2000) and math.QA/0001034 16. Jimbo, M., Konno, H., Odake, S., Shiraishi, J.: Quasi-Hopf twistors for elliptic quantum groups. “Transformation Groups” and q-alg/9712029 17. Arnaudon, D.,Buffenoir, E., Ragoucy, E., Roche, Ph.: Universal solutions of quantum dynamical Yang– Baxter equations. Lett. Math. Phys. 44, 201 (1998) and q-alg/9712037 18. Arnaudon, D., Avan, J., Frappat, L., Rossi, M.: Deformed double Yangian structures. Rev. Math. Phys. 12, 945 (2000) and math.QA/9905100. 19. Khoroshkin, S.M., Tolstoy, V.N.: Universal R-matrix for quantized (super)algebras. Commun. Math. Phys 141, 599 (1991) 20. Khoroshkin, S. and Tolstoy, V.: Twisting of quantum (super)algebras. Connection of Drinfeld’s and Cartan–Weyl realizations for quantum affine algebras. hep-th/9404036. 21. Buffenoir, E., Roche, Ph.: Harmonic analysis on the quantum Lorentz group. Commun. Math. Phys. 207, 499–555 (1999), and q-alg/9710022 22. Yang, W.-L., Zhen, Y.: Modular transformation and twist between trigonometric limits of sl(n) elliptic R-matrix. math-ph/0103028 23. Arnaudon, D., Chryssomalakos, C. and Frappat, L.: Classical and Quantum sl(1|2) Superalgebras, Casimir Operators and Quantum Chain Invariants. J. Math. Phys. 36, 5262 (1995) and q-alg/9503021. 24. Kulish, P.P. andReshetikhin, N.Y.: Universal R matrix of the quantum superalgebra osp(2|1). Lett. Math. Phys. 18, 143 (1989) 25. Saleur, H.: Quantum osp(1|2) and solutions of the graded Yang–Baxter equation. Nucl. Phys. B336, 363 (1990) 26. Drinfeld, V.G.: Quantum Groups. Proc. Int. Congress of Mathematicians, Berkeley, California, Vol. 1. New York: Academic Press, 1986, p. 798 27. Khoroshkin, S., Stolin, A. and Tolstoy, V.: Deformation of Yangian Y (sl2 ), Comm. Algebra 26, 1041 (1998) and q-alg/9511005 28. Jones, V.F.R.: Baxterization. Int. J. Mod. Phys. B4, 701 (1990). Proceedings of “Yang-Baxter equations, conformal invariance and integrability in statistical mechanics and field theory”. Canberra, 1989 29. Barnes, E.W.: The theory of the double gamma function. Philos. Trans. Roy. Soc. A196, 265 (1901) Communicated by L. Takhtajan
Commun. Math. Phys. 226, 205 – 220 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
On the Existence of the Absolutely Continuous Component for the Measure Associated with Some Orthogonal Systems Sergey A. Denisov1,2 1 California Institute of Technology, Mathematics, 253-37, Caltech, Pasadena, CA 91125, USA.
E-mail: [email protected]
2 Moscow State University, Faculty of Computational Mathematics and Cybernetics, Vorob’evy Gory,
119899 Moscow, Russia. E-mail: [email protected] Received: 26 June 2001 / Accepted: 18 October 2001
Abstract: In this article, we consider two orthogonal systems: Sturm–Liouville operators and Krein systems. For Krein systems, we study the behavior of generalized polynomials at the infinity for spectral parameters in the upper half-plain. That makes it possible to establish the presence of absolutely continuous component of the associated measure. For Sturm–Liouville operator on the half-line with bounded potential q, we prove that essential support of absolutely continuous component of the spectral measure is [m, ∞) if lim supx→∞ q(x) = m and q ∈ L2 (R + ). That holds for all boundary conditions at zero. This result partially solves one open problem stated recently by S. Molchanov, M. Novitskii, and B. Vainberg. We consider also some other classes of potentials.
1. Introduction The contents of the paper is as follows. In the first section, we prove the asymptotics of generalized polynomials for Krein systems with coefficients of a special kind. We also establish the presence of absolutely continuous component of the associated measure. In the second section, results obtained for Krein system are applied to Sturm–Liouville operators. In this introductory section, we will remind some results for the Krein systems. The Krein systems are defined by the equations dP (x, λ) = iλP (x, λ) − A(x)P∗ (x, λ), P (0, λ) = 1, dx dP (x, λ) ∗ = −A(x)P (x, λ), P∗ (0, λ) = 1, dx where A(x) is locally summable function on R + .
(1)
206
S. A. Denisov
In his famous work [6], M. G. Krein showed that P (x, λ) have many properties of polynomials orthogonal on the unit circle1 . For example, there exists a non-decreasing function σ (λ) (spectral measure ) defined on the whole line such that mapping UP (f ) = ∞ f (x)P (x, λ)dx is isometry from L2 (R + ) to L2 (σ, R). We will call P (x, λ) the gen0
eralized polynomials. The following Theorem was stated in [6]. It was proved later in [12]. Theorem 1 ([6, 12]). The following statements are equivalent (1) The integral
∞ −∞
ln σ (λ) dλ 1+λ2
is finite.
(2) At least at some λ, λ > 0, the integral ∞ |P (x, λ)|2 dx
(2)
0
converges. (3) At least at some λ, λ > 0, the function P∗ (x, λ) is bounded. (4) On any compact set in the open upper half-plane, integral (2) converges uniformly. That is equivalent to the existence of uniform limit (λ) = limxn →∞ P∗ (xn , λ) [19]. Consider some measure µ on R. Let I be the finite union of intervals on R. Let us assume that for any measurable set ⊂ I with positive Lebesgue measure (|| > 0), we have µ() > 0. That already means that µ has nontrivial absolutely continuous component. Though not any measure with nontrivial absolutely continuous component has this property. If this condition holds, we say that the essential support of absolutely continuous component of measure µ is I (I ⊆ essupp{µac }). Thus, if one of the conditions (2)–(4) holds, then the essential support of σac (λ) is R. It is easy to show [6] that A(x) ∈ L2 (R) or A(x) ∈ L1 (R) yields (3). In [3], we proved the criterion for (3) to hold in terms of coefficients A(x) from the so-called Stummel class. In the next section, we will study another class of perturbations that does not shrink the essential support for the absolutely continuous component of the measure. The similar problems for Sturm–Liouville operators were studied in numerous publications (see, for example, [7] and the bibliography there). But first, let us outline some relations between Krein systems and some other orthogonal systems. Consider the Dirac system φ = −λψ − a1 φ + a2 ψ, φ(0) = 1, (3) ψ = λφ + a2 φ + a1 ψ, ψ(0) = 0, where a1 = 2A(2x), a2 = 2 A(2x). It turns out that e−iλx P (2x, λ) = φ(x, λ) + iψ(x, λ). That allows us to say [6] that ρDir (λ) = 2σ (λ), where ρDir (λ)− spectral measure of Dirac systems (3). In case a2 = 0 and a1 − absolutely continuous, we have ψ − qψ + λ2 ψ = 0, ψ(0) = 0, φ
− q1 φ
+ λ2 φ
= 0, φ(0) = 1,
ψ (0) = λ, φ (0) + a
1 (0)φ(0)
1 The spectral theory of Krein systems was developed further in [1, 3, 11–14].
= 0,
(4)
Absolutely Continuous Component for Some Orthogonal Systems
207
where q = a12 + a1 , q1 = a12 − a1 . Therefore, the spectral measure ρd (λ) of Sturm–Liouville operator l(u) = −u + qu with Dirichlet boundary condition u(0) = 0 is related to σ (λ) by √
ρd (λ) = 4
λ
ξ 2 dσ (ξ ), λ > 0.
(5)
0
2. Krein Systems with Coefficients of a Special Kind In this section, we will prove the following theorem. Theorem 2. If the bounded coefficient A(x) of the Krein system is real valued, 0 < l1 = lim inf x→∞ A(x) ≤ lim supx→∞ A(x) = l2 , and A ∈ L2 (R + ), then (−∞, −2l2 ] ∪ [2l2 , ∞) ⊂ essupp{σac }. But first we will prove one auxiliary lemma. Consider the Krein system P = iλP − AP∗ , P (0) = 1, P∗ = −AP , P∗ (0) = 1,
(6)
where real valued A is bounded, 0 < l1 = lim inf x→∞ A ≤ lim supx→∞ A = l2 , and A ∈ L2 (R + ). iλ −A Matrix ℵ = has eigenvalues −A 0 √ √ iλ − 4A2 − λ2 iλ + 4A2 − λ2 µ1 = , µ2 = . 2 2 Consider λ = τ + ik, τ > 2l2 is fixed, k > 0. Assume for simplicity A is such that l1 < 2A < τ/2 + l2 for all x ∈ R + . We will explain later why this assumption can always be made without loss of generality. Symbol C is reserved for positive constants whose value might change from one formula to another. It is easy to verify that µ1 > 0 for all x > 0, k > 0. The following Lemma is true. Lemma 1. The asymptotics holds at infinity, x P∗ (x, τ + ik) = exp µ1 (s, k)ds O(x, k),
(7)
0
where |O(x, k)| < exp(C/k) for all x > 0, k > 0. Proof. Many estimates in this proof are very crude, but they will be good enough for our purposes. Let J be a 2 × 2 matrix that satisfies equation J = ℵJ . We will find J −µ1 /A −µ2 /A in the form J = LQ, where L = consists of eigenvectors of ℵ. We 1 1 have the following equation for Q: Q = L−1 ℵLQ − L−1 L Q. Multiplying matrixes, we have µ1 0 Q + V Q, (8) Q = 0 µ2
208
S. A. Denisov
where
v v V = 11 12 v21 v22
=√
1
− AA µ1 + µ1 − AA µ2 + µ2 . A A A µ1 − µ 1 A µ2 − µ 2
4A2 − λ2
(9)
Let us notice that the following inequality: ∞ V 2 dx < C
(10)
0
holds uniformly in k. Introduce the matrix
x
exp Q◦ =
(µ1 (s, k) + v11 (s, k))ds
0
0
0
exp
x
(µ2 (s, k) + v22 (s, k))ds
.
(11)
0
If Q = Q◦ S, then for S =
s11 s12 , S(0) = I , we have s21 s22
S =
v12 exp
0 x v21 exp − ν(s, k)ds
ν(s, k)ds 0 S, 0
x
(12)
0
where ν = For s11
4A2 − λ2 +
√
iλA
. A 4A2 − λ2 and s21 , we have x s11 = v12 exp ν(s, k)ds s21 , s11 (0) = 1, 0 x s21 = v21 exp − ν(s, k)ds s11 , s21 (0) = 0.
(13)
0
Consider the system of the corresponding integral equations τ x s11 (x) = 1 + v12 (τ ) exp ν(s, k)ds s21 (τ )dτ, 0 0 t x s21 (x) = v21 (t) exp − ν(s, k)ds s11 (t)dt. 0
(14)
0
Substituting the second formula into the first one and changing the order of integration, we have the following integral equation for s11 : x s11 (x) = 1 +
x s11 (t)v21 (t)
0
v12 (s) exp t
s t
ν(ξ )dξ dsdt.
Absolutely Continuous Component for Some Orthogonal Systems
209
This equation yields the integral inequality x |s11 | ≤ 1 +
∞ |s11 (t)||v21 (t)| t
0
s Notice that Because
t µ1
√
|v12 (s)|| exp
s
ν(ξ )dξ |dsdt.
t
dξ < C uniformly in k, t, s.
iλA
A 4A2 − λ2 > 0, we have the estimate 4A2 − λ2 < −k.
(16)
Consequently, the Gronwall Lemma, being applied to (15), yields x ∞ |s11 | ≤ exp C |v21 (t)| |v12 (s)| exp −k[s − t] dsdt ≤ exp(C/k). 0
(15)
(17)
t
At the final step, we used (10) and the Young inequality for convolutions. Therefore, for s21 , we have the estimate x
s |v21 (s)|| exp − ν(ξ, k)dξ |ds
0
0
|s21 | ≤ exp(C/k)
which follows from the second equation of system (13) and the estimate on s11 . In the same way, the estimates for s12 , s22 can be obtained. They are as follows: C C |s12 | ≤ √ exp , (18) k k x s C C |s22 | ≤ √ exp |v21 (s)|| exp − ν(ξ, k)dξ |ds. k k 0
(19)
0
If J is such that J (0) = I , then J = LQ◦ SL−1 (0). Therefore, for P∗ , we will have x P∗ = exp (µ1 (t) + v11 (t))dt αs11 + βs12
(20)
0
x " + exp (µ2 (s) − µ1 (s) + v22 (s) − v11 (s))ds (s21 α + s22 β) .
(21)
0
Constants α and β are chosen in such a way that the initial condition P∗ (0, k) = 1 is satisfied. We have 1 α A(0) µ2 (0) 1 = × , (22) β −A(0) −µ (0) 1 2 2 1 4A (0) − λ
210
S. A. Denisov
therefore, α and β are bounded uniformly in k. Denote by O(x, k), x O = exp v11 (t)dt αs11 + βs12
(23)
0
x " + exp (µ2 (s) − µ1 (s) + v22 (s) − v11 (s))ds (s21 α + s22 β) .
(24)
0
x
x v11 (t)dt and
Notice that 0
v22 (t)dt are bounded uniformly in x and k. Due to the 0
estimates on s21 and s22 , we have x exp (µ2 (s) − µ1 (s) + v22 (s) − v11 (s))ds (s21 α + s22 β)
(25)
0
x ≤ exp(C/k) exp ν(η, k)dη 0 s x × |v21 (s)| exp − ν(ξ, k)dξ ds 0 0 x x √ ≤ exp(C/k) |v21 (s)| exp ν(ξ, k)dξ ds ≤ exp(C/k)/ k. 0
(26)
(27)
(28)
s
To get the last estimate, we used the Cauchy inequality. Finally, bounds on s11 , s12 lead to |O(x, k)| < exp(C/k). That finishes the proof of the lemma. Remark. More accurate estimates on α, β, s11 allow us to write inequality |O(x, z) − 1| < 1/2,
(29)
where x ∈ R + , τ ≤ z ≤ τ + 1, z > k0 , k0 – some positive constant. C Indeed, it is easy to verify that α → 1 and |β| < if z → +∞, τ ≤ z ≤ τ + 1.
z From (13), by the Cauchy inequality we have |s11 − 1| ≤ exp(C/k)/k uniformly in x ∈ R + , τ ≤ z ≤ τ + 1, where k = z > k0 . x One can verify that v11 (t)dt → 0 uniformly in x if z → ∞. 0
(30)
Absolutely Continuous Component for Some Orthogonal Systems
211
Therefore, from (23) and (26)–(28), we infer (29). Let us prove Theorem 2 now. Proof of Theorem 2. Fix any τ > 2l2 . We will show that [τ, ∞) ⊂ essupp{σac }. Because τ is chosen arbitrarily larger than 2l2 and σ is odd (since A is real ), this inclusion is sufficient for Theorem 2 to be true. Once τ is fixed, we can assume that l1 < 2A < τ/2 + l2 for all x ∈ R + . Due to the standard trace-class perturbation argument applied to the Dirac system (3) [9], we can always make this assumption. Indeed, since l1 = lim inf x→∞ A ≤ lim supx→∞ A = l2 as x → ∞, it suffices to multiply A by some smooth function which is equal to 0 on [0, MA ] and 1 on [MA + 1, ∞). The absolutely continuous part of ρDir will not change because of the trace-class argument. On the other hand, if MA is sufficiently large, we will satisfy the imposed condition. A(x), x < n, Consider the Krein system with coefficients A(n) (x) = Denote the 0, x ≥ n. corresponding measure by σn . We have the formula ((3.14) from [12]): ∞ √ √ 1 (1 + tz) ln σn (t) 2πP∗ (∞, z) = 2πP∗ (n, z) = eiαn exp dt , z > 0, 2π i (z − t)(1 + t 2 ) −∞
(31) where αn are some real constants. Because A(x) is real, functions σn are odd. Therefore, if we take z = i, the left-hand side of (31), together with exponent from the right-hand side, are real valued. Thus, αn can be chosen equal to zero. The asymptotics of P∗ (n, z) as |z| → ∞, z > 0 is P∗ (n, ∞) = 1 [11]. Therefore, we can rewrite this formula as follows: ∞ −2πi ln P∗ (n, z) = −∞
(1 + tz) ln(2π σn (t)) dt (t − z)(1 + t 2 )
(32)
if z > 0. Here we used the identity ∞ √ 1 (1 + tz) ln(2π ) 2π = exp − dt , z > 0. 2π i (z − t)(1 + t 2 ) −∞
Also we used the fact that 2π σn (t) → 1 if |t| → ∞ [11]. The right-hand side RH S(z) of (32) satisfies the condition RH S(z) = RH S(¯z). Let us define the left-hand side for z < 0 according to this rule. So we have some analytic function LH S(z) defined in the region: z = 0. Recall the asymptotic formula for P∗ (x, z), ( z > 0, z = τ > 2l2 ): x P∗ (x, z) = exp µ1 (s, z)ds O(x, z). 0
(33)
212
S. A. Denisov
Then, n LH S(z) = −2π i
µ1 (s, z)ds − 2π i ln O(n, z) 0
for z > 0. It is easy to see that the continuation according to the chosen rule for the function n −2πi µ1 (s, z)ds is exactly the Schwarz analytic continuation. It follows from the fact 0
that the value of this function on the half-line z = 0, z ≥ τ is real. Consider z = τ + ik. Let ln O(n, z) = r1 (n, k) + ir2 (n, k). From the definition of LH S(z) in z < 0, we have i(r1 (n, k) + ir2 (n, k)) = −ir1 (n, k) − r2 (n, k) = ir1 (n, −k) − r2 (n, −k). Thus, r1 (n, k) is odd and r2 (n, k) is even. For fixed n, they are well defined functions for all k = 0 with finite right and left limits at k = 0. Notice also that for k > 0, r1 (n, k) = ln |O(n, τ + ik)| and r2 (n, k) = ArgO(n, τ + ik). For r1 , r2 , we have r1 (n, ∞) = 0, r2 (n, ∞) = 0 because P∗ (n, ∞) = 1 and µ1 (s, ∞) = 0. Then, integrate both sides of (32), together with some auxiliary function (z−τ )3 , along the contour Con. We choose Con as the contour that consists of ((z−τ )2 +m2 )4 the complex numbers τ + ik (|k| ≤ m − 1, |k| ≥ m + 1 ) and two right semicircles with radii 1 centered at τ ∓ im. The direction of integration is upward. For the right-hand side, we have ∞ (z − τ )3 (1 + tz) ln(2π σn (t)) dtdz ((z − τ )2 + m2 )4 (1 + t 2 )(t − z) −∞
Con
∞ = −∞
ln(2πσn (t)) 1 + t2 ∞
= 2πi τ
Con
(1 + tz)(z − τ )3 dzdt ((z − τ )2 + m2 )4 (t − z)
(t − τ )3 ln(2π σn (t))dt. ((t − τ )2 + m2 )4
Here we changed the order of integration by the Fubini Theorem and then used the Cauchy formula. Integrating the LH S(z) with the same function, we have n (z − τ )3 (z − τ )3 − 2πi µ1 (s, z)ds dz − 2π i ln O(n, z)dz ((z − τ )2 + m2 )4 ((z − τ )2 + m2 )4 Con 0
= 0 − 2πi Con
because
n
Con
(z − τ )3 ln O(n, z)dz ((z − τ )2 + m2 )4
µ1 (s, z)ds is analytic and bounded in z ≥ τ . Thus, we have the equality
0
− Con
(z − τ )3 ln O(n, z)dz = ((z − τ )2 + m2 )4
∞ τ
(t − τ )3 ln(2π σn (t))dt. ((t − τ )2 + m2 )4
(34)
Absolutely Continuous Component for Some Orthogonal Systems
213
We have the following inequality [12] ∞ −∞
dσn (t)
(35)
uniformly in n. Define ln+ h = ln h for h > 1 and zero otherwise, ln− h = − ln h if 0 < h < 1 and zero otherwise. The trivial inequality ln+ h < h, together with (35), guarantee that the right-hand side of (34) is bounded above uniformly in n. Our goal is to show that it is bounded from below as well. Thus, we want to show that (z − τ )3 ln O(n, z)dz < C ((z − τ )2 + m2 )4 Con
uniformly in n. The left-hand side is equal to I1 +I2 , where I1 =
i |k|<m−1,|k|>m+1
=2 0m+1
(ik)3 (r1 (n, k) + ir2 (n, k))dk (m2 − k 2 )4 k3 r1 (n, k)dk ≤ C (m2 − k 2 )4
uniformly in n due to the estimates on ln |O(n, k)|. We also used the fact that r2 (n, k) is even. It is crucial since we do not have control over the argument of O(n, k). We only know the upper bound on ln |O(n, k)|. The term I2 corresponds to integration along the semicircles. Choose m sufficiently large. Then, for I2 , we have the estimate |I2 | ≤ C uniformly in n because of the estimate (29). To finish the proof, we will use one argument from [2]. Consider any compact set Co ∈ (τ, ∞) of positive Lebesgue measure. Then, as it follows from (35) and boundedness of right-hand side in (34), ln− σn (t)dt is bounded uniformly in n. Jensen’s inequality yields
Co
1 1 ln− σn (t)dt ≤ ln− σn (t)dt. |Co| |Co| Co
Co
Therefore, σn (Co) is greater than some positive constant d(Co) for all n. The Weyl–Titchmarsh function of the system with coefficients A(n) converges to the Weyl–Titchmarsh function of a system with coefficient A [12]. This convergence is uniform on any compact set of upper half-plane. By the Stone–Weierstrass Theorem, we have weak convergence of σn to σ . Consequently, σ (Co) ≥ lim supn→∞ σn (Co) > 0 for each compact Co with |Co| > 0. In the next Theorem, we consider a different class of coefficients.
214
S. A. Denisov
Theorem 3. If A(x) = −l −v(x), where l ∈ R, v(x) is real valued, and v(x) ∈ L2 (R + ), then for the corresponding Krein system essupp{σac } = (−∞, −2|l|] ∪ [2|l|, ∞). For l = 0, the result follows from [6]. Therefore, we will consider the case l = 0. To make calculations more simple, let l = −1/2. The idea of the proof is the same as that of Theorem 2. We need the asymptotics on P∗ (x, λ). Consider the Krein system written in the following way: # P = iλP + P∗ /2 + vP∗ , P (0) = 1, (36) P∗ = P /2 + vP , P∗ (0) = 1.
√ iλ 1/2 has eigenvalues µ± = (iλ ± 1 − λ2 )/2. Consider λ = 1/2 0 τ + ik, τ > 1 is fixed, k > 0. One can verify that µ− > 0 for all k > 0. The matrix
Lemma 2. The following asymptotics holds at the infinity x 1 P∗ (x, τ + ik) = exp µ− (λ)x − √ v(s)ds O(x, k), 1 − λ2
(37)
0
where |O(x, k)| < exp(C/k) for all x > 0, k > 0. Proof. We will give only the sketch of the proof, because it repeats essentially the proof of Lemma 1. Consider the following matrix differential system: iλ 1/2 01 X + v(x) X, X(0) = E, (38) X = 1/2 0 10 where E is 2 × 2 identity matrix. √exp(µ− (λ)x) √exp(µ+ (λ)x) X0 = (−iλ + 1 − λ2 ) exp(µ+ (λ)x) (−iλ − 1 − λ2 ) exp(µ− (λ)x) iλ 1/2 X0 . Introduce the matrix V = X0−1 (0) = is the solution of the equation X0 = 1/2 0 √ iλ + √1 − λ2 1 √1 . We will find X in the form X = X0 Y . Then we have 2 1−λ2 −iλ + 1 − λ2 −1 −1 0 1 X0 Y, and Y (0) = V . The multiplication of an equation for Y : Y = v(x)X0 10 the matrixes yields v(x) Y = √ 1 − λ2 $ ×
1
√ √ −(λ2 + iλ 1 − λ2 ) exp( 1 − λ2 x)
√ √ % (λ2 − iλ 1 − λ2 ) exp(− 1 − λ2 x) −1
Y.
(39)
Absolutely Continuous Component for Some Orthogonal Systems
$
exp Let Y =
x
√ 1 1−λ2
%
v(s)ds
0 $
0
exp − √
0
1 1−λ2
For T , we have the equation v(x) T = √ 1 − λ2 $ ×
0
√ −(λ2 + iλ 1 − λ2 ) exp(φ(x, λ))
where φ(x, λ) =
215
√ 1 − λ2 x +
√ 2 1−λ2
x
x
v(s)ds
%T.
0
√ % (λ2 − iλ 1 − λ2 ) exp(−φ(x, λ)) 0
T , (40)
v(s)ds. Systems (40) and (12) have the same
0
structure. Therefore, we can use arguments that were applied to the system (12) from Lemma 1. Remark. Let us apply the Taylor formula for the square root to µ1 from (7). We see that x µ1 (s, k)ds =
(iλ −
0
√
1 − λ2 )x 1 −√ 2 1 − λ2
x v(s)ds + O(1). 0
Taking the exponent, we have exactly the main factor in the asymptotics from Lemma 2. The proof of Theorem 3 is straightforward now. Proof of the Theorem 3. It suffices to use Lemma 2 and the arguments from the proof of Theorem 2 to establish the inclusion (−∞, −2|l|] ∪ [2|l|, ∞) ⊆ essupp{σac }. The converse inclusion follows from the fact that the essential spectrum of the Dirac system (3) is (−∞, −2|l|] ∪ [2|l|, ∞) due to Weyl Theorem [8]. Remark. It is likely that condition v ∈ L2 (R + ) can be relaxed as it was done in [3]. 3. Sturm–Liouville Operators The main goal of this section is to prove the following theorem. The idea of the proof will follow [4] more or less. But it will require more technical details. Theorem 4. Consider the Sturm–Liouville operator on the half-line given by the differential expression l(u) = −u + qu,
(41)
where q is bounded, lim supx→∞ q = m, q(x) is the absolutely continuous, and q ∈ L2 (R + ). Then, for any boundary condition at zero, the essential support of absolutely continuous component of the corresponding spectral measure is [m, ∞). Problems of this kind for Sturm–Liouville operators were treated in many publications. We mention some of the results obtained. P. Deift and R. Killip proved
216
S. A. Denisov
Theorem 5 ([2]). If in (41), q(x) ∈ L2 (R + ), then the essential support of the absolutely continuous component of the spectral measure is R + for all boundary conditions. In the paper of S. Molchanov, M. Novitskii, and B. Vainberg [7], some other results were obtained. For example, it was proved that q ∈ L3 (R + ) and q ∈ L2 (R + ) lead to the same property of the spectral measure. Consider the particular case of Theorem 4, where q → 0 as x → ∞, q ∈ L2 (R + ). We see that condition q ∈ L3 (R + ) used in [7] can be relaxed to q → 0 as x → ∞. In [7], the authors also pose the following open problem: Is it true that for all boundary conditions at zero, essupp{ρac } = [m, ∞) provided that q is bounded, lim supx→∞ q = m, and for some integer p, q (p) ∈ L2 (R + )? Here q (p) means the derivative of order p. Theorem 4 solves this problem for p = 1. Because any bounded function admits a supremum, we actually characterize the absolutely continuous component of the spectrum for bounded potentials with a square summable first derivative. We think that the method developed here can be used to deal with any p. No doubt, it will require more calculations to establish the asymptotics. Let us prove the following lemma first. Lemma 3. If bounded q is such that lim supx→∞ q = m and q ∈ L2 (R + ), then for some large γ > 0, there exists bounded v(x) so that lim supx→∞ v = γ 2 + m − γ , v ∈ L2 (R + ), and q = v 2 + 2γ v + v . Proof. Consider the corresponding integral equation v(x) = e
x
−2γ x
e2γ s (q(s) − v 2 (s))ds.
(42)
0
If v(x) is a solution of this integral equation, then it satisfies the differential equation as well. Write (42) as follows v = OPγ v, where OPγ is the corresponding formal nonlinear operator. Consider the complete metric space M = {f ∈ C(R + ), f ∞ ≤ 1} with .∞ metric. By C(R + ) we denote continuous functions on R + . If γ is large enough, the operator OPγ acts from M to M. Naturally, the choice of γ depends on q∞ . For large γ , OPγ has a contracting property. Indeed, |OPγ g1 − OPγ g2 | ≤ e
−2γ x
x
e2γ s |g1 − g2 ||g1 + g2 |ds ≤
1 g1 − g2 ∞ , 2
0
for γ large enough. Therefore, there is the unique fixed point from M. Let us call this function v. Differentiate (42). After integration by parts, we will have
v =e
−2γ x
(q(0) − v (0)) + e 2
−2γ x
x e
2γ s
q (s)ds − 2e
−2γ x
x
0
e2γ s v(s)v (s)ds.
0
Consider integral equation b=e
−2γ x
(q(0) − v (0)) + e 2
−2γ x
x e 0
2γ s
q (s)ds − 2e
−2γ x
x 0
e2γ s v(s)b(s)ds. (43)
Absolutely Continuous Component for Some Orthogonal Systems
217
It has the unique solution from C(R + ). Therefore, its solution is v . The uniqueness follows from the convergence of the corresponding iterated series. Write (43) as b = OP 1γ b, e
−2γ x
x e
2γ s
∞
q (s)ds =
0
0
rγ (x − s)q (s)ds,
where rγ (s) = e−2γ s for positive s, and zero otherwise. Because q ∈ L2 (R + ), theYoung inequality for convolutions yields that OP 1γ acts from the ball .2 ≤ 1 into itself, provided that γ is large enough. For large γ , it has a contracting property. Therefore, there is the unique fixed point b in ball .2 ≤ 1 which is equal to v . We also used the fact that v∞ ≤ 1. It is clear that v (x) → 0 as x → ∞. Therefore, solving equation v 2 + 2γ v + v − q = 0 gives the formula & v = −γ ± γ 2 − v + q. (44) We know that q is bounded and v∞ ≤ 1 for large γ > 0. Consequently, to obtain the asymptotics of v at infinity, one should take sign + in (44). Therefore, we have lim supx→∞ v(x) = −γ + γ 2 + m. Now the proof of Theorem 4 is straightforward. Proof of Theorem 4. Notice that the essential support of the absolutely continuous component does not depend on the boundary condition at zero. It follows, for example, from the subordinacy theory [5, 15]. Consider (41) with Dirichlet boundary condition at zero. Add to the potential q some constant γ 2 . Denote the corresponding self-adjoint operator by Hγ . Obviously, the spectral measure of Hγ is the shift of the spectral measure of the initial operator H0 . On the other hand, for large γ , we can solve equation q + γ 2 = (v + γ )2 + (v + γ ) . That is due to Lemma 3 of this section. Now we can apply results of the first section for Krein systems with coefficient A(x) = γ /2 + v(x/2)/2. We can√use Theorem 2 because for large γ , γ 2 +m
0 < lim inf x→∞ A(x) ≤ lim supx→∞ A(x) = . Let us use formula (5) from the 2 2 Introduction. Thus, we see that [γ + m, ∞) ⊂ essupp{ρac (Hγ )} which is equivalent to [m, ∞) ⊂ essupp{ρac (H0 )}.
To prove that it is actually an equality, one should use some result from [17] which goes back to [16]. Before formulating this result, let us introduce some notations used in d2 [17]. Consider V – real valued and locally integrable on (0, ∞). Assume that − dx 2 +V ε 2 d 2 + is limit point at +∞ and |V (s)|ds < ∞. Let T = − dx 2 + V in L (R ) with any 0
fixed boundary condition at zero. Consider also V◦ – integrable and bounded from below d2 2 function. Denote by T◦ operator generated by − dx 2 + V◦ in L (R). Then, the following theorem holds. Theorem 6 ([17], Theorem 2.1). Assume that −∞ ≤ α < β ≤ +∞ and (α, β) ∩ Spectrum(T◦ ) = ∅ and that intervals In ⊂ [0, ∞) exist such that |In | → ∞ as n → ∞ and
sup |V (x) − V◦ (x)| ≤ δ.
x∈∪n In
Then, essupp{ρac (T )} ∩ (α + δ, β − δ) = ∅.
218
S. A. Denisov
Now, consider V◦ = m, V = q. Because lim supx→∞ q = m and q ∈ L2 (R + ), one can easily show that for any δ > 0, essupp{ρac (H0 )} ∩ (−∞, m − δ) = ∅. Indeed, it suffices to choose xn → ∞ such that |q(xn ) − m| < δ/2 and In as some neighboring x intervals. We specify them as follows. Notice that q(x) − q(xn ) = q (s)ds. Therefore, xn
∞ 1/2 ' (2 √ √ |q(x) − q(xn )| ≤ q (s) ds · x − xn = εn x − xn , xn
where εn → 0 as n → ∞. Choose as In intervals [xn , xn + εn−1 ] for n so large that 2 εn < δ4 . Evidently, |In | → ∞ and for each x ∈ In , we have |q(xn )−m| < δ. Therefore, the Stolz Theorem yields essupp{ρac (H0 )} = [m, ∞). Remark. There are many functions that satisfy conditions of Theorem 4. In particular, these are some slowly oscillating functions. One can think about cos(x µ ) for 0 < µ < 1/2. In that case lim supx→∞ q(x) = 1, q ∈ L2 (R + ). Consequently, essupp{ρac } = [1, ∞) for all boundary conditions. In paper [17], the author used the theory of subordinate solutions to show that for any 0 < µ < 1, the spectrum is actually purely absolutely continuous on [1, ∞). The interval [−1, 1] is filled by singular spectrum. Remark. The condition q ∈ L2 (R + ) from Theorem 4 is optimal [7]. That means that the statement can be false if q ∈ Lp (R + ), p > 2. The famous von Neumann-Wigner potential [18, 10] satisfies the conditions of Theorem 4. That means that under the conditions of Theorem 4, the singular component of the spectrum can appear on the interval [m, ∞) which supports the absolutely continuous part. Remark. It should be noted that the class of Sturm–Liouville operators with decaying potentials match very well the class of Krein systems with coefficients that tend to nonzero constant. For example, Sturm–Liouville operators of this kind admit negative eigenvalues with zero as the only possible point of accumulation. For the Krein systems, these eigenvalues of the discrete spectrum might accumulate only near the edges of the corresponding symmetric interval centered at zero. This relation can be explained as follows. Consider the potential from Lp (R + ), (1 ≤ p < ∞) space and Dirichlet boundary condition for example. Then, for γ large enough, Eq. (42) can be solved so that v ∈ L∞ (R + ) ∩ Lp (R + ). Consequently, formula (5) from the Introduction allows us to study the Krein system with coefficient A = γ /2 + v(x/2)/2 instead of initial Sturm–Liouville operator. Theorem 3 from the first section lets us prove the theorem of P. Deift and R. Killip for more general class of potentials. Theorem 7. Consider Sturm–Liouville operator (41) with potential q(x). If q is uniformly square summable functions from H −1 (R + ) space, i.e. x+1 q 2 (s)ds < C x
Absolutely Continuous Component for Some Orthogonal Systems
219
uniformly in x ∈ R + , and e−x
x
es q(s)ds ∈ L2 (R + ),
(45)
0
then, for any boundary conditions at zero, the essential support of the absolutely continuous component of the corresponding spectral measure is a positive half-line. This result solves one open problem stated in [4]. The proof is essentially the same as the proof of Theorem 4. Lemma 4. Under the conditions imposed on q(x), we can find the absolutely continuous function a(x) which is square summable on the positive half-line and satisfies the equation q(x) = a (x) + a(x). Proof. Consider the function a(x) = e−x q = a + a. From (45), we have a(x) ∈
x
es q(s)ds. Obviously, a(x) ∈ AC(R + ) and
0 L2 (R + ).
Proof of Theorem 7. For given q(x), find the corresponding function a(x) and consider the Krein system (1) with A(x) = a(x/2)/2+1/4. The absolutely continuous component of the corresponding measure σ has essential support (−∞, −1/2] ∪ [1/2, ∞). That is due to Lemma 4 and Theorem 3. Consequently, for the Sturm–Liouville operator on the half-line with potential q ∗ = a 2 +a +1/4 +a and Dirichlet boundary condition at zero, the essential support of the absolutely continuous component of the spectral measure is [1/4, ∞). The essential support of the absolutely continuous component does not depend on the boundary condition. Therefore, this property holds for all boundary conditions at zero. Because a ∈ L2 (R + ) and q = a + a, the standard trace-class perturbation argument yields that for operator with potential q + 1/4, the essential support of the absolutely continuous component is again the interval [1/4, ∞). It suffices to subtract 1/4 from the operator to complete the proof of the theorem. Remark. Let q(x) be zero for x < 0. Then condition (45) means that q(x) is from H −1 (R) class. Indeed, since q(x) = 0 for x < 0, we can write e
−x
x 0
+∞ e q(s)ds = q(s)r(x − s)ds, s
−∞
where r(t) = e−t for t > 0 and r(t) = 0 otherwise. Taking the Fourier transform, we ) r(ω) ) =√ 1 ) ∈ L2 (R). That means q(x) ∈ H −1 (R). have q(ω) q(ω) 2π(1−iω)
Acknowledgements. The author thanks Professor J. Goodman for useful discussion. Most of this work was done during a stay at the Courant Institute of Mathematical Sciences.
220
S. A. Denisov
References 1. Akhiezer, N.I., Rybalko, A.M.: On the continuous analogues of polinomials orthogonal on the unit circle. Ukranian Math. J. 20, no. 1, 3–24 (1968) 2. Deift, P., Killip, R.: On the absolutely continuous spectrum of one-dimensional Schrödinger operators with square summable potentials. Commun. Math. Phys. 203, 341–347 (1999) 3. Denisov, S.A.: To the spectral theory of Krein systems (to appear in Integral Equations Operator Theory) 4. Denisov, S.A.: On the application of some M.G. Krein’s results to the spectral analysis of Sturm–Liouville operators. Journal of Math. Analysis and Applications 261, 1, 171–191 (2001) 5. Gilbert, D.J., Pearson, D.B.: On the subordinacy and analysis of the spectrum of one-dimensional Schrödinger operators. J. Math. Anal. 128, 30–56 (1987) 6. Krein, M.G.: Continuous analogues of propositions on polynomials orthogonal on the unit circle. Dokl. Akad. Nauk SSSR 105, 637–640 (1955) 7. Molchanov, S., Novitskii, M., Vainberg, B.: First KdV integrals and absolutely continuous spectrum for 1-D Schrödinger operator. Commun. Math. Phys. 216, 195–213 (2001) 8. Reed, M., Simon, B.: Methods of modern mathematical physics. Vol. 1. London: Academic Press, 1972 9. Reed, M., Simon, B.: Methods of modern mathematical physics. Vol. 3. London: Academic Press, 1979 10. Reed, M., Simon, B.: Methods of modern mathematical physics. Vol. 4. London: Academic Press, 1978 11. Rybalko, A.M.: On the theory of continual analogues of orthogonal polynomials. Teor. Funktsii, Funktsional. Anal. i Prilozhen. 3, 42–60 (1966) 12. Sakhnovich, L.A.: Spectral theory of a class of canonical systems. Funct. Anal. Appl. 34, no. 2, 119–129 (2000) 13. Sakhnovich, L.A.: On a class of canonical systems on half-axis. Integral Equations Operator Theory. 31, 92–112 (1998) 14. Sakhnovich, L.A.: Spectral theory of canonical differential systems: method of operator identities. Oper. Theory, Adv. and Applications. Vol. 107. Basel, Boston: Birkhäuser Verlag, 1999 15. Simon, B.: Bounded eigenfunctions and absolutely continuous spectra for one-dimensional Schrödinger operators. Proc. Amer. Math. Soc. 124, no. 11, 3361–3369 (1996) 16. Simon, B., Spencer, T.: Trace class perturbations and the absence of absolutely continuous spectra. Commun. Math. Phys. 125, 113–125 (1989) 17. Stolz, G.: Spectral theory for slowly oscillating potentials II. Schrödinger operators. Math. Nachr. 183, 275–294 (1997) 18. von Neumann, J., Wigner, E.: Über merkwürdige diskrete Eigenwerte. Z. Phys. 30, 465–467 (1929) 19. Teplyaev, A.V.: continuous analogues of random polynomials that are orthogonal on the circle. Theory Probab. Appl. 39, no. 3, 476–489 (1994) Communicated by B. Simon
Commun. Math. Phys. 226, 221 – 232 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Quantum Groups and Fuss–Catalan Algebras Teodor Banica Institut de Mathématiques de Jussieu, 175 rue du Chevaleret, 75013 Paris, France. E-mail: [email protected] Received: 9 March 2001 / Accepted: 12 November 2001
Abstract: The categories of representations of compact quantum groups of automorphisms of certain inclusions of finite dimensional C∗ -algebras are shown to be isomorphic to the categories of Fuss–Catalan diagrams. Introduction Let (D, τ ) be a finite dimensional C∗ -algebra together with a trace. In [14] Wang constructs an algebra Aaut (D, τ ): the biggest Hopf C∗ -algebra co-acting on (D, τ ). From the point of view of noncommutative geometry the algebra D corresponds to a and the trace τ corresponds to a measure Thus noncommutative finite space D τ on D. τ ). Aaut (D, τ ) corresponds to the “quantum symmetry group” Gaut (D, If D = Cn with n = 1, 2, 3, then Aaut (D, τ ) is just the algebra of functions on the nth symmetric group. But if dim(D) ≥ 4, then Aaut (D, τ ) is infinite-dimensional ([14]). The corepresentations of Aaut (D, τ ) are studied in [1]: under a suitable assumption on the trace τ , the algebras of symmetries of the fundamental corepresentation (i.e. the one on D) are shown to be isomorphic to the Temperley–Lieb algebras. Let B ⊂ D be an inclusion of finite dimensional C∗ -algebras and let ϕ be a state on D. Following Wang one can construct a Hopf algebra Aaut (B ⊂ D, ϕ): the biggest Hopf C∗ -algebra co-acting on D such that B and ϕ are left invariant. The main result in this paper is that, under suitable assumptions on ϕ, the category of finite dimensional corepresentations of Aaut (B ⊂ D, ϕ) is isomorphic to the completion of the category of Fuss–Catalan diagrams. (These are certain colored Temperley–Lieb diagrams, discovered by Bisch and Jones in connection with intermediate subfactors [5].) The proof (Sect. 1–Sect. 4) uses [1, 5, 16] and reconstruction techniques. The Fuss–Catalan diagrams were recently shown to appear in several contexts, related to subfactors, planar algebras and integrable lattice models. See e.g. Bisch and Jones [5, 6], Di Francesco [7], Landau [12] and the references there in. In the last section of the paper (Sect. 5) we discuss the relation between Aaut (B ⊂ D, ϕ) and subfactors.
222
T. Banica
1. Preliminaries The Fuss–Catalan category, as well as other categories to be used in this paper, is a tensor C∗ -category having (N, +) as monoid of objects. For simplifying writing, such a tensor category will be called a N-algebra. If C is a N-algebra we use the notations C(m, n) = HomC (m, n),
C(m) = EndC (m).
As a first class of examples, associated to any object O in a tensor C∗ -category is the N-algebra NO given by NO(m, n) = Hom(O ⊗m , O ⊗n ). Fix δ > 0. The N-algebra T L2 is defined as follows. The space T L2 (m, n) consists of linear combinations of Temperley-Lieb diagrams between 2m points and 2n points · · · · · · ← 2m points T L2 (m, n) = α W ← m + n strings · · ← 2n points (strings join points and don’t cross) and the operations ◦, ⊗ and ∗ are induced by vertical and horizontal concatenation and upside-down turning of diagrams. With the following rule: erasing a circle is the same as multiplying by δ. B A◦B= , A
A∗ = ∀,
A ⊗ B = AB,
= δ.
Consider the following two elements u ∈ T L2 (0, 1) and m ∈ T L2 (2, 1): u = δ − 2 ∩, 1
1
m = δ2 | ∪ | .
Theorem 1. The following relations (i) mm∗ = δ 2 1, (ii) u∗ u = 1, (iii) m(m ⊗ 1) = m(1 ⊗ m), (iv) m(1 ⊗ u) = m(u ⊗ 1) = 1, (v) (m ⊗ 1)(1 ⊗ m∗ ) = (1 ⊗ m)(m∗ ⊗ 1) = m∗ m are a presentation of T L2 by u ∈ T L2 (0, 1) and m ∈ T L2 (2, 1). The result says that u and m satisfy the relations and that if C is a N-algebra and v ∈ C(0, 1) and n ∈ C(2, 1) satisfy the relations then there exists a N-algebra morphism T L2 → C,
u → v,
m → n.
This is proved in [1]. Actually in [1] the “index” δ 2 is an integer and u and m are certain explicit operators, but these extra structures are not used. Let D be a finite dimensional C∗ -algebra with a state ϕ on it. We have a scalar product x, y = ϕ(y ∗ x) on D, so D is an object in the category of finite dimensional Hilbert spaces. Consider the unit u and the multiplication m of D, u ∈ ND(0, 1),
m ∈ ND(2, 1).
Quantum Groups and Fuss–Catalan Algebras
223
The relations in Theorem 1 are satisfied if and only if the first one, namely mm∗ = δ 2 1, 2 is satisfied. If D = ⊕Mnβ is a decomposition of D, we must have T r(Q−1 β ) = δ for any block Qβ of the unique Q ∈ D such that ϕ = T r(Q.). This can be checked by direct computation; see [1] for the case ϕ = trace. A linear form ϕ such that mm∗ = δ 2 1, where the adjoint of the multiplication is taken with respect to the scalar product associated to ϕ, will be called a δ-form. One can deduce from Theorem 1 that if ϕ is a δ-form then the category of corepresentations of the Hopf C∗ -algebra Aaut (D, ϕ) is the completion of T L2 . The case ϕ = trace is studied in [1]; for the general case, see Sect. 4 below. 2. The Fuss–Catalan Category A Fuss–Catalan diagram is a planar diagram formed by an upper row of 4m points, a lower row of 4n points and by 2m + 2n non-crossing strings joining them. Both rows of points are colored from left to right in the following standard way: white, black, black, white, white, black, black, ... and strings have to join pairs of points having the same color. Fix β > 0 and ω > 0. The N-algebra F C is defined as follows. The spaces F C(m, n) consist of linear combinations of Fuss–Catalan diagrams w,b,b,w,w,b,b,w, . . . ← 4m colored points m + n white strings W ← and F C(m, n) = α m + n black strings w,b,b,w,w,b,b,w, . . . ← 4n colored points and the operations ◦, ⊗ and ∗ are induced by vertical and horizontal concatenation and upside-down turning of diagrams. With the following rule: erasing a black/white circle is the same as multiplying by β/ω. B A◦B = , A
A⊗B = AB,
A∗ = ∀,
black
→ = β,
white
→ = ω,
Let δ = βω. The following bicolored analogues of the elements u and m in Sect. 1 1
1 u = δ− 2 ∩ , m = δ 2 || ∪ || generate in F C a N-subalgera which is isomorphic to T L2 . Consider also the black and white Jones projections. e = ω−1 |
∪ |, ∩
f = β −1 |||
∪ ||| . ∩
We have f = β −2 (1 ⊗ me)m∗ , so we won’t need f for presenting F C. For simplifying writing we identify x and x ⊗ 1, for any x. Theorem 2. The following relations, with f = β −2 (1 ⊗ me)m∗ and δ = βω: (1) the relations in Theorem 1, (2) e = e2 = e∗ , f = f ∗ and (1 ⊗ f )f = f (1 ⊗ f ),
224
T. Banica
(3) eu = u, (4) mem∗ = m(1 ⊗ e)m∗ = β 2 1, (5) mm(e ⊗ e ⊗ e) = emm(e ⊗ 1 ⊗ e) are a presentation of F C by m ∈ F C(2, 1), u ∈ F C(0, 1) and e ∈ F C(1). Proof. As for any presentation result, we have to prove two assertions. (I) The elements m, u, e satisfy the relations (1–5) and generate the N-algebra F C. (II) If M, U and E in a N-algebra C satisfy the relations (1–5), then there exists a morphism of N-algebras F C → C sending m → M, u → U and e → E. The proof will be based on the results in the paper of Bisch and Jones [5], plus some graphic computations for (I) and some purely algebraic computations for (II). (I) First, the relations (1–5) are easily verified by drawing pictures. Let us show that the N-subalgebra C = m, u, e of F C is equal to F C. First, C contains the infinite sequence of black and white Jones projections p1 = e = ω−1 |
∪ |, ∩
p2 = f = β −1 |||
∪ |||, ∩
p3 = 1 ⊗ e = ω−1 ||||| p4 = 1 ⊗ f = β −1 |||||||
∪ |, ∩ ∪ ||| , ∩
... as well as the infinite sequence of bicolored Jones projections ∪ ∗ −1 e1 = uu = δ , ∩ ∪ e2 = δ m m = δ || || , ∩ ∪ ∗ −1 e3 = 1 ⊗ uu = δ |||| , ∩ ∪ −2 ∗ −1 e4 = δ (1 ⊗ m m) = δ |||||| || , ∩ ... −2
∗
−1
which by the result of Bisch and Jones ([5]) generate the diagonal N-algebra #F C. (If X is a N-algebra, its diagonal #X is defined by #X(m) = X(m) and #X(m, n) = ∅ if m = n.) Thus we have inclusions #F C ⊂ C ⊂ F C
Quantum Groups and Fuss–Catalan Algebras
225
so we can use the following standard argument. First, we have #F C = #C. Second, the existence of semicircles shows that the objects of C and F C are selfdual and by Frobenius reciprocity we get m+n m+n = dim F C = dim(F C(m, n)) dim(C(m, n)) = dim C 2 2 for m + n even. By tensoring with u and u∗ we get embeddings C(m, n) ⊂ C(m, n + 1),
F C(m, n) ⊂ F C(m, n + 1),
and this shows that the dimension equalities hold for any m and n. Together with #F C ⊂ C ⊂ F C, this shows that C = F C. (II) Assume that M, U and E in a N-algebra C satisfy the relations (1–5). We have to construct a morphism F C → C sending m → M, u → U, e → E. This will be done in two steps. First, we restrict attention to diagonals: we would like to construct a morphism #F C → #C sending m∗ m → M ∗ M, uu∗ → U U ∗ , e → E. By constructing the corresponding Jones projections Ei and Pi , we must send ei → Ei , pi → Pi
(i = 1, 2, 3, . . . ).
The presentation result for #F C of Bisch and Jones ([5]) reduces this to an algebraic computation. More precisely, it is proved in [5] that the following relations (a) (b) (c) (d) (e)
ei2 = ei , ei ej = ej ei if |i − j | ≥ 2 and ei ei±1 ei = δ −2 ei , pi2 = pi and pi pj = pj pi , ei pi = pi ei = ei and pi ej = ej pi if |i − j | ≥ 2, e2i±1 p2i e2i±1 = β −2 e2i±1 and e2i p2i±1 e2i = ω−2 e2i , p2i e2i±1 p2i = β −2 p2i±1 p2i and p2i±1 e2i p2i±1 = ω−2 p2i p2i±1
are a presentation of #F C. So it remains to verify that (1 − 5) ⇒ (a − e), where m, u are e are abstract objects and we are no longer allowed to draw pictures. First, by using en+2 = 1 ⊗ en and pn+2 = 1 ⊗ pn these relations reduce to: (α) (β) (γ ) (δ1) (δ2) (*1) (*2)
ei2 = ei for i = 1, 2, e1 e2 e1 = δ −2 e1 and e2 e1 e2 = δ −2 e2 , pi2 = pi for i = 1, 2 and [p1 , p2 ] = [1 ⊗ p1 , p2 ] = [1 ⊗ p2 , p2 ] = 0, [e2 , 1 ⊗ p2 ] = [p2 , 1 ⊗ e2 ] = 0 and ei pi = pi ei = ei for i = 1, 2, e1 p2 e1 = β −2 e1 and (1 ⊗ e1 )p2 (1 ⊗ e1 ) = β −2 (1 ⊗ e1 ), e2 p1 e2 = e2 (1 ⊗ p1 )e2 = ω−2 e2 , β 2 p 2 e 1 p 2 = ω 2 p 1 e 2 p 1 = p1 p 2 , β 2 p2 (1 ⊗ e1 )p2 = ω2 (1 ⊗ p1 )e2 (1 ⊗ p1 ) = (1 ⊗ p1 )p2 .
226
T. Banica
With e1 = uu∗ , e2 = δ −2 m∗ m, p1 = e and p2 = f one can see that most of them are trivial. What is left can be reformulated in the following way: (x) (y) (z) (t)
em∗ me = β 2 f ∗ e, (1 ⊗ e)m∗ m(1 ⊗ e) = β 2 f ∗ (1 ⊗ e), f ∗ = f ∗f , [e, f ] = [1 ⊗ e, f ] = [m∗ m, 1 ⊗ f ] = [f, 1 ⊗ m∗ m] = 0,
By multiplying the relation (5) by u and by 1 ⊗ 1 ⊗ u to the right we get the following useful formula, to be used many times in what follows. m(e ⊗ e) = em(1 ⊗ e) = eme. Let us verify (x–t). First, we have β 2 f ∗ e = m(e ⊗ e)(1 ⊗ m∗ ) and by replacing m(e ⊗ e) with eme we get em∗ me, so (x) is true. We have (1 ⊗ e)m∗ m(1 ⊗ e) = m(1 ⊗ (em(1 ⊗ e))∗ ) and by replacing em(1 ⊗ e) with eme we get β 2 f ∗ (1 ⊗ e), so (y) is true. We have f ∗ f = β −4 m(1 ⊗ em∗ me)m∗ and by replacing em∗ me with eme(1 ⊗ m∗ ), then eme with m(e ⊗ e) we get f ∗ , so (z) is true. The first two commutators are zero, because f e and f (1 ⊗ e) are selfadjoint. Same for the others, because of the formulae mm∗ (1 ⊗ ff ∗ ) = β −4 (1 ⊗ 1 ⊗ me)m∗ m∗ mm(1 ⊗ 1 ⊗ em∗ ), (1 ⊗ m∗ m)ff ∗ = β −4 (1 ⊗ m∗ me)m∗ m(1 ⊗ em∗ m). The conclusion is that we constructed a certain N-algebra morphism #J : #F C → #C that we have to extend now to a morphism J : F C → C sending u → U and m → M. We will use a standard argument (see [11]). For w bigger than k and l we define φ : F C(l, k) → F C(w),
x → (u⊗(w−k) ⊗ 1k ) x ((u∗ )⊗(w−l) ⊗ 1l ),
θ : F C(w) → F C(l, k),
x → ((u∗ )⊗(w−k) ⊗ 1k ) x (u⊗(w−l) ⊗ 1l ),
where 1k = 1⊗k , and where the convention x = x ⊗ 1 is no longer used. We define 1 and 2 in C by similar formulae. We have θφ = 21 = I d. We define a map J by J
F C(l, k) −→ C(l, k) φ↓ ↑2 #J
F C(w) −→ C(w) As J (a) doesn’t depend on the choice of w, these J ’s are the components of a map J : F C → C. This map J extends #J and sends u → U and m → M. It remains to prove that J is a morphism. We have I m(φ) = {x ∈ F C(w) | x = ((uu∗ )⊗(w−k) ⊗ 1k ) x ((uu∗ )⊗(w−l) ⊗ 1l )}
Quantum Groups and Fuss–Catalan Algebras
227
as well as a similar description of I m(1), so J sends I m(φ) to I m(1). On the other hand we have 21 = I d, so 12 = I d on I m(1). Thus J
F C(l, k) −→ C(l, k) φ↓ ↓1 #J
F C(w) −→ C(w) commutes, so J is multiplicative: J (ab) = 2(#J φ(a)#J φ(b)) = 2(1J (a)1J (b)) = 21(J (a)J (b)) = J (a)J (b). It remains to prove that J (a ⊗ b) = J (a) ⊗ J (b). We have a ⊗ b = (a ⊗ 1s )(1t ⊗ b) for certain s and t, so it is enough to prove it for pairs (a, b) of the form (1t , b) or (a, 1s ). For (a, 1s ) this is clear, so it remains to prove that the set B = {b ∈ F C | J (1t ⊗ b) = 1t ⊗ J (b), ∀ t ∈ N} is equal to F C. First, #J being a N-algebra morphism, we have #F C ⊂ B. On the other hand, computation gives J (1t ⊗ u ⊗ 1s ) = 1t ⊗ U ⊗ 1s . Also, J being involutive and multiplicative, B is stable by involution and multiplication. We conclude that B contains the compositions of elements of #F C with 1t ⊗ u ⊗ 1s ’s and 1t ⊗ u∗ ⊗ 1s ’s. But any b in F C is equal to θ φ(b), so it is of this form and we are done. " # 3. Inclusions of Finite Dimensional C∗ -Algebras Let B ⊂ D be an inclusion of finite dimensional C∗ -algebras and let ϕ be a state on D. We have the scalar product x, y = ϕ(y ∗ x) on D. The multiplication m of D, the unit u of D and the orthogonal projection e from D onto B m : D ⊗ D → D,
u : C → D,
e:D→D
can be regarded as elements of the N-algebra ND given by ND(m, n) = L(D ⊗m , D ⊗n ). We say that ϕ is a (β, ω)-form on B ⊂ D if it is a βω-form on D, if its restriction ϕ|B is a β-form on B and if e is a B − B bimodule morphism. (For δ-forms, see Sect. 1.) As a first example, if φ is a β-form on B and ψ is a ω-form on W , then φ ⊗ ψ is a (β, ω)-form on B ⊂ B ⊗ W . In particular a δ-form on D is a (1, δ)-form on C ⊂ D. Theorem 3. If ϕ is a (β, ω)-form on B ⊂ D, then m, u, e = F C. Proof. We prove that m, u, e satisfy the relations (1–5). The formulae e = e2 = e∗ and (3) are true, (1) is equivalent to the fact that ϕ is a βω-form and (5) says that e(b)e(c)e(d) = e(e(b)ce(d)) for any b, c, d in B, i.e. that e is a morphism of B–B bimodules. Let {b−i }i∈N be an orthonormal basis of B and let {bj }j ∈N be an orthonormal basis of B ⊥ . We denote by {bn }n∈Z the orthonormal basis {b−i , bj }i,j ∈N of D. We have m∗ (b) = bk ⊗ bs b, bk bs = bk ⊗ bs bk∗ b, bs = bk ⊗ bk∗ b, k,s∈Z
k,s∈Z
k∈Z
228
T. Banica
so ϕ is a δ-form if and only if bk bk∗ = δ 2 1. On the other hand, we get ∗ ∗ ∗ ∗ mem (b) = m e(bk ) ⊗ bk b = e(bk )bk b = b−i b−i b, k∈Z
mem∗
k∈Z
i∈N
so the formula = in (4) is equivalent to the fact that ϕ|N is a β-form on B. It remains to check the following three formulae, with f = β −2 (1 ⊗ me)m∗ : f = f ∗,
β 21
(1 ⊗ f )f = f (1 ⊗ f ),
m(1 ⊗ e)m∗ = β 2 1.
(<)
By using the fact that e is a bimodule morphism we get successively that σ (B) = B,
e∗ = ∗e,
where σ : D → D is such that ϕ(ab) = ϕ(bσ (a)). By using the above formula for m∗ we get f (x ⊗ y) = β −2 (1 ⊗ me)m∗ (x ⊗ y) = β −2 bk ⊗ e(bk∗ bm )bn . k∈Z
This allows us to prove the first (<) formula, because we have ∗ ∗ e(bM bm )bn ) = bm ⊗ bn , f (bM ⊗ bN ) f (bm ⊗ bn ), bM ⊗ bN = β −2 ϕ(bN
for any m, n, M, N. The second (<) formula follows from (1 ⊗ f )f (x ⊗ y ⊗ z), bk ⊗ bs ⊗ w = β −4 e(bs∗ ay)z, w, f (1 ⊗ f )(x ⊗ y ⊗ z), bk ⊗ bs ⊗ w = β −4 abt , bs e(bt∗ y)z, w, t∈Z
with a = e(bk∗ x), for any x, y, z, w, k, s. For the third (<) formula, we have m∗ (b) = bk ⊗ bs b, bk bs = bk bσ (bs∗ ), bk ⊗ bs = bσ (bs∗ ) ⊗ bs , k,s∈Z
and this gives
k,s∈Z
m(1 ⊗ e)m∗ (b) q=
s∈Z
s∈Z
= bq with q given by ∗ = σ (b−i )b−i = mB m∗B (1) = β 2 1,
σ (bs∗ )e(bs )
i∈N
where mB is the multiplication of N , which was computed in a similar way. Thus m, u, e satisfy the relations (1–5), so Theorem 2 applies and gives a certain N-algebra surjective morphism J : F C → m, u, e. It remains to prove that J is faithful. Consider the maps φn : F C(n) → F C(n − 1),
x → (1⊗(n−1) ⊗ v ∗ )(x ⊗ 1)(1⊗(n−1) ⊗ v),
x → (1⊗(n−1) ⊗ J (v)∗ )(x ⊗ 1)(1⊗(n−1) ⊗ J (v)), ψn : C(n) → C(n − 1), ∗ where v = m u ∈ F C(0, 2). These make the following diagram commutative: F C(n) φn ↓
J
−→ J
C(n) ↓ ψn
F C(n − 1) −→ C(n − 1) and by gluing such diagrams we get a factorisation by J of the composition on the left of conditional expectations, which is the Markov trace. By positivity J is faithful on #F C, then by Frobenius reciprocity faithfulness has to hold on the whole F C.
Quantum Groups and Fuss–Catalan Algebras
229
4. Quantum Automorphism Groups of Inclusions Let B ⊂ D be an inclusion of finite dimensional C∗ -algebras and let ϕ be a state on D. Following Wang ([14]) we define the universal C∗ -algebra Aaut (B ⊂ D, ϕ) generated by the coefficients vij of a unitary matrix v subject to the conditions m ∈ Hom(v ⊗2 , v),
u ∈ Hom(1, v),
e ∈ End(v),
where m : D ⊗ D → D is the multiplication, u : C → D is the unit and e : D → D is the projection onto B, with respect to the scalar product x, y = ϕ(y ∗ x). This definition has to be understood as follows. Let n = dim(D) and fix a vector space isomorphism D % Cn . Let F be the free ∗-algebra on n2 variables {vij }i,j =1,...,n and let v = (vij ) ∈ Mn ⊗ F . For any k ∈ N define v ⊗k to be v ⊗k = v1,k+1 v2,k+1 . . . vk,k+1 ∈ Mn⊗k ⊗ F. If a, b ∈ n and t ∈ L(Mn⊗a , Mn⊗b ), the collection of relations between vij ’s and their adjoints obtained by identifying coefficients in the formula (t ⊗ id)v ⊗a = v ⊗b (t ⊗ id) can be called “the relation t ∈ Hom(v ⊗a , v ⊗b )”. With this definition, let J ⊂ F be the two-sided ∗-ideal generated by the relations m ∈ Hom(v ⊗2 , v), u ∈ Hom(1, v) and e ∈ End(v), together with the relations obtained by identifying coefficients in vv ∗ = v ∗ v = 1. The ∗-algebra F /J being generated by the coefficients of a unitary matrix, its enveloping C∗ -algebra of F /J is well-defined. We call it Aaut (B ⊂ D, ϕ). By universality we can construct a C∗ -algebra morphism # : Aaut (B ⊂ D, ϕ) → Aaut (B ⊂ D, ϕ) ⊗ Aaut (B ⊂ D, ϕ), sending vij → k vik ⊗ vkj for any i and j (the tensor product being the “min” tensor product). We have (id ⊗ #)v = v12 v13 , so the comultiplication relation (id ⊗ #)# = (# ⊗ id)# holds on the generating set {vij }i,j =1,...,n . It follows that the comultiplication relation holds on the whole Aaut (B ⊂ D, ϕ). Summing up, we have constructed a pair (Aaut (B ⊂ D, ϕ), v) consisting of a unital Hopf C∗ -algebra together with a generating corepresentation, i.e. a compact matrix pseudogroup in the sense of Woronowicz ([15]). The matrix v is a corepresentation of Aaut (B ⊂ D, ϕ) on the Hilbert space D. The three “Hom” conditions translate into the fact that v corresponds to a coaction of Aaut (B ⊂ D, ϕ) on the C∗ -algebra D, which leaves ϕ and B invariant. See Wang ([14]) and [1] for details and comments, in the case B = C. See Sect. 3 in [3] for a general construction of such universal Hopf C∗ -algebras. We recall from Woronowicz ([16]) that the completion of a tensor C∗ -category is by definition the smallest semisimple tensor C∗ -category containing it.
230
T. Banica
Theorem 4. If ϕ is a (β, ω)-form on B ⊂ D, then the tensor C∗ -category of finite dimensional corepresentations of Aaut (B ⊂ D, ϕ) is the completion of F C. Proof. The unital Hopf C∗ -algebra Aaut (B ⊂ D, ϕ) being presented by the relations corresponding to m, u and e, its tensor C∗ -category of corepresentations has to be the completion of the tensor C∗ -category m, u, e generated by m, u and e. (This is a direct consequence of Tannakian duality [16], cf. Theorem 3.1 in [3].) On the other hand, the linear form ϕ being a (β, ω)-form, Theorem 3 applies and gives an isomorphism m, u, e % F C. " # In the case B = C and ϕ = trace, studied in [1], we have F C = T L2 . Note the following corollary of Theorem 4: If ϕ is a (β, ω)-form, then Nv = F C. (This follows from the basic facts about completion in [16].) Note also, as an even weaker version of Theorem 4, the dimension equalities dim(Hom(v ⊗m , v ⊗n )) = dim(F C(m, n)) for any m and n. Together with standard reconstruction tricks (see e.g. [3]) and with the results in [5], these equalities could be used for classifying the irreducible corepresentations of Aaut (B ⊂ D, ϕ) and for finding their fusion rules. 5. Fixed Point Subfactors If D is a finite dimensional C∗ -algebra then there exists a unique δ-trace on it: the canonical trace τD . This is by definition the restriction to D√of the unique unital trace of matrices, via the left regular representation. We have δ = dim(D). See [1]. In [4] we construct inclusions of fixed point von Neumann algebras of the form (P ⊗ B)K ⊂ (P ⊗ D)K , where P is a I I1 -factor, B ⊂ D is an inclusion of finite dimensional C∗ -algebras with a trace τ and K is a compact quantum group acting minimally on P and acting on D such that B and τ are left invariant. (See [4] for technical details, in terms of unital Hopf C∗ -algebras.) We also show that if K acts on (C, τ ) and acts minimally on a I I1 -factor P then we have the following implications: Z((P ⊗ C)K ) = C ⇐⇒ Z(C)K = C ⇒ τ = τC , and we deduce from this that we have the following sequence of implications: τ = τD Z(D)K = C (P ⊗ B)K ⊂ (P ⊗ D)K , ⇒ ⇐⇒ τ|B = τB is a subfactor Z(B)K = C which can be glued to the following sequence of elementary implications dim(D) τ = τD ⇒ τD |B = τB ⇒ Ind(B ⊂ D) = ∈N . τ|B = τB dim(B) The following question is raised in [4]: Is τD |B = τB the only restriction on B ⊂ D?
Quantum Groups and Fuss–Catalan Algebras
231
Theorem 5. For an inclusion B ⊂ D the following are equivalent: – there exist subfactors of the form (P ⊗ B)K ⊂ (P ⊗ D)K , – B ⊂ D commutes with the canonical traces of B and D. √ Proof. The canonical trace τD is a δ-form, with δ = √ dim(D). Its restriction to B is the canonical trace of B, so it is a β-form, with β = dim(B). Also τD being a trace, the projection e has to be a B–B bimodule morphism. Thus τD is a (β, ω)-form, with ω = dim(D)/ dim(B). β = dim(B), Thus Theorem 4 applies to τD . In terms of quantum groups, we get that the fundamental representation π of K = Gaut (B ⊂ D, τD ) satisfies dim(Hom(π ⊗m , π ⊗n )) = dim(F C(m, n)) for any m and n. With m = 0 and n = 1 we get dim(Hom(1, π )) = dim(F C(0, 1)) = 1. Together with the canonical isomorphism D K % Hom(1, π ), this shows that the action of K on D is ergodic. In particular Z(D)K = Z(B)K = C, so by the above, if P is a I I1 -factor with minimal action of K (the existence of such a P is shown by Ueda in [13]), then (P ⊗ B)K ⊂ (P ⊗ D)K is a subfactor and we are done. " # Note that by [4] the standard invariant of P K ⊂ (P ⊗ D)K is the Popa system associated to π, which by Theorem 4 is the Fuss–Catalan Popa system. Equivalently, P K ⊂ (P ⊗ B)K ⊂ (P ⊗ D)K is isomorphic to a free composition of A∞ subfactors. In [2] and [4] standard invariant of subfactors are associated to actions of compact quantum groups on objects of the form (C ⊂ Mn , ϕ) and (B ⊂ D, τ ). This gives evidence for the existence of a general construction, starting with objects of the form (B ⊂ D, ϕ), subject to certain conditions. The results in this paper suggest that there should be only one condition on (B ⊂ D, ϕ), namely “ϕ is a (β, ω)-form on B ⊂ D”. References 1. Banica, T.: Symmetries of a generic coaction. Math. Ann. 314, 763–780 (1999) 2. Banica, T.: Representations of compact quantum groups and subfactors. J. Reine Angew. Math. 509, 167–198 (1999) 3. Banica, T.: Fusion rules for representations of compact quantum groups. Exposition. Math. 17, 313–337 (1999) 4. Banica, T.: Subfactors associated to compact Kac algebras. Integral Equations Operator Theory 39, 1–14 (2001) 5. Bisch, D. and Jones, V.: Algebras associated to intermediate subfactors. Invent. Math. 128, 89–157 (1997) 6. Bisch, D. and Jones, V.: A note on free composition of subfactors, Geometry and physics (Aarhus, 1995). Lecture Notes in Pure and Appl. Math. 184, 339–361 (1997) 7. Di Francesco, P.: New integrable lattice models from Fuss–Catalan algebras. Nucl. Phys. B 532, 609–634 (1998) 8. Jones, V.: Index for subfactor. Invent. Math. 72, 1–25 (1983) 9. Jones, V.: Planar algebras, I. Preprint 10. Kauffman, L.: State models and the Jones polynomial. Topology 26, 395–407 (1987) 11. Kazhdan, D. and Wenzl, H.: Reconstructing monoidal categories. Adv. in Soviet Math. 16, 111–136 (1993)
232
T. Banica
12. Landau, Z.: Fuss–Catalan algebras and chains of intermediate subfactors. Pacific J. Math. 197, 325–367 (2001) 13. Ueda,Y.: A minimal action of the quantum group SUq (n) on a full factor. J. Math. Soc. Japan 51, 449–461 (1999) 14. Wang, S.: Quantum symmetry groups of finite spaces, Comm. Math. Phys. 195, 195–211 (1998) 15. Woronowicz, S.: Compact matrix pseudogroups. Comm. Math. Phys. 111, 613–665 (1987) 16. Woronowicz, S.: Tannaka–Krein duality for compact matrix pseudogroups. Invent. Math. 93, 35–76 (1988) Communicated by A. Connes
Commun. Math. Phys. 226, 233 – 268 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Geometric Quantization and the Generalized Segal–Bargmann Transform for Lie Groups of Compact Type Brian C. Hall Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA. E-mail: [email protected] Received: 28 June 2001 / Accepted: 17 September 2001
Abstract: Let K be a connected Lie group of compact type and let T ∗ (K) be its cotangent bundle. This paper considers geometric quantization of T ∗ (K), first using the vertical polarization and then using a natural Kähler polarization obtained by identifying T ∗ (K) with the complexified group KC . The first main result is that the Hilbert space obtained by using the Kähler polarization is naturally identifiable with the generalized Segal–Bargmann space introduced by the author from a different point of view, namely that of heat kernels. The second main result is that the pairing map of geometric quantization coincides with the generalized Segal–Bargmann transform introduced by the author. This means that the pairing map, in this case, is a constant multiple of a unitary map. For both results it is essential that the half-form correction be included when using the Kähler polarization. These results should be understood in the context of results of K. Wren and of the author with B. Driver concerning the quantization of (1 + 1)-dimensional Yang–Mills theory. Together with those results the present paper may be seen as an instance of “quantization commuting with reduction”.
Contents 1. 2. 3. 4. 5. 6. 7.
Introduction . . . . . . . . . . . . . . . . . . . The Main Results . . . . . . . . . . . . . . . . Quantization, Reduction, and Yang–Mills Theory The Geodesic Flow and the Heat Equation . . . The Rn Case . . . . . . . . . . . . . . . . . . . Appendix: Calculations with ζ and κ . . . . . . Appendix: Lie Groups of Compact Type . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
234 238 250 255 258 263 265
234
B. C. Hall
1. Introduction The purpose of this paper is to show how the generalized Segal–Bargmann transform introduced by the author in [H1] fits into the theory of geometric quantization. I begin this introduction with an overview of the generalized Segal–Bargmann transform and its applications. I continue with a brief description of geometric quantization and I conclude with an outline of the results of this paper. The reader may wish to begin with Sect. 5, which explains how the results work out in the Rn case. 1.1. The generalized Segal–Bargmann transform. See the survey paper [H7] for a summary of the generalized Segal–Bargmann transform and related results. Consider a classical system whose configuration space is a connected Lie group K of compact type. Lie groups of compact type include all compact Lie groups, the Euclidean spaces Rn , and products of the two (and no others – see Sect. 7). As a simple example, consider a rigid body in R3 , whose rotational degrees of freedom are described by a system whose configuration space is the compact group SO(3). For a system whose configuration space is the group K, the corresponding phase space is the cotangent bundle T ∗ (K). There is a natural way to identify T ∗ (K) with the complexification KC of K. Here KC is a certain connected complex Lie group whose Lie algebra is the complexification of Lie(K) and which contains K as a subgroup. For example, if K = Rn then KC = Cn and if K = SU(n) then KC = SL(n; C). The paper [H1] constructs a generalized Segal–Bargmann transform for K. (More precisely, [H1] treats the compact case; the Rn case is just the classical Segal–Bargmann transform, apart from minor differences of normalization.) The transform is a unitary map Ch¯ of L2 (K, dx) onto HL2 (KC , νh¯ (g) dg), where dx and dg are the Haar measures on K and KC , respectively, and where νh¯ is the K-invariant heat kernel on KC . Here h¯ is Planck’s constant, which is a parameter in the construction (denoted t in [H1]). The transform itself is given by Ch¯ f = analytic continuation of eh¯ K /2 f, where the analytic continuation is from K to KC with h¯ fixed. The results of the present paper and of [Wr] and [DH] give other ways of thinking about the definition of this transform. (See below and Sect. 3 for a discussion of [Wr, DH].) The results of [H1] can also be formulated in terms of coherent states and a resolution of the identity, as described in [H1] and in much greater detail in [HM]. The isometricity of the transform and the resolution of the identity for the coherent states are just two different ways of expressing the same mathematical result. The results of [H1] extend to systems whose configuration space is a compact homogeneous space, such as a sphere, as shown in [H1, Sect. 11] and [St]. However the group case is special both mathematically and for applications to gauge theories. In particular the results of the present paper do not extend to the case of compact homogeneous spaces. The generalized Segal–Bargmann transform has been applied to the Ashtekar approach to quantum gravity in [A], as a way to deal with the “reality conditions” in the original version of this theory, formulated in terms of complex-valued connections. (See also [Lo].) More recently progress has been made in developing a purely real-valued version of the Ashtekar approach, using compact gauge groups. In a series of six papers (beginning
Geometric Quantization and Segal–Bargmann Transform
235
with [T2]) T. Thiemann has given in this setting a diffeomorphism-invariant construction of the Hamiltonian constraint, thus giving a mathematically consistent formulation of quantum gravity. In an attempt to determine whether this construction has ordinary general relativity as its classical limit, Thiemann and co-authors have embarked on a program [T3,TW1,TW2,TW3, STW] to construct coherent states that might approximate a solution to classical general relativity. These are to be obtained by gluing together the coherent states of [H1] for a possibly infinite number of edges in the Ashtekar scheme. This program requires among other things a detailed understanding of the properties of the coherent states of [H1] for one fixed compact group K, which has been worked out in the case K = SU(2) in [TW1]. In another direction, K. K. Wren [Wr], using a method proposed by N. P. Landsman [La1], has shown how the coherent states of [H1] arise naturally in the canonical quantization of (1 + 1)-dimensional Yang–Mills theory on a spacetime cylinder. The way this works is as follows. (See Sect. 3 for a more detailed explanation.) For the canonical quantization of Yang–Mills on cylinder, one has an infinite-dimensional “unreduced” configuration space consisting of K-valued connections over the spatial circle, where K is the structure group. One is then supposed to pass to the “reduced” or “physical” configuration space consisting of connections modulo gauge transformations. It is convenient to work at first with “based” gauge transformations, those equal to the identity at one fixed point in the spatial circle. In that case the reduced configuration space, consisting of connections modulo based gauge transformations over S 1 , is simply the structure group K. (This is because the one and only quantity invariant under based gauge transformations is the holonomy around the spatial circle.) Wren considers the ordinary “canonical” coherent states for the space of connections and then “projects” these (using a suitable regularization procedure) onto the gaugeinvariant subspace. The remarkable result is that after projection the ordinary coherent states for the space of connections become precisely the generalized coherent states for K, as originally defined in [H1]. Wren’s result was elaborated on by Driver–Hall [DH] and Hall [H8], in a way that emphasizes the Segal–Bargmann transform and uses a different regularization scheme. These results raise interesting questions about how geometric quantization behaves under reduction – see Sect. 3. Finally, as mentioned above, we can think of the Segal–Bargmann transform for K as a resolution of the identity for the corresponding coherent states. The coherent states then “descend” to give coherent states for any system whose configuration space is a compact homogeneous space [H1, Sect. 11], [St]. Looked at this way, the results of [H1, St] fit into the large body of results in the mathematical physics literature on generalized coherent states. It is very natural to try to construct coherent states for systems whose configuration space is a homogeneous space, and there have been previous constructions, notably by C. Isham and J. Klauder [IK] and De Bièvre [De]. However, these constructions, which are based on extensions of the Perelomov [P] approach, are not equivalent to the coherent states of [H1, St]. In particular the coherent states of [IK] and [De] do not in any sense depend holomorphically on the parameters, in contrast to those of [H1, St]. More recently, the coherent states of Hall–Stenzel for the case of a 2-sphere were independently re-discovered, from a substantially different point of view, by K. Kowalski and J. Rembieli´nski [KR1]. (See also [KR2].) The forthcoming paper [HM] explains in detail the coherent state viewpoint, taking into account the new perspectives offered by Kowalski and Rembieli´nski [KR1] and Thiemann [T1]. In the group case, the present paper shows that the coherent states of [H1] can be obtained by means of geometric quantization and are thus of “Rawnsley type” [Ra1, RCG].
236
B. C. Hall
1.2. Geometric quantization. A standard example in geometric quantization is to show how the Segal–Bargmann transform for Rn can be obtained by means of this theory. Furthermore, the standard method for constructing other Segal–Bargmann-type Hilbert spaces of holomorphic functions (and the associated coherent states) is by means of geometric quantization. Since [H1] is not formulated in terms of geometric quantization, it is natural to apply geometric quantization in that setting and see how the results compare. A first attempt at this was made in [H4, Sect. 7], which used “plain” geometric quantization and found that the results were not equivalent to those of [H1]. The present paper uses geometric quantization with the “half-form correction” and the conclusion is that geometric quantization with the half-form correction does give the same results as [H1]. In this subsection I give a brief overview of geometric quantization, and in the next subsection I summarize how it works out in the particular case at hand. See also Sect. 5 for how all this works in the standard Rn case. For quantum mechanics of a particle moving in Rn there are several different ways of expressing the quantum Hilbert space, including the position Hilbert space (or Schrödinger representation) and the Segal–Bargmann (or Bargmann, or Bargmann–Fock) space. The position Hilbert space is L2 (Rn ), with Rn thought of as the position variables. The Segal– Bargmann space is the space of holomorphic functions on Cn that are square-integrable with respect to a Gaussian measure, where Cn = R2n is the phase space. (There are also the momentum Hilbert space and the Fock symmetric tensor space, which will not be discussed in this paper.) There is a natural unitary map that relates the position Hilbert space to the Segal–Bargmann space, namely the Segal–Bargmann transform. One way to understand these constructions is in terms of geometric quantization. (See Sect. 5.) In geometric quantization one first constructs a pre-quantum Hilbert space over the phase space R2n . The prequantum Hilbert space is essentially just L2 R2n . It is generally accepted that this Hilbert space is “too big”; for example, the space of position and momentum operators does not act irreducibly. To get an appropriate Hilbert space one chooses a “polarization”, that is (roughly) a choice of n out of the 2n variables on R2n . The quantum Hilbert space is then the space of elements of the prequantum Hilbert space that are independent of the chosen n variables. So in the “vertical polarization” one considers functions that are independent of the momentum variables, hence functions of the position only. In this case the quantum Hilbert space is just the position Hilbert space L2 (Rn ). Alternatively, one may identify R2n with Cn and consider complex variables z1 , . . . , zn , and z¯ 1 , . . . , z¯ n . The Hilbert space is then the space of functions that are “independent of the z¯ k ’s”, that is, holomorphic. In this case the quantum Hilbert space is the Segal–Bargmann space. More precisely, the prequantum Hilbert space for a symplectic manifold (M, ω) is the space of sections of a line-bundle-with-connection L over M, where the curvature of L is given by the symplectic form ω. A real polarization for M is a foliation of M into Lagrangian submanifolds. A Kähler polarization is a choice of a complex structure on M that is compatible with the symplectic structure, in such a way that M becomes a Kähler manifold. The quantum Hilbert space is then the space of sections that are covariantly constant along the leaves of the foliation (for a real polarization) or covariantly constant in the z¯ -directions (for a complex polarization). Since the leaves of a real polarization are required to be Lagrangian, the curvature of L (given by ω) vanishes along the leaves and so there exist, at least locally, polarized sections. Similarly, the compatibility condition between the complex structure and the symplectic structure in a complex polarization guarantees the existence, at least locally, of polarized sections.
Geometric Quantization and Segal–Bargmann Transform
237
A further ingredient is the introduction of “half-forms”, which is a technical necessity in the case of the vertical polarization and which can be useful even for a Kähler polarization. The inclusion of half-forms in the Kähler-polarized Hilbert space is essential to the results of this paper. If one has two different polarizations on the same manifold then one gets two different quantum Hilbert spaces. Geometric quantization gives a canonical way of constructing a map between these two spaces, called the pairing map. The pairing map is not unitary in general, but it is unitary in the case of the vertical and Kähler polarizations on R2n . In the R2n case, this unitarity can be explained by the Stone-von Neumann theorem. I do the calculations for the R2n case in Sect. 5; the reader may wish to begin with that section. Besides the R2n case, there have not been many examples where pairing maps have been studied in detail. In particular, the only works I know of that address unitarity of the pairing map outside of R2n are those of J. Rawnsley [Ra2] and K. Furutani and S.Yoshizawa [FY]. Rawnsley considers the cotangent bundle of spheres, with the vertical polarization and also a certain Kähler polarization. Furutani and Yoshizawa consider a similar construction on the cotangent bundle of complex and quaternionic projective spaces. In these cases the pairing map is not unitary (nor a constant multiple of a unitary map). 1.3. Geometric quantization and the Segal–Bargmann transform. An interesting class of symplectic manifolds having two different natural polarizations is the following. Let X be a real-analytic Riemannian manifold and let M = T ∗ (X). Then M has a natural symplectic structure and a natural vertical polarization, in which the leaves of the Lagrangian foliation are the fibers of T ∗ (X) . By a construction of Guillemin and Stenzel [GStenz1, GStenz2] and Lempert and Sz˝oke [LS], T ∗ (X) also has a canonical “adapted” complex structure, defined in a neighborhood of the zero section. This complex structure is compatible with the symplectic structure and so defines a Kähler polarization on an open set in T ∗ (X) . This paper considers the special case in which X is a Lie group K with a bi-invariant Riemannian metric. Lie groups that admit a bi-invariant metric are said to be of “compact type”; these are precisely the groups of the form (compact)×Rn . In this special case, the adapted complex structure is defined on all of T ∗ (K), so T ∗ (K) has two polarizations, the vertical polarization and the Kähler polarization coming from the adapted complex structure. If K = Rn then the complex structure is just the usual one on T ∗ (Rn ) = R2n = Cn . There are two main results, generalizing what is known in the Rn case. First, the Kähler-polarized Hilbert space constructed over T ∗ (K) is naturally identifiable with the generalized Segal–Bargmann space defined in [H1] in terms of heat kernels. Second, the pairing map between the vertically polarized and the Kähler-polarized Hilbert space over T ∗ (K) coincides (up to a constant) with the generalized Segal–Bargmann transform of [H1]. Thus by [H1, Thm. 2] a constant multiple of the pairing map is unitary in this case. Both of these results hold only if one includes the “half-form correction” in the construction of the Kähler-polarized Hilbert space. In the case K = Rn everything reduces to the ordinary Segal–Bargmann space and the Segal–Bargmann transform (Sect. 5). The results are surprising for two reasons. First, the constructions in [H1] involve heat kernels, whereas geometric quantization seems to have nothing to do with heat kernels or the heat equation. Second, in the absence of something like the Stone–von Neumann theorem there does not seem to be any reason that pairing maps ought to be
238
B. C. Hall
unitary. The discussion in Sect. 4 gives some partial explanation for the occurrence of the heat kernel. (See also [JL].) If one considersYang–Mills theory over a space-time cylinder, in the temporal gauge, the “unreduced phase space” is a certain infinite-dimensional linear space of connections. The reduced phase space, obtained by “reducing” by a suitable gauge group, is the finite-dimensional symplectic manifold T ∗ (K), where K is the structure group for the Yang–Mills theory. Thus the symplectic manifold T ∗ (K) considered here can also be viewed as the “symplectic quotient” of an infinite-dimensional linear space by an infinite-dimensional group. It is reasonable to ask whether “quantization commutes with reduction”, that is, whether one gets the same results by first quantizing and then reducing as by first reducing and then quantizing. Surprisingly (to me), the answer in this case is yes, as described in Sect. 3. I conclude this introduction by discussing two additional points. First, it is reasonable to consider the more general situation where the group K is allowed to be a symmetric space of compact type. In that case the geometric quantization constructions make perfect sense, but the main results of this paper do not hold. Specifically, the Kähler-polarized Hilbert space does not coincide with the heat kernel Hilbert space of M. Stenzel [St], and I do not know whether the pairing map of geometric quantization is unitary. This discrepancy reflects special properties that compact Lie groups have among all compact symmetric spaces. See the discussion at the end of Sect. 2.3. Second, one could attempt to construct a momentum Hilbert space for T ∗ (K). In the case K = Rn this may be done by considering the natural horizontal polarization. The pairing map between the vertically polarized and horizontally polarized Hilbert spaces is in this case just the Fourier transform. By contrast, if K is non-commutative, then there is no natural horizontal polarization. (For example, the foliation of T ∗ (K) into the left orbits of K is not Lagrangian.) Thus, even though there is a sort of momentum representation given by the Peter–Weyl theorem, it does not seem possible to obtain a momentum representation by means of geometric quantization. It is a pleasure to thank Bruce Driver for valuable discussions, Dan Freed for making an important suggestion regarding the half-form correction, and Steve Sontz for making corrections to the manuscript. 2. The Main Results 2.1. Preliminaries. Let K be a connected Lie group of compact type. A Lie group is said to be of compact type if it is locally isomorphic to some compact Lie group. Equivalently, a Lie group K is of compact type if there exists an inner product on the Lie algebra of K that is invariant under the adjoint action of K. So Rn is of compact type, being locally isomorphic to a d-torus, and every compact Lie group is of compact type. It can be shown that every connected Lie group of compact type is isomorphic to a product of Rn and a connected compact Lie group. So all of the constructions described here for Lie groups of compact type include as a special case the constructions for Rn . On the other hand, all the new information (beyond the Rn case) is contained in the compact case. See [He, Chap. II, Sect. 6] (including Proposition 6.8) for information on Lie groups of compact type. Let k denote the Lie algebra of K. We fix once and for all an inner product · , · on k that is invariant under the adjoint action of K. For example we may take K = SU(n), in which case k = su(n) is the space of skew matrices with trace zero. An invariant inner product on k is X, Y = Re trace (X∗ Y ) .
Geometric Quantization and Segal–Bargmann Transform
239
Now let KC be the complexification of K. If K is simply connected then the complexification of K is the unique simply connected Lie group whose Lie algebra kC is k+ik. In general, KC is defined by the following three properties. First, KC should be a connected complex Lie group whose Lie algebra kC is equal to k + ik. Second, KC should contain K as a closed subgroup (whose Lie algebra is k ⊂ kC ). Third, every homomorphism of K into a complex Lie group H should extend to a holomorphic homomorphism of KC into H. The complexification of a connected Lie group of compact type always exists and is unique. (See [H1, Sect. 3].) Example 2.1. If K = Rn then KC = Cn . If K = SU(n) then KC = SL(n; C). If K = SO(n) then KC = SO(n; C). In the first two examples, K and KC are simply connected. In the last example, neither K nor KC is simply connected. We have the following structure result for Lie groups of compact type. This result is a modest strengthening of Corollary 2.2 of [Dr] and allows all the relevant results for Lie groups of compact type to be reduced to two cases, the compact case and the Rn case. Proposition 2.2. Suppose that K is a connected Lie group of compact type, with a fixed Ad-invariant inner product on its Lie algebra k. Then there exists a isomorphism K∼ = H × Rn , where H is compact and where the associated Lie algebra isomorphism k = h + Rn is orthogonal. The proof of this result is given in an appendix. 2.2. Prequantization. We let θ be the canonical 1-form on T ∗ (K), normalized so that in the usual sort of coordinates we have θ= pk dqk . We then let ω be the canonical 2-form on T ∗ (K), which I normalize as ω = −dθ, so that in coordinates ω = "dqk ∧ dpk . We then consider a trivial complex line bundle L on T ∗ (K) L = T ∗ (K) × C with trivial Hermitian structure. Sections of this bundle are thus just functions on T ∗ (K). We define a connection (or covariant derivative) on L by ∇X = X −
1 θ (X) . i h¯
(2.1)
Note that the connection, and hence all subsequent constructions, depends on h¯ (Planck’s constant). The curvature of this connection is given by [∇X , ∇Y ] − ∇[X,Y ] =
1 ω (X, Y ) . i h¯
We let ε denote the Liouville volume form on T ∗ (K), given by ε=
1 n ω , n!
240
B. C. Hall
where n = dim K = (1/2) dim T ∗ (K). Integrating this form gives the associated Liouville volume measure. Concretely we have the identification T ∗ (K) ∼ =K ×k
(2.2)
by means of left-translation and the inner product on k. Under this identification we have [H3, Lemma 4] fε = f (x, Y ) dx dY, (2.3) T ∗ (K)
k K
where dx is Haar measure on K, normalized to coincide with the Riemannian volume measure, and dY is Lebesgue measure on k, normalized by means of the inner product. The prequantum Hilbert space is then the space of sections of L that are square integrable with respect to ε. This space may be identified with L2 (T ∗ (K), ε) . One motivation for this construction is the existence of a natural mapping Q from functions on T ∗ (K) into the space of symmetric operators on the prequantum Hilbert space, satisfying [Q (f ) , Q (g)] = −i hQ ¯ ({f, g}) , where {f, g} is the Poisson bracket. Explicitly, Q (f ) = i h∇ ¯ Xf + f, where Xf is the Hamiltonian vector field associated to f. This “prequantization map” will not play an important role in this paper. See [Wo, Chap. 8] for more information. 2.3. The Kähler-polarized subspace. Let me summarize what the results of this subsection will be. The cotangent bundle T ∗ (K) has a natural complex structure that comes by identifying it with the ‘complexification’ of K. This complex structure allows us to define a notion of Kähler-polarized sections of the bundle L. There exists a natural trivializing polarized section s0 such that every other polarized section is a holomorphic function times s0 . The Kähler-polarized Hilbert space is then identifiable with an L2 space of holomorphic functions on T ∗ (K), where the measure is the Liouville measure times |s0 |2 . We then consider the “half-form” bundle δ1 . The half-form corrected Kähler Hilbert space is the space of polarized sections of L ⊗ δ1 . This may be identified with an L2 space of holomorphic functions on T ∗ (K), where now the measure is the Liouville measure times |s0 |2 |β0 |2 , where β0 is a trivializing polarized section of δ1 . The main result is that this last measure coincides up to a constant with the K-invariant heat kernel measure on T ∗ (K) introduced in [H1]. Thus the half-form-corrected Kähler-polarized Hilbert space of geometric quantization coincides (up to a constant) with the generalized Segal–Bargmann space of [H1, Thm. 2]. We let KC denote the complexification of K, as described in Sect. 2.1, and we let T ∗ (K) denote the cotangent bundle of K. There is a diffeomorphism of T ∗ (K) with KC as follows. We identify T ∗ (K) with K × k∗ by means of left-translation and then with K × k by means of the inner product on k. We consider the map ' : K × k → KC given by ' (x, Y ) = xeiY ,
x ∈ K, Y ∈ k.
(2.4)
The map ' is a diffeomorphism. If we use ' to transport the complex structure of KC to T ∗ (K), then the resulting complex structure on T ∗ (K) is compatible with the symplectic structure on T ∗ (K), so that T ∗ (K) becomes a Kähler manifold. (See [H3, Sect. 3].) Consider the function κ : T ∗ (K) → R given by κ (x, Y ) = |Y |2 .
(2.5)
Geometric Quantization and Segal–Bargmann Transform
241
This function is a Kähler potential for the complex structure on T ∗ (K) described in the previous paragraph. Specifically we have ¯ = θ. Im ∂κ (2.6) Then because ω = −dθ it follows that ¯ = ω. i∂ ∂κ
(2.7)
An important feature of this situation is the natural explicit form of the Kähler potential. This formula for κ comes as a special case of the general construction of Guillemin– Stenzel [GStenz1, Sect. 5] and Lempert–Sz˝oke [LS, Cor. 5.5]. In this case one can compute directly that κ satisfies (2.6) and (2.7) (see the first appendix). We define a smooth section s of L to be Kähler-polarized if ∇X s = 0 for all vectors of type (0, 1) . Equivalently s is polarized if ∇∂/∂ z¯ k s = 0 for all k, in holomorphic local coordinates. The Kähler-polarized Hilbert space is then the space of square-integrable Kähler-polarized sections of L. (See [Wo, Sect. 9.2].) Proposition 2.3. If we think of sections s of L as functions on T ∗ (K) then the Kählerpolarized sections are precisely the functions s of the form s = F e−|Y |
2 /2h
¯,
with F holomorphic and |Y |2 = κ (x, Y ) the Kähler potential (2.5). The notion of holomorphic is via the identification (2.4) of T ∗ (K) with KC . Proof. If we work in holomorphic local coordinates z1 , . . . , zn then we want sections s such that ∇∂/∂ z¯ k s = 0 for all k. The condition (2.6) on κ says that in these coordinates 1 ∂κ ∂κ θ= d z¯ k − dzk . 2i ∂ z¯ k ∂zk k
1 ∂κ ∂ = . θ ∂ z¯ k 2i ∂ z¯ k Then we get, using definition (2.1) of the covariant derivative, ∂ ∂ −κ/2h¯ 1 −κ/2h¯ e−κ/2h¯ = e − θ ∇∂/∂ z¯ k e ∂ z¯ k i h¯ ∂ z¯ k 1 ∂κ 1 1 ∂κ e−κ/2h¯ = 0. = − − 2h¯ ∂ z¯ k i h¯ 2i ∂ z¯ k Now any smooth section s can be written uniquely as s = F exp (−κ/2h) ¯ , where F is a smooth complex-valued function. Such a section is polarized precisely if
0 = ∇∂/∂ z¯ k F e−κ/2h¯
So
∂F −κ/2h¯ e + F ∇∂/∂ z¯ k e−κ/2h¯ ∂ z¯ k ∂F −κ/2h¯ = e ∂ z¯ k =
for all k, that is, precisely if F is holomorphic.
242
B. C. Hall
The norm of a polarized section s (as in Proposition 2.3) is computed as |F |2 e−κ/h¯ ε s2 = ∗ T (K)
2 2 = F xeiY e−|Y | /h¯ dx dY. k K
Here F is a holomorphic function on KC which we are “transporting” to T ∗ (K) by means of the map ' (x, Y ) = xeiY . (Recall (2.2) and (2.3).) Thus if we identify the section s with the holomorphic function F, the Kähler-polarized Hilbert space will be identified with 2 HL2 (T ∗ (K), e−|Y | /h¯ ε). Here ε is the Liouville volume measure and HL2 denotes the space of holomorphic functions that are square-integrable with respect to the indicated measure. 2 In Sect. 7 of [H4] I compared the measure e−|Y | /h¯ ε to the “K-invariant heat kernel ∗ ∼ measure” νh¯ on KC = T (K). The measure νh¯ is the one that is used in the generalized Segal–Bargmann transform of [H1, Thm. 2]. In the commutative case the two measures agree up to a constant. However, in the non-commutative case the two measures differ by a non-constant function of Y, and it is easily seen that this discrepancy cannot be eliminated by choosing a different trivializing polarized section of L. In the remainder of this section we will see that this discrepancy between the heat kernel measure and the geometric quantization measure can be eliminated by the “half-form correction”. I am grateful to Dan Freed for suggesting to me that this could be the case. We now consider the canonical bundle for T ∗ (K) relative to the complex structure obtained from KC . The canonical bundle is the complex line bundle whose sections are complex-valued n-forms of type (n, 0) . The forms of type (n, 0) may be described as those n-forms α for which Xα = 0 for all vectors of type (0, 1) . We then define the polarized sections of the canonical bundle to be the (n, 0)-forms α such that Xdα = 0 for all vector fields of type (0, 1) . (Compare [Wo, Eq. (9.3.1)].) These are nothing but the holomorphic n-forms. We define a Hermitian structure on the canonical bundle by defining for an (n, 0)-form α α¯ ∧ α |α|2 = . bε Here the ratio means the only thing that is reasonable: |α|2 is the unique function such that |α|2 bε = α¯ ∧ α. The constant b should be chosen in such a way as to make |α|2 positive; we may take b = (2i)n (−1)n(n−1)/2 . In this situation the canonical bundle may be trivialized as follows. We think of T ∗ (K) as KC , since at the moment the symplectic structure is not relevant. If Z1 , . . . , Zn are linearly independent left-invariant holomorphic 1-forms on KC then their wedge product is a nowhere-vanishing holomorphic n-form. We now choose a square root δ1 of the canonical bundle in such a way that there exists a smooth section of δ1 whose square is Z1 ∧ · · · ∧ Zn . This section of δ1 will √ be denoted by the mnemonic Z1 ∧ · · · ∧ Zn . There then exists a unique notion of polarized sections of δ1 such that 1) a locally defined, smooth, nowhere-zero section ν
Geometric Quantization and Segal–Bargmann Transform
243
of δ1 is polarized if and only if ν 2 is a polarized section of the canonical bundle, and 2) if ν is a locally defined, nowhere-zero, polarized section of δ1 and F is a smooth function, then F ν is polarized if and only if F is holomorphic. (See [Wo, p. 186].) Concretely the polarized sections of δ1 are of the form s = F (g) Z1 ∧ · · · ∧ Zn with F a holomorphic function on KC . The absolute value of such a section is defined as
Z¯ 1 ∧ · · · ∧ Z¯ n ∧ Z1 ∧ · · · ∧ Zn |s|2 := s 2 , s 2 = |F |2 . bε Now the half-form corrected Kähler-polarized Hilbert space is the space of squareintegrable polarized sections of L ⊗ δ1 . (The polarized sections of L ⊗ δ1 are precisely those that can be written locally as the product of a polarized section of L and a polarized section of δ1 .) Such sections are precisely those that can be expressed as 2 s = F e−|Y | /2h¯ ⊗ Z1 ∧ · · · ∧ Zn (2.8) with F holomorphic. The norm of such a section is computed as 2 |F |2 e−|Y | /h¯ ηε, s2 = T ∗ (K)
where η is the function given by Z¯ 1 ∧ · · · ∧ Z¯ n ∧ Z1 ∧ · · · ∧ Zn , η= bε
(2.9)
and where b = (2i)n (−1)n(n−1)/2 . We may summarize the preceding discussion in the following theorem. Theorem 2.4. If we write elements of the half-form corrected Kähler Hilbert space in the form (2.8) then this Hilbert space may be identified with HL2 (T ∗ (K), γh¯ ), where γh¯ is the measure given by γh¯ = e−|Y |
2 /h
¯ η ε.
Here ε is the canonical volume form on T ∗ (K), |Y |2 is the Kähler potential (2.5), and η is the “half-form correction” defined in (2.9) and given explicitly in (2.10) below. Here as elsewhere HL2 denotes the space of square-integrable holomorphic functions. Note that Z¯ 1 ∧ · · · ∧ Z¯ n ∧ Z1 ∧ · · · ∧ Zn is a left-invariant 2n-form on KC , so that the associated measure is simply a multiple of Haar measure on KC . Meanwhile ε is just the Liouville volume form on T ∗ (K). Thus η is the square root of the density of Haar measure with respect to Liouville measure, under our identification of KC with T ∗ (K). Both measures are K-invariant, so in our (x, Y ) coordinates on T ∗ (K), η will
244
B. C. Hall
be a function of Y only. By [H3, Lem. 5] we have that η (Y ) is the unique Ad-K-invariant function on k such that in a maximal abelian subalgebra η (Y ) =
sinh α (Y ) , α (Y ) +
(2.10)
α∈R
where R + is a set of positive roots. Meanwhile there is the “K-invariant heat kernel measure” νh¯ on KC ∼ = T ∗ (K), used in the construction of the generalized Segal–Bargmann transform in [H1, Thm. 2]. When written in terms of the polar decomposition g = xeiY , νh¯ is given explicitly by −n/2 −|ρ| h¯ −|Y | dνh¯ = (π h) e e ¯ 2
2 /h
¯ η (Y )
dx dY.
(See [H3, Eq. (13)].) Here ρ is half the sum of the positive roots for the group K. Thus apart from an overall constant, the measure T ∗ (K) coming from geometric quantization coincides exactly with the heat kernel measure of [H1]. So we have proved the following result. Theorem 2.5. For each h¯ > 0 there exists a constant ch¯ such that the measure γh¯ coming from geometric quantization and the heat kernel measure νh¯ are related by νh¯ = ch¯ γh¯ , where
ch¯ = (π h¯ )−n/2 e−|ρ| h¯ , 2
and where ρ is half the sum of the positive roots for the group K. Let us try to understand, at least in part, the seemingly miraculous agreement between these two measures. (See also Sect. 4.) The cotangent bundle T ∗ (K) has a complex structure obtained by identification with KC . The metric tensor on K then has an analytic continuation to a holomorphic n-tensor on T ∗ (K). The restriction of the analytically continued metric tensor to the fibers of T ∗ (K) is the negative of a Riemannian metric g. Each fiber, with this metric, is isometric to the non-compact symmetric space KC /K. (See [St].) This reflects the well-known duality between compact and non-compact symmetric spaces. Each fiber is also identified with k, and under this identification the Riemannian volume measure with respect to g is given by √ gdY = η (Y )2 dY. That is, the “half-form factor” η is simply the square root of the Jacobian of the exponential mapping for KC /K. Now on any Riemannian manifold the heat kernel measure (at a fixed base point, written in exponential coordinates) has an asymptotic expansion of the form
−n/2 −|Y |2 /h¯ 1/2 2 dµh¯ (Y ) ∼ (π h) e a j + ta + t + . . . dY. (2.11) (Y ) (Y ) (Y ) ¯ 1 2 Here j (Y ) is the Jacobian of the exponential mapping, also known as the Van Vleck– Morette determinant. (I have written h¯ for the time variable and normalized the heat equation to be du/dt = (1/4)u.) Note that this is the expansion for the heat kernel measure; in the expansion of the heat kernel function one has j −1/2 instead of j 1/2 .
Geometric Quantization and Segal–Bargmann Transform
245
In the case of the manifold KC /K we have a great simplification. All the higher terms in the series are just constant multiples of j 1/2 and we get an exact convergent expression of the form −n/2 −|Y | dµh¯ (Y ) = (π h) e ¯
2 /h
¯ j 1/2 (Y ) f
(t) dY.
(2.12)
Here explicitly f (t) = exp(− |ρ|2 t), where ρ is half the sum of the positive roots. The measure νh¯ in [H1] is then simply this measure times the Haar measure dx in the K-directions. So we have −n/2 −|Y | dνh¯ = e−|ρ| t (π h) e ¯ 2
2 /h
¯ j 1/2 (Y )
dx dY.
So how does geometric quantization produce a multiple of νh¯ ? The Gaussian factor in νh¯ comes from the simple explicit form of the Kähler potential. The factor of j 1/2 in νh¯ is the half-form correction – that is, j 1/2 (Y ) = η(Y ). If we begin with a general compact symmetric space X then much of the analysis goes through: T ∗ (X) has a natural complex structure, |Y |2 is a Kähler potential, and the fibers are identifiable with non-compact symmetric spaces. (See [St, p. 48].) Furthermore, the half-form correction is still the square root of the Jacobian of the exponential mapping. What goes wrong is that the heat kernel expansion (2.11) does not simplify to an expression of the form (2.12). So the heat kernel measure used in [St] and the measure coming from geometric quantization will not agree up to a constant. Nevertheless the two measures do agree “to leading order in h”. ¯ I do not know whether the geometric quantization pairing map is unitary in the case of general compact symmetric spaces X. There is, however, a unitary Segal–Bargmanntype transform, given in terms of heat kernels and described in [St].
2.4. The vertically polarized Hilbert space. After much sound and fury, the vertically polarized Hilbert space will be identified simply with L2 (K, dx), where dx is Haar measure on K. Nevertheless, the fancy constructions described below are important for two reasons. First, the vertically polarized Hilbert space does not depend on a choice of measure on K. The Hilbert space is really a space of “half-forms”. If one chooses a smooth measure µ on K (with nowhere-vanishing density with respect to Lebesgue measure in each local coordinate system) then this choice gives an identification of the vertically polarized Hilbert space with L2 (K, µ). Although Haar measure is the obvious choice for µ, the choice of measure is needed only to give a concrete realization of the space as an L2 space; the vertically polarized Hilbert space exists independently of this choice. Second, the description of the vertically polarized Hilbert space as space of half-forms will be essential to the construction of the pairing map in Sect. 2.5. The following description follows Sect. 9.3 of [Wo]. Roughly speaking our Hilbert space will consist of objects whose squares are n-forms on T ∗ (K) that are constant along the fibers and thus descend to n-forms on K. The norm of such an object is computed by squaring and then integrating the resulting n-form over K. We consider sections of L that are covariantly constant in the directions parallel to the fibers of T ∗ (K). Note that each fiber of T ∗ (K) is a Lagrangian submanifold of T ∗ (K), so that T ∗ (K) is naturally foliated into Lagrangian submanifolds. Suppose that X is a tangent vector to T ∗ (K) that is parallel to one of the fibers. Then it is easily seen that θ (X) = 0, where θ is the canonical 1-form on T ∗ (K). Thus, recalling definition (2.1) of the covariant derivative and thinking of the sections of L as functions on T ∗ (K),
246
B. C. Hall
the vertically polarized sections are simply the functions that are constant along the fibers. Such a section cannot be square-integrable with respect to the Liouville measure (unless it is zero almost everywhere). This means that we cannot construct the vertically polarized Hilbert space as a subspace of the prequantum Hilbert space. We consider, then, the canonical bundle of T ∗ (K) relative to the vertical polarization. This is the real line bundle whose sections are n-forms α such that Xα = 0
(2.13)
for all vectors parallel to the fibers of T ∗ (K). We call such a section polarized if in addition we have Xdα = 0
(2.14)
for all vectors X parallel to the fibers. (See [Wo, Eq. (9.3.1)].) Now let Q be the space of fibers (or the space of leaves of our Lagrangian foliation). Clearly Q may be identified with K itself, the “configuration space” corresponding to the “phase space” T ∗ (K). Let pr : T ∗ (K) → K be the projection map. It is not hard to verify that if α is a n-form on T ∗ (K) satisfying (2.13) and (2.14) then there exists a unique n-form β on K such that α = pr ∗ (β) . We may think of such an n-form α as being constant along the fibers, so that it descends unambiguously to an n-form β on K. In this way the polarized sections of the canonical bundle may be identified with n-forms on K. Since K is a Lie group it is orientable. So let us pick an orientation on K, which we think of as an equivalence class of nowhere-vanishing n-forms on K. Then if β is a nowhere-vanishing oriented n-form on K, we define the “positive” part of each fiber of the canonical bundle to be the half-line in which pr ∗ (β) lies. We may then construct a unique trivial real line bundle δ2 such that 1) the square of δ2 is the canonical bundle and 2) if γ is a nowhere-vanishing section of δ2 then γ 2 lies in the positive part of the canonical bundle. We have a natural notion of polarized sections of δ2 , such that 1) a locally defined, smooth, nowhere-zero section ν of δ2 is polarized if and only if ν 2 is a polarized section of the canonical bundle and 2) if ν is a locally defined, nowhere-zero, polarized section of δ2 and f is a smooth function, then f ν is polarized if and only if f is constant along the fibers. Now let β be any nowhere vanishing oriented n-form on K. Then there exists a ∗ polarized section of √ δ2 ∗(unique up to an overall sign) whose square is pr (β) . This section is denoted pr (β). Any other polarized section of δ2 is then of the form f (x) pr ∗ (β), where f (x) denotes a real-valued function on T ∗ (K) that is constant along the fibers. Finally we consider polarized sections of L⊗δ2 , i.e. those that are locally the product of a vertically polarized section of L and a polarized section of δ2 . These are precisely the sections that can be expressed in the form s = f (x) ⊗
pr ∗ (β),
Geometric Quantization and Segal–Bargmann Transform
247
where f is a complex-valued function on T ∗ (K) that is constant along the fibers. The norm of such a section is computed as 2 |f (x)|2 β. s = K
It is easily seen that this expression for s is independent of the choice of β. Note that the integration is over the quotient space K, not over T ∗ (K). In particular we may choose linearly independent left-invariant 1-forms η1 , . . . , ηn on K in such a way that η1 ∧ · · · ∧ ηn is oriented. Then every polarized section of L ⊗ δ2 is of the form s = f (x) ⊗ pr ∗ (η1 ∧ · · · ∧ ηn ) and the norm of a section is computable as |f (x)|2 η1 ∧ · · · ∧ ηn s2 = K |f (x)|2 dx, = K
(2.15)
where dx is Haar measure on K. Thus we may identify the vertically polarized Hilbert space with L2 (K, dx). More precisely, if we assume up to now that all sections are smooth, then we have the subspace of L2 (K, dx) consisting of smooth functions. The vertically polarized Hilbert space is then the completion of this space, which is just L2 (K, dx) . 2.5. Pairing. Geometric quantization gives a way to define a pairing between the Kählerpolarized and vertically polarized Hilbert spaces, that is, a sesquilinear map from HK¨ahler × HVertical into C. This pairing then induces a linear map between the two spaces, called the pairing map. The main results are: (1) the pairing map coincides up to a constant with the generalized Segal–Bargmann transform of [H1], and (2) a constant multiple of the pairing map is unitary from the vertically polarized Hilbert space onto the Kähler-polarized Hilbert space. Now the elements of the Kähler-polarized Hilbert space are polarized sections of L ⊗ δ1 and the elements of the vertically polarized Hilbert space are polarized sections of L ⊗ δ2 . Here δ1 and δ2 are square roots of the canonical bundle for the Kähler polarization and the vertical polarization, respectively. The pairing of the Hilbert spaces will be achieved by appropriately pairing the sections at each point and then integrating over T ∗ (K) with respect to the canonical volume form ε. (See [Wo, p. 234].) A polarized section s1 of L ⊗ δ1 can be expressed as s1 = f1 ⊗ β1 , where f1 is a Kähler-polarized section of L and β1 is a polarized section of δ1 . Similarly, a polarized section of L ⊗ δ2 is expressible as s2 = f2 ⊗ β2 with f2 a vertically polarized section of L and β2 a polarized section of δ2 . We define a pairing between β1 and β2 by β12 ∧ β22 , (β1 , β2 ) = cε where c is constant which I will take to be c = (−i)n (−1)n(n+1)/2 . (This constant is chosen so that things come out nicely in the Rn case. See Sect. 5.) Note that β12 and β22
248
B. C. Hall
are n-forms on T ∗ (K), so that β12 ∧ β22 is a 2n-form on T ∗ (K). Note that (β1 , β2 ) is a complex-valued function on T ∗ (K). There are at most two continuous ways of choosing the sign of the square root, which differ just by a single overall sign. That there is at least one such choice will be evident below. We then define the pairing of two sections s1 and s2 (as in the previous paragraph) by s1 , s2 pair = (2.16) (f1 , f2 ) (β1 , β2 ) ε T ∗ (K)
whenever the integral is well-defined. Here as usual ε is the Liouville volume form on T ∗ (K). It is easily seen that this expression is independent of the decomposition of si as fi ⊗ βi . The quantity (f1 , f2 ) is computed using the (trivial) Hermitian structure on the line bundle L. Although the integral in (2.16) may not be absolutely convergent in general, there are dense subspaces of the two Hilbert spaces for which it is. Furthermore, Theorem 2.6 below will show that the pairing can be extended by continuity to all s1, s2 in their respective Hilbert spaces. Now, we have expressed the polarized sections of L ⊗ δ1 in the form 2 F e−|Y | /2h¯ ⊗ Z1 ∧ · · · ∧ Zn , where F is a holomorphic function on KC and Z1 , . . . , Zn are left-invariant holomorphic 1-forms on KC . As always we identify KC with T ∗ (K) as in (2.4). The function |Y |2 is the Kähler potential (2.5). We have expressed the polarized sections of L ⊗ δ2 in the form f (x) ⊗ pr ∗ (η1 ∧ · · · ∧ ηn ), where f (x) is a function on T ∗ (K) that is constant along the fibers, η1 , . . . , ηn are left-invariant 1-forms on K, and pr : T ∗ (K) → K is the projection map. Thus we have the following expression for the pairing: 2 F, f pair = F xeiY f (x) e−|Y | /2h¯ ζ (Y ) dx dY, (2.17) K
k
T ∗ (K)
where ζ is the function on ζ =
given by
Z¯ 1 ∧ · · · ∧ Z¯ n ∧ pr ∗ (η1 ∧ · · · ∧ ηn ) , cε
(2.18)
where c = (−i)n (−1)n(n+1)/2 . I have expressed things in terms of the functions F and f, and I have used the identification (2.2) of T ∗ (K) with K × k. It is easily seen that ζ (x, Y ) is independent of x, and so I have written ζ (Y ) . Theorem 2.6. Let us identify the vertically polarized Hilbert space with L2 (K) as in (2.15) and the Kähler-polarized Hilbert space with HL2 (T ∗ (K), γh¯ ) as in Theorem 2.4. Then there exists a unique bounded linear operator :h¯ : L2 (K) → HL2 (T ∗ (K), γh¯ ) such that F, f pair = F, :h¯ f HL2 (T ∗ (K),γ ) = :∗h¯ F, f L2 (K) h¯
for all f ∈ L2 (K) and all F ∈ HL2 (T ∗ (K), γh¯ ). We call :h¯ the pairing map. The pairing map has the following properties.
Geometric Quantization and Segal–Bargmann Transform
249
(1) There exists a constant ah¯ such that for any f ∈ L2 (K) , :h¯ f is the unique holomorphic function on T ∗ (K) whose restriction to K is given by :h¯ f K = ah¯ eh¯ K /2 f.
Equivalently,
:h¯ f (g) = ah¯
ρh¯ (gx −1 )f (x) dx,
K
g ∈ KC ,
where ρh¯ is the heat kernel on K, analytically continued to KC . (2) The map :∗h¯ may be computed as
:∗h¯ F (x) =
k
2 F xeiY e−|Y | /2h¯ ζ (Y ) dY,
where ζ is defined by (2.18) and computed in Proposition 2.7 below. (3) There exists a constant bh¯ such that bh¯ :h¯ is a unitary map of L2 (K) onto :−1 HL2 (T ∗ (K), γh¯ ). Thus :∗h¯ = bh−2 h¯ . ¯ The constants ah¯ and bh¯ are given explicitly as ah¯ = (2π h¯ )n/2 e−|ρ| (4π h¯ )−n/4 , where ρ is half the sum of the positive roots for K.
2 h/2
¯
and bh¯ =
Remarks. (1) The map :h¯ coincides (up to the constant ah¯ ) with the generalized Segal– Bargmann transform for K, as described in [H1, Thm. 2]. (2) The formula for :∗h¯ F may be taken literally on a dense subspace of HL2 (T ∗ (K), γh¯ ). For general F, however, one should integrate over a ball of radius R in k and then take a limit in L2 (K) , as in [H2, Thm. 1]. (3) The formula for :∗h¯ is an immediate consequence of the formula (2.17) for the pairing. By computing ζ (Y ) explicitly we may recognize :∗h¯ as simply a constant times the inverse Segal–Bargmann transform for K, as described in [H2]. (4) In [H2] I deduce the unitarity of the generalized Segal–Bargmann transform from the inversion formula. However, I do not know how to prove the unitarity of the pairing map without recognizing that the measure in the formula for :∗h¯ is related to the heat kernel measure for KC /K. (5) Since F is holomorphic, there can be many different formulas for :∗h¯ (or :−1 h¯ ). In particular, if one takes the second expression for :h¯ and computes the adjoint in the obvious way, one will not get the given expression for :∗h¯ . Nevertheless, the two expressions for :∗h¯ do agree on holomorphic functions. Proof. We begin by writing the explicit formula for ζ. Proposition 2.7. The function ζ is an Ad-K-invariant function on k which is given on a maximal abelian subalgebra by ζ (Y ) =
sinh α(Y /2) , α(Y /2) +
α∈R
where R + is a system of positive roots.
250
B. C. Hall
The proof of this proposition is a straightforward but tedious calculation, which I defer to an appendix. Directly from the formula (2.16) for the pairing map we see that F, f pair = :∗h¯ F, f L2 (K) , (2.19) where :∗h¯ is defined by
:∗h¯ F (x) =
k
2 F xeiY e−|Y | /2h¯ ζ (Y ) dY.
At the moment it is not at all clear that :h¯ is a bounded operator, but there is a dense subspace of HL2 (T ∗ (K), γh¯ ) on which :h¯ makes sense and for which (2.19) holds. We will see below that :h¯ extends to a bounded operator on all of HL2 (T ∗ (K), γh¯ ), for which (2.19) continues to hold. Then by taking the adjoint of :∗h¯ we see that F, f pair = F, :h¯ f HL2 (T ∗ (K),γ ) as well. h¯
Using the explicit formula for ζ and making the change of variable Y = 21 Y we have
∗ 2 sinh α Y :h¯ F (x) = 2n F xe2iY e−2|Y | /h¯ dY . ) α (Y k + α∈R
We recognize from [H3] the expression in square brackets as a constant times the heat kernel measure on KC /K, written in exponential coordinates and evaluated at time t = h/2. It follows from the inversion formula of [H2] that ¯ :∗h¯ = ch¯ Ch−1 , ¯ for some constant ch¯ and where Ch¯ is the generalized Segal–Bargmann transform of [H1, Thm. 2]. Now, Ch¯ is unitary if we use on KC ∼ = T ∗ (K) the heat kernel measure νh¯ . But in Theorem 1 we established that this measure coincides up to a constant with the measure γh¯ . Thus :h¯ is a constant multiple of a unitary and coincides with Ch¯ up to a constant. This gives us what we want except for computing the constants, which I leave as an exercise for the reader. 3. Quantization, Reduction, and Yang–Mills Theory Let me summarize the results of this section before explaining them in detail. It is possible to realize a compact Lie group K as the quotient K = A/L (K), where A is a certain infinite-dimensional Hilbert space and L (K) is the based loop group over K, which acts freely and isometrically on A. (Here A is to be interpreted as a space of connections over S 1 and L (K) as a gauge group.) The cotangent bundle of A may be identified with the associated complex Hilbert space AC and the symplectic quotient AC //L (K) is identifiable with T ∗ (K). The results of [DH,Wr] (see also the exposition in [H8]) together with the results of this paper may be interpreted as saying that in this case quantization commutes with reduction. This means two things. First, if we perform geometric quantization on AC and then reduce by L (K) the resulting Hilbert space is naturally unitarily equivalent to the result of first reducing by L (K) and then quantizing the reduced manifold AC //L (K) = T ∗ (K). This result holds using either the vertical
Geometric Quantization and Segal–Bargmann Transform
251
or the Kähler polarization; in the Kähler case it is necessary to include the half-form correction. Second, the pairing map between the vertically polarized and Kähler-polarized Hilbert spaces over AC descends to the reduced Hilbert spaces and then coincides (up to a constant) with the pairing map for T ∗ (K). Additional discussion of these ideas is found in [H7, H8]. The first result contrasts with those of Guillemin and Sternberg in [GStern]. That paper considers the geometric quantization of compact Kähler manifolds, without half-forms, and exhibits (under suitable regularity assumptions) a one-to-one onto linear map between the “first quantize then reduce” space and the “first reduce and then quantize” space. However, they do not show that this map is unitary, and it seems very unlikely that it is unitary in general. In the case considered in this paper and [DH], quantization commutes unitarily with reduction. Consider then a Lie group K of compact type, with a fixed Ad-K-invariant inner product on its Lie algebra k. Then consider the real Hilbert space A := L2 ([0, 1] ; k) . Let L (K) denote the based loop group for K, namely the group of maps l : [0, 1] → K such that l0 = l1 = e. (For technical reasons I also assume that l has one derivative in L2 , i.e. that l has “finite energy”.) There is a natural action of L (K) on A given by (l · A)τ = lτ Aτ lτ−1 −
dl −1 l . dτ τ
(3.1)
Here l is in L (K) , A is in A, and τ is in [0, 1] . Then we have the following result: the based loop group L (K) acts freely and isometrically on A, and the quotient A/L (K) is a finite-dimensional manifold that is isometric to K. Thus K, which is finite-dimensional but with non-trivial geometry, can be realized as a quotient of A, which is infinitedimensional but flat. Explicitly the quotient map is given in terms of the holonomy. For A ∈ A we define the holonomy h (A) ∈ K by the “path-ordered integral” 1
h(A) = P e 0 Aτ dτ = lim e N→∞
1/N 0
Aτ dτ
e
2/N 1/N
Aτ dτ
···e
1
(N −1)/N
Aτ dτ
.
(3.2)
Then it may be shown that A and B are in the same orbit of L (K) if and only if h(A) = h (B) . Furthermore, every x ∈ K is the holonomy of some A ∈ A, and so the L (K)-orbits are in one-to-one correspondence with points in K. The motivation for these constructions comes from gauge theory. The space A is to be thought of as the space of connections for a trivial principal K-bundle over S 1 , in which case L (K) is the based gauge group and (3.1) is a gauge transformation. For connections A over S 1 the only quantity invariant under (based) gauge transformations is the holonomy h (A) around the circle. See [DH] or [H8] for further details. Meanwhile, we may consider the cotangent bundle of A, T ∗ (A) , which may be identified with AC := L2 ([0, 1] ; kC ) . Then AC is an infinite-dimensional flat Kähler manifold. The action of the based loop group L (K) on A extends in a natural way to an action on AC (given by the same formula). Starting with AC we may construct the symplectic (or Marsden–Weinstein) quotient AC //L (K) . This quotient is naturally identifiable with T ∗ (A/L (K)) = T ∗ (K).
252
B. C. Hall
One may also realize the symplectic quotient as AC /L (KC ), where L (KC ) is the based loop group over KC . The quotient AC /L (KC ) is naturally identifiable with KC . So we have ultimately T ∗ (K) ∼ = T ∗ (A/L (K)) ∼ = AC /L (KC ) ∼ = KC . The resulting identification of T ∗ (K) with KC is nothing but the one used throughout this paper. The quotient AC /L (KC ) may be expressed in terms of the complex holonomy. For Z ∈ AC we define hC (Z) ∈ KC similarly to (3.2). Then the L (KC )-orbits are labeled precisely by the value of hC . So the manifold T ∗ (K) that we have been quantizing is a symplectic quotient of the infinite-dimensional flat Kähler manifold AC . Looking at T ∗ (K) in this way we may say that we have first reduced AC by the loop group L (K) , and then quantized. One may attempt to do things the other way around: first quantize AC and then reduce by L (K) . Motivated by the results of K. Wren [Wr] (see also [La2, Chap. IV.3.8]), Bruce Driver and I considered precisely this procedure [DH]. Although there are technicalities that must be attended to in order to make sense of this, the upshot is that in this case quantization commutes with reduction, as explained in the first paragraph of this section. In the end we have three different procedures for constructing the generalized Segal– Bargmann space for K and the associated Segal–Bargmann transform. The first is the heat kernel construction of [H1], the second is geometric quantization of T ∗ (K) with a Kähler polarization, and the third is by reduction from AC . It is not obvious a priori that any two of these constructions should agree. That all three agree is an apparent miracle that should be understood better. I expect that if one replaces the compact group K with some other class of Riemannian manifolds, then these constructions will not agree. Let me now explain how the quantization of AC and the reduction by L (K) are done in [DH]. (See also the expository article [H8].) In the interest of conveying the main ideas I will permit myself to gloss over various technical issues that are dealt with carefully in [DH]. Although [DH] does not use the language of geometric quantization, it can easily be reformulated in those terms. Now, the constructions of geometric quantization are not directly applicable in the infinite-dimensional setting. On the other hand, AC is just a flat Hilbert space and there are by now many techniques for dealing with its quantization. Driver and I want to first perform quantization on Cn and then let n tend to infinity. If one performs geometric quantization on Cn with a Kähler polarization and the half-form correction one gets HL2 (Cn , νh¯ ), where dνh¯ = e−(Im z)
2 /h
¯
dz.
See Sect. 5 below. In this form we cannot let the dimension go to infinity because the measure is Gaussian only in the imaginary directions. So we introduce a regularization parameter s > h/2 ¯ and modify the measure to 2 /h
−n/2 −(Im z) dMs,h¯ = (πr)−n/2 (π h) e ¯
¯ e−(Re z)2 /r ,
where r = 2(s − h/2). The constants are chosen so that Ms,h¯ is a probability measure. If ¯ one rescales Ms,h¯ by a suitable function of s and then lets s tend to infinity one recovers the measure νh¯ . Our Hilbert space is then just HL2 (Cn , Ms,h¯ ). Now we can let the dimension tend to infinity, and we get HL2 AC , Ms,h¯ ,
Geometric Quantization and Segal–Bargmann Transform
253
where Ms,h¯ is a Gaussian measure on a certain “extension” AC of AC . (See [DH, Sect. 4.1].) This we think of as the (regularized) Kähler-polarized Hilbert space. Our next task is to perform the reduction by L (K) , which means looking for functions in HL2 (AC , Ms,h¯ ) that are “invariant” in the appropriate sense under the action of L (K) . The notion of invariance should itself come from geometric quantization, by “quantizing” the action of L (K) on AC . Note that L (K) acts on A by a combination of rotations and translations; the action of L (K) on AC is then induced from its action on A. Let us revert temporarily to the finite-dimensional situation as in Sect. 4. Then the way we have chosen our 1-form θ and our Kähler potential κ means that the rotations and translations of Rn act in the Kähler-polarized Hilbert space HL2 (Cn , νh¯ ) in the simplest possible way, namely by rotating and translating the variables. (This is not the case in the conventional form of the Segal–Bargmann space.) We will then formally extend this notion to the infinite-dimensional case, which means that an element l of L (K) acts on a function F ∈ HL2 (AC , Ms,h¯ ) by F (Z) → F l −1 · Z . We want functions in HL2 (AC , Ms,h¯ ) that are invariant under this action, i.e. such that F l −1 · Z = F (Z) for all l ∈ L (K) . Since our functions are holomorphic they must also (at least formally) be invariant under L (KC ) . So we expect the invariant functions to be those of the form F (Z) = ' (hC (Z)) , where ' is a holomorphic function on KC . (Certainly every such function is L (K)invariant. Although Driver and I did not prove that every L (K)-invariant function is of this form, this is probably the case.) The norm of such a function may be computed as |F (Z)|2 dMs,h¯ (Z) = |' (g)|2 dµs,h¯ (g) , AC
KC
where µs,h¯ is the push-forward of Ms,h¯ to KC under hC . Concretely µs,h¯ is a certain heat kernel measure on KC . See [DH] or [H5] for details. So our regularized reduced quantum Hilbert space is HL2 (KC , µs,h¯ ). At this point we may remove the regularization by letting s tend to infinity. It can be shown that lim µs,h¯ = νh¯ , s→∞
where νh¯ is the K-invariant heat kernel measure of [H1]. So without the regularization our reduced quantum Hilbert space becomes finally HL2 (KC , νh¯ ), which (up to a constant) is the same as HL2 (T ∗ (K), γh¯ ), using our identification of T ∗ (K) with KC . Meanwhile the vertically polarized Hilbert space for Cn also requires a regularization before we let n tend to infinity. So we consider L2 (Rn , Ps ), where Ps is the Gaussian measure given by dPs (x) = (2π s)−n/2 e−|x|
2 /2s
.
254
B. C. Hall
Rescaling Ps by a function of s and then letting s tend to infinity gives back the Lebesgue measure on Rn . We then consider the Segal–Bargmann transform Sh¯ , which coincides with the pairing map of geometric quantization (Sect. 5). This is given by Sh¯ f (z) = (2π t)−n/2
2 /2t
Rn
e−(z−x)
f (x) dx.
With the constants adjusted as above this map has the property that it is unitary between our regularized spaces L2 (Rn , Ps ) and HL2 (Cn , Ms,h¯ ), for all s > h/2. (See [DH, ¯ Sect. 3.1] or [H5].) Letting the dimension tend to infinity we get a unitary map [DH, Sect. 4.1] Sh¯ : L2 A, Ps → HL2 (AC , Ms,h¯ ).
(3.3)
It seems reasonable to think of this as the infinite-dimensional regularized version of the pairing map for AC . To reduce by L (K) we consider functions in L2 A, Ps that are L (K)-invariant. According to an important theorem of Gross [G1] these are (as expected) precisely those of the form f (A) = φ (h (A)) ,
(3.4)
where φ is a function on K. The norm of such a function is computed as
A
|f (A)|2 dPs (A) =
K
|φ (x)|2 dρs (x) .
Thus with the vertical polarization our reduced Hilbert space becomes L2 (K, ρs ) . Since lim dρs (x) = dx
s→∞
we recover in the limit the vertically polarized subspace for K. (Compare [Go].) Theorem 3.1. [DH] the Segal–Bargmann transform Sh¯ of (3.3). Then consider Consider 2 a function f ∈ L A, Ps of the form f (A) = φ (h (A)) , with φ a function on K. Then Sh¯ f (Z) = ' (hC (Z)) , where ' is the holomorphic function on KC given by ' = analytic continuation of eh¯ K /2 φ. Restricting Sh¯ to the L (K)-invariant subspace and then letting s → ∞ gives the unitary map Ch¯ : L2 (K, dx) → HL2 (KC , νh¯ ) given by φ → analytic continuation of eh¯ K /2 φ.
Geometric Quantization and Segal–Bargmann Transform
255
If we restrict Sh¯ to the L (K)-invariant subspace but keep s finite, then we get a modified form of the Segal–Bargmann transform for K, a unitary map L2 (K, ρs ) → HL2 (KC , µs,h¯ ), still given by φ → analytic continuation of eh¯ K /2 φ. This transform is examined from a purely finite-dimensional point of view in [H5]. So if we accept the constructions of [DH] as representing regularized forms of the geometric quantization Hilbert spaces and pairing map, then we have the following conclusions. First, the Kähler-polarized and vertically polarized Hilbert spaces for AC , after reducing by L (K) and removing the regularization, are naturally unitarily equivalent to the Kähler-polarized and vertically polarized Hilbert spaces for T ∗ (K) = AC //L (K) . (I am including the half-forms in the construction of the Kähler-polarized Hilbert spaces.) Second, the pairing map for AC , after restricting to the L (K)-invariant subspace and removing the regularization, coincides with the pairing map for T ∗ (K). Both of these statements are to be understood “up to a constant”. 4. The Geodesic Flow and the Heat Equation This section describes how the complex polarization on T ∗ (K) can be obtained from the vertical polarization by means of the imaginary-time geodesic flow. This description is supposed to make the appearance of the heat equation in the pairing map seem more natural. After all the heat operator is nothing but the imaginary-time quantized geodesic flow. This point of view is due to T. Thiemann [T1,T3]. Suppose that f is a function on K and let π : T ∗ (K) → K be the projection map. Then f ◦ π is the extension of f to T ∗ (K) that is constant along the fibers. A function of the form f ◦ π is a “vertically polarized function”, that is, constant along the leaves of the vertical polarization. Now recall the function κ : T ∗ (K) → R given by κ (x, Y ) = |Y |2 . Let Bt be the Hamiltonian flow on T ∗ (K) generated by the function κ/2. This is the geodesic flow for the bi-invariant metric on K determined by the inner product on the Lie algebra. The following result gives a way of using the geodesic flow to produce a holomorphic function on T ∗ (K). Theorem 4.1. Let f : K → C be any function that admits an entire analytic continuation to T ∗ (K) ∼ = KC , for example, a finite linear combination of matrix entries. Let π : T ∗ (K) → K be the projection map, and let Bt be the geodesic flow on T ∗ (K). Then for each m ∈ T ∗ (K) the map t → f (π (Bt (m))) admits an entire analytic continuation (in t) from R to C. Furthermore the function fC : T ∗ (K) → C given by fC (m) = f (π (Bi (m))) is holomorphic on T ∗ (K) and agrees with f on K ⊂ T ∗ (K). Note that fC is the analytic continuation of f from K to T ∗ (K), with respect to the complex structure on T ∗ (K) obtained by identifying it with KC . So in words: to analytically continue f from K to T ∗ (K), first extend f by making it constant along the fibers and then compose with the time i geodesic flow. So we can say that the
256
B. C. Hall
Kähler-polarized functions (i.e. holomorphic) are obtained from the vertically polarized functions (i.e. constant along the fibers) by composition with the time i geodesic flow. Now if g is any function on T ∗ (K) then g ◦ Bt may be computed formally as g ◦ Bt =
∞ (t/2)n n=0
{. . . {{g, κ} , κ} , . . . , κ}. n! n
Thus formally we have fC =
∞ (i/2)n n=0
{. . . {{f ◦ π, κ} , κ} , . . . , κ}. n!
(4.1)
n
(Compare [T1, Eq. (2.3)].) In fact, this series converges provided only that f has an analytic continuation to T ∗ (K). This series is the “Taylor series in the fibers” of fC ; that is, on each fiber the nth term of (4.1) is a homogeneous polynomial of degree n. Theorem 4.2. Suppose f is any function on K that admits an entire analytic continuation to T ∗ (K), denoted fC . Then the series on the right in (4.1) converges absolutely at every point and the sum is equal to fC . As an illustrative example, consider the case K = R so that T ∗ (K) = R2 . Then consider the function f (x) = x k on R, so that (f ◦ π ) (x, y) = x k . Using the standard ∂g ∂h ∂g ∂h Poisson bracket on R2 , {g, h} = ∂x ∂y − ∂y ∂x it is easily verified that ∞ (i/2)n n=0
n!
...
x k , y 2 , y 2 , . . . , y 2 = (x + iy)k . n
(The series terminates after the n = k term.) So fC (x + iy) = (x + iy)k is indeed the analytic continuation of x k . So “classically” the transition from the vertical polarization (functions constant along the fibers) to the Kähler polarization (holomorphic functions) is accomplished by means of the time i geodesic flow. Let us then consider the quantum counterpart of this, namely the transition from the vertically polarized Hilbert space to the Kähler-polarized Hilbert space. In the position Hilbert space the quantum counterpart of the function κ/2 is the operator H := −h¯ 2 K /2. (Possibly one should add an “author-dependent” multiple of the scalar curvature to this operator [O], but since the scalar curvature of K is constant, this does not substantively affect the answer.) The quantum counterpart of the geodesic flow is then the operator Bˆ t := exp (itH /h¯ ) and so the time i quantized geodesic flow is represented by the operator Bˆ i = eh¯ K /2 . Since this is precisely the heat operator for K, the appearance of the heat operator in the formula for the pairing map perhaps does not seem quite so strange as at first glance.
Geometric Quantization and Segal–Bargmann Transform
257
This way of thinking about the complex structure and the associated Segal–Bargmann transform is due to T. Thiemann [T1]. The relationship between the complex structure and the imaginary time geodesic flow is also implicit in the work of Guillemin–Stenzel, motivated by the work of L. Boutet de Monvel. (See the discussion between Thm. 5.2 and 5.3 in [GStenz2].) Thiemann proposes a very general scheme for building complex structures and Segal–Bargmann transforms (and their associated “coherent states”) based on these ideas. However, there are convergence issues that need to be resolved in general, so it is not yet clear when one can carry this program out. Although results similar to Theorems 4.1 and 4.2 are established in [T3, Lem. 3.1], I give the proofs here for completeness. Similar results hold for the “adapted complex structure” on the tangent bundle of an real-analytic Riemannian manifold, which will be described elsewhere. Proof. According to a standard result [He, Sect. IV.6], the geodesics in K are the curves of the form γ (t) = xetX , with x ∈ K and X ∈ k. This means that if we identify T ∗ (K) with K × k by left-translation, then the geodesic flow takes the form
Bt (x, Y ) = xetY , Y . Thus if f is a function on K then
f (π (Bt (x, Y ))) = f xetY .
We are now supposed to fix x and Y and consider the map t → f xetY . If f has tY an analytic continuation to KC , denoted fC , then the map t → f xe has an analytic continuation (in t) given by
t → fC xetY , t ∈ C. (This is because the exponential mapping from kC to KC is holomorphic.) Thus
f (π (Bi (x, Y ))) = fC xeiY . Now we simply note that the map (x, Y ) → fC xeiY is holomorphic on T ∗ (K), with respect to the complex structure obtained by the map ' (x, Y ) = xeiY . This establishes Theorem 4.1. To establish the series form of this result, Theorem 4.2, we note that (almost) by the definition of the geodesic flow we have n d 1 = n {. . . {{f ◦ π, κ} , κ} , . . . , κ}. (4.2) (f ◦ π ) ◦ Bt dt 2 t=0 n
On the other hand, if f has an entire analytic continuation to T ∗ (K) ∼ = KC , then as established above, the map t → (f ◦ π ) ◦ Bt has an entire analytic continuation. This analytic continuation can be computed by an absolutely convergent Taylor series at t = 0, where the Taylor coefficients at t = 0 are computable from (4.2). Thus fC = (f ◦ π) ◦ Bi =
∞ (i/2)n n=0
This establishes Theorem 4.2.
{. . . {{f ◦ π, κ} , κ} , . . . , κ}. n! n
258
B. C. Hall
5. The Rn Case It is by now well known that geometric quantization can be used to construct the Segal– Bargmann space for Cn and the associated Segal–Bargmann transform. (See for example [Wo, Sect. 9.5].) In this section I repeat that construction, but in a manner that is nonstandard in two respects. First, I trivialize the quantum line bundle in such a way that the measure in the Segal–Bargmann space is Gaussian only in the imaginary directions. This is preferable for generalizing to the group case and it is a simple matter in the Rn case to convert back to the standard Segal–Bargmann space (see below). Second, I initially compute the pairing map “backward,” that is, from the Segal–Bargmann space to L2 (Rn ) . I then describe this backward map in terms of the backward heat equation, which leads to a description of the forward map in terms of the forward heat equation. By contrast, Woodhouse uses the reproducing kernel for the Segal–Bargmann space in order to compute the pairing map in the forward direction. Although I include the half-form correction on the complex side, this has no effect on the calculations in the Rn case. We consider the phase space R2n = T ∗ (Rn ). We use the coordinates q1 , . . . , qn , p1 , . . . , pn , where the q’s are the position variables and the p’s are the momentum variables. We consider the canonical one-form θ= pk dqk , where here and in the following the sum ranges from 1 to n. Then ω := −dθ = dqk ∧ dpk is the canonical 2-form. We consider a trivial complex line bundle L = R2n × C with a notion of covariant derivative given by ∇X = X −
1 θ (X) . i h¯
Here ∇X acts on smooth sections of L, which we think of as smooth functions on R2n . The prequantum Hilbert space is the space of sections of L that are square-integrable with respect to the canonical volume measure on R2n . The canonical volume measure is the one given by integrating the Liouville volume form defined as 1 ω ∧ · · · ∧ ω (n times) n! = dq1 ∧ dp1 ∧ · · · ∧ dqn ∧ dpn .
ε=
Since our prequantum line bundle is trivial we may identify the prequantum Hilbert space with L2 R2n , ε . We now consider the usual complex structure on R2n = Cn . We think of this complex structure as defining a Kähler polarization on R2n . This means that we define a smooth section s of L to be polarized if ∇∂/∂ z¯ k s = 0 for all k.
(5.1)
Geometric Quantization and Segal–Bargmann Transform
259
Proposition 5.1. If we think of sections s of L as functions then a smooth section s satisfies (5.1) if and only if s is of the form s (q, p) = F (q1 + ip1 , . . . , qn + ipn ) e−p
2 /2h
¯,
(5.2)
where F is a holomorphic function on Cn . Here p 2 = p12 + · · · + pn2 . Proof. To prove this we first compute ∇∂/∂ z¯ k as ∂ ∂ 1 ∇∂/∂ z¯ k = − θ ∂ z¯ k i h¯ ∂ z¯ k 1 1 ∂ ∂ = − +i pk . 2 ∂qk ∂pk 2i h¯ Then we note that ∇∂/∂ z¯ k e
−p2 /2h¯
p2 ∂ 1 1 ∂ 2 − − pk e−p /2h¯ = +i 2 ∂qk ∂pk 2h¯ 2i h¯ pk 1 2 = −i − pk e−p /2h¯ 2h¯ 2i h¯ = 0.
Then if s is any section, we can write s in the form s = F e−p valued function F. Such a section s is polarized if and only if
2 0 = ∇∂/∂ z¯ k F e−p /2h¯
2 /2h
¯,
for some complex-
∂F −p2 /2h¯ 2 e + F ∇∂/∂ z¯ k e−p /2h¯ ∂ z¯ k ∂F −p2 /2h¯ = e , ∂ z¯ k =
for all k, that is, if and only if F is holomorphic.
We then define the Kähler-polarized Hilbert space to be the space of square-integrable Kähler-polarized sections of L. Note that the L2 norm of the section s in (5.2) is computable as s2 =
Cn
|F (z)|2 e−p
2 /h
¯
d n q d n p,
where z = q + ip with q, p ∈ Rn . If we identify the polarized section s with the holomorphic function F then we identify the Kähler-polarized Hilbert space as the space HL2 (Cn , e−p
2 /h
¯ d nq
d n p).
(5.3)
Here HL2 denotes the space of square-integrable holomorphic functions with respect to the indicated measure. This space is a form of the Segal–Bargmann space. The conventional description [Wo, Sect. 9.2] of the Segal–Bargmann space is slightly different from what we have here, for two reasons. First, it is conventional to insert a factor √ of 2 into the identification of R2n with Cn . Second, it is common to use a different
260
B. C. Hall
trivialization of L, resulting in a different Gaussian measure on Cn . The map F → 2 2 ez /4h¯ F maps “my” Segal–Bargmann space unitarily to HL2 (Cn , e−|z| /2h¯ d n q d n p), which √ is the standard Segal–Bargmann space (apart from the above-mentioned factor of 2). The normalization used here for the Rn case is the one that generalizes to the group case. We also define the canonical bundle (relative to the given complex structure) to be the bundle whose sections are n-forms of type (n, 0) . We then define the half-form bundle δ1 to be the square root of the canonical bundle. The polarized sections of δ1 are objects of the form F (z) dz1 ∧ · · · ∧ dzn , where F is holomorphic. Here the square root is a mnemonic for a polarized section of δ1 whose square is dz1 ∧ · · · ∧ dzn . The absolute value of such a section is computed by setting 2 d z¯ ∧ · · · ∧ d z¯ ∧ dz ∧ · · · ∧ dz 1/2 1 n 1 n dz1 ∧ · · · ∧ dzn = bε = 1,
(5.4)
where the constant b is given by b = (2i)n (−1)n(n−1)/2 . The half-form-corrected Hilbert space is then the space of square-integrable polarized sections of L ⊗ δ1 . Polarized sections of L ⊗ δ1 may be expressed uniquely as s = F (z) e−p
2 /2h
¯
⊗
dz1 ∧ · · · ∧ dzn .
(5.5)
In light of (5.4) our Hilbert space may again be identified with the Segal–Bargmann 2 space HL2 (Cn , e−p /h¯ d n q d n p). Although in this flat case the half-form correction does not affect the description of the Hilbert space, it still has an important effect on certain subsequent calculations, such as the WKB approximation. (See [Wo, Chap. 10].) Next we consider the vertically polarized sections. A vertically polarized section s of L is one for which ∇∂/∂pk s = 0 for all k. Identifying sections with functions and using θ = "pk dqk we see that ∇∂/∂pk = ∂/∂pk . Thus the vertically polarized sections are simply functions f (q, p) that are independent of p. Unfortunately, such a section cannot be square-integrable (over R2n ) unless it is zero almost everywhere. So we now consider the canonical bundle (relative to the vertical polarization). This is the real line bundle whose sections are n-forms α satisfying (∂/∂pk )α = 0 for all k. Concretely such forms are precisely those expressible as α = f (q, p) dq1 ∧ · · · ∧ dqn where f is real-valued. Such a n-form is called polarized if (∂/∂pk )dα = 0 for all k. Such forms are precisely those expressible as α = f (q) dq1 ∧ · · · ∧ dqn . We now choose an orientation on Rn and we construct a square root δ2 of the canonical bundle in such a way that the square of a section of δ2 is a non-negative multiple of dq1 ∧ · · · ∧ dqn , where q1 , . . . , qn is an oriented coordinate system for Rn . There is a natural notion of polarized sections of δ2 , namely those whose squares are polarized
Geometric Quantization and Segal–Bargmann Transform
261
sections of the canonical bundle. The polarized sections of δ2 are precisely those of the form β = f (q) dq1 ∧ · · · ∧ dqn . (5.6) We then consider the space of polarized sections of L ⊗ δ2 . Every such section may be written uniquely in the form s = f (q) ⊗ dq1 ∧ · · · ∧ dqn , (5.7) where now f is complex-valued. We define the inner product of two such sections s1 and s2 by f1 (q) f2 (q) dq1 ∧ · · · ∧ dqn . (5.8) (s1 , s2 ) = Rn
Note that the integration is over Rn not R2n . The vertically polarized Hilbert space is the space of polarized sections s of L ⊗ δ2 for which (s, s) < ∞. (This construction is explained in a more manifestly coordinate-independent way in the general group case, in Sect. 2.4.) Finally, we introduce the pairing map between the vertically polarized and Kählerpolarized Hilbert spaces. First we define a pointwise pairing between sections of δ1 and sections of δ2 by setting
d z¯ ∧ · · · ∧ d z¯ ∧ dq ∧ · · · ∧ dq 1/2 1 n 1 n dz1 ∧ · · · ∧ dzn , dq1 ∧ · · · ∧ dqn = cε = 1, where the constant c is given by c = (−i)n (−1)n(n+1)/2 . Then we may pair a section of L ⊗ δ1 with a section of L ⊗ δ2 by applying the above pairing of δ1 and δ2 and the Hermitian structure on L, and then integrating with respect to ε. So if s1 is a polarized section of L⊗δ1 as in (5.5) and s2 is a polarized section of L⊗δ2 then we have explicitly 2 F, f pair = F (q + ip)f (q) e−p /2h¯ d n q d n p. (5.9) Rn
Rn
Here I have expressed things in terms of F ∈ HL2 (Cn , e−p L2 (Rn ).
2 /h
¯
d n q d n p) and f ∈
Theorem 5.2. Let us identify the vertically polarized Hilbert space with L2 (Rn ) as 2 in (5.8) and the Kähler-polarized Hilbert space with HL2 (Cn , e−p /h¯ d n q d n p) as in (5.5). Then there exists a unique bounded linear operator :h¯ : L2 (Rn ) → H 2 L2 (Cn , e−p /h¯ d n q d n p) such that F, f = F, :h¯ f HL2 (Cn ,e−p2 /h¯ d n q d n p) = :∗h¯ F, f L2 (Rn ) . We call :h¯ the pairing map. We then have the following results. (1) The map :h¯ : L2 (Rn ) → HL2 (Cn , e−p /h¯ d n q d n p) is given by 2 :h¯ f (z) = ah¯ e−(z−q) /2h¯ f (q) d n q, 2
Rn
−n/2
where ah¯ = (π h¯ )
−n
(2π h¯ )
.
262
B. C. Hall
(2) The map :∗h¯ may be computed as
∗ :h¯ F (q) =
Rn
F (q + ip) e−p
2 /2h
¯
d n p.
(3) The map bh¯ :h¯ is unitary, where bh¯ = (π h¯ )n/4 (2π h¯ )n/2 . Note that the formula for :∗h¯ (mapping from the Segal–Bargmann space to L2 (Rn )) comes almost directly from the formula (5.9) for the pairing. The unitarity (up to a constant) of the pairing map in this Rn case is “explained” by the Stone–von Neumann theorem. The map :h¯ , as given in 1), is the “invariant” form of the Segal–Bargmann transform, as described, for example, in [H6, Sect. 6.3]. In the expression for :∗h¯ the integral is not absolutely convergent in general, so more precisely one should integrate over the set |p| ≤ R and then take a limit (in L2 (Rn )) as R → ∞. (Compare [H2, Thm. 1].) There are doubtless many ways of proving these results. I will explain here simply how the heat equation creeps into the argument, since the heat equation is essential to the proof in the group case. Fix a holomorphic function F on Cn that is square-integrable over Rn and that has moderate growth in the imaginary directions. Then define a function fh¯ on Rn by 2 e−p /2h¯ fh¯ (q) = d n p. F (q + ip) (5.10) n/2 n h) (2π R ¯ Note that the Gaussian factor in the square brackets is just the standard heat kernel in the p-variable and in particular satisfies the forward heat equation ∂u/∂ h¯ = (1/2)u. Let us then differentiate under the integral sign, integrate by parts, and use the Cauchy– Riemann equations in the form ∂F /∂pk = i∂F /∂qk . This shows that ∂fh¯ 1 = − fh¯ , (5.11) ∂ h¯ 2 which is the backward heat equation. Furthermore, letting h¯ tend to zero we see that lim fh¯ (q) = F (q) . h¯ ↓0
(5.12)
n/2 ) :∗h¯ F is obtained by applying the inverse heat Thus (up to a factor of (2π h) ¯ n operator to the restriction of F to R . Turning this the other way around we have
∗ −1 :h¯ f = (2π h¯ )n/2 analytic continuation of eh¯ /2 f , (5.13)
where eh¯ /2 f means the solution to the heat operator at time h, ¯ with initial condition f. Of course, eh¯ /2 f can be computed by integrating f against a Gaussian, so we have ∗ −1 2 :h¯ f (z) = e−(z−q) /2h¯ f (q) d n q, Rn
where the factors of 2π h¯ in (5.13) have canceled those in the computation of the heat operator on Rn . −1 as coinciding up to a constant with the “invariant” form We now recognize :∗h¯ Ch¯ of the Segal–Bargmann transform, as described in [H6, Sect. 6.3]. The unitarity of Ch¯ then implies that :h¯ is unitary up to a constant. The argument in the compact group case goes in much the same way, using the inversion formula [H2] for the generalized Segal–Bargmann transform of [H1].
Geometric Quantization and Segal–Bargmann Transform
263
6. Appendix: Calculations with ζ and κ We will as always identify T ∗ (K) with K × k by means of left-translation and the inner product on k. We choose an orthonormal basis for k and we let y1 , . . . , yn be the coordinates with respect to this basis. Then all forms on K × k can be expressed in terms of the left-invariant 1-forms η1 , . . . , ηn on K and the translation-invariant 1-forms dy1 , . . . , dyn on k. Since the canonical projection pr : T ∗ (K) → K in this description is just projection onto the K factor, pr ∗ (ηk ) is just identified with ηk . We identify the tangent space at each point in K × k with k + k. Meanwhile we identify the tangent space of KC at each point with kC ∼ = k + k. We then consider the map ' that identifies T ∗ (K) ∼ = K × k with KC , ' (x, Y ) = xeiY . Since we are identifying the tangent space at every point of both K × k and KC with k + k, the differential of ' at any point will be described as a linear map of k + k to itself. Explicitly we have [H3, Eq. (14)] at each point (x, Y ) " ! adY cos adY 1−cos adY '∗ = . (6.1) adY − sin adY sinadY Our first task is to compute the function ζ (Y ) defined in (2.18). So let us use ' to pull back the left-invariant anti-holomorphic forms Z¯ k to T ∗ (K). To do this we compute the adjoint '∗ of the matrix (6.1), keeping in mind that adY is skew, since our inner product is Ad-K-invariant. We then get that '∗ Z¯ k = terms involving ηl sin adY cos adY − 1 −i dyl . +i adY adY lk Thus Z¯ 1 ∧ · · · ∧ Z¯ n ∧ η1 ∧ · · · ∧ ηn = (−i)n ζ (Y )2 η1 ∧ · · · ∧ ηn ∧ dy1 ∧ · · · ∧ dyn = ±(−i)n ζ (Y )2 ε, where
cos adY − 1 sin adY ζ (Y ) = det +i . adY adY
2
Here ε = η1 ∧ dy1 ∧ · · · ∧ ηn ∧ dyn is the Liouville volume form, and the factor of ±(−i)n is accounted for by the constant c in the definition of ζ. Computing in terms of the roots we have ζ (Y )2 =
sinh α(Y ) + cosh α(Y ) − 1 α(Y )
α∈R
eα(Y ) − 1 α(Y ) α∈R α(Y ) e − 1 1 − e−α(Y ) = . α(Y )2 +
=
α∈R
264
B. C. Hall
Since (ex − 1) 1 − e−x = 4 sinh2 (x/2) we get ζ (Y )2 =
sinh2 α(Y /2) . α(Y /2)2 +
α∈R
Taking a square root gives the desired expression for ζ (Y ). Now we turn to the Kähler potential κ. As usual we identify T ∗ (K) with K × k by means of left-translation and the inner product on k. The canonical projection π : T ∗ (K) → K in this description is simply the map (x, Y ) → x. The canonical 1-form θ is defined by setting θ (X) = Y, π∗ (X) , where X is a tangent vector to T ∗ (K) at the point (x, Y ) . Choose an orthonormal basis e1 , . . . , en for k and let y1 , . . . , yn be the coordinates on k with respect to this basis. Let α1 , . . . , αn be left-invariant 1-forms on K whose values at the identity are the vectors e1 , . . . , en in k ∼ = k∗ . Then it is easily verified that at each point (x, Y ) ∈ T ∗ (K) we have n yk α k . θ= k=1
Now let κ be the function on T ∗ (K) given by κ (x, Y ) = |Y |2 =
n k=1
We want to verify that
yk2 .
¯ = θ. Im ∂κ
We start by observing that dκ =
n
2yk dyk .
k=1
¯ we need to transport dκ to KC , where the complex structure is To compute ∂κ defined. On KC we express things in terms of left-invariant 1-forms η1 , . . . , ηn and J η1 , . . . , J ηn . We then want to pull back dκ to KC by means of '−1 . So we need to compute the inverse transpose of the matrix (6.1) describing '∗ . This may be computed as ! sin adY "
tr − sin adY adY adY −1 = . '∗ sin adY 1−cos adY cos adY adY In terms of our basis for 1-forms on T ∗ (K), dκ is represented by the vector 0 Y
Geometric Quantization and Segal–Bargmann Transform
265
so we have to apply the matrix above to this vector. But of course adY (Y ) = 0, and so we get simply
'−1
∗
(dκ) = 2 =2
n k=1 n
yk J η k yk
k=1
1 ((ηk + iJ ηk ) − (ηk − iJ ηk )) . 2i
Thus taking only the term involving the anti-holomorphic 1-forms ηk − iJ ηk we have ¯ = ∂κ
n
iyk (ηk − iJ ηk ) ,
k=1
which is represented by the vector
iY Y
.
We now transfer this back to T ∗ (K) by means of '∗ . So applying the transpose of the matrix (6.1) we get n ¯ = ∂κ (iyk αk + yk dyk ) , k=1
and so
n ¯ = Im ∂κ yk αk = θ. k=1
7. Appendix: Lie Groups of Compact Type In this appendix I give a proof of Proposition 2.2, the structure result for connected Lie groups of compact type. We consider a connected Lie group K of compact type, with a fixed Ad-invariant inner product on its Lie algebra k. Since the inner product is Ad-invariant, the orthogonal complement of any ideal in k will be an ideal. Thus k decomposes as a direct sum of subalgebras that are either simple or one-dimensional. Collecting together the simple factors in one group and the one-dimensional factors in another, we obtain a decomposition of k as k = k1 + z, where k1 is semisimple and z is commutative. Since k1 is semisimple and admits an Ad-invariant inner product, the connected subgroup K1 of K with Lie algebra k1 will be compact. (By Cor. II.6.5 of [He], the adjoint group of K1 is a closed subgroup of Gl (k1 ) ∩ O (k1 ) and is therefore compact. Then Thm. II.6.9 of [He] implies that K1 itself is compact.) Now let B be the subset of z given by B = Z ∈ z| eZ = id , where id is the identity in K. Since z is commutative, B is a discrete additive subgroup of z, hence there exist vectors X1 , . . . , Xk , linearly independent over R, such that B is the set of integer linear combinations of the Xk ’s. (See [Wa, Exer. 3.18] or [BtD, Lemma 3.8].)
266
B. C. Hall
Now let z1 be the real span of X1 , . . . , Xk , and let z2 be the orthogonal complement of z1 in z, with respect to the fixed Ad-invariant inner product. Since z1 is commutative, the image of z1 under the exponential mapping is a connected subgroup of K, which is isomorphic to a torus, hence compact. Thus the connected subgroup H of K whose Lie algebra is k1 + z1 is a quotient of K1 × (z1 / B) , hence compact. Next consider the map G : H × z2 → K given by G (h, X) = heX , which is a homomorphism because z2 is central. I claim that this map is injective. To see this, suppose (h, X) is in the kernel. Then h = e−X , which means that h is in the center of K, hence in the center of H. Now, H is a quotient of K1 × (z1 / B) , so there exist x ∈ K1 and y ∈ (z1 / B) such that h = xy. Since h is central and y is central, x is central as well. But the center of K1 is finite, so there exists m such that x m = id. Since y and eX are central, this means that hm = x m y m emX = y m emX = id. But y = eY for some Y ∈ z1 , so we have emY emX = emY +mX = id, which means that mY + mX ∈ B. This means that X = 0, since z is the direct sum of the real span of B and z2 , and so also h = e−X = id. Thus G is an injective homomorphism of H × Z2 into K. The associated Lie algebra homomorphism is clearly an isomorphism (k = (k1 + z1 ) + z2 ). It follows that G is actually a diffeomorphism. To finish the argument, we need to show that the Lie algebra of H (namely, k1 + z1 ) is orthogonal to z2 . To see this, note that k1 and z2 are automatically orthogonal with respect to any Ad-invariant inner product (since the orthogonal projection of k1 onto z2 is a Lie algebra homomorphism of a semisimple algebra into a commutative algebra), and z1 and z2 are orthogonal with respect to the chosen inner product, by the construction of z2 . References [A] [B] [BtD] [De] [Dr] [DH] [F] [FY] [Go] [G1] [G2] [GM]
Ashtekar, A., Lewandowski, J., Marolf D., Mourão, J. and Thiemann, T.: Coherent state transforms for spaces of connections. J. Funct. Anal. 135, 519–551 (1996) Bargmann, V.: On a Hilbert space of analytic functions and an associated integral transform, Part I. Comm. Pure Appl. Math. 14, 187–214 (1961) Bröcker, T. and tom Dieck, T.: Representations of compact Lie groups. Graduate Texts in Mathematics 98. New York, Berlin: Springer-Verlag, 1995 De Bièvre, S.: Coherent states over symplectic homogeneous spaces. J. Math. Phys. 30, 1401– 1407 (1989) Driver, B.: On the Kakutani–Itô–Segal–Gross and Segal–Bargmann–Hall isomorphisms. J. Funct. Anal. 133, 69–128 (1995) Driver, B. and Hall, B.: Yang–Mills theory and the Segal–Bargmann transform. Commun. Math. Phys. 201, 249–290 (1999) Folland, G. Harmonic analysis in phase space. Princeton: Princeton Univ. Press, 1989 Furutani, K. and Yoshizawa, S.: A Kähler structure on the punctured cotangent bundle of complex and quaternion projective spaces and its application to a geometric quantization. II. Japan. J. Math. (N.S.) 21, 355–392 (1995) Gotay, M.: Constraints, reduction, and quantization. J. Math. Phys. 27, 2051–2066 (1986) Gross, L.: Uniqueness of ground states for Schrödinger operators over loop groups. J. Funct. Anal. 112, 373–441 (1993) Gross, L.: Heat kernel analysis on Lie groups. Preprint Gross, L. and Malliavin, P.: Hall’s transform and the Segal–Bargmann map. In: Itô’s stochastic calculus and probability theory, Fukushima, M., Ikeda, N., Kunita, H. and Watanabe, S., eds., New York, Berlin: Springer-Verlag, 1996, pp. 73–116
Geometric Quantization and Segal–Bargmann Transform
267
[GStenz1] Guillemin, V. and Stenzel, M.: Grauert tubes and the homogeneous Monge–Ampère equation. J. Differ. Geom. 34, 561–570 (1991) [GStenz2] Guillemin, V. and Stenzel, M.: Grauert tubes and the homogeneous Monge–Ampère equation. II. J. Differ. Geom. 35, 627–641 (1992) [GStern] Guillemin, V. and Sternberg, S.: Geometric quantization and multiplicities of group representations. Invent. Math. 67, 515–538 (1982) [H1] Hall, B.: The Segal–Bargmann “coherent state” transform for compact Lie groups. J. Funct. Anal. 122, 103–151 (1994) [H2] Hall, B.: The inverse Segal–Bargmann transform for compact Lie groups. J. Funct. Anal. 143, 98–116 (1997) [H3] Hall, B.: Phase space bounds for quantum mechanics on a compact Lie group. Commun. Math. Phys. 184, 233–250 (1997) [H4] Hall, B.: Quantum mechanics in phase space. In: Perspectives on quantization Coburn, L. and Rieffel, M., eds., Contemp. Math. Vol. 214. Providence, R.I.: Am. Math. Soc., 1998, pp. 47–62 [H5] Hall, B.: A new form of the Segal–Bargmann transform for Lie groups of compact type. Canad. J. Math. 51, 816–834 (1999) [H6] Hall, B.: Holomorphic methods in analysis and mathematical physics. In: First Summer School in Analysis and Mathematical Physics Pérez-Esteva, S. and Villegas-Blas, C. eds., Contemp. Math. Vol. 260. Providence, R.I.: Am. Math. Soc., 2000, pp. 1–59 [H7] Hall, B.: Harmonic analysis with respect to heat kernel measure. Bull. (N.S.) Am. Math. Soc. 38, 43–78 (2001) [H8] Hall, B.: Coherent states and the quantization of (1 + 1)-dimensional Yang–Mills theory. Rev. Math. Phys. 13, 1281–1306 (2001) [HM] Hall, B. and Mitchell, J.: Coherent states for spheres. J. Math. Phys., to appear. quant-ph/0109086; http://xxx.lanl.gov [HS] Hall, B. and Sengupta, A.: The Segal–Bargmann transform for path-groups. J. Funct. Anal. 152, 220–254 (1998) [He] Helgason, S.: Differential Geometry, Lie Groups, and Symmetric Spaces. San Diego: Academic Press, 1978 [Ho] Hochschild, G.: The structure of Lie groups. San Francisco: Holden-Day, 1965 [IK] Isham, C. and Klauder, J.: Coherent states for n-dimensional Euclidean groups E(n) and their application. J. Math. Phys. 32, 607–620 (1991) [JL] Jorgenson, J. and Lang, S.: The ubiquitous heat kernel. In: Mathematics unlimited – 2001 and beyond. New York – Berlin: Springer-Verlag, 2001, pp. 655–683 [Ki] Kirillov, A.: Geometric quantization. In: Dynamical Systems IV, Arnoˇld, V. and Novikov, S., eds., Encyclopaedia of Mathematical Sciences, Vol. 4. New York, Berlin: Springer-Verlag, 1990 [KR1] Kowalski, K. and Rembieli´nski, J.: Quantum mechanics on a sphere and coherent states. J. Phys. A 33, 6035–6048 (2000) [KR2] Kowalski, K. and Rembieli´nski, J.: The Bargmann representation for the quantum mechanics on a sphere. J. Math. Phys. 42, 4138–4147 (2001) [La1] Landsman, N.: Rieffel induction as generalized quantum Marsden–Weinstein reduction. J. Geom. Phys. 15, 285–319 (1995) [La2] Landsman, N.: Mathematical topics between classical and quantum mechanics. Springer Monographs in Mathematics. New York, Berlin: Springer-Verlag, 1998 [LS] Lempert, L. and Sz˝oke, R.: Global solutions of the homogeneous complex Monge–Ampère equation and complex structures on the tangent bundle of Riemannian manifolds. Math. Ann. 290, 689–712 (1991) [Lo] Loll, R.: Non-perturbative solutions for lattice quantum gravity. Nucl. Phys. B 444, 619–639 (1995) [O] Onofri, E.: Mathematical Reviews, Review 83b:58038, of the paper “Geometric quantization for the mechanics on spheres,” by Ii, K.: Tôhoku Math. J. 33, 289–295 (1981) [P] Perelomov, A.: Generalized coherent states and their applications. Texts and Monographs in Physics. New York, Berlin: Springer-Verlag, 1986 [Ra1] Rawnsley, J.: Coherent states and Kähler manifolds. Quart. J. Math. Oxford Ser. (2) 28, 403–415 (1977) [Ra2] Rawnsley, J.: A nonunitary pairing of polarizations for the Kepler problem. Trans. Am. Math. Soc. 250, 167–180 (1979) [RCG] Rawnsley, J., Cahen, M. and Gutt, S.: Quantization of Kähler manifolds. I. Geometric interpretation of Berezin’s quantization. J. Geom. Phys. 7, 45–62 (1990) [STW] Sahlmann, H., Thiemann, T. and Winkler, O.: Coherent states for canonical quantum general relativity and the infinite tensor product extension. Nucl. Phys. B 606, 401–440 (2001)
268
B. C. Hall
[S1]
Segal, I.: Mathematical problems of relativistic physics, Chap. VI. In: Proceedings of the Summer Seminar, Boulder, Colorado, 1960, Vol. II, Kac, M., ed., Lectures in Applied Mathematics. Providence, R.I.: Am. Math. Soc., 1963 Segal, I.: Mathematical characterization of the physical vacuum for a linear Bose–Einstein field. Illinois J. Math. 6, 500–523 (1962) Segal, I.: The complex wave representation of the free Boson field. In: Topics in functional analysis: Essays dedicated to M.G. Krein on the occasion of his 70th birthday (Gohberg, I. and Kac, M., Eds). Advances in Mathematics Supplementary Studies, Vol. 3. San Diego: Academic Press, 1978, pp. 321–343 Stenzel, M.: The Segal–Bargmann transform on a symmetric space of compact type. J. Funct. Anal. 165, 44–58 (1999) Thiemann, T.: Reality conditions inducing transforms for quantum gauge field theory and quantum gravity. Classical Quantum Gravity 13, 1383–1403 (1996) Thiemann, T.: Quantum spin dynamics (QSD). Classical Quantum Gravity 15, 839–873 (1998) Thiemann, T.: Gauge field theory coherent states (GCS): I. General properties. Classical Quantum Gravity 18, 2025–2064 (2001) Thiemann, T. and Winkler, O.: Gauge field theory coherent states (GCS): II. Peakedness properties. Classical Quantum Gravity 18, 2561–2636 (2001) Thiemann, T. and Winkler, O.: Gauge field theory coherent states (GCS): III. Ehrenfest theorems. Classical Quantum Gravity 18, 4629–4681 (2001) Thiemann, T. and Winkler, O.: Gauge field theory coherent states (GCS): IV. Infinite tensor product and thermodynamical limit. Classical Quantum Gravity 18, 4997–5053 (2001) Warner, F.: Foundations of differentiable manifolds and Lie groups. Graduate Texts in Mathematics 94. New York, Berlin: Springer-Verlag, 1983 Woodhouse, N.: Geometric Quantization, Second Ed. Oxford, New York: Oxford Univ. Press, 1991 Wren, K.: Constrained quantisation and θ -angles. II. Nuclear Phys. B 521, 471–502 (1998)
[S2] [S3]
[St] [T1] [T2] [T3] [TW1] [TW2] [TW3] [Wa] [Wo] [Wr]
Communicated by A. Connes
Commun. Math. Phys. 226, 269 – 287 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Generalized Longo–Rehren Subfactors and α-Induction Yasuyuki Kawahigashi Department of Mathematical Sciences, University of Tokyo, Komaba, Tokyo 153-8914, Japan. E-mail: [email protected] Received: 11 September 2001 / Accepted: 7 October 2001
Abstract: We study the recent construction of subfactors by Rehren which generalizes the Longo–Rehren subfactors. We prove that if we apply this construction to a nondegenerately braided subfactor N ⊂ M and α ± -induction, then the resulting subfactor is dual to the Longo–Rehren subfactor M ⊗ M opp ⊂ R arising from the entire system of irreducible endomorphisms of M resulting from α ± -induction. As a corollary, we solve a problem on existence of braiding raised by Rehren negatively. Furthermore, we generalize our previous study with Longo and Müger on multi-interval subfactors arising from a completely rational conformal net of factors on S 1 to a net of subfactors and show that the (generalized) Longo–Rehren subfactors and α-induction naturally appear in this context.
1. Introduction In subfactor theory initiated by V. F. R. Jones [11], Ocneanu’s construction of asymptotic inclusions [22] have been studied by several people as a subfactor analogue of the quantum double construction. (See [5, Chap. 12] on general theory of asymptotic inclusions.) Popa’s construction of symmetric enveloping inclusions [23] gives its generalizations and is important in the analytic aspects of subfactor theory. Longo and Rehren gave another construction of subfactors in [19] in the setting of sector theory [15, 16] and Masuda [21] has proved that the asymptotic inclusion and the Longo–Rehren subfactor are essentially the same constructions. Izumi [8,9] gave very detailed and interesting studies of the Longo–Rehren subfactors. Recently, Rehren [25] gave a construction generalizing the Longo–Rehren subfactor and we call the resulting subfactor a generalized Longo–Rehren subfactor. This construction uses certain extensions of systems of endomorphisms from subfactors (of type III) to larger factors. We will analyze this construction in detail in this paper. (This construction will be explained in more detail in Sect. 2 below.)
270
Y. Kawahigashi
Longo and Rehren also defined such an extension of endomorphisms for nets of subfactors in the same paper [19, Prop. 3.9], based on an old suggestion of Roberts [26]. The essentially same construction of new endomorphisms was also given in Xu [27, p. 372] and several very interesting properties and examples were found by him in [27, 28]. We call this extension of endomorphisms α-induction. In this paper, we study the generalized Longo–Rehren subfactors arising from α-induction based on the above works, Böckenhauer–Evans [1] and our previous work [2–4]. In the papers of Longo, Rehren, and Xu, they study nets of subfactors and have a certain condition arising from locality of the larger net, now called chiral locality as in [2, Sect. 3.3], but we do not assume this condition in this paper. We assume only a non-degenerate braiding in the sense of [24]. (See [2, Section 3.3] for more on this matter. We only need a braiding in order to define α-induction, but we also assume non-degeneracy in this paper. If we start with a completely rational net on the circle in the sense of [13], non-degeneracy of the braiding holds automatically by [13].) Izumi’s work [8, 9] on a half-braiding is closely related to the theory of α-induction and a theory of induction for bimodules generalizing these works has been recently given by Kawamuro [14]. Results in [4] suggest that if we apply the construction of the generalized Longo– Rehren subfactor to α ± -induction for N ⊂ M, then the resulting subfactor N ⊗N opp ⊂ P would be dual to the Longo–Rehren subfactor M ⊗ M opp ⊂ R applied to the system of endomorphisms of M arising from α ± -induction. In this paper we will prove that this is indeed the case. The proof involves several calculations of certain intertwiners related to a half-braiding in the sense of Izumi [8] arising from a relative braiding in Böckenhauer-Evans [1]. As an application, we solve a problem on existence of braiding raised by Rehren [25] negatively. Furthermore, we generalize our previous study with Longo and Müger [13] on multiinterval subfactors arising from a completely rational conformal net of factors on S 1 to a net of subfactors. That is, we have studied “multi-interval subfactors” arising from such a net on S 1 , whose definitions will be explained below, and proved that the resulting subfactor is isomorphic to the Longo–Rehren subfactor arising from all superselection sectors of the net in [13]. We apply the construction of multi-interval subfactors to conformal nets of subfactors with finite index and prove that the resulting subfactor is isomorphic to the Longo–Rehren subfactor arising from the system of α-induced endomorphisms. We then also explain a relation of this result to the generalized Longo– Rehren subfactors. The results in Sect. 2 were announced in [12]. 2. Generalized Longo–Rehren Subfactors Let N ⊂ M be a type III subfactor with finite index and finite depth. Let N XN , N XM , M XN , M XM be finite systems of irreducible morphisms of type N -N , N -M, M-N , M-M, respectively and suppose that the four systems together make a closed system under conjugations, compositions and irreducible decompositions, and the inclusion map from N into M decomposes into irreducible N -M morphisms within N XM , as in [2, Assumpt. 4.1]. We assume that the system N XN is non-degenerately braided as in [24], [2, Def. 2.3]. Then we have positive and negative α-inductions, corresponding to positive and negative braidings, and the system M XM is generated by the both αinductions because of the non-degeneracy as in [2, Thm. 5.10]. We do not assume the chiral locality condition, which arises from locality of the larger net of factors, in this paper. (See [3, Sect. 5] for more on the role of chiral locality.)
Generalized Longo–Rehren Subfactors and α-Induction
271
Now recall a new construction of subfactors due to Rehren [25] arising from two systems of endomorphisms and two extensions to the same factor as follows. Let be a system of endomorphisms of a type III factor N and consider a subfactor N ⊂ M with finite index. An extension of is a pair (ι, α), where ι is the embedding map of N into M and α is a map → End(M), λ → αλ satisfying the following properties: 1. Each αλ has a finite dimension. 2. We have ιλ = αλ ι for λ ∈ . 3. We have ι(Hom(λµ, ν)) ⊂ Hom(αλ αµ , αν ) for λ, µ, ν ∈ . Next let N1 , N2 be two subfactors of a type III factor M, (ι1 , α 1 ) and (ι2 , α 2 ) be two extensions of finite systems 1 , 2 of endomorphisms of N1 , N2 to M, respectively. For λ ∈ 1 and µ ∈ 2 , we set Zλ,µ = dim Hom(αλ1 , αµ2 ). Then Rehren proved in opp [25] that we have a subfactor N1 ⊗ N2 ⊂ R such that the canonical endomorphism opp restricted on N1 ⊗N2 has a decomposition λ∈1 ,µ∈2 Zλ,µ λ⊗µopp by constructing the corresponding Q-system explicitly. This is a generalization of the Longo–Rehren construction [19, Prop. 4.10] in the sense that if N1 = N2 = M, Rehren’s Q-system coincides with the one given in [19]. We call it a generalized Longo–Rehren subfactor. The most natural example of such extensions seems to be the α-induction, and then we can take = N XN , α 1 = α + , α 2 = α − for α-induction from N to M based on a braiding ε± on the system N XN and then Zλ,µ is the “modular invariant” matrix as in [2, Def. 5.5, Thm. 5.7]. Our aim is to study the generalized Longo–Rehren subfactor arising from N XN and α ± -induction in this way. The result in [4, Cor. 3.11] suggests that this subfactor is dual to the Longo–Rehren subfactor arising from M XM , and we prove this is indeed the case. For this purpose, we study the Longo–Rehren subfactor arising from M XM first as follows. Let M ⊗ M opp ⊂ R be the Longo–Rehren subfactor [19, Prop. 4.10] arising from the system M XM on M and (, V , W ) be the corresponding Q-system [17]. (Actually, the subfactor we deal with here is the dual to the original one constructed in [19, Prop. 4.10]. This dual version is called the Longo–Rehren subfactor in [4, 13].) That is, we have that ∈ End(R) is the canonical endomorphism of the subfactor, V ∈ Hom(id, ) ⊂ R, and W ∈ Hom(, 2 ). We also have W ∈ M ⊗ M opp , R = (M ⊗ M opp )V , W ∗ V = (V ∗ )W = w−1/2 , ∗ W (W ) = W W ∗ , (W )W = W 2 ,
where w = β∈M XM dβ2 is the global index of the system M XM and equal to the index [R : M ⊗ M opp ]. By the definition of the original Longo–Rehren subfactor in [19], the Q-system (, W, (V )) is given as follows. We have Wβ (β ⊗ β opp (x))Wβ∗ , for x ∈ M ⊗ M opp , (x) = β∈M XM
where is the dual canonical endomorphisms, the restriction of to M ⊗ M, the family {Wβ } is that of isometries with mutually orthogonal ranges satisfying β∈M XM Wβ Wβ∗ =
272
Y. Kawahigashi
1, and also have
(V ) =
β1 ,β2 ,β3 ∈M XM N12
d1 d2 β (Wβ2 )Wβ1 Tβ13β2 Wβ∗3 , wd3
(1)
3
β Tβ13β2
=
l=1
β
β
Tβ13β2 ,l ⊗ j (Tβ13β2 ,k ) ∈ M ⊗ M opp ,
(2)
β
by definition of the Longo–Rehren subfactor [19], where {Tβ13β2 ,l }l is an orthogonal basis in Hom(β3 , β1 β2 ) ⊂ M, Nijk is the structure constant dim Hom(βk , βi βj ), dj = dβj is the statistical dimension of βj , and j is the anti-isomorphism x ∈ M → x ∗ ∈ M opp . Starting from this explicit expression of the Q-system (, W, (V )), we would like to write down the Q-system (, V , W ) explicitly and identify it with the Q-system given by the construction of Rehren [25]. First, by [4, Thm. 3.9], we know that Zλ1 ,λ2 [η(αλ+1 , +)ηopp (αλ−2 , −)], [] = λ1 ,λ2 ∈N XN
where [ ] represents the sector class of an endomorphism, η( , ) is the extension of an endomorphism of M ⊗ M opp to R with a half-braiding by Izumi [8], α ± is the α-induction, the notations here follow those of [4], and Zλ1 λ2 = dim Hom(αλ+1 , αλ−2 ) is the “modular invariant” as in [2, Def. 5.5]. (Recall that we now assume non-degeneracy of the braiding on N XN .) Furthermore, by [4, Cor. 3.10], we have equivalence of two C ∗ -tensor categories of {η(αλ+ , +)ηopp (αµ− , −)} on R and {λ ⊗ µopp } on N ⊗ N opp , thus the canonical endomorphisms of the two Q-systems are naturally identified. So we will next compute V , W explicitly and identify them with the intertwiners in Rehren’s Qsystem. (Note that it does not matter that two von Neumann algebras R and N ⊗ N opp are different, since only the equivalence class of C ∗ -tensor categories matters in the construction of the (generalized) Longo–Rehren subfactors.) We next closely follow Izumi’s arguments in [8, Sect. 7]. First we have the following lemma. Lemma 2.1. For an operator X ∈ M ⊗ M opp , XV ∈ R is in Hom(η(αλ+1 , +)ηopp (αλ−2 , −), ) if and only if we have the following two conditions. −,opp
1. X ∈ Hom((αλ+1 ⊗ αλ2 ), ). 2. X(U ∗ )(V ) = (V )X, where −,opp U= Wβ (Eλ+1 (β) ⊗ j (Eλ−2 (β)))(αλ+1 ⊗ αλ2 )(Wβ∗ ), β∈M XM
and E ± is the half-braiding defined in [4, Sect. 3]. Proof. By a standard argument similar to the one in the proof of [8, Prop. 7.3], we easily get the conclusion.
Generalized Longo–Rehren Subfactors and α-Induction
273
Next, we rewrite the second condition in the above lemma as follows. Using the definition of (V ) as in (1), we have d 1 d2 β X(U ∗ ) (Wβ2 )Wβ1 Tβ13β2 Wβ∗3 wd3 β1 ,β2 ,β3 ∈M XM d4 d5 β = (Wβ5 )Wβ4 Tβ46β5 Wβ∗6 X, wd6 β4 ,β5 ,β6 ∈M XM
which is equivalent to the following equations for all β3 , β4 , β5 ∈ M XM . d1 d2 β Wβ∗4 (Wβ5 )X(U ∗ ) (Wβ2 )Wβ1 Tβ13β2 d3 β1 ,β2 ∈M XM d4 d5 β6 = T W ∗ XWβ3 . d6 β4 β5 β6 β6 ∈M XM
Assuming the first condition in Lemma 2.1, we compute the left hand side of this equation as follows. d 1 d2 ∗ −,opp β W X((αλ+1 ⊗ αλ2 )(Wβ∗5 )U ∗ Wβ2 )Wβ1 Tβ13β2 d 3 β4 β1 ,β2 ∈M XM d1 d2 ∗ opp −,opp β = W XWβ1 (β1 ⊗ β1 )((αλ+1 ⊗ αλ2 )(Wβ∗5 )U ∗ Wβ2 )Tβ13β2 d3 β 4 β1 ,β2 ∈M XM d1 d2 ∗ opp = W XWβ1 (β1 ⊗ β1 ) d3 β 4 β1 ,β2 ∈M XM −,opp −,opp (αλ+1 ⊗ αλ2 )(Wβ∗5 )(αλ+1 ⊗ αλ2 ) × β∈M XM
=
β1 ∈M XM
×(Wβ )(Eλ+1 (β)∗
⊗ j (Eλ−2 (β))∗ )Wβ∗ Wβ2
β
Tβ13β2
d1 d5 ∗ opp β W XWβ1 (β1 ⊗ β1 )(Eλ+1 (β5 )∗ ⊗ j (Eλ−2 (β5 ))∗ )Tβ13β5 . d 3 β4
That is, our equation is now d1 ∗ opp β W XWβ1 (β1 ⊗ β1 )(Eλ+1 (β5 )∗ ⊗ j (Eλ−2 (β5 ))∗ )Tβ13β5 d 3 β4 β1 ∈M XM d4 β6 = T W ∗ XWβ3 d6 β4 β5 β6 β6 ∈M XM
(3)
274
Y. Kawahigashi
for all β3 , β4 , β5 ∈ M XM . Now set β4 = id in this equation. Then on the left hand side, opp −,opp we have a term W0∗ XWβ1 , which is in Hom((β1 ⊗ β1 )(αλ+1 ⊗ αλ2 ), idM⊗M opp ). Now setting Xβ1 = W0∗ XWβ1 , we get d1 1 ∗ opp β3 + − ∗ ∗ Xβ (β1 ⊗ β1 )(Eλ1 (β5 ) ⊗ j (Eλ2 (β5 )) )Tβ1 β5 = W XWβ3 d3 1 d5 β 5 β1 ∈M XM
for any β3 , β5 ∈ M XM from Eq. (3), and this implies d1 d 5 opp β X= Wβ5 Xβ1 (β1 ⊗ β1 )(Eλ+1 (β5 )∗ ⊗ j (Eλ−2 (β5 ))∗ )Tβ13β5 Wβ∗3 . d3 β1 ,β3 ,β5 ∈M XM
(4) Consider the linear map sending X ∈ M ⊗ M opp with XV ∈ Hom(η(αλ+1 , +)ηopp (αλ−2 , −), ) to
(W0∗ XWβ )β ∈
β∈M XM
−,opp
Hom(βαλ+1 , id) ⊗ Hom(β opp αλ2
, id).
The dimensions of the space of such X and the space −,opp Hom(βαλ+1 , id) ⊗ Hom(βoppαλ2 , id) β∈M XM
are both equal to Zλ1 λ2 , and this map is injective by Eq. (4), so this map is also surjective. That is, a general form of such an X is determined now by Eq. (4), where Xβ ’s −,opp are now arbitrary intertwiners in Hom(βαλ+1 , id) ⊗ Hom(β opp αλ2 , id). Fix λ1 , λ2 ∈ + N XN , β ∈ M XM and let l1 and l2 be indices in the set {1, 2, . . . , dim Hom(βαλ1 , id)}, − {1, 2, . . . , dim Hom(βαλ2 , id)} respectively. Following [25], we use the letter l for the multi-index (λ1 , λ2 , β, l1 , l2 ). Note that in order for us to get a non-trivial index, that is, l1 > 0, l2 > 0, the endomorphism β must be ambichiral in the sense that it appears in irreducible decompositions of both α + -induction and α − -induction as in [2]. Let {Tl+ } 1 l1 + ¯ − ¯ and {Tl− } be orthonormal bases of Hom(α , β) and Hom(α , β ), respectively. 2 λ1 λ2 2 l2 We now study some intertwiners using a graphical calculus in [2, Sect. 3]. First note that we have identities as in Fig. 1 by the braiding-fusion equation [8, Def. 4.2], [4, Def. 2.2 2] for a half-braiding, where crossings in the picture represent the halfbraidings and the black and white small circles represent intertwiners in Hom(βαλ+1 , id) ¯ respectively. (See [2, Sect. 3] for interpretations of the graphical caland Hom(αλ+1 , β) culus. Here and below, a triple point, a black or while small circle always represents an isometry or a co-isometry. One has to be careful that we have a normalizing constant involving the fourth roots of statistical dimensions as in [2, Figs. 7,9]. From now on, we drop orientations of wires, which should cause no confusion.) We also have the following lemma to relate these two intertwiners. Lemma 2.2. Let Tj ∈ Hom(β, αλ+ ) and define Tˆj ∈ Hom(β, αλ+ ) by the graphical expression in Fig. 2. Then we have Tk∗ Tj = Tˆk∗ Tˆj .
Generalized Longo–Rehren Subfactors and α-Induction
αλ+
β3
✠
N
αλ+
β3
1
❄ β
275
=
β
✢
✮
✠
=
β
β5
1
❘
❄ ☛
αλ+
β3
1
I ❄
β5
β5
Fig. 1. An application of the braiding-fusion equation β Tj∗ αλ+ Fig. 2. The intertwiner Tˆj β Tj∗
Tj∗
β
1 = dβ
αλ+ Tk
1 = dβ
αλ+
Tj∗ β Tk
αλ+ = Tk∗ Tj
Tk
β Fig. 3. The inner product Tˆk∗ Tˆj
Proof. We compute as in Fig. 3. Based on this, we set N23 1
β β Sβ12 3
=
k=1
β
β
(Tβ21β3 ,k )∗ ⊗ j (Tβ21β3 ,k )∗ ∈ M ⊗ M opp
276
Y. Kawahigashi
and we now define Xl ∈ M ⊗ M opp as follows: Xl = dλ1 dλ2 d3 β¯ β Wβ S 1 3 (Tl+ ⊗ j (Tl− ))(Eλ+1 (β3 )∗ ⊗ j (Eλ−2 (β3 ))∗ )Wβ∗3 . 1 2 d1 d5 5 β5
(5)
β3 ,β5 ∈M XM
Then by Eq. (4), the operator Ul ∈ R defined by Ul = Xl V is in Hom(η(αλ+1 , +)ηopp (αλ−2 , −), ) and {Ul }β,l1 ,l2 is a linear basis of this intertwiner space. We next prove that {Ul }β,l1 ,l2 is actually an orthonormal basis with respect to the usual inner product. Recall that for s, t ∈ Hom(η(αλ+1 , +)ηopp (αλ−2 , −), ), we have
dλ1 dλ2 ∗ t s∈C w = dλ1 dλ2 . (See [8, Lemma 3.1 (i)].) We then have
EM⊗M opp (st ∗ ) = because dη(α + ,+)ηopp (α − ,−) λ1
λ2
1 Xl Xl∗ w dλ dλ = δll 1 2 w
EM⊗M opp (Ul Ul∗ ) =
β3 ,β5 ∈M XM
d3 5 N Wβ W ∗ d1 d5 13 5 β5
dλ dλ = δll 1 2 , w and this proves that {Ul }β,l1 ,l2 is indeed an orthonormal basis. This also shows that we have ∗ ∗ φ (Xm Xl ) = W ∗ E(R) (Xm Xl )W = W ∗ (Um∗ Ul )W = δlm , where φ is the standard left inverse of . (See [20] for a general theory of left inverses.) Let l = (λ1 , λ2 , β1 , m1 , m2 ), m = (µ1 , µ2 , β1 , m1 , m2 ), n = (ν1 , ν2 , β1 , n1 , n2 ) be ∗ X ∗ X ) as follows: multi-indices as above. We compute E(R) (Xm l n ∗ ∗ ∗ ∗ E(R) (Xm Xl Xn ) = (V ∗ Xm Xl Xn V )
∗ = (w1/2 V ∗ Xm (V ∗ )W Xl∗ Xn V ) ∗ = (w1/2 V ∗ Xm (V ∗ Xl∗ )W Xn V )
= (w1/2 Um∗ (Ul∗ )W Un ). Based on this, we set
n ∗ ∗ Ylm = w−1/2 V ∗ Xm Xl Xn V = Um∗ (Ul∗ )W Un ∈ R
and then this is an element in Hom(η(αν+1 , +)ηopp (αν−2 , −), η(αµ+1 , +)ηopp (αµ−2 , −)η(αλ+1 , +)ηopp (αλ−2 , −)), which is then contained in Hom(ν1 , µ1 λ1 ) ⊗ Hom(ν2 , µ2 λ2 )opp ⊂ N ⊗ N opp ⊂ M ⊗ M opp
Generalized Longo–Rehren Subfactors and α-Induction
277
by [4, Thm. 3.9]. That is, we now have ∗ ∗ n Xl Xn ) = w1/2 (Ylm ) ∈ (M ⊗ M opp ) E(R) (Xm
and ∗ ∗ ∗ ∗ Xl Xn ) = V ∗ Xm Xl Xn V . φ (Xm
(6)
Proposition 2.3. In the above setting, the Q-system (, V , W ) is given as follows: Ul (η(αλ+1 , +)ηopp (αλ−2 , −))(x)Ul∗ , for x ∈ R, (7) (x) = l
V = U(0,0,0,1,1) , n (Ul )Um Ylm Un∗ . W =
(8) (9)
l,m,n
Proof. Since {Ul }β1 ,l1 ,l2 is an orthonormal basis of Hom(η(αλ+1 , +)ηopp (αλ−2 , −), ), we get the first identity (7). By the definition (5) of Xl , we have X(0,0,0,1,1) = 1, hence n = U ∗ (U ∗ )W U , we get (9). U(0,0,0,1,1) = V , which is (8). Since Ylm n m l n . We first have Next we further compute Ylm n n = W ∗ (Ylm )W Ylm
∗ ∗ = w−1/2 W ∗ E(R) (Xm Xl Xn )W
∗ ∗ = w−1/2 φ (Xm Xl Xn ) 2 dβ opp ∗ ∗ (φβ ⊗ φβ )(Wβ∗ Xm Xl Xn Wβ ), = w 3/2 β∈M XM
where φβ is the standard left inverse of β. In this expression, we compute the term ∗ X ∗ X W as follows. Wβ∗ Xm l n β ∗ ∗ Wβ∗ Xm Xl Xn Wβ =
dλ1 dλ2 dµ1 dµ2 dν1 dν2
dβ
dβ5 dβ1 dβ1 dβ1
β¯ β ∗ ×(Eµ+1 (β) ⊗ j (Eµ−2 (β)))((Tm+1 )∗ ⊗ j (Tm−2 )∗ ) Sβ 1 3
β¯ β ∗ ×(Eλ+1 (β3 ) ⊗ j (Eλ−2 (β3 )))((Tl+ )∗ ⊗ j (Tl− )∗ ) Sβ51 3 1 2 β3 ,β5 ∈M XM
β¯ β
×Sβ51 (Tn+1 ⊗ j (Tn−2 ))(Eν+1 (β)∗ ⊗ j (Eν−2 (β))∗ . n coincides with Rehren’s T n in [25, p. 400]. Our Y n Our aim is to show that our Ylm lm lm n . So we expand our is already in Hom(ν1 , µ1 λ1 ) ⊗ Hom(ν2 , µ2 λ2 )opp as in Rehren’s Tlm n with respect to the basis {T˜ = T 1 ⊗j (T 2 )} 1 2 Ylm e e1 e2 e=(e1 ,e2 ) , where {Te1 }e1 , {Te2 }e2 are bases for Hom(ν1 , µ1 λ1 ), Hom(ν2 , µ2 λ2 ), respectively. We will prove that the coefficients of n for such an expansion coincide with Rehren’s coefficients ζ n Ylm lm,e1 ,e2 in [25, p. 400].
278
Y. Kawahigashi
Let Sl+ = Sβ+1 ,λ1 ,l1 ∈ Hom(β1 , αλ+1 ) be isometries so that {Sβ+1 ,λ1 ,l1 }l1 gives an orthonormal basis in Hom(β1 , αλ+1 ). Similarly we choose Sl− = Sβ−1 ,λ2 ,l2 ∈ Hom(β1 , αλ−2 ). Rehren puts an inner product in Hom(αλ+1 , αλ−2 ) in [25, p. 400]. When we decompose this space as β∈M X 0 Hom(αλ+1 , β) ⊗ Hom(β, αλ−2 ), Rehren’s normalization implies M ∗ that his orthonormal basis consists of intertwiners of the form dλ1 /dβ Sl− Sl+ , where ± n Sl are isometries as above. This implies that Rehren’s ζlm,e1 ,e2 is given as follows:
dλ1 dλ2 dµ1 dµ2 dβ1 wdν1 dν2 dβ1 dβ1
+ −∗ Sn+∗ (Te21 )∗ ((Sl+ Sl−∗ ) × (Sm Sm ))Te12 Sn− .
(10)
Note that we have dν dν ∗ ∗ ∗ ∗ EM⊗M opp (Xn V T˜e V ∗ Xm Xl ) = 1 2 T˜e V ∗ Xm Xl Xn V , w
(11)
where we have T˜e = Te11 ⊗ j (Te22 ) as above. (See [8, Lemma 3.1].) n with respect to the basis {T˜ } . Then the coefficient is given as We expand our Ylm e e follows using the relations (6), (11): n ∗˜ ) Te = w −1/2 φ (Xn∗ Xl Xm )T˜e (Ylm = w −1/2 V ∗ Xn∗ Xl Xm V T˜e
w 1/2 EM⊗M opp (Xl Xm V T˜e V ∗ Xn∗ ) dν1 dν2 1 = 1/2 Xl Xm (T˜e )Xn∗ . w dν1 dν2
=
(12)
We represent Xl graphically as in Fig. 4, where we follow the graphical convention of [2, Sect. 3], and {Ti }i is an orthonormal basis of Hom(β5 , β1 β3 ). After this figure, we drop the symbols Ti , Sl±∗ , and the summation Ti for simplicity. αλ+
β3
Xl =
d λ 1 d λ2
β3 ,β5
dβ 1
W β5 (
Ti
Sl+∗ β1
αλ−
β3
1
2
−∗ ⊗ j Sl
Ti∗ β5
β1
)Wβ∗ . 3
Ti∗ β5
Fig. 4. A graphical expression for Xl
We next have a graphical expression for Xl Xm as in Fig. 5, where we have used a braiding-fusion equation for the half-braiding. Here we prepare two lemmas.
Generalized Longo–Rehren Subfactors and α-Induction β3
X l Xm =
β3 ,β3 ,β5
d λ1 dλ2 d µ 1 d µ 2 d β1 d β
W β5 (
+ αµ 1
αλ+
β3
1
β1
1
279 − αµ 2
αλ−
2
β1
)W ∗ .
⊗j
Ti
β3
β1
β3
β3
β1
β5
β5
Fig. 5. A graphical expression for Xl Xm
Lemma 2.4. For an intertwiner in Hom(β1 β2 , β3 ) ⊗ Hom(β3 , β1 β2 ), the application of the left inverse φβ1 is given as in Fig. 6. Proof. Immediate by [8, Lemma 3.1. (i)] and our graphical normalization convention. β1
φ β1 :
β2
→
β3
β1
β2
β2
1 β1 d β1
β3
β2
Fig. 6. A graphical expression for the left inverse
Lemma 2.5. For a change of bases, we have a graphical identity as in Fig. 7, where we have summations over orthonormal bases of (co)-isometries for small black circles. Proof. The change of bases produces quantum 6j -symbols, and their unitarity gives the conclusion. Then next we compute Xl Xm (T˜e )Xn∗ . It is expressed as Xl Xm (T˜e )Xn∗ 1/4 dλ1 dλ2 dµ1 dµ2 dν1 dν2 dν1 dν2 = dβ1 dβ1 dβ1 dλ1 dλ2 dµ1 dµ2 × Wβ5 (graphical expression of Fig. (8))Wβ∗5 , β3 ,β5 ,β˜3
(13)
280
Y. Kawahigashi
β1
β2
β1
β3
β2
β3
⊗j
β1
=
β3
β3
β2
β1
β2 β3
⊗j
β3
β5
β4
β5
β4
β4
β5
β4
β5
Fig. 7. A change of orthonormal bases
where small white circles represent intertwiners corresponding to Te11 , Te22 regarded as elements in M, we have applied φ graphically using Lemma 2.4, changed the orthonormal bases in the space Hom(β1 β1 β3 , β5 ) using Lemma 2.5 and thus we now have a summation over β˜3 rather than over β3 . β5 β1
β5 β1
β3
β3
αν+1
αν−2
+ αµ 1
− αµ 2
⊗j
αλ+
αλ−
1
2
β1
β1
β3
β˜3
β5
β1
β1
β3
β˜3
β5
Fig. 8. A graphical expression for Xl Xm (T˜e )Xn∗
Then the complex number value represented by Fig. 8 can be computed as in Fig. 9, where we have used the braiding-fusion equation for a half-braiding twice. Here we have the following lemma.
Generalized Longo–Rehren Subfactors and α-Induction β5
281
β5
β1
β5
β1
αν+1 + αµ 1
β1
αν+1 + αµ 1 αλ+
1
αν+1 + αµ 1 αλ+
αλ+
1
β3
β1
1
β1
β1
β1 β˜3
β5
=
dβ˜ dβ
=
5 3 dβ 5
dβ d β 1 d β 3
β1 = β˜3
β˜3 β1
dβ 5
1
β1
β1 β1 αλ− 2
β1
β3
− αµ 2
αλ− 2 − αµ 2
αν−2
αλ− 2
β3
− αµ 2
αν−2
β1
β1
β3
αν−2
β1 β5
β1 β5
β5
Fig. 9. The value of Fig. 8
Lemma 2.6. Let β, β be ambichiral and choose isometries T ∈ Hom(β, αλ+ ), S ∈ Hom(β , αµ+ ). Then we have the identity as in Fig. 10.
αλ+
αλ+
+ αµ
T∗
T∗
+ αµ
S∗
= S∗ β
β
β
Fig. 10. A naturality equation
β
282
Y. Kawahigashi
Proof. We compute both sides by the definitions of the half and the relative braidings in [4, (10)] and [1, Subsect. 3.3], respectively, and then we get β (T ∗ )S ∗ ε + (λ, µ)T T ∗ , where we have used ε+ (λ, µ)αλ+ (SS ∗ ) = SS ∗ ε + (λ, µ), which follows from the arguments and the figure in [27, p. 377]. (The chiral locality is not used in the argument in [27, p. 377].) n )∗ T˜ is computed with the coefficients in Eqs. (12), (13), and Then the value (Ylm e Fig. 9. The coefficient is now 1/4 w1/2 dλ1 dλ2 dµ1 dµ2 dν1 dν2 dν1 dν2 dν1 dν2 dβ1 dβ1 dβ1 dλ1 dλ2 dµ1 dµ2 dβ3 dβ1 dβ1 dβ1 dβ5 × dβ1 dβ5 dβ3 1/4 dβ1 d d d d dν1 dν2 λ λ µ µ 1 2 1 2 −1/2 =w (14) dν1 dν2 dβ1 dβ1 dλ1 dλ2 dµ1 dµ2
and this is multiplied with the intertwiner in Fig. 11, where the two crossings of the two wires labeled with β1 , β1 represent the “ambichiral braiding” studied in [1, Subsect. 3.3]. β1 αν+1 + αµ 1
αλ+
1
β1
β1
αλ−
− αµ 2
2
αν−2 β1
Fig. 11. The remaining intertwiner
Then the monodromy of β1 and β1 in Fig. 11 acts on Hom(β1 β1 , β1 ) as a scalar arising from “conformal dimensions” of β1 , β1 , β1 in the ambichiral system. (See [5, Fig. 8.30].) So up to this scalar, we have Fig. 12. Since the fourth root in (14) comes from our normalization for the graphical expression (see [2, Figs. 7, 9]) and we can
Generalized Longo–Rehren Subfactors and α-Induction
283
absorb the above scalar arising from the conformal dimensions by changing the bases {T˜e }e , our coefficient multiplied with the number represented by Fig. 11 now coincides with Rehren’s coefficient computed as in (10). (Actually, λj and µj are interchanged and also α + and α − are interchanged, but these are just matters of convention.) β1
αν+1 αλ+
+ αµ 1
1
β1
β1 αλ−
− αµ 2
2
αν−2
β1
Fig. 12. The new form of the remaining intertwiner
Now with [4, Cor. 3.10], we have proved the following theorem. Theorem 2.7. The generalized Longo–Rehren subfactor arising from α ± -induction with a non-degenerate braiding on N XN is isomorphic to the dual of the Longo–Rehren subfactor arising from M XM . At the end of [25], Rehren asks for an Izumi type description [8] of irreducible endomorphisms of P arising from the generalized Longo–Rehren subfactor N ⊗ N opp ⊂ P and in particular, he asks whether a braiding exists or not on this system of endomorphisms of P . The above theorem in particular shows that the system of endomorphisms opp of P is isomorphic to the direct product system of M XM and M XM and thus we solve these problems and the answer to the second question is negative, since this system does not have a braiding in general and it can be even non-commutative. (Note that [2, Cor. 6.9] gives a criterion for such non-commutativity.) Remark 2.8. If N = M in the above setting, our result implies [8, Prop. 7.3], of course, but a remark on [8, p. 171] gives a “twisted Longo–Rehren subfactor” rather than the usual Longo–Rehren subfactor. This is due to the monodoromy operator similar to the one in Fig. 11, but as pointed out by Rehren, one can always eliminate such a twist and then the “twisted Longo–Rehren subfactor” is actually isomorphic to the Longo–Rehren subfactor. (See “Added in proof” of [8] on this point.) We also had a similar twist in our results here, originally, but we have eliminated it thanks to this remark of Rehren. In the above setting, we can also set N1 = N , N2 = M, 1 = N XN , 2 = M X 0 M , αλ1 = αλ+ , ατ2 = τ in the construction of the generalized Longo–Rehren subfactor.
284
Y. Kawahigashi
Then subfactor M ⊗ N opp ⊂ R has a dual canonical endomorphism the resulting + opp , where b+ = dim Hom(α + , τ ) is the chiral branching λ∈N XN ,τ ∈M X 0 M bτ,λ λ ⊗ τ τ,λ λ coefficient as in [3, Subsect. 3.2]. Now using the results in [4, Sect. 4] and arguments almost identical to the above, we can prove the following theorem. Theorem 2.9. The generalized Longo–Rehren subfactor M ⊗ N opp ⊂ R arising from α + -induction as above with a non-degenerate braiding on N XN is isomorphic to the dual of the Longo–Rehren subfactor arising from M X + M . 3. Nets of Subfactors on S 1 In this section, we study multi-interval subfactors for completely rational nets of subfactors, which generalizes the study in [13]. Let {M(I )}I ⊂S 1 be a completely rational net of factors of S 1 in the sense of [13], where an “interval” I is a non-empty, non-dense connected open subset of S 1 . (That is, we assume isotony, conformal invariance, positivity of the energy, locality, existence of the vacuum, irreducibility, the split property, strong additivity, and finiteness of the µ-index. See [6, 13] for the detailed definitions.) We also suppose to have a conformal subnet {N (I )}I ⊂S 1 of {M(I )}I ⊂S 1 with finite index as in [18]. The main result in [18] says that the subnet {N (I )}I ⊂S 1 is also completely rational. Let E = I1 ∪ I3 be a union of two intervals I1 , I3 such that I¯1 ∩ I¯3 = ∅. Label the interiors of the two connected components of S E as I2 , I4 so that I1 , I2 , I3 , I4 appear on the circle in a counterclockwise order. We set Nj = N (Ij ), Mj = M(Ij ), for j = 1, 2, 3, 4. (This numbering should not be confused with the basic construction.) We also set N = N1 , M = M1 . We have a finite system of mutually inequivalent irreducible DHR endomorphisms {λ} for the net {N (I )} by complete rationality. We may and do regard this as a braided system of endomorphisms of N = N1 . By [13, Corollary 37], this braiding is nondegenerate. We write N XN for this system. As in [2], we can apply α ± -induction to get systems M XM , M X + M , M X − M , M X 0 M of irreducible endomorphisms of M. That is, they are the systems of irreducible endomorphisms of M arising from α ± -induction, α + induction, α − -induction, and the “ambichiral” system, respectively. Since the braiding on N XN is non-degenerate, [2, Thm. 5.10] and [1, Prop. 5.1] imply that the ambichiral system M X 0 M is given by the irreducible DHR endomorphisms of the net {M(I )}. By the inclusions M X 0 M ⊂ M X ± M ⊂ M XM and the Galois correspondence of [8, Thm. 2.5] (or by the characterization of the Longo–Rehren subfactor in [13, Appendix A]), we have inclusions of the corresponding Longo–Rehren subfactors M ⊗M opp ⊂ R, M ⊗ M opp ⊂ R ± , M ⊗ M opp ⊂ R 0 with R 0 ⊂ R ± ⊂ R. We study these Longo–Rehren subfactors in connection to the results in Sect. 2. As in [13], we make identification of S 1 with R ∪ {∞}, and as in [13, Prop. 36], we may and do assume that I1 = (−b, −a), I3 = (a, b), with 0 < a < b. Take a DHR endomorphism λ localized in I1 for the net {N (I )}. Let P = M(I˜), where I˜ = (−∞, 0). Let J be the modular conjugation for P with respect to the vacuum vector. We consider endomorphisms of the C ∗ -algebras I ⊂(−∞,∞) M(I ) and I ⊂(−∞,∞) N (I ). The canonical endomorphism γ and the dual canonical endomorphism θ are regarded as endomorphisms of these C ∗ -algebras. We regard αλ+ as an endomorphism of the former C ∗ -algebra as in [19], and then it is not localized in I1 any more, but it is localized in (−∞, −a) by [19, Prop. 3.9]. We study an irreducible decomposition of αλ+ as an endomorphism of M1 and choose β appearing in such an irreducible decomposition
Generalized Longo–Rehren Subfactors and α-Induction
285
of αλ+ regarded as an endomorphism of M1 . That is, we choose an isometry W ∈ M1 with W ∗ W ∈ αλ+ (M) ∩ M, β(x) = W ∗ αλ+ (x)W . Using this same formula, we can regard β as an endomorphism of the C ∗ -algebra I ⊂(−∞,∞) M(I ). We next regard β as an endomorphism of P and let Vβ be the isometry standard implementation of β ∈ End(P ) as in [6, Appendix]. We now set β¯ = JβJ . Then for any X ∈ P ∨ P , we ¯ have β β(X)V β = Vβ X as in the proof of [13, Prop. 36] since J Vβ J = Vβ . By strong additivity, we have this for all local operators X. Since λ, λ¯ = J λJ, β, β¯ are localized in (−∞, a), (a, ∞), I1 , I3 , respectively, we know that Vβ ∈ (M2 ∨ N4 ) . Consider the subfactor M1 ∨ M3 ⊂ (M2 ∨ N4 ) . By Frobenius reciprocity [7], we know that the dual canonical endomorphism for the subfactor M1 ∨ M3 ⊂ (M2 ∨ N4 ) contains β ⊗ β opp , opp opp where M3 = J M1 J is now regarded as M1 and M1 ∨ M3 is regarded as M1 ⊗ M1 , + for all β ∈ M X M . We now compute the index of the subfactor M1 ∨ M3 ⊂ (M2 ∨ N4 ) in two ways. On one hand, it has an intermediate subfactor (M2 ∨ M4 ) and the index for M1 ∨ M3 ⊂ (M2 ∨ M4 ) is the global index of the ambichiral system by [13, Thm. 33]. The index of (M2 ∨ M4 ) ⊂ (M2 ∨ N4 ) is simply that of the net {N (I ) ⊂ M(I )} of subfactors. We also have w w+ = = dλ Zλ0 = dθ = [M(I ) : N (I )], w0 w+ λ∈N XN
where w, w+ , w0 are the global indices of M XM , M X + M , M X 0 M , respectively, by [3, Thm. 4.2, Prop. 3.1], [27, Thm. 3.3 (1)]. (Here we have used the chiral locality condition arising from the locality of the net {M(I )}. Without the chiral locality, the results in this section would not hold in general.) These imply that [(M2 ∨ N4 ) : M1 ∨ M3 ] = w+ .
(15)
On the other hand, the dual canonical endomorphism for the subfactor M1 ∨ M3 ⊂ (M2 ∨ N4 ) contains β∈M X + M β ⊗ β opp from the above considerations since each β is irreducible as an endomorphism of M, thus the index value is at least β∈M X + M dβ2 = w+ . Together with (15), we know that the dual canonical endomorphism is indeed equal to β∈M X + M β ⊗ β opp . Put Rβ = dβ Vβ ∈ (M2 ∨ N4 ) . As in the proof of [13, Prop. 36], we now conclude that the subfactor M1 ∨ M3 ⊂ (M2 ∨ N4 ) is isomorphic to the Longo–Rehren subfactor M ⊗ M opp ⊂ R + . Similarly, we know that the subfactor M1 ∨ M3 ⊂ (N2 ∨ M4 ) is isomorphic to the Longo–Rehren subfactor M ⊗ M opp ⊂ R − . These two isomorphisms are compatible on (M2 ∨ M4 ) and they give an isomorphism of M1 ∨ M3 ⊂ (M2 ∨ M4 ) to the Longo–Rehren subfactor M ⊗ M opp ⊂ R 0 . We finally look at the inclusions M ⊗ M opp ⊂ R 0 ⊂ R + ∩ ∩ R − ⊂ R. The right square is a commuting square by [18, Lemma 1] and thus R is generated by R + and R − . (Or [2, Thm. 5.10] and [8, Prop. 2.4, Thm. 2.5] also give this generating property.) It means that the above isomorphisms give the following theorem. Theorem 3.1. Under the above setting, the following system of algebras arising from four intervals on the circle is isomorphic to the system of algebras (16) arising as
286
Y. Kawahigashi
Longo–Rehren subfactors. M1 ∨ M3 ⊂ (M2 ∨ M4 ) ⊂ (M2 ∨ N4 ) ∩ ∩ (N2 ∨ M4 ) ⊂ (N2 ∨ N4 ) . Remark 3.2. Passing to the commutant, we also conclude that the subfactor N1 ∨ N3 ⊂ (M2 ∨ M4 ) is isomorphic to the dual of M ⊗ M opp ⊂ R and thus isomorphic to the generalized Longo–Rehren subfactor arising from the α ± -induction studied in Sect. 2. In the example of the conformal inclusion SU (2)10 ⊂ Spin(5)1 in [27, Sect. 4.1], this fact was first noticed by Rehren and it can be proved also in general directly by computing the corresponding Q-system. Acknowledgement. The author thanks K.-H. Rehren for his remarks mentioned in Remarks 2.8, 3.2 and detailed comments on a preliminary version of this paper. We also thank F. Xu for his comments on the preliminary version. We gratefully acknowledge the financial supports of Grant-in-Aid for Scientific Research, Ministry of Education and Science (Japan), Japan-Britain joint research project (2000 April–2002 March) of Japan Society for the Promotion of Science, Mathematical Sciences Research Institute (Berkeley), the Mitsubishi Foundation and University of Tokyo. A part of this work was carried out at the Mathematical Sciences Research Institute, Berkeley, and Università di Roma “Tor Vergata” and we thank them for their hospitality.
References 1. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. III. Commun. Math. Phys. 205, 183–228 (1999) 2. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral generators and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 3. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: Chiral structure of modular invariants for subfactors. Commun. Math. Phys. 210, 733–784 (2000) 4. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: Longo–Rehren subfactors arising from α-induction. Publ. RIMS, Kyoto Univ. 31, 1–35 (2001) 5. Evans, D.E., Kawahigashi, Y.: Quantum symmetries on operator algebras. Oxford: Oxford University Press, 1998 6. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 7. Izumi, M.: Subalgebras of infinite C ∗ -algebras with finite Watatani indices II: Cuntz-Krieger algebras. Duke Math. J. 91, 409–461 (1998) 8. Izumi, M.: The structure of sectors associated with the Longo–Rehren inclusions I. General theory. Commun. Math. Phys. 213, 127–179 (2000) 9. Izumi, M.: The structure of sectors associated with the Longo–Rehren inclusions II. Examples. Rev. Math. Phys. 13, 603–674 (2001) 10. Izumi, M., Longo, R., Popa, S.: A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a generalization to Kac algebras. J. Funct. Anal. 155, 25–63 (1998) 11. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 12. Kawahigashi, Y.: Braiding and extensions of endomorphisms of subfactors. In “Mathematical Physics in Mathematics and Physics”, ed. R. Longo. Fields Institute Comm. 30, AMS Publ., 2001, 261–269 13. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 14. Kawamuro, K.: An induction for bimodules arising from subfactors. Preprint 2001 15. Longo, R.: Index of subfactors and statistics of quantum fields. I. Commun. Math. Phys. 126, 217–247 (1989) 16. Longo, R.: Index of subfactors and statistics of quantum fields. II. Commun. Math. Phys. 130, 285–309 (1990) 17. Longo, R.: A duality for Hopf algebras and for subfactors I. Commun. Math. Phys. 159, 133–150 (1994) 18. Longo, R.: Conformal subnets and intermediate subfactors. Preprint 2001, math.OA/0102196 19. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 20. Longo, R., Roberts, J.E.: A theory of dimension. K-theory 11, 103–159 (1997)
Generalized Longo–Rehren Subfactors and α-Induction
287
21. Masuda, T.: An analogue of Longo’s canonical endomorphism for bimodule theory and its application to asymptotic inclusions. Internat. J. Math. 8, 249–265 (1997) 22. Ocneanu, A.: Quantized group, string algebras and Galois theory for algebras. In: Operator algebras and applications, Vol. 2 (Warwick, 1987), ed. D. E. Evans and M. Takesaki, London Mathematical Society Lecture Note Series 36, Cambridge: Cambridge University Press, 1988, pp. 119–172 23. Popa, S.: Symmetric enveloping algebras, amenability and AFD properties for subfactors. Math. Res. Lett. 1, 409–425 (1994) 24. Rehren, K.-H.: Braid group statistics and their superselection rules. In: The algebraic theory of superselection sectors. ed. D. Kastler, Palermo 1989, Singapore: World Scientific, 1990, pp. 333–355 25. Rehren, K.-H.: Canonical tensor product subfactors. Commun. Math. Phys. 211, 395–406 (2000) 26. Roberts, J.E.: Local cohomology and superselection structure. Commun. Math. Phys. 51, 107–119 (1976) 27. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 347–403 (1998) 28. Xu, F.: Applications of braided endomorphisms from conformal inclusions. Internat. Math. Research Notices 5–23 (1998) Communicated by H. Araki
Commun. Math. Phys. 226, 289 – 322 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Cohomology of Canonical Projection Tilings A. H. Forrest1 , J. R. Hunton2 , J. Kellendonk3 1 IMF, NTNU Lade, 7034 Trondheim, Norway. E-mail: [email protected] 2 The Department of Mathematics and Computer Science, University of Leicester, University Road, Leicester,
LE1 7RH, England. E-mail: [email protected]
3 Fachbereich Mathematik, Sekr. MA 7-2, Technische Universität Berlin, 10623 Berlin, Germany.
E-mail: [email protected] Received: 24 June 1999 / Accepted: 18 October 2001
Abstract: We define the cohomology of a tiling as the cocycle cohomology of its associated groupoid and consider this cohomology for the class of tilings which are obtained from a higher dimensional lattice by the canonical projection method in Schlottmann’s formulation. We prove the cohomology to be equivalent to a certain cohomology of the lattice. We discuss one of its qualitative features, namely that it provides a topological obstruction for a generic tiling to be substitutional. We develop and demonstrate techniques for the computation of cohomology for tilings of codimension smaller than or equal to 2, presenting explicit formulae. These in turn give computations for the K-theory of certain associated non-commutative C ∗ algebras. Introduction Quasiperiodic tilings have become an active area of research in solid state physics due to their role in modeling quasicrystals [1–4], and the projection method in its various formulations [5–8] is one of the most common techniques to construct candidates for such tilings. This raises the question of characterization and even classification of such tilings. For that to be investigated one must first decide which properties of a tiling are essential for the physical properties of the solid. We take the point of view here that it is only the local structure of the tiling that matters, and even more, only its topological content, as captured, for example, by the continuous hull [22, 23] or the tiling groupoid [15, 10]. According to this point of view the tight binding model for particle motion in the tiling is not uniquely determined by the tiling but its form is constrained by the topology of the tiling, i.e. the Hamiltonian reflects the long range order of the tiling (though additional information is required to specify the interaction strengths, etc.). Our interest is thus in the topological invariants of tilings, in particular here with the cohomology and K-theory of the tiling groupoid. Without additional mathematical structure of the tiling it is not clear how to obtain explicit results for its cohomology. Substitution tilings provide a class of tilings where
290
A. H. Forrest, J. R. Hunton, J. Kellendonk
such results can be obtained [9, 10] since they possess a symmetry which relates different scales. The present article is part of a programme to compute the tiling cohomology of projection tilings, those which may be obtained by projection from higher dimensional lattices. We consider here projection tilings defined by Laguerre complexes after Schlottmann [20]; see Definition 20 and the notation at the start of Sect. 3.1 for a precise description of the class of tilings considered. We present both qualitative and quantitative results. Our qualitative results centre around giving sufficient conditions under which a rational version of the cohomology is infinitely generated. These conditions are in some sense almost always met and since the rational cohomology of substitution tilings is finitely generated we can conclude, Corollary 55, that canonical projection tilings are rarely substitutional. We cannot as yet offer an interpretation of the fact that some tilings produce finitely generated cohomology whereas others do not, but, if understood, it could well lead to a criterion to single out a subset of tilings relevant for quasicrystal physics from the vast set of tilings which may be obtained from the canonical projection method. In this context we point out that no canonical projection tiling is known to us which has infinitely generated cohomology but allows for local matching rules, cf. [11]. Our quantitative results are restricted to canonical projection tilings with small codimension ( i.e. small difference between the rank of the projected lattice and the dimension of the tiling). We give closed formulæ, Theorems 63, 64 for the cohomology of such tilings in terms of the defining projection data. Formulæ for tilings of higher codimension can in principle be derived using more sophisticated tools from algebraic topology, along the lines of the methods employed at the end of [19]. As tilings obtained by the projection method belong to a large class of tilings whose cohomology is isomorphic to the (unordered) K-theory of the associated groupoid-C ∗ algebra [12], we also have explicit calculations for the K-theory of these algebras, Corollary 66. This (non-commutative) aspect of the topology of tilings has a direct interpretation in physics. The C ∗ algebra is the algebra of observables for particles moving in the tiling and its ordered K0 -group (or its image on a tracial state) may serve to “count” (or label) the possible gaps in the spectrum of the Hamilton operator which describes its motion [13–15]. In this context it is even more challenging to find an interpretation of the generators of the K0 -group when there are infinitely many. At first sight, all but finitely many of them appear to be infinitesimal. This article has some parallels with the series [16–18] (see also [19]). Here however we study tilings as defined by Schlottmann’s variant of the projection method [20]; the calculations we present are consequently applicable to a wider class of tilings than those considered in [18] or at the end of [19]. The article is organized as follows. We describe the continuous dynamical system which can be assigned to any reasonable tiling in Sect. 1. Its associated transformation groupoid has orbits homeomorphic to the space in which the tiling is embedded. We derive the tiling groupoid as a reduction of this groupoid in Sect. 2; it is an r-discrete groupoid and we define tiling cohomology to be the cohomology of this groupoid. Again, this can be done for arbitrary tilings but one of the main features of the particular canonical projection tilings we consider, which make a computation of the cohomology feasible, is that one can find a Zd Cantor dynamical system whose associated transformation groupoid is continuously similar to the tiling groupoid. This material is covered in Sect. 3 where we define precisely the class of tilings for which we obtain our results. This observation allows the tiling cohomology to be formulated in terms of group cohomology. In this part our work parallels that of Bellissard et al. [21] on the K-theoretic
Cohomology of Canonical Projection Tilings
291
level. After two illustrative examples in Sect. 4 we discuss our qualitative results in Sect. 5 and the quantitative results in Sect. 6. In Sect. 7 we present the connection with K-theory and the non-commutative topological approach. 1. Continuous Tiling Dynamical Systems In this section we set up some preliminary notions and definitions with the main aim being to introduce and begin to describe the continuous hull MT , Definition 2, of a tiling T . In fact, this idea is not particular to the projection method tilings considered in the main work of this paper and in this section our definitions and results apply to a wide class of patterns. We specialise to the canonical projection tilings in Sect. 3.1 where we formally define this class. In general, a d-dimensional tiling is a covering of Rd by closed subsets, called its tiles, which overlap at most at their boundaries and are usually subject to various other constraints, as for example being connected, uniformly bounded in size and the closures of their interiors; they may also be decorated. For this article though we shall assume that the tiles are (possibly decorated) polytopes with non-empty interiors and which touch face to face. Moreover, we require that the tilings are of finite type, see Definition 3. Given a tiling T of Rd , then Rd acts naturally on it by translation. Denote the tiling translated by x as T − x. The closure of the orbit T − Rd of T with respect to an appropriate metric gives rise to a dynamical system [22] whose underlying space is the continuous hull of T . Thus our precise definition of the continuous hull will follow when we have chosen our metric. There are several proposals for the metric used which are all based on comparing patches around the origin of Rd . The basic idea is as follows. Represent a tiling T as a closed subset of Rd by the boundaries of its tiles and its decorations (if any) by small compact sets. Let Br be the open ball of radius r around 0 ∈ Rd and let Br (T ) := (Br ∩ T ) ∪ ∂Br , a closed set. Two tilings, T and T , should be close to each other if Br (T ) and Br (T ) coincide, possibly up to a small discrepancy, for large r. The different ways to quantify the allowed discrepancy lead to the different spaces which may be found in the literature. Definition 1. For tilings T and T as above, define metrics D0 and D by 1 D0 (T , T ) = inf r+1 | Br (T ) = Br (T ) , 1 D(T , T ) = inf r+1 | dr (Br (T ), Br (T )) < 1r , where dr is the Hausdorff metric defined among closed subsets of the closed r-ball. The first metric, D0 , allows no discrepancy; the completion of the Rd orbit of T under this metric would be non-compact. However, completion with respect to the metric D yields a compact space under very general conditions [22, 23]. Note also that D is not invariant under the action of Rd by translation, but this action is nevertheless uniformly continuous and can thus be extended to the completion. Definition 2. The continuous dynamical system associated to T is the pair (MT , Rd ), the closure MT of the orbit of T with respect to the metric D, and with the action of Rd induced by translation. Call MT the continuous hull of T .
292
A. H. Forrest, J. R. Hunton, J. Kellendonk
Let Mr (T ) be the subset of (whole) tiles of T contained in Br . As for T , think of Mr (T ) as the closed subset defined by the boundaries and decorations of its tiles. Definition 3. A tiling T is called of finite type (or of finite pattern type, or of finite local complexity) if for all r the set of translational congruence classes of sets Mr (T − x), x ∈ Rd , is finite. The elements of the space MT may again be interpreted as tilings. While we continue to write T for the original tiling, we write T for a general element of MT . If T is of finite type the elements T ∈ MT are those tilings in which each finite part can be identified with a finite part of a translate of T . Thus, for each T ∈ MT and for each r, there exists an x ∈ Rd such that Br (T ) = Br (T − x). Definition 4. Two tilings T , T are called locally isomorphic if for every r there exist x, x ∈ Rd such that Br (T ) = Br (T − x ) and Br (T ) = Br (T − x). If every element of MT is locally isomorphic to T then T is called minimal . The tilings we are interested in here are all minimal. Note that a tiling being minimal directly implies that each orbit of the associated dynamical system is dense. Finally, we have a third option for a metric on the orbit of T , linking the spaces considered here with the work of [9]. The following metric defines the same topology as the metric considered there. Definition 5. Define the metric Dt by 1 | ∃x, x ∈ B 1 : Br (T − x) = Br (T − x )}. Dt (T , T ) := inf{ r+1 2r
In this metric discrepancy is allowed only for small translations. As soon as two tilings differ by a rotation, however small, they will have a certain minimal non-zero distance. Thus closure with respect to Dt leads, for instance for the Pinwheel tilings [24], to a non-compact space, whereas closure with respect to D would still lead to a compact space. Which kind of metric is to be used has, of course, to be adapted to the problem, but for our purposes the following result shows that the distinction between D and Dt is inessential. Theorem 6. Let T be a finite type tiling. Then MT is compact and equal to the completion of T − Rd with respect to Dt . Proof. We start by showing that the two metrics D and Dt yield the same completion for finite type tilings. Clearly D(T , T ) ≤ Dt (T , T ) so we have to show that any D-Cauchy sequence is also a Dt -Cauchy sequence. Suppose that (Ti )i is a D-Cauchy sequence converging to T ∈ MT . Then for any i→∞
r, dr (Mr (Ti ), Mr (T )) −→ 0. As T is a finite type tiling, we can find for all i which i→∞
are larger than some i0 an i such that Mr (Ti ) = Mr (T ) − i and i −→ 0. But then Br−c (Ti ) = Br−c (T − i ), where c is an upper bound on the diameter of the tiles. Now choose ir such that ir ≤ 1/r. Then, for any r, Dt (T , Tir ) ≤ 1/(r + 1). Thus a D-Cauchy sequence will also be a Dt -Cauchy sequence. In particular MT is equal to the completion of T − Rd with respect to Dt . Its compactness for finite type tilings is well known, see, for example, [23]. This result allows us to identify the open sets in MT .
Cohomology of Canonical Projection Tilings
293
Definition 7. Say that a finite subset P of tiles of a tiling T is a patch (or pattern, or cluster) of it and write P ⊂ T . Define UP := {T ∈ MT |P ⊂ T }, subsets of the continuous hull. Theorem 8. The collection of sets {B + x + UP }, > 0, x ∈ Rd , P a patch of T , is a base for the topology of MT . Proof. The previous result allows us to work with the metric Dt . Let r() := 1− and Vr (T ) = {T ∈ MT |Br (T ) = Br (T )}. Then we can describe the -neighbourhoods of T with respect to Dt as follows. Dt (T , T ) < iff ∃r > r() ∃x, x ∈ B 1 : Br (T − x) = Br (T − x ) 2r iff T ∈ B 1 + Vr (T − x) . r>r() x∈B
2r
(1)
1 2r
The tiling being of finite type implies that, for every r > 0 and every T ∈ MT , there exists a finite set of pairs (xi , Pi ), xi ∈ Rd , Pi a patch of T , such that Br (T ) = Br (T ) whenever there is an i such that Pi + xi is a patch of T . In other words, Vr (T ) = i UPi +xi . This shows that (1) is a union of sets of the above collection. To show that B + UP is open in the metric topology (which by continuity of the action implies that also B + x + UP is open for x ∈ Rd ) we take a point T in it and show that a whole neighbourhood (with respect to Dt ) of it lies in B + UP . Let R be large enough so that R1 < and P is a patch of BR− 1 (T )(we view here P as a closed 2R subset much like a tiling). Then, for all x ∈ B 1 , P ⊂ BR (T − x) + x and hence VR (T − x) ⊂ UP − x. This implies that the
2R
1 R+1 -neighbourhood
of T lies in B + UP .
The following observation will be useful in Sect. 3.2 Lemma 9. Let P be a patch in a finite type tiling T . Then UP is compact. Proof. If D(T , T ) is small enough, and T , T ∈ UP , then it is equal to D0 (T , T ). That UP is complete and precompact with respect to the D0 -metric is proven in [15]. 2. The Groupoid Approach to Tilings To a given tiling one may associate an r-discrete groupoid called the tiling groupoid. This groupoid is special among other groupoids which may be assigned to the tiling in that its C ∗ algebra plays the role of the algebra of observables for particles moving in the tiling [15, 10]. It also determines the tiling up to topological equivalence [25]. The K-theory of the C ∗ algebra and the cohomology of the groupoid are – at least for canonical projection tilings – closely related, and may be considered as (non-commutative) invariants of the tiling. It is these invariants we discuss in this paper. We define the tiling groupoid in Sect. 2.2, but first we need to briefly recall some facts about groupoids.
294
A. H. Forrest, J. R. Hunton, J. Kellendonk
2.1. Generalities. For a traditional definition of a topological groupoid, and as a general reference for most of the concepts introduced below like that of reduction, continuous similarity and continuous cocycle cohomology, we refer the reader to [26]. In a slightly different but equivalent way one may say that a groupoid G is a set with partially defined associative, cancellative multiplication and with unique inverses. Partially defined refers to the fact that multiplication is not defined for all elements, but only for a subset of G × G, the composable elements. An inverse of x is a solution y of the equations xyx = x and yxy = y, and for a groupoid this solution is required to be unique. Hence we may denote the inverse of x by x −1 . The inverse map x → x −1 is an involution. Multiplication is cancellative if, provided it is defined, xy = xz implies y = z, and this is the case whenever the composable elements are the pairs (x, y) for which x −1 x = yy −1 . The set G 0 = {xx −1 |x ∈ G} is called the set of units; it is the image of the map r : G → G 0 given by r(x) = xx −1 , which is called the range map. The map s : G → G 0 given by s(x) = x −1 x = r(x −1 ) is called the source map. Writing u ∼ v for u, v ∈ G 0 whenever r −1 (u) ∩ s −1 (v) = ∅ defines an equivalence relation; its equivalence classes are called the orbits of G. A topological groupoid is a groupoid with a topology with respect to which multiplication and inversion are continuous maps. Such a groupoid is called r-discrete if G 0 is an open subset. This condition implies that r −1 (u) is a discrete set for any unit u. A groupoid is called principal if its elements are uniquely determined by their range and source, i.e. if the map G → G 0 × G 0 given by x → (r(x), s(x)) is injective. 2.1.1. Transformation groupoids. Let M be a topological space with a right action of a topological group G by homeomorphisms, denoted here (x, g) → x · g. The transformation groupoid 1 G(M, G) is the topological space M × G with product topology; two elements (x, g) and (x , g ) are composable provided that x = x · g, and their product is then (x, g)(x , g ) = (x, gg ). Inversion is then given by (x, g)−1 = (x ·g, g −1 ). Hence, r(x, g) = (x, 0) and we see that G(M, G) is r-discrete if G is discrete. Furthermore, G(M, G) is principal precisely when G acts fixed point freely. One of the examples we have in mind here is G(MT , Rd ) which, however, is not r-discrete. 2.1.2. Reductions. Definition 10. Let G be a groupoid, G 0 its unit space and L a closed subset of G 0 . Then −1 −1 L GL := s (L) ∩ r (L) is a closed subgroupoid of G called the reduction of G to L. Two further conditions on L will play a major role here. • A reduction is called regular if every orbit of G has a non-empty intersection with L. • Say that L is range-open [16] if the set r(s −1 (L) ∩ U ) is open whenever U ⊂ G is open. A regular reduction of a groupoid G to a range-open subset L is for many purposes as good as the groupoid itself. Muhly et al. have established a notion of equivalence between groupoids which captures this phenomenon in greater generality [27]. We will not discuss this notion of equivalence here, we merely record its main consequence of interest to us: The K-groups of the C ∗ algebras associated to a groupoid G and its reduction L GL to a range open subset L which intersects each orbit are isomorphic as ordered groups. 1 or transformation group as in [26]
Cohomology of Canonical Projection Tilings
295
2.1.3. Continuous similarity. As just noted, the concept of reduction is particularly well adapted to yield an equivalence relation on groupoids which carries over to an equivalence relation on the C ∗ algebras they define. It turns out that for canonical projection tilings the K-groups of the C ∗ algebras are related to the cohomology of the groupoids, as discussed further in Sect. 7, but this relation is not clear on the level of arbitrary tilinggroupoids. On the other hand there is a natural equivalence relation on groupoids, that of continuous similarity, which immediately gives rise to an equality on cohomology as well as implying equivalence in the sense of Muhly et al. [28]. Definition 11. Two homomorphisms φ and ψ : G → R between (topological) groupoids are (continuously) similar if there exists a function # : G 0 → R such that #(r(x))φ(x) = ψ(x)#(s(x)).
(2)
Two (topological) groupoids, G and R, are called (continuously) similar if there exist homomorphisms φ : G → R, φ : R → G such that $G = φ ◦ φ is (continuously) similar to idG and $R = φ ◦ φ is (continuously) similar to idR . We are mainly interested in establishing continuous similarity of certain principal transformation groupoids. A useful lemma to test this is proved in [17, (3.3, 3.4)]. Proposition 12. Let G = G(X, G) be a principal transformation groupoid (so G acts freely on X) and L and L closed subsets of X ∼ = G 0 . Suppose that γ : L → G and γ : L → G are two continuous functions which define continuous functions L → L : x → x · γ (x) and L → L: x → x · γ (x). Then the reductions of G to L and to L are continuously similar. Remark 13. If L = X then one can take γ (x) to be the identity in the group for all x ∈ L and the condition comes down to finding a continuous function γ : X → G such that x · γ (x) ∈ L for all x ∈ L. 2.1.4. Continuous cocycle cohomology. Given a dynamical system (M, G) with discrete group G one standard topological invariant associated with it is the cohomology of G with coefficients in the G-module C(M, Z) of integer-valued continuous functions with G action given by (g · f )(m) = f (m · g). This cohomology may be interpreted as a groupoid cohomology of the groupoid G(M, G). This is the continuous cocycle cohomology for r-discrete groupoids and we will recall its definition here for constant coefficients following [26]. Let A be an abelian group and G be a groupoid. Then G acts on the trivial A-bundle ρ G 0 × A → G 0 (with product topology) partially, namely x ∈ G can act on the element (s(x), a) mapping it to (r(x), a). We denote this action by $, writing the partial map given by x ∈ G as $(x). The action is continuous in the sense that when f ∈ C(G 0 , A) is a continuous section of the bundle then the function x → (r(x), f (s(x))) is continuous too. Let G (0) = G 0 , and, for n > 0, let G (n) be the subset of the n-fold Cartesian product of G (with relative topology) consisting of composable elements (x1 , . . . , xn ), that is, with r(xi ) = s(xi−1 ). The n-cochains are the continuous functions f : G (n) → G 0 × A such that ρ(f (x1 , . . . , xn )) = r(x1 ) and, for n > 0, f (x1 , . . . , xn ) = (r(x1 ), 0) provided one of the xi is a unit. The n-cochains form an abelian group under pointwise addition. The coboundary operator δ n is defined as δ 0 (f )(x) = $(x)f (s(x)) − f (r(x)),
296
A. H. Forrest, J. R. Hunton, J. Kellendonk
and, for n > 0, δ n (f )(x0 , . . . , xn ) = $(x0 )(f (x1 , . . . , xn )) n + (−1)i f (x0 , . . . , xi−1 xi , · · · , xn ) i=1
+(−1)n+1 f (x0 , . . . , xn−1 ). Then H n (G, A), the continuous cocycle cohomology of G in dimension n with (constant) coefficients A, is defined as ker δ n /imδ n−1 . The following result is proved in [26]. Theorem 14. Continuously similar groupoids have isomorphic cohomology with constant coefficients. Let us consider a transformation groupoid G(M, G) as an example (G discrete). In that case the n-cochains are maps f : M × Gn → M × A of the form f (m, g1 , . . . , gn ) = (m, f˜(g1 , . . . , gn )(m)), where f˜ : Gn → C(M, A) is a continuous map which, for n > 0, is the zero map when applied to (g1 , . . . , gn ) with any one gi = e, the identity element in G. These are precisely the n-cochains of the group G with coefficients in C(M, A), a G module with respect to the action (g · f )(m) = f (m · g) [29]. Hence every n-cochain of the groupoid with coefficients in A determines an n-cochain of the group G with coefficients in C(M, A), and vice versa. Moreover, under this identification δ n becomes the usual coboundary operator in group cohomology, since the groupoid action is nothing other than the shift of base point given by the action of G. Corollary 15. There is a natural isomorphism between the continuous cocycle cohomology of the transformation groupoid G(M, G) with constant coefficients A and the group cohomology of G with coefficients in C(M, A), H n (G(M, G), A) ∼ = H n (G, C(M, A)). In the main results of this paper we shall be interested in the cases A = Z and A = Q. 2.2. The tiling groupoid. The tiling groupoid may be defined without referring to continuous tiling dynamical systems, as for example in [15, 10], but for the purpose of the present work it is important to draw the connection [13, 9]. Starting with the groupoid of the continuous tiling dynamical system G(MT , Rd ) we construct the tiling groupoid as a reduction of it. We first construct a closed, range-open subset 0T of MT . Choose a point in the interior of each tile of T – called its puncture – in such a way that translationally congruent tiles have their puncture at the same position. Let 0T be the subset of tilings of MT for which a puncture of one of its tiles coincides with the origin 0 ∈ Rd . Note that 0T intersects each orbit of Rd . Definition 16. The tiling groupoid of T , denoted by GT , is the reduction of G(MT , Rd ) to 0T . Note that, by construction, GT is r-discrete. Proposition 17. Suppose 0T contains only non-periodic, finite type tilings. Then 0T is closed and range-open and GT coincides with the groupoid R defined in [15].
Cohomology of Canonical Projection Tilings
297
Proof. We refer to [10] for the groupoid R and its properties. Under the hypothesis Rd acts fixed point freely on MT and hence GT is principal. Therefore the map between GT and R is given by (T , x) → (T , T − x), which certainly preserves multiplication and inversion, is an isomorphism provided it preserves the topology. The tiling being of finite type implies that punctures of two different tiles have a minimal distance, δ say. Thus there exists an (which is roughly as large as δ) such that if D(T − x, T − x ) < and T − x, T − x ∈ 0T , then D(T − x, T − x ) = D0 (T − x, T − x ). It follows that 0T is the metric completion with respect to D0 of the set of all T ∈ 0T which are translates of T . In particular, it is closed and the existence of a minimal distance δ between punctures directly implies range-openness, cf. [16]. Furthermore, the metric D0 and the metric used in [15] to define the hull lead to the same completions. This shows that the above map (T , x) → (T , T − x) restricts to a homeomorphism of the spaces of units of GT and of R. As noted, GT is r-discrete and its topology is generated by the sets U × {x}, U open in 0T . Images of those sets under the above map generate the topology of R. We conclude this section with our basic definition of the cohomology of a tiling. Definition 18. The cohomology of the tiling T , denoted by H ∗ (T ), is the continuous cocycle cohomology H ∗ (GT , Z) of GT with constant coefficients Z. We shall see later on that for canonical projection tilings, H (GT , Z) is isomorphic to the Czech cohomology of MT . It seems to be an interesting question whether this is true in general. 3. Quasiperiodic Tilings Obtained by Cut and Projection The projection method (or cut and projection method) is a well known way of producing quasiperiodic point sets or tilings by projection of a certain subset of a periodic set in a higher dimensional space. In earlier versions, for example [5], the favorite set was the integer lattice ZN but a price has to be paid for the simplicity of this choice if the kernel of the projection contains non-zero lattice points. An elegant way around this difficulty, which is applicable to almost all interesting examples, is to use root lattices instead of ZN [30] and the construction we use here is related to that. However, rather than looking at arbitrary point sets obtained by the projection method (for example with fractal acceptance domain) we want to focus in this article on tilings where the acceptance domain is canonical – after all these include the main candidates for the description of quasicrystals – and for these tilings there is another approach which is a bit more elaborate to start with but easier to handle when it comes to the later steps in the construction of the cohomology groups. The approach we are about to describe is based on polyhedral complexes and their dualization, it is therefore sometimes called the dualization method, but in the present context where we start with a higher-dimensional periodic set it can be simply considered as a variant of the projection method such as used in [16, 17]. We follow its description as in the article by Schlottmann [20] and refer the reader also to the examples discussed in [31]. The organisation of this section is as follows. We formally define the construction considered in 3.1 and discuss some basic properties and examples. The remaining subsections form a sequence of descriptions of the associated hull for such tilings; the final description is the one which allows us to describe the tiling cohomology in the remainder of the paper.
298
A. H. Forrest, J. R. Hunton, J. Kellendonk
3.1. Projection tilings after Schlottmann. We must first recall and set up some notation to discuss Laguerre complexes. Consider a point set W of a euclidean space E together with a weight function w : W → R on it; write # for the pair (W, w). For q ∈ W , the set L# (q) := {x ∈ E|∀q ∈ W : |x − q|2 − w(q) ≤ |x − q |2 − w(q )}
(3)
is called the Laguerre domain of q. It is convex and under rather weak conditions [20] on # all Laguerre domains are actually compact polytopes (of dimension smaller or equal to that of E or even empty sets) and the set of all Laguerre domains with nonempty interior provides the tiles of a tiling T # which is of finite type and face to face. Laguerre domains generalise the notion of Voronoi domains and specialise to them when the weight function is constant. The concept of Voronoi domains is a familiar one in solid state physics where they arise (under the name Brouillon zone or Wigner-Seitz cell) if one takes as W the dual of the crystal lattice. A non-constant weight function gives the means to enlarge certain Laguerre domains at the cost of others or even to surpress some altogether. The faces of the Laguerre domains define a cell complex structure: this is the socalled Laguerre complex. We denote it by L# and the (closed) cells of dimension k (k) by L# . The data # specify another complex which is dual to L# : the dual ξ ∗ of a k-cell ξ is the convex hull of the set of q ∈ W whose corresponding Laguerre domains contain ξ as a face. Note that ξ ∗ depends on ξ and # and not only on ξ and L# . It has codimension k. This dual complex is again a Laguerre complex, denoted L#∗ for #∗ = (W ∗ , w∗ ), where W ∗ is the set of vertices (0-cells) of L# and w ∗ : W ∗ → R is given by w∗ (q ∗ ) = |q ∗ − q|2 − w(q) for some q such that q ∗ is a vertex of L# (q). In particular, #∗ also defines a tiling with the above properties. We can now describe the projection method construction we shall study. Let 7 ∈ E be a lattice whose generators form a base for E, let W be a finite union of 7-orbits of points in E, and let w : W → R be a 7-periodic function. Now let E ⊂ E be a linear affine subspace and let π : E → E be the orthogonal projection. Write d for the dimension of E, d ⊥ for that of its orthocomplement E ⊥ , and π ⊥ for 1 − π . We shall also write x ⊥ (d ⊥ −1) as shorthand for π ⊥ (x). An element u ∈ E is called singular if there is a β ∈ L# such that π ⊥ (u) ∈ π ⊥ (β). Hence the set of singular points is S = S ⊥ + E where S ⊥ := π ⊥ (β). (d ⊥ −1)
β∈L#
The set of non-singular points is denoted by N S. We can write it as N S = E ⊥ \β ⊥
(d ⊥ −1)
β∈L#
E+
which shows that it is a Gδ set (a countable intersection of open sets). Since β ⊥ has codimension 1 in E ⊥ , N S is dense. It is convenient to write Wu = W + u and wu (q + u) = w(q) and define #u as (Wu , wu ). Definition 19. For data W, w and E as above, each u ∈ N S defines a tiling Tu whose tiles are the elements of the set (d ⊥ )
{π(ξ ∗ )|ξ ∈ L#u , ξ ∩ E = ∅}. The dimension of E ⊥ is called the codimension of Tu .
Cohomology of Canonical Projection Tilings
299
That this is a tiling by Laguerre domains has been shown by Schlottmann [20]. In fact, Tu is the tiling T (W˜ u∗ , w˜ u∗ ) defined by the Laguerre-complex dual to L(W˜ u ,w˜ u ) , where W˜ u = π(Wu ) and w˜ u (π(q+u)) = max{w(q )−|π ⊥ (q +u)|2 |π(q ) = π(q)} (assuming it exists). Using this description one can see that one loses no generality in restricting to the cases in which π ⊥ (7) is dense in E ⊥ [20]. Definition 20. A canonical projection tiling is a tiling Tu associated to data W , w, E and u as before that satisfies also the conditions (a) that π ⊥ (7) lies dense in E ⊥ ; (b) that E ∩ 7 = 0; (d) (c) that up to translation, any ξ ∗ ∈ L#∗ is uniquely determined by its projection π(ξ ∗ ); (d) (d) that for ξ ∗ , η∗ ∈ L#∗ , ξ ∗ = η∗ + x implies x ∈ 7. (d ⊥ −1)
(e) that for all β ∈ L# subset of S ⊥ .
, the (affine) hyperplane Hβ which is tangent to β ⊥ is a
Remark 21. Conditions (b),(c),(d) in this definition are not strictly necessary but will considerably simplify the exposition. (b) implies that the tilings are completely nonperiodic. (c) and (d) can be made obsolete with the help of decorations, see Sect. 3.2.1. Condition (e) will not be relevant until Subsect. 3.3 and we shall ignore it for the remainder of this and the next subsection. Example 22. Consider the example W = ZN , the integer lattice in RN , with standard basis {ei , i = 1, . . . , N} and vanishing weight function w. In this highly symmetric case, the dual complex to LZN ,w differs only by a shift about δ = 21 i ei from the original one. Writing γ = { N i=1 ci ei |0 ≤ ci ≤ 1} for the unit cube, its translates by δ + z, z ∈ ZN , are its Laguerre domains and it is not difficult to see that, when E is chosen such that E ∩ ZN = {0} the vertices of the tiling Tu defined in Definition 19 are the points {π(z)| z ∈ (ZN + u + δ) ∩ (E + γ )}.
(4)
This set we referred to in [16] as the canonical projection pattern defined by the data (ZN , E, u ) with u = u + δ. π ⊥ (ZN ) lies dense in E ⊥ if and only if E ⊥ ∩ ZN = {0}. In this case one sees quickly that all further conditions of Definition 20 are met. But E ⊥ ∩ ZN is not always trivial, important examples where it is non-trivial are the Penrose tilings. This is the reason why we consider the apparently more elaborate construction with Laguerre complexes. It allows us to focus our attention to input data which satisfy (a) of Definition 20. Let D be the real span of E ⊥ ∩ ZN (assuming it is not trivial) and let V be the orthocomplement of D in E ⊥ . Following [20] we factor the projection π : RN → E as π = π2 ◦ π1 , where π1 : E → E ⊕ V is the orthogonal projection with kernel D and π2 : E ⊕ V → E has kernel V . We may then perform the construction of the projection method in two steps. First we produce the (periodic) tiling defined by the data W = ZN , w = 0, the subspace E ⊕ V and non-singular point u and using projection π1 . As noted, this tiling can be understood as a Laguerre complex, namely the one defined by the lattice π1 (ZN ) and weight function w given by w(π1 (z)) = max{w(z ) − |π1⊥ (z + u)|2 |π1 (z ) = π1 (z)}. In the second step we now use this new Laguerre complex and
300
A. H. Forrest, J. R. Hunton, J. Kellendonk
the projection π2 : to be precise, we use the data π1 (ZN ), w, E, π1 (u). Note that w remains zero after the first step if π1⊥ (u) ∈ ZN , but, if π1⊥ (u) ∈ / ZN , we have to expect that the maximal periodicity lattice of the Laguerre complex defined by (π1 (ZN ), w) is a sublattice of π1 (ZN ) containing the lattice ZN ∩ (E ⊕ V ). To summarize, even if E ⊥ ∩ ZN = {0} we may construct tilings whose vertices are the points of (4) by Schlottmann’s method from data which satisfy conditions (a) and (b) of Definition 20. The further conditions, in particular (e), have to be carefully verified. The most famous class of tilings which may be constructed by the above method are the Penrose tilings. Here N = 5, E is a two dimensional invariant subspace of the rotation ei → ei+1 (i mod 5) and D is the span of δ. If π1⊥ (u) = −δ then the new Laguerre complex Lπ1 (Z5 ),w becomes the dual of the Voronoi complex ( i.e. the Delaunay complex) of the root lattice A4 [30]. The resulting tilings are the usual Penrose tilings. Other choices for π1⊥ (u) lead to the so-called generalized Penrose tilings. We conclude this section by establishing some important properties of canonical projection tilings which will be of use later. First, for non-singular u and v, Tu is locally isomorphic to Tv and to any other element of its hull [20]; in fact, MTu = MTv and the dynamical system (MTu , E) is minimal (any orbit lies dense). We may therefore drop the index u to write MT for the continuous hull. Given u ∈ E (not necessarily non-singular) we define ⊥
(d ) P˜u := {ξ ∈ L#u |0 ∈ π ⊥ (ξ )}.
Lemma 23. Let ξ ∈ P˜u , u ∈ N S and P = π(ξ ∗ ). 1. If s ∈ −ξ ⊥ + 7 such that u + s ∈ N S then P is a tile of Tu+s . 2. If s ∈ E + 7 then the converse holds: P being a tile of Tu+s implies s ∈ −ξ ⊥ + 7. ⊥
(d ) Proof. First, let s ∈ −ξ ⊥ such that u + s ∈ N S and ξ ∈ P˜u . Then ξ + s ∈ L#u+s and 0 ∈ ξ ⊥ + s. Hence ξ + s ∈ P˜u+s so that the dual of ξ + s with respect to the data #u+s projects (under π) onto a tile of Tu+s . This dual is ξ ∗ + s (where ξ ∗ is the dual of ξ with respect to #u ) and hence projects onto P . For the second statement split the given s = s +γ with s ∈ E, γ ∈ 7. Then π(ξ ∗ ) ∈ Tu+s whenever π(ξ ∗ ) − s ∈ Tu . Hence there is a η ∈ P˜u such that π(ξ ∗ ) − s = π(η∗ ). By condition (c) this implies ∃v ∈ E ⊥ : ξ ∗ + v − s = η∗ . By condition (d) we must have v − s ∈ 7. But then ξ + v − s = η ∈ P˜u . The latter implies v ∈ −ξ ⊥ . The statement follows since v + 7 = s + 7 = s + 7.
Lemma 24. u ∈ N S whenever ∀ξ ∈ P˜u : 0 ∈ Int ξ ⊥ . (d ⊥ )
Proof. u is singular whenever there is a ξ ∈ L#u such that 0 ∈ ∂ξ ⊥ . This ξ then belongs to P˜u . For regular u and a patch P of Tu let Au (P ) =
−ξ ⊥ .
ξ ∈P˜u |π(ξ ∗ )∈P
For technical reasons we set Au (∅) = E ⊥ . Au (P ) is called the acceptance domain for P , for reasons which become clear in Corollary 26.
Cohomology of Canonical Projection Tilings
301
Lemma 25. With the notation above 1. For all u ∈ N S and all r > 0 there is a δ > 0 such that t ∈ E ⊥ , u + t ∈ N S, |t| < δ implies Mr (Tu ) = Mr (Tu+t ). 2. For all u ∈ N S and all > 0 there is a δ > 0 such that |u − v| < δ, v ∈ N S, implies D(Tu , Tv ) < . Proof. If r is large enough Au (Mr (Tu )) is a finite intersection of convex polytopes. Since u is regular, 0 is an interior point of these polytopes and hence Au (Mr (Tu ))) contains an open δ-neighbourhood of 0 ∈ E ⊥ . By Lemma 23.1 |t| < δ implies that Mr (Tu ) ⊂ Tu+t . Hence Mr (Tu+t ) = Mr (Tu ) which proves the first statement. As for the second, given u and let r > 1 − c, where c − 1 is an upper bound for the diameter of the tiles. The first statement of the lemma insures the existance of a δ such that t ∈ E ⊥ , u + t ∈ N S, |t| < δ implies D(Tu , Tu+t ) < . Hence if |u − v| < δ , v ∈ N S, then D(Tu , Tv ) ≤ D(Tu , Tu+π ⊥ (v−u) ) + D(Tu+π ⊥ (v−u) , Tv ) < + δ . Taking δ = min{δ , 2 } then implies D(Tv , Tw ) < . 2
Corollary 26. Let P be a patch of Tu , u ∈ N S. Then P ⊂ Tv , for v ∈ N S whenever v − u ∈ Au (P ) + 7. Proof. First let P = π(ξ ∗ ), ξ ∈ P˜u . Then we only have to improve the second part of Lemma 23. Let r > 0 such that P ⊂ Br . Then we find from Lemma 25.1 a δ (depending on v) such that t ∈ E ⊥ , u + t ∈ N S, |t| < δ implies Mr (Tv ) = Mr (Tv+t ). Since E + 7 lies dense we can find arbitrarily small t ∈ E ⊥ so that v + t − u ∈ E + 7. If |t| < δ we can combine the above with Lemma 23.2 to obtain that P ⊂ Tv implies v ∈ −ξ ⊥ +7 +B|t| . Since we can choose t arbitrarily small the statement of the corollary follows for P = π(ξ ∗ ). Now the case of a general patch P is a simple consequence of the fact that P ⊂ Tv whenever all tiles of P belong to Tv . Lemma 27. Let u ∈ N S. Then A(Tu ) := r Au (Mr (Tu )) = {0}. Clearly A(Tu ) is convex and closed. If 0 = s ∈ A(Tu ) then A(Tu ) must contain the interval [0, s]. Suppose that this is the case. Since the singular points are 7 ⊥ orbits of boundaries of compact polytopes and since 7 ⊥ is dense, u + Int[0, s] must contain a singular point. By convexity of the ξ , u + [0, s] ∈ Intξ ⊥ for all ξ ∈ P˜u . In particular, u + t, 0 < t < s, is an interior point of all ξ ⊥ for which ξ ∈ P˜u+t . This shows by Lemma 24 that all points in u + Int[0, s] must be regular. This is a contradiction. Proposition 28. Let u, v ∈ N S. Then Tu = Tv whenever u − v ∈ 7. Proof. If Tu = Tv then M r (Tu ) ⊂ Tv for all r. Hence, by Corollary 26 and Lemma 27 v−u∈ r Au (Mr (Tu )) + 7 = 7. 3.2. The topology of MT . For canonical projection tilings we have a much better description of the topology of the continuous hull; this is one of the crucial reasons why we can so successfully compute their cohomology. First we use the tiling metric to define a metric on the space N S, ¯ D(v, w) := D(Tu , Tv ) + |v − w|, ¯ and let > be the D-completion of N S.
302
A. H. Forrest, J. R. Hunton, J. Kellendonk
Lemma 29. The action of E + 7 on N S (by translation), the map η0 : N S → MT by x → Tx , and the inclusion µ0 : N S @→ E all extend to continuous maps on the completion >. Furthermore, the extension of η0 , to η : > → MT is a local homeomorphism and the extension of µ0 is a surjection µ : > → E that is one to one on non-singular points. ¯ + Proof. D¯ is invariant under the 7 action and for small s ∈ E we have that D(u ¯ s, v + s) differs very little from D(u, v); this implies that the action of E + 7 extends to one by homeomorphisms on >. Uniform continuity of η0 and µ0 is clear, as one can ¯ bound the D-metric and the euclidian metric by the D-metric. Hence both maps extend continuously. To show that η is open recall from Proposition 28 that η0−1 (Tu ) = u + 7. Hence, different preimages under η0 of one single point have a minimal distance. In particular, any restriction of η0 to some small open ball, smaller than that minimal distance, will be injective and we claim that a Cauchy-sequence in the image of such a restriction has a Cauchy sequence as preimage. This then shows that the restrictions extend to injective maps implying that η is a local homeomorphism. To prove our claim let (Tuν )ν , uν ∈ N S, be a D-Cauchy sequence with (uν )ν belonging to a small ball (with respect to D¯ in the ¯ relative topology). Observe that if D(u, v) is small (u, v ∈ N S) then |π(u) − π(v)| is small as well and bounded by 2D(Tu , Tv ). Hence we can choose the ball small enough so that convergence of Tuν implies that of |π(uν )| and hence also Tu⊥ν is a Cauchy sequence. But the latter is even a Cauchy sequence with respect to the metric D0 . Now D0 (Tu⊥ν , Tu⊥ ) → 0 implies that Rν = sup{R|∀µ : BR (Tuν ) = BR (Tuν+µ )} diverges ν+µ and hence diameter of Auν (MRν (Tuν )) shrinks to zero (Lemma 27) which implies, by ⊥ Lemma 23, |u⊥ ν+µ − uν | → 0. This shows that (uν )ν converges in the euclidian metric ¯ and therefore also in the D-metric. To show that µ is almost one to one on non-singular points observe that µ can also ¯ → (N S, $ · $) to the be viewed as the extension of the identity map id : (N S, D) ¯ completions (here (N S, D) and (N S, $ · $) is the standard notation for the incomplete metric spaces, $ · $ standing for the euclidean metric). Above we showed that id is uniformly continuous and Lemma 25.2 shows that its inverse is pointwise continuous. ¯ So if u ∈ N S and (xν )ν is a D-Cauchy sequence in N S converging to x ∈ >, then µ(u) = µ(x) implies that (xν )ν must be a $ · $-Cauchy sequence converging to u ∈ N S. ¯ implies therefore The pointwise continuity of the identity map (N S, $ · $) → (N S, D) that x = u. Corollary 30. The map η induces an E-equivariant homeomorphism between the orbit space >/ 7 and MT . Proof. Proposition 28 and Lemma 29 imply that η maps 7-orbits onto single tilings. To show that η(x) = η(y) implies y ∈ x + 7 (we denote the extended action of γ ∈ 7 on > also simply additively) we first recall from the last lemma that N S as a subset of > is the preimage of N S ⊂ E under µ, a continuous map. Therefore N S is also a dense Gδ subset of >. Let η(x1 ) = η(x2 ) but x1 = x2 . Fix δ > 0, by the Hausdorff property we may find ¯ D-open Ui such that xi ∈ Ui , Ui is contained in the δ-neighbourhood (with respect to ¯ of xi , and η(U1 ) = η(U2 ). Since η is continuous and open, η(Ui ∩ N S) is a Gδ -dense D) subset of η(U ). Therefore η(U1 ∩ N S) ∩ η(U2 ∩ N S) is not empty. So take ui ∈ Ui ∩ N S such that η(u1 ) = η(u2 ). By Proposition 28 we find a γ ∈ 7 such that u1 − u2 = γ .
Cohomology of Canonical Projection Tilings
303
¯ 1 , x2 + γ ) ≤ D(x ¯ 1 , u1 ) + D(u ¯ 2 + γ , x2 + γ ) which tends to 0 if δ → 0. Therefore D(x Hence x1 = x2 + γ . E-equivariance is clear. We have thus another dynamical system (>, E + 7) which plays the role of a “universal covering” (not in its strict sense) of the continuous tiling dynamical system. Remark 31. We can compare this construction with the so-called torus parametrisation of projection tilings [32]; this also parallels a discussion which was carried out for tilings related to ZN (not necessarily canonical) in [16]. There is a surjection µ : MT → E/ 7 which makes commutative the diagram µ
> → E η↓ ↓ .
(5)
µ
MT → E/ 7 All maps are E-equivariant and µ is E+7 equivariant; µ is one to one on (classes of) nonsingular points. The dense set N S/ 7 of the torus E/ 7 therefore yields a parametrization of a dense set of tilings. In fact it can be shown that E/ 7 parametrizes the remaining set of tilings up to changes on sets of tiles having zero density in the tiling. This torus parametrization is very useful for analyzing symmetry properties of the tilings [32]. We need now to describe the topology of >. Recall from Sect. 1 that a base of the topology of MT is generated by sets B + x + UP , > 0, x ∈ E, P a patch in T . For u ∈ E ⊥ ∩ N S Lemma 23 can be reformulated to say that P ⊂ Tx for x ∈ u + E + 7 whenever x ∈ Au (P ) + u + 7. For u ∈ E ⊥ ∩ N S we let Au = {(Au (P ) ∩ 7 ⊥ ) + u + y ⊥ |P ⊂ Tu , y ∈ 7} ∪ {∅}. Then, by the interpretation of Au (P ) we see that Au is closed under intersection. In fact, if y ∈ 7 then Au (P )∩(Au (P )+y ⊥ ) = Au (P ∪(P +π(y))) provided P ∪(P +π(y)) ⊂ Tu and ∅ otherwise. It is also useful to have another description of Au which shows that the collection (d ⊥ ) B := {A|A ∈ Au } of closed subsets in > does not depend on u. For X ⊂ L# , let A(X) := ξ ∈X −ξ ⊥ and (d ⊥ )
Au := {A(X) ∩ (7 ⊥ + u)|X ⊂ L#
finite} ∪ {∅}.
Then Au (P ) + u = A(X), where X = {ξ ∈ P˜u |π(ξ ∗ ) ∈ P } + u which shows that Au ⊂ Au . On the other hand let v ∈ A(X) ∩ (7 ⊥ + u). Then ∀ξ ∈ X: π(ξ ∗ ) ∈ Tv and v − u = γ ⊥ for some γ ∈ 7. It follows that {π(ξ ∗ )|ξ ∈ X} + π(γ ) is a patch in Tu . Hence Au = Au . But from the form of Au it is clear that B does not depend on u. Theorem 32. The collection {B + x + U |U ∈ B, > 0, x ∈ E} is a base of the topology of >. In particular, > is homeomorphic to Ec⊥ × E (with the product topology) ¯ where Ec⊥ = E ⊥ ∩ N S (the D-closure of E ⊥ ∩ N S in >). Proof. Let P be a patch of Tu , u ∈ E ⊥ ∩ N S. From Lemma 23 follows that for x ∈ u + E + 7, P ⊂ Tx whenever x ∈ Au (P ) + u + 7. Let X(P ) = (Au (P ) + u) ∩ N S. Since UP is closed η−1 (UP ) = X(P ) + 7. Furthermore, if γ ∈ 7 is not trivial then ¯ D(X(P ), X(P ) + γ ) > δ, for some δ > 0 (here we mean the obvious extension of D¯
304
A. H. Forrest, J. R. Hunton, J. Kellendonk
to subsets). Hence, for all x ∈ E + 7, B + x + X(P ) is an open set. We conclude that the above collection consists indeed of open sets and its image under η is a collection of sets of which forms a base of the topology of MT . δ Now let V ⊂ > be open and of diameter smaller than 2 . Then η(V ) is open and hence of the form η(V ) = (,x,P )∈I B + x + UP , where I is an index set containing triples with > 0, x ∈ E + 7, P ⊂ Tu . If we choose small enough and the patches ¯ P large enough we can make sure that B + x + X(P ) has D-distance at least δ to B + x + γ + X(P ) provided γ ∈ 7 is non-trivial. Then V is the union of those B + x + X(P ), (, x, P ) ∈ I which contain one of its points. That > has the above form of a product space is now clear. Corollary 33. The collection B is a base of compact open neighbourhoods for Ec⊥ . In particular, Ec⊥ is a totally disconnected set without isolated points. Proof. That B is a base of the topology follows directly from the last theorem. That its sets are compact follows from compactness, Lemma 9, of the sets UP , P ⊂ Tu . 3.2.1. Decorated tilings. Sometimes it is useful to decorate the tiles of a tiling, usually with small compact sets like arrows. One reason for introducing decorations in the present framework is to get around the hypotheses (c) and (d) made in Definition 20. If (d) it happens that two translationally non-congruent faces of L#∗ project onto the same tile we can distinguish them by means of a decoration: the projection images of faces are decorated by arrows which have equal shape for equal translational congruence class but different shape for different classes. Decorating has to be taken into account in the general framework in the way that tiles, patches, and tilings are decorated objects. This means for Lemma 23, for instance, that the tile P is no longer just the set π(ξ ∗ ) but this set together with the decoration. Likewise we have to understand patches in Corollary 26 as subsets of decorated tiles. The description of the hull and notably Theorem 32 remain as stated if one takes into account that the tiling is the decorated one. It is important to note that we need only finitely many different decorations for that so that the decorated tiling remains finite type. In the same way we can handle the case in which the translation subgroup (d) of L#∗ is larger than 7 or a fundamental domain for it contains several translationally congruent faces. We can distinguish them again by decorations of which we need only finitely many. A different reason for introducing decorations is to introduce matching conditions or break the symmetry of the tiles. For instance, the octagonal and decagonal tilings are canonical projection tilings which have matching rules only after (a symmetry breaking) decoration. We now indicate how certain (quasiperiodic) decorations can be incorporated in the projection method. This situation is in so far different from the above in that we suppose to start with a canonical projection tiling which we want to decorate and ask how this modifies the topology of the hull. We saw that the sets of B have the interpretation of acceptance domains. If a nonsingular point u belongs to such a set then this can be interpreted by saying that a certain patch occurs at Tu . If we introduce by hand additional faces in the Laguerre-complex L#u we started with we divide a d ⊥ -cell ξ ⊥ into several components. Each component may serve as acceptance domain for a decorated tile, the bare tile is π(ξ ∗ ) and for its decoration we can take a label or a small compact set like an arrow. We need to make sure that there are as many different decorations as there are new components and we need to require that the additional faces form 7-orbits so that the new Laguerre complex remains
Cohomology of Canonical Projection Tilings
305
7-invariant. This also insures that the decorated tiling remains minimal. If we now take the new faces into account by taking as a base for the topology the sets corresponding to the above components then we end up with a similar description of the continuous hull in the decorated case as in the undecorated one. Certainly arbitrary decorations could not be handled like this, but those which define matching rules for the (then decorated) octagonal and decagonal tilings do. 3.3. A description of the topology by singular planes. We now bring into play the final hypothesis of the main Definition 20 of canonical projection tilings, (d ⊥ −1)
(e) For all β ∈ L# of S ⊥ .
, the (affine) hyperplane Hβ which is tangent to β ⊥ is a subset
What we require here is that for all β, the stabilizer of Hβ with respect to the action of 7 given by λ → λ + γ ⊥ has rank at least d ⊥ and that its lattice spacing is small enough compared with the inner diameter of β ⊥ to insure that β ⊥ intersects each of its orbits. This is certainly the case for W = ZN , w = 0, but holds in many other interesting cases. We call the hyperplanes Hβ singular planes. Using hypothesis (e) we get a further description of the topology of Ec⊥ . It allows us to write the singular points in E ⊥ as S ⊥ = (d ⊥ −1) Hβ which is clearly invariant under the action of 7 given by β∈L#
λ → λ + γ ⊥ . The set C of all singular planes is invariant under 7 as well and, since (d ⊥ −1) L# contains only a finitely many 7-orbits, C consists of a finite number of 7-orbits, too. Definition 34. A compact polytope in E ⊥ is called a C-tope if it is the closure of its interior and if all its boundary faces are subsets of singular planes. A subset of Ec⊥ is ¯ called a C-tope if it is the D-closure of the set of non-singular points of a C-tope in E ⊥ . Theorem 35. The characteristic functions on C-topes generate Cc (Ec⊥ , Z), the compactly supported continuous, integer valued functions on Ec⊥ . Proof. C-topes form the set of finite unions of sets of B. The latter being clopen and forming a base of the topology, their corresponding characteristic functions generate Cc (Ec⊥ , Z). Since 1U ∪V + 1U ∩V = 1U + 1V the statement follows. ⊥
Remark 36. For 7 = Zd+d Le [11] gave a description of the topology of Ec⊥ which we relate to the above. For x ∈ E ⊥ let cx be a connected component of E ⊥ \ x∈H ∈C H , an open subset of E ⊥ called a corner. Note that cx = E ⊥ if x ∈ N S. Let EL⊥ = {(x, cx )|x ∈ E ⊥ } with topology generated by the sets U(x,cx ) = {(y, cy )|y ∈ cx , cx ∩ cy = ∅}. Clearly, the projection onto the first factor is a continuous surjective map EL⊥ → E ⊥ . This is Le’s description of a transversal for the continuous hull. Let U be a C-tope in E ⊥ . Then UL := {(x, cx )|x ∈ U, cx ∩ IntU = ∅} is a preimage of U in EL⊥ which is a finite union of UL ’s and hence open. Let BL be the collection of all sets obtained in this way. Then the topology of EL⊥ is generated by BL
306
A. H. Forrest, J. R. Hunton, J. Kellendonk
since we can realize the sets U(x,cx ) as (infinite) unions. We leave it to the reader to verify that the map B → BL given by U → µ(U )L is a bijection preserving the operations intersection, union, and symmetric difference. Then C0 (Ec⊥ ) is isomorphic to C0 (EL⊥ ) and Ec⊥ is homeomorphic to EL⊥ . 3.4. A variant of the tiling groupoid for canonical projection tilings. For canonical projection tilings it is convenient to use a slightly different groupoid which is isomorphic to a reduction of the tiling groupoid. It is also continuously similar to it. In [17] it is called the pattern groupoid. Let be a small vector in E which is not parallel to any of the faces of tiles. To a vertex v, associate the tile which contains in its interior v + ; this defines an injection between the vertices of a projection tiling and its tiles. We assume that is small enough so that the associated tile contains this vertex. Let 0T be the subset of MT given by those tilings which have a vertex on 0 ∈ E. As for 0T one shows that 0T is a closed range-open subset which intersects each orbit of G(MT , E)). Thus we define the reduction GT := 0T G(MT , E))0T of G(MT , E)). Now consider a new set of punctures for T , a subset of the old one, namely give only those tiles a puncture which are associated to vertices as described above. This choice can be made locally since we only have to test the vertices of the tile itself to decide whether we select its puncture to become a new one. Call 0T the subset of tilings of MT for which a new puncture lies on 0. By letting the new punctures tend to the corresponding vertices one immediately sees that the reduction 0 G(MT , E))0 is T T isomorphic to GT . Furthermore, 0 G(MT , E))0 is the reduction to 0T of GT which, T T as noted in [10] is continuously similar to it. A similar argument can also be found in [17]. Without loss of generality we may assume that 0 ∈ W , our 7-invariant set we start with, and that the Laguerre domain of 0 has interior and therefore 0 is a vertex of the dual complex. Let u ∈ E ⊥ ∩ N S be such that 0 is a vertex of Tu . All vertices of Tu are contained in π(Wu ) which can be written in the form π(Wu ) = x∈X x + π(7) for a finite subset X ∈ E of points which are all in different π(7) orbits, 0 being one of them. Therefore, if s ∈ E and 0 is a vertex of Tu−s then s ∈ x + π(7) for some x ∈ X. Using Proposition 28 we find that η−1 (Tu−s ) ∩ Ec⊥ × {x} is not empty provided 0 is a vertex of Tu−s . By continuity and closedness of Ec⊥ this extends to arbitrary T ∈ 0T . So if we let LT := η−1 (0T ) ∩ Ec⊥ × X then η−1 (0T ) = LT + 7. Lemma 37. GT is isomorphic to the reduction LT G(>, E + 7)LT , where LT is as above. Proof. The map LT G(>, E + 7)LT → 0T G(MT , E)0T given by (y, s + γ ) → (η(y), s) is a groupoid homomorphism. It is injective, because no two points of X belong to the same π(7) orbit, and surjective, because η(LT ) = 0T . Continuity follows from the continuity properties of η. 3.5. Discrete tiling dynamical systems for canonical projection tilings. We now bring to fruition the work of the preceding subsections and prove that the groupoids constructed so far from a canonical projection tiling are continuously similar to that arising from
Cohomology of Canonical Projection Tilings
307
a minimal action of Zd on a Cantor-set. This gives us the key, in Sects. 4, 5 and 6, to qualitatively and quantitatively describing the cohomology of these tilings. Let F be a subspace which is complementary to E, thus F ∩ E = 0 and F + E = E. We denote by π the projection onto F with kernel E (so it is not orthogonal except if F = E ⊥ ). The restriction of π to u + 7 ⊥ (u ∈ E ⊥ ∩ N S) extends to a homeomorphism between Ec⊥ and Fc = F ∩ N S (its closure in >) and we can write > = Fc × E with the product topology. Since E ∩ 7 = {0}, π (7) is isomorphic to 7 so that we have a natural minimal action of 7 on F , x · γ = x − π (γ ), without fixed points. The extension of this action to Fc defines a minimal dynamical system (Fc , 7) also without fixed points. Proposition 38. G(Fc , 7) is continuously similar to G(>, E + 7). Proof. We apply Proposition 12 taking L = Fc (which is closed) and γ : > → E + 7 to be the extension of π : E → E. ⊥ Now we decompose 7 ∼ = Zd+d into complementary subgroups, 7 = G0 ⊕ G1 , ⊥ where G0 ∼ = Zd and G0 := π (G0 ) spans F . Define
X := Fc /G0 so that we obtain (X, G1 ), a minimal dynamical system without fixed points. Proposition 39. G(Fc , 7) is continuously similar to G(X, G1 ). Proof. We claim that Fc has a clopen fundamental domain Y for G0 . The proposition follows then from Proposition 12 upon using L = Y and γ : Fc → 7, γ (x) being the unique element of G0 such that x · γ (x) ∈ Y . The latter is indeed continuous since the preimage of a lattice point is a translate of the fundamental domain and therefore open. (d ⊥ ) To prove the claim pick any ξ ∈ L# such that ξ ⊥ has interior. Since G0 spans F it has a compact fundamental Y 0 . By density of 7 ⊥ there is a finite subset domain 1 ⊥ 0 ∈ J ⊂ 7 such that Y = γ ∈J (−ξ + γ ⊥ ) covers Y 0 . It follows that ((−ξ ⊥ ∩ N S) + γ ⊥ ) Yc1 := γ ∈J
is a compact open subset of Fc and Yc1 + G0 = Fc . Now let G+ 0 be a positive cone + + of G0 which satisfies G0 = G0 ∪ (−G0 ) thus implying a total order. We claim that 1 Y := Yc1 \(Yc1 + G+ 0 \{0}) ∩ Yc is a clopen fundamental domain. Clopenness is easy to see. So let x ∈ Fc . Clearly, the set of all g ∈ G0 such that x + g ∈ Yc1 is non-empty and finite. The unique minimal element g0 of this set is the only one satisfying x + g0 ∈ Y . Proposition 40. GT is continuously similar to G(Ec⊥ , 7). Proof. From Lemma 37 we know that GT is isomorphic to the reduction LT G(>, E + 7)LT . Let (LT )x := LT ∩ Ec⊥ × {x}, x ∈ X. If u ∈ E ⊥ ∩ N S such that 0 is a vertex of Tu and v ∈ u + E ∩ (LT )x then v = u − s with s ∈ x + π(7). Hence there is a unique g ∈ 7 such that v + x − g ∈ E ⊥ . Now η(v + x − g) = η(u) contains 0 as a vertex and hence v + x − g ∈ (LT )0 . We define a map γ : (LT )x → E + 7 first on the dense set u + E ∩ (LT )x by γ (v) = x − g, with g as above, and then extend it by continuity. Applying Proposition 12 with L = LT , L = (LT )0 , γ : L → E + 7, γ (x) = 0, and
308
A. H. Forrest, J. R. Hunton, J. Kellendonk
γ : LT → E + 7 as above, we find that LT G(>, E + 7)LT is continuously similar to L G(>, E + 7)L . The latter is equal to the reduction of G(Ec⊥ , 7) to L . L is clopen (in the topology of Ec⊥ ) and hence µ(L ) contains an open set. We claim that there exists a choice of decomposition 7 = G0 +G1 with the properties stated before Proposition 39 and such that L contains a clopen fundamental domain Y for G0 . It then follows again from Proposition 12 upon using the same map γ as in Proposition 39 (Y is a subset of L ) that L G(>, E + 7)L is continuously similar to G(Ec⊥ , 7). This then proves the proposition. It remains to prove the claim. Since 7 ⊥ is dense in E ⊥ we can choose d ⊥ elements ⊥ of 7 which generate a group H isomorphic to Zd , such that H ⊥ spans E ⊥ , and has a fundamental domain Y in E ⊥ contained in µ(L ). Let G0 be the group generated by H and representatives for the torsion elements of 7/H . It is a free abelian group of rank ⊥ d ⊥ which contains H and G⊥ 0 cannot be dense in E . By the same construction as in the proof of the last proposition we obtain from Y a fundamental domain Y for G0 in Ec⊥ which is contained in L since µ(Y ) ⊂ Y . ∗ ∗ ∗ ∼ ∼ Corollary 41. H (T ) = H (7, C(Fc , Z)) = H (G1 , C(X, Z)). A direct consequence of the above corollary is that H k (T ) is trivial if k exceeds the rank of G1 , which is d, the dimension of the tiling. Furthermore, using that H 0 (G1 , C(X, Z)) = {f ∈ C(X, Z)|∀g ∈ G1 : g · f = f } [29], minimality of the G1 action implies that H 0 (T ) = Z. Finally, if M is a G1 -module then H d (G1 , M) = Coinv(G1 , M) is the group of coinvariants [29] Coinv(G1 , M) := M/&{m − g · m|m ∈ M, g ∈ G1 }'. By the corollary H d (T ) is thus equal to C(X, Z)/E(G1 ) where E(G1 ) is subgroup of C(X, Z) generated by the elements f − g · f for g ∈ G1 and (g · f )(x) = f (x · g). Remark 42. The dynamical systems of the form (X, G1 ) defined above a priori depend on the position of F and on the choice of G0 . However, in a certain sense they are all equivalent, namely their groupoids are all continuously similar and they are all reductions of one big groupoid. They are not all isomorphic, as an investigation of the order unit of the K0 -group of the C ∗ algebra they define shows. The dependence on F is inessential. The map π induces a 7 equivariant homeomorphism between Eu⊥ and Fc . Different F ’s therefore lead to isomorphic dynamical systems (Fc , 7). Taking F as the span of G0 one verifies directly that MT is the mapping torus of (X, G1 ) [16]. One consequence of this (though not one we make use of below) is the following. Corollary 43. The tiling cohomology of non-periodic canonical projection tilings is isomorphic to the Czech cohomology of their continuous hull. We do not know whether this result is true for general tilings. ⊥
Remark 44. Consider the case 7 = Zd +d , F = E ⊥ and G0 generated by, say, the first d ⊥ basis elements ei . Then the dynamical system is the rope dynamical system of [10]. Remark 45. We conclude Sect. 3 by summarizing the structure of (X, G1 ) in a commutative diagram which is the discrete analogue of (5); see [16] for the neccessary proofs. µ
Fc → η↓ µ
F ↓
X → F /G0 .
Cohomology of Canonical Projection Tilings
309
The maps are 7 (respectively G1 ) equivariant where the G1 -action on the d ⊥ -torus F /G0 is by rotations (constant shifts). X is a Cantor set and the surjection µ : X → F /G0 is one to one for nonsingular points of X which form a dense Gδ subset. Thus (X, G1 ) is an almost one to one extension of a relatively simple system, that of rotations on a torus. The crucial topological information is encoded in the set on which µ is not injective. 4. Examples Before we proceed to give a qualitative picture of tiling cohomology and to describe methods for calculation, we discuss the two simplest examples which we believe show typical features. Both are one-dimensional tilings obtained from an integer lattice, so by Corollary 41 only H 0 (T ) and H 1 (T ) are non-zero. As noted, by minimality H 0 (T ) = Z and H 1 (T ) is identified in the last section as a group of coinvariants. Example 46. In our first example we take W = Z2 , w = 0 and d = 1. Here E is specified by a vector (1, ν) and ν has to be irrational to meet the requirement E∩Z2 = {0}. Clearly, E ⊥ is generated by (−ν, 1) and the singular planes are simply points, namely the points of π ⊥ (Z2 ) (we ignore the shift by δ). Identifying E ⊥ with R we have π ⊥ (Z2 ) = Z + νZ (after a suitable rescaling). Hence Cc (Ec⊥ , Z) is generated by indicator functions 1[a,b] ¯ (on the D-closure of [a, b] ∩ N S) with a, b ∈ Z + νZ, a < b. How many of them are cohomologous? Clearly, 1[a,b] ∼ 1[0,b−a] and there are unique n, m ∈ Z such that b − a = n + νm. Defining 1[a,b] = −1[b,a] in the case of a > b, we get 1[0,b−a] = 1[0,n] + 1[n,n+νm] ∼ n1[0,1] + m1[0,ν] which shows that the coinvariants are Z2 provided the two generators given by the classes of 1[0,1] and of 1[0,ν] are independent. This will be shown in Sect. 7. Let us mention in this context that the above tilings are very close to being substitutional [33] (they are strictly substitutional only for ν a quadratic irrationality). The above result shows that whatever the irrational ν is H 1 (Z2 , Cc (Ec⊥ , Z)) = Z2 . This demonstrates that cohomology is not a very fine invariant to distinguish tilings, at least in these low dimensions. We shall see in Sect. 7 how further structure can be added. Example 47. In our second example we take W = Z3 , w = 0 and d = 1. Here we consider only the case where E ⊥ ∩ Z3 = {0} because the other leads essentially to the previous example. In this case, the singular planes are lines which are π ⊥ (Z3 )-translates of Hα = &eα⊥ ', α = 1, 2, 3 (again up to the shift by δ). Any two Hα span E ⊥ . We claim that the result for the cohomology differs drastically from the previous example in that the coinvariants are infinitely generated. Fix g1 , g2 ∈ π ⊥ (Z3 ) and let U be the rhombus (we assume it has interior) whose boundary faces lie in H1 ∪ (H1 + g1 ) ∪ ¯ H2 ∪ (H2 + g2 ). Clearly, 1U , the indicator function on the D-closure of U ∩ N S, belongs to Cc (Ec⊥ , Z). Let, for α = 1, 2, π1 (π2 ) be the projection onto H1 (H2 ) which has kernel H2 (H1 ) and let 7α = πα (π ⊥ (Z3 )). Then for all λα ∈ 7α also 1U +λ1 +λ2 ∈ Cc (Ec⊥ , Z). How many of them are cohomologous? Let us try to repeat the construction of the first example. Clearly 1U +λ1 +λ2 ∼ 1U +λ1 +λ2
if λ1 + λ2 − λ1 − λ2 ∈ π ⊥ (Z3 ).
But since the rank of 7α is at least 2 (because it is dense in Hα ) we see that the number of π ⊥ (Z3 ) orbits of points in 71 + 72 (which is the number of elements in (71 +
310
A. H. Forrest, J. R. Hunton, J. Kellendonk
72 )/π ⊥ (Z3 )) is infinite. Therefore the construction used in the first example cannot be used here to reduce the generators to a finite set. This does not prove our claim but it does indicate a crucial point, namely that there are infinitely many orbits of points which are intersections of singular planes. From this we will conclude in the next section that the tilings of the second example cannot be substitutional. 5. Conditions for Infinitely Generated Cohomology The cohomology groups of a canonical projection tiling, as defined in Sect. 2.2, contain rich information about the tiling. With the analysis of Sect. 3 we shall see in Sect. 6 that they are completely computable, at least for projections of small codimension. In this section we examine instead the qualitative behaviour for generic projection tilings of the rationalisations of these cohomology groups. Although rational cohomology, H ∗ (GT , Q), is a somewhat cruder invariant, it still proves useful. In the following subsection it will allow us to comment on the relationship between canonical projection tilings and tilings defined by a substitution system. Recall the set of singular points S ⊥ in E ⊥ , defined in Sect. 3.1, and the assumption (e) of our Definition 20 of a canonical projection tiling. Definition 48. We call a point x ∈ S ⊥ an intersection point if there are d ⊥ singular planes which intersect uniquely at x. Let P be the set of intersection points. Clearly, P is invariant under the action of 7. Let 0(P) = P/ 7 be the orbit space. One of the main results of [19] is the following theorem (see also [17]). Theorem 49 ([17,19]). 0(P) is an infinite set if and only if H ∗ (GT , Q) is infinitely generated. We do not repeat its proof here, but rather explain how to obtain criteria under which 0(P) is infinite. Choose d ⊥ singular planes Hβ , indexed now simply by α = 1, . . . , d ⊥ , which intersect in exactly one point. Let S := α (Hα +7 ⊥ ) and let P = P∩S , a subset which is clearly 7-invariant. Write Lα for α =α Hα , a line, and let πα : E ⊥ → Lα be the (not necessarily orthogonal) projection with kernel Hα . Then 7 α := {γ ∈ 7|Lα + γ = Lα }, the stabilizer of Lα , can be naturally identified with a subgroup of 7α = πα (7 ⊥ ). Lemma 50. If rank7 α < rank7α then 0(P) is an infinite set. Proof. Let x ∈ Lα ∩ P . Then, by construction, x + 7α ∈ P , too. The latter set may be decomposed in its 7 α -orbits and if rank7 α < rank7α there are infinitely many. On the other hand, intersection points of Lα ∩ P which lie in different 7 α -orbits lie also in different 7-orbits. This gives the following easily checked criterion; it also shows that 0(P) being an infinite set is a generic feature. Corollary 51. If rank7 α < 2 then 0(P) is an infinite set. Proof. Density of 7 ⊥ implies that of 7α . Hence rank7α ≥ 2. Corollary 52. If d ⊥ > d then 0(P) is an infinite set.
Cohomology of Canonical Projection Tilings
311
Proof. We showed above rank7α ≥ 2. In particular, α rank7α ≥ 2d ⊥ . The statement of the lemma follows therefore from the observation that 0(P) is an infinite set if ( α 7α )/ 7 ⊥ is infinite and the latter is the case whenever α rank7α > d + d ⊥ . The claim of our second example in Sect. 4 follows from this last result and the discussion of the next subsection. With a little more thorough analysis [17] one can show that if 0(P) is a finite set then dd⊥ must be an integer. A further result, accessible with the algebraic-topological methods of [19], is the following. Theorem 53. [19] If 0(P) is a finite set then each H r (GT , Z) is a finitely generated free abelian group for r = 0, . . . , d and is zero for other r. 5.1. Comparison with substitution tilings. In addition to those tilings which arise from the canonical projection method there is another very important class for which cohomology can be computed. These are the finite type tilings which allow for a locally invertible (primitive) substitution. We briefly discuss these tilings and show, with the aid of the results of the previous section, that tiling cohomology gives effective criteria for distinguishing whether a tiling can come from one or the other of these two classes. In particular, we shall see that generically canonical projection tilings do not allow for a locally invertible substitution. A substitution of a tiling T (the terms inflation and deflation are also used in this context) is roughly speaking a rule according to which each tile of T gets substituted by a collection of tiles (a patch) such that these patches fit together to form a new tiling which is locally isomorphic to T . Furthermore, the translational congruence class of the patch which substitutes a tile depends only on the translational congruence class of that tile and the relative position between two patches only on the relative position between the two tiles which they substitute. Therefore, the rule is specified when it is given for any translational congruence class of tiles (of which there are only finitely many) and for all possible relative positions two neighbouring tiles can have (of which there are also only finitely many). One of the major examples is the octagonal tiling whose substitution rule is shown in Fig. 1. The octagonal tiling is also an example of a tiling that can be obtained as a canonical projection tiling and the question naturally arises of obtaining criteria for deciding the possible origins, whether as substitutions, projections or both, of any given tiling. There are additional conditions which turn out to be useful to assume a substitution satisfies, such as local invertibility; we refer the reader to [9] and [10] for details.
Fig. 1. Substitution of the octagonal tiling (triangle version)
312
A. H. Forrest, J. R. Hunton, J. Kellendonk
Under such suitable conditions, [9] and [10] develop methods for the computation of substitution tilings. Of the two approaches to compute the cohomology of substitution tilings that of [9] is based on the continuous dynamical system (MT , Rd ) whereas that of [10] is based on the tiling groupoid GT . We sketch here the latter. The essential observation of this approach is that a primitive invertible substitution gives rise to a homeomorphism # (the Robinson map) between 0T and the space of paths PI on a certain oriented graph I. In the case where the substitution forces its border (see [15]) the connectivity matrix σ of I is a power of the substitution matrix. A natural principal topological groupoid GI is associated with the path space, namely the one given by tail equivalence: two paths are tail equivalent if they agree up to finitely many edges. The tiling groupoid GT , which is always principal for such substitution tilings, is identified via # with a subset of PI × PI and hence can be compared with GI . In fact, GI is a subset of GT (but not a closed one). This construction allows for a description of Coinv(GT ; Z), the group of coinvariants of GT with integer coefficients, a group which coincides with the cohomology group H d (T ) of Sect. 2.2 when T arises also from the projection method (or, in the language of [10], when the tiling reduces to a Zd -decoration). Theorem 54 ([10]). For substitution tilings as discussed, the group of coinvariants Coinv(GT ; Z) is a quotient of the group of coinvariants of GI . Moreover, Coinv(GI ; Z) is the direct limit of the system σ
σ
ZN → ZN → · · · , where N is the number of vertices of I (which in the border forcing case coincides with the translational tile-classes). Corollary 55. A necessary condition for a canonical projection tiling to be substitutional is that 0(P) is a finite set. Consequently, canonical projection tilings are generically non-substitutional and in particular no canonical projection tiling with d ⊥ not dividing d is substitutional. Proof. Suppose a canonical projection tiling T is substitutional. Then Theorem 54 tells us that H d (T ) can be expressed as a direct limit of finitely generated free abelian groups. Such a limit need not to be finitely generated itself but when rational coefficients are considered instead of integer ones then the direct limit becomes that of the system σ
σ
QN → QN → · · · , namely QR where R is the rank of σ n for large n. The first part of the corollary now follows from Theorem 49; the remainder follows from the results and comments of the preceding section. Remark 56. It is worth comparing the above result with a similar one due to Pleasants who uses the theory of algebraic number fields [34]. In the context of tilings obtained by the projection method there is an approach to the construction of substitutions which is based on the torus-parametrization. It is most powerful not when tilings are considered but when projection point patterns are looked at (though these are closely related to tilings, see [16]). For a lattice 7 ⊂ E, a subspace E, and an acceptance domain A ⊂ E ⊥ (satisfying certain rather weak conditions) the projection point pattern given by the triple (7, E, A) is the point set PA := π((E +A)∩7). The canonical choice for A corresponds
Cohomology of Canonical Projection Tilings
313
to one where PA = {π(ξ )|ξ ∈ P˜ 0 } with P˜ 0 the set of vertices (0-cells) of the lift of a canonical projection tiling T (constructed from the same data with constant weight function). In that case, A is a polytope, but in [34] A is allowed to be more general. For the more general acceptance domains, the notion of substitution generalises to that of an inflation, a linear map λ [34] (or even affine linear [32]) which has E as one of its eigenspaces, with eigenvalue of modulus greater than 1, preserves 7, and is contracting in a space F complementary to E. For λ to be a local inflation, i.e. an inflation which can be defined as a map on translational congruence classes, leads to a criterion on the acceptance domain A. The method of Pleasants [34] is designed to construct projection point patterns with a given (finite) symmetry group of isometries. It is based on the result that every representation of a finite isometry group acting on Rd can be written as a matrix representation where the matrices take their entries in a real algebraic number field K of (finite) degree p. This number field K is then used to construct a decomposition Rdp = E ⊕ E ⊥ , where dim E = d, and a lattice 7 so that the point pattern with the desired symmetry is the projection point pattern constructed from data (7, E) and a general acceptance domain in E ⊥ . In [34] Pleasants comes to the conclusion that local inflations always exist but, for p > 2, never for polytopal acceptance domains (so in particular not for the canonical one) whereas this obstruction is absent for p = 2. Note that dim E ⊥ ≥ dim E in his construction, with equality holding only for p = 2, a result in agreement and comparable to our Corollary 52. The direct limit of rational vector spaces in the proof of Corollary 55 is finitely generated, but the corresponding limit of underlying free abelian groups need not be finitely generated; indeed limits with divisibility can easily occur. Corollary 55 and Theorem 53 now imply the following. Corollary 57. A necessary condition for a substitution tiling T to arise also as a canonical projection tiling is that Coinv(GT ; Z) is a finitely generated free abelian group. 6. Explicit Formulae for Codimension d ⊥ ≤ 2 We turn now to methods of computation and present quantitative results for the cohomology of canonical projection tilings of codimension smaller than or equal to 2. The restriction to small codimension is a matter of simplification: in principle, the calculations can be carried out for any codimension, but in practice become quite complicated. Algebraic topology provides sophisticated tools to organize such calculations, namely spectral sequences, and we exploit their full power elsewhere [18, 19]. However, they are not really necessary for codimensions strictly less than 3 and we present here alternative, elementary methods of computation for these codimensions. Throughout this section we assume that 0(P) is finite, which we saw in Theorems 49 and 53 was necessary and sufficient to ensure that the cohomology is finitely generated and free abelian. In fact, the results below are independent of these theorems and show directly that if d ⊥ ≤ 2 then H ∗ (T ) is finitely generated and free abelian. The calculations rely on the description of the topology of Ec⊥ by singular planes developed in Sect. 3. Recall that C is a countable collection of singular planes with only finitely many 7-orbits; we index the orbits by I . We know that the normals of the singular planes span E ⊥ and that 7 ⊥ lies dense in it. We now simplify the notation in writing 7 in place of 7 ⊥ .
314
A. H. Forrest, J. R. Hunton, J. Kellendonk
By Corollary 41 the task is to compute the cohomology of the group 7 with values in C(Ec⊥ , Z) and the strategy is as follows. We recognize Cc (Ec⊥ , Z), the compactly supported functions, as an 7-module in a (finite) exact sequence of 7-modules and use the functorial properties of cohomology, in particular that it turns short exact sequences into long exact ones. As the other modules in the exact sequence are effectively lower dimensional we can proceed recursively. In practice it turns out to be more convenient to use homology in place of cohomology. This makes no essential difference: the fact that E ⊥ has d ⊥ non-compact independent directions together with Poincaré duality [29] gives an isomorphism [17] Lemma 58. H k (7, C(E ⊥ , Z)) ∼ = Hd−k (7, Cc (E ⊥ , Z)). c
c
6.1. Group homology. As a general reference to group homology we refer to [29]. Homology of a group 7 is defined using any projective resolution of Z by Z7 modules of the group; here Z7 denotes the free Z module on the basis elements of 7; we write [γ ] for the basis element corresponding to γ ∈ 7. We choose here the following free resolution. Let {e1 , . . . , eN } be a basis of 7 ∼ ZN .
N = Then M7, the exterior algebra over 7, is the free graded Z-module M7 = k=0 Mk 7, where Mk 7 has basis {ei1 ∧ · · · ∧ eik |1 ≤ ij < ij +1 ≤ N } with antisymmetric multiplication (denoted by ∧), i.e. the only relations are ei ∧ ej = −ej ∧ ei . Our resolution is ∂ ∂ ∂ I 0 → MN 7 ⊗ Z7 → MN−1 7 ⊗ Z7 → · · · → M0 7 ⊗ Z7 → Z → 0, where tensor products are over Z and the Z7 action on Mr 7 ⊗ Z7 is trivial on Mr 7 and is the permutation representation on Z7. The maps ∂ are defined as follows. We may regard Z7 as Laurent polynomials in N variables {t1 , . . . , tN } with integer coefficients. Addition in Z7 then corresponds to multiplication of Laurent-polynomials. Then ∂ is the unique Z7-linear derivation of degree 1 determined by ∂(ei ) = (ti − 1), and I(ti ) = 1. Given a 7-module M, then H∗ (7, M), the homology of the group 7 with coefficients in M, is defined as the homology of the complex ∂⊗1
∂⊗1
0 → MN 7 ⊗ Z7 ⊗7 M → · · · → M0 7 ⊗ Z7 ⊗7 M → 0 where, for two 7-modules M1 , M2 , M1 ⊗7 M2 is the quotient of the algebraic tensor product (over Z) M1 ⊗ M2 by the relations γ · m1 ⊗ m2 = m1 ⊗ γ · m2 . Remark 59. An easy exercise in the definitions shows that Hk (7, Z7) is trivial for all k > 0 and is equal to Z for k = 0. More generally, suppose that 7 = G ⊕ H and let us compute H∗ (7, ZH ), where ZH is the free Z-module generated by H made into a 7-module by the action (g ⊕ h) · h = h + h . Then we can identify Mk 7 ⊗ Z7 ⊗7 ZH ∼ Mi G ⊗ Mj H ⊗ ZH (6) = i+j =k
and under this identification ∂ ⊗ 1 becomes (−1)deg ⊗ ∂ , where ∂ is the boundary operator for the homology of H . It follows that Hk (7, ZH ) ∼ Mi G ⊗ Hj (H, ZH ) = Mk G. = i+j =k
As a special case, Hk (7, Z) = Mk 7 ∼ =Z
N k
.
Cohomology of Canonical Projection Tilings
315
Now let I : ZH → Z be the Z7 module homomorphism given by the sum of the coefficients, i.e. I[h] = 1 for all h ∈ H . We shall need the following lemma later. Lemma 60. Under the identifications H∗ (7, ZH ) ∼ = M∗ G and H∗ (7, Z) ∼ = M∗ 7 the induced map Ik : Hk (7, ZH ) → Hk (7, Z) becomes the embedding Mk G @→ Mk 7. Proof. Using the decomposition (6) the induced map Ik :
Mi G ⊗ Hj (H, ZH ) →
i+j =k
Mi G ⊗ Hj (H, Z)
i+j =k
preserves the bidegree and must be the identity on the first factors in the tensor products. Since Hk (H, ZH ) is trivial whenever k = 0 and one dimensional for k = 0, Ik can be determined by evaluating I0 on the generator of H0 (H, ZH ); the result follows. The basic tool in the calculations below is the following. Whenever we have a short ψ
ϕ
exact sequence of Z7-modules 0 → A → B → C → 0 we get a long exact sequence of homology groups ψk+1
γk+1
ψk
ϕk
· · · → Hk+1 (7, C) → Hk (7, A) → Hk (7, B) → Hk (7, C) · · · . The maps ϕk and ψk are the induced homomorphisms and the γk are the connecting homomorphisms. For details see [29]. 6.2. A CW-like complex. Let C be an arbitrary countable collection of affine hyperplanes of F , a linear space, and define C -topes as before: compact polytopes which are the closures of their interiors and whose boundary faces belong to hyperplanes from C . For n at most the dimension of F let CCn be the Z-module generated by the n-dimensional faces of convex C -topes satisfying the relations [U1 ] + [U2 ] = [U1 ∪ U2 ] for any two faces U1 , U2 , for which U1 ∪ U2 is as well a convex face and U1 ∩ U2 has no interior ( i.e. nonzero codimension in U1 ). These relations then imply [U1 ] + [U2 ] = [U1 ∪U2 ]+[U1 ∩U2 ] if U1 ∩U2 has interior. If we take C = C, our collection of singular planes from Sect. 3, then C n := CCn is a Z7 module under the action γ · [U ] = [U + γ ]. ⊥ As Z7-modules, C d ∼ = Cc (E ⊥ , Z), the isomorphism being given by assigning to [U ] c
the indicator function on the closure of U ∩ N S (which is clopen). Moreover, C 0 is a free Z7-module with basis in one to one correspondence with the intersection points P. Proposition 61. There exist 7-equivariant module maps δ and I such that ⊥
δ
0 → Cd → Cd
⊥ −1
δ
I
→ · · · C 0 → Z → 0,
is an exact sequence of 7-modules and I[U ] = 1 for all vertices U of C-topes.
(7)
316
A. H. Forrest, J. R. Hunton, J. Kellendonk
Proof. Let I be the indexing set for 7 orbit classes of singular planes. For a subset R of 7 (which we identified with 7 ⊥ ⊂ E ⊥ ) let CR := {Hi + r|r ∈ R, i ∈ I } and SR = {x ∈ H |H ∈ CR }. Let R be the set of subsets R ⊂ 7 such that all connected components of E ⊥ \SR are bounded and have interior. R is closed under union and hence forms an upper directed system under inclusion. For any R ∈ R, the CR -topes define a regular polytopal CW-complex ⊥
δR
⊥
δR
0 → CCdR → CCdR −1 → · · · CC0R → 0,
(8)
with boundary operators δR depending on the choices of orientations for the n-cells (n > 0) [35]. Moreover, this complex is acyclic (E ⊥ is contractible), i.e. upon replacing IR
CC0R → 0 by CC0R → Z → 0, where IR [U ] = 1, (8) becomes an exact sequence. Let us ⊥ constrain the orientation of the n-cells in the following way: for each n < d there are finitely many subsets J ⊂ I such that dim i∈J Hi = n and J is maximal. Each n-cell belongs to a subspace parallel to one of the i∈J Hi and we choose its orientation such that it depends only on the corresponding J (i.e. we choose an orientation for i∈J Hi and then the cell inherits it as a subset). By the same principle, all d ⊥ -cells are supposed to have the same orientation. Then the cochains and boundary operators δR share two crucial properties: first, if R ⊂ R for R, R ∈ R, then we may identify CCnR with a submodule of CCn and under this identification δR (x) = δR (x) for all x ∈ CCnR , and R second, if U and U + x are CR -topes then δR [U + x] = δR [U ] + x. The first property implies that the directed system R gives rise to a directed system of acyclic cochain complexes, and hence its direct limit is an acyclic complex, and the second implies, together with the fact that for all γ ∈ 7 and R ∈ R also R + γ ∈ R, that this complex becomes a complex of 7-modules. The statement now follows since CCn is the direct limit of CCnR for all n. 6.3. Solutions for d ⊥ = 1, 2. Based on the results of the last two sections we now ⊥ calculate the homology groups Hk (7, C d ) for d ⊥ = 1, 2. Lemma 62. Given a CW-like complex as in Sect. 6.2, 0 for k > 0, 0 Hk (7, C ) = ZL for k = 0,
(9)
where L is the number of 7-orbits of vertices of C-topes, i.e. L = |0(P)|. Proof. Since 7 acts fixpoint-freely we have M7 ⊗ Z7 ⊗7 C 0 ∼ = M7 ⊗ Z7 ⊗ ZL which directly implies the result. Theorem 63. Let T be a d-dimensional canonical projection tiling of codimension 1. d+1 k+1 d−k for k > 0, ∼ H (T ) = Z Zd+L for k = 0. Proof. In the case d ⊥ = 1, (7) is the short exact sequence δ
I
0 → C1 → C0 → Z → 0
(10)
Cohomology of Canonical Projection Tilings
317
and we use the resulting long exact sequence of homology groups for the computation. By the last lemma, apart from the lowest degree every third homology group in that sequence is trivial so that Hk (7, C 1 ) ∼ = Hk+1 (7, Z) for k > 0. The remaining part of the sequence has the form 0 → Zd+1 → H0 (7, C 1 ) → ZL → Z → 0 and hence H0 (7, C 1 ) = Zd+L as claimed. Note that at this stage (for very low codimension) we did not need to know explicitly the morphisms involved. Recall the description of the topology of E ⊥ for canonical projection tilings by singular planes. These planes were organized in 7-orbits, indexed by a finite set I , and we choose representatives Hα , for each α ∈ I . Theorem 64. Let T be a d-dimensional canonical projection tiling of codimension 2, να d+2 k+2 −rk −rk+1 + α∈I k+1 Z for k > 0, H d−k (T ) ∼ (11) = + (ν +l −1) −d−L−1−r d+2 α α 1 α∈I for k = 0, Z 2 where να is the rank of 7 α (the stabilizer of Hα ), lα the number of 7 α -orbits of intersection points in Hα , and rk the rank of the module generated by the submodules Mk+1 7 α ⊂ Mk+1 7 for all α ∈ I . Proof. Inserting C00 := δ(C 1 ) we break the exact sequence (7) into two short exact ones δ
0 → C 2 −→ C 1
δ
,
−→
-
C00
0
I
-
C 0 −→ Z → 0.
, 0
0 → C00 → C 0 → Z → 0 can be treated as in the codimension 1 case. Taking into account that the rank of 7 is d + 2 one gets d+2 0 ∼ Z k+1 for k > 0, Hk (7, C0 ) = (12) Zd+L+1 for k = 0. Let us have a closer look at C 1 . For n at most 1 let Cαn be the sub-module of C n generated by the n-dimensional faces which belong to Hα , α ∈ I . As before we denote by 7 α the stabilizer of Hα and we let 7ˆ α be a complementary subgroup, i.e. 7 = 7 α ⊕ 7ˆ α (recall that 7/ 7 α has no torsion). Then C1 ∼ Cα1 ⊗ Z7ˆ α , (13) = α∈I
because any 1-dimensional face belongs to a translate of some Hα . Moreover the action of 7 α ⊕ 7ˆ α on C 1 is such that the first summand acts non-trivially only on the first factors, Cα1 , and the second only on the second factors, Z7ˆ α . In particular, Z7 ⊗7 C 1 ∼ =
α 1 α ˆ α α∈I Z7 ⊗7 Cα ⊗ Z7 as 7-modules which implies H∗ (7, C 1 ) ∼ H∗ (7 α , Cα1 ). (14) = α∈I
318
A. H. Forrest, J. R. Hunton, J. Kellendonk
Restricting the boundary maps δ and I to Cαn we get a short exact sequence δα
Iα
0 → Cα1 → Cα0 → Z → 0.
(15)
As in Theorem 63 and combined with Eq. (14) we obtain ν α 1 ∼ Z α∈I k+1 for k > 0, Hk (7, C ) = Z α∈I (να +lα −1) for k = 0,
(16)
where να and lα are as defined in the statement of the theorem. Note that the lα are all finite since we required L to be finite. Equations (12, 16) give us part of the information needed to determine H∗ (7, C 2 ) from the exact sequence δ
δ
0 → C 2 → C 1 → C00 → 0,
(17)
but we have to determine explicitly one further morphism since we have no longer enough trivial groups in the resulting long exact sequence. We shall determine the induced morphism β∗ := δ∗ : H∗ (7, C 1 ) → H∗ (7, C00 ).
(18)
Consider the following commutative diagram: δα ⊗1
Iα ⊗1
0 → Cα1 ⊗ Z7ˆ α → Cα0 ⊗ Z7ˆ α → Z7ˆ α → 0 ↓ δα ⊗ 1 ↓ ↓ Iα 0→
C00
@→
C0
I
→
Z
→0
where the middle vertical arrow is the inclusion, the right vertical arrow the sum of the coefficients, I α [γ ] = 1, and the left vertical arrow the map of interest. In fact, βk is the direct sum over all α of (δα ⊗1)k : Hk (7, Cα1 ⊗ Z7ˆ α ) → Hk (7, C00 ). This diagram gives rise to two long exact sequences of homology groups together with vertical maps, all commuting, (δα ⊗ 1)∗ being one of them. Now use that for k > 0, Hk (7, Cα0 ⊗ Z7ˆ α ) = Hk (7, C 0 ) = 0 so that we can express (δα ⊗1)∗ through I∗α . In fact, the triviality of these groups imply that Hk (7, Cα0 ⊗ Z7ˆ α ) ∼ = Hk+1 (7, Z7ˆ α ) and Hk (7, C00 ) ∼ = Hk+1 (7, Z), for k > 0, and with these identifications α (δα ⊗ 1)k = Ik+1 .
By Lemma 60 the map Ikα becomes the embedding Mk 7 α @→ Mk 7 under the above identifications. For k > 0 therefore, the rank of βk is equal to the rank of the span of the submodules Mk+1 7 α , α ∈ I , in Mk+1 7, the number defined as rk in the statement of the theorem. The long exact sequence corresponding to (17) implies Hk (7, C 2 ) ∼ = Hk+1 (7, C00 )/im βk+1 ⊕ Hk (7, C 1 ) ∩ ker βk . Since, for k > 0, dim Hk (7, C 1 ) ∩ ker βk = dim Hk (7, C 1 ) − rk we get the desired result (the case k = 0 is similar), provided the homology groups are torsion free. That this is the case we know from [12].
Cohomology of Canonical Projection Tilings
319
6.4. Example: octagonal tilings. We provide here one example, the octagonal tilings. A whole list of results for codimension 2 tilings could be obtained by evaluating (11) with a computer [36]. The (undecorated) octagonal tilings are two dimensional tilings which may be constructed from the data (Z4 , 0, E), the four dimensional integer lattice Z4 (with standard basis {ei }i=1,...,4 ) and the two dimensional invariant subspace of the eightfold symmetry C8 : ei → ei+1 for i = 1, 2, 3 and e4 → −e1 (the group C8 acts as rotation by π4 ) [37, 38]. It consists of squares and 450 -rhombi all edges having equal length. E ⊥ is, of course, also an invariant subspace of the eightfold symmetry and the singular planes (which are lines) are well known, they are the tangents to the boundary faces of the projection of the unit cube into E ⊥ which is a regular octagon. They are translates under π ⊥ (Z4 ) of the four lines spanned by ei⊥ which form an orbit under C8 (we may ignore the shift by δ). From these lines we get all our information, the numbers L, νi , li , I = {1, . . . , 4}, and r1 , r2 , r3 (higher rk are unecessary since d = 2). Usually it is not so easy to determine L but in our case it is easy to see that apart from the orbit of the intersection point at 0 there are only two other ones: the orbit of √1 (e1⊥ + e3⊥ ) and that of √1 (e2⊥ + e4⊥ ). 2
2
Hence L = 3. Clearly, 7 1 is spanned by e1⊥ and e2⊥ − e4⊥ and hence ν1 = 2 and l1 = 2 which carries over to all i by symmetry. Finally, r1 = 3 and rk = 0 for k ≥ 2 as νi = 2. Inserting the numbers yields H 0 (T ) = Z,
H 1 (T ) = Z5 ,
H 2 (T ) = Z9 .
This result is in agreement with a calculation we made using Anderson and Putnam’s method [9] for substitution tilings: the octagonal tiling is also substitutional, its substitution is given in Fig. 1 of Sect. 5.1. 7. The Non-Commutative Approach We conclude by connecting the cohomology of a tiling, as we have been discussing, with its non-commutative topological invariants. The starting point of the non-commutative approach is the observation that the orbit spaces of the dynamical systems arising from the tiling are non-Hausdorff. In fact, for a (completely) non-periodic tiling T , no two points in MT /Rd can be separated by open neighbourhoods. Connes’non-commutative geometry was motivated by the desire to analyse such spaces. In the non-commutative topological approach [39] one studies the properties of the (non-commutative) C ∗ algebra associated with the dynamical system (MT , Rd ). This algebra is the crossed product algebra of C(MT ), the algebra of continuous functions over MT , with the group Rd . We denote it by C(MT ) × Rd . Topologically, this algebra may be described by its K-theory [40, 41]. It turns out that the K-groups are closely related to the Czech-cohomology of MT . The K-groups, however, contain additional information in the form of a natural order structure on the K0 -group and this is the advantage of the non-commutative approach. We have seen in Example 46 that cohomology without extra structure is not a very fine invariant. Equally well mathematically, but from a more physically motivated point of view, we can work with the formulation of the quotient MT /Rd as the space of orbits of the tiling groupoid GT (or of GT ). The C ∗ algebra whose K-theory provides the noncommutative topological invariant is then the corresponding groupoid-C ∗ algebra [26, 15]. The importance of this groupoid C ∗ algebra for physical systems lies in the fact that it provides an abstract definition of the algebra of observables [15, 10] for particles
320
A. H. Forrest, J. R. Hunton, J. Kellendonk
moving in the tiling; the scaled ordered K0 -group and its image under a tracial state governs the gap labelling. If T is a canonical projection tiling GT and GT are equivalent in the sense of Muhly et al. to the transformation groupoid G(X, G1 ). This is proven directly in [16] but it also follows from our analysis of Sect. 3.5 where similarity of the two groupoids has been shown. By application of the theory of Muhly et al. [27] we obtain Theorem 65. The K-groups of C(MT ) × Rd and of the groupoid-C ∗ algebras of GT and of G(X, G1 ) are isomorphic, the isomorphism preserving the order on the K0 -group. The isomorphism between the first two K-groups was already observed in [9]. Of particular importance for the present case is the following relationship between K-theory and cohomology proved in [12]: if (X, Zd ) is a minimal Zd -dynamical system where X is homeomorphic to the Cantor set then H d−i+2j (Zd , C(X, Z)) Ki (C(X) × Zd ) ∼ = j
as unordered groups. Thus, in view of Corollary 41, Corollary 66. For a canonical projection tiling T , H d−i+2j (T ) Ki (C ∗ (GT )) ∼ = j
as unordered groups. It is an interesting question whether this result is true for finite type tilings in general. As already mentioned, the isomorphism of the corollary neglects the information contained in the order structure on the K0 -group. One can cure for this at least partly by looking at the order on H d (T ), the group of coinvariants, which is induced by the unique invariant probability measure on 0T (the dynamical system (MT , Rd ) is uniquely ergodic). That measure defines a group homomorphism Cc (Ec⊥ , Z) → R which by invariance induces a homomorphism τ : H d (T ) → R. The subset τ −1 (R>0 ) is closed under addition and defines a positive cone of H d (T ) which sits inside the positive cone of K0 (C ∗ (GT )) and contains already a good portion of the information, including that needed for the standard gap-labelling. In fact, for d = 1, where H 1 (T ) = K0 (C ∗ (GT )), this order is precisely the order defined on the K0 -group in the standard way [40]. With this information at hand let us come back to Example 46, the canonical projection tiling with data W = Z2 , w = 0, d = 1, and E specified by an irrational number ν. To keep track of this dependence we write T (ν) for a canonical projection tiling obtained from such data. The unique invariant probabibity measure on 0T (ν) is the pull back under µ of the Lebesgue measure on E ⊥ normalized in such a way that π ⊥ (γ ) (the projection of the unit cell) has measure 1. From this we see that with [1[a,b] ] denoting the coinvariant class of 1[a,b] , τ ([1[a,b] ]) =
b−a . 1+ν
In particular, the rank of τ (H 1 (T (ν) )) is 2 and hence H 1 (T (ν) ) ∼ = Z2 . Now, τ (n[1[0,1] ]+ m[1[0,ν] ]) > 0, for n, m ∈ Z, whenever (n, m) has positive scalar product with (1, ν) and hence belongs to the upper right half space defined by E ⊥ in R2 . It follows that K0 (GT (ν) )
Cohomology of Canonical Projection Tilings
321
is order isomorphic to K0 (GT (ν ) ) whenever there exists a matrix M ∈ GL(2, Z) such 11 ν+M12 that ν = M M21 ν+M22 . Note that in the above cases τ is injective. We remark without further explanation that the order unit improves the invariant even more. K0 (GT (ν) ) and K0 (GT (ν ) ) are order isomorphic with isomorphism preserving the order unit if and only if ν = ±ν. Returning to Example 47, the canonical projection tiling with data W = Z3 , w = 0, d = 1, the unique invariant probability measure on 0T is again the pull back under µ of the Lebesgue measure on E ⊥ normalized in such a way that π ⊥ (γ ) has measure 1. Thus all the elements [1U +λ1 +λ2 ] − [1U ] are mapped to 0 by τ . In fact, one can show that the image of τ is finitely generated so that in this case all but finitely many generators of the K0 -group are neither positive nor negative, i.e. that almost all are infinitesimal. Acknowledgements. The third author thanks F. Gähler for helpful discussions. The collaboration of the first two authors was initiated by the William Gordon Seggie Brown Fellowship at The University of Edinburgh, Scotland, and was further supported by a Collaborative Travel Grant from the British Council and the Research Council of Norway with the generous assistance of The University of Leicester, England, and the EU Network “Non-commutative Geometry” at NTNU Trondheim, Norway. The collaboration of the first and third authors was supported by the Sonderforschungsbereich 288, “Differentialgeometrie und Quantenphysik” at TU Berlin, Germany, and by the EU Network and NTNU Trondheim. The first author is supported while at NTNU Trondheim, as a post-doctoral fellow of the EU Network and the third author is supported by the Sfb288 at TU Berlin. All three authors are most grateful for the financial help received from these various sources.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
Steinhardt, P.J. and Ostlund, S.: The Physics of Quasicrystals. Singapore: World Scientific, 1987 Janot, C. and Mosseri, R.: Proc. 5th Int. Conf. on Quasicrystals. Singapore: World Scientific, 1995 Axel, F. and Gratias, D.: Behond Quasicrystals. Berlin–Heidelber–New York: Springer, 1995 Moody, R.V.: The Mathematics of Long Range Aperiodic Order. Dordrecht: Kluwer, 1997 Duneau, M. and Katz, A.: Quasiperiodic patterns and icosahedral symmetry. J. Physique 47, 181–196 (1986) Oguey, C., Katz, A. and Duneau, M.: A geometrical approach to quasiperiodic tilings. Commun. Math. Phys. 118, 99–118 (1988) de Bruijn, N.G.: Algebraic theory of Penrose’s nonperiodic tilings of the plane. Kon. Nederl. Akad. Wetensch. Proc. Ser. A84, 38–66 (1981) Kramer, P. and Schlottmann, M.: Dualisation of Voronoi domains and Klotz construction: a general method for the generation of quasiperiodic tilings. J. Phys. A 22, L1097 (1989) Anderson, J.E. and Putnam, I.F.: Topological invariants for substitution tilings and their associated C ∗ algebras. Ergod. Th. and Dynam. Sys. 18, 509–537 (1998) Kellendonk, J.: The local structure of tilings and their integer group of coinvariants. Commun. Math. Phys. 187(1), 115–157 (1997) Le, T.T.Q.: Local rules for quasiperiodic tilings. In: R.V. Moody (ed.) The Mathematics of Long Range Aperiodic Order. Dordrecht: Kluwer, 1997, pp. 331–366 Forrest, A.H. and Hunton, J.: The cohomology and K-theory of commuting homeomorphisms of the Cantor set. Ergod. Th. and Dynam. Sys. 19, 611–625 (1999) Bellissard, J.: Gap labelling theorems for Schrödinger’s operators. In: Waldschmidt, M., Moussa, P., Luck, J.M. and Itzykson, C. (eds.) From Number Theory to Physics. Berlin–Heidelberg–New York: SpringerVerlag, 1992, pp. 538–630 Bellissard, J., Bovier, A. and Ghez, J.M.: Gap labelling theorems for one dimensional discrete Schrödinger operators. Rev. Math. Phys. 4, 1–38 (1992) Kellendonk, J.: Non commutative geometry of tilings and gap labelling. Rev. Math. Phys. 7, 1133–1180 (1995) Forrest, A.H., Hunton, J. and Kellendonk, J.: Projection quasicrystals I: Toral rotations. SFB-preprint No. 340, 1998 Forrest,A.H., Hunton, J. and Kellendonk, J.: Projection quasicrystals II:Versus substitutions. SFB-preprint No. 396, 1999 Forrest, A.H., Hunton, J.R. and Kellendonk, J.: Projection quasicrystals III: Cohomology. SFB-preprint No. 459, 2000
322
A. H. Forrest, J. R. Hunton, J. Kellendonk
19. Forrest, A.H., Hunton, J.R. and Kellendonk, J.: Topological invariants for projection method patterns. To appear in Mem. Amer. Math. Soc. 20. Schlottmann, M.: Periodic and quasi-periodic Laguerre tilings. Int. J. Mod. Phys. B 7, 1351–1363 (1993) 21. Bellissard, J.,Contensou, E. and Legrand, A.L.: K-théorie des quasi-cristeaux, image par la trace: le cas du réseau octogonal. C. R. Acad. Sci. Paris, Série I 326, 197–200 (1998) 22. Rudolph, D.J.: Markov tilings of Rn and representations of Rn actions. Contemporary Mathematics 94, 271–290 (1989) 23. Radin, C. and Wolff, M.: Space tilings and local isomorphism. Geom. Ded. 42, 355–360 (1992) 24. Radin, C.: The Pinwheel tilings of the plane. Annals of Math. 139, 661–702 (1994) 25. Kellendonk, J.: Topological equivalence of tilings. J. Math. Phys. 38 (4), 1823–1842 (1997) 26. Renault, J.: A Groupoid approach to C ∗ -Algebras. Lecture Notes in Math. 793. Berlin–Heidelberg–New York: Springer-Verlag, 1980 27. Muhly, P.S., Renault, J.N. and Williams, D.P.: Equivalence and isomorphism for groupoid C ∗ -algebras. J. Operator Theory 17, 3–22 (1987) 28. Renault, J.: Private communication. 29. Brown, K.S.: Cohomology of Groups. Berlin–Heidelberg–New York: Springer-Verlag, 1982 30. Baake, M., Joseph, D., Kramer, P. and Schlottmann, M.: Root lattices and quasicrystals. J. Phys. A: Math. Gen. 23, L1037–L1041 (1990) 31. Gähler, F. and Stampfli, P.: The dualisation method revisited: dualisation of product Laguerre complexes as a unifying framework. Int. J. Mod. Phys. B 7, 1333–1349 (1993) 32. Baake, M., Hermisson, J. and Pleasants, P.: The torus parametrization of quasiperiodic LI-classes. J. Phys. A 30, 3029–3056 (1997) 33. Mingo, J.A.: C ∗ -algebras associated with one-dimensional almost periodic tilings. Commun. Math. Phys. 183, 307–337 (1997) 34. Pleasants, P.A.B.: The construction of quasicrystals with arbitrary symmetry group. In: Janot, C. and Mosseri, R. (eds): Proc. 5th Int. Conf. on Quasicrystals. Singapore: World Scientific, 1995. pp. 22–30 35. Massey, W.S: A Basic Course in Algebraic Topology. Berlin–Heidelberg–New York: Springer-Verlag, 1991 36. Gähler, F. and Kellendonk, J.: Cohomology groups for projection tilings of codimension 2. Material Science and Engineering 294–296, 438–440 (2000) 37. Beenker, F.P.M.: Algebraic theory of non-periodic tilings of the plane by two simple building blocks: A square and a rhombus. Thesis, Techn. Univ. Eindhoven, TH-report 82-WSK-04, 1982 38. Socolar, J.E.S. Simple octagonal and dodecagonal quasicrystals. Phys. Rev. B 39, (15), 10519–10551 (1989) 39. Connes, A.: Non Commutative Geometry. London–New York: Academic Press, 1994 40. Blackadar, B.: K-Theory for Operator Algebras. MSRI Publications 5. Berlin–Heidelberg–New York: Springer-Verlag, 1986 41. Wegge-Olson, N.E.: K-theory of C ∗ -algebras. A friendly approach. Oxford: Oxford University Press, 1993 Communicated by H. Araki
Commun. Math. Phys. 226, 323 – 375 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Asymmetric Diffusion and the Energy Gap Above the 111 Ground State of the Quantum XXZ Model Pietro Caputo, Fabio Martinelli Dip. Matematica, Universita’ di Roma Tre, L.go S. Murialdo 1, 00146 Roma, Italy. E-mail: [email protected]; [email protected] Received: 9 August 2001 / Accepted: 29 October 2001
Abstract: We consider the anisotropic three dimensional XXZ Heisenberg ferromagnet in a cylinder with axis along the 111 direction and boundary conditions that induce ground states describing an interface orthogonal to the cylinder axis. Let L be the linear size of the basis of the cylinder. Because of the breaking of the continuous symmetry around the zˆ axis, the Goldstone theorem implies that the spectral gap above such ground states must tend to zero as L → ∞. In [3] it was proved that, by perturbing in a subcylinder with basis of linear size R L the interface ground state, it is possible to construct excited states whose energy gap shrinks as R −2 . Here we prove that, uniformly in the height of the cylinder and in the location of the interface, the energy gap above the interface ground state is bounded from above and below by const. L−2 . We prove the result by first mapping the problem into an asymmetric simple exclusion process on Z3 and then by adapting to the latter the recursive analysis to estimate from below the spectral gap of the associated Markov generator developed in [7]. Along the way we improve some bounds on the equivalence of ensembles already discussed in [3] and we establish an upper bound on the density of states close to the bottom of the spectrum. 1. Introduction In recent years there has been a great deal of investigation of the anisotropic spin Heisenberg model defined by H = −
x,y∈: |x−y|=1
1 1 1 Sx Sy + Sx2 Sy2 + Sx3 Sy3 + boundary conditions,
1 2
XXZ
(1.1)
where ⊂ Zd and > 1 measures the anisotropy. Sometimes the parameter is expressed as = (q + q −1 )/2, 0 < q < 1, and the classical Ising model is recovered in the limit q → 0. We refer the reader in particular to [1, 22, 14–16, 20, 3, 23].
324
P. Caputo, F. Martinelli
As it is well known, the XXZ model has two ferromagnetically ordered translation invariant ground states, but also ground states that describe domain walls between regions of opposite sign of the spins. More precisely, for d ≥ 3 and using a quantum version of the Pirogov–Sinai theory [6], it is possible to prove the existence of low temperature states describing an interface orthogonal to the 001 direction (a kind of Dobrushin state), provided that is large enough. Quite surprisingly, and this is one of the main reasons for the increasing interest in such a model, the anisotropy is able, under certain circumstances, to stabilize a domain wall against quantum fluctuations even when, classically, thermal fluctuations are too strong to allow for a stable interface. This is indeed the case for the so-called 11, 111, . . . diagonal interfaces. The Ising model is not expected to have a Gibbs state describing a diagonal interface at low temperature because the zero temperature configurations compatible with the natural geometry and corresponding boundary conditions are enormously degenerate.A rigorous proof of such a result is available so far only in the solid-on-solid approximation thanks to results of [13]. On the other hand it has been shown independently in [1] and [12] that an appropriate choice of the boundary conditions in (1.1) can lead to ground state selection that favours a diagonal interface. Let us be a little bit more precise. For definiteness we set d = 3. We then take as domain a cylinder with basis of linear size L, height H and axis along the 111 direction. The state of the system is described by vectors in √the tensor product Hilbert space H = (C2 )⊗|| . Fix > 1 and define A() = 21 1 − −2 . Boundary conditions are then introduced as follows. We let H =
Hb ,
(1.2)
b∈B
where B is the set of oriented bonds of Z3 inside , the single bond hamiltonians Hb are given by 1 Hb = −−1 Sx1b Sy1b + Sx2b Sy2b − Sx3b Sy3b + A() Sy3b − Sx3b + , 4 and we write the bond b as b = (xb , yb ) if yb > xb , where x = x1 + x2 + x3 , x = (x1 , x2 , x3 ) is a signed distance to the origin. Notice that the terms A() Sy3b − Sx3b cancel everywhere except at the two basis of the cylinder and that the third component of the spin is a conserved quantity. The constant 41 is there in order to have Hb ≥ 0. The reason for the special choice of the coefficient A() comes mainly from the one dimensional system (see [14, 15, 1]). For d = 1 (L = 1 in our language) and boundary coefficient A() the system enjoys a SUq (2) quantum group symmetry and the ground state degeneracy is equal to H + 1. If instead we take the boundary coefficient different from A() then the degeneracy is lifted. Moreover, in complete analogy with the exact computation of the ground state wave function of (1.2) in d = 1, one can show that, in each sector with x Sx3 = (2n−||)/2, n = 0, 1, . . . , ||, there exists a unique ground state of H , denoted by ψn , with zero energy [1]. More precisely, with the convention that |1 and |0 stand for spin “up” and spin “down” respectively, the ground state ψn can be written as |αx , ψn = ψn (α) (1.3) α∈ : N (α)=n
x∈
Energy Gap in the XXZ Model
325
where := {0, 1} , N (α) :=
x∈ αx
and q x αx . ψn (α) =
(1.4)
x∈
The square of the coefficients ψn (α) can be interpreted as the statistical weights of a (non-translation invariant) canonical Gibbs measure for a lattice gas with n particles described by the variables {αx }. The typical configurations of such a measure form a sharply localized (depending on n) interface orthogonal to the 111 direction, separating a region almost filled with particles (αx = 1) from an almost empty region (αx = 0). That justifies the name “interface ground state” for the vector ψn . Because of the degeneracy of the ground states ψn , n = 0, 1, . . . , ||, the continuous symmetry given by rotation around the z-axis is broken and therefore the spectrum above zero energy must be gapless in the thermodynamic limit (see [20]). That makes, in particular, any attempt to go beyond the zero temperature case quite hard. To the best of our knowledge the only model with a state describing a 111-interface also at positive temperature is the Falicov–Kimball model [10]. The structure of the low-lying excitations above the interface ground states of (1.2) was recently studied in great detail in a series of interesting papers [3–5]. The main result in the above papers is that one can construct excitations localized in a sub-cylinder of of radius R L such that their energy gap is smaller than kR −2 for a certain constant k = k(q). Moreover, in an appropriate scaling, the energy spectrum of such low-lying excitations coincides with the spectrum of the d − 1 Laplacian on a suitable domain. An important ingredient in these works is an equivalence of ensembles result that can be roughly described as follows. If we replace in (1.3) the weights ψn (α) by their associated grand canonical weights obtained by adding a suitably chosen constant chemical potential λ := λ(, n) and if we remove the condition N (α) = n, we obtain a new vector that we call grand canonical ground state and denote it by ψ λ . Then, for any local observable X that commutes with the total third component of the spin, the
difference between the two averages ψn , Xψn and ψ λ , Xψ λ vanishes as L → ∞. Let us now discuss our results. As pointed out in [3] it is generally believed that the energy gap above every ground state ψn in the 111-cylinder with height H and basis of linear size L, is not only bounded from above but also from below by O(L−2 ), uniformly in H and n. Our main contribution is a proof of this result, see Theorem 2.2. By making an ansatz similar to that of [3] we also show that lowest energies are produced by long wavelength spin-wave like excited states. We should emphasize that in contrast to [3] we do not have a detailed control of the q-dependent prefactors in the estimates but rather focus on the uniformity in n (total third component of the spin) and H (height of the cylinder). Another result of this paper concerns an estimate on the density of states. f Namely, we consider vectors ψn of the form f |αx , f (α)ψn (α) ψn = α∈ : N (α)=n
x∈ f
where f is a local bounded function of the variables {αx }x∈ such that ψn is orthogonal to ψn . Then, using the lower bound on the spectral gap, we prove that the spectral f measure ρf (E) associated to the vector ψn satisfies ρf (E) ≤ kε E 1−ε for any ε > 0 as E → 0, uniformly in n = 0, || and in (see Theorem 2.4). We believe that, in the above generality, a linear behaviour near the bottom of the spectrum is the correct
326
P. Caputo, F. Martinelli
one. Along the way we partially improve the equivalence of ensembles results of [3] (see Sect. 3) and we provide a probabilistic proof of the known result ([15]) that the spectral gap for the linear chain XXZ is uniformly positive (but our bound is very rough compared with that of [15]). We now briefly describe our approach. Let Hn denote the sector of the Hilbert space H with x∈ αx = n and define the normalized states ψ 2 (α) νn (α) = n 2 . η ψn (η) Using the positivity of the ground states ψn we may define a unitary transformation between Hn and L2 ( , νn ) by formally multiplying by ψn−1 . This transforms H,n , the restriction of H to Hn , into a new operator G,n on L2 ( , νn ). The latter turns out to be nothing but the Markov generator of an asymmetric simple exclusion process in that can be roughly described as follows. We have n particles in and each particle jumps to an empty neighbouring site with rate proportional to q if the signed distance from the origin is increased (by one) and to q −1 if it is decreased. The number of particles is a conserved quantity and by construction the measure νn is reversible for the process since G,n is self adjoint in L2 ( , νn ). The spectral gap of G,n coincides with the spectral gap of H,n and it accounts for the smallest rate of exponential decay to equilibrium for the above process in L2 ( , νn ). Note that the isotropic case q = 1 is the usual symmetric simple exclusion process. Although we discovered such an equivalence independently, we realized later on that it was well known to physicists for some years [2]. Once the problem has become a kind of reversible Kawasaki dynamics for a classical lattice gas, we adapt to it some recent work [7] (see also [18] for a different approach) to bound from below its spectral gap, recursively in L. Although our asymmetric simple exclusion has certain advantages over a high temperature truly interacting lattice gas because its grand canonical measure is product, nevertheless several new problems arise, particularly if one looks for results uniform in n, H , because of the unboundedness of the signed distance x entering in the canonical measure νn . As a final remark we observe that all our results are restricted to spin 21 . For higher spins one can still compute exactly the ground state (see [1]) for a suitable choice of the boundary conditions and, as described above, it is possible to unitarily transform the Hamiltonian into a Markov generator. The interacting particle process one gets in this way is however more involved than the one considered here. Particles of different kind (namely different spin) appear and, besides the usual asymmetric simple exclusion process, new transitions are allowed in which pairs of particles of opposite spin are created or destroyed with certain rates (see [2]). We plan to analyze this new situation in the near future. We conclude with a road map of the paper. • In Sect. 2 we fix the model, define the unitary transformation leading to the Markov generator and state the main results. • In Sect. 3 we provide a series of technical tools including the results on the equivalence of ensembles. • In Sect. 4 we describe the recursive approach to prove the lower bound on the spectral gap by assuming a key result that one may call “transport theorem” (see Theorem 4.1). We also prove a lower bound on the gap in one dimension uniformly in the number of “up” spins and in the height H .
Energy Gap in the XXZ Model
327
• In Sect. 5 we prove the transport theorem. • Finally in Sect. 6 we prove the upper bound on the spectral gap and the result on the spectral measure of local perturbations of the ground state. 2. Setup and Main Results 2.1. Lattice, bonds, 111-planes, sticks and cylinders. We consider the 3D integer lattice Z3 , and denote ei , i = 1, . . . , 3 the unit vectors in the i th direction. For any x ∈ Z3 we write xi = x · ei for the i th coordinate of x and denote by x the signed distance from the origin x = x1 + x2 + x3 . A bond in Z3 is an oriented couple b = (x, y), where x, y ∈ Z3 are neighbours, i.e. ∗ x − y1 = 1 with x1 = |x1 | + |x2 | + |x3 |. We denote Z3 the set of all bonds. A given ∗ 3 3 b ∈ Z identifies two sites xb , yb ∈ Z such that b = (xb , yb ). For any subset ⊂ Zd ∗ we call ∗ the set of b ∈ Z3 , such that xb , yb ∈ . For any b we have xb − yb = ±1. We choose an orientation according to increasing values of and denote B = b ∈ ∗ : yb = xb + 1 . Given h ∈ Z+ we call Ah the 111-plane at height h, i.e. Ah = x ∈ Z3 : x = h . We define the infinite stick )∞ passing through the origin as the doubly infinite sequence . . . , −e1 − e2 − e3 , −e2 − e3 , −e3 , 0, e1 , e1 + e2 , e1 + e2 + e3 , e1 + e2 + e3 + e1 , . . . . We write )x,∞ for the infinite stick going through x, i.e. )x,∞ = x + )∞ . Note that the union of )x,∞ , x ∈ A0 covers all of Z3 . For every positive integer H we define the finite stick )H = y ∈ )∞ : y ∈ {0, 1, 2, . . . , H − 1} . The finite stick through x is then )x,H = x + )H . When no confusion arises we shall simply write )x for a generic finite stick at x. We will often consider cylindrical subsets of Z3 of the type
)*,H = )x,H , * ⊂ A0 , x∈*
with some finite * ⊂ A0 , called the basis. Then )*,H contains H |*| sites, |*| being the cardinality of *. On the plane A0 it is convenient to parametrize sites as follows. Consider the two vectors Pu = (1, −1, 0) and Pv = (0, 1, −1). Then any x ∈ A0 is uniquely determined by a couple of integers (xu , xv ) with x = xu Pu +xv Pv . We consider tilted rectangles RL,M = x ∈ A0 : xu ∈ {0, 1, . . . , L − 1}, xv ∈ {0, 1, . . . , M − 1} . In this way |RL,M | = LM. When L = M we call QL = RL,L a tilted square. Corresponding cylinders )QL ,H are denoted )L,H . Note that there are no true neighbours on A0 . We say that two sites x, y are neighbours in A0 if x, y ∈ A0 and |xu − yu | + |xv − yv | = 1.
328
P. Caputo, F. Martinelli
2.2. Interface ground states of the XXZ model. Consider a cylinder = )*,H for some * ⊂ A0 , H ∈ Z+ . The state of the system is described by vectors in the tensor product Hilbert space H = (C2 )⊗|| . Fix q ∈ (0, 1) and define =
1 (q + q −1 ), 2
A() =
1 1 − −2 . 2
The Hamiltonian operator is defined by H =
b∈B
q
Hb ,
(2.1)
q
where single bond hamiltonians Hb , b = (xb , yb ) are given 1 q Hb = −−1 Sx1b Sy1b + Sx2b Sy2b − Sx3b Sy3b + A() Sy3b − Sx3b + . 4
(2.2)
Here the spin operators Sxi , i = 1, 2, 3, are the Pauli matrices 0 1/2 0 −i/2 1/2 0 1 2 3 , Sx = , Sx = . Sx = 1/2 0 i/2 0 0 −1/2 Expressions (2.1) and (2.2) give the usual XXZ Hamiltonian, with the term proportional to A() accounting for boundary conditions which favour 111-interface states. The term 1/4 has been introduced so that ground states have zero energy, see below. We choose a basis for H labeled by the two states “up” or “down” of the third component of the spin at each site, and write it in terms of configurations α = {αx }x∈ , with αx ∈ {0, 1} with the convention that αx = 1 stands for spin “up” while αx = 0 stands for spin “down”. = {0, 1} denotes the set of all configurations and |α = |α x∈ x stands for a generic basis vector. For every ϕ ∈ H we write ϕ(α) = α|ϕ . q
Since Hb only acts on the bond b, a simple computation shows that q Hb |α = (q + q −1 )−1 q αxb −αyb |α − |α b , where α b := Txb ,yb α, and for a generic pair which αx and αy have been exchanged, α y (Tx,y α)z = αx α z
(2.3)
x, y, Tx,y α denotes the configuration in z=x . z=y otherwise q
q
In particular, formula (2.3) shows that if α = α b , then Hb |α = 0. Moreover, Hb = q |ξ ξ | is a projection onto the vector ξ = ξb with ξ(α) =
1 1 + q2
qαxb (1 − αyb ) − (1 − αxb )αyb .
Energy Gap in the XXZ Model
329
Let N denote the operator N |α = N (α) |α ,
N (α) =
αx .
(2.4)
x∈
From (2.3) we see that H commutes with N . We divide H in || + 1 sectors corresponding to all possible values of N . Namely, given ϕ ∈ H we write |ϕ =
||
|ϕn ,
|ϕn =
ϕ(α) |α .
α∈ : N (α)=n
n=0
In this way H is unitarily equivalent to the direct sum ⊕n Hn , where Hn is the closed subspace of H spanned by all vectors |α with N (α) = n. Now, ground states for the Hamiltonian (2.1) are vectors ψ in H such that H |ψ = 0. As in [1], [3] and [4], in each sector Hn , n = 0, 1, . . . , ||, there is a unique ground state ψn given by x αx N (α) = n x∈ q ψn (α) = . (2.5) 0 N (α) = n We shall interpret ψn2 as the weights of a canonical probability distribution νn on , by writing νn (f ) =
νn (α)f (α),
f : → R,
α∈
with νn (α) =
ψn2 (α) . 2 η∈ ψn (η)
(2.6)
It is convenient to introduce the corresponding grand canonical distributions. For every λ ∈ R we define the product measure µλ on given by µλ (f ) =
α∈
µλ (α)f (α),
µλ (α) =
q 2(x −λ)αx . 1 + q 2(x −λ)
(2.7)
x∈
For every λ ∈ R, νn can be obtained from µλ by conditioning on N (α) = n, i.e. νn = µλ (·|N (α) = n).
(2.8)
To avoid confusion we sometimes write explicitly the region we are considering and use the notations ν,n and µλ instead of νn and µλ . We shall adopt the standard notation for the variance and covariances w.r.t. a measure µ: Var µ (f ) = µ(f, f ) = µ (f − µ(f ))2 , µ(f, g) = µ (f − µ(f ))(g − µ(g)) . (2.9)
330
P. Caputo, F. Martinelli
2.3. The spectral gap. We call gap(H ) the energy of the first excited state of H . Let us write H,n for the restriction of H to the sector Hn . For each n we define the gap gap(H,n ) =
inf
0=ϕ∈Hn : ϕ|ψn =0
ϕ| H,n |ϕ . ϕ|ϕ
(2.10)
We then have gap(H ) = min gap(H,n ). n
(2.11)
2.4. Ground state transformation. For each n we consider now the Hilbert space Hn := L2 ( , νn ) with scalar product ϕ, ψνn = νn (α)ϕ(α)ψ(α). (2.12) α∈
The ground state transformation is defined by the unitary map Un : Hn → Hn ,
ϕ → Un ϕ,
where, for every α ∈ with N (α) = n, Un ϕ (α) = (νn (α))−1/2 ϕ(α).
(2.13)
Let us define the operator G,n on Hn given by G,n f (α) =
ψn (α b ) 1 b ) − f (α) . f (α (q + q −1 ) ψn (α)
(2.14)
b∈B
A simple computation shows that −G,n is a symmetric, non-negative operator with
f, (−G,n )f
νn
=
2 1 ψn (α b ) b ν (α) ) − f (α) (α . f n 2(q + q −1 ) ψn (α)
(2.15)
α∈ b∈B
Moreover, G,n 1= 0 with 1 denoting the constant f ≡ 1. We may define the gap in the spectrum of −G,n as
f, (−G,n )f ν n gap(G,n ) = inf . (2.16) 0 =f ⊥1 f, f νn Here the orthogonality f ⊥ 1 means νn (f ) = 0. The next proposition motivates the introduction at this stage of the operator G,n and of its spectral gap. Proposition 2.1. For every finite ⊂ Z3 , for every n = 0, 1, . . . , ||, we have the identity H,n = Un−1 (−G,n )Un . In particular, gap(G,n ) = gap(H,n ).
(2.17)
Energy Gap in the XXZ Model
331
Proof. If (2.17) holds we see that for any ϕ ∈ Hn ,
ϕ| H,n |ϕ = Un ϕ, (−G,n )Un ϕ ν . n
(2.18)
From this gap(G,n ) = gap(H,n ) follows since ϕ|ψn = 0 ⇐⇒ νn (Un ϕ) = 0. √ n (α) = νn (α), so that ϕ/ψ n = Un ϕ. Observe that We turn to the proof of (2.17). Let ψ for every b ∈ B we have yb = xb + 1 and therefore n (α). n (α b ) = q αxb −αyb ψ ψ
(2.19)
From (2.1) and (2.3) we see that 1 αxb −αyb b ϕ(α) − ϕ(α ) q (q + q −1 ) b∈B 1 n (α b ) Un ϕ (α) − Un ϕ (α b ) ψ = −1 (q + q ) b∈B n (α) (−G,n )Un ϕ (α) = Un−1 (−G,n )Un ]ϕ(α). =ψ
α| H |ϕ =
Then (2.20) proves the claim.
(2.20)
!
2.5. Asymmetric exclusion process. The operator G,n in (2.14) can be interpreted as the generator of an interacting particle system (see e.g. [17] for a general reference). We define ∇xy f (α) := f (Tx,y α) − f (α),
∇b f (α) := ∇xb yb f.
(2.21)
Let also cb (α) =
q αxb −αyb , q + q −1
b = (xb , yb ).
(2.22)
Then (2.14) may be rewritten G,n f (α) =
cb (α)∇b f (α).
(2.23)
b∈B
For every n, this defines a Markov Process with n particles in jumping to empty neighbouring sites. The rate of a jump is proportional to q if a particle moves from x to y = x + 1, and to q −1 if it moves from y to x . The number of particles is conserved and the measure νn is reversible for the process since G,n is self adjoint in L2 ( , νn ). Consider a cylinder := )L,H of height H and whose 111-section )L,H ∩ A0 is a tilted square QL containing L2 sites. Since the degenerate cases n = 0 and n = H L2 are trivial (νn is simply a delta on the empty/full configuration), the variable n will be assumed to range from 1 and H L2 − 1 in all statements below. Our main results can be stated as follows.
332
P. Caputo, F. Martinelli
Theorem 2.2. For any q ∈ (0, 1) there exists a constant k ∈ (0, ∞) such that for every positive integer L of the form L = 2j for some j ∈ N, inf gap(G)L,H ,n ) ≥ k −1 L−2 ,
(2.24)
sup gap(G)L,H ,n ) ≤ kL−2 .
(2.25)
H,n H,n
Remark 2.3. The proof of the lower bound (2.24) is based on a recursive analysis ([7]) using successive bisections of the basis square QL and for simplicity we stated the result only for L of the form 2j . However, as we point out in Remark 4.2 below, it is not difficult to remove this restriction so that the result actually holds for any integer L. The second result concerns the behaviour of the spectral measure associated to suitable local functions near the bottom of the spectrum. Theorem 2.4. In the same setting of Theorem 2.2, let f be a bounded function of zero mean w.r.t. νn and such that its support is contained in a sub-cylinder 0 := )L0 ,H . Let Es denote the spectral projection of the operator −G,n associated to the interval [0, s]. Then, for any q ∈ (0, 1) and any 8 > 0 there exists a constant k8 depending on 8, L0 and f ∞ such that sup f, Es f ν,n ≤ k8 s 1−8 . ,n
(2.26)
Remark 2.5. Theorem 2.4 can be obviously formulated also for the quantum Hamiltonian H,n thanks to the unitary equivalence stated in Proposition 2.1. In this context the f f result is as follows. Consider the vector ψn in the Hilbert space Hn defined by ψn (α) = f f (α)ψn (α), with f as in the theorem. Then the spectral measure of ψn has an almost linear bound close to the bottom of the spectrum of H,n . 2.6. From tilted to straight shapes. In order to avoid unnecessary complications coming from the tilted geometry of our setting we shall make the following simple transformation which allows us to go from the 111-cylinders described above to more familiar straight cylinders in Z3 , with axis along one coordinate axis. Recall that any point x ∈ Z3 is identified by the triple (xu , xv , x ), where x = x1 + x2 + x3 and the pair (xu , xv ) specifies the projection of x onto the A0 plane, obtained as the intersection )x ∩ A0 . We then have an isomorphism 9 : Z3 → Z3 given by (9x)1 = xu ,
(9x)2 = xv ,
(9x)3 = x .
(2.27)
The map 9 brings the 111-cylinder )L,H into the straight cylinder 9)L,H = {0, 1, . . . , L − 1}2 × {0, 1, . . . , H − 1}. We now introduce a new exclusion process, with n particles in a given ⊂ Z3 , jumping to empty neighbouring sites. A jump in the horizontal direction occurs with rate 1 while in the vertical direction it occurs with rate q or q −1 if the particle is going upwards or downwards, respectively. The asymmetry of the original process along the 111 direction becomes here an asymmetry along the 001 direction (the third axis). Consider the set of oriented bonds ∗ . We choose an arbitrary orientation for the horizontal bonds, which we denote O . O can be taken to be the set of couples b = (x, y) ∈ ∗ such that
Energy Gap in the XXZ Model
333
x3 = y3 , y1 ≥ x1 and y2 ≥ x2 . For vertical bonds, which we denote V , we choose the orientation according to increasing values of the third component. Thus V = b = (x, y) ∈ ∗ : y3 = x3 + 1 . The generator of our new process can be written L,n f (α) = ∇b f (α) + q αxb −αyb ∇b f (α). b∈O
(2.28)
b∈V
The generator L,n is symmetric in L2 ( , ν˜ ,n ), where now ν˜ ,n is again given by (2.5) and (2.6) but we interpret x as the third coordinate x3 . The Dirichlet form associated to this process is defined by E,n (f, f ) = EnO (f, f ) + EnV (f, f ), where f : → R and
2 1 ∇b f (α) , ν˜ ,n (α) 2
(2.30)
2 1 ν˜ ,n (α) q αxb −αyb ∇b f (α) . 2
(2.31)
EnO (f, f ) = EnV (f, f ) =
(2.29)
α∈
α∈
b∈O
b∈V
We have the following simple relation between the old process on and the new process ˜ = 9 and f˜(α) = f (α ◦ 9−1 ). on 9. Given ⊂ Z3 and f : → R, set Lemma 2.6. For every q ∈ (0, 1] there exists k < ∞ such that for every and n = 0, 1, . . . , ||,
˜ ˜ k −1 f, (−G,n )f ν ≤ E,n (2.32) ˜ (f , f ) ≤ k f, (−G,n )f ν . n
n
Proof. To prove the second estimate observe that by (2.15)
V En ˜ (f˜, f˜) ≤ (q + q −1 ) f, (−G,n )f ν , n
since for any b˜ ∈ V˜ , b˜ = (x, y), we have b := ∈ B and the rates coincide apart from the factor (q +q −1 ). Therefore we only have to control the horizontal part (2.30). Let us fix an horizontal bond b˜ ∈ O˜ , b˜ = (x, y). The point is that b := (9−1 x, 9−1 y) is not a true bond in B (since 9−1 x = 9−1 y ), but we can find b1 , b2 ∈ B such that, with the notation Tb α = α b one has (9−1 x, 9−1 y)
α b = Tb1 Tb2 Tb1 α,
∇b f (α) = ∇b1 f (Tb2 Tb1 α) + ∇b2 f (Tb1 α) + ∇b1 f (α).
Observing that cb ≥ q/(q +q −1 ) and that changes of measures give at most an additional factor q −2 , e.g. νn (α) ≤ q −2 νn (Tb1 α) for any α, we can estimate 2 ˜ ν˜ ,n ν,n (α)(∇b f (α))2 ˜ (α) ∇b˜ f (α) = α∈˜
≤3
α∈
ν,n (α) (∇b1 f (Tb2 Tb1 α))2 + (∇b2 f (Tb1 α))2 + (∇b1 f (α))2
α∈
≤ 3q −3 (q + q −1 )
α∈
ν,n (α) 2cb1 (α)(∇b1 f (α))2 + cb2 (α)(∇b2 f (α))2 .
334
P. Caputo, F. Martinelli
Summing over b˜ ∈ O˜ we obtain
O En ˜ (f˜, f˜) ≤ 6q −3 (q + q −1 ) f, (−G,n )f ν . n
To prove the first inequality in (2.32) we repeat the same reasoning, observing that for every bond in b ∈ B either b is along a single stick in which case the bound is straightforward since b˜ ∈ V˜ ( b is the image of b under 9), or b connects two different sticks. In the latter case there are b˜1 ∈ O˜ and b˜2 ∈ V˜ such that the exchange across b can be realized by successive exchanges across b˜1 and b˜2 and the above arguments apply. ! Thanks to Lemma 2.6 we will obtain Theorem 2.2 as a consequence of the following Theorem 2.7. For any q ∈ (0, 1) there exists a constant k ∈ (0, ∞) such that for every positive integer L of the form L = 2j , j ∈ N, inf gap(L)L,H ,n ) ≥ k −1 L−2 ,
(2.33)
sup gap(L)L,H ,n ) ≤ kL−2 .
(2.34)
H,n H,n
Remark 2.8. Since there is complete symmetry between particles (αx = 1) and holes (αx = 0), for any and any n = 0, 1, . . . , || we have gap(L,n ) = gap(L,||−n ).
(2.35)
Convention. In the rest of the paper we work in the straight geometrical setting described above. With some abuse we keep all the notations unchanged and write x for the third coordinate x3 . In this way sets Ah are now horizontal planes, )x denotes a vertical stick, )*,H denotes a straight cylinder, RL,M denotes a rectangle on the plane A0 and so on. Moreover, the probability measure ν˜ n will be simply written νn , so that νn and µλ are defined as in (2.6) and (2.7) provided x stands for x3 . 3. Preliminary Results In this section we collect several preliminary technical results that will enter at different stages in the proof of our two main results. As a rule, in what follows k denotes a positive finite constant depending only on q, whose value may change from line to line. 3.1. Mean and variance of the number of particles. In this first paragraph we give some elementary bounds on the statistics of the number of particles in a stick and on the chemical potential as a function of the mean number of particles. Part of the results discussed below have already been derived with more accurate constants in [3] in the case of an interface sitting roughly in the middle of the cylinder. Here we need results that are uniform in the location of the interface. Let us consider a single stick of height H , := )H , and the grand canonical measure µλ on , λ ∈ R. We have the following simple relations between λ and m(λ) := µλ (N ), the mean number of particles in .
Energy Gap in the XXZ Model
335
Lemma 3.1. For each q ∈ (0, 1) there exists k < ∞ such that for every H ≥ 1, (λ − k) ∨
λ ≤ m(λ) ≤ k + λ, 2
1 2|λ| q ≤ m(λ) ≤ kq 2|λ| , k H − kq 2(λ−H ) ≤ m(λ) ≤ H,
if λ ∈ [0, H − 1],
(3.1)
if λ < 0,
(3.2)
if λ ≥ H.
(3.3)
Proof. From (2.7) we have the identity m(λ) =
H −1 j =0
q 2(j −λ) . 1 + q 2(j −λ)
(3.4)
If λ > 0, the summand in (3.4) is bounded below by 1/2 for all j ≤ [λ] and the bound m(λ) ≥ λ/2 is straightforward. When λ ∈ [0, H − 1], writing λ = [λ] + {λ} we have m(λ) = [λ] + 1 −
[λ] k=0
q 2(k+{λ}) + 1 + q 2(k+{λ})
H −1−[λ] l=1
q 2(l−{λ}) . 1 + q 2(l−{λ})
(3.5)
The estimate |m(λ) − λ| ≤ k then follows easily from (3.5). This proves (3.1). To prove (3.2) observe that if λ < 0, m(λ) = q 2|λ|
H −1 j =0
and therefore q 2|λ|
q 2j , 1 + q 2(j −λ)
(3.6)
1 − q 2H 1 − q 2H ≤ m(λ) ≤ q 2|λ| . 2 2(1 − q ) 1 − q2
Finally (3.3) follows from m(λ) =
H −1 j =0
1 q 2(λ−j ) + 1
= H − q 2(λ−H )
H −1 k=0
q 2(k+1) . 1 + q 2(k+1+λ−H )
(3.7)
! Next we consider σ 2 (λ) := Var µλ (N ), the variance of the number of particles in . Lemma 3.2. For each q ∈ (0, 1) there exists k < ∞ such that for every H ≥ 1, λ ∈ R , 1 ≤ q 2[λ∧0+(H −λ)∧0] σ 2 (λ) ≤ k. k
(3.8)
H −1 −2 q 2(j −λ) −(j −λ) (j −λ) = + q . q (1 + q 2(j −λ) )2
(3.9)
Proof. Recall that σ 2 (λ) =
H −1 j =0
j =0
336
P. Caputo, F. Martinelli
By symmetry (particle-shole duality), it is sufficient to consider the range λ ≤ H /2. For the upper bound observe that (3.9) is bounded above by H −1 j =0
q
−2|j −λ|
1+q 2 ≤
1−q 2 q 2|λ|
0 ≤ λ ≤ H /2 1 1−q 2
λ<0
.
(3.10)
For the lower bound we estimate (3.9) below by H −1 1 −2|j −λ| 1 q 2 q ≥ 4 4 q 2(|λ|+1) j =0
0 ≤ λ ≤ H /2 . λ<0
(3.11)
! Remark 3.3. A simple consequence of the bounds in Lemma 3.1 and Lemma 3.2 is the following. Let L = )L,H be the usual cylinder containing H L2 sites and let σL2 (λ) denote the variance of NL w.r.t. µλ . Let also mL (λ) = µλ (NL ). Since mL (λ) = L2 m(λ) and σL2 (λ) = L2 σ 2 (λ) the estimates of the two lemmas can be combined to obtain that there exists k = k(q) < ∞ such that for any λ ∈ R and any H ≥ 1, 1 (mL (λ) ∧ L2 ) ≤ σL2 (λ) ≤ k(mL (λ) ∧ L2 ). k
(3.12)
3.2. Comparison of canonical and grand canonical measures. In this second paragraph we will discuss some simple but important results on the equivalence and comparison between the finite volume canonical measure ν,n and the grand canonical one µλ , where := )L,H and the chemical potential λ = λ(, n) is chosen in such a way that µλ (N ) = n. Although some of the results discussed below have already been discussed in the seminal paper [3], for later purposes we need to improve the estimates obtained in [3] to get bounds similar to those established in [8] for general lattice gases. For notational convenience we drop all the super/subscripts in the measures. Theorem 3.4. For any q ∈ (0, 1) there exists a constant k ∈ (0, ∞) such that for every positive integers H, L, L0 ≤ L, for every n = 0, 1, . . . , H L2 and for any bounded function f such that its support is contained in the sub-cylinder )L0 ,H we have |ν(f ) − µ(f )| ≤ kf ∞
L20 . L2
(3.13)
Remark 3.5. For the above result it is completely irrelevant whether we are in a tilted or straight geometry. Also, because of horizontal translation invariance, all that matters is that the support of f is contained in some cylinder with basis of linear size L0 . Remark 3.6. It is interesting to compare our result with that of [3]. There the dependence L2
0 on L0 , L is worse because the leading term is of the order of L−L but the coefficient 0 in front of it, kf ∞ in our case, is better since it is proportional to supn |ν,n (f )|. On the other hand the proof given below works in any dimension so that the bound (3.13) is valid for all d ≥ 2 with (L0 /L)2 replaced by (L0 /L)d−1 .
Energy Gap in the XXZ Model
337
Proof. We begin by proving the result for f (α) = αx , x ∈ )L,H . In what follows k will denote a generic constant depending only on q whose value may change from time to time. It will first be convenient to fix some additional handy notation. We let σy2 := µ(αy , αy );
σ 2 := L2
y∈)0
α¯ x := αx − ρx ;
βxy :=
σy2 σx2
σy2 ;
ρx := µ(αx ); t
φx (t) := µ(ei σ α¯ x ).
;
Notice that σ 2 = µ(N , N ) because of the product structure of the measure µ. Following [8] we begin by proving that for any x, y ∈ , |ν(αy − βxy αx − µ(αy − βxy αx )| ≤ k
σy2 σ2
(3.14)
for some constant k = k(q). Once (3.14) holds then, a summation over y together with the identity y [ν(αy ) − µ(αy )] = 0 and the definition of βxy yields |ν(αx ) − µ(αx )| ≤ k
σx2 , σ2
which is a slightly stronger result than the sought bound σx2 /σ 2 ≤
z∈)x
σz2 /σ 2 =
(3.15) k L2
because
1 . L2
(3.16)
In order to prove (3.14) we write ν(αy − βxy αx ) − µ(αy − βxy αx ) πσ t 1 = dtµ(ei σ (N −n) , αy − βxy αx ), 2πσ µ(N = n) −πσ where 1 µ(N = n) = 2π σ
πσ
−πσ
(3.17)
t dtµ ei σ (N −n)
denotes the µ-probability of having n particles in . Since µ is a product measure, the absolute value of the numerator in the integrand is bounded from above by
t |φz (t)| µ(ei σ (α¯ x +α¯ y ) [α¯ y − βxy α¯ x ]).
(3.18)
z=x,y
It is quite easy to check that |φz (t)| ≤ e
−k
σz2 2 t σ2
∀|t| ≤ π σ
(3.19)
338
P. Caputo, F. Martinelli
for some constant k = k(q), so that
|φz (t)| ≤ e−kt . Moreover the identity 2
z=x,y
eiϑ = 1 + iϑ + R(ϑ) with |R(ϑ)| ≤ ϑ2 together with the definition of the coefficient βxy gives µ(ei σt (α¯ x +α¯ y ) [α¯ y − βxy α¯ x ]) t t t2 = φx (t)µ(ei σ α¯ y α¯ y ) − βxy φy (t)µ(ei σ α¯ x α¯ x ) ≤ k 2 σy2 . (3.20) σ 2
In conclusion, putting together (3.19) and (3.20), the numerator of (3.17) is bounded σ2
from above by k σy2 . We are left with the analysis of the denominator in (3.17). We will show that σ µ(N = n) ≥ k −1
(3.21)
uniformly in n, L, H . We have to distinguish between the case in which σ 2 is “large” and the case in which σ 2 is “small”. To be more specific we fix a large number A and start to analyze the case σ ≥ A5 . Again using Fourier analysis we write 2πσ µ(N = n) πσ t = dtµ ei σ (N −n) =
−πσ
A≤|t|≤πσ
=
A≤|t|≤πσ
=
A≤|t|≤πσ
t dtµ ei σ (N −n) + t dtµ ei σ (N −n) + t dtµ ei σ (N −n) +
where Rx (t) = φx (t) − 1 + R(t) =
|| 1 j! j =1
2 t 2 σx 2 σ2
A −A A −A A −A
dt
φx (t)
x∈
! t 2 σx2 1− dt + Rx (t) 2 σ2 x∈ ! A t 2 σx2 1− + dt dtR(t), 2 σ2 −A
(3.22)
x∈
and a simple expansion gives j
Rxi (t)
x1 ...xj ∈: i=1 x1 =x2 =···=xj
z∈\{x1 ,...xj }
1−
! t 2 σz2 . 2 σ2
(3.23)
Let us examine the three terms in the r.h.s. of (3.22). The first one is smaller in absolute / 2 value than ke−k A because of the gaussian bound (3.19). Observing that ! 2 2 t 2 σx2 − t 2 x∈ σx2 − t4 4σ 1− ≥ e = e 2 σ2 x∈
we see that the second one is greater than 1/2 provided that A is large enough, uniformly in all the parameters. Finally, using |Rx (t)| ≤ k
|t|3 2 σ σ3 x
Energy Gap in the XXZ Model
339
and (3.23), the absolute value of the third one is smaller than j j || || 1 1 |t|3 |Rx (t)| ≤ 2A sup ≤ kA−1 . 2A sup j! j! σ |t|≤A |t|≤A x∈
j =1
j =1
In conclusion, if σ ≥ A5 and A is large enough (but independent of L, H, n) (3.21) holds true. Let us now examine the case σ ≤ A5 which, for large values of L and H , corresponds to an extremely low density of particles (cf. (3.12)). In this case we bound µ(N = n) from below as follows. If L2 ≤ 2n then we impose that all the particles in the cylinder are packed starting from the bottom and according to an arbitrary ordering of the sites on each horizontal square QL + (0, 0, ). It is not difficult to check that the probability of such an event is bounded below by exp (−γ L2 ) for some γ = γ (q) > 0, uniformly in the height of the cylinder. But since L2 ≤ 2n ≤ kσ 2 (by (3.12)) we have a lower bound exp (−kγ A10 ). If instead L2 ≥ 2n then we impose that all the particles are at height = 0. The probability of this last event is equal to 2 H −1 L 2 2 (1 − p )L , (3.24) p0n (1 − p0 )L −n n =1
where p = Notice that H −1
q 2(−λ)
is the probability that there is a particle at a site x with x = . any ≥ 0 since L2 ≥0 p = n. In particular
1+q 2(−λ) p ≤ 21 for
H −1 2 (1 − p )L ≥ exp −4L2 (1 − p )p = exp(−4σ 2 ) ≥ exp(−4A10 ).
=1
=0
(3.25) L2
Finally, since p0 L2 ≤ n and n ≤ kσ 2 ≤ kA10 , also the factor n p0n (1 − p0 )L −n is bounded away from zero uniformly in L, H . We are now in a position to give the result in its full generality. For any x ∈ 0 = )L0 ,H , set βx,f := µ(f, N0 )/σx2 , where N0 denotes the number of particles in 0 . Observing that Var µ (f ) ≤ 2f 2∞ Var µ (N0 ), and using Schwarz’ inequality, it is not difficult to check that |βx,f | ≤ kf ∞
Var µ (N0 ) . σx2
2
(3.26)
Thanks to (3.15), |βx,f ||ν(αx ) − µ(αx )| ≤ kf ∞
L2 Var µ (N0 ) = kf ∞ 02 . 2 σ L
We can thus safely replace f by f − βx,f αx . We proceed at this point exactly as in (3.17)–(3.20) and we observe that, because of the very definition of βx,f the analogous of (3.20) holds, namely 2 i t (α¯ +N¯ ) µ e σ x 0 [f − βx,f αx ] ≤ kf ∞ L0 . L2 The theorem then follows because of the bound (3.21). !
(3.27)
340
P. Caputo, F. Martinelli
Our second result provides a simple upper bound on the canonical expectation of a rather general, non-negative function f in terms of its grand canonical mean. The relevance of such bounds is that they permit to control the canonical large deviations in terms of the grand canonical ones (see [7] and especially [9]). We begin with the very simple one dimensional case = )H . Proposition 3.7. For any q ∈ (0, 1) there exists a constant k ∈ (0, ∞) such that for any one dimensional stick = )H , for every n = 0, . . . , H and for every f ≥ 0, ν(f ) ≤ kµ(f ).
(3.28)
Proof. It suffices to observe that the probability µ(N = n) can be bounded from below by µ(N = n) ≥
µ(αx )
x∈:x ≤n−1
µ(1 − αy ) ≥ k / (q) > 0,
(3.29)
y∈:y ≥n
since, by Lemma 3.1, there exists a constant k = k(q) such that |λ(n) − n| ≤ k.
!
Next we turn to the genuine three dimensional case = )L,H . Proposition 3.8. For any q ∈ (0, 1) and δ ∈ (0, 1) there exists a constant k ∈ (0, ∞) such that for every positive integers H and L, for every n = 0, 1, . . . , H L2 and for every non-negative function f whose support does not intersect more than (1 − δ)L2 sticks, ν(f ) ≤ kµ(f ).
(3.30)
ν(f, f ) ≤ kµ(f, f ).
(3.31)
In particular,
Proof. Let us denote by f the support of f . Then, using (3.21) together with the gaussian upper bound on the absolute value of the characteristic function (3.19), πσ t 1 dtµ(ei σ (N −n) f ) 2πσ µ(N = n) −πσ πσ ≤ kµ(f ) dt |φx (t)|
ν(f ) =
−πσ
≤ kµ(f )
πσ
−πσ
x∈\f
dte−kδt ≤ k / µ(f ). 2
(3.32)
The result for the variance follows at once from ν(f, f ) ≤ ν (f − µ(f )2 ≤ kµ(f, f ).
!
Energy Gap in the XXZ Model
341
3.3. An estimate on covariances. An important ingredient of our approach is the following version of a well known estimate due to [18] (see also [7]). Set = )L,H , and let ν denote the canonical measure with n particles in . The result given below will be used in the recursive estimate of Sect. 4 in the regime of large L, see Theorem 4.1. On the other hand its proof uses an estimate for small values of L (see (4.4) and (4.5)) that will be proven independently later on. In the following B denotes a planar section of , i.e. B = ∩ Ah for some integer h ≤ H − 1, and NB is the number of particles in B. Proposition 3.9. For every q ∈ (0, 1) and for every 8 > 0 there exists C8 = C8 (q) < ∞ such that for any function f , any height H ≥ 1 and for all n = 0, . . . || we have ν(f, NB )2 ≤ L2 ∧ n C8 Eν (f, f ) + 8 Var ν (f ) . (3.33) Proof. only on We take R ∈ Z+ and write the square QL as the disjoint union of smaller squares QiR of side R L. This is no real loss since, in view of the horizontal exchangeability of variables under ν, the geometry of the basis does not play any role ¯ where Q ¯ is a and we can always assume QL to be given by the union of ∪i QiR and Q, small region contained in a square of side R which is inessential in the argument below. Let then = ∪i )i , )i := )Qi ,H , Ni := N)i , and let F be the σ -algebra generated R by the random variables {Ni }. For any pair of functions f, g we have ν(f, g) = ν ν(f, g | F) + ν f, ν(g | F) . (3.34) Simple estimates then allow us to write ν(f, g)2 ≤ 2ν Var ν (g | F) ν Var ν (f | F) + 2Var ν (f )Var ν ν(g | F) . (3.35) Now define the function g = i gi , with gi = NBi − βNi , where Bi = B ∩ )i and β is a parameter to be fixed later on. Observe that with this choice ν(f, NB ) = ν(f, g) and Var ν (g | F) = Var ν (NBi | Fi ), (3.36) i
where we used Fi to denote the σ -algebra generated by Ni . We fix now a value ni for Ni and write λ(ni ) for the corresponding chemical potential, i.e. ni = µλ(ni ) (Ni ) = |Bi |
H −1 =0
λ(ni )
p
,
λ(ni )
p
=
q 2(−λ(ni )) . 1 + q 2(−λ(ni ))
(3.37)
Using Proposition 3.8 we have (recall that h is the level of every Bi ) λ(ni )
Var ν (NBi |Ni = ni ) ≤ kVar µλ(ni ) (NBi ) = k|Bi |ph
λ(ni )
(1 − ph
),
(3.38)
and by Lemma 3.2 and Lemma 3.1 we have Var ν (NBi |Ni = ni ) ≤ k |Bi | ∧ ni .
(3.39)
In particular, together with (3.36) this implies max
{ni }:
i
ni =n
Var ν (g | F) ≤ k L2 ∧ n .
(3.40)
342
P. Caputo, F. Martinelli
Since the measure ν(· | F) is a product ⊗i ν(· | Fi ) and each factor ν(· | Fi ) satisfies a Poincaré inequality with a constant W (R) uniform in the conditioning field (see (4.4) and (4.5)), we have ν (Var ν (f | F)) ≤ ν Var ν (f | Fi ) i
≤ W (R)
ν ν (∇b f )2 | Fi b∈)i∗
i
≤ W (R)Eν (f, f ).
(3.41)
Plugging (3.40) and (3.41) in (3.35) we obtain ν(f, g)2 ≤ 2kW (R) L2 ∧ n Eν (f, f ) + 2kL2 R −2 Var ν (f )Var µ (ν(g1 | F1 )) , (3.42) where we have used again Proposition 3.8 to bound Varν (ν(g | F)) in terms of Var µ (ν(g | F)) = L2 R −2 Var µ (ν(g1 | F1 )) with µ := µλ the grand canonical measure on , and λ := λ(n) such that µ(N ) = n. At this point we consider separately two cases corresponding to “many” and “few” particles respectively. We start with the case of many particles. Suppose n > 8 2 L2 . Here the claim (3.33) will follow from Var µ (ν(g1 | F1 )) ≤ k,
(3.43)
by taking R sufficiently large in (3.42). To prove (3.43) we begin by observing that by Theorem 3.4, λ(n ) sup ν(g1 | N1 = n1 ) − µ)1 1 (g1 ) ≤ k, n1
and therefore it is sufficient to show
λ(N ) Var µ µ)1 1 (g1 ) ≤ k.
(3.44)
λ(N )
An estimate on the variance of ϕ(N1 ) := µ)1 1 (g1 ) can be obtained as follows. Since µ is a product ⊗x∈)i µx , one has µ Var µx (ϕ) . Var µ (ϕ) ≤ x∈)1
But
2 2 µ Var µx (ϕ) = σx µ ϕ 1 + , αy − ϕ αy y=x
y=x
σx2 = pλx (1 − pλx ).
It is not difficult now to deduce Var µ (ϕ) ≤ k σx2 µ [ϕ(N1 + 1) − ϕ(N1 )]2 . x∈)1
(3.45)
Energy Gap in the XXZ Model
343
By Remark 3.12 we know that we have
x∈)1
σx2 ≤ k(R 2 ∧ µ(N1 )). Since µ(N1 ) = nR 2 /L2
Var µ (ϕ(N1 )) ≤ kR 2 µ
2
ϕ(N1 + 1) − ϕ(N1 )
.
(3.46)
Now we can choose β so that λ(N +1) λ(N ) µ µ)1 1 (NB1 ) − µ)1 1 (NB1 ) = β. In this way the right-hand side of (3.46) is again a variance and a new application of (3.45) gives Var µ (ϕ) ≤ kR 4 µ (ϕ(N1 + 1))2 , (3.47) ϕ(m) := ϕ(m + 1) + ϕ(m − 1) − 2ϕ(m). We are going to show that sup |ϕ(m)|2 ≤ kR −4 .
(3.48)
m
We have
1 1
ϕ(m) = 0
0
λ(m+s+t)
∂t ∂s µ)1
(NB1 ) dt ds.
(3.49)
Set λs,t = λ(m + s + t). Using the identities λ ∂t µ)s,t1 (NB1 )
=
λ ∂s µ)s,t1 (NB1 )
λ
=
µ)s,t1 (N1 , NB1 ) λ
µ)s,t1 (N1 , N1 )
,
(3.50)
we have λ ∂t ∂s µ)s,t1 (NB1 )
λ
λ
λ
µ)s,t1 (N1 , N1 , NB1 ) µ)s,t1 (N1 , N1 , N1 )µ)s,t1 (N1 , NB1 ) = λ . λs,t 2 − 3 µ)s,t1 (N1 , N1 ) µ)1 (N1 , N1 )
Here we use the standard notation µ(f, g, h) = µ((f − µ(f ))(g − µ(g))(h − µ(h))). Direct computations show that for any λ, µλ)1 (N1 , N1 ) = |B1 | µλ)1 (N1 , NB1 )
H −1 j =0
pjλ (1 − pjλ ),
= |B1 |phλ (1 − phλ ),
µλ)1 (N1 , N1 , N1 ) = |B1 |
H −1 j =0
pjλ (1 − pjλ )(1 − 2pjλ ),
µλ)1 (N1 , N1 , NB1 ) = |B1 |phλ (1 − phλ )(1 − 2phλ ).
(3.51)
344
P. Caputo, F. Martinelli
From (3.51), using |B1 | = R 2 we have λ
∂t ∂s µ)s,t1 (NB1 ) = R −2 C(h, H, λs,t ), −1 λ λ λ λ 2phλ (1 − phλ ) H j =0 pj (1 − pj )[pj − ph ] . C(h, H, λ) := H −1 λ λ 3 j =0 pj (1 − pj )
(3.52)
From (3.52) and (3.49) we see that (3.48) will follow from sup C(h, H, λ) < ∞.
h,H ∈N λ∈R
(3.53)
A first estimate gives 2 C(h, H, λ) ≤
H −1
λ λ λ λ j =0 pj (1 − pj )|ph − pj | . H −1 λ λ 2 j =0 pj (1 − pj )
Then Lemma 3.2 shows that |C(h, H, λ)| is bounded whenever λ ∈ [0, H − 1]. On the other hand if λ ≤ 0 then 1 − pjλ ≥ 1/2, whereas if λ ≥ H − 1, then pjλ ≥ 1/2. In any case for 0 ≤ j, h ≤ H − 1, |phλ − pjλ | ≤ 2
H −1 =0
pλ (1 − pλ )
and (3.53) follows. We turn to analyze the case of few particles: n ≤ 8 2 L2 . In this case we simply take R = 1 and call ψ(N1 ) := ν(g1 | N1 ). We may assume that 8 2 k ≤ 8. Thus looking back at (3.42) we see that it will be sufficient to show Var µ (ψ) ≤ k8 2
n . L2
(3.54)
Since now µ(N1 ) = n/L2 , (3.45) gives Var µ (ψ) ≤ k
n 2 . µ [ψ(N + 1) − ψ(N )] 1 1 L2
Choosing β = µ ν(NB1 | N1 + 1) − ν(NB1 | N1 ) , (3.47) becomes Var µ (ψ) ≤ k
n n2 µ (ψ(N1 + 1))2 ≤ k8 2 2 µ (ψ(N1 + 1))2 . 4 L L
On the other hand a trivial bound (remember that now B1 is just a single site) gives |ψ(m)| ≤ 8 and (3.54) follows immediately. !
Energy Gap in the XXZ Model
345
3.4. Glauber bound for the number of particles in half volume. Consider the cylinder )2L,H and divide it into two parts: = )2L,H = 1 ∪ 2 ,
1 = )RL,2L ,H ,
2 = (L, 0, 0) + )RL,2L ,H .
(3.55)
Fix n ∈ {1, . . . , 2L2 H = ||/2}, let ν := ν,n denote the canonical measure on = )2L,H with total particle number n and let pn (m) = ν(N1 = m). We begin by establishing upper and lower bounds on the ratio pnp(m+1) for m ≥ n2 . In what follows n (m) λ λs will denote the chemical potential such that µ s (N1 ) = s. Lemma 3.10. For any q ∈ (0, 1), there exists k < ∞ such that, uniformly in L, H, n and m ∈ [ n2 , n], k −1 q 2(λm+1 −λn−m−1 ) ≤
pn (m + 1) ≤ kq 2(λm −λn−m ) . pn (m)
(3.56)
Moreover, for every 8 > 0 there exists δ ∈ (0, 1) such that pn (m + 1) ≤8 pn (m)
(3.57)
whenever m ∈ [δn, n], uniformly in all other parameters. Proof. We write pn (m + 1) =
ν αx (1 − αy )1IN1 =m+1
x∈1 y∈2
(m + 1)(|1 | − n − m − 1)
q 2(x −y ) ν αy (1 − αx ) | N1 = m pn (m) = (m + 1)(|1 | − n − m − 1) x∈1 y∈2
q 2(x −y ) µλm (1 − αx ) µλn−m (αy )pn (m) ≤k (m + 1)(|1 | − n − m − 1) x∈1 y∈2
=k
m | | − n − m 1 q 2(λm −λn−m ) pn (m) m + 1 |1 | − n − m − 1
≤ k / q 2(λm −λn−m ) pn (m),
(3.58)
where we used Proposition 3.7 and the fact that ν(· | m) is the product of ν1 ,m and ν2 ,n−m . The lower bound in (3.56) can be obtained in a similar way if we write pn (m) as ν αy (1 − αx )1IN =m 1 pn (m) = (n − m)(|1 | − m) x∈1 y∈2
and proceed as above. In order to prove the estimate (3.57) one could use bounds on the chemical potentials λm , λn−m . We prefer however a different route and rewrite pn (m + 1) as follows. For any we set A = 1 ∩ A and B = 2 ∩ A . Let also NA (α) be the number of
346
P. Caputo, F. Martinelli
particles in the plane A , VA (α) = 2L2 − NA (α) the corresponding number of holes, and similarly for B . Then # " H −1 αx (1 − αy ) ν H −1 1IN1 =m+1 pn (m + 1) = h=0 NAh VBh =0 x∈A y∈B
=
H −1
=0 x∈A y∈B
" ν
H −1 h=0
"
≤ν
H −1 =0 VA NB H −1 h=0 NAh VBh
(1 − αx )αy (NAh + δ,h )(VBh + δ,h )
# | N1 = m pn (m)
# | N1 = m pn (m).
(3.59)
On the event N1 = m, N2 = n − m we have H −1 −1 2L2 (n − m) − H 1−δ n−m =0 VA NB =0 NA NB ≤ = ≤ H −1 H −1 2 2m − n 2δ −1 2L m − h=0 NAh NBh h=0 NAh VBh for δn ≤ m ≤ n. Therefore (3.57) follows from the estimate (3.59).
!
Next we establish a Poincaré inequality for the marginal of ν on N1 with respect to the corresponding Metropolis–Glauber dynamics. Proposition 3.11. For all q ∈ (0, 1), there exists a constant k < ∞ such that for all integers L, H , for all n ∈ {0, 1, . . . , 4L2 H } and for all functions g : N → R, Var ν g(N1 ) ≤ k n ∧ L2 pn (m) ∧ pn (m + 1) [g(m + 1) − g(m)]2 . (3.60) m
Proof. We follow closely the analogous result for translation invariant lattice gases proved in [7] (see Theorem 4.4 there). Assume without loss of generality that m ≥ n2 and write pn (m) as pn (m) = e−Vn (m) φn (m), where Zm Vn (m) := 2 log(1/q) mλm + (n − m)λn−m − log , Z λ (N2 = n − m) µλm1 (N1 = m)µn−m 2 φn (m) := . µ(N = n) Here the partition function Z (m) is given by Z (m) := 1 + q 2(x −λm ) 1 + q 2(y −λn−m ) x∈1 y∈2
and similarly for Z but without the chemical potentials λm , λn−m . Were the factor φn (m) constant (better: of bounded variation uniformly in m, n) then the desired Poincaré inequality would follow at once, using e.g. the Cheeger inequality
Energy Gap in the XXZ Model
347
or Hardy’s inequality [21], from a convexity bound of the form (see (3.64) and (3.65) below) d2 1 Vn (m) ≥ k / (δ, q) . 2 dm n ∧ L2 φ(m) Unfortunately the ratio φ(m / ) can be rather large, depending on n, if e.g. m ≈ n/2 and / m ≈ n. On the other hand Lemma 3.10 shows that the distribution pn (m) has at least exponential tails so that, as far as the Poincaré inequality is concerned, the tails should be irrelevant. That is indeed true and, according to Sect. 4 of [7], the result follows if we can show that there exists δ < 1 with 1 − δ 1 and a constant k such that
φ(m) ≤ k, / m,m/ ∈[ n ,δn] φ(m ) sup
(3.61)
2
1 pn (m + 1) ≤ , pn (m) 2 δn≤m≤n 1 min V (m + 1) − V (m) ≥ . n k(n ∧ L2 ) m∈[ 2 ,δn] sup
(3.62) (3.63)
Inequality (3.61) follows at once from the fact, proved in the discussion of the equiva λ lence of ensembles, that µλm1 N1 = m (and similarly for µn−m (N2 = n − m)) is 2 comparable to the inverse of the standard deviation of the number of particles in 1 , together with (3.12). Inequality (3.62) is nothing but (3.57) above. Finally (3.63) follows from the convexity of the “potential” Vn (m). More precisely, since Vn (m) is even w.r.t. n2 , all that we need is d2 d λm − λn−m ] Vn (m) = 2 log(1/q) 2 dm dm
(3.64)
together with d 1 1 1 λm = ≥ k / (δ, q) dm log(1/q) Var µλm (N1 ) n ∧ L2 1
n ∀m ∈ [ , δn] 2
(3.65)
and analogously for λn−m . Above we have used once more (3.12) to control the variance of the number of particles in terms of its mean. ! 3.5. Moving particles. In this paragraph we will show how to relate “long jump terms” of the form ν,n ([∇xy f ]2 ) with x, y ∈ to the sum of nearest neighbor jumps along a path leading from x to y. In what follows the setting and the notation will be that of the preceding subsection, cf. (3.55). We will analyze two different situations that we call, for convenience, the many particles case (MP) and the few particles case (FP). The definitions will depend on a parameter δ which will be forced to be sufficiently small when needed (see the proof of Theorem 4.1). • Many particles:
H ≥ 1,
δL2 ≤ n ≤ 2L2 H.
(MP)
348
P. Caputo, F. Martinelli
• Few particles: H ≥ 1,
1 ≤ n ≤ δL2 .
(FP)
In the MP case let A and B be two horizontal sections of 1 and 2 at height A and B respectively, with A ≥ B . In the FP case, the sets A and B are instead given by (3.66) A = x ∈ 1 : x ≤ h − 1 , B = y ∈ 2 : y = 0}, where h ∈ N will be suitably tuned later on. Below we use ν(· | m) for ν(· | N1 = m). n Proposition 3.12. For any q ∈ (0, 1), any n = 1, . . . || 2 and any m ∈ [ 2 , n],
q 2(x −y ) ν (∇xy f )2 αy (1 − αx ) | m
x∈A y∈B
≤ CL2 L2 ν (∇b f )2 | m + ν (∇b f )2 | m , b∈O
(3.67)
b∈V
where C is a suitable constant depending on q in the MP case and on q, h in the FP case. The rest of this section is devoted to the proof of the above proposition. For each couple of sites x ∈ A, y ∈ B, define a third site z = z(x, y) with z1 = y1 , z2 = y2 and z = x . That is, z is the unique element of Ax ∩ )y . Since Txy α = Tyz Txz Tyz α, we decompose ∇xy f in ∇xy f (α) = ∇yz f (Txz Tyz α) + ∇xz f (Tyz α) + ∇yz f (α).
(3.68)
We then have two vertical moves corresponding to exchanges between y and z, and one horizontal move corresponding to the exchange between x and z. Thus (3.69) q 2(x −y ) ν (∇xy f )2 αy (1 − αx ) | m ≤ 3 {IO + IV } x∈A y∈B
with IO =
q 2(x −y ) ν (∇xz f (Tyz α))2 αy (1 − αx ) | m
(3.70)
x∈A y∈B
and IV =
q 2(x −y ) ν
(∇yz f (Txz Tyz α))2 + (∇yz f (α))2 αy (1 − αx ) | m .
x∈A y∈B
We analyze these terms separately. Vertical moves. If we have a particle at y and a hole at x then Txz Tyz α x = αy = 1, Txz Tyz α z = αx = 0. Txz Tyz α y = αz ,
(3.71)
Energy Gap in the XXZ Model
349
Computing ∇yz f (Txz Tyz α) we may thus assume αz = 1 (it vanishes otherwise). Since we have a particle both at y and at z the change of variables α → Txz Tyz α produces no extra factors and ν (∇yz f (Txz Tyz α))2 αy (1 − αx ) | m = ν (∇yz f (Txz Tyz α))2 αy (1 − αx )αz | m = ν (∇yz f (α))2 αy αx (1 − αz ) | m . (3.72) For the second term in (3.71) we have ν (∇yz f (α))2 αy (1 − αx ) | m = ν (∇yz f (α))2 αy (1 − αx )(1 − αz ) | m , therefore (3.71) becomes IV = q 2(x −y ) ν (∇yz f (α))2 αy (1 − αz ) | m .
(3.73)
x∈A y∈B
We need the following rather general result. Given any y ∈ and z ∈ )y we write γzy for the shortest path connecting z and y along the stick. Proposition 3.13. For any q ∈ (0, 1), for any y ∈ and z ∈ )y , ν (∇b f )2 . q 2[(z −y )∨0] ν (∇zy f )2 αy (1 − αz ) ≤ 4q 2 (1 − q 2 )−1
(3.74)
b∈γzy
Proof. Assume first that z ≥ y . Let M = z − y and consider the sequence y = x0 , x1 , . . . , xM = z, with xi = xi−1 + 1. Write αi = αxi , Ti,i+1 for the exchange operator Txi xi+1 and ∇i,i+1 for the corresponding gradient. We want to prove M ν (∇i−1,i f )2 . q 2M ν (∇0,M f )2 α0 (1 − αM ) ≤ 4q 2 (1 − q 2 )−1
(3.75)
i=1
We have a particle at x0 and a hole at xM . To compute T0,M we first bring the particle from x0 to xM and then bring the hole, which sits now at xM−1 , back to x0 . We write T0,M α = T0,1 T1,2 · · · TM−1,M TM−2,M−1 · · · T1,2 T0,1 α.
(3.76)
To fit the picture described above, formula (3.76) should be read backwards. The first part of the transformation is described by operators R0 α = α,
Ri α = Ti−1,i · · · T1,2 T0,1 α,
i = 1, 2, . . . , M − 1,
(3.77)
while the second part is given by operators Li α = Ti,i+1 Ti+1,i+2 · · · TM−1,M TM−2,M−1 · · · T1,2 T0,1 α = Ti,i+1 Ti+1,i+2 · · · TM−1,M RM−1 α, i = 1, 2, . . . , M − 1.
(3.78)
In this way a simple telescopic argument shows that ∇0,M f (α) = f (T0,M α) − f (α) = f (LM−1 α) − f (α) +
M−1
∇i−1,i f (Li α)
i=1
=
M i=1
∇i−1,i f (Ri−1 α) +
M−1 i=1
∇i−1,i f (Li α).
(3.79)
350
P. Caputo, F. Martinelli
Let us study these two contributions separately. We start with the Ri ’s. Observe that α0 Ri α j = αj +1 α j
j =i 0≤j ≤i−1 i+1≤j ≤M
i = 1, . . . , M − 1.
(3.80)
The change of variable α → Ri α produces then a factor ri (α) =
ν(α) 2 i−1 α −α = q j =0 xj ( j j +1 ) q 2xi (αi −α0 ) . ν(Ri α)
(3.81)
Writing xj = y + j we have i−1
xj αj − αj +1 + xi (αi − α0 )
j =0 i−1 j αj − αj +1 + i (αi − α0 ) = N[1,i] (α) − iα0 ,
=
(3.82)
j =0
where we have used the identity i−1 j αj − αj +1 = α1 − α2 + 2α2 − 2α3 + · · · + (i − 1)αi−1 − (i − 1)αi j =0
= (α1 + · · · + αi−1 ) − (i − 1)αi = N[1,i] (α) − iαi ,
(3.83)
and N[1,i] stands for the number of particles between in {x1 , . . . , xi }. Therefore we may estimate (3.81) simply with ri (α) ≤ q −2i . In particular, ν
2
∇i−1,i f (Ri−1 α)
≤ q −2(i−1) ν
2
∇i−1,i f (α)
.
(3.84)
On the other hand by Schwarz’ inequality M
$2 ∇i−1,i f (Ri−1 α)
≤
i=1
M j =1
q −2j
M
2 q 2i ∇i−1,i f (Ri−1 α) .
(3.85)
i=1
From (3.84) and (3.85) we arrive at q
2M
$2 M M M 2 2(M−j ) ≤ ν ∇i−1,i f (Ri−1 α) q q 2 ν ∇i−1,i f i=1
j =1
≤ q 2 (1 − q 2 )−1
i=1 M i=1
ν
∇i−1,i f
2
.
(3.86)
Energy Gap in the XXZ Model
351
We turn to estimate the contribution of terms with Li in (3.79). Notice that j =i α M α0 j =M Li α j = i = 1, . . . , M − 1. α 0≤j ≤i−1 j +1 α i+1≤j ≤M −1 j
(3.87)
The change of variable α → Li α gives a factor li (α) =
ν(α) 2 i−1 α −α = q j =0 xj ( j j +1 ) q 2xi (αi −αM )+2xM (αM −α0 ) . ν(Li α)
(3.88)
As in (3.83) and (3.82) we can write li (α) = q 2N[1,i] (α) q 2(M−i)αM q −2Mα0 ≤ q −2M q 2N[1,i] (α) .
(3.89)
2 2 ν q −2N[1,i] (α) ∇i−1,i f (Li α) ≤ q −2M ν ∇i−1,i f (α) .
(3.90)
In particular,
Since we are assuming αM = 0, we also have (Li α)i = αM = 0. Thus in order to compute ∇i−1,i f (Li α) we may assume (Li α)i−1 = αi = 1 and write directly αi ∇i−1,i f (Li α). Using again Schwarz’ inequality M−1 $2 M−1 M−1 2 αi ∇i−1,i f (Li α) ≤ αj q 2N[1,j ] (α) q −2N[1,i] (α) ∇i−1,i f (Li α) . i=1
j =1
i=1
(3.91) But
M−1 j =1
αj q
2N[1,j ] (α)
N[1,M−1] (α)
=
q 2j ≤ q 2 (1 − q 2 )−1 .
j =1
Now we can estimate as in (3.86), using (3.90): $2 M−1 q 2M ν ∇i−1,i f (Li α) α0 (1 − αM ) i=1 2 −1
≤ q (1 − q ) 2
M−1
ν
∇i−1,i f
2
.
(3.92)
i=1
The estimates of (3.86) and (3.92) together with (3.79) imply the claim (3.75). It is not difficult to adapt the above argument to the case z < y . In this case, writing M = y − z , (3.75) has to be replaced by M ν (∇i−1,i f )2 . ν (∇0,M f )2 αM (1 − α0 ) ≤ 4q 2 (1 − q 2 )−1
(3.93)
i=1
Now ri (α) ≤ q 2N[1,i] (α) and li (α) ≤ q 2(M−i) , so (3.93) follows using the estimate (3.92) for Ri -terms and (3.86) for Li -terms. This ends the proof of Proposition 3.13. !
352
P. Caputo, F. Martinelli
We can now go back to (3.73) and continue the proof of Proposition 3.12. Suppose first we are in case (MP), i.e. the sets A and B are the planar sections at level A and B respectively. Then, summing over x ∈ A in (3.73), IV = 2L2 q 2(A −B ) ν (∇yz f (α))2 αy (1 − αz ) | m , (3.94) y∈B
where z is the unique element of )y ∩ AA . Since A ≥ B using Proposition 3.13 (with ν replaced by ν(· | m)) we easily obtain IV ≤ 8q 2 (1 − q 2 )−1 L2 ν (∇b f )2 | m . (3.95) b∈V
Suppose now we are in the case (FP). Here A is a sub-cylinder with height h while B is the planar section at height 0, cf. (3.66). Then from (3.73), using Proposition 3.13 (with ν replaced by ν(· | m)), we see that IV = q 2x ν (∇yz f (α))2 αy (1 − αz ) | m x∈A y∈B
≤
4q 2 (1 − q 2 )−1
b∈V
x∈A 2 −1
= 8q (1 − q ) 2
ν (∇b f )2 | m
hL
2
ν (∇b f )2 | m .
(3.96)
b∈V
Horizontal moves. We go back to (3.70). Observe that if there is a particle at y and a hole at z the change of variable α → Tyz α produces the factor q 2(y −z ) , thus canceling q 2(x −y ) in (3.70). We can estimate IO ≤ ν (∇xz f )2 | m . (3.97) x∈A y∈B
Consider the case (MP) first. Now both x and z lie on the plane AA . We fix a choice of paths on this plane as follows. For each couple x, z ∈ AA we take the path γxz obtained by connecting x to z first along the direction e1 and then along the direction e2 . As in the case of vertical moves we use a telescopic sum to write ∇xz f , thus obtaining two sums over all bonds in the path γxz , cf. (3.79). Since here we only have horizontal exchanges there are no factors when we change variables and we simply use Schwarz’ inequality to obtain ν (∇xz f )2 | m ≤ 2|γxz | ν (∇b f )2 | m b∈γxz
ν (∇b f )2 | m , ≤ 8L
(3.98)
b∈γxz
where we used |γxz | ≤ 4L. Moreover, for any bond b in the plane 1{b∈γxz } ≤ 4L3 . x∈A y∈B
(3.99)
Energy Gap in the XXZ Model
353
When we sum in (3.97) we obtain IO ≤ 32L4
ν (∇b f )2 | m .
(3.100)
b∈O
Consider now the case (FP). Here A is the sub-cylinder at height h and B is the planar section at height 0, see (3.66). The same estimate (3.100) applies since when summing over x ∈ A we are now summing over all layers up to level h − 1 and the r.h.s. in (3.100) contains all bonds in such planes. Collecting all the estimates in (3.95), (3.96) and (3.100) and plugging into (3.69) we have obtained the desired bound (3.67). This completes the proof of Proposition 3.12. 4. Recursive Proof of Theorem 2.7 We begin by describing the main ideas behind the recursive proof of Theorem 2.7. Let Var ,n (f ) denote the variance of a function f w.r.t. ν,n and let W () = max sup n
f
Var ,n (f ) , E,n (f, f )
(4.1)
where the supremum is taken over all non-constant f : → R. When = )L,H we write W (L) for supH W ()L,H ). The lower bound in Theorem 2.7 follows if we can prove that for any q ∈ (0, 1) there exists k < ∞ such that W (L) ≤ kL2
(4.2)
for any L of the form L = 2j , j ∈ N. In turn (4.2) follows at once if we can prove that for any q ∈ (0, 1) there exist k < ∞ and L0 > 0 such that W (2L) ≤ 3W (L) + kL2 , W (2L) ≤ kW (L) + k, W (1) ≤ k.
L ≥ L0 , L ≤ L0 ,
(4.3) (4.4) (4.5)
4.1. Transport theorem and proof of the recursive inequalities. The starting point to prove the recursive inequalities is the formula for conditional variance that we now describe. Consider the cylinder )2L,H and divide it into two parts (cf. (3.55)) = )2L,H = 1 ∪ 2 ,
1 = )RL,2L ,H ,
2 = (L, 0, 0) + )RL,2L ,H .
Fix n ∈ {1, . . . , 2L2 H = ||/2} and let ν,n denote as usual the canonical measure on = )2L,H with total particle number n. Conditioning on the number of particles in 1 decomposes the variance as follows: (4.6) Var ,n (f ) = ν,n Var ,n f |N1 + Var ,n ν,n f |N1 . Moreover, the above conditioning breaks ν,n into the product ν1 ,N1 ⊗ ν2 ,n−N1 , and therefore (4.7) ν,n Var ,n f |N1 ≤ ν,n Var 1 ,N1 (f ) + Var 2 ,n−N1 (f ) .
354
P. Caputo, F. Martinelli
The first term in (4.6) is then estimated above using (4.1): ν,n Var ,n f |N1 ≤ W ()RL,2L ,H )ν,n E1 ,N1 (f, f ) + E2 ,n−N1 (f, f ) ≤ W ()RL,2L ,H )E,n (f, f ).
(4.8)
The analysis of the second term in (4.6) is more delicate and is directly related to transport of particles. In a sense it represents the core of the proof. As we will see we will provide two different bounds on the transport term: the first one is rather subtle but it is valid only for large enough L. The second one, valid for any value of L, is much more rough and therefore it will be used only for those values of L for which the first bound is not known to hold. For simplicity, in what follows, we will always refer to these two situations as the “large” or “small” L case. Recall now the Definition (2.30) and (2.31) of the horizontal and vertical part of the Dirichlet form. Then we have Theorem 4.1. (i) Large L. For any 8 > 0, q ∈ (0, 1), there exists a finite constants C8 = C(8, q), k = k(q) and L0 = L0 (8, q) such that for any L > L0 , H ≥ 1 and for any n = 1, 2, . . . , || − 1, Var ,n ν,n f |N1 ≤ k L2 EnO (f, f ) + EnV (f, f ) + C8 E,n (f, f ) + 8Var ,n (f ), (ii) Small L. For any q ∈ (0, 1) and for any L ≥ 1 there exists a finite constant C = C(L, q) such that, for any H ≥ 1 and any n = 1, 2, . . . , || − 1, Var ,n ν,n f |N1 ≤ C E,n (f, f ) + ν,n Var ,n f | N1 . Once Theorem 4.1 is proven, we use (4.6) and (4.8) to obtain Var ,n (f ) ≤ (1 − 8)−1 W ()RL,2L ,H ) + kL2 E,n (f, f )
Var ,n (f ) ≤ C0 W ()RL,2L ,H ) + C0 E,n (f, f )
for L ≥ L0 (8, q), (4.9)
for L ≤ L0 (8, q),
where C0 = C(L0 , q). In the large L case (4.9) proves in particular that W ()R2L,2L ,H ) ≤ (1 − 8)−1 W ()RL,2L ,H ) + kL2 .
(4.10)
(4.11)
We repeat now the decomposition (4.6) for = )RL,2L ,H , writing the latter cylinder as = 1 ∪ 2 , 1 = )L,H and 2 = (0, L, 0) + )L,H . Applying the same reasoning as above we arrive at W ()RL,2L ,H ) ≤ (1 − 8)−1 W ()L,H ) + kL2 , ∀H ≥ 1 and L ≥ L0 (8, q). (4.12) From (4.11) and (4.12) we finally obtain W (2L) ≤ (1 − 8)−2 W (L) + kL2
∀L ≥ L0 (8, q)
(4.13)
which proves (4.3) due to the arbitrariness of 8. Equation (4.4) is proved similarly starting from (4.10). The bound (4.5) is given in the next subsection.
Energy Gap in the XXZ Model
355
Remark 4.2. In order to remove the restriction that L be a power of 2 one may proceed as follows (see e.g. [19], Sect. 4.2). Denote R(L) the class of rectangles Rl1 ,l2 such that 1 l1 ∧ l2 ≥ 10 (l1 ∨ l2 ) and (l1 ∨ l2 ) ≤ L. At each step of the iteration we divide a rectangle R ∈ R(2L)\R(L) as R = R1 ∪R2 with R1 , R2 the two rectangles obtained by bisecting R along the longest side. Thus if R = Rl1 ,l2 with l2 ≥ l1 we have R1 = Rl ,[ l2 ] and 1
2
R2 = (0, [ l22 ]) + Rl ,l −[ l2 ] . A careful check reveals that estimates (4.9) and (4.10) are 1 2 2 still valid if we replace by )R,H and )RL,2L ,H by any )Ri ,H , i = 1, 2. We can now repeat the iteration on each Ri until we arrive at rectangles which are all contained in R(L). This procedure requires, by construction, at most four steps. Thus if we define (L) = W
sup W ()R,H )
R⊂R(L)
the preceding observations together with the reasoning leading to (4.13) actually enable (L) rather than W (L). us to establish bounds (4.3) and (4.4) for W
4.2. Spectral gap in the one dimensional case. In this final paragraph we prove that W (1) < ∞. In other words we show that the spectral gap for the one dimensional asymmetric simple exclusion process with generator (2.28) in the interval )H := {x = (0, 0, l) : 0 ≤ l ≤ H − 1} is bounded away from zero uniformly in the number of particles and in the length of the interval H . Such a result has already been proved for the one dimensional XXZ model in [23] but we decided to present a “probabilistic” proof for completeness. Here is our formal statement. Below we write ν := ν)H ,n , E := E)H ,n . Theorem 4.3. For any q ∈ (0, 1) there exists a constant k such that for any H ≥ 1, any n ≤ H and any function f , Var ν (f ) ≤ kE(f, f ). In particular, W (1) ≤ k. Proof. Let γ (n, H ) denote the inverse spectral gap for the process in )H with n particles and let γ (n) = supH γ (n, H ). Notice that, by the particle–hole duality, γ (n, H ) = γ (H − n, H ) and therefore we will always assume, without loss of generality, that n ≤ H2 . If n = 1, then it is well known, by e.g. Hardy’s [21] or Cheeger’s inequality [11], that γ (1) < ∞. Our idea is to perform a sort of induction on the number of particles. For this purpose, for each configuration α with n particles we denote by ξ := ξ(α) the position of the last particle, namely ξ = max{x ∈ )H : αx = 1}, and we set ρ(x) = ν(ξ = x) the probability that ξ = x. It is not difficult to see that the distribution of ξ has an exponential falloff so that, in particular, it satisfies a Poincaré inequality with constant depending only on q. More precisely we have the following Lemma 4.4. For any q ∈ (0, 1) there exists k such that for any f (α) := F (ξ(α)), Var ν (f ) ≤ k
ρ(x) ∧ ρ(x + 1) [F (x + 1) − F (x)]2
x≥n−1
∀H ≥ n.
(4.14)
356
P. Caputo, F. Martinelli
Proof. Using Cheeger’s inequality it is enough to prove that there exists x0 ≥ n − 1 and β < 1 depending only on q such that ρ(x+1) ρ(x) ≤ β for any x ≥ x0 . A simple change of variables (see (4.21) below) shows that ρ(x + 1) ν (1 − αx ) | ξ = x + 1 = 1. (4.15) qρ(x) In order to complete the proof it is enough to prove that ν αx | ξ = x + 1 tends to zero for large x uniformly in n ≤ H2 . For any x ≥ n we have ν αx | ξ = x + 1 µ(αx = 1; αx+1 = 1; αy = 0 ∀y > x + 1) µ(N[0,n−2] = n − 1; αx+1 = 1; αy = 0 ∀y ≥ n − 1, y = x + 1) µ(αx ) ≤ n−2 (4.16) ≤ kq 2(x−n) µ(α ) µ(1 − α ) y=x: y y x=0 ≤
y≥n−1
for some constant k = k(q). Above we have used the explicit product structure of the measure µ := µλ(n) together with the fact proved in (3.1) that |λ(n) − n| ≤ k / . ! We are now in a position to prove the theorem. We write Var ν (f ) = ν Var ν (f | ξ ) + Var ν ν(f | ξ ) .
(4.17)
The first term in the r.h.s. of (4.17) coincides with ρ(x) Var ν[0,x−1],n−1 ⊗ν[x,H −1],0 (f ), x≥n−1
and therefore it can be bounded from above, using the definition of γ (n, x), by ρ(x)[γ (n − 1, x) ∧ γ (x − n + 1, x)]E[0,x−1],n−1 (f, f ) (4.18) x≥n−1
because of the holes–particles duality. Here and below γ (0, x) = 0 for all x. Let us examine the second term. Here we apply Lemma 4.4 to write 2 ρ(x) ∧ ρ(x + 1) F (x + 1) − F (x) , Var ν ν(f | ξ ) ≤ k
(4.19)
x≥n−1
where F (x) = ν(f | ξ = x). In order to compute the “gradient” of F (x) we write ν(α) F (x) = f (α)αx ρ(x) α;ξ(α)=x
ν(α) f (α x,x+1 )(1 − αx ) ρ(x + 1) α;ξ(α)=x+1 ρ(x + 1) = ν [∇x,x+1 f ](1 − αx ) | ξ = x + 1 + ν f, (1 − αx ) | ξ = x + 1 qρ(x) ρ(x + 1) + (4.20) ν(f | ξ = x + 1)ν (1 − αx ) | ξ = x + 1 . qρ(x)
=
ρ(x + 1) qρ(x)
Energy Gap in the XXZ Model
357
Setting f = 1 gives ρ(x + 1) ν (1 − αx ) | ξ = x + 1 = 1. qρ(x)
(4.21)
Therefore the last term in ther.h.s. of (4.20) is equal to F (x + 1). Thanks to (4.16)
2
F (x + 1) − F (x)
ρ(x+1) qρ(x)
≤ k uniformly in n, H . In conclusion
≤ k / ν (∇x,x+1 f )2 | ξ = x + 1 + k / ε(x)ν f, f | ξ = x + 1 , (4.22)
where ε(x) := ν (1 − αx ), (1 − αx ) | ξ = x + 1 ≤ kq 2(x−n)
(4.23)
because of (4.16). Thus the r.h.s. of (4.17) is bounded from above by sup
(γ (n − 1, x) ∧ γ (x − n + 1, x) (1 + kq 2(x−n) ) ∨ k / E(f, f ).
n−1≤x≤H −1
(4.24) In other words we have proved the recursive inequality γ (n − 1) ∧ γ (x − n) (1 + kq 2(x−n) ) ∨ k / x≥n ≤ sup (γ (x − n) ∨ k // )(1 + kq 2(x−n) ) .
γ (n) ≤ sup
n≤x≤2n−1
(4.25)
It is quite simple now to conclude that γ (n) is uniformly bounded. Indeed if γ˜ (m) := (γ (m) ∨ k // ) then (4.25) tells us that γ˜ (m) ≤
γ˜ ()(1 + kq 2 ) .
sup
(4.26)
1≤≤m−1
We then have a sequence 1 < 2 < · · · < s , s ≤ m − 1 such that γ˜ (m) ≤ γ˜ (1)
s
(1 + kq 2i )
i=1
which is finite since γ˜ (1) < ∞ and and so is γ (n). !
i
q 2i < ∞. Thus γ˜ (m) is uniformly bounded
358
P. Caputo, F. Martinelli
5. Proof of Theorem 4.1 The setting in this section is as in (3.55). For notation convenience in what follows we will drop the subscripts , n. We also use for ν(· | N1 = m). If we apply ν(· | m) Proposition 3.11 to the function g(N1 ) = ν f |N1 we get Var ν g(N1 ) ≤ k n ∧ L2 pn (m) ∧ pn (m + 1) [g(m + 1) − g(m)]2 ,
(5.1)
m
where pn (m) = ν N1 = m . Therefore we need to study the gradient g(m + 1) − g(m). For this purpose the main idea (very roughly) is the following: Pick a configuration α such that N1 (α) = m + 1 and N (α) = n, choose two sites x ∈ 1 , y ∈ 2 such that α(x) = 1, α(y) = 0 and consider the exchanged configuration η = α xy . Clearly N1 (η) = m and N (η) = n. Using this kind of change of variables it is not difficult to write an expression for the gradient g(m+1)−g(m) in terms of suitable spatial averages of ∇xy f plus a covariance term ν(f, Fxy ), where the latter originates from the action of the change of variables on the probability measures. One possibility to concretely implement this program is to write ν f |m + 1 =
1 (m + 1)(n − m − 1)
ν f αx (1 − αy ) | m + 1
x∈1 ,y∈2
and to make the change of variables described above for each pair (x, y). This idea works just fine in the context of translation invariant lattice gases [7], but has some drawback in our context due to the nature of the typical configurations of the measure ν · | m+1 . As already shown, the m + 1 particles in 1 tend to fill the cylinder 1 up to a well specified height and the same for 2 . Without loss of generality we can assume m ≥ n/2 so that the resulting surface in 1 will stay higher than thesurface in 2 . Thus, if we don’t want to transform a typical configuration of ν · | m + 1 into an atypical one for ν · | m via the exchange Txy , we should only try to exchange the holes that sit on the surface in 2 with the particles on the surface in 1 . In other words the above (deterministic) sum αx (1 − αy ) x∈1 ,y∈2
should be replaced by a random variable
αx (1 − αy ),
x∈A,y∈B
where A, B denote the two surfaces. Of course, for certain rare configurations, the surfaces either do not exist or their density of particles is far from its typical value. We are forced therefore to split according to some criterium the contribution to the gradient g(m + 1) − g(m) coming from typical and rare configurations and apply the above reasoning only to the typical cases. The contribution coming from the rare configurations should be estimated via moderate deviation bounds for the measure ν · | m + 1 . We will now make precise what we just said. In the rest of this section we will always assume m ≥ n2 .
Energy Gap in the XXZ Model
359
For any event G ⊂ we write ν f | m + 1 − ν f | m ν(G | m + 1) = ν f 1IG | m + 1 − ν f | m ν G | m + 1 − ν f, 1IG | m + 1 . We then estimate 2 ν f |m + 1 − ν f |m ν Gc | m + 1 Var ν f | m + 1 ≤2 ν G|m + 1 2 1 +2 2 ν f 1IG | m + 1 − ν f | m ν 1IG | m + 1 . ν G|m + 1
(5.2)
(5.3)
5.1. The typical events. We will provide different definitions of the typical event G according to whether L is “large” or “small” and whether we have “many” MP or “few” FP particles (see the beginning of Sect. 3.5). L large. We start with the MP case. Take λ, λ/ ∈ R such that µλ1 (N1 ) = m,
/
µλ2 (N2 ) = n − m.
(5.4)
Set A = [λ ∨ 0] ∧ (H − 1),
B = [λ/ ∨ 0] ∧ (H − 1)
(5.5)
and define A, B as the planar sections of 1 and 2 at height A and B respectively. More precisely A = 1 ∩ AA ,
B = 2 ∩ AB .
Define the number of particles in A and the number of holes in B: αx , VB (α) = (1 − αy ), NA (α) = x∈A
(5.6)
(5.7)
y∈B
and define N¯ A = ν(NA | m), V¯B = ν(VB | m). Definition 5.1. The event G in the MP, L large case. We set G = GA ∩ GB , where GA = {|NA (α) − N¯ A | ≤ (n ∧ L2 ) 2 +γ }, 1
GB = {|VB (α) − V¯B | ≤ L1+2γ }.
(5.8)
Here γ is a small positive number, say γ = .001. Note that in any case when L is large G implies 1 (n ∧ L2 ), VB ≥ L2 . (5.9) 4 We turn to the case (FP). Here we do not fix two planar sections but rather confine most of the particles in a cylinder with finite height. Let h ∈ N, 1 ≤ h ≤ H and define (5.10) A = x ∈ 1 : x ≤ h − 1 , B = y ∈ 2 : y = 0}. NA ≥
360
P. Caputo, F. Martinelli
Definition 5.2. The event G in the FP, L large case. We set G = GA ∩ GB , with (5.11) GA = {NA (α) ≥ m/2} , GB = |VB (α) − V¯B | ≤ L1+2γ . Note that here too when L is large G implies NA ≥
1 n, 4
VB ≥ L2 .
(5.12)
Finally we analyze the case of L small. The construction of the sets A and B is done exactly as before in the two cases MP and FP but the definition of G changes. Definition 5.3. The event G in the L small case. We set G = GA ∩ GB , where GA = {NA (α) ≥ 1} ,
GB = {VB (α) ≥ 1} ,
(5.13)
and A and B are as in (5.6) or as in (5.10) depending on whether we are in the MP or the FP case. 5.2. Bounds on the probability of the typical events. In what follows we will provide some simple estimates on the probability of the event Gc in the various cases of few/many particles and large/small L. We begin by stating our bounds to be used when L is large. Lemma 5.4. Assume (MP). For any q ∈ (0, 1) there exist k < ∞ such that ν Gc | m ≤ k exp(−k −1 L4γ ). Proof. Observe that
(5.14)
ν Gc | m ≤ ν1 ,m GcA + ν2 ,n−m GcB .
We first prove ν1 ,m GcA ≤ k exp(−k −1 L4γ ).
(5.15)
We write A = A1 ∪ A2 with A1 = )L,H ∩ AA and A2 = {(0, L, 0) + )L,H } ∩ AA . Letting 1 Gi = |NAi (α) − N¯ A /2| ≤ (n ∧ L2 )1/2+γ , i = 1, 2, 2 we see that ν1 ,m GcA ≤ ν1 ,m Gc1 + ν1 ,m (G2 )c . (5.16) By Proposition 3.8 we can estimate (5.16) with the help of the grand canonical distribution µλ , where λ is given by (5.4). Thus (5.17) ν1 ,m GcA ≤ kµλ Gc1 + kµλ (G2 )c = 2kµλ Gc1 . Note that αx , x ∈ A1 are i.i.d. random variables under µλ , with mean value ρx = µλ (αx ). Let us consider the case n ≤ L2 in detail. For the case n ≥ L2 simply replace n by L2 in the lines below. We have, for any t ≥ 0, µλ Gc1 ≤ exp (−tn1/2+γ ) exp |A1 |ϕ(t) + exp |A1 |ϕ(−t) , (5.18)
Energy Gap in the XXZ Model
361
where
ϕ(t) = log µλ1 exp t (αx − ρx ) .
(5.19)
Then ϕ(0) = ϕ / (0) = 0 and ϕ // (t) = Var λt (αx ), where λt = λ + t/(−2 log q). Now, for any |t| ≤ 1, Var µλt (αx ) = µλt (αx − µλt (αx ))2 ≤ µλt (αx − µλ (αx ))2 ≤ e2 Var µλ (αx ). (5.20) Then |ϕ(t)| ≤ 5Var µλ (αx )t 2 ,
|t| ≤ 1.
(5.21)
Using Lemma 3.2 and Lemma 3.1 we have |A1 |Var µλ (αx ) = Var µλ (NA1 ) ≤ km ≤ kn. Therefore by (5.18), choosing t = O(n−1/2+γ ) we obtain −1 µλ Gc1 ≤ k exp(−k −1 n2γ ) ≤ k / exp(−k / L4γ ). The estimate for VB is obtained in a similar fashion.
(5.22)
!
We turn to analyze the case of few particles. Again our estimate will be meaningful only if L is large enough. Lemma 5.5. Assume (FP). For any q ∈ (0, 1) there exist k < ∞, h0 < ∞ such that for all m ∈ [ n2 , n] and h ≥ h0 we have (5.23) ν Gc | m ≤ k n−1 q 2h + exp (−k −1 L4γ ) . Proof. Repeating the argument leading to (5.18) and (5.21), choosing t = O(L−1+2γ ), we easily obtain −1 (5.24) ν2 ,n−m GcB ≤ k exp (−tL1+2γ ) exp (knt 2 ) ≤ k / exp (−k / L4γ ). Let A¯ = 1 \ A and write
ν1 ,m GcA ≤ ν1 ,m NA¯ ≥ m/2 .
Dividing A¯ in two parts A¯ = A1 ∪ A2 with A1 = )L,H ∩ A¯ and A2 = {(0, L, 0) + ¯ we may estimate )L,H } ∩ A, ν1 ,m NA¯ ≥ m/2 ≤ ν1 ,m NA1 ≥ m/4 + ν1 ,m NA2 ≥ m/4 . Then by Proposition 3.8 it is sufficient to estimate µλ1 [NA1 ≥ m/4], where λ is given by (5.4). Since m ≤ n ≤ δL2 we have λ ≤ 0 from Lemma 3.1. Therefore µλ1 (NA1 ) = L2
H −1 j =h
H −1 q 2(j −λ) q 2(j −λ) 2h 2 ≤ 2q L 1 + q 2(j −λ) 1 + q 2(j −λ)
= 2q 2h µλ1 (N1 ) = 2q 2h m.
j =0
362
P. Caputo, F. Martinelli
We then estimate
µλ1 NA1 ≥ m/4 ≤ µλ1 |NA1 − µλ1 (NA1 )| ≥ cm ,
with c > 0, if h ≥ h0 (q) for some h0 (q) < ∞. Then µλ1 [NA1 ≥ m/4] ≤ ≤
H q 2(j −λ) 1 L2 Var (N ) = λ A1 µ 2 2 2 2 c m c m (1 + q 2(j −λ) )2 1 j =h
(1 − q 2 )−1 q 2(h−λ) L2 c 2 m2
≤
4q 2h c2 m
,
(5.25)
where in the last bound we use q −2λ ≤
4m (1 − q 2 ), L2
which follows from Lemma 3.1. Since m ≥ n/2 (5.25) and (5.24) yield (5.23).
!
Finally we analyze the case L small. Below the event G will be that appearing in (5.13). Lemma 5.6. ν(G | m) ≥ c(q, L) > 0,
(5.26)
with a constant c(q, L) independent of the height H of the cylinder. Proof. Since ν(GA | m) =
µλ1 (GA ∩ {N1 = m}) µλ1 ({N1 = m})
,
(5.27)
the claim easily follows from a slight modification of the argument given at the end of Theorem 3.4 (packing all particles at the bottom of the cylinder). The same can be done for the event GB . ! 2 5.3. Bounding the gradient ν f 1IG | m + 1 − ν f | m ν G | m + 1 . From Lemmas 5.4, 5.5 and 5.6 we see that, for any 8 ∈ (0, 1) the first term in the r.h.s. of (5.3) can be bounded from above by (i) L large.
ν Gc | m + 1 8 Var ν f | m + 1 ≤ Var ν f | m + 1 , 2 n ∧ L2 ν G|m + 1
(5.28)
provided that L and the constant h in Lemma 5.5 are large enough depending on 8. (ii) L small. ν Gc | m + 1 Var ν f | m + 1 ≤ C Var ν f | m + 1 , 2 (5.29) ν G|m + 1 where C = C(L) is some finite constant independent of m.
Energy Gap in the XXZ Model
363
We now turn our attention to the second term appearing in the r.h.s. of (5.3). As −2 2 can be bounded from below by either (1−8) before, the factor 2ν G | m + 1 2 or by / / C (L) for a suitable constant C (L) according to whether L is large enough (depending on 8) or it is small (i.e. smaller than some L0 ). We thus concentrate on the computation of the relevant term 2 ν f 1IG | m + 1 − ν f | m ν G | m + 1 . The following calculation holds irrespectively of which definition of G is adopted. Defining αx (1 − αy ) φxy (α) = , x ∈ A, y ∈ B, NA (α)VB (α) we may write ν f 1IG | m + 1 = ν f 1IG φxy | m + 1 .
(5.30)
x∈A y∈B
With the change of variables α → Txy α, (5.30) becomes ν f 1IG | m + 1 =
pn (m) 2(x −y ) q ν Txy [f 1IG φxy ] | m pn (m + 1)
= σm
x∈A y∈B
q 2(x −y ) ν Txy f Fxy | m ,
(5.31)
x∈A y∈B
where σm =
pn (m) , pn (m + 1)
Fxy (α) = 1IG (α xy )φxy (α xy ).
Subtracting and adding f inside averages gives ν f 1IG | m + 1 = σm q 2(x −y ) ν ∇xy f Fxy | m + ν f Fxy | m .
(5.32)
(5.33)
x∈A y∈B
When f = 1 we see that ν G | m + 1 = σm q 2(x −y ) ν Fxy | m .
(5.34)
x∈A y∈B
Therefore, by subtracting ν f | m ν G | m + 1 the last term in (5.33) becomes a covariance ν f 1IG | m + 1 − ν f | m ν G | m + 1 = σm q 2(x −y ) ν [∇xy f ]Fxy | m + ν f, Fxy | m . x∈A y∈B
(5.35)
364
P. Caputo, F. Martinelli
We then estimate the square of the l.h.s. of (5.35) by 2 ≤ I1 + I2 , ν f 1IG | m + 1 − ν f | m ν G | m + 1
(5.36)
2 q 2(x −y ) ν ∇xy f Fxy | m , I1 = 2 σm
(5.37)
with
x∈A y∈B
and 2 q 2(x −y ) ν f, Fxy | m . I2 = 2 σm
(5.38)
x∈A y∈B
Estimate of I1 . Using (5.34), the non-negativity of Fxy and the Schwarz’ inequality we obtain $2 ν ∇ f F | m xy xy q 2(x −y ) ν Fxy | m I1 ≤ 2σm ν Fxy | m x∈A
y∈B
≤ 2σm
q 2(x −y ) ν (∇xy f )2 Fxy | m .
(5.39)
x∈A y∈B
Next we observe that, by the definition of the event G, using (5.9) and (5.12) we have αy (1 − αx ) Fxy (α) = 1IG (α xy ) NA (α xy )VB (α xy ) 2 −1 −2 4(n ∧ L ) L αy (1 − αx ) (MP) L large ≤ 4n−1 L−2 αy (1 − αx ) (FP) L large . α (1 − α ) L small x y
(5.40)
Therefore in both cases (MP), (FP) I1 ≤ 4σm L−2 (n ∧ L2 )−1
q 2(x −y ) ν (∇xy f )2 αy (1 − αx ) | m ,
(5.41)
x∈A y∈B
if L is large, while I1 ≤ σm
x∈A y∈B
q 2(x −y ) ν (∇xy f )2 αy (1 − αx ) | m
(5.42)
Energy Gap in the XXZ Model
365
if L is small. We can finally apply Proposition 3.12 to obtain C 2 L2 b∈O ν (∇b f )2 | m + b∈V ν (∇b f )2 | m if L is large n∧L , I1 ≤ σm L2 2 |m + 2 |m ν (∇ f ) ν (∇ f ) if L is small b b b∈O b∈V (5.43) where C is a suitable constant depending on q and h (h is the constant in Lemma 5.5). Estimate of I2 . Recall the definition of I2 given in (5.38). It is quite clear from (5.40), Lemma 3.10 and the Schwartz inequality that I2 ≤ kL8 Var ν (f | m),
(5.44)
where k = k(q). Such a bound will turn out to be useful when L is “small”. The case L large is more involved and requires a more subtle analysis. We start with the case (MP). Lemma 5.7. For every 8 > 0 and q ∈ (0, 1) there exist finite constants C8 and L0 such that for any L ≥ L0 , H, n satisfying (MP) the following estimate holds: I2 ≤ (n ∧ L2 )−1 {C8 Eν (f, f | m) + 8 Var ν (f | m)} .
(5.45)
Proof. From (5.38) and Lemma 3.10 we have a first estimate I2 ≤ k
2 ν f, Fxy | m .
(5.46)
x∈A y∈B
Observe that
Fxy =
x∈A y∈B
VA NB 1IG , (NA + 1)(VB + 1)
(5.47)
=G A ∩ G B with where G A = {|NA (α) + 1 − N¯ A | ≤ (n ∧ L2 )1/2+γ } G and
B = {|VB (α) + 1 − V¯B | ≤ L1+2γ }. G
As in Lemma 5.4 we have the bounds cA | m) ≤ k exp (−k −1 L4γ ), ν(G Writing FA =
cB | m) ≤ k exp (−k −1 L4γ ). ν(G
VA 1I , NA + 1 GA
FB =
(5.48)
NB 1I , VB + 1 GB
(5.46) says that 2 I2 ≤ kν f, FA FB m .
(5.49)
366
P. Caputo, F. Martinelli
We write ν(· | m) = ν1 ⊗ ν2 where ν1 = ν1 ,m and ν2 = ν2 ,n−m and use the decomposition (5.50) ν f, FA FB | m = ν2 (FB )ν f, FA | m + ν FA ν2 f, FB | m . 2 We start by estimating ν f, FA | m . Defining ρA =
NA , |A|
ρ¯A =
N¯ A |A|
we may write FA =
1IG A ρA
− 1IG A + 1IG A 1 −
1 ρA
!
1 . NA + 1
(5.51)
For the second term in the right side of (5.51) one can use (5.48). For the third term, recalling (5.9), one has an upper bound of order kL−2 . Therefore apart from the first 2 term the rest contributes at most kL−4 Var ν (f | m) to the upper bound on ν f, FA | m . The first term in (5.51) is handled as follows. We expand 1IG 1IG ρA A A 2− (5.52) = + RA ρA ρ¯A ρ¯A A , where, on G
|RA | ≤ k
ρA −1 ρ¯A
2
≤ k(n ∧ L2 )−1+2γ .
(5.53)
In view of (5.53) and using again (5.48) to depress the term proportional to 1IG A we have obtained 2 2 ν f, FA | m ≤ k(ρ¯A2 N¯ A2 )−1 ν f, NA | m + kL−4+8γ Var ν (f | m). (5.54) An application of Proposition 3.9 together with the bound (ρ¯A2 N¯ A2 )−1 ≤ kL4 (n ∧ L2 )−4 yields the estimate 2 ν f, FA | m ≤ L4 (n ∧ L2 )−3 {CEν (f, f | m) + 8 Var ν (f | m)} . (5.55) Using FB ≤ (n ∧ L2 )L−2 , the first term in (5.50) can be finally estimated by 2 ν2 (FB )2 ν f, FA | m ≤ (n ∧ L2 )−1 {CEν (f, f | m) + 8 Var ν (f | m)} .
(5.56)
We turn to the second term in (5.50). Repeating the arguments given above and using |VB − V¯B | ≤ L1+2γ we obtain the upper bound 2 2 ν ν2 f, FB | m ≤ kL−4 ν ν2 f, NB | m + kL−4+8γ Var ν (f | m). (5.57) By Proposition 3.9 ν ν2 f, FB )2 | m ≤ (n ∧ L2 )L−4 {CEν (f, f | m) + 8 Var ν (f | m)} .
(5.58)
Recalling that FA ≤ L2 (n ∧ L2 )−1 we can estimate the square of the second term in (5.50) by ν(FA2 | m)ν ν2 f, FB )2 | m ≤ (n ∧ L2 )−1 {CEν (f, f | m) + 8 Var ν (f | m)} . (5.59) !
Energy Gap in the XXZ Model
367
We now turn to estimate I2 in the case (FP). Recall that here A is the cylinder with height h, see (5.10). Lemma 5.8. For every 8 > 0, q ∈ (0, 1), there exists δ0 (8, q) > 0 and finite constants C8 , L8 and h8 such that for any L ≥ L8 , H, n satisfying (FP) with δ ≤ δ0 , any m ≥ n2 and h ≥ h8 , I2 ≤ n−1 {C8 Eν (f, f | m) + 8 Var ν (f | m)} .
(5.60)
Proof. Define FA = with Vj =
h−1
1IG A
NA + 1
q 2j Vj ,
FB =
j =0
(1 − αx ),
NB 1I , VB + 1 GB
Aj = {x ∈ A : x = j },
x∈Aj
A = {NA + 1 ≥ m/2}, G B = {|VB + 1 − V¯B | ≤ L1+2γ }. Then as in (5.46) we and G have 2 I2 ≤ kν f, FA FB | m , (5.61) and we decompose as in (5.50). Let us first estimate ν(f, FA | m)2 ≤ Var ν (f | m)Var ν (FA | m) = Var ν (f | m) ν Var ν (FA | NA ) | m + Var ν ν(FA | NA ) | m . (5.62) Observe that Var ν (FA | NA ) =
1IG A
Var ν 2
q 2j Vj | NA
(NA + 1) j −2 2j ≤ kn q Var ν Vj | NA ≤ kn−1 ,
(5.63)
j
where we used the fact that NA + 1 ≥ m/2 ≥ n/4 and that Var ν (Vj | NA ) = Var ν (NAj | NA ) ≤ kn. The latter estimate can be derived as usual from Proposition 3.8 and Lemma 3.2. For the second term in (5.62) we claim that Var ν ν(FA | NA ) | m ≤ kL4 n−3 q 2h . (5.64) Set ϕj (NA ) =
ν(Vj | NA ) , NA + 1
so that Var ν ν(FA | NA ) | m ≤ k q 2j Var ν (ϕj (NA )1IG A | m). j
(5.65)
368
P. Caputo, F. Martinelli
We have Var ν (ϕj (NA )1IG A | m) =
ν(NA = | m)ν(NA = / | m)
,/
2 / / × ϕj () − ϕj (/ ) 1IG A ()1IG A ( ) + 21IG A ()1IG c ( ) . A
(5.66) Using NA ∈ [n/4, n], Vj ≤ kL2 and |ν(Vj | NA = ) − ν(Vj | NA = / )| ≤ n, (5.66) implies cA | m). Var ν (ϕj (NA ) | m) ≤ k + kL4 n−4 Var ν (NA | m) + kL4 n−2 ν(G
(5.67)
From the equivalence of ensembles and Lemma 3.2 we have Var ν (NA | m) ≤ kmq 2h ≤ knq 2h . c | m) ≤ kn−1 q 2h . Thus (5.67) combined Moreover, by Lemma 5.5 we know that ν(G A with (5.65) yields the claim (5.64). Going back to (5.62) and recalling that FB ≤ knL−2 we have the estimate 2 ν2 (FB )2 ν f, FA | m ≤ k q 2h n−1 + nL−4 Var ν (f | m). (5.68) Recall that n ≤ δL2 so that nL−4 ≤ δn−1 and we have to choose δ small depending on 2 8. We now estimate the term ν ν2 f, FB | m in (5.50). As in (5.58) we have ν ν2 f, FB )2 | m ≤ nL−4 {C8 Eν (f, f | m) + 8 Var ν (f | m)} . (5.69) At this point the bound FA ≤ kn−1 L2 gives ν(FA2 | m)ν ν2 f, FB )2 | m ≤ n−1 {C8 Eν (f, f | m) + 8 Var ν (f | m)} .
(5.70)
Choosing h sufficiently large in (5.68) and combining with (5.70) the proof of (5.60) is complete. ! 5.4. The proof of the theorem completed. (i) L large. From the estimate of Proposition 3.11 applied to g(N1 ) := ν(f | N1 ), (5.3), the bound (5.28) and (5.36), we see that Var ν ν(f | N1 ) ≤ 8 Var ν f + k(n ∧ L2 ) pn (m) ∧ pn (m + 1) I1 + I2 , m
(5.71) provided that L is large enough depending on q, 8. Thanks to (5.43), (n ∧ L2 ) pn (m) ∧ pn (m + 1)I1 m
≤k
m
pn (m) L2 ν (∇b f )2 |m + ν (∇b f )2 |m b∈O
= k L2 EνO (f, f ) + EνV (f, f ) .
b∈V
(5.72)
Energy Gap in the XXZ Model
369
On the other hand the estimates on I2 given in Lemmas 5.7 and 5.8 yield pn (m) ∧ pn (m + 1)I2 ≤ C8 Eν (f, f ) + 8Var ν (f ) (n ∧ L2 )
(5.73)
m
for any 8 > 0 and a suitable constant C8 independent of L. In conclusion, for any 8 > 0 and q ∈ (0, 1) we can choose L0 = L0 (8, q) such that, by combining together (5.72) and (5.73), the r.h.s. of (5.71) is bounded from above by (5.74) k L2 EνO (f, f ) + EνV (f, f ) + C8 Eν (f, f ) + 8Var ν (f ) for any L ≥ L0 . (ii) L small. Using Proposition 3.11 together with (5.29), and (5.36), we see that Var ν ν f |N1 ≤ C pn (m) ∧ pn (m + 1) I1 + I2 + Var ν (f ) (5.75) m
for a suitable constant C = C(L, q). It is enough to use at this point (5.43) together with the rough estimate (5.44) to get the desired bound. ! 6. Proof of the Upper Bound in Theorem 2.7 and of Theorem 2.4 In this final section we prove the upper bound on the spectral gap of the generator L)H,L ,n and the bound on the spectral projection. 6.1. Proof of (2.34). Consider the cylinder := )L,H which has the square QL (containing L2 sites) as basis. A generic point of QL will be denoted by z and Nz stands for the number of particles in the stick going through z, Nz (α) = αx . x∈)z,H
Given a smooth function ϕ : [0, 1]2 → R, we define fϕ : → R by fϕ (α) =
ϕL (z)Nz (α),
(6.1)
z∈QL
where ϕL denotes the rescaled profile ϕL (z) = ϕ(z/L), We will use the notation e(ϕ) = ∇ϕ(u)2 du, [0,1]2
z ∈ QL .
∇ϕ(u)2 := (∂u1 ϕ(u))2 + (∂u2 ϕ(u))2 .
The upper bound in Theorem 2.7 is obtained as follows.
(6.2)
370
P. Caputo, F. Martinelli
Proposition 6.1. For every q ∈ (0, 1), there exists k = k(q) < ∞) such that the following holds. For any smooth function ϕ : [0, 1]2 → R satisfying ϕ(u) du = 0 and ) 2 ϕ(u) du = 1 there exists L0 such that for any L ≥ L0 , H ≥ 1 and n = 1, . . . , H L2 −1 one has Eν (fϕ , fϕ ) ≤ ke(ϕ)L−2 Var ν (fϕ ).
(6.3)
Proof. Observing that Nz , z ∈ QL are identically distributed under ν we easily see that ν(Nz , Nz/ ) = −
σν2 , −1
z = z/ ,
L2
(6.4)
where σν2 := ν(Nz , Nz ) is the variance of the number of particles in a single stick. Thus Var ν (fϕ ) = σν2
≥ σν2
ϕL (z)2 −
z∈QL
z∈QL
σν2 −1
L2
ϕL (z)ϕL (z/ )
z,z/ ∈QL : z=z/
2 2 σ ϕL (z)2 − 2 ν ϕL (z) . L −1
(6.5)
z∈QL
) ) Since ϕ(u) du = 0 and ϕ(u)2 du = 1, from Riemann integration we conclude that there exists a finite L0 such that for any L > L0 , Var ν (fϕ ) ≥
σν2 2 L . 2
(6.6)
Let us now estimate the Dirichlet form. In view of (6.6) all we have to prove is Eν (fϕ , fϕ ) ≤ ke(ϕ)σν2 .
(6.7)
Consider a bond (x, y) = b ∈ O . Clearly ∇xy fϕ = 0 if x, y belong to the same stick since an exchange between x and y does not change the number of particles in any stick. In particular, only horizontal bonds b ∈ O contribute to Eν (f, f ). Take z, z/ ∈ QL such that x ∈ )z,H , y ∈ )z/ ,H and b = (x, y) ∈ O . One has ∇xy fϕ (α) = (αy − αx )(ϕL (z) − ϕL (z/ )),
(6.8)
and since z − z/ 1 = 1, |ϕL (z) − ϕL (z/ )| ≤ 2L−1 ∇ϕ(˜z/L) + O(L−2 ). From (6.8) we obtain, Eν (fϕ , fϕ ) ≤ 2L−2
∇ϕ(˜z/L)2 + O(L−2 ) C(ν, z),
z∈QL
with C(ν, z) :=
x∈)z,H y ∈) / z,H : x−y1 =1
ν (αx − αy )2 .
(6.9)
Energy Gap in the XXZ Model
Since
L−2
371
∇ϕ(˜z/L)2 + O(L−2 ) → e(ϕ),
L → ∞,
z∈QL
the claim (6.7) is proven once we show that there exists k < ∞ such that for any z ∈ QL , C(ν, z) ≤ kσν2 .
(6.10)
We start the proof of (6.10) by estimating with the help of Proposition 3.8: ν (αx − αy )2 ≤ kµλ (αx − αy )2 ,
(6.11)
with µλ the grand canonical measure corresponding to n particles. We observe that, since x and y are at the same height, µλ (αx − αy )2 = µλ (αx )(1 − µλ (αx )) + µλ (αy )(1 − µλ (αy )) = 2Var µλ (αx ). / )z so that C(ν, z) ≤ For every x ∈ )z there are at most 4 horizontal neighbours y ∈ 8σ 2 (λ), with σ 2 (λ) := Var µλ (Nz ). The rest of the proof is now concerned with the estimate σ 2 (λ) ≤ kσν2 (n)
(6.12)
with a constant k only depending on q. Once (6.12) is established we obtain (6.10) and the proposition follows. Below we restrict to the case n ≤ H L2 /2, which is no loss of generality in view of particle–hole duality. From (6.4) we have σν2 =
L2 − 1 ν (Nz − Nz/ )2 , 2 2L
z = z/ .
(6.13)
For any integer m ≥ −1 consider the event Ez,m that the stick )z,H is filled with particles up to level m and is empty above level m. More precisely if x0 = z, x1 , . . . , xH −1 are the sites of )z,H with xi = i we define Ez,m = {αx0 = · · · = αxm = 1, αm+1 = · · · = αxH −1 = 0},
Ez,−1 = {Nz = 0}.
For any integer 0 ≤ m ≤ H − 1 we have the bound ν (Nz − Nz/ )2 ≥ ν Ez,m ∩ Ez/ ,m−1 ) .
(6.14)
The right-hand side above should be maximal around m = [n/L2 ]. Indeed, simple computations as in Lemma 3.1 show that there exists δ = δ(q) > 0 such that uniformly in the height H one has µλ (Ez,[n/L2 ] ) ≥ δq −2(λ∧0) ,
µλ (Ez,[n/L2 ]−1 ) ≥ δ.
(6.15)
Therefore using Theorem 3.4 we have ν Ez,[n/L2 ] ∩ Ez/ ,[n/L2 ]−1 ≥ µλ (Ez,[n/L2 ] )µλ (Ez,[n/L2 ]−1 ) − kL−2 ≥ δ 2 q −2(λ∧0) − kL−2 ≥ k −1 δ 2 σ 2 (λ) − kL−2 ,
(6.16)
372
P. Caputo, F. Martinelli
with the last inequality coming from Lemma 3.2. But we know (Remark 3.12) that σ 2 (λ) ≥ k −1 (1∧ Ln2 ), thus (6.13), (6.14) and (6.16) imply that there exist finite constants L0 , N0 and k only depending on q such that (6.12) holds whenever L ≥ L0 and n ≥ N0 . It remains to treat the case n < N0 . It will suffice to show σν2 ≥
n . kL2
(6.17)
We write ν (Nz − Nz/ )2 ≥ ν Nz = 1, Nz/ = 0 ≥ ν Nz = 1, Nz/ = 0 | Nw ≤ 1, ∀w ∈ QL ν Nw ≤ 1, ∀w ∈ QL ≥ ν Nz = 1, Nz/ = 0 | Nw ≤ 1, ∀w ∈ QL ν αw = n . w∈QL
But µλ ( w∈QL αw = n, N\QL = 0) ν αw = n = µλ (N = n) w∈QL ≥ µλ αw = n, N\QL = 0 ,
w∈QL
and the latter is bounded away from 0 uniformly as in the proof of Theorem 3.4 (see (3.24)). On the other hand L2 −2 n n(L2 − n) ν Nz = 1, Nz/ = 0 | Nw ≤ 1, ∀w ∈ QL = n−1 , ≥ = 2 2 2 L L (L − 1) 2L2 n
as soon as L2 ≥ 2N0 . This yields the desired bound (6.17).
!
Remark 6.2. The above proposition allows to produce low-lying excitations which are localized in a sub-cylinder )R,H ⊂ )L,H with R ≤ L, much in the)spirit of [3]. Indeed, one can always choose a function ϕ supported on [0, R/L] with ϕ 2 = 1 and e(ϕ) = O(R −2 L2 ) and the resulting states fϕ have energy O(R −2 ). 6.2. Proof of Theorem 2.4. For simplicity we prove the result for the generator L,n instead of G,n , but the argument applies essentially without modifications to the original setting of Theorem 2.4. We follow quite closely the proof of an analogous result for translation invariant lattice gases (see Theorem 2.4 in [7]). The main idea is to establish the following inequality: ν(g, f )2 ≤ k8 8 Eν (g, g) + −2 ν(g, g)
(6.18)
for any 8 and any , with the constant k8 uniform in , . Once we have (6.18) we obtain Theorem 2.4 by choosing g := Es f and optimizing over . Indeed, with this choice
Energy Gap in the XXZ Model
373
we have ν(f Es f ) = ν(f, g) = ν(g, g) since f (and therefore Es f ) has zero mean. Moreover Eν (Es f, Es f ) ≤ sν(f Es f ), so that (6.18) implies ν(f Es f ) ≤ k8 s8 + −2 , and the claim follows. In order to prove (6.18) we need the following technical lemma. In what follows f is as in Theorem 2.4. Lemma 6.3. There exists a constant k depending on f such that k Var ν ν(f | N),H ) ≤ 2
∀ ≤
L . 2
(6.19)
Proof. Without loss of generality we can assume that is so large that the support of f is contained in ),H . For notational convenience we set N := N),H and µλ := µλ),H . Using the result on the equivalence of ensembles, see Theorem 3.4, we can safely replace λ(m,) (f ), where λ(m, ) ν(f | N = m) with its grand-canonical average F (m) := µ λ(m,) (N ) = m. Moreover, thanks to Proposition 3.8, we can bound is such that µ λ(n) the canonical variance w.r.t. ν by the grand canonical one w.r.t. µ := µ with self explanatory notation. In conclusion 1 Var ν ν(f | N ) ≤ kµ F, F + C 2
(6.20)
for some k = k(q) and C = C(f, q). Since the measure µ is a product measure over the sites of , it is immediate to check (see also (3.45)) 2 µ F, F ≤ σx µ 2[F (N + 1) − F (N )]2 + 2[F (N ) − F (N − 1)]2 , x∈),H
(6.21) where σx2 := µ(αx , αx ). We now bound the gradient [F (m + 1) − F (m)]. Let λs := sλ(m + 1, ) + (1 − s)λ(m, ) and let Fs := µλ s (f ). Then, setting a(q) = log 1/q we have
1
F (m + 1) − F (m) =
ds 0
d Fs ds 1
= a(q) 0
dsµλ s (N , f )[λ(m + 1, ) − λ(m, )].
(6.22)
In turn λ(m + 1, ) − λ(m, ) =
m+1 m m+1
=
m
d λ(t, ) dt −1 λ(t,) dt a(q)µ (N , N ) . dt
(6.23)
374
P. Caputo, F. Martinelli
It is easy to check at this point, thanks to the results of Sect. 3.1, that |λ(m + 1, ) − λ(m, )| ≤ k(m ∧ 2 )−1 for some k = k(q). Since |µλ s (N , f )| ≤ Cf we get that the r.h.s. of (6.22) is bounded from above by Cf k(m ∧ 2 )−1 . In conclusion, the r.h.s. of (6.21) is bounded from above by Kf 2
n −2 ∧ 1 µ N ∧ 2 2 L
(6.24)
for some constant Kf depending on f . Standard large deviations for the product measure µ imply that the r.h.s. of (6.24) is bounded from above by Kf/ −2 . ! We can now complete the proof of the theorem following step by step the proof of Theorem 2.4 in [7]. We first establish (6.18) for 8 = 2 and then show how to improve it to all values of 8 > 0. The main ingredients are the lower bound on the spectral gap given in Theorem 2.2 together with the formula ν(g, f ) = ν ν(g, f | F) + ν g, ν(f | F) valid for any σ -algebra F. If we take F as the σ -algebra generated by N , we get, after one Schwartz inequality, 2 ν(g, f )2 ≤ 2ν ν(g, f | N )2 + 2ν g, ν(f | N ) 1 ≤ Cf 2 Eν (g, g) + 2 Var ν (g) ,
(6.25)
where we used Lemma 6.3 and the Poincaré inequality Var ν (g | N ) ≤ k2 Eν (g, g | N ), which follows from Theorem 2.2. Now we assume inductively that we have been able to prove (6.25) with 2 replaced by 8 and Cf replaced by some constant Cf,8 for all ≤ L2 . Then the term ν(g, f | N )2 in the r.h.s. of the first line of (6.25) can be bounded from above by 1 ν(g, f | N )2 ≤ Cf,8 81 Eν (g, g | N ) + 2 Var ν (g | N ) 1 2 / ≤ Cf,8 81 + 2 Eν (g, g | N ) 1 for any 1 ≤ 2 . If we optimize over 1 for a given we get 28
// 2+8 Eν (g, g | N ). ν(g, f | N )2 ≤ Cf,8
(6.26)
In other words we have been able to replace the assumed 8 factor in front of the Dirichlet 28 form of g by 2+8 . The price is an increase of the constant Cf,8 . Since the discrete map 2x x → 2+x , x0 = 2 has as unique fixed point the origin, (6.18) follows. ! Acknowledgements. We warmly thank Bruno Nachtergaele and Pierluigi Contucci for enlightening discussions on their paper [3] and on the XXZ models in general.
Energy Gap in the XXZ Model
375
References 1. Alcaraz, F.C., Salinas, S.R., Wreszinski, W.F.: Anisotropic ferromagnetic quantum domains. Phys. Rev. Lett. 75, 930–933 (1995) 2. Alcaraz, F.C.: Exact steady states of asymmetric diffusion and two-species annihilation with back reaction from the ground state of quantum spin models. Intern. J. Mod. Phys. B 8, 3449–3461 (1994) 3. Bolina, O., Contucci, P., Nachtergaele, B., Starr, S.: Finite volume excitations of the 111 Interface in the quantum XXZ model. Commun. Math. Phys. 212, 63–91 (2000) 4. Bolina, O., Contucci, P., Nachtergaele, B., Starr, S.: A continuum approximation for the excitations of the (1, 1, . . . , 1) interface in the quantum Heisenberg model. Electron. J. Diff. Eqns. 04, 1–10 (2000) 5. Bolina, O., Contucci, P., Nachtergaele, B.: Path Integral Representation for Interface States of the Anisotropic Heisenberg Model. Rev. in Math. Phys. 12, no. 10, 1325–1344 (2000) 6. Borgs, C., Chayes, J.T., Fröhlich, J.: Dobrushin states for classical spin systems with complex interactions. J. Statist. Phys. 89, no. 5–6, 895–928 (1997) 7. Cancrini, N., Martinelli, F.: On the spectral gap of Kawasaki dynamics under a mixing condition revisited. J. Math. Phys. 41, no. 3, 1391–1423 (2000) 8. Cancrini, N., Martinelli, F.: Finite volume comparison of canonical and multicanonical Gibbs measures under a mixing condition. Markov Processes and Related Fields 6, no. 1, 23–73 (2000) 9. Cancrini, N., Martinelli, F., Roberto, C.: On the logarithmic Sobolev constant of Kawasaki dynamics under a mixing condition revisited. To appear in Ann. Institut H. Poincaré 10. Datta, N., Messager, A., Nachtergaele, B.: Rigidity of the 111 interface in the Falicov–Kimball model. J. Stat. Phys. 99, 461–555 (2000) 11. Diaconis, P., Saloff-Coste, L.: Logarithmic Sobolev inequality for finite Markov chains. Ann. Appl. Prob 6, no. 3, 695–750 (1996) 12. Gottstein, C.-T., Werner, R.F.: Ground states of the infinite q-deformed Heisenberg ferromagnet. Preprint archived as condmat/9501123 13. Kenyon, R.: Local statistics of lattice dimers. Ann. Inst. H. Poincaré Probab-Stat. 33, 591–618 (1997) 14. Koma, T., Nachtergaele, B.: The complete set of ground states of the ferromagnetic XXZ chains. Adv. Theor. Math. Phys. 2, no. 3, 533–558 (1998) 15. Koma, T., Nachtergaele, B.: The spectral gap of the ferromagnetic XXZ chain, Lett. Math. Phys. 40, no. 1, 1–16 (1997) 16. Koma, T., Nachtergaele, B.: Low-lying spectrum of quantum interfaces. Abstracts of the AMS 17, 146 (1996) 17. Ligget, T.M.: Interacting particles systems. Berlin–Heidelberg–New York: Springer-Verlag, 1985 18. Lu, S.T.,Yau, H.-T.: Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics. Commun. Math. Phys 156, 399–433 (1993) 19. Martinelli, F.: Lectures on Glauber dynamics for discrete spin models. In: Lectures on probability theory and statistics (Saint-Flour, 1997), Lecture Notes in Math. 1717, Berlin: Springer, 1999, pp. 93–191 20. Matsui, T.: On the spectra of the kink for ferromagnetic XXZ models. Lett. Math. Phys. 42, no. 3, 229–239 (1997) 21. Miclo, L.: An example of application of discrete Hardy’s inequalities. Markov Process. Related Fields 5, 319–330 (1999) 22. Nachtergaele, B.: Interfaces and droplets in quantum lattice models. Preprint, archived as mp_arc/00-369, 2000 23. Nachtergaele, B., Starr, S.: Droplet States in the XXZ Heisenberg Chain. To appear in Commun. Math. Phys. Communicated by H. Spohn
Commun. Math. Phys. 226, 377 – 391 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Automorphism Inducing Diffeomorphisms, Invariant Characterization of Homogeneous 3-Spaces and Hamiltonian Dynamics of Bianchi Cosmologies T. Christodoulakis, E. Korfiatis, G. O. Papadopoulos University of Athens, Physics Department, Nuclear & Particle Physics Section, Panepistimioupolis, Ilisia GR 157–71, Athens, Hellas. E-mail: [email protected]; [email protected] Received: 29 October 2001 / Accepted: 5 November 2001
Abstract: An invariant description of Bianchi Homogeneous (B.H.) 3-spaces is presented, by considering the action of the Automorphism Group on the configuration space of the real, symmetric, positive definite, 3 × 3 matrices. Thus, the gauge degrees of freedom are removed and the remaining (gauge invariant) degrees, are the – up to 3 – curvature invariants. An apparent discrepancy between this Kinematics and the Quantum Hamiltonian Dynamics of the lower Class A Bianchi Types, occurs due to the existence of the Outer Automorphism Subgroup. This discrepancy is satisfactorily removed by exploiting the quantum version of some classical integrals of motion (conditional symmetries) which are recognized as corresponding to the Outer Automorphisms. 1. Introduction In a preceding work [1] we have shown how the presence of the linear constraints entails a reduction of the degrees of freedom for the quantum theory of Class A spatially homogeneous geometries: the initial six-dimensional configuration space spanned by γαβ ’s (the components of the spatial metric with respect to the invariant basis oneforms), is reduced to a space parameterized by the independent solutions to the linear quantum constraints (Kuchaˇr’s physical variables [2]). For Bianchi Types VI0 , VII0 , VIII, IX these solutions are the three combinations: β
α x 1 = Cµκ Cνλ γαβ γ µν γ κλ
β
α x 2 = Cβκ Cαλ γ κλ
x3 = γ
(or any other three, independent, functions thereof) and the Wheeler–DeWitt equation becomes a P.D.E. in terms of these x i ’s. The Bianchi Type I, where all structure constants are zero (and thus the linear constraints vanish identically), has been exhaustively treated [3]. The Type II case, where only two linear constraints are independent, has been examined along the above lines in [4] and differently in [5]. The fact that the quantum theory (within each one of the above mentioned Bianchi (1) (2) Types) forces us to consider as equivalent any two points γαβ , γαβ in the configuration
378
T. Christodoulakis, E. Korfiatis, G. O. Papadopoulos
space if they form the same triplet (x i ), seems quite intriguing. It is the purpose of the present work to investigate in detail the reasons for this grouping of the γαβ ’s. The paper is organized as follows: Section 2 begins with a careful examination of the action of the general coordinate transformations group on γαβ . The demand that the diffeomorphisms must preserve the manifest homogeneity of the 3-spaces singles out a particular set of those transformations which has a well defined, non-trivial action on γαβ ; this action is then proven to be nothing but the action of the automorphism group corresponding to each arbitrary but given Bianchi Type. The differential description of these automorphic motions, is achieved by identifying the vector fields on the configuration space which, through their integral curves, induce these motions. The importance of Automorphisms in the theory of Bianchi Type Cosmologies, has been stressed in [6]. Concluding this section, we prove the following: if (within a particular albeit arbi(1) (2) trary Bianchi Type) two points γαβ , γαβ lying on the configuration space correspond to (2)
µ
(1)
ν α and γ , then γ the same scalar combinations of Cµν αβ αβ = α β γµν , where is an element of the corresponding Automorphism group. In Sect. 3 we briefly recapitulate the essential features of the quantum theory developed in [1], and we compare the purely kinematical results of the previous section, with the ensuing Quantum Hamiltonian dynamics. For the lower Class A Bianchi Types, this comparison reveals an apparent mismatch between the dynamics and the kinematics. The gap is bridged through the notion of conditional symmetries [7], i.e. some linear in momenta, integrals of motion; their quantum counterparts constrain to be a function of the geometry only. Finally, some concluding remarks are included in the discussion.
2. Automorphism Inducing Diffeomorphisms In this section we shall first relate the action of the Automorphism group on γαβ , to the action induced on it by the class of General Coordinate Transformations (G.C.T.’s) which are subject to the restriction of preservation manifest spatial homogeneity. To this end, consider the spatial line element: β
ds 2 = γαβ σiα (x)σj (x)dx i dx j ,
(2.1)
where σiα (x)dx i are the invariant basis 1-forms, of some given Bianchi Type. The spatial homogeneity of this line element, is of course, preserved under any G.C.T. of the form: x i −→ x i = f i (x).
(2.2)
Under such a transformation, ds 2 simply becomes: (ds 2 ≡)d s 2 = γαβ σmα ( x ) σnβ ( x )d x m d xn,
(2.3)
where the basis one-forms are supposed to transform in the usual way: σmα ( x ) = σiα (x)
∂x i . ∂ xm
(2.4)
If one were to stop at this point, then one might have concluded that all spatial diffeomorphisms act trivially on γαβ , i.e. γαβ −→ γαβ = γαβ . But as we shall immediately
Bianchi Homogenous 3-Spaces
379
see, there are special G.C.T.’s which induce a well-defined, non-trivial action on γαβ . To uncover them, let us ask what is the change in form induced, by tranformation (2.2), to the line element (2.1). To find this change we have to express the line element (2.3) in terms of the old basis one-forms (at the new point) σiα ( x ). There is always a non-singular matrix αβ ( x ) connecting σ and σ , i.e.: σmα ( x ) = αµ ( x )σmµ ( x ).
(2.5)
Using this matrix we can write the line element (2.3) in the form: d s 2 = γαβ αµ ( x )βν ( x )σmµ ( x )σnν ( x )d x m d xn.
(2.6)
If the functions f i , defining the transformation, are such that the matrix αµ does not depend on the spatial point, then there is a well defined, non trivial action of these transformations on γαβ : γαβ −→ γµν = αµ βν γαβ .
(2.7)
With the use of (2.4) and (2.5), the requirement that αµ does not depend on the spatial point x i places the following differential restrictions on the f i ’s: ∂f i (x) β = σαi (f )Sβα σj (x), ∂x j
(2.8)
where σαi and Sβα are the matrices inverse to σiα (x) and αβ , respectively. These conditions constitute a set of first order, highly non-linear P.D.E.’s in the unknown functions f i ’s. The existence of solutions to these equations, is guaranteed by the Frobenius theorem 2 f i −∂ 2 f i = 0 hold. Through [8], as long as the necessary and sufficient conditions ∂k, l l, k the use of (2.8) and the defining property of the invariant basis 1-forms (3.2), we can transform these conditions into the form: ρ α µ ν 2σαi (f )σk (x)σlδ (x) Cδ Sρα − Cµν S Sδ = 0 (2.9) which are satisfied, if and only if, Sµα (and thus also αµ ) is a Lie Algebra Automorphism (see 2.15 below). It is, therefore, appropriate to call the General Coordinate Transformations (2.2), when the f i ’s satisfy (2.8), Automorphism Inducing Diffeomorphisms (A.I.D.’s). The existence of such spatial coordinate transformations is not entirely unexpected: in the particular case αβ ( x ) = δβα these coordinate transformations, are nothing but the finite motions induced on the hypersurface, by the three Killing vector fields (existing by virtue of homogeneity of the space), which leave the basis one-forms form invariant. The new thing we learn is that there are further motions leaving the basis oneforms quasi-invariant, i.e. invariant modulo a global (space independent) linear mixing, with the mixing matrix αµ belonging to the Automorphism Group. The notion of such transformations “leaving the invariant triads unchanged modulo a global rotation” also appears in Ashtekar’s work [3], under the terminology “Homogeneity Preserving Diffeomorphisms”; also the term global is there used in the topological sense. In order to gain a deeper understanding of the implications of the above analysis as well as the consequences of the kinematics on the dynamics, we have to carefully consider the configuration space and the differential description of the changes (2.7) induced on it by the A.I.D.’s. Let us begin with some propositions about the space of 3×3 real, symmetric (positive definite) matrices:
380
T. Christodoulakis, E. Korfiatis, G. O. Papadopoulos
Proposition 1. The set of all 3 × 3 real, symmetric, matrices forms a vector subspace of GL(3, ), and is thus endowed with the structure of a six-dimensional manifold. Proposition 2. The set " of all 3 × 3 real, symmetric, positive definite matrices is an open subset of . Proof. Let γαβ , be a positive definite 3 × 3 real, symmetric matrix and p(s) = s 3 − As 2 + Bs − C its characteristic polynomial with A, B, C, continuous polynomial functions of γαβ ’s. Since γαβ is symmetric, the necessary and sufficient condition that γαβ be positive definite, is A > 0, B > 0, C > 0. Therefore ", as an inverse image of an open subset, is itself open. Proposition 3. The set " is an arcwise connected subset of . Proof. Let γαβ ∈ ". Then, there is P ∈ SO(3) such that (in matrix notation): P γ P T = D = diag(a, b, c), with a, b, c the three positive eigenvalues of γ . Since P belongs to SO(3), there is a continuous mapping ω : [0, 1] → SO(3) such that ω(0) = P and ω(1) = I3 . Introduce now the mapping f : [0, 1] → ", with f (σ ) = ω(σ )γ ω(σ )T . As ω(σ ) belongs to SO(3), its determinant is not zero for every σ ∈ [0, 1]. Therefore, by Sylvester’s theorem, f (σ ) is positive definite –just like γ . But f (0) = D and f (1) = γ , i.e. the matrix γ is connected to D by a continuous curve lying entirely on ". Consider now the mapping: φ : [0, 1] → " with:
φ(σ ) = diag((a − 1)σ + 1, (b − 1)σ + 1, (c − 1)σ + 1),
φ is continuous and φ(σ ) ∈ ", ∀ σ ∈ [0, 1]. This means that γ is finally arcwise connected to I3 . Let us now proceed with the differential description of motions (2.7). To this end, consider the following linear vector fields defined on ": X(i) = λα(i)ρ γαβ ∂ βρ
(2.10)
with an obvious notation for the derivative with respect to γαβ . β β β The matrices λ(i)α ≡ (C(ρ)α , ε(i)α ) are the generators of (the connected to the identity component of) the Automorphism group (see (2.16)) and (i) labels the different generators. Depending on the particular Bianchi Type, the vector fields (in ") X(i) may also include, except for the quantum linear constraints (generators of Inner Automor∂ α γ phic Motions) Hρ = Cρβ ακ ∂γβκ , the generators of the outer-automorphic motions: ∂ σ γ E(j ) ≡ ε(j )ρ σ τ ∂γρτ .
The infinitesimal action of the generic vector field (2.10) ε (i) X(i) on γαβ is given by: 1 ¯ αβ ≡ ε(i) (λµ γµβ + λµ γµα ), δγ (i)β 2 (i)α
(2.11)
Bianchi Homogenous 3-Spaces
381
where ε (i) are infinitesimal arbitrary parameters. If we now define the matrices: µ
Mαµ = ε(i) λ(i)α ,
(2.12)
we can prove that these are generators of automorphisms. To see it, let us briefly recall the notion of a Lie Algebra Automorphism: if A denotes the space of third rank (1,2) tensors under GL(3, ), antisymmetric in the two covariant indices, then the structure constants transform (as it can be inferred from (3.2)) according to: β
α α µν →C = Sβα κµ λν Cκλ Cµν
(2.13)
with αµ and Sµα = (−1 )αµ ∈ GL(3, ). A transformation is called a Lie Algebra Automorphism, if and only if, it leaves the structure constants unchanged, i.e. if: β
α = Sβα κµ λν Cκλ Cµν
(2.14)
ρ α Cµν αρ = κµ λν Cκλ .
(2.15)
or equivalently:
To find the defining relation for the generators λαµ of the automorphisms αµ , consider ρ ρ ρ a path through the identity θ (τ ), with θ (0) = δθ (we are concerned only with the connected to the identity component of the automorphism group). Differentiating both sides of (2.15) with respect to the parameter τ and setting τ = 0, we get the relation: β α α λαβ Cµν = λρµ Cρν + λρν Cµρ ,
(2.16)
where we have identified λαµ ’s with the vectors tangent to the path, at the identity. By virtue of the Jacobi Identities, one can see that a solution to the system (2.16) is: α and thus, the structure constants matrices are the generators of the Inner λα(κ)β = C(κ)β Automorphisms’ proper invariant subgroup of Aut(G). For Bianchi Types VIII, IX these are the only generators of automorphisms. For all other Bianchi Types, there exist extra α – generating the Outer Automorphisms subgroup matrices satisfying (2.16) – say ε(i)β of Aut(G). We are now ready to find the finite motions induced on ", by the generic vector field X ≡ ε(i) X(i) : (0)
Proposition 4. Let γαβ be a fixed point in ". Then the curve γ : → " with: ν (0) γαβ (τ ) = (exp(τ M))µ α (exp(τ M))β γµν (0)
is an integral curve (passing through γαβ ) of the vector field X ≡ ε (i) X(i) . µ
Proof. We give a rigorous proof of the statement that the matrices (exp(τ M))α are where automorphisms. To this end, define the mapping φτ : A → A, with φτ (C) = C, β α κ λ α Cµν = Sβ µ ν Cκλ . Define also the matrices (τ ) = exp(τ M), S(τ ) = exp(−τ M), with M given by (2.12). It is straightforward to verify that φτ ◦ φσ = φτ +σ . Using the Jacobi Identities and the definitions above, it is not difficult to see that: dφτ (C) = 0. τ =0 dτ
382
T. Christodoulakis, E. Korfiatis, G. O. Papadopoulos
C ∈ A, such that φψ (C) = C, for some ψ. Since the derivative Consider now two sets C, of φθ at 0 is zero, we have that: − φ0 (C) φθ (C) dφθ (C) = lim =0 θ→0 θ=0 dθ θ which in turn, implies that: φθ ◦ φψ (C) − φψ (C) φθ+ψ (C) − φψ (C) = 0 ⇒ lim =0 θ→0 θ→0 θ θ lim
The last expression says that: dφψ (C) = 0, dψ
∀ ψ,
i.e. the mapping φψ (C) is constant ∀ ψ. Thus it holds, in particular, that φψ (C) = φ0 (C) α = Cα . µν or C µν We have thus proven that the finite motions induced on " by X(i) ’s (through their integral curves) are linear transformations of γαβ of the form (2.7) with ∈ Aut(G). In particular, it is deduced that the linear constraint vector fields generate inner automorphic motions (see [6, 3]). We now turn our attention to the invariant description of Bianchi Homogeneous (B.H.) 3-Geometries. It is known that a geometry is invariantly characterized by all its metric invariants. In 3 dimensions all metric invariants, are higher derivative curvature invariants [9], and homogeneity reduces any higher derivative curvature invariant to a λ ,γ scalar combination of Cαβ µν – with the appropriate number of C’s. So, it is natural to expect that these scalar combinations will invariantly describe a B.H. 3-geometry. λ , γ , Indeed, it is straightforward to check that any given scalar combination of Cαβ µν is annihilated by all X(i) ’s defined in (2.10). This, in turn, implies that any such scalar combination is constant (as a function of γµν ), along the integral curves of the X(i) ’s. This fact on account of Proposition 4, points to the following (2)
(1)
α ,γ Statement. Any two hexads γαβ , γαβ , for which all scalar combinations of Cµν αβ coincide, are automorphically related, i.e. (2.7) holds with ∈ Aut(G). In order to proceed with the proof, and for latter use as well, it is necessary to define α ,γ the following scalar combinations of Cµν αβ – which constitute a base in the space of all scalar contractions: α q 1 (Cµν , γαβ ) =
mαβ γαβ , √ γ
(mαβ γαβ )2 1 α β − Cµκ Cνλ γαβ γ µν γ κλ , 2γ 4 m α q 3 (Cµν , γαβ ) = √ , γ
α q 2 (Cµν , γαβ ) =
(2.17a) (2.17b) (2.17c)
where mαβ is the symmetric second rank contravariant tensor density (under the action of GL(3, )) in which the structure constants are uniquely decomposed, and m its determinant i.e.: α = mαδ εδβγ + νβ δγα − νγ δβα Cβγ
(2.18)
Bianchi Homogenous 3-Spaces
383
ρ
with να = 21 Cαρ . At this point the following – easily provable – elements, must be underlined: E1 Concerning the number of scalar combinations: the number of independent γαβ ’s in a d dimensional space, is N1 = d 2 − d2 = d(d +1)/2 – due to symmetry. Initially, the number of independent structure constants, is N2 = d d2 = d 2 (d −1)/2 – due to the antisymmetry in the lower indices. Taking into account the number of independent Jacobi identities, which is d2 (d − 2) = (d − 2)(d − 1)d/2, one is left with N3 = N2 −(d −2)(d −1)d/2 = (d −1)d independent structure constants. But, there is also the freedom of arbitrarily choosing N4 = d 2 parameters by linear mixing, i.e. the action of the GL(3, ). Thus, the number of independent scalars, which one α ’s, is: N ≡ N + N − N = (d − 1)d/2. may construct from the γαβ ’s and the Cµν s 1 3 4 For d = 3, Ns = 3; note that 3 is the maximum number which is achieved only for Bianchi Type VIII, IX. In all others, m = 0 and, as it can be seen either by direct calculation or from the appendix of [1], the independent scalars are less than 3; namely they are two for Type VI, VII, IV one for Type II, V and 0 for Type I. In each case, the number of the independent q i ’s equals the number of curvature invariants. E2 The q i ’s constitute a complete set of solutions to the system of equations X(i) = 0, i.e. = (q i ) is the most general solution to these equations. Since the linear constraint vector fields, Hα are in general a subset of the X(i) ’s, it can be inferred that the q i ’s, are solutions to the quantum linear constraints. Except for Type VIII, IX, where there are not extra generators, the independent solutions to the quantum linear constraints include γ = |γαβ | as well as other non scalar combinations [4, 10]. This signals an apparent discrepancy between the kinematics of B.H. 3-spaces previously described, and the quantum dynamics of the (lower) Class A Bianchi Cosmologies. Now, to resume the line of thought for the proof of the statement, let us define the action of GL(3, ) on " and A. If αµ , Sµα = (−1 )αµ ∈ GL(3, ) then: def
ν γ = φ (γ ) ←→ γαβ = µ α β γµν ,
(2.19a)
def β α µν = φ (C) ←→ C = Sβα κµ λν Cκλ . C
(2.19b) β
= φ (C) ⇒ m As it can be easily inferred from (2.18), C αβ = |S|−1 Sκα Sλ mκλ β and να = α νβ . It also holds that φ1 ◦ φ2 = φ2 1 and obviously the q i ’s in (2.17) satisfy the relation: q i (γ , C) = q i (φ (γ ), φ (C)).
(2.20)
This has the important implication that, when αµ ∈ Aut(G), the form invariance of the α ’s. q i ’s is guaranteed by their explicit definition as scalar combinations of γαβ ’s and Cµν The following proposition holds: (1)
(2)
Proposition 5. Let γαβ , γαβ , ∈ ", and C ∈ A be the structure constants of a given Bianchi Type. If q i (γ (1) , C) = q i (γ (2) , C) (i = 1, 2, 3), then there is αµ such that γ (2) = φ (γ (1) ) and αµ ∈ Aut(G), i.e. C = φ (C). To prove this we need the following
384
T. Christodoulakis, E. Korfiatis, G. O. Papadopoulos
Lemma. If q i (I3 , C(1) ) = q i (I3 , C(2) ) (i = 1, 2, 3), where C(1) , C(2) ∈ A are two sets of structure constants corresponding to the same Bianchi Type and I3 is the Identity 3 × 3 matrix, then there exists a matrix R ∈ SO(3) such that φR (C(1) ) = C(2) . Proof of the lemma. We first note that the number of independent relations in the lemma’s hypothesis equals the number of independent q i ’s and is therefore, at most 3. We second observe that in Class A Bianchi Types, the structure constants are characterized by the matrix mαβ only, and thus the relevant numbers involved are the (at most 3) real, non zero, eigenvalues of mαβ . In Bianchi Type VIII and IX, the non-vanishing eigenvalues are exactly 3. In conclusion, in each and every Class A Bianchi Type, the number of independent relations in the lemma’s hypothesis exactly equals the number of the non vanishing eigenvalues of matrix mαβ . In Class B, the null eigenvector να of mαβ is also present. In this case, q 3 vanishes identically, since rank(m) is less than 3 and the number of independent relations in the lemma’s hypothesis is reduced to at most 2. An apparent complication, is thus emerging for Class B Type V I and V I I , where the independent relations are two while the relevant numbers are 3 (the two real, non zero, eigenvalues of mαβ plus the non vanishing component of να ). The resolution to this apparent complication, is provided by the algebraic invariant: λ≡
χ
µν
χ
µν .
Cττ µ Cχν I3 τ C I Cχµ τν 3
This quantity, which is not meant to replace the dynamical variable q 3 , vanishes identically in Class A models, while in Class B it provides the third relation needed (see [1]). Thus in every Bianchi Type, 6 numbers appear: in Class A, the 3 eigenvalues of αβ αβ m(1) which correspond to C(1) , and the 3 eigenvalues of m(2) which correspond to C(2) . αβ
Similarly, in Class B, the at most 2 eigenvalues of m(1) plus the third component of its αβ
null eigenvector correspond to C(1) and the at most 2 eigenvalues of m(2) plus the third component of its null eigenvector correspond to C(2) . The justification for considering only these two triplets and not – for example – the non-diagonal components of mαβ , lies in the fact that mαβ can be put in diagonal form through the action of SO(3), while να will have the proper form for being the null eigenvector of mαβ . So, taking this irreducible form for both the matrix and its null eigenvector, we have the following relations: In Class A: q 1 (I3 , C1 ) = q 1 (I3 , C2 ), q 2 (I3 , C1 ) = q 2 (I3 , C2 ), q 3 (I3 , C1 ) = q 3 (I3 , C2 ), while in Class B: q 1 (I3 , C1 ) = q 1 (I3 , C2 ), q 2 (I3 , C1 ) = q 2 (I3 , C2 ), λ(I3 , C1 ) = λ(I3 , C2 ).
Bianchi Homogenous 3-Spaces
385
In each and every case, the corresponding system, can be easily solved, resulting in the αβ αβ equality between the eigenvalues of m(1) and m(2) , as well as να(1) and να(2) . There is thus, a matrix R ∈ SO(3), such that (in matrix notation) m(2) = |R|−1 R m(1) R T and ν(2) = (R −1 )T ν(1) ⇐⇒ C(2) = φR (C(1) ). Of course, |R| = 1 and is there only as a reminder of the tensor density character of mαβ . Proof of Proposition 5. Since the matrices γ (1) , γ (2) are positive definite, there are (1) , (2) ∈ GL(3, ) such that γ (1) = φ(1) (I3 ), γ (2) = φ(2) (I3 ). Let C(1) , C(2) be defined as C(1) = φ−1 (C) ⇐⇒ C = φ(1) (C(1) ) and C(2) = φ−1 (C) ⇐⇒ C = (1)
(2)
φ(2) (C(2) ), with C representing again a given, albeit arbitrary Bianchi Type. Using the above and (2.20) we have: q i (γ (1) , C) = q i (φ(1) (I3 ), φ(1) (C(1) )) = q i (I3 , C(1) ), q i (γ (2) , C) = q i (φ(2) (I3 ), φ(2) (C(2) )) = q i (I3 , C(2) ).
The hypothesis q i (γ 1 , C) = q i (γ 2 , C) translates into q i (I3 , C(1) ) = q i (I3 , C(2) ) which through the lemma implies that there is R ∈ SO(3) such that C(2) = φR (C(1) ). Since R ∈ SO(3) (in matrix notation): −1 −1 (γ (2) ) = φR (φ (γ (1) )) ⇒ γ (2) = φ(2) ◦ φR ◦ φ−1 (γ (1) ). I3 = φR (I3 ) ⇒ φ (2) (1) (1)
Similarly, we have: −1 −1 C(2) = φR (C(1) ) ⇒ φ (C) = φR (φ (C) ⇒ C = φ(2) ◦ φR ◦ φ−1 (C). (2) (1) (1)
The above imply that the matrix = φ (C), i.e. ∈ Aut(G).
−1 (1) R(2)
satisfies: γ (2) = φ (γ (1) ) and C =
We have thus completed the proof of the statement that whenever two hexads form the same multiplet (q i ), they are in automorphic correspondence, i.e. (in matrix notation): ∃ ∈ Aut(G) : γ (2) = T γ (1) . 3. Automorphisms and the Linear Constraints We deem it appropriate to begin this section with a short recalling of the main points of the quantum theory developed in [1]: our starting point is the line element describing the most general spatially homogeneous Bianchi type geometry: ds 2 = (−N 2 (t) + Nα (t)N α (t))dt 2 + 2Nα (t)σiα (x)dx i dt β
+ γαβ (t)σiα (x)σj (x)dx i dx j ,
(3.1)
where σiα are the invariant basis one-forms of the homogeneous surfaces of simultaneity t . Lower case Latin indices are world (tensor) indices and range from 1 to 3, while lower case Greek indices number the different basis one-forms and take values in the same range. The exterior derivative of any basis one-form (being a two-form), is expressible as a linear combination of any two of them, i.e.: γ
β
α α dσ α = Cβγ σ β ∧ σ γ ⇔ σi,α j − σj,α i = 2Cβγ σi σj .
(3.2)
386
T. Christodoulakis, E. Korfiatis, G. O. Papadopoulos
a are, in general, functions of the point x. When the space is hoThe coefficients Cµν mogeneous and admits a 3-dimensional isometry group, there exist 3 one-forms such that the C’s become independent of x, and are then called structure constants of the corresponding isometry group. Einstein’s Field Equations for the metric (3.1), are obtained only for the class A α = 0) from the following Hamiltonian: subgroup [11] (i.e. those spaces with Cαβ
H = N (t)H0 + N α (t)Hα .
(3.3)
where H0 =
1 −1/2 Lαβµν π αβ π µν + γ 1/2 R γ 2
(3.4)
is the quadratic constraint, with Lαβµν = γαµ γβν + γαν γβµ − γαβ γµν ,
(3.5)
β
α α δ R = Cλµ Cθτ γαβ γ θλ γ τ µ + 2Cβδ Cνα γ βν ,
γ being the determinant of γαβ , R being the Ricci scalar of the slice t = const, and µ γβµ π βρ Hα = 4Cαρ
(3.6)
are the linear constraints. For Bianchi Types VI0 , VII0 , VIII and IX, all three Hα ’s are independent. Following Kuchaˇr & Hajicek [2], we can quantize the system (3.3) – with the allocations (3.4), (3.5), (3.6) – by writing the operator constraint equations as ∂ β α = Cαµ H γβν = 0, ∂γµν 2 0 = − 1 ij ∂ − ij Aijk ∂ + (D − 2) R + √γ R , H 2 ∂x i x j ∂x k 4(D − 1)
(3.7) (3.8)
where x i are the independent solutions to (3.7) and ij =
∂x i ∂x j −1/2 γ Lαβµν ∂γαβ ∂γµν
is the induced metric on the reduced configuration space. Also, Aijk , R , are the corresponding Christoffel symbols and Ricci scalar respectively, while D = 3 (for details such as consistency etc. see [1]). The linear equations (3.7) constitute a system of three independent, first order, P.D.E.’s in the six variables γαβ . These equations, by virtue of the first class algebra satisfied by the operator constraints, admit three independent, non-zero, solutions which can be taken to be the combinations: β
α x 1 = Cµκ Cνλ γαβ γ µν γ κλ ,
β
α x 2 = Cβκ Cαλ γ κλ ,
x3 = γ ,
(3.9)
or any other three independent functions thereof. These are Kuchaˇr’s physical variables, which solve the linear constraints. Thus, the presence of the linear constraints at the quantum level implies that the state vector must be an arbitrary function of the three combinations (3.9) or any three independent functions thereof. This assumption is also
Bianchi Homogenous 3-Spaces
387
compatible with (3.8), which finally becomes a P.D.E. in terms of the x i ’s (see (2.11) of [1]). In Type II, where only two of the three Hα ’s are independent, yet another combination of γαβ ’s (except the three x i ’s in (3.9)) solves (3.7) – see the first part of [4]. In Type I, all six γαβ ’s solve the identically vanishing quantum linear constraints. Let us now compare this theory with the purely kinematical results of the previous section: to this end, first note that q 1 , q 2 in (2.17) solve the quantum linear constraints since, as it can be easily verified: q1 = ε
x 1 − 2x 2 2
q2 = −
x2 , 2
where ε = sign(mαβ γαβ ) – see the appendix of [1]. In Bianchi Type VIII, IX the existence√of the non vanishing c-number density m permits us to relate x 3 to the scalar q 3 = m/ x 3 ; thus the grouping entailed by the quantum Hamiltonian dynamics is completely equivalent to that enforced by the Kinematics of B.H. 3-spaces – described in the previous section. For Type VI0 , VII0 , q 3 = 0 (since m = 0) and an apparent discrepancy occurs: kinematically q 1 , q 2 , (or equivalently x 1 , x 2 ) invariantly and irreducibly characterize a B.H. 3-geometry; that is any function of the 3-geometry, must necessarily and exclusively depend on x 1 , x 2 . On the other hand, the quantum Hamiltonian dynamics emanating from (3.3) allows x 3 = γ as a third possible argument of the wave function which is to solve (3.8). The situation is getting worst when coming to the lower Class A Types. In Type II, the single independent scalar q 1 is adequate for characterizing the 3-slice while – as explained above – the dynamics allows γ plus two more combinations of γαβ ’s. In Type I, not a single q i survives while all γαβ ’s are – in principle – candidates as arguments of the solution to the Wheeler–DeWitt equation (3.8). The discrepancy is not of merely academic interest. Any possible argument of the wave function other than the q i ’s (or three independent functions thereof) is a gauge degree of freedom since it can be affected by an appropriate A.I.D. A satisfactory solution of the puzzle can be achieved through the usage of the existing conditional symmetries of system (3.3). The detailed analysis has been given for Type I in the last of [3], for Type II in the second part of [4], and for Type VI0 , VII0 in [12]. In the rest of this section, we give a brief outline of this idea, and present the characteristic example of TypeV, where a complete matching between kinematics and Hamiltonian dynamics occurs. We first observe that the root of the problem lies in the existence of the generators of ∂ σ γ the outer automorphic motions E(j ) = ε(j )ρ σ τ ∂γρτ among the X(j ) ’s: their classical σ ρσ counterparts E(j ) = ε(j )ρ γσ τ π are, at first sight, absent from (3.3). As one can easily compute, the Lie Brackets among these and the generators of the inner automorphic motions Hα ’s are: 1 δ Hδ {Hα , Hβ } = − Cαβ 2
1 {E(i) , Hβ } = − λδ(i)β Hδ 2 (k)
(k)
{E(i) , E(j ) } = C(i)(j ) E(k) , (3.10)
where { , } stands for the Lie Bracket and C(i)(j ) are particular to each Bianchi Type. So, all the quantum analogues of the X(j ) ’s can be consistently imposed on the wave
388
T. Christodoulakis, E. Korfiatis, G. O. Papadopoulos
function: X(j ) = 0
(3.11)
as kinematics, dictates. Then, is a function of the q i ’s only – see Table. The classical dynamics of action (3.3) provides us some, linear in momenta, integrals of motion which are either E(j ) ’s themselves or linear combinations of some of them with γµν π µν (last part of [3], second part of [4, 12]). Adopting the recipe that these integrals of motion should also be turned into operators annihilating the wave function, we achieve the desired reconciliation between kinematics and Quantum Hamiltonian dynamics. A very interesting feature is that the corresponding constants of motion are set equal to zero due to the consistency required (Frobenius Theorem). The following general Type V case is characteristic: Although Type V is a Class B model, a valid totally scalar Hamiltonian has been found [13], having the form: H c ≡ N0 H0c + N ρ Hρ0 = N0
1 Bαβµν π αβ π µν + V 2
α + N ρ Cρβ γαδ π βδ ,
(3.12)
where Bµνρσ and V a 4th -rank contravariant tensor and scalar respectively, constructed out of the structure constants and γαβ ’s. When quantized according to Kuchaˇr & Hajicek, this action gives rise to a wave function depending on 3 combinations of the γαβ ’s, 11 γ12 , γ22 ) [10]. Clearly, the two last arguments are gauge degrees of namely = (q 2 , γγ12 freedom since – as one can see from the Table – q 2 is the only invariant characterizing the 3-geometry under discussion. The elimination of these two degrees of freedom, is achieved by considering the quantum analogues of the following three integrals of motion admitted by (3.12), E(j ) = σ γ π ρτ , with: ε(j )ρ σ τ
α ε(1)β
1 0 0 = 0 −1 0 , 0 0 0
α ε(2)β
0 1 0 = 0 0 0 , 0 0 0
α ε(3)β
0 0 0 = 1 0 0 , 0 0 0
One can immediately recognize that these matrices are the outer automorphisms of the ∂ σ γ Type V Lie Algebra. Consequently, the vector fields E(j ) = ε(j )ρ σ τ ∂γρτ are generating outer automorphic motions in the configuration space. Turning these integrals of motion into operators imposed on , i.e. demanding E(j ) = Mj and utilizing the algebra (say Cijk ) which the previous three matrices obey, one arrives at the consistency condition
Cijk Mk = 0, implying that the constants of integration Mk should be set equal to zero. We thus retrieve all the conditions X(j ) = 0 – required by the kinematics. So we have an example in which the dynamics completely complies with the kinematical/geometrical results, obtained in Sect. 2. As we have earlier mentioned, the same situation occurs for all Class A Types – when E(j ) ’s exist. In the case of Type VIII, IX the √ Hamiltonian (3.3) is totally scalar since m/ γ is the q 3 –m being a c-number density.
Bianchi Homogenous 3-Spaces
389
4. Discussion In Sect. 2, we first identified the particular class of G.C.T.’s which preserve manifest homogeneity of the line element of the generic B.H. 3-space. Their action on the configuration space spanned by γαβ ’s is shown to be that of the Automorphism group. The differential description of this action on " leads us to the vector fields X(j ) ’s. Their characteristic solutions, the q i ’s, irreducibly and invariantly label the 3-geometry. Thus for any given but arbitrary Bianchi Type points in ", corresponding to the same multiplet q i , are automorphically related and thus G.C.T. equivalent. A first conclusion concerning any possible quantum theory of Bianchi Cosmologies, is thus reached on solely kinematical grounds; the wave function must depend on q i ’s only – if it is to represent the geometry and not the coordinate system on the 3-slice. In Sect. 3 we first present the quantization of the Hamiltonian action (3.3) according to Kuchaˇr’s and Hajiceck’s recipe. We see that the quantum linear constraint vector fields Hα ’s corresponding to the inner automorphisms proper invariant subgroup InAut(G) of Aut(G) are among the X(i) ’s. As seen from the table, for Types VIII, IX there are no outer-automorphisms and the three x i ’s are in one-to-one correspondence to the three q i ’s (essentially the three independent curvature invariants). For all other Class A Types, there is always an outerautomorphism with non-vanishing trace; the corresponding generator in configuration space " does not (weakly) commute with the quadratic constraint (3.4) nor does its corresponding quantum analogue commute with (3.8). Thus, for the lower Class A Types, the wave functions emanating from action (3.3) depend on the curvature invariants and on γ despite the fact that q 3 = 0; these wave functions will therefore not be G.C.T. invariant, since γ can be changed to anything we like by an A.I.D. This result seems to justify (for these types) the claim made by some authors that γ should be considered as a time variable and thus frozen out [14]. One may say that for the lower Class A Types the grouping dictated by the quantum theory, resulting from action (3.3), is overcomplete: although any two hexads forming the same x 1 , x 2 are geometrically identifiable (since they correspond to G.C.T. related spatial line-elements), the theory requires that x 3 = γ be also the same in order to consider these two hexads as equivalent. At first sight, this may be seen as a defect of the classical action (3.3); although it reproduces Einstein’s Equations for (Class A) spatially homogeneous spacetimes, it does not correctly reflect the full covariances of these equations. However, as is explained in [4, 10, 12, 3], the conditional symmetries of this action rectify this defect: for Bianchi Types other than VIII, IX, there are extra, linear in momenta, integrals of motion – say α γ π ρσ – corresponding to the outer automorphisms subgroup of Aut(G). E(i) = ε(i)ρ ασ It is shown how the quantum analogues of these E(i) ’s can serve to satisfactorily remove this discrepancy. Their imposition as additional conditions restricting the wave function results in forcing it to depend on q i ’s only. A noteworthy feature of this procedure is that, at the quantum level, the consistency requirement of these extra conditions leads to setting zero, the classical constants of integration (which are non-essential) – as the particular example of Type V exhibits. Another important consequence of the results in Sect. 2 is the conclusion that Homogeneous 3-Geometries are completely characterized by their curvature invariants: indeed, as it is well known, in 3 dimensions all metric invariants are higher derivative curvature invariants [9]; but the homogeneity of the space reduces any higher derivative α ,γ curvature invariant to a scalar combination of Cµν αβ with the appropriate number of C’s. Thus any two distinct Homogeneous 3-Geometries must differ by at least one curvature invariant, i.e. by at least one q i ; and vice versa, any two Homogeneous 3-metrics
390
T. Christodoulakis, E. Korfiatis, G. O. Papadopoulos
Table 4.1. Type I II
III
IV
V
VI
VII
VIII
XI
Generators λα(i)β
p1 p2 p3 p4 p5 p6 p7 p8 p9
p3 + p6 p1 p2 0 p 3 p4 0 p 5 p6 p 1 p2 p 3 p2 p1 p4 0 0 0 p1 p2 p3 0 p1 p 4 0 0 0 p1 p2 p3 p4 p5 p6 0 0 0 p 1 p2 p 3 p2 p1 p4 0 0 0 p 1 p2 p3 −p2 p1 p4 0 0 0 0 p1 p2 0 p3 p1 p −p3 0 2 0 p1 p2 0 p3 −p1 −p2 −p3 0
# of Indep. Parameters
# of Indep. Hα ’s
# of Indep. Eα ’s
# of Indep. q i ’s
9
0
0
0
6
2
3
1
4
2
2
2
4
3
1
2
6
3
2
1
4
3
1
2
4
3
1
2
3
3
0
3
3
3
0
3
Notes: N1 The number of the independent q i ’s equals the number of the independent curvature invariants. N2 Type III, is characterized by the condition h = ±1, while Type VI, by the condition h = (0, ±1).
for which all curvature invariants (i.e. all q i ’s) coincide, are necessarily G.C.T. related and thus represent the same 3-Geometry. Last but not least, we would like to underline that the partitioning of the Automorphism Group in Inner and Outer Subgroups, which quantum theory seems to favour, does have a classical analogue: the inner automorphism parameters represent genuine “gauge” degrees of freedom (i.e. can be allowed to be arbitrary functions of time) – see the 4th part of [6] –, while the outer automorphism parameters, are rigid symmetries – see the 3rd part of [6]. Acknowledgements. The authors wish to express their appreciation for the referee’s critical comments on an earlier version of the manuscript, which helped them to present a clearer version of the essence of this work.
Bianchi Homogenous 3-Spaces
391
One of us (G. O. Papadopoulos) is currently a scholar of the Greek State Scholarships Foundation (I.K.Y.) and acknowledges the relevant financial support.
References 1. Christodoulakis, T., Korfiatis, E. and Vagenas, E.C.: gr-qc/9407042 2. Kuchaˇr, K.V. and Hajiceck, P.: Phys. Rev. D 41, 1091 (1990); J. Math. Phys. 31, 1723 (1990) 3. Ashtekar A., and Samuel, J.: Class. Quan. Grav. 8, (1991); Folomeev, V. N. and Gurovich, V.Ts.: Gravitation & Cosmology 6 No. 1, 19–26 (2000); Hervik, S.: Class. Quant. Grav. 17, 2765–2782 (200); Christodoulakis, T., Gakis, T. and Papadopoulos, G.O.: gr-qc/0106065, to appear in Class. Quant. Grav. 4. Christodoulakis, T., Kofinas, G., Korfiatis, E. and Paschos, A.: Phys. Lett. B 390, 55–58 (1997); Christodoulakis, T. and Papadopoulos, G.O.: Phys. Lett. B 501, 264–268 (2001) 5. Lidsey, J.E.: Phys. Lett. B 352, 207 (1995) 6. Heckman O. and Schücking, E.: Relativistic Cosmology in Gravitation (an introduction to current research). Edited by L. Witten, New York: Wiley, 1962; Jantzen, R.T.: Comm. Math. Phys. 64, 211 (1979); Coussaert, O. and Henneaux, M.: Class. Quant. Grav. 10, 1607–1618 (1993); Christodoulakis, T., Kofinas, G., Korfiatis, E., Papadopoulos, G.O. and Paschos, A.: gr-qc/0008050 J. Math. Phys. 42 (8), 3580–3608 (2001) 7. Kuchaˇr, K.V.: J. Math. Phys. 23 (9), 1647–1661 (1982) 8. Frank Warner: Foundations of Differentiable Manifolds and Lie Groups. Glenview, Illinois: Scott Foresman & Co, 1971, pp. 41–50 9. Munoz Masque, J. and Valdes Morales, A.: J. Physics A, 27 (23) (1994) 10. Christodoulakis, T., Kofinas, G., Korfiatis, E. and Paschos, A.: Phys. Lett. B 419, 30–36 (1998) 11. MacCallum, M.A.H. and Taub, A.H.: Commun. Math. Phys. 25, 173 (1972); Ellis, G.R.F. and MacCallum, M.A.H.: Commun. Math. Phys. 12, 108–141 (1969); MacCallum, M.A.H.: Commun. Math. Phys. 20, 57 (1971); General Relativity. An Einstein Centenary Survey, ed. S. W. Hawking and W. Israel, Cambridge: Cambridge University Press; Sneddon, G.E.: J. Phys. A 9, 229 (1976); Wald, R.M.: Phys. Rev. D 28, 2118 (1982); Wald, R.M.: General Relativity. Chicago, IL: University of Chicago Press, 1984; Christodoulakis, T. and Korfiatis, E.: Nuovo Cimento 109 B, 1155 (1994); Higuchi, A. and Wald, R.M.: Phys. Rev. D 51, 544 (1995) 12. Christodoulakis, T., Kofinas, G. and Papadopoulos, G.O.: Phys. Lett. B 514, 149–154 (2001) 13. Christodoulakis, T., Korfiatis, E. and Paschos, A.: Phys. Rev. D 54, 4 (1996) 14. Schirmer, J.: Class. Quan. Grav. 12, 1099 (1995) Communicated by H. Nicolai
Commun. Math. Phys. 226, 393 – 418 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Tensor Fields of Mixed Young Symmetry Type and N-Complexes Michel Dubois-Violette1 , Marc Henneaux2 1 Laboratoire de Physique Théorique, UMR 8627, Université Paris XI, Bâtiment 210, 91405 Orsay Cedex,
France. E-mail: [email protected]
2 Physique Théorique et Mathématique, Université Libre de Bruxelles, Campus Plaine C.P. 231,
1050 Bruxelles, Belgique. E-mail: [email protected] Received: 25 October 2001 / Accepted: 13 November 2001
Abstract: We construct N -complexes of non-completely antisymmetric irreducible tensor fields on RD which generalize the usual complex (N = 2) of differential forms. Although, for N ≥ 3, the generalized cohomology of these N -complexes is nontrivial, we prove a generalization of the Poincaré lemma. To that end we use a technique reminiscent of the Green ansatz for parastatistics. Several results which appeared in various contexts are shown to be particular cases of this generalized Poincaré lemma. We furthermore identify the nontrivial part of the generalized cohomology. Many of the results presented here were announced in [10]. 1. Introduction Our aim in this paper is to develop differential tools for irreducible tensor fields on RD which generalize the calculus of differential forms. By an irreducible tensor field on RD , we here mean a smooth mapping x → T (x) of RD into a vector space of (covariant) tensors of given Young symmetry. We recall that this implies that the representation of GLD in the corresponding space of tensors is irreducible. Throughout the following (x µ ) = (x 1 , . . . , x D ) denotes the canonical coordinates of RD and ∂µ are the corresponding partial derivatives which we identify with the corresponding covariant derivatives associated to the canonical flat torsion-free linear (0)
connection ∇ of RD . Thus, for instance, if T is a covariant tensor field of degree p on (0)
RD with components Tµ1 ...µp (x), then ∇ T denotes the covariant tensor field of degree (0)
p + 1 with components ∂µp+1 Tµ1 ...µp (x). The operator ∇ is a first-order differential operator which increases by one the tensorial degree. In this context, the space (RD ) of differential forms on RD is the graded vector space of (covariant) antisymmetric tensor fields on RD with graduation induced by the tensorial degree, whereas the exterior differential d is up to a sign the composition of
394
M. Dubois-Violette, M. Henneaux (0)
the above ∇ with antisymmetrisation, i.e. (0)
d = (−1)p Ap+1 ◦ ∇ : p (RD ) → p+1 (RD ),
(1)
where Ap denotes the antisymmetrizer on tensors of degree p. The sign factor (−1)p (0)
arises because d acts from the left, while we defined ( ∇ T )µ1 ...µp+1 = ∂µp+1 Tµ1 ...µp . One has d 2 = 0 and the Poincaré lemma asserts that the cohomology of the complex ((RD ), d) is trivial, i.e. that one has H p ((RD )) = 0, ∀p ≥ 1 and H 0 ((RD )) = R, where H ((RD )) = Ker(d)/Im(d) = ⊕p H p ((RD )) with H p ((RD )) = Ker(d : p (RD ) → p+1 (RD ))/d(p−1 (RD )). From the point of view of Young symmetry, antisymmetric tensors correspond to Young diagrams (partitions) described by one column of cells, corresponding to the partition (1p ), whereas Ap is the associated Young symmetrizer (see the next section for definitions and conventions). There is a relatively easy way to generalize the pair ((RD ), d) which we now describe. Let (Y ) = (Yp )p∈N be a sequence of Young diagrams such that the number of cells of Yp is p, ∀p ∈ N (i.e. such that Yp is a partition of the integer p for any p). We p define (Y ) (RD ) to be the vector space of smooth covariant tensor fields of degree p on RD which have the Young symmetry type Yp and we let (Y ) (RD ) be the graded vector p space ⊕(Y ) (RD ). We then generalize the exterior differential by setting p
(0)
p
p+1
d = (−1)p Yp+1 ◦ ∇ : (Y ) (RD ) → (Y ) (RD ),
(2)
where Yp is now the Young symmetrizer on tensor of degree p associated to the Young symmetry Yp . This d is again a first order differential operator which is of degree one, (i.e. it increases the tensorial degree by one), but now, d 2 = 0 in general. Instead, one has the following result. Lemma 1. Let N be an integer with N ≥ 2 and assume that (Y ) is such that the number of columns of the Young diagram Yp is strictly smaller than N (i.e. ≤ N − 1) for any p ∈ N. Then one has d N = 0. In fact the indices in one column are antisymmetrized (see below) and d N ω involves necessarily at least two partial derivatives ∂ in one of the columns since there are N partial derivatives involved and at most N − 1 columns. Thus if (Y ) satisfies the condition of Lemma 1, the pair ((Y ) (RD ), d) is a N complex (of cochains) [19, 6, 12, 20, 7], i.e. here a graded vector space equipped with an endomorphism d of degree 1, its N -differential, satisfying d N = 0. Concerning N complexes, we shall use here the notations and the results of [7] which will be recalled when needed. p Notice that (Y ) (RD ) = 0 if the first column of Yp contains more than D cells p and that therefore, if Y satisfies the condition of Lemma 1, then (Y ) (RD ) = 0 for p > (N − 1)D. One can also define a graded bilinear product on (Y ) (RD ) by setting (αβ)(x) = Ya+b (α(x) ⊗ β(x))
(3)
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
395
for α ∈ a(Y ) (RD ), β ∈ b(Y ) (RD ) and x ∈ RD . This product is by construction bilinear with respect to the C ∞ (RD )-module structure of (Y ) (RD ) (i.e. with respect to multiplication by smooth functions). It is worth noticing here that one always has 0(Y ) (RD ) = C ∞ (RD ). In this paper we shall not stay at this level of generality; for each N ≥ 2 we shall choose a maximal (Y ), denoted by (Y N ) = (YpN )p∈N , satisfying the condition of Lemma 1. The Young diagram (Fig. 1)with p cells YpN is defined in the following manner: write the division of p by N − 1, i.e. write p = (N − 1)np + rp , where np and rp are (the unique) integers with 0 ≤ np and 0 ≤ rp ≤ N − 2 (np is the quotient whereas rp is the remainder), and let YpN be the Young diagram with np rows of N − 1 cells and the last row with rp cells (if rp = 0). One has YpN = ((N − 1)np , rp ), that is we fill the rows maximally.
Fig. 1. p
p
We shall denote (Y N ) (RD ) and (Y N ) (RD ) by N (RD ) and N (RD ), respec-
tively. It is clear that (2 (RD ), d) is the usual complex ((RD ), d) of differential forms on RD . The N-complex (N (RD ), d) will be simply denoted by N (RD ). We recall [7] that the (generalized) cohomology of the N -complex N (RD ) is the family of graded vector spaces H(k) (N (RD )) k ∈ {1, . . . , N − 1} defined by H(k) (N (RD )) = p Ker(d k )/Im(d N−k ), i.e. H(k) (N (RD )) = ⊕H(k) (N (RD )) with p
p
p
p+k
H(k) (N (RD )) = Ker(d k : N (RD ) → N (RD ))/d N−k (p+k−N (RD )). The following statement is our generalization of the Poincaré lemma. (N−1)n
0 ( (RD )) is the space Theorem 1. One has H(k) (N (RD )) = 0, ∀n ≥ 1 and H(k) N D of real polynomial functions on R of degree strictly less than k (i.e. ≤ k − 1) for k ∈ {1, . . . , N − 1}.
This statement reduces to the Poincaré lemma for N = 2 but it is a nontrivial genp eralization for N ≥ 3 in the sense that, as we shall see, the spaces H(k) (N (RD )) are nontrivial for p = (N − 1)n and, in fact, are generically infinite dimensional for D ≥ 3, p ≥ N.
396
M. Dubois-Violette, M. Henneaux
The connection between the complex of differential forms on RD and the theory of classical gauge field of spin 1 is well known. Namely the subcomplex d
d
d
0 (RD ) → 1 (RD ) → 2 (RD ) → 3 (RD )
(4)
has the following interpretation in terms of spin 1 gauge field theory. The space 0 (RD ) (= C ∞ (RD )) is the space of infinitesimal gauge transformations, the space 1 (RD ) is the space of gauge potentials (which are the appropriate description of spin 1 gauge fields to introduce local interactions). The subspace d0 (RD ) of 1 (RD ) is the space of pure gauge configurations (which are physically irrelevant), d1 (RD ) is the space of field strengths or curvatures of gauge potentials. The identity d 2 = 0 ensures that the curvatures do not see the irrelevant pure gauge potentials whereas, at this level, the Poincaré lemma ensures that it is only these irrelevant configurations which are forgotten when one passes from gauge potentials to curvatures (by applying d). Finally d 2 = 0 also ensures that curvatures of gauge potentials satisfy the Bianchi identity, i.e. are in Ker(d : 2 (RD ) → 3 (RD )), whereas at this level the Poincaré lemma implies that conversely the Bianchi identity characterizes the elements of 2 (RD ) which are curvatures of gauge potentials. Classical spin 2 gauge field theory is the linearization of Einstein geometric theory. d1
d2
d3
In this case, the analog of (4) is a complex E 1 → E 2 → E 3 → E 4 , where E 1 is the space of covariant vector field (x → Xµ (x)) on RD , E 2 is the space of covariant symmetric tensor fields of degree 2 (x → hµν (x)) on RD , E 3 is the space of covariant tensor fields of degree 4 (x → Rλµ,ρν (x)) on RD having the symmetries of the Riemann curvature tensor and where E 4 is the space of covariant tensor fields of degree 5 on RD having the symmetries of the left-hand side of the Bianchi identity. The arrows d1 , d2 , d3 are given by (d1 X)µν (x) = ∂µ Xν (x) + ∂ν Xµ (x), (d2 h)λµ,ρν (x) = ∂λ ∂ρ hµν (x) + ∂µ ∂ν hλρ (x) − ∂µ ∂ρ hλν (x) − ∂λ ∂ν hµρ (x), (d3 R)λµν,αβ (x) = ∂λ Rµν,αβ (x) + ∂µ Rνλ,αβ (x) + ∂ν Rλµ,αβ (x). λ ρ The symmetry of x → Rλµ,ρν (x), , shows that E 3 = 43 (RD ) and that µ ν E 4 = 53 (RD ); furthermore one canonically has E 1 = 13 (RD ) and E 2 = 23 (RD ). One also sees that d1 and d3 are proportional to the 3-differential d of 3 (RD ), i.e. d1 ∼ d : 13 (RD ) → 23 (RD ) and d3 ∼ d : 43 (RD ) → 53 (RD ). The structure of d2 looks different, it is of second order and increases by 2 the tensorial degree. However it is easy to see that it is proportional to d 2 : 23 (RD ) → 43 (RD ). Thus the analog of (4) is (for spin 2 gauge field theory) d
d2
d
13 (RD ) → 23 (RD ) → 43 (RD ) → 53 (RD )
(5)
and the fact that it is a complex follows from d 3 = 0 whereas our generalized Poincaré lemma (Theorem 1) implies that it is in fact an exact sequence. Exactness at 23 (RD ) is 2 ( (RD )) = 0 and exactness at 4 (RD ) is H 4 ( (RD )) = 0, (the exactness at H(2) 3 3 3 (1) 43 (RD ) is the main statement of [17]). Thus what plays the role of the complex of differential forms for the spin 1 (i.e. 2 (RD )) is the 3-complex 3 (RD ) for the spin 2. More generally, for the spin S ∈ N,
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
397
this role is played by the (S + 1)-complex S+1 (RD ). In particular, the analog of the sequence (4) for the spin 1 is the complex dS
d
d
2S+1 D S D 2S D D S−1 S+1 (R ) → S+1 (R ) → S+1 (R ) → S+1 (R )
(6)
for the spin S. The fact that (6) is a complex was known, [4], it here follows from D d S+1 = 0. One easily recognizes that d S : SS+1 (RD ) → 2S S+1 (R ) is the generalized (linearized) curvature of [4]. Our Theorem 1 implies that sequence (6) is exact: S ( 2S D D exactness at SS+1 (RD ) is H(S) S+1 (R )) = 0 whereas exactness at S+1 (R ) is 2S S D D H(1) (S+1 (R ) = 0, (exactness at S+1 (R ) was directly proved in [5] for the case S = 3). Finally, there is a generalization of Poincaré duality for N (RD ), which is obtained by contractions of the columns with the Kroneker tensor εµ1 ...µD of RD , that we shall describe in this paper. When combined with Theorem 1, this duality leads to another kind of results. A typical result of this kind is the following one. Let T µν be a symmetric contravariant tensor field of degree 2 on RD satisfying ∂µ T µν = 0, (like e.g. the stress energy tensor), then there is a contravariant tensor field R λµρν of degree 4 with the λ ρ symmetry , (i.e. the symmetry of Riemann curvature tensor), such that µ ν T µν = ∂λ ∂ρ R λµρν .
(7)
In order to connect this result with Theorem 1, define τµ1 ...µD−1 ν1 ...νD−1 = 2(D−1) (RD ) and conversely, any τ ∈ T µν εµµ1 ...µD−1 ενν1 ...νD−1 . Then one has τ ∈ 3 2(D−1) (RD ) can be expressed in this form in terms of a symmetric contravariant 23 tensor. It is easy to verify that dτ = 0 (in 3 (RD )) is equivalent to ∂µ T µν = 0. On the 2(D−1) other hand, Theorem 1 implies that H(1) (3 (RD )) = 0, and therefore ∂µ T µν = 0
implies that there is a ρ ∈ 3 (RD ) such that τ = d 2 ρ. The latter is equivalent to (7) with R µ1 µ2 ν1 ν2 proportional to εµ1 µ2 ...µD ε ν1 ν2 ...νD ρµ3 ...µD ν3 ...νD and one verifies that, so defined, R has the correct symmetry. That symmetric tensor fields identically fulfilling ∂µ T µν = 0 can be rewritten as in Eq. (7) has been used in [23] and more recently in [3] in the investigation of the consistent deformations of the free spin two gauge field action. Beside their usefulness for computations (and for unifying various results) through the generalization of Poincaré lemma (Theorem 1) and the generalization of the Poincaré duality, the N-complexes described in this paper give a class of nontrivial examples of N -complexes which are not related with simplicial modules. Indeed most nontrivial examples of N-complexes considered in [6–8, 19, 21, 20] are of simplicial type and it was shown in [7] that such N -complexes compute the ordinary (co)homologies of the simplicial modules (see also in [20] for the Hochschild case). Furthermore that kind of results have been recently extended to the cyclic context in [24], where new proofs of above results have been carried over. This does not mean that N -complexes associated with simplicial modules are not useful; for instance in [14] such an N -complex (related with a simplicial Hochschild module) was needed for the construction of a natural generalized BRS-theory [1, 18] for the zero modes of the SU (2) WZNW-model, see [9] for a general review. It is however very desirable to produce useful examples which are not of simplicial type and, apart from the universal construction of [12] (and some finitedimensional examples [7, 12]), the examples produced here are the first ones escaping from the simplicial frame. 2(D−2)
398
M. Dubois-Violette, M. Henneaux
Many results of this paper where announced in our letter [10] so an important part of it is devoted to the proofs of these results, in particular to the proof of Theorem 1 above which generalizes the Poincaré lemma. In order that the paper be self contained we recall some basic definitions and results on Young diagrams and representations of the linear group which are needed here. Throughout the paper, we work in the real setting, so all vector spaces are on the field R of real numbers (this obviously generalizes to any commutative field K of characteristic zero). The plan of the paper is the following. After this introduction we discuss Young diagrams, Young symmetry types for tensor and we define in this context a notion of contraction. Section 3 is devoted to the construction of the basic N -complex of tensor fields on RD considered in this paper, namely N (RD ), and the description of the generalized Poincaré (Hodge) duality in this context. In Sect. 4 we introduce a multicomplex on RD and we analyse its cohomological properties; Theorem 2 proved there, which is by itself of interest, will be the basic ingredient in the proof of our generalization of the Poincaré lemma, i.e. of Theorem 1. Section 5 contains this proof of Theorem 1. In Sect. 6 we analyse the structure of the generalized cohomology of N (RD ) in the degrees which are not exhausted by Theorem 1. The N -complex N (RD ) is a generalization of the complex (RD ) = 2 (RD ) of differential forms on RD ; in Sect. 7 we define another generalization [N] (RD ) of the complex of differential forms which is also a N -complex and which is an associative graded algebra acting on the graded space N (RD ). In Sect. 8 which plays the role of a conclusion we sketch another possible proof of Theorem 1 based on a generalization of algebraic homotopy for N -complexes. In this section we also define natural N -complexes of tensor fields on complex manifolds ¯ which generalize the usual ∂-complex (of forms in d z¯ ). 2. Young Diagrams and Tensors For theYoung diagrams etc. we use throughout the conventions of [16]. AYoung diagram Y is a diagram which consists of a finite number r > 0 of rows of identical squares (refered to as the cells) of finite decreasing lengths m1 ≥ m2 ≥ · · · ≥ mr > 0 which are arranged with their left hands under one another. The lengths m ˜ 1, . . . , m ˜ c of the columns of Y are also decreasing m ˜1 ≥ ··· ≥ m ˜ c > 0 and are therefore the rows of another Young diagram Y˜ with r˜ = c rows. The Young diagram Y˜ is obtained by flipping Y over its diagonal (from upper left to lower right) and is referred to as the conjugate of Y . Notice that one has m ˜ 1 = r and therefore also m1 = r˜ = c and that m1 + · · · + mr = m ˜1 +···+m ˜ c is the total number of cells of Y which will be denoted by |Y |. It is convenient to add the empty Young diagram Y0 characterized by |Y | = 0. The figure below describes a Young diagram Y and its conjugate Y˜ :
Y=
Y˜ =
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
399
In the following E denotes a finite-dimensional vector space of dimension D and E ∗ n denotes its dual. The nth tensor power E ⊗ of E identifies canonically with the space n ∗ of multilinear forms on (E ) . Let Y be a Young diagram and let us consider that the |Y | copies of E ∗ in (E ∗ )|Y | are labelled by the cells of Y so that an element of (E ∗ )|Y | is given by specifying an element of E ∗ for each cell of Y . The Schur module E Y is defined to be the vector space of all multilinear forms T on (E ∗ )|Y | such that: (i) T is completely antisymmetric in the entries of each column of Y , (ii) complete antisymmetrization of T in the entries of a column of Y and another entry of Y which is on the right-hand side of the column vanishes. |Y |
Notice that E Y = 0 if the first column of Y has length m ˜ 1 > D. One has E Y ⊂ E ⊗ |Y | Y and E is an invariant subspace for the action of GL(E) on E ⊗ which is irreducible. n ⊗ Furthermore each irreducible subspace of E for the action of GL(E) is isomorphic to E Y with the above action of GL(E) for some Young diagram Y with |Y | = n. Let Y be a Young diagram and let T be an arbitrary multilinear form on (E ∗ )|Y | , |Y | (T ∈ E ⊗ ). Define the multilinear form Y(T ) on (E ∗ )|Y | by (−1)ε(q) T ◦ p ◦ q, Y(T ) = p∈R q∈C
where C is the group of the permutations which permute the entries of each column and R is the group of the permutations which permute the entries of each row of Y . |Y | One has Y(T ) ∈ E Y and the endomorphism Y of E ⊗ satisfies Y 2 = λY for some |Y | number λ = 0. Thus Y = λ−1 Y is a projection of E ⊗ into itself, Y2 = Y, with image Im(Y) = E Y . The projection Y will be referred to as the Young symmetrizer (relative to E) of the Young diagram Y . The element eY = λ−1 p∈R q∈C (−1)ε(q) pq of the group algebra of the group S|Y | of permutation of {1, . . . , |Y |} is an idempotent which will be referred to as the Young idempotent of Y . By composition of Y as above with the canonical multilinear mapping of E |Y | into |Y | ⊗ one obtains a multilinear mapping v → vY of E |Y | into E Y . The Schur module E Y E together with the mapping v → vY are characterized uniquely up to an isomorphism by the following universal property: For any multilinear mapping φ : E |Y | → F of E |Y | into a vector space F satisfying (i) φ is completely antisymmetric in the entries of each column of Y , (ii) complete antisymmetrization of φ in the entries of a column of Y and another entry of Y which is on the right-hand side of the column vanishes, there is a unique linear mapping φ Y : E Y → F such that φ(v) = φ Y (vY ). By construction v → vY satisfies the conditions (i) and (ii) above. There is an obvious notion of inclusion for Young diagrams, namely Y is included in Y , Y ⊂ Y , if one has this inclusion for the corresponding subsets of the plane whenever their upper left cells coincide. This means for instance that Y ⊂ Y whenever the length c = m1 of the first row of Y is greater than the length c = m1 of the first row of Y and that for any 1 ≤ i ≤ c the length m ˜ i of the i th column of Y is greater than the length th m ˜ i of the i column of Y , (c ≥ c and m ˜i ≥ m ˜ i for 1 ≤ i ≤ c ). In the following we shall need a stronger notion. A Young diagram Y is strongly included in another one Y and we write Y ⊂⊂ Y if the length of the first row of Y is greater than the length of the first row of Y and if the length of the last column of Y is
400
M. Dubois-Violette, M. Henneaux
greater than the length of the first column of Y . Notice that this relation is not reflexive, one has Y ⊂⊂ Y if and only if Y is rectangular which means that all its columns have the same length or equivalently all its rows have the same length. It is clear that Y ⊂⊂ Y implies Y ⊂ Y . ˜1 ≥ ··· ≥ m ˜ c > 0 be Let Y and Y be Young diagrams such that Y ⊂⊂ Y and let m the lengths of the columns of Y and m ˜ 1 ≥ · · · ≥ m ˜ c > 0 be the lengths of the columns ˜c ≥ m ˜ 1 . Define the contraction of Y by Y to be the Young of Y ; one has c ≥ c and m diagram C(Y |Y ) obtained from Y by dropping m ˜ 1 cells of the last, i.e. the cth column of th Y, m ˜ 2 cells of the (c − 1) column of Y, . . . , m ˜ c cells of the (c − c + 1)th column of Y . ˜ 1 then C(Y |Y ) has c columns as Y , however if m ˜c = m ˜ 1 If m ˜ c is strictly geater than m then the number of columns of C(Y |Y ) is strictly smaller than c (it is c − 1 if m ˜ c−1 is strictly greater than m ˜ 2 , etc.). Notice that if Y is rectangular then C(Y |Y ) ⊂⊂ Y and C(Y |C(Y |Y )) = Y so that Y → C(Y |Y ) is then an involution on the set of Young diagrams Y which are strongly included in Y (Y ⊂⊂ Y ). Let again Y and Y be Young diagrams with Y ⊂⊂ Y . Our aim is now to define a bilinear mapping (T , T ) → C(T |T ) of E Y × E ∗Y into E C (Y |Y ) . This will be ob|Y | |Y | tained by restriction of a bilinear mapping (T , T ) → C(T |T ) of E ⊗ × E ∗⊗ into |C(Y |Y )| which will be an ordinary (complete) tensorial contraction. Any such tensorial E⊗ |Y | contraction associates to a contravariant tensor T of degree |Y | (i.e. T ∈ E ⊗ ) and a | |Y covariant tensor T of degree |Y | (i.e. T ∈ E ∗⊗ ) a contravariant tensor of degree |C(Y |Y )|, (Y ⊂⊂ Y ). In order to specify such a contraction, one has to specify the entries of T , that is of Y , to which each entry of T , that is of Y , is contracted (recalling that T is a linear combination of canonical images of elements of E |Y | and that T is a linear combination of canonical images of elements of E ∗|Y | ). In order that C(T |T ) has the right antisymmetry in the entries of each column of C(Y |Y ) when T ∈ E Y and T ∈ E ∗Y , one has to contract the entries of T corresponding to the i th column of Y with entries of T corresponding to the (c − i + 1)th column of Y . The precise choice and the order of the latter entries is irrelevant up to a sign in view of the antisymmetry in the entries of a column. Our choice is to contract the first entry of the i th column of Y with the last entry of the (c − i + 1)th column of Y , the second entry of the i th column of Y with the penultimate entry of the (c − i + 1)th column of Y , etc. for any 1 ≤ i ≤ c (with obvious conventions). This fixes the bilinear mapping (T , T ) → C(T |T ) of |Y | |Y | |C(Y |Y )| . The following figure describes pictorially in a particular E ⊗ × E ∗⊗ into E ⊗ case the construction of C(Y |Y ) as well as the places where the contractions are carried over in the corresponding construction of C(T |T ):
−→
Y=
−→
↑ Y =
=
C(Y |Y )
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
401
Proposition 1. Let T be an element of E Y and T be an element of E ∗Y with Y ⊂⊂ Y . Then C(T |T ) is an element of E C (Y |Y ) . |C(Y |Y )|
with a multilinear form on E ∗|C (Y |Y )| . Proof. As before, we identify C(T |T ) ∈ E ⊗ ) C (Y |Y means verifying properties (i) and (ii) above. PropTo show that C(T |T ) is in E erty (i), i.e. antisymmetry in the columns entries of C(Y |Y ), is clear. Property (ii) has to be verified for each column of C(Y |Y ) and entry on its right-hand side which can be chosen to be the first entry of a column on the right-hand side (in view of the column antisymmetry). If the column is the last one it has no entry on the right-hand side so there nothing to verify and if the column is a full column of Y , i.e. has not be contracted, which is the case for the i th column with i ≤ c − c , the property (ii) follows from the same property for T (assumption T ∈ E Y ) . Thus to achieve the proof of the proposition we only need to verify property (ii) in the case where both Y and Y have exactly two columns of lengths say m ˜1 ≥ m ˜ 2 for Y and m ˜ 1 ≥ m ˜ 2 for Y with m ˜2 > m ˜ 1 . In this case C(Y |Y ) has also two columns of lengths m ˜1 −m ˜ 2 and m ˜2 −m ˜ 1 (m ˜1 − m ˜2 ≥ m ˜2 − m ˜ 1 > 0) and one has to verify that antisymmetrization of the first entry of the second column of C(Y |Y ) with the entries of the first column (of length m ˜1 −m ˜ 2 ) of C(Y |Y ) in C(T |T ) gives zero. We know that antisymmetrization with all entries of the first column of Y give zero (for T ); however when contracted with T this identity implies a sum of antisymmetrizations of the entries of the first column of Y with the successive entries of its second column for T which gives zero (T = E ∗Y ) and reduces therefore to the desired antisymmetrization with the m ˜1 −m ˜ 2 first entries.
3. Generalized Complexes of Tensor Fields Throughout this section (Y ) denotes not just one Young diagram but a sequence (Y ) = (Yp )p∈N of Young diagrams Yp such that the number of cells of Yp is equal to p, that is |Yp | = p, ∀p ∈ N. Notice that there is no freedom for Y0 and Y1 : Y0 must be the empty Young diagram and Y1 is the Young diagram with one cell. Let us denote by ∧(Y ) E the direct sum ⊕p∈N E Yp of the Schur modules E Yp . This is a graded vector space with p ∧(Y ) E = E Yp . The origin of this notation is that for the sequence (Y 2 ) = (Yp2 ) of the one column Young diagrams, i.e. Yp2 is the Young diagram with p cells in one column for any p ∈ N, then ∧(Y 2 ) E is the exterior algebra ∧E of E.
In the following, we shall be interested in particular sequences (Y N ) = (YpN )p∈N of Young diagrams satisfying the assumption of Lemma 1 (as explained in the introduction). The sequence (Y N ) contains Young diagrams YpN in which all the rows but the last one are of length N − 1, the last one being of length smaller than or equal to N − 1 in such a way that |YpN | = p (∀p ∈ N). Pictorially one has for instance for N = 5,
Y35 =
5 = Y22
5 = Y24
402
M. Dubois-Violette, M. Henneaux N
p
and so on. In this case ∧(Y N ) E and ∧(Y N ) E = E Yp will be simply denoted by ∧N E p
p
and ∧N E respectively. Notice that ∧N E = 0 for p > (N − 1)D, (D = dimE), so that (N−1)D p ∧N E is finite-dimensional. ∧N (E) = ⊕p=0 Let us assume that E is equipped with a dual volume, i.e. a non-vanishing element D ε of ∧D E (= ∧D 2 E), which is therefore a basis of the 1-dimensional space ∧ (E). It is N
(N−1)D
straightforward that ε⊗ is in ∧N E = E Y(N −1)D because (i) is obvious whereas N is rectangular so that each Young (ii) is trivial, i.e. empty. The Young diagram Y(N−1)D N N ; this is in diagram which is included in Y(N−1)D is in fact strongly included in Y(N−1)D N particular the case for the Yp for p ≤ (N − 1)D. One then defines a linear isomorphism ∗ : ∧N E ∗ → ∧N E generalizing the algebraic part of the Poincaré (Hodge) duality by setting (N −1)
(N −1)
∗ω = C(ε ⊗
|ω)
(8)
for ω ∈ ∧N E ∗ . One has p
(N−1)D−p
∗ ∧N E ∗ = ∧N
E
(9)
for p = 0, . . . , (N − 1)D. Let (eµ )µ∈{1,...,D} be a basis of E and let (θ µ ) be the dual basis of E ∗ . Our aim is to be able to compute in terms of the components of tensors for the various concepts connected with Young diagrams. For this, one has to decide the linear order in which |Y | one writes the components of a tensor T ∈ E ⊗ or, which is the same, of a multilinear ∗|Y | form T on E for any given Young diagram Y . Since we have labelled the arguments (entries) of such a T by the cells of Y and since the components are obtained by taking the arguments among the θ µ , this means that one has to choose an order for the cells of Y (i.e. a way to “read the diagram” Y ). One natural choice is to read the rows of Y from left to right and then from up to down (like a book); another natural choice is to read the columns of Y from up to down and then from left to right. Although the first choice is very natural with respect to the sequences (Y N ) ofYoung diagrams introduced above and will be used later, we shall choose the second way of ordering in the following. The reason is that when T belongs to the Schur module E Y , then it is (property (i)) antisymmetric in the entries of each column. Thus if Y has columns of lengths m ˜1 ≥ ··· ≥ m ˜ c (> 0 for |Y | = 0) our choice is induced by the canonical identification E Y ⊂ ∧m˜ 1 E ⊗ · · · ⊗ ∧m˜ c E
(10) p
of the Schur module E Y as a subspace of ∧m˜ 1 E ⊗ · · · ⊗ ∧m˜ c E, where ∧p E = ∧2 E is the pth exterior power of E. With the above choice, the components (relative to the |Y |
m ˜1
m ˜c
basis (eµ ) of E) of T ∈ E ⊗ read T µ1 ...µ1 ,...,µc ...µc and T ∈ E Y if and only if these ˜r components are completely antisymmetric in the µ1r , . . . , µm r for each r ∈ {1, . . . , c} ˜r 1 and such that complete antisymmetrization in the µ1r , . . . , µm r and µs gives zero for any 1 ≤ r < s ≤ c. We have defined for a sequence (Y ) = (Yp ) of Young diagrams with |Yp | = p (∀p ∈ N) the graded vector space ∧(Y ) E which can be considered as a generalization of the exterior algebra ∧E as explained above. We now wish to define the corresponding generalization of differential forms. Let M be a D-dimensional smooth manifold. For 1
1
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
403
any Young diagram Y one has the smooth vector bundle T ∗Y (M) over M of the Schur modules (Tx∗ (M))Y , x ∈ M. Correspondingly, for (Y ) as above, one has the smooth bundle ∧(Y ) T ∗ (M) over M of graded vector spaces ∧(Y ) Tx∗ (M). The graded C ∞ (M)module (Y ) (M) of smooth sections of ∧(Y ) T ∗ (M) is the generalization of differential forms corresponding to (Y ). In order to generalize the exterior differential one has to choose a connection ∇ on the vector bundle T ∗ (M) that is a linear connection ∇ on M. Such a connection extends canonically as linear mappings p
p
∇ : (Y ) (M) → (Y ) (M)
⊗
C ∞ (M)
1 (M),
where 1 (M) = 1(Y ) (M) is the C ∞ (M)-module of smooth sections of T ∗ (M) (i.e. of differential 1-forms) satisfying ∇(αf ) = ∇(α)f + α ⊗ df p
for any α ∈ (Y ) (M) and f ∈ C ∞ (M) and where d is the ordinary differential of C ∞ (M) into 1 (M). Notice that for any sequence (Y ) of Young diagrams as above, one has 0(Y ) = 0 (M) = C ∞ (M) and 1(Y ) (M) = 1 (M) since one has no choice for Y0 and Y1 . Let us define the generalization of the covariant exterior differential d∇ : (Y ) (M) → (Y ) (M) by p
p+1
d∇ = (−1)p Yp+1 ◦ ∇ : (Y ) (M) → (Y ) (M)
(11)
for any p ∈ N. Notice that d∇ = d on C ∞ (M) = 0(Y ) (M) and that d∇ is a first order differential operator. Lemma 1 in the introduction admits the following generalization. Lemma 2. Let N be an integer with N ≥ 2 and assume that (Y ) is such that the number of columns of the Young diagram Yp is strictly smaller than N for any p ∈ N. Then (d∇ )N is a differential operator of order strictly smaller than N . If ∇ is torsion-free, then d∇N is order strictly smaller than N − 1. If furthermore ∇ has vanishing torsion and curvature then one has (d∇ )N = 0. The proof is straightforward. In the case N = 2, if ∇ is torsion free, (d∇ )2 is not only an operator of order zero but (d∇ )2 = 0 follows from the first Bianchi identity; however in this case, for (Y 2 ), d∇ coincides with the ordinary exterior differential. For the sequences p p (Y N ) = (YpN ) we denote (Y N ) (M) and (Y N ) (M) simply by N (M) and N (M). As already mentioned 2 (M) is the graded algebra (M) of differential forms on M. Not every M admits a flat torsion-free linear connection. In the following we shall (0)
concentrate on N (RD ) equipped with d = d(0), where ∇ is the canonical flat torsion∇
free connection of RD . So equipped, N (RD ) is an N -complex. One has of course N (RD ) = ∧N RD∗ ⊗ C ∞ (RD ). Let us equip RD with the dual volume ε ∈ ∧D RD which is the completely antisymmetric contravariant tensor of maximal degree with component ε1...D = 1 in the canonical basis of RD . Then the corresponding isomorphism ∗ : ∧N RD∗ → ∧N RD extends by C ∞ (RD )-linearity as an isomorphism of C ∞ (RD )modules, again denoted by ∗, of N (RD ) into the space (of contravariant tensor fields on RD ) ∧N RD ⊗ C ∞ (RD ) with p
(N−1)D−p
∗N (RD ) = ∧N
RD ⊗ C ∞ (RD )
404
M. Dubois-Violette, M. Henneaux
for any 0 ≤ p ≤ (N − 1)D. Let us define the first-order differential operator δ of degree −1 on ∧N RD ⊗ C ∞ (RD ) (N−1)p+r
δ : ∧N
(N−1)p+r−1
RD ⊗ C ∞ (RD ) → ∧N
RD ⊗ C ∞ (RD )
by setting N ˜ ◦ δT δT = Y(N−1)p+r−1 (N−1)p+r
for T ∈ ∧N defined by
p+1
˜ )µ1 ...µ1 (δT 1
(12)
RD ⊗ C ∞ (RD ) with 0 ≤ p < D and 1 ≤ r ≤ N − 1, δ˜ being p+1
p
p
,...,µ1r−1 ...µr−1 ,µ1r ...µr ,...,µ1N −1 ...µN −1
p+1
= ∂µ T µ1 ...µ1 1
p
p
,...,µ1r ...µr µ,...,µ1N −1 ...µN −1
where we have used the canonical identification (10) and the conventions explained below (10). It is worth noticing here that in view (essentially) of Proposition 1, one ˜ for r = N − 1, i.e. in this case (well-filled case) the projection is not has δT = δT (N−1)p+r−1 D necessary in formula (12). So defined (δT )(x) is by construction in ∧N R −1 and the operator δ is in each degree proportional to the operator ∗d∗ , i.e. that one has D ∞ D δ = cn ∗ d ∗−1 : ∧nN RD ⊗ C ∞ (RD ) → ∧n−1 N R ⊗ C (R )
(13)
for some cn ∈ R, 1 ≤ n ≤ (N − 1)D (δ = 0 in degree zero). 4. Digression on a Related Multicomplex In this section, we introduce a multicomplex which will be related to our N-complex N (RD ) in the next section. We also derive some useful cohomological results in this multicomplex, which will be the key for proving our generalization of the Poincaré lemma that is Theorem 1. Let A be the graded tensor product of N − 1 copies of the exterior algebra ∧RD∗ of the dual space RD∗ of RD with C ∞ (RD ), (RD ). A = (⊗N−1 ∧ RD∗ ) ⊗ C ∞ (RD ) = ⊗N−1 C ∞ (RD ) An element of A is as a sum of products of the (N −1)D generators di x µ (i = 1, . . . , N − 1, µ = 1, . . . , D) with smooth functions on RD . Elements of A will be referred to as multiforms. The space A is a graded-commutative algebra for the total degree, in particular one has di x µ dj x ν = −dj x ν di x µ ,
x µ di x ν = di x ν x µ .
One defines N − 1 antiderivations di on A by setting di f = ∂µ f di x µ (f ∈ C ∞ (RD )) ,
di (dj x µ ) = 0.
(14)
These antiderivations anticommute, di dj + dj di = 0,
(15)
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
405
in particular each di is a differential. The graded algebra A has a natural multidegree (d1 , d2 , . . . , dN−1 ) for which di (dj x µ ) = δij . It is useful to consider the subspaces A(k) of multiforms that vanish at the origin, together with all their successive derivatives up to order k − 1 included (k ≥ 1). If ω ∈ A(k) , one says that ω is of order k. The terminology comes from the fact that a smooth function belongs to A(k) if and only if its limited Taylor expansion starts with terms of order ≥ k. If l ≥ k, A(l) ⊂ A(k) . The subspaces A(k) are not stable under di but one has di A(k) ⊂ A(k−1) for k ≥ 1 (with A(0) ≡ A). The vector space H (k) (di , A) is defined as Z (k) (di , A) H (k) (di , A) ≡ , di A(k+1) where Z (k) (di , A) is the set of di -cocycles ∈ A(k) . Note that any multiform ω ∈ A can be written as ω = p(k) + β, where p (k) is a polynomial multiform of polynomial degree k and β ∈ A(k+1) . This decomposition is unique which implies in particular that A(k) ∩ di A = di A(k+1) . It follows from the standard Poincaré lemma that H (1) (di , A) = 0.
(16)
Indeed, the cohomology of di in A is isomorphic to the space of constant multiforms not involving di x µ . The condition that the cocycles belong to A(1) , i.e., vanish at the origin, eliminates precisely the constants. One has also H (m) (di , A) = 0 ∀m ≥ 1, since A(m) ⊂ A(1) for m ≥ 1 and A(m) ∩ di A = di A(m+1) . Let K be an arbitrary subset of {1, 2, . . . , N − 1}. We define AK as the quotient space AK =
A j ∈K
dj A
(for K = ∅, AK = A). The differential di induces, for each i, a well-defined differential in AK which we still denote by di . Of course, the induced di is equal to zero if i ∈ K. Lemma 3. For every proper subset K of {1, 2, . . . , N − 1} and for every i ∈ / K, one has H (k+1) (di , AK ) = 0 (k = #K). Proof. The proof proceeds by induction on the number k of elements of K. The lemma clearly holds for k = 0 (K = ∅) since then AK = A and the lemma reduces to Eq. (16). Let us now assume that the lemma holds for all subsets K (not containing i) with k ≤ ; elements. Let K be a subset not containing i with ; + 1 elements. Let j ∈ K and K = K \{j }. The induction hypothesis implies H (;+1) (di , AK ) = H (;+1) (dj , AK ) = 0. By standard “descent equation” arguments (see below), this leads to H p,q,(;+2) (di |dj , AK ) & H p+1,q−1,(;+2) (di |dj , AK ). In H p,q,(;+2) (di |dj , AK ), the first superscript p stands for the di -degree, the second supercript q stands for the dj -degree while (; + 2) is the polynomial order. Repeated application of this isomorphism yields H p,q,(;+2) (di |dj , AK ) & H p+q,0,(;+2) (di |dj , AK ). But H p+q,0,(;+2) (di |dj , AK ) ≡ H p+q,0,(;+2) (di , AK ) = 0. Hence, the cohomological spaces H p,q,(;+2) (di |dj , AK ) vanish for all p, q, which is precisely the statement H (;+2) (di , AK ) = 0.
406
M. Dubois-Violette, M. Henneaux
The precise descent equation argument involved in this proof runs as follows: let α p,q,(;+2) be a di -cocycle modulo dj in AK , i.e., a solution of di α p,q,(;+2) + dj α p+1,q−1,(;+2) ≈ 0 for some α p+1,q−1,(;+2) , where the notation ≈ means “mod ulo terms in j ∈K dj A”. Applying di to this equation yields dj di α p+1,q−1,(;+2) ≈ 0 and hence, since di α p+1,q−1,(;+2) is of order ; + 1 and H (;+1) (dj , AK ) = 0, di α p+1,q−1,(;+2) + dj α p+2,q−2,(;+2) ≈ 0 for some α p+2,q−2,(;+2) . Hence, α p+1,q−1,(;+2) is also a di -cocycle modulo dj in AK . Consider the map α p,q,(;+2) → α p+1,q−1,(;+2) of di -cocycles modulo dj . There is an arbitrariness in the choice of α p+1,q−1,(;+2) given α p,q,(;+2) so this map is ambiguous, however H (;+1) (dj , AK ) = 0 implies that it induces a well-defined linear mapping H p,q,(;+2) (di |dj , AK ) → H p+1,q−1,(;+2) (di |dj , AK ) in cohomology. This map is injective and surjective since H (;+1) (di , AK ) = 0 and thus one has the isomorphism H p,q,(;+2) (di |dj , AK ) & H p+1,q−1,(;+2) (dj |di , AK ) (see [11] for additional information). A direct application of this lemma is the following Proposition 2. Let J be any non-empty subset of {1, 2, . . . , N − 1}. Then
dj α = 0 and α ∈ A(#J ) ⇒ α = dj βj
j ∈J
j ∈J
for some βj ’s. Proof. The property is clearly true for #J = 1 (see Eq. (16)). Assume then that the property is true for all proper subsets with k ≤ ; < N − 1 elements. Let J be a proper subset with exactly ; elements and i ∈ / J . Let α be a multiform in A(;+1) such that di ( j ∈J dj )α = 0. This is equivalent to ( j ∈J dj )di α = 0. Application of the recursive assumption to di α, which belongs to A(;) , implies then di α = j ∈J dj βj , from which one derives, using the previous lemma, that α = di ρ + j ∈J ρj for some ρ, ρj . Therefore, the property passes on to all subsets with ;+1 elements, which establishes the theorem. We are now in position to state and prove the main result of this section. Theorem 2. Let K be an arbitrary non-empty subset of {1, 2, . . . , N −1}. If the multiform ω is such that (17) di ω = 0 ∀I ⊂ K | #I = m i∈I
(with m ≤ #K a fixed integer), then ω=
J ⊂K #J = #K − m + 1
dj αJ + ω 0 ,
j ∈J
where ω0 is a polynomial multiform of degree ≤ m − 1.
(18)
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
407
Proof. The polynomial multiform ω0 is clearly a solution of the problem, so we only need to check that if ω ∈ A(m) in addition to (17), then (18) is replaced by ω= dj αJ . (19) J ⊂K #J = #K − m + 1
j ∈J
The αJ ’s can be assumed to be of order #K + 1 since one differentiates them #K − m + 1 times to get ω. To prove (19), we proceed recursively, keeping m fixed and increasing the size of K step by step from #K = m to #K = N − 1. If #K = m, there is nothing to be proven since I = K and the theorem reduces to the previous theorem. So, let us assume that the theorem has been proven for #K = k ≥ m and let us show that it extends to any set U = K ∪ {;}, ; ∈ / K with #U = k + 1 elements. If (17) holds for any subset I ⊂ U of U (with #I = m), it also holds for any subset I ⊂ K of K ⊂ U (with #I = m), so the recursive hypothesis implies ω= dj αJ . (20) J ⊂K #J = k − m + 1
j ∈J
Let now A be an arbitrary subset of U with #A = m, which contains the added element ;. Among all the subsets J occurring in the sum (20), there is only one, namely J = U \A such that J ∩ A = ∅. The condition (17) with I = A implies, when applied to the expression (20) of ω, dj αJ = 0
j ∈U
(if J = J , the product ( i∈A di )( j ∈J dj ) identically vanishes because at least one differential df is repeated). But since αJ is of order k+1 = #U , the previous proposition implies that αJ = dj βjJ . j ∈U
When injected into (20), this yields in turn ω= L⊂U #L = k − m + 2
j ∈L
dj αL
(21)
for some αL , and shows that the required property is also valid for sets with cardinal equal to k + 1, completing the proof of the theorem. 5. The Generalization of the Poincaré Lemma With the result of last section, Theorem 2, we can now proceed to the proof of Theorem 1 that is to the proof of the generalization of the Poincaré lemma announced in the introduction. Let us first show that N (RD ) identifies canonically as a graded C ∞ (RD )-module with the image of a C ∞ (RD )-linear homogeneous projection π of A into itself:
408
M. Dubois-Violette, M. Henneaux
N (RD ) = π(A) ⊂ A. Indeed by using the canonical identification (10) of Sect. 3, one has the identification (N−1)n+i
∧N
n+1 n E⊂∧ E ⊗ · · · ⊗ ∧n+1 E ⊗ ∧ · · ⊗ ∧n E E ⊗ · i
N
(22)
N−1−i
(n+1)n+i
of the Schur module E Y(N −1)n+i = ∧N E as vector subspace of the right-hand side. However by decomposing the right-hand side of (22) into irreducible subspaces for the action of GL(E) one sees that there is only one irreducible factor isomorphic N to E Y(N−1)n+i which is therefore the image of a GL(E)-invariant projection. It follows that ∧N E ⊂ ⊗N−1 ∧ E is the image of a GL(E)-invariant projection P of ⊗N−1 ∧ E into itself which is homogeneous for the total degree. The result for N (RD ) follows by choosing E to be the dual space RD∗ of RD and by setting π = P ⊗ IC ∞ (RD ) in view of N (RD ) = ∧N RD∗ ⊗ C ∞ (RD ) and A = (⊗N−1 ∧ RD∗ ) ⊗ C ∞ (RD ). The projection π is in fact by construction a projection of ⊕p∈N A[p] into itself, where A[p] = An+1,...,n+1,n,...,n , p = (N − 1)n + i with obvious notations. We now relate the N -differential d of N (RD ) to the differentials di of A. Let ω be p an element of N (RD ) with p = (N − 1)n + i, 0 ≤ i < N − 1. One has dω = cω π(di+1 ω),
(23)
where cω is a non-vanishing number that depends on the degrees of ω. In general, the projection is nontrivial, in the sense that di+1 ω has components not only along the irreducible N
Schur module E Yp+1 (E = RD∗ ), but also along other Schur modules not occurring in N (RD ). For instance, with N = 3, the covariant vector with components vα defines the element v = vα d1 x α of A. One has d2 v = −∂β vα d1 x α d2 x β . This expression contains both a symmetric (dv) and an antisymmetric part, so d2 v = dv − v[α,β] d1 x α d2 x β . The projection removes v[α,β] d1 x α d2 x β , which does not vanish in general. Because the projection is nontrivial, the conditions dω = 0 and di+1 ω = 0 are inequivalent for generic i. However, if ω is a well-filled tensor, that is, if i = 0, then dω = d1 ω
(i = 0).
(24)
Indeed, d1 ω has automatically the correctYoung symmetry. Thus the conditions d1 ω = 0 and dω = 0 are equivalent. Furthermore, because of the symmetry between the columns, if d1 ω = 0, then, one has also d2 ω = d3 ω = · · · = 0. For instance, again for N = 3, the derivative of the symmetric tensor g = gαβ d1 x α d2 x β (gαβ = gβα ) is given by dg = d1 g = 21 (gαβ,ρ − gρβ,α )d1 x ρ d1 x α d2 x β . The completely symmetric component g(αβ,ρ) is absent because d1 x ρ d1 x α = −d1 x α d1 x ρ . Also, it is clear that if d1 g = 0, then, d2 g = 21 (gαβ,ρ − gαρ,β )d1 x α d2 x β d2 x ρ = 0. This generalizes to the following lemma: (N−1)n
Lemma 4. Let ω ∈ N
(RD ) (well-filled, or rectangular, tensor). Then
dkω = 0
⇔
j ∈J, #J =k
dj ω = 0.
(25)
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
409
Proof. One has d k ω = (−1)m d1 d2 · · · dk ω. Indeed, it is clear that the multiform d1 d2 · · · dk ω ∈ An+1,n+1,··· ,n+1,n,··· ,n belongs to N (RD ) because it cannot have components along the invariant subspaces corresponding to Young diagrams with the first column having i > r + 1 boxes, since one cannot put two derivatives ∂µ , ∂ν in the same column. Hence, d k ω = 0 is equivalent to d1 d2 · · · dk ω = 0. One completes the proof by observing that for well-filled tensors, the condition d1 d2 · · · dk ω = 0 is equivalent to the conditions di1 di2 · · · dik ω = 0 ∀(i1 , · · · , ik ) because of symmetry in the columns. (N−1)n
Lemma 5. Let ω ∈ N
ω=
(RD ) with n ≥ 1. Then dj αJ ⇒ ω = d N−k α
J, #J =N−k (N−1)n−N +k
for some α ∈ N
(26)
j ∈J
(RD ), k ∈ {1, . . . , N − 1}.
Proof. First, we note that the αJ occurring in (26) can be chosen to have di -degrees equal to n − 1 or n according to whether di acts or does not act on αJ , since ω has multidegree (n, n, · · · , n). Second, one can project the right-hand side of (26) on (N−1)n (N−1)n (RD ) without changing the left-hand side, since ω ∈ N (RD ). It is easy N N−k to see that π[( j ∈J dj ) αJ ] ∼ d αJ , with αJ = π(α˜ J ), where α˜ J is the element in An,··· ,n,n−1,n−1,··· ,n−1 obtained by reordering the “columns” of αJ so that they have non-increasing length. In fact, when differentiated, the other irreducible components of α˜ J do not contribute to ω because their first column is too long to start with or because two partial derivatives find themselves in the same column, yielding zero. Injecting the above expression for π [( j ∈J dj ) αJ ] into (26) yields the desired result. (N−1)n
Lemma 6. Let ω ∈ N Then,
(RD ) with n ≥ 1 be a polynomial multiform of degree k −1. ω = d N−k α (N−1)n−N+k
for some polynomial multiform α ∈ N {1, . . . , N − 1}.
(27) (RD ) of degree N − 1, with k ∈
Proof. The proof amounts to playing with Young diagrams. The coefficients of ω transN form in the tensor product of the representation associated with Y(N−1)n (symmetry of ω) and the completely symmetric representation with k − 1 boxes (symmetric polynomials in the x µ ’s of degree k − 1). Let T be this representation and VT be the carrier vector space. Similarly, the multiform α transforms (if it exists) in the tensor product of the N representation associated with Y(N−1)n−N+k (symmetry of α) and the completely symmetric representation with N − 1 boxes (symmetric polynomials in the x µ ’s of degree N − 1). Let S be this representation and WS be the carrier vector space. Now, the linear operator d N−k : WS → VT is an intertwiner for the representations S and T . To analyse how it acts, it is convenient to decompose both S and T into irreducible representations. The crucial fact is that all irreducible representations occurring in T also occur in S. That is, if T = ⊕i Ti , VT = ⊕i Vi (where each irreducible representation Ti has multiplicity one), then S = (⊕i Ti ) ⊕ (⊕α Tα ),
WS = (⊕i Wi ) ⊕ (⊕α Wα ),
410
M. Dubois-Violette, M. Henneaux
where Tα are some other representations, irrelevant for our purposes. Because Ti is irreducible, the operator d N−k maps the invariant subspace Wi on the invariant subspace Vi , and furthermore, d N−k |Wi is either zero or bijective. It is easy to verify by taking simple examples that d N−k |Wi is not zero. Hence, d N−k |Wi is injective, which implies that d N−k : WS → VT is surjective, so that ω can indeed be written as d N−k α for some α. Proof of Theorem 1. Theorem 1 is a direct consequence of the above two lemmas. (i) Let (N−1)n (RD ) (with n ≥ 1) be annihilated by d k , d k ω = 0. We write ω = ω + ω0 , ω ∈ N where ω is of order k and where ω0 is a polynomial multiform of polynomial degree < k. Both ω and ω0 have the symmetry of ω. Also, since ω0 is trivially annihilated by k k d k , one has separately d ω = 0 and d ω0 = 0. We consider first ω . The first lemma implies ( j ∈J,#J =k dj )ω = 0, from which it follows, using the theorem of the previous section, that ω = dj αJ J,#J =N−k
j ∈J
(see (19)). By the second lemma above, this term can be written as d N−k α. As we have also seen, the same property holds for ω0 . This proves the theorem for n ≥ 1. (ii) The 0 ( (RD )) is even easier to discuss: for a function, the condition d k f = 0 is case of H(k) N equivalent to ∂µ1 ···µk f = 0 and thus, f must be of degree strictly less than k. Moreover, it can never be the d N−k of something, since there is nothing in negative degree. It is worth noticing here that, as explained in the introduction, Theorem 1 has a dual counterpart for the δ-operator introduced at the end of Sect. 3 which allows to integrate lots of generalized currents conservation equations. In the last section of this paper we shall sketch another approach for proving Theorem 1 which is based on the appropriate generalization of homotopy for N -complexes. m ( (RD )) for Generic m 6. Structure of H(k) N m ( (RD )) vanishes whenever m = In the previous section we have shown that H(k) N (N − 1)n with n ≥ 1. In the case N = 2 this is the usual Poincaré lemma which means that the cohomology vanishes in positive degrees. For N ≥ 3 there are degrees m which do not belong to the set {(N − 1)(n + 1)|n ∈ N} and it turns out that for such m ( (RD )) are nontrivial (k ∈ {1, . . . , N − 1}). a (generic) degree m, the spaces H(k) N More precisely for m ∈ {0, . . . , N − 2} these spaces are finite-dimensional of strictly positive dimensions whereas for m ≥ N with m = (N − 1)n these spaces are infinitedimensional. In the following we shall explicitly display the case N = 3 and indicate how to proceed for the general case N ≥ 3. m ( (RD )) by H m In order to simplify the notations let us denote the spaces H(k) N (k) m ). and the graded spaces H(k) (N (RD )) by H(k) (= ⊕m H(k) 2n = H 2n = 0 For N = 3, one has only H(1) and H(2) and Theorem 1 states that H(1) (2) 0 0 is the D for n ≥ 1 and that H(1) & R is the space of constant functions on R , whereas H(2) 0 & R ⊕ RD . space of polynomial functions of degree less or equal to one on RD , i.e. H(2) 1 identify with the covariant vector fields (or On the other hand, the elements of H(1)
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
411
one-forms) x → X(x) on RD satisfying ∂µ Xν + ∂ν Xµ = 0,
(28)
which is the equation characterizing the Killing vector fields (i.e. infinitesimal isometries) µ 2 D of the standard euclidean metric D µ=0 (dx ) of R . The general solution of (28) is Xµ = vµ + aµν x ν with v ∈ RD (infinitesimal translations) and a ∈ ∧2 RD , i.e. 1 & RD ⊕ ∧2 RD . Notice aµν = −aνµ = C te (infinitesimal rotations). Thus one has H(1) that with this terminology we have implicitly identified covariant vector fields with 0 , contravariant ones by using the standard metric of RD . Notice also that as far as H(1) 0 and H 1 are concerned nothing changes if N ≥ 3. For N = 3, H 1 identifies with H(2) (1) (2) the space of covariant vector fields x → X(x) on RD satisfying ∂λ (∂µ Xν − ∂ν Xµ ) = 0
(29)
modulo the ones of the form Xµ = ∂µ ϕ for some ϕ ∈ C ∞ (RD ). The general solution of (29) is Xµ = aµν x ν + ∂µ ϕ with a ∈ ∧2 RD and ϕ ∈ C ∞ (RD ) so that one has 1 & ∧2 RD . Let us now show that H 3 is infinite-dimensional for N = 3. For this, H(2) (1) consider an arbitrary 2-form ω, i.e. an arbitrary covariant antisymmetric tensor field of (0)
degree 2 on RD and consider the element t = Y33 ◦ ∇ ω of 33 (RD ). Up to an irrelevant normalization constant, the components of t are given by tµλν = 2∂λ ωµν + ∂µ ωλν − ∂ν ωλµ
(30)
and one verifies that one has dt = 0 in 3 (RD ). On the other hand one has t = dh in 3 (RD ) that is 2∂λ ωµν + ∂µ ωλν − ∂ν ωλµ = ∂ν hµλ − ∂µ hνλ
(31)
for some symmetric covariant tensor field h ∈ 23 (RD ) if and only if ω is of the form ωµν = aρµν x ρ + ∂µ Xν − ∂ν Xµ
(32)
for a ∈ ∧3 RD and some covariant vector field X ∈ 13 (RD ) and then t is proportional to 3 . This argument shows firstly that H 3 contains d 2 (X) in 3 (RD ), i.e. t is trivial in H(1) (1) the quotient of the space of 2-forms by the ones of the form given by (32) which is 3 infinite-dimensional and secondly that the same space identifies with a subspace of H(2) which is therefore also infinite-dimensional. In fact as will be shown below one has an 3 & H 3 which is induced by the inclusion Ker(d) ⊂ Ker(d 2 ). By isomorphism H(1) (2) replacing the 2-form ω by an irreducible covariant tensor field ωn of degree 2n + 2 on RD with Young symmetry type given by the Young diagram with n lines of length two 2(n+1)+1 2(n+1)+1 and H(2) and two lines of length one, it can be shown similarly that H(1) are infinite-dimensional spaces (we shall see that they are in fact isomorphic). The last argument for N = 3 admits the following generalization for N ≥ 3. Let YmN be a Young diagram of the sequence (Y N ) and let Ym−p be a Young diagram obtained N by deleting p boxes of Ym with 0 < p < N − 1 such that it does not belong to (Y N ) (0)
N ) and such that by applying p derivatives (i.e. ∇ p ) to a generic tensor (i.e. Ym−p
= Ym−p field with Young symmetry Ym−p one obtains a tensor which has a nontrivial component
412
M. Dubois-Violette, M. Henneaux
D t with Young symmetry YmN . Then generically the latter t ∈ m N (R ) is a nontrivial generalized cocycle and one obtains by this procedure an infinite dimensional subspace m for the appropriate k. Notice of the corresponding generalized cohomology, i.e. of H(k) that this is only possible for m ≥ N with m = (N − 1)n. We conjecture that the whole nontrivial part of the generalized cohomology of N (RD ) in degree m ≥ N is obtained by the above construction (N ≥ 3). In order to complete the discussion for N ≥ 3 in degree m ≤ N −2 as well as to show 2n+1 2n+1 the isomorphisms H(1) & H(2) for N = 3, n ≥ 1 and their generalizations for N ≥ 3, we now recall a basic lemma of the general theory of N -complexes [7, 12]. This lemma was formulated in [7] in the more general framework of N -differential modules (Lemma 1 of [7]) that is of k-modules equipped with an endomorphism d such that d N = 0, where k is a unital commutative ring. In this paper we only discuss N -complexes of (real) vector spaces. Let E be a N -complex of cochain [7] like N (RD ), that is here E = ⊕m∈N E m is a graded vector space equipped with an endomorphism d of degree one such that d N = 0 (N ≥ 2). The inclusions Ker(d k ) ⊂ Ker(d k+1 ) and Im(d N−k ) ⊂ Im(d N−k−1 ) induce linear mappings [i] : H(k) → H(k+1) in generalized cohomology for k such that 1 ≤ k ≤ N − 2. Similarly the linear mappings d : Ker(d k+1 ) → Ker(d k ) and d : Im(d N−k−1 ) → Im(d N−k ) obtained by restriction of the N -differential d induce linear mappings [d] : H(k+1) → H(k) . One has the following lemma (for a proof we refer to [12] or [7]).
Lemma 7. Let the integers k and ; be such that 1 ≤ k, 1 ≤ ;, k + ; ≤ N − 1. Then the hexagon of linear mappings
[i];
H(k) (E)
[d]k
H(;+k) (E) ✯ ✟ ✟
✲ H(;) (E)
✟✟ ✟✟
[i]N −(;+k)
❍❍ ❍ ❥ ❍ H(N−k) (E) ✟ ✟ ✟✟[d];
❍ ❍
❍❍
[d]N −(;+k)
❍❍
❍❍
H(N−;) (E) ✛
[i]k
✟ ✙ ✟ H(N−(;+k)) (E)
is exact. Since [i] is of degree zero while [d] is of degree one, these hexagons give long exact sequences. Let us apply the above result to the N -complex N (RD ). For N = 3, there is only 2n = 0 for n ≥ 1, k = 1, 2 it reduces one hexagon as above (k = ; = 1) and, by using H(k) to the exact sequences [i]
[d]
[d]
[i]
d
0 0 1 1 → H(2) → H(1) → H(2) →0 0 → H(1)
(33)
and d
[i]
d
2n+1 2n+1 → H(2) →0 0 → H(1)
(34)
2n+1 2n+1 for n ≥ 1. The sequences (34) give the announced isomorphisms H(1) & H(2) while 1 knowing the one the 4-terms sequence (33) allows to compute the finite dimension of H(2)
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
413 (N−1)n
0 , H 0 and H 1 . For N ≥ 3 one has several hexagons and by using H of H(1) (2) (1) (k)
for n ≥ 0, the sequence (33) generalizes as the following sequences: [d]k
[i]N −k−;
[d];
[i]k
(N−2)(N−1) 2
=0
four-terms exact
[d]N −k−;
k−1 k−1 k+;−1 k+;−1 0 −→ H(;) −→ H(N−k) −→ H(N−k−;) −→ H(N−;) −→ 0
(35)
for 1 ≤ k, ; and k + ; ≤ N − 1. There are also two-terms exact sequences generalizing (34) giving similar isomorphisms but, for N > 3, there are other longer exact sequences (N−1)n = 0 for n ≥ 1). Suppose that the spaces (which are of finite lengths in view of H(k) m H(k) are finite-dimensional for k + m ≤ N − 1 and that we know their dimensions. Then m for m ≤ N −2 are finite-dimensional and the exact sequences (35) imply that all the H(k) m for k + m ≤ allows to compute their dimensions in terms of the dimensions of the H(k) N − 1. To complete the discussion it thus remains to show the finite-dimensionality of m for k + m ≤ N − 1. For k + m ≤ N − 1, the space H m identifies with the the H(k) (k) space of (covariant) symmetric tensor fields S of degree m on RD such that ∂µπ(1) . . . ∂µπ(k) Sµπ(k+1) . . . µπ(k+m) = 0 (36) π∈Sk+m
for µi ∈ {1, . . . , D}, where Sk+m is the group of permutation of {1, . . . , k + m}. In particular, for k = 1 the equation (36) means that S is a Killing tensor field of degree m for the canonical metric of RD and it is well known and easy to show that the components of such a Killing tensor field of degree m are polynomial functions on RD of degree less than or equal to m. In fact the Killing tensor fields on RD form an algebra for the symmetric product over each point of RD which is generated by the Killing vector fields m is finite-dimensional for 1+m ≤ N −1. (which are polynomial of degree ≤ 1). Thus H(1) m is finiteBy using this together with Theorem 1, one shows by induction on k that H(k) dimensional for k +m ≤ N −1, more precisely, that the solutions of (36) are polynomial functions on RD of degree less than k + m. The results of this section concerning the generic degrees show that our generalization of the Poincaré lemma, i.e. Theorem 1, is far from being a straightforward result and that it is optimal. 7. Algebras Let E & RD be a D-dimensional vector space, (Y ) be a sequence (Y ) = (Yp )p∈N of Young diagrams such that |Yp | = p (∀p ∈ N) and let us use the notations and conventions of Sect. 3. As we have seen, the graded space ∧(Y ) E is a generalization of the exterior algebra of E in the sense that as a graded vector space it reduces to the latter when (Y ) coincides with the sequence (Y 2 ) = (Yp2 )p∈N of the one-column Young diagrams. It is also a generalization of the symmetric algebra of E since it reduces to it when (Y ) coincides with the sequences (Y˜ 2 ) = (Y˜p2 )p∈N of the one-line Young diagrams (which are the conjugates of the Yp2 ). In fact, for general (Y ) the graded vector space ∧(Y ) E is also a graded algebra if one defines the product by setting T T = Yp+p (T ⊗ T )
(37)
414
M. Dubois-Violette, M. Henneaux
for T ∈ E Yp and T ∈ E Yp , where Yn is the Young symmetrizer defined in Sect. 2. However, although it generalizes the exterior product, this product is generically a nonassociative one. Thus ∧(Y ) E is a generalization of the exterior algebra ∧E in which each homogeneous subspace is irreducible for the action of GL(E) & GLD but in which one loses the associativity of the product. There is another closely related generalization of the exterior algebra connected with the sequence (Y ) in which what is retained is the associativity of the graded product but in which one generically loses the GL(E)irreducibility of the homogeneous components. This generalization, denoted by ∧[(Y )] E, is such that ∧(Y ) E is a graded (right) ∧[(Y )] E-module. We now describe its construction. Let T(E) be the tensor algebra of E, we use the product defined by (37) to equip ∧(Y ) E with a right T(E)-module structure by setting T λ(Y ) (X1 ⊗ · · · ⊗ Xn ) = (· · · (T X1 ) · · · )Xn
(38)
for any Xi ∈ E and T ∈ ∧(Y ) E. By definition the kernel Ker(λ(Y ) ) of λ(Y ) is a two-sided ideal of T(E) so that the right action of T(E) on ∧(Y ) E is in fact an action of the quotient algebra ∧[(Y )] E = T(E)/Ker(λ(Y ) ). So ∧(Y ) E is a graded right ∧[(Y )] E-module. Lemma 8. Let N be an integer with N ≥ 2 and assume that (Y ) is such that the number of columns of the Young diagram Yp is strictly smaller than N for any p ∈ N. Then Ker(λ(Y ) ) contains the two-sided ideal of T(E) which consists of the tensors which are symmetric with respect to at least N of their entries; in particular (λ(Y ) (X))N = 0, ∀X ∈ E. Stated differently, under the assumption of the lemma for (Y ), a monomial X1 . . . Xn ∈ ∧[(Y )] E with Xi ∈ E vanishes whenever it contains N times the same argument, that is if there are N distinct elements i1 , . . . , iN of {1, . . . , n} such that Xi1 = · · · = XiN . Proof. This is straightforward, as for the proof of Lemma 1, since one has more than N symmetrized entries which are distributed among less than N − 1 columns in which the entries are antisymmetrized. The right action λ(Y N ) of T(E) on ∧N E will also be simply denoted by λN . In the case N = 2, ∧2 E is the usual exterior algebra ∧E of E and the right action λ2 of T(E) factorizes through the right action of ∧E on itself, in particular Ker(λ2 ) is the two-sided ideal of T(E) generated by the X ⊗ X for X ∈ E. Thus the graded algebra ∧[(Y )] E = T(E)/Kerλ(Y ) is also a generalization of the exterior algebra of E. For (Y ) = (Y N ), ∧[(Y N )] E will be simply denoted by ∧[N] E. One clearly has ∧[2] E = ∧2 E = ∧E for N = 2. In the case N = 3, it can be shown that Ker(λ3 ) is the two-sided ideal of T(E) generated by the X⊗Y ⊗Z+Z⊗X⊗Y +Y ⊗Z⊗X and the
X⊗Y ⊗X⊗X
for X, Y, Z ∈ E. This implies that one has λ3 (X)λ3 (Y )λ3 (Z) + λ3 (Z)λ3 (X)λ3 (Y ) + λ3 (Y )λ3 (Z)λ3 (X) = 0 and
λ3 (X)λ3 (Y )(λ3 (X))2 = 0
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
415
for any X, Y, Z ∈ E and that these are the only independent relations in the associative algebra Im(λ3 ) = ∧[3] E. This means that ∧[3] E is the associative unital algebra generated by the subspace E with relations XY Z + ZXY + Y ZX = 0 and XY X 2 = 0 for X, Y, Z ∈ E. The graduation is induced by giving the degree one to the elements of E which is consistent since the relations are homogeneous. It is clear on this example p that the homogeneous subspaces ∧[N] E of ∧[N] E are generally not irreducible for the (obvious) action of GL(E). It is not hard to see that one has ω0 ∧[N] E = ∧N E, where ω0 is a generator (& 1l) of ∧0N E & R, that is ω0 is a cyclic vector for the action of ∧[N] E on ∧N E. Corresponding to the generalization ∧[(Y )] E of the exterior algebra there is a generalization [(Y )] (M) of differential forms on a smooth manifold M which is defined in a similar way as (Y ) (M) was defined in Sect. 3. This [(Y )] (M) is then a graded associative algebra and (Y ) (M) is a right graded [(Y )] (M)-module (etc.). In the case (Y ) = (Y N ) one writes [N] (M) for this generalization. For M = RD one has [N] (RD ) = ∧[N] RD∗ ⊗ C ∞ (RD ) and, by identifying [N] (RD ) as a graded-subspace of T(RD∗ ) ⊗ C ∞ (RD ) and by using the canonical flat torsion-free linear connection of RD one can define an N -differential d on [N] (RD ) by appropriate projection. One can proceed similarly for [(Y )] (RD ) when (Y ) satisfies the assumption of Lemma 1 (or Lemma 2, Lemma 7). More precisely, the N complexes constructed so far are particular cases of the following general construction. Let A = ⊕n∈N An be an associative unital graded algebra generated by D elements of degree one θ µ for µ ∈ {1, . . . , D} such that θ µp(1) . . . θ µp(N ) = 0 (39) p∈SN
for any µ1 , . . . , µN ∈ {1, . . . , D}. Then the algebra A(RD ) defined by A(RD ) = A ⊗ C ∞ (RD ) is a graded algebra and one defines an N -differential d on A(RD ), i.e. a linear mapping d of degree one satisfying d N = 0, by setting d(a ⊗ f ) = (−1)n aθ µ ⊗ ∂µ f
(40)
for a ∈ An and f ∈ C ∞ (RD ). Let M = ⊕n Mn be a graded right A-module, then M(RD ) = M ⊗ C ∞ (RD ) is a graded space which is a graded right A(RD )-module and one defines a N -differential d on M(RD ) by setting d(m ⊗ f ) = (−1)n mθ µ ⊗ ∂µ f
(41)
for m ∈ Mn and f ∈ C ∞ (RD ). The (irrelevant) sign (−1)n in formulas (40) and (41) is here in order to recover the usual exterior differential in the case where A = ∧RD∗ = M. It is clear that [N] (RD ) = A(RD ) for A = ∧[N] RD∗ and that N (RD ) = M(RD ) for M = ∧N RD∗ . If (Y ) satisfies the assumption of Lemma 1 one can take (in view of Lemma 7) A = ∧[(Y )] RD∗ and M = ∧(Y ) RD∗ and then [(Y )] (RD ) = A(RD ) and (Y ) (RD ) = M(RD ).
416
M. Dubois-Violette, M. Henneaux
8. Further Remarks Our original unpublished project for proving Theorem 1 was based on the construction of generalized algebraic homotopy in appropriate degrees. Let us explain what it means, why it is rather cumbersome and why the proof given here, based on the introduction of the multigraded differential algebra A, is much more instructive and general and is related to the ansatz of Green for the fermionic parastatistics of order N − 1 (in the case d N = 0). Let = ⊕n n be an N-complex (of cochains) with N -differential d. An algebraic homotopy for the degree n will be a family of N linear mappings hk : n+k → n+k−N+1 N−1−k h d k is the identity mapping I of for k = 0, . . . , N − 1 such that N−1 k n k=0 d n = 0 n onto itself. If such a homotopy exists for the degree n, then one has H(k) for k ∈ {0, . . . , N − 1}. Indeed let ω ∈ n be such that d k ω = 0, then one has
k−1 k−1−p N−k hp d p ω . ω=d p=0 d Our original strategy for proving Theorem 1 was to show that one can construct inductively such homotopies for the degrees (N − 1)p with p ≥ 1 in the case of the N -complex N (RD ) and our idea was to exhibit explicit formulas. Unfortunately this latter task seems very difficult in general. We only succeeded in producing formulas in a closed form in the case N = 3 and we refrain to give them here because this would imply explanations of our normalization conventions which have no character of naturality. The difficulty is indeed a problem of normalization. For the classical case N = 2, one obtains a homotopy formula by using the inner derivation ix with respect to the vector field x with components x µ . In this case one uses the fact that both d and ix are antiderivations and that the Lie derivative Lx = dix +ix d is the sum of the form-degree and of the degree of homogeneity in x. This gives homotopy formulas for forms which are homogeneous polynomials in x and one gets rid of the above degree by appropriately weighted radial integration and obtains thereby the usual homotopy formula for positive form-degree. In this case the normalizations are fixed by the (anti)derivation properties. In the case N ≥ 3, d has no derivation property and one has to generalize ix which is possible with ixN = 0 but there is no natural normalization since ix cannot possess the derivation property. As a consequence the appropriate generalization of the Lie derivative involves a linear combination of products of powers of d and ix with coefficients which have to be fixed at each step. That this is possible constitutes a cumbersome proof of Theorem 1 but does not allow one to easily write closed formulas. The interest of the proof of Theorem 1 presented here lies in the fact that it follows from the more general Theorem 2 which can be applied to other situations in particular to investigate the generalized cohomology of [N] (RD ). Moreover, the realization of N (RD ) embedded in A is related to the Green ansatz for the parafermionic creation operators of order N − 1. Indeed if instead of equipping A with the graded commutative product one replaces in the definition of A the graded tensor products of graded algebras by the ordinary tensor products of algebras (applying the appropriate Klein transformation) then the di x µ and the dj x ν commute for i = j and the di defined by the same formulas (14) commute, i.e. satisfy di dj = dj di instead of (15), from which it follows that i di is only an N -differential. This latter N -differential induces the N -differential of N (RD ) ⊂ A and the relation with the Green ansatz becomes obvious after Fourier transformation.
Tensor Fields of Mixed Young Symmetry Type and N -Complexes
417
The basic N -complexes considered in this paper are N -complexes of smooth tensor fields on RD and we have seen the difficulty to extend the formalism on an arbitrary D-dimensional manifold M. In the case of a complex (holomorphic) manifold M of ¯ complex dimension D, there is an extension of the previous formalism at the ∂-level which we now describe shortly. Let M be a complex manifold of complex dimension D and let T be a smooth covariant tensor field of type (0, p) (i.e. of d z¯ -degree p) with local components Tµ¯ 1 ...µ¯ p in local holomorphic coordinates z1 , . . . , zD . Then ∂µ¯ p+1 Tµ¯ 1 ...µ¯ p are the components of ¯ of type (0, p + 1) since the transition a well-defined smooth covariant tensor field ∇T functions are holomorphic, where ∂µ¯ denotes the partial derivative ∂/∂ z¯ µ of smooth functions. Let (Y ) be a sequence (Yp )p∈N of Young diagrams such that |Yp | = p (∀p ∈ 0,p N) and denote by (Y ) (M) the space of smooth covariant tensor fields of type (0, p) with 0,p
Young symmetry type Yp (with obvious notation). Let us set 0,∗ (Y ) (M) = ⊕p (Y ) (M) ¯ and generalize the ∂-operator by setting 0,p 0,p+1 ∂¯ = (−1)p Yp+1 ◦ ∇¯ : (Y ) (M) → (Y ) (M)
with obvious conventions. It is clear that if (Y ) is such that for any p ∈ N the number of columns of Yp is strictly less than N , then one has ∂¯ N = 0 so 0,∗ (Y ) (M) is an N -complex 0,∗ ¯ In particular one has the N -complex (M) for ∂¯ by taking (Y ) = (Y N ). One (for ∂). N ¯ has an obvious extension of Theorem 1 ensuring that the generalized ∂-cohomology of
D 0,∗ N (C ) vanishes in degree (N −1)p (i.e. bidegree or type (0, (N −1)p)) for p ≥ 1. It is thus natural to seek for an interpretation of this generalized cohomology for 0,∗ N (M) in degrees (N − 1)p with p ≥ 1 for an arbitrary complex manifold M and one may ¯ wonder whether it can be computed in terms of the ordinary ∂-cohomology of M.
References 1. Becchi, C., Rouet, A., Stora, R.: Renormalization models with broken symmetries. In: Renormalization Theory, Erice 1975, G. Velo, A.S. Wightman, eds, Dordrecht: Reidel, 1976 2. Boerner, H.: Representations of groups. Amsterdam: North Holland, 1970 3. Boulanger, N., Damour, T., Gualtieri, L., Henneaux, M.: Inconsistency of interacting, multigraviton theories. Nucl. Phys. B597, 127–171 (2001) 4. de Wit, B., Freedman, D.Z.: Systematics of higher-spin gauge fields. Phys. Rev. D21, 358–367 (1980) 5. Damour, T., Deser, S.: Geometry of spin 3 gauge theories. Ann. Inst. H. Poincaré 47, 277–307 (1987) 6. Dubois-Violette, M.: Generalized differential spaces with d N = 0 and the q-differential calculus. Czech J. Phys. 46, 1227–1233 (1997) 7. Dubois-Violette, M.: d N = 0 : Generalized homology. K-Theory 14, 371–404 (1998) 8. Dubois-Violette, M.: Generalized homologies for d N = 0 and graded q-differential algebras. Contemp. Math. 219, 69–79 (1998) 9. Dubois-Violette, M.: Lectures on differentials, generalized differentials and some examples related to theoretical physics. LPT-ORSAY 00/31; math.QA/0005256 10. Dubois-Violette, M., Henneaux, M.: Generalized cohomology for irreducible tensor fields of mixed Young symmetry type. Lett. Math. Phys. 49, 245–252 (1999) 11. Dubois-Violette, M., Henneaux, M., Talon, M., Viallet, C.M.: Some results on local cohomologies in field theory. Phys. Lett. B267, 81–87 (1991) 12. Dubois-Violette, M., Kerner, R.: Universal q-differential calculus and q-analog of homological algebra. Acta Math. Univ. Comenian 65, 175–188 (1996) 13. Dubois-Violette, M., Todorov, I.T.: Generalized cohomology and the physical subspace of the SU (2) WZNW model. Lett. Math. Phys. 42, 183–192 (1997) 14. Dubois-Violette, M., Todorov, I.T.: Generalized homology for the zero mode of the SU (2) WZNW model. Lett. Math. Phys. 48, 323–338 (1999)
418
M. Dubois-Violette, M. Henneaux
15. 16. 17. 18.
Fronsdal, C.: Massless fields with integer spins. Phys. Rev. D 18, 3624 (1978) Fulton, W.: Young tableaux. Cambridge: Cambridge University Press 1997 Gasqui, J.: Sur les structures de courbure d’ordre 2 dans Rn . J. Differ. Geom. 12, 493–497 (1977) Henneaux, M., Teitelboim, C.: Quantization of gauge systems. Princeton, NJ: Princeton University Press, 1992 Kapranov, M.M.: On the q-analog of homological algebra. Preprint Cornell University 1991; qalg/9611005 Kassel, C., Wambst, M.: Algèbre homologique des N -complexes et homologies de Hochschild aux racines de l’unité. Publ. RIMS, Kyoto Univ. 34, 91–114 (1998) Mayer, M. A new homology theory I, II. Ann. of Math. 43, 370–380 and 594–605 (1942) Singh, L.P.S., Hagen, C.R.: Lagrangian formulation for arbitrary spin. 1. The boson case. Phys. Rev. D 9, 898–909 (1974) Wald, R.M.: Spin-two fields and general covariance. Phys. Rev. D 33, 3613–3625 (1986) Wambst, M.: Homologie cyclique et homologie simpliciale aux racines de l’unité. K-Theory 23, 377–397 (2001)
19. 20. 21. 22. 23. 24.
Communicated by A. Connes
Commun. Math. Phys. 226, 419 – 432 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Noncommutative Instantons on the 4-Sphere from Quantum Groups F. Bonechi1 , N. Ciccoli2 , M. Tarlini1 1 INFN Sezione di Firenze, Dipartimento di Fisica, Università di Firenze, Via G. Sansone 1,
50019 Sesto F.no (Fi), Italy. E-mail: [email protected]; [email protected]
2 Dipartimento di Matematica e Informatica, Università di Perugia, via Vanvitelli 1, 06123 Perugia, Italy.
E-mail: [email protected] Received: 3 January 2001 / Accepted: 14 November 2001
Abstract: We describe an approach to the noncommutative instantons on the 4-sphere based on quantum group theory. We quantize the Hopf bundle S7 → S4 making use of the concept of quantum coisotropic subgroups. The analysis of the semiclassical Poisson–Lie structure of U (4) shows that the diagonal SU (2) must be conjugated to be properly quantized. The quantum coisotropic subgroup we obtain is the standard SUq (2); it determines a new deformation of the 4-sphere q4 as the algebra of coinvariants in S7q . We show that the quantum vector bundle associated to the fundamental corepresentation of SUq (2) is finitely generated and projective and we compute the explicit projector. We give the unitary representations of q4 , we define two 0-summable Fredholm modules and we compute the Chern–Connes pairing between the projector and their characters. It comes out that even the zero class in cyclic homology is non-trivial. 1. Introduction Since the work [25] on instantons on noncommutative R4 a lot of attention has been devoted to the problem of gauge theories on noncommutative four manifolds. In ordinary differential geometry, the topological properties of instantons in R4 are better understood by studying fibre bundles on the sphere S4 . In noncommutative geometry this is not an easy task: it is more natural to define the problem directly on the noncommutative sphere. Very recently, in [10] and [12] two different deformations of S4 were proposed. The one in [10] preserves the property of having zero the first Chern class which is not trivial in [12]. In this second case the deformation is a suspension of the quantum 3-sphere SUq (2) obtained by adding a central generator. In this paper we propose an alternative approach, based more directly on quantum groups and on Hopf algebraic techniques. In noncommutative geometry finitely generated projective modules, i.e. the quantum vector bundles, are the central object to develop gauge theories. From this point of view there is no obvious notion of structure group. Quantum groups provide a construction
420
F. Bonechi, N. Ciccoli, M. Tarlini
of quantum vector bundles which is closer to ordinary differential geometry. The first attempts go back to [14], [26] and [6], where the gauge theory is developed starting from the notion of Hopf–Galois extension, which is the analogue of principal bundles in the Hopf algebra setting, see [28]. The associated quantum vector bundles have a Hopf algebra on the fiber and, if they admit a connection, are finitely generated and projective modules [11]. Although this definition works in principle, it is not enough to explain all known interesting examples. This problem is better understood if we concentrate on the specific class of principal bundles given by homogeneous spaces. A quantum homogeneous space is an example of Hopf–Galois extension only if it is obtained as quotient by a quantum subgroup (i.e. a Hopf algebra quotient). But quantum subgroups are very rare. For instance between the quantum 2-spheres introduced by Podle`s in [27] only one, the standard one, is such an example. It is necessary to generalize the notion of subgroup, allowing a more general quotient procedure. This is possible by using quantum coisotropic subgroups: they are quotient by a coideal, right (or left) ideal, so that they inherit only the coalgebra, while the algebra structure is weakened to a right (or left) module. Their semiclassical interpretation is illuminating: in a Poisson–Lie group every Poisson (resp. coisotropic) subgroup can be quantized to a quantum (resp. coisotropic) subgroup (see [8]). Nevertheless conjugation, which does not change topology, can break Poisson properties: for instance a subgroup conjugated to a Poisson subgroup can be only coisotropic or can have no Poisson properties at all (see for instance SL(2, R) in [3]). Coisotropic subgroups can be quantized and give rise to inequivalent quantum homogeneous spaces: for instance all the Podle`s quantum spheres are obtained as a quotient of coisotropic U (1). The general scheme to describe such examples could be the so-called C-Galois extensions, see for instance [5, 4]. The principal bundle on S4 corresponding to SU (2) instantons with charge −1 has = U (3)\U (4) as total space and the action on the fibre is obtained by considering SU (2) as the diagonal subgroup of U (4). In this description S4 is the double coset U (3)\U (4)/SU (2). In the quantum setting, odd spheres were obtained in [30] as homogeneous spaces of Uq (N ) with respect to the quantum subgroup Uq (N − 1) so that the left quotient is easily quantized. The right quotient is more problematic because the diagonal SU (2) doesn’t survive in the quantization of U (4); indeed the analysis of the limit Poisson structure on U (4) shows that it is not coisotropic. We then have to look for coisotropic subgroups in the conjugacy class of the diagonal one. It comes out that there is at least one which defines what we call the Poisson Hopf bundle in S4 (Proposition 3). In this bundle, which is topologically equivalent to the usual Hopf bundle, both the total and the base spaces are Poisson manifolds and the projection is a Poisson map. Its quantization is straightforward: the quantum coisotropic subgroup turns out to be equivalent as a coalgebra to SUq (2) (Proposition 4) and the algebra of functions over the quantum 4-sphere q4 is then obtained as the subalgebra of coinvariants in S7q with respect to this SUq (2) (Proposition 6). This deformation of the algebra of functions on S4 is different from those introduced in [10] and [12]. We then study the quantum vector bundle associated to the fundamental corepresentation of SUq (2) and give the explicit projector (Proposition 7). We describe the unitary irreducible representations of q4 (Eqs. 7 and 8); there is a 1-dimensional representation and an infinite dimensional one realized by trace class operators (Proposition 8). Finally we study the Chern class in cyclic homology of the projector and compute the Chern–Connes pairing with a trace induced by the trace class representation (Proposition 10). It comes out that, on the conS7
Noncommutative Instantons on the 4-Sphere from Quantum Groups
421
trary with [10] and [12], they are all non-trivial. This result is the analogue of what was obtained in [22, 18] for the standard Podle`s 2-sphere. 2. Quantization of Coisotropic Subgroups A Poisson–Lie group (G, { , }) is a Lie group G with a Poisson bracket { , } such that the multiplication map m : G × G → G is a Poisson map with respect to the product Poisson structure in G × G. The Poisson bracket { , } is identified by a bivector ω (i.e. a section of 2 T G) such that {φ1 , φ2 }(x) = ω(x)(dx φ1 , dx φ2 ). (For more details see [7] and [29].) Every Poisson–Lie group induces a natural bialgebra structure on g = Lie(G) which will be called the tangent bialgebra of G. Indeed, δ : g → g ∧ g is defined by X, de {f, g} = δ(X), f ⊗ g, where X ∈ g and f, g ∈ C ∞ (G). The point we want to discuss is the behaviour of subgroups and corresponding homogeneous spaces with respect to the Poisson structure. A Lie subgroup H of G is called a Poisson–Lie subgroup if it is also a Poisson submanifold of G, i.e. if the immersion map ı : H → G is a Poisson map. There are various characterizations for such subgroups: as invariant subspaces for the dressing action or as the union of symplectic leaves [21]. The property of being a Poisson–Lie subgroup is, evidently, a very strong one. We need then to characterize a family of subgroups satisfying weaker hypotheses with respect to the Poisson structure. In Poisson geometry a submanifold N of a Poisson manifold (M, ω) is said to be coisotropic if ωAnn(T N) = 0, where Ann(Tx N ) = {α ∈ Tx∗ (M) | α(v) = 0 ∀v ∈ Tx N }. Coisotropy can be formulated very neatly as an algebraic property at the function algebra level (see [29]). Indeed a locally closed submanifold N of the Poisson manifold (M, ω) is coisotropic if and only if for every f, g ∈ C ∞ (M), f N = 0, g N = 0 ⇒ {f, g}N = 0. Thus locally closed coisotropic submanifolds correspond to manifolds whose defining ideal is not a Poisson ideal but only a Poisson subalgebra. A Lie subgroup H of (G, ω) is said to be a coisotropic subgroup if it is coisotropic as a Poisson submanifold. In the connected case there are nice characterizations, as shown for example in [20]; we will need the following one: Proposition 1. A connected subgroup H of (G, ω) with h = LieH is coisotropic iff δ(h) ⊂ g ∧ h and it is Poisson–Lie iff δ(h) ⊂ h ∧ h. Given a Poisson–Lie group G and a coisotropic subgroup H the natural projection map G → H\G coinduces a Poisson structure on the quotient. If K is a second subgroup of G a condition which guarantees that even the projection on the double coset is Poisson is given by the following: Proposition 2 ([20]). Let (M, ωo ) be a Poisson manifold with a Poisson action of a Poisson–Lie group (G, ω). Let K be a coisotropic connected subgroup of G. If the orbit space M/K is a manifold there exists a unique coinduced Poisson bracket such that the natural projection M → M/K is Poisson. We now recall how these concepts can be translated in a Hopf algebra setting, (see [2, 8] for more details). Given a real quantum group (A, ∗, , S, ) we will call real coisotropic quantum right (left) subgroup (K, τK ) a coalgebra, right (left) A-module K such that:
422
F. Bonechi, N. Ciccoli, M. Tarlini
(i) there exists a surjective linear map π : A → K, which is a morphism of coalgebras and of A-modules (where A is considered as a module on itself via multiplication); (ii) there exists an antilinear map τK : K → K such that τK ◦ π = π ◦ τ , where τ = ∗ ◦ S. A ∗-Hopf algebra S is said to be a real quantum subgroup of A if there exists a ∗-Hopf algebra epimorphism π : A → S; evidently this is a particular coisotropic subgroup. We remark that a coisotropic quantum subgroup is not in general a ∗-coalgebra but it has only an involution τK defined on it. Right (left) coisotropic quantum subgroups are obviously characterized by the kernel of the projection, which is a τ -invariant two-sided coideal, right (left) ideal in A. It is easy to verify that if the kernel is also ∗-invariant then it is an ideal and the quotient is a real quantum subgroup. A ∗-algebra B is said to be an embeddable quantum left (right) A-homogeneous space if there exists a coaction µ : B → B ⊗ A, (µ : B → A ⊗ B) and an injective morphism of ∗-algebras ı : B → A such that ◦ ı = (ı ⊗ id) ◦ µ ( ◦ ı = (id ⊗ ı) ◦ µ). Embeddable quantum homogeneous spaces can be obtained as the space of coinvariants with respect to the coaction of coisotropic quantum subgroups. For instance if K is a right (left) subgroup and π = (id ⊗ π ) ( π = (π ⊗ id) ), then B π = {a ∈ A | π a = a ⊗ π(1)} ( π B = {a ∈ A | π a = π(1) ⊗ a}), is an homogeneous space with µ = . If ρ : V → K ⊗ V is a corepresentation of K, we define the cotensor product as A ρ V = {F ∈ A ⊗ V | (π ⊗ id) F = (id ⊗ ρ) F }. π We have that A ρ V is a left B -module. Let ρ be unitary and {ei } be an orthonormal basis of V ; if F = i Fi ⊗ ei , let’s define F, G = i Fi G∗i . It is shown in [2] that
, is a sesquilinear form on A ρ V with values in B π . The correspondence between coisotropic quantum subgroups and embeddable quantum homogeneous spaces is bijective only provided some faithful flatness conditions on the module and comodule structures are satisfied (see [23] for more details). The role of coisotropic subgroups can also be appreciated in the context of formal and algebraic equivariant quantization. While it is known that not every Poisson homogeneous space admits such quantization, it holds true that every quotient of a Poisson–Lie group by a coisotropic subgroup can be equivariantly quantized. Although such quotients do not exhaust the class of quantizable Poisson spaces they provide a large subclass in it. Furthermore in functorial quantization they correspond to embeddable quantum homogeneous spaces. More on the subject can be found in [15].
3. The Classical Instanton with k = −1 In this section we review the construction of the principal bundle corresponding to instantons with topological charge k = −1 (see [1]). We denote be H the quaternions generated by i, j , k with the usual relations i 2 = j 2 = k 2 = −1, and ij = −j i = k, j k = −kj = i, ki = −ik = j . The total space of the bundle is defined as E = {(q1 , q2 ) ∈ H2 | |q1 |2 + |q2 |2 = 1}, the base space is P1 (H) = {[(q1 , q2 )] | (q1 , q2 ) (q1 λ, q2 λ), (q1 , q2 ) ∈ H2 , λ ∈ H} and the bundle projection is p(q1 , q2 ) = [(q1 , q2 )]. The fibre is SU (2) which acts on H2 by the diagonal right multiplication of quaternions
Noncommutative Instantons on the 4-Sphere from Quantum Groups
423
of unit modulus. The quaternionic polynomial functions B = Pol(P1 (H)) on the base space are generated by R = q1 q¯1 and Q = q1 q¯2 , with the relation |Q|2 = R(1 − R). The fundamental representation of SU (2) can be realized again by right multiplication of unit quaternions on H. The space E of sections of the associated vector bundle is the space of equivariant functions s : E → H, i.e. such that s(q1 , q2 )λ = s(q1 λ, q2 λ), for |λ| = 1. It is generated as a left B-module by s1 (q1 , q2 ) = q1 and s2 (q1 , q2 ) = q2 and it has an hermitian structure , : E × E → B given by s1 , s2 = s1 s¯2 . We can define G ∈ M2 (B) with Gij = si , sj . By direct computation we obtain that R Q 2 G=G = ¯ . (1) Q 1−R It is easy then to verify that E B 2 G. For our future purposes, we have to describe this bundle in Hopf algebraic language. We first remark that E is isomorphic to S7 = U (3)\U (4) and P1 (H) to S4 = U (3)\U (4)/SU (2). Let tf = {tij }4ij =1 define the fundamental representation of U (4). Then (tij ) = k tik ⊗ tkj and let 3 : Pol(U (4)) → Pol(U (3)) be the Hopf algebra projection defined by 3(t4j ) = 3(tj 4 ) = 0 for j = 1, 2, 3, and 3(t44 ) = 1. The algebra of polynomial functions on S7 isgiven by the coinvariants 3 Pol(U (4)) and it is generated by zi = t4i with the relation i |zi |2 = 1. Let r : Pol(U (4)) → Pol(SU (2)) be the Hopf algebra projection defined by α β 0 0 −β ∗ α ∗ 0 0 , |α|2 + |β|2 = 1. r(t) = 0 0 α β 0 0 −β ∗ α ∗ As usual Pol(U (4)/SU (2)) is obtained as the space of coinvariants Pol(U (4))r . The algebra of polynomial functions on S4 is 3 Pol(U (4)) ∩ Pol(U (4))r and is generated by R = |z1 |2 + |z2 |2 , A = z1 z3∗ + z2 z4∗ and B = z1 z4 − z2 z3 , with the relation |A|2 + |B|2 = R(1 − R). Let τf : C2 → Pol(SU (2))⊗C2 be the fundamental corepresentation of Pol(SU (2)), e1 α β e τf = ⊗ 1 . −β ∗ α e2 e2 The left Pol(S4 )-module of sections of the associated vector bundle is obtained as E = 2 Pol(S7 ) τf C . As a Pol(S4 )-module, E is generated by ∗ ∗ z1 z2 z3 z4 , f2 = , f3 = , f4 = . f1 = ∗ z2 −z1 z4 −z3∗ With the usual hermitian structure we define G ∈ Pol(S4 ) ⊗ M4 (C) with Gij = fi , fj and obtain that R 0 A B ∗ ∗ A 0 R −B G = G2 = ∗ . (2) A −B 1 − R 0 B∗ A 0 1−R
424
F. Bonechi, N. Ciccoli, M. Tarlini
With the usual representation of H as C2 , where (z1 , z2 ) is identified with z1 + z2 j , it is easy to verify that Q = A − Bj , f1 = q1 , f2 = −j q1 , f3 = q2 and f4 = −j q2 . Once we introduce the representation of the quaternions with Pauli matrices it is easy to verify that (1) and (2) define the same projector. 4. Poisson Hopf Bundle on S4 Let us identify g = u(4) = Lie U (4) with its defining representation by antihermitian 4 × 4 matrices. The SU (2) generators of the Dynkin diagram are, for i = 1, 2, 3,
1 1 Hi = i eii − ei+1,i+1 , Ei = ei,i+1 + ei+1,i , Fi = ei,i+1 − ei+1,i , 2i 2 where eij are the elementary matrices with entries (eij )kl = δik δj l and the central generator is H = iI. The Poisson–Lie structure of U (4) is defined by the canonical coboundary bialgebra given on these generators by δR (Hi ) = 0,
δR (H ) = 0,
δR (Ei ) = Ei ∧ Hi ,
δR (Fi ) = Fi ∧ Hi .
(3)
The generators h = 41 H1 + 21 H2 + 43 H3 + 43 H and {Hi , Ei , Fi }i=1,2 define the embedding of u(3) in u(4) that we want to study; from relations (3) we have that δR (u(3)) ⊂ u(3) ∧ u(3) so that U (3) is a Poisson Lie subgroup. Let us fix on S7 = U (3)\U (4) the coinduced Poisson bracket (S7 , { , }). The bracket on S7 can be written as the restriction of the following bracket in C4 : if zi , i = 1, . . . , 4, denote complex coordinates we let {zi , zj } = zi zj , {zi , zj∗ } = −zi zj∗ ,
1 ≤ i < j ≤ 4, 1 ≤ i = j ≤ 4,
{zi∗ , zj∗ } = −zi∗ zj∗ , 1 ≤ i < j ≤ 4, {zj∗ , zj } = zj zj∗ . i<j
More detailed information can be found in [30]. The Lie algebra of the diagonal SU (2)d is su(2)d = H1 + H3 , E1 + E3 , F1 + F3 ; using (3) it is easy to verify that δR (su(2)d ) ⊂ su(2)d ∧ u(4) so that SU (2)d is not a coisotropic subgroup. We then have to solve the following problem: Does there exist any g ∈ U (4) such that δR (Adg (su(2)d )) ⊂ Adg (su(2)d ) ∧ u(4), i.e. such that gSU (2)d g −1 is coisotropic ? We give a positive answer to this question. By direct computation we verify that if 1 0 0 0 0 1 0 0 g= ∈ U (4), 0 0 0 1 0 0 −1 0 then SU (2)dg = gSU (2)d g −1 is a coisotropic subgroup of U (4). This is not the only solution but the general problem will be studied elsewhere. The projection onto this subgroup is then defined by α β 0 0 ∗ ∗ −β α 0 0 rg (t) = (4) , |α|2 + |β|2 = 1. 0 0 α∗ β ∗ 0 0 −β α
Noncommutative Instantons on the 4-Sphere from Quantum Groups
425
The right action of SU (2)dg on S7 U (3)\U (4) is free and defines a principal bundle on S4 U (3)\U (4)/SU (2)dg which is isomorphic to the Hopf bundle. Indeed it is easy to verify that i : S7 → S7 , i(z1 , z2 , z3 , z4 ) = (z1 , z2 , −z4 , z3 ) is a bundle morphism. Nevertheless since SU (2)dg is coisotropic on U (4), thanks to Proposition 2, a Poisson structure is coinduced on the base and the projection S7 → S4 is a Poisson map. We call this bundle a Poisson principal bundle. The Poisson structure can be explicitly described by the restriction of the bracket of S7 to the subalgebra generated by the following coinvariant functions: R = z1 z1∗ + z2 z2∗ ,
a = z1 z4∗ − z2 z3∗ ,
b = z 1 z3 + z 2 z 4 ,
which satisfy |a|2 + |b|2 = R(1 − R) . Easy calculations prove that: {a, R} = −2aR, {b, R} = 2bR, ∗ {a, a } = −2aa ∗ + 2R 2 ,
{a, b} = −3ab, {a, b∗ } = ab∗ , {b, b∗ } = 4bb∗ − 2R.
This Poisson algebra has clearly zero rank in R = 0. Let R = 0 and define ζ1 = a/R, ζ2 = b/R. Geometrically we’re just giving cartesian coordinates on the stereographic projection on C2 . Poisson brackets between these new coordinates are given by: {ζ1 , ζ2 } = ζ1 ζ2 ,
{ζ1 , ζ1∗ } = 2(1 + |ζ1 |2 ),
{ζ1 , ζ2∗ } = ζ1 ζ2∗ ,
{ζ2 , ζ2∗ } = −2(1 + |ζ1 |2 + |ζ22 |).
(5)
Such brackets define a symplectic structure on the 4-dimensional real space R4 (it can be proven, in fact that the corresponding map between cotangent and tangent bundle has fixed maximal rank). The covariant Poisson bracket on S4 has thus a very simple foliation given by a 0-dimensional leaf and a 4-dimensional linear space. We summarize this discussion in the following proposition. Proposition 3. The embedding of SU (2)dg into U (4) defines a coisotropic subgroup. The corresponding bundle S7 → S4 S7 /SU (2)dg is a Poisson bundle. 5. The Quantum q4 The Hopf algebra Uq (4) is generated by {tij }4ij =1 , Dq−1 and the following relations (see [16]): tik tj k = q tj k tik , ti3 tj k = tik tj 3 − tj 3 tik = Dq Dq−1 =
tki tkj = q tkj tki , tj k ti3 , (q − q −1 )tj k ti3 , Dq−1 Dq = 1,
i < j, i < j, k < 3, i < j, k < 3,
where Dq = σ ∈P4 (−q)3(σ ) tσ (1)1 . . . tσ (4)4 with P4 being the group of 4-permutations, is central. The Hopf algebra structure is (tij ) = tik ⊗ tkj , (Dq ) = Dq ⊗ Dq , k
(tij ) = δij ,
(Dq ) = 1,
426
F. Bonechi, N. Ciccoli, M. Tarlini
S(tij ) = (−q)i−j
σ ∈P3 (j )
(−q)3(σ ) tσ (1)1 . . . tσ (i)i+1 . . . tσ (4)4 Dq−1 ,
the compact real structure forces us to choose q ∈ R and is defined by tij∗ = S(tj i ), Dq∗ = Dq−1 . Let L = Span{tj 4 , t4j , t44 − 1}j =1,2,3 ; it comes out that L = Uq (4)L is a Hopf ideal, so that Uq (4)/L is equivalent to Uq (3) as a Hopf algebra. Let 3 : Uq (4) → Uq (4)/L Uq (3) be the quotient projection. The algebra S7q = 3 Uq (4) is generated by zi = t4i , i = 1, . . . , 4, with the following relations [30]: zj∗ zi = qzi zj∗ (i = j ), zi zj = qzj zi (i < j ), 4 ∗ ∗ 2 ∗ zj z j , zk zk∗ = 1. zk zk = zk zk + (1 − q ) j
k=1
The Uq (4)-coaction on reads (zi ) = j zj ⊗ tj i . Let us now quantize the coisotropic subgroup SU (2)dg of Proposition 3. Motivated by the projection rg in (4) let us define R = R Uq (4), where S7q
R = Span{t13 , t31 , t14 , t41 , t24 , t42 , t23 , t32 , t11 − t44 , t12 + t43 , t21 + t34 , t22 − t33 , t11 t22 − q t12 t21 − 1} = R˙ ⊕ Span{t11 t22 − q t12 t21 − 1}. It is easy to verify that R is a τ -invariant, right ideal, two sided coideal. Let r : Uq (4) → Uq (4)/R be the projection map. We have the following result: Proposition 4. As a τ -coalgebra Uq (4)/R is isomorphic to SUq (2). Proof. We sketch here the main lines of the proof. Let Aq (N ) be the bialgebra generated by the {tij }. We first remark that r(Dq ) = r(1) so that Uq (4)/R Aq (4)/RAq (4). ˙ q (4) Aq (2). Once an order in the generators tij First one can show that Aq (4)/RA of Aq (4) is chosen, a linear basis is given by the ordered monomials in tij [17], so that ˙ q (4) = Span{r(t n11 t n44 t n12 t n43 t n21 t n34 t n22 t n33 )}. Making repeated use of Aq (4)/RA 11 44 12 43 21 34 22 33 the following relations for i < k, j < l: tijn tkl = tkl tijn − q −1 (1 − q 2n ) til tkj tijn−1 ,
m−1 m m tij tkl = tkl tij + q (1 − q −2m ) til tkj tkl ,
˙ q (4) = Span{r(t n11 t n12 t n21 t n22 )} Aq (2). To show that this is we get that Aq (4)/RA 11 12 21 22 a τ -coalgebra isomorphism is equivalent to verify that the projection r restricted to the first quadrant of Aq (4) is a τ -bialgebra isomorphism. This can be directly done by using the relations. Finally the quotient by the quantum determinant gives SUq (2). Remark 5. The projection map r : Uq (4) → SUq (2) is not a Hopf algebra map as can be, for instance, explicitly verified on r(t11 t43 ) = r(t11 )r(t43 ). In the following we will denote Uq (4)/R with SUq (2), but we have to be careful that r doesn’t preserve the algebra structure but only defines a right Uq (4)-module structure on the quotient. By construction r = (id ⊗ r) : S7q → S7q ⊗ SUq (2) defines an SUq (2) coaction on S7q . The space of functions on the quantum 4-sphere q4 is the space of coinvariants with respect to this coaction, i.e. q4 = {a ∈ S7q | r (a) = a ⊗ r(1)}. We describe q4 in the following proposition whose proof is postponed to the Appendix.
Noncommutative Instantons on the 4-Sphere from Quantum Groups
427
Proposition 6. The algebra q4 is generated by {a, a ∗ , b, b∗ , R}, where a = z1 z4∗ − z2 z3∗ , b = z1 z3 + q −1 z2 z4 , R = z1 z1∗ + z2 z2∗ . They satisfy the following relations: Ra = q −2 aR, Rb = q 2 bR, ab = q 3 ba , ab∗ = q −1 b∗ a, aa ∗ + q 2 bb∗ = R(1 − q 2 R), aa ∗ = q 2 a ∗ a + (1 − q 2 )R 2 ,
b∗ b = q 4 bb∗ + (1 − q 2 )R.
In terms of rij = r(tij ) ∈ SUq (2), with i, j = 1, 2, the fundamental corepresentation τf : C2 → SUq (2) ⊗ C2 is written as e1 r r e τf = 11 12 ⊗ 1 . e2 r21 r22 e2 Let E = S7q τf C2 be the associated quantum vector bundle. Let (a1 , a2 ), (b1 , b2 ) = a1 b1∗ + a2 b2∗ ∈ q4 for (a1 , a2 ), (b1 , b2 ) ∈ E be the hermitian structure in E. Let f1 = q(z1 , z2 ), f2 = q(z2∗ , −qz1∗ ), f3 = (z4 , −z3 ), f4 = q(z3∗ , q −1 z4∗ ), and G ∈ M4 (q4 ) be such that Gij = fi , fj . We then have the following description of E (see the Appendix for the proof). Proposition 7. As a q4 -module E is generated by fi , i = 1, . . . , 4; it is isomorphic to (q4 )4 G where 2 qa q 2b q R 0 0 q 2 R qb∗ −q 3 a ∗ . G = G2 = (6) qa ∗ qb 1 − R 0 q 2 b∗ −q 3 a 0 1 − q 4 R 6. Unitary Representations of q4 Let 0 < q < 1. By restriction of those of S7q , see for instance [7], we obtain the following two inequivalent unitary representations of q4 . The first is one dimensional and it is obtained as the restriction of the counit of Uq (4): (R) = (a) = (b) = 0.
(7)
The second one σ : q4 → B(32 (N)⊗2 ) is defined by σ (R)|n1 , n2 = q 2(n1 +n2 ) |n1 , n2 , σ (a) |n1 , n2 = q n1 +2n2 −1 (1 − q 2n1 )1/2 |n1 − 1, n2 , σ (b) |n1 , n2 = q n1 +n2 (1 − q 2(n2 +1) )1/2 |n1 , n2 + 1.
(8)
There are no other irreducible representations with bounded operators. In fact let ρ : q4 → B(H) be such a representation, since ρ(R) is a bounded selfadjoint operator and Ra = q −2 aR, Rb∗ = q −2 b∗ R, there exists a vector |λ such that ρ(R) |λ = λ |λ and ρ(a)|λ = ρ(b∗ )|λ = 0. By using the relation a ∗ a + bb∗ = q −2 R(1 − R) we conclude
428
F. Bonechi, N. Ciccoli, M. Tarlini
that λ = 0 or λ = 1. Being ρ irreducible it can be verified that for λ = 0 we have that ρ = and for λ = 1 we have ρ = σ . Let us remark that such irreducible (unitary) representations are in one to one correspondence with the leaves of the symplectic foliation of the underlying Poisson 4-sphere: the 0-dimensional leaf corresponds to the counit and the symplectic R4 to the infinite dimensional representation. The representation σ has the following important property. Proposition 8. The operator σ (x) ∈ B(32 (N)⊗2 ) is a trace class operator for each x ∈ ¯q4 = q4 /C 1. Proof. Since the family of trace class operators I1 is a ∗-ideal in the algebra of bounded operators, it is enough to verify the proposition on the generators σ (R), σ (a) and σ we have that tr(σ (|R|)) = tr(σ (R)) = (1 − q 2 )−2 and tr(σ (|a|)) = (b). Indeed n +2n −1 2n 1/2 −1 2 −1 n 2n 1/2 ≤ q −1 (1 − 1 2 1 (1 − q ) = q (1 − q ) n1 ,n2 n≥0 q (1 − q ) ≥0 q 2 −1 n 2n −1 3 −1 2 q ) n≥0 q (1 − q ) = (1 − q) (1 − q ) . Analogously tr(|b|) ≤ (1 + q )(1 − q)−1 (1 − q 3 )−1 . Remark 9. The universal C ∗ -algebra C(q4 ), defined by q4 , is the norm closure of σ (q4 ). By Proposition 8 we have that σ (q4 ) \ C1 is contained in the algebra K of compact operators on B(32 (N)⊗2 ). Using Proposition 15.16 of [13] we conclude that C(q4 ) is isomorphic to the unitization of compacts. Note that, though different at an algebraic level, it is not possible to distinguish from their C*-algebras our 4-sphere and the standard Podle`s sphere S2q (c, 0) in [22, 27]. A possible explanation for this peculiarity can be found in the fact that the space of leaves of the underlying symplectic foliations are homeomorphic. σ 0 1 0 , γ = . As a consequence 0 0 −1 of Proposition 8 we have that (H, π ) is a 0-summable Fredholm module whose character is trσ = tr (σ − ). By explicit computation we have trσ (1) = 0 and Let H = 32 (N)⊗2 ⊕ 32 (N)⊗2 and π =
1 (k > 0), tr σ (a) = tr σ (b) = 0, (1 − q 2k )2 1 1 tr σ (aa ∗ ) = . , tr σ (bb∗ ) = (1 − q 4 )2 (1 − q 2 )(1 − q 4 )
tr σ (R k ) =
(9)
In the representation σ , R is an invertible operator. This suggests the possibility of realizing a quantum stereographic transformation on a deformation of C2 , that we will denote C2q . Let us define ζ1 = R −1 a, ζ2 = bR −1 ; by direct computation we find that they satisfy the following relations: ζ1 ζ2 = q −1 ζ2 ζ1 ,
ζ1 ζ1∗ = q −2 ζ1∗ ζ1 + (1 − q 2 ),
ζ1 ζ2∗ = q −1 ζ2∗ ζ1 ,
ζ2 ζ2∗ = q 2 ζ2∗ ζ2 − (1 − q 2 )(q 2 + ζ1∗ ζ1 ).
(10)
One can verify that the algebra C2q quantizes the symplectic structure on C2 seen in (5).
Noncommutative Instantons on the 4-Sphere from Quantum Groups
429
7. Chern–Connes Pairing of G Let us compute the Chern classes in cyclic homology associated to the projector G. We briefly recall some basic definitions and results from cyclic homology, see [9] and [19] for any details. Let A be an associative C-algebra. Let di (a0 ⊗ a1 ⊗ · · · ⊗ an ) = a0 ⊗ · · · ⊗ ai ai+1 ⊗ · · · ⊗ an , for i = 0, . . . , n − 1 and dn (a0 ⊗ a 1 ⊗ · · · ⊗ an ) = an a0 ⊗ a1 ⊗ · · · ⊗ an−1 ; the Hochschild boundary is defined as β = ni=0 (−)i di and the Hochschild complex is (C∗ (A), β), with Cn (A) = A⊗n+1 . As usual we denote Hochschild homology with H H∗ (A). Let t (a0 ⊗ a1 ⊗ · · · ⊗ an ) = (−)n a1 ⊗ · · · ⊗ an ⊗ a0 be the cyclic operator and Cnλ (A) = A⊗n+1 /(1 − t)A⊗n+1 . The Connes complex is then (C∗λ (A), β); its homology is denoted as H∗λ (A). For each projector G ∈ Mk (A), i.e. G2 = G, the Chern class is λ (A), where Tr : M (A)⊗n → A⊗n is the defined as chλn (G) = Tr[(−)n G⊗2n+1 ] ∈ H2n k generalized trace, i.e. Tr[M1 ⊗ · · · ⊗ Mn ] = j [M1 ]j1 j2 ⊗ [M2 ]j2 j3 ⊗ · · · ⊗ [Mn ]jn j1 . Let I : A⊗n+1 → Cnλ (A) be the projection map and let x ∈ A⊗n be such that I (x) is λ , a cycle which induces [I (x)] ∈ Hnλ . Let us define the periodicity map S : Hnλ → Hn−2 as 1 i+j (−) di dj (x) . I S([I (x)]) = − n(n − 1) 0≤i<j ≤n
There is then a long exact sequence in homology: I
S
B
λ → H Hn−1 (A) . . . , . . . → H Hn (A) → Hnλ (A) → Hn−2
(11)
where B is an operator we don’t need to define. With our normalization of the Chern character, we have that for each projector G, S(chλn (G)) = −
1 chλ (G). 2(2n − 1) n−1
(12)
Let G ∈ M4 (q4 ) be the projector defined in Proposition 7. Then chλ0 (G) = [Tr(G)] = [2 − (1 − q 2 )2 R] ∈ H0λ . The character tr σ of the Fredholm module (H, π ) given in (9) is a well defined cyclic 0cocycle on q4 . We then have that tr σ (chλ0 (G)) = −1 and conclude that chλ0 (G) defines a non-trivial cyclic cycle in H0λ (q4 ); using the S-operator (12) and Connes sequence (11) we conclude that chλ1 and chλ2 define non-trivial classes in cyclic homology and are not Hochschild cycles. Since tr σ is the character of a Fredholm module, the integrality of the pairing is a manifestation of the so called noncommutative index theorem [9]. We summarize this discussion in the following proposition. Proposition 10. The projector G defined in (6) defines non-trivial cyclic homology λ . The Chern–Connes pairing with tr defined in (9) is: classes chλn (G) ∈ H2n σ
tr σ , G = −1.
430
F. Bonechi, N. Ciccoli, M. Tarlini
Appendix. Proof of Propositions 6 and 7 To prove Propositions 6 and 7 we use the strategy adopted by Nagy in [24]. His argument is based on the general fact that the corepresentation theory of compact quantum groups is “equivalent” to the classical one. This equivalence is realized by a bijective map between quantum and classical finite dimensional corepresentations: this map preserves direct sum, tensor product and dimension. The fact that we deal with coisotropic subgroups requires some additional care. The projection r induces a mapping r[ρ] = (id ⊗ r)ρ from the corepresentations of Uq (4) into the corepresentations of SUq (2), since r is not an algebra morphism it is not obvious that this mapping preserves the tensor product. However in our case the following lemma can be proved: Lemma 11. Let tf and tf c be the fundamental and its contragredient corepresentation of Uq (4), then r[tf⊗r ⊗ tf⊗sc ] is equivalent to r[tf ]⊗r ⊗ r[tf c ]⊗s . Proof. Let τf be the fundamental corepresentation of SUq (2) and τf c its contragredient. Let us notice that for i, j = 1, 2 we have that r[ti,j ] = (τf )ij , r[ti+2,j +2 ] = q i−j (τf c )ij , ∗ ] = (τ c ) and r[t ∗ j −i (τ ) . By making use of the equivalence r[ti,j f ij f ij i+2,j +2 ] = q between τf and τf c it is easy to conclude that r[tf⊗r ⊗ tf⊗sc ] for r + s = 2 is equivalent
to 4 τf⊗2 and then to r[tf ]⊗r ⊗ r[tf c ]⊗s . The result for generic r and s is obtained by recurrence and by making use of the right Uq (4)-module structure of the projection r.
As a consequence the decomposition of r[tf⊗n ] into irreducible corepresentations of SUq (2) is the same as the classical one. Let tn = r+s≤n tr,s , where tr,s = tf⊗r ⊗tf⊗sc . We denote with C(tn ) ⊂ Uq (4) the sub coalgebra of the matrix elements of tn . We then have Uq (4) = n∈N C(tn ) and we define S7q,n = C(tn ) ∩ S7q . Obviously S7q,n is a Uq (4)-comodule with coaction n = |S7q,n . From the decomposition into irreducible corepresentations n = λ∈I (Uq (4)) mλ λ, 7 λ, j where mλ ∈ N, we get S7q,n = λ∈I (Uq (4)) Sq,n . j =1,... ,mλ
Let ρ : V → SUq (2) ⊗ V be an irreducible SUq (2) corepresentation. We prove the following lemma. Lemma 12. The dimension of S7q,n ρ V doesn’t depend on q. Proof. Let Pρ : S7q ⊗ V → S7q ρ V be the projection defined by Pρ (f ⊗ v) = (f,v) f(0) h(f(1) S(v(−1) ))⊗v(0) , where h is the Haar measure on SUq (2). We obviously 7 λ, j have that Pρ (S7q,n ⊗ V ) = ⊗ V ). Then dim Pρ (S7q,n ⊗ V ) = λ∈I (Uq (4)) Pρ (Sq,n j =1,... ,mλ 7 λ, j λ∈I (Uq (4)) mλ mρ (λ), where mρ (λ) = dim Pρ (Sq,n ⊗ V ) equals the multiplicity of ρ in the decomposition of r[λ] = (id ⊗ r) λ. Since the correspondence between classical and quantum corepresentation preserves dimensions, the result follows. Proof of Proposition 6. To show that {a, b, R} are coinvariants is a direct computation. Let Bq ⊂ q4 be the *-algebra generated by those elements. By the use of the diamond lemma the monomials {a ∗ i1 a i2 R j b∗ k1 bk2 k1 k2 = 0} are linearly independent and they form a basis of Bq . Note that the same monomials form a basis for the polynomial functions on the classical 4-sphere, and define a vector space isomorphism which maps Bq,n =
Noncommutative Instantons on the 4-Sphere from Quantum Groups
431
C(tn ) ∩ Bq → PK (S71,n ), where PK = Pρ with ρ being the identity corepresentation. Using Lemma 12 we then have Bq = q4 . Proof of Proposition 7. By a direct check it is easy to see that fi are in E = S7q τf C2 and that the mapping fi → ei G, with (ei )j = δij , is a q4 -module morphism. Since in the classical case it is clearly bijective the result follows by repeating the same arguments of the proof of Proposition 6 and applying Lemma 12 with ρ = τf . Acknowledgements. The authors want to thank L. Dabrowski and G. Landi for having stimulated this work and for the useful discussions on the subject and the referee for the constructive suggestions. One of us (N.C.) would like to thank A.J.L. Sheu for his comments on the paper.
References 1. Atiyah, M.: The geometry of Yang–Mills fields. Lezioni Fermiane. Accademia Nazionale dei Lincei e Scuola Normale Superiore, Pisa (1979) 2. Bonechi, F., Ciccoli, N., Giachetti, R., Sorace, E., Tarlini, M.: Unitarity of induced representations from coisotropic quantum subgroups. Lett. Math. Phys. 49, 17–31 (1999) 3. Bonechi, F., Ciccoli, N., Giachetti, R., Sorace, E., Tarlini, M.: The coisotropic subgroup structure of SLq (2, R), J. Geom. Phys. 37, 190–200 (2001) 4. Brzezi´nski, T.: On modules associated to coalgebra Galois extensions. J. Algebra 215, 290–317 (1999) 5. Brzezi´nski, T., Hajac, P.M.: Coalgebra extensions and algebra coextensions of Galois type. Comm. Alg. 27, 1347–1367 (1999) 6. Brzezi´nski, T., Majid, S.: Commun. Math. Phys. 157, 591-638 (1993); Erratum 167, 235 (1995) 7. Chari, V., Pressley, A.: A Guide to Quantum Groups. Cambridge: Cambridge University Press, 1994 8. Ciccoli, N.: Quantization of coisotropic subgroups. Lett. Math. Phys. 42, 123–138 (1997) 9. Connes, A.: Noncommutative Geometry. London: Academic Press, 1994 10. Connes, A., Landi, G.: Noncommutative manifolds, the instanton algebra and isospectral deformations. Commun. Math. Phys. 221, 141–159 (2001) 11. Dabrowski, L., Grosse, H., Hajac, P.M.: Strong connections and Chern–Connes pairing in the Hopf–Galois theory. math.QA/9912239 12. Dabrowski,L., Landi, G., Masuda, T.: Instantons on the quantum 4-spheres Sq4 . Commun. Math. Phys. 221, 161–168 (2001) 13. Doran, R.S., Fell, J.M.G.: Representations of *-Algebras, Locally Compact Groups, and Banach *Algebraic Bundles: Vol I. New York: Academic Press, 1988 14. Durdevic, M.: Geometry of quantum principal bundles I. Commun. Math. Phys. 175, 457–521 (1996) 15. Etingof, P., Kazhdan, D.: Quantization of Poisson algebraic groups and Poisson homogeneous spaces. In: Proceedings of Les Houches Summer School, Session LXIV, A Connes et al. eds., 1998, pp. 935–946 16. Klimyk, A., Schmüdgen K.: Quantum Groups and Their Representations. Berlin: Springer-Verlag, 1997 17. Koelink, E.: On quantum groups and q-special function. Ph.D. thesis University of Leiden (1991) 18. Hajac, P.M., Majid, S.: Projective module description of the q-monopole. Commun. Math. Phys. 206, 247–264 (1999) 19. Loday, J.L.: Cyclic Homology. Berlin: Springer-Verlag, 1992 20. Lu, J.H.: Multiplicative and affine Poisson structures on Lie groups. Ph.D. thesis University of California, Berkeley (1990) 21. Lu, J.H., Weinstein, A.: Poisson–Lie groups, dressing transformation and Bruhat decompositions. J. Diff. Geom. 31, 501–526 (1990) 22. Masuda, T., Nakagami, Y., Watanabe, J.: Noncommutative differential geometry on the quantum two sphere of Podle`s. I: An algebraic viewpoint. K-theory 5, 151–175 (1991) 23. Müller, E.F., Schneider, H.J.: Quantum homogeneous spaces with faithfully flat module structure. Isr. J. Math. 31, 501–526 (1999) 24. Nagy, G.: On the Haar measure of the quantum SU (N ) group. Commun. Math. Phys. 153, 217–228 (1993) 25. Nekrasov, N., Schwarz, A.: Instantons on noncommutative R4 , and (2, 0) Superconformal Six Dimensional Theory. Commun. Math. Phys. 198, 689–703 (1998) 26. Pflaum, M.: Quantum groups on fibre bundles. Commun. Math. Phys. 166, 279–316 (1994) 27. Podle´s, P.: Quantum spheres. Lett. Math. Phys. 14, 193–202 (1987)
432
F. Bonechi, N. Ciccoli, M. Tarlini
28. Schneider, H.J.: Principal homogeneous spaces for arbitrary Hopf algebras. Isr. J. Math. 72, 167–195 (1990) 29. Vaisman I.: Lectures on the geometry of Poisson manifolds. Progress in Math. 118. Basel–Boston: Birkhäuser Verlag, 1994 30. Vaksman, L.L., Soibelman, Ya.: Algebra of functions on the quantum group SU (N + 1) and odd dimensional quantum spheres. Leningrad Math. J. 2, 1023–1042 (1991) Communicated by A. Connes
Commun. Math. Phys. 226, 433 – 454 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
2D Models of Statistical Physics with Continuous Symmetry: The Case of Singular Interactions D. Ioffe1, , S. Shlosman2, , Y. Velenik3 1 Dept. of Industrial Engineering, Technion, Haifa, Israel. E-mail: [email protected] 2 CPT, Luminy, Marseille, France. E-mail: [email protected] 3 UMR-CNRS 6632, CMI, Marseille, France. E-mail: [email protected]
Received: 18 October 2001 / Accepted: 7 January 2002
Abstract: We show the absence of continuous symmetry breaking in 2D lattice systems without any smoothness assumptions on the interaction. We treat certain cases of interactions with integrable singularities. We also present cases of singular interactions with continuous symmetry, when the symmetry is broken in the thermodynamic limit. 1. Introduction and Results 1.1. The invariance problem: an overview. In this paper we are studying the twodimensional lattice models of statistical mechanics, which are defined by a G-invariant interaction, where G is some compact connected Lie group. We shall investigate both the cases of finite and infinite range interactions. The general class of finite-range models to be considered is given by the following Hamiltonians: H (φ) = U (φ·+x | ) . (1) x∈Z2
Here φ = φy , y ∈ Z2 is the field, taking values in some compact topological space S, 2 is a fixed finite subset of Z and the translation-invariant interaction U = {U +x (·) , x ∈ Z2 is specified by a real function U on S . We suppose that a continuous action of a compact connected Lie group G on S is given, a : G × S → S, and for φ ∈ S, g ∈ G we introduce the notation gφ = a (g, φ). This action defines the action of G on S k for every k by g (φ1 , . . . , φk ) = (gφ1 , . . . , gφk ), and the main assumption is that the function U is invariant under this action on S : for every g ∈ G, U g φ1 , . . . , φ| | = U φ1 , . . . , φ| | . (2) Research partly supported by the Fund for the Promotion of Research at the Technion and by the NATO grant PST.CLG.976552. Research supported in part by RFBR Grant 99-01-00284.
434
D. Ioffe, S. Shlosman, Y. Velenik
Of course, we suppose that the free measure dφ on S is G-invariant as well. The best known examples of such models are the XY model (or plane rotator model) and XYZ model (or classical Heisenberg model). For the XY model S = S1 ⊂ R2 is the unit circle, and for the XYZ model S = S2 ⊂ R3 is the unit sphere. The Hamiltonians are H (φ) = − (3) J φx ◦ φy , x,y∈Z2 |x−y|=1
where J is a positive constant, (.. ◦ ..) stands for the scalar product, and the free measures are just the Lebesgue measures on the spheres. The first rigorous result in this field is the well-known Mermin-Wagner theorem, which for the models (3) states the absence of spontaneous magnetization. Then in [DS1] a stronger result was proven, stating that under some smoothness conditions (see (4) below) on the function U every Gibbs state of the model defined by the Hamiltonian 2 (1) is G-invariant under the natural action of G on S Z . Later it was proven in [MS] that for the model (3) the correlations decay at least as a power law. In [S78] the same power law was obtained for the general model (1), again under the smoothness condition (4). This result was reproved later in [N] for the case G = SO(n), by means of the complex translations method of McBryan and Spencer. Another proof of G-invariance of the Gibbs states of models of type (1) was found in [P, FP]; with this technique it was possible to prove the result for long range interactions decaying as slowly as r −4 (in fact better, see the remark after Theorem 2). This result is optimal since it is known [KP] that in the low temperature XY model with interactions decaying as r −4+α , α > 0, there is spontaneous symmetry breaking. On the other hand, this technique seems unable to yield the algebraic decay of correlations. In the case of SO(N )-symmetric models, the technique of [MS] can be extended to cover such long-range interactions, see [MMR]. An alternative approach to these problems is via Bogoliubov inequalities, see [M, KLS, BPK, I]; it also permits to prove absence of continuous symmetry breaking for long-range interactions. As the technique of [P] however, they seem unable to yield algebraic decay of correlations. The smoothness condition on the interaction, which was crucial for all the results mentioned above, is the following. Let M ⊂ be a subset, and φ = φM ∪ φ \M be an arbitrary configuration. Then one requires that for every choice of the subset M and the configuration φ the functions1 Vφ ,M (g) = U φM ∪ gφ \M are C 2 functions on G. (4) Moreover, the second derivatives of the functions Vφ ,M (·), taken along any tangent direction in G, have to be bounded from above, uniformly in φ and M. In the next section we are explaining how that condition can be used in the proof of the G-invariance.
1.2. No breaking of continuous symmetry for singular interactions. The Main Result of the present paper is that the smoothness property (4) is in fact not necessary for the G-invariance, and the latter is implied by the mere continuity of the functions Vφ ,M (g) 1 In fact, as noted in [P], the proofs really only use the fact that these functions are C 1 , with a first derivative satisfying a Lipschitz condition.
2D Models with Continuous Symmetry and Singular Interactions
435
and the invariance (2). Moreover, even the continuity is not necessary, and a certain integrability condition on Vφ ,M (g) is enough (see relation (26) below). For example, for G being a circle, S1 , with U nearest neighbour interaction, U (φ1 , φ2 ) = U (φ1 − φ2 ), the singularity U (φ) ∼ ln |φ| at φ = 0 does not destroy S1 -invariance of the corresponding Gibbs measures. Jumps are allowed as well. However, if the interaction U is “even more singular”, then the G-invariance can be destroyed, as Theorem 4 below shows. To formulate our main result, we will introduce the notation µU for a Gibbs measure 2 on S Z corresponding to the formal Hamiltonian (1) and the free measure dφ on S, which is supposed to be G-invariant. We will denote the integration operation with respect to U to the µU by ·U . By µU x,y dφx , dφy we denote the restriction of the measure µ product S × S of the state spaces of the field variables φx and φy . We will prove the following Theorem 1. Suppose that: • the finite range interaction function U is continuous, bounded on S | | , and satisfies the G-invariance property (2), • the free measure dφ on S is G-invariant. Then the measure µU is G-invariant: for every g ∈ G, every finite V ⊂ Z2 and every µU -integrable function f on S V , f (g·)U = f (·)U .
(5)
Moreover, it has the following correlation decay: for every A, B ⊂ S the conditional distributions of the measure µU x,y satisfy for every g ∈ G the estimate µU φ ∈ gA φ ∈ B y x,y x ≤ C (U ) |x − y|−c(U ) , − 1 (6) U µx,y φx ∈ A φy ∈ B with C (U ) < ∞, c (U ) > 0. In the case when the space S is a homogeneous space of the group G (e.g. S = G), the measure µU dφ , dφ x y can be written as a convex x,y sum of two probability measures: U dφ , dφ ˜ x,y ˆU µU x y . x,y dφx , dφy = cxy µ x,y dφx , dφy + 1 − cxy µ The measure µˆ U x,y dφx , dφy can be singular, but the number cxy is very small: √ 0 ≤ cxy ≤ exp − |x − y| , while the measure µ˜ U x,y dφx , dφy has a density U px,y φx , φy with respect to the measure dφx dφy , which for every conditioning φy = ψ satisfies the estimate U px,y φx φy = ψ − 1 ≤ C (U ) |x − y|−c(U ) , with c (U ) > 0. In particular, for the case G = SO (n), S = Sn−1 ⊂ Rn with n ≥ 2 we have U (7) 0 ≤ φx ◦ φy ≤ C (U ) |x − y|−c(U ) .
436
D. Ioffe, S. Shlosman, Y. Velenik
We remind the reader that the homogeneous space is a manifold of the classes of conjugacy of a compact subgroup H ⊂ G. The G-invariance (5) does not imply the uniqueness of the Gibbs state with the interaction U The reason is that the interaction U may possess an additional discrete symmetry, which may be broken. An example is constructed in [S80]. The estimate (7) cannot be improved in general. Indeed, Fröhlich and Spencer have obtained the power law decay of the pair correlations in the XY model (3) for large values of the coupling constant J , see [FS]. On the other hand, for the XYZ model it is expected that the pair correlations decay exponentially for all values of J . 1.3. Infinite range case. The preceding theorem is restricted to finite-range interactions. Let us now turn to the long-range case. The formal Hamiltonian is supposed to be of the form H (φ) = Jx−y U (φx , φy ) . (8) x,y
More general Hamiltonians (e.g., without separating the spatial and spin part of the interaction, or with more than 2-body interactions) could also be treated along the lines of the approach we develop here, but for the sake of simplicity we shall restrict ourselves to the case of (8). Since the coupling constants {J· } have to satisfy the summability condition, we can make an additional normalization assumption |Jx | = 1. (9) x=0
Let X· be the random walk on with transition probabilities from x to y given by |Jx−y |. We then have the following Z2
Theorem 2. Suppose that • The random walk X· is recurrent. • The 2-body interaction function U is continuous on S × S, and satisfies the invariance property (2). • The free measure dφ on S is G-invariant. Then all Gibbs states, corresponding to the Hamiltonian (8), are G-invariant. The recurrency condition is known to be optimal even in the case of smooth U , in the sense that there are examples of systems for which the continuous symmetry is broken as soon as the underlying random-walk is transient, see [BPK] or Theorem (20.15) in [G]. Recurrence of the underlying random-walk is not a very explicit condition. Explicit examples have been given in [P]. Namely, it follows from the latter that Theorem 2 applies if there exists p < ∞ such that the coupling constants decay for large x∞ at least like x−4 ∞ log2 x∞ . . . logp x∞ , where logk x = log logk−1 x, and log2 x = log log x. On the other hand, it follows from [FILS] that the continuous symmetry is broken for the low temperature XY model with coupling constants behaving, for large x∞ , like 1+ε , x−4 ∞ log2 x∞ . . . logp x∞ for any p < ∞ and ε > 0.
2D Models with Continuous Symmetry and Singular Interactions
437
1.4. Non-compact symmetry group: Non-existence of 2D Gibbs states. Finally we mention the case of connected non-compact Lie group G. The case of the smooth interaction was treated in [DS2], and the corresponding long-range result was obtained in [FP]. Technically the compact and the non-compact cases are very similar, but the results are quite different. The reason is that while in the compact case the Haar measure on G can be normalized to a probability measure, in the non-compact case it is not possible. Therefore, there are no G-invariant probability measures on G for G non-compact. This is the main reason behind the result of [DS2] and [FP]: the corresponding 1D and 2D Gibbs measures do not exist. Below we are formulating the simplest such result for the non-compact case and singular interaction that our technique can produce. The field φ will be real-valued, G = R1 , and H (φ) = Jx−y U¯ φx − φy , (10) x,y∈Z2
with the function U¯ satisfying • •
U¯ (φ) = U¯ (−φ), U¯ (φ) = U (φ) − υ (φ), where U is a C 2 function with uniformly bounded second derivative, and 0 ≤ υ ≤ ε0 , where ε0 is some technical constant, which is small,
and the coupling constants {J· } satisfy the same hypothesis as in Theorem 2 . Theorem 3. There are no two-dimensional Gibbs fields, corresponding to the Hamiltonian (10), with interaction U¯ and coupling constants J· as above. In particular, the last theorem covers the case of the (non-convex) interactions U¯ (φ) = |φ|α , 0 < α ≤ 1, and so answers a question which was left open in the paper [BLL]. In fact, all the results of [BLL] concerning the non-existence of the 2D Gibbs fields for interactions growing at most quadratically in φ follow from the above theorem. Notice that our techniques also allow to obtain lower bounds with the correct behavior for the variance of the field in a finite box. The general formulation of the above theorem and its proof will be published in a separate paper. 1.5. Continuous symmetry breaking in 2D. Our results on continuous symmetry breaking are taking place for the Patrascioiu–Seiler model [PS]. Namely, it was argued there, and was rigorously proven later by M. Aizenman [A], that the following holds. Consider the case when S = G = S1 , with the interaction U (φ1 , φ2 ) = U (φ1 − φ2 ) given by − cos φ if |φ| ≤ θ, U (φ) = (11) +∞ if |φ| > θ. Then in the 2D case, the statement is that the two-point pair correlations in the state with free or periodic b.c. decay at most as a power law, at all temperatures including infinite temperature, provided |θ| < π4 . It would be interesting to know whether the Gibbs states µ0 of this model with zero b.c., i.e. φ ≡ 0, are S1 -invariant. To the best of our knowledge this question is open. However, one can prove the following simple:
438
D. Ioffe, S. Shlosman, Y. Velenik
Theorem 4. Suppose that |θ| < π4 . Then at any temperature there exist Gibbs states, corresponding to the interaction (11), which are not S1 -invariant. 2. Proofs 2.1. Theorem 1: Smooth case. We begin by reminding the reader the main ideas of the proof for the case of smooth interaction. The proof for the general case would be built upon it. We follow [DS1], with simplifications made in [Si]. For simplicity we will consider the case when both the space S and the group G will be a circle, S1 . The general case follows easily from this special one, see [DS1], since for every element g ∈ G there is a compact commutative subgroup (torus) T ⊂ G, such that g ∈ T . We also suppose that the interaction U is a nearest neighbour translation invariant interaction, given by a symmetric function U of two variables: U (φ1 , φ2 ) = U (φ2 , φ1 ). The generalization to a finite range interaction is straightforward. The S1 -invariance of U means that U (φ1 , φ2 ) = U (φ1 + ψ, φ2 + ψ) for every ψ ∈ S1 , so in fact we can say that U is a function of one variable, U (φ1 , φ2 ) = U (φ1 − φ2 ), with U (φ) = U (−φ). The smoothness we need is the following: we suppose that U has the second derivative, which is bounded from above: ¯ U (φ) ≤ C. (12) Let n be the box x ∈ Z2 : ||x||∞ ≤ n , and φ¯ be an arbitrary boundary condition outside n . Let ·n,φ¯ be the Gibbs state in n corresponding to the interaction U and ¯ Let V be an arbitrary finite subset of Z2 , containing the origin. the boundary condition φ. Our theorem will be proven for the interaction U once we obtain the following estimate: Lemma 5. For every function f (φ) = f (φV ), which depends only on the configuration φ inside V , we have for every ψ ∈ S1 ¯ V ||f ||∞ n−N(U ) (13) f (φ + ψ)n,φ¯ − f (φ)n,φ¯ ≤ C C, ¯ V > 0, while the functional N (·) is positive for every U smooth. for some C C, Proof. Our system in the box n has (2n + 1)2 degrees of freedom, which is hard to study. We are going to fix (2n + 1)2 − (n + 1) of them, leaving only n + 1 degrees of freedom, and we will show that for every choice * of the degrees frozen we have ¯ V ||f ||∞ n−N(U ) (14) f (φ + ψ) |*n,φ¯ − f (φ) |*n,φ¯ ≤ C C, uniformly in *. From that (13) evidently follows by integration. These degrees of freedom are introduced in the following way. 2 For 2every k = 0, 1, 2, . . . , we define the layer Lk ⊂ Z as the subset Lk = x ∈ Z : ||x||∞ = k . For a configuration φ in n we denote by *k , k = 0, 1, 2, . . . , n its restrictions to the layers Lk : *k = φ|Lk .
n+1 We define now the action (ψ0 , ψ1 , . . . , ψn ) φ of the group S1 on configurations φ in n by ((ψ0 , ψ1 , . . . , ψn ) φ) (x) = φ (x) + ψk(x) ,
2D Models with Continuous Symmetry and Singular Interactions
439
where k (x) = ||x||∞ is the number of the layer to which the site x belongs. We define the torus * (φ) to be the orbit of the configuration φ under this action. In other words, * (φ) is the set of configurations *0 + ψ0 , *1 + ψ1 , . . . , *n + ψn , for all possible values of the angles ψi , where the configuration *k + ψk on the layer Lk is defined by (*k + ψk ) (x) = φ (x) + ψk . Let us fix for every orbit * one representative, φ, so * = * (φ), and let *0 , *1 , . . . , *n be the restrictions, *k = φ|Lk . We will study the conditional Gibbs distribution ·|* (φ) = *n,φ¯ . This distribution is again a Gibbs n+1 measure on S1 = {(ψ0 , ψ1 , . . . , ψn )}, corresponding to the nearest neighbour interaction W*,φ¯ = {Wk , k = 1, 2, . . . , n}. It is defined for k < n by
Wk (ψk , ψk+1 ) = U (*k + ψk ) (x) , (*k+1 + ψk+1 ) (y) , (15) x∈Lk ,y∈Lk+1 : |x−y|=1
while
Wn (ψn ) =
U (*n + ψn ) (x) , φ¯ (y) .
(16)
x∈Ln ,y∈Ln+1 : |x−y|=1
(Note for the future that the interactions along the bonds which are contained within one layer do not contribute to W -s.) We are going to show that for every k the distribution of the random variable ψk under ·|* (φ) = *n,φ¯ has a density pk (t) with respect to the Lebesgue measure on S1 , which satisfies √ n −N(U ) sup |pk (t) − 1| ≤ C k , (17) k t∈S1 ¯ with C = C C¯ . That implies (14). uniformly in *, φ, To show (17) we note that due to S1 -invariance of U we have Wk (ψk , ψk+1 ) = Wk (ψk + α, ψk+1 + α) for every α ∈ S1 . Hence Wk (ψk , ψk+1 ) = Wk (ψk − ψk+1 , 0), and therefore the random variables ψk − ψk+1 for k < n χk = ψn for k = n are independent. Since evidently ψk = χk + χk+1 + · · · + χn ,
(18)
we are left with the question about the distribution of the sum of independent random elements of S1 . Were the independent random elements χi identically distributed, with the distribution having density, the statement (17) would be immediate. However, they are not identically distributed, so we need to work further. Introducing Wk (χk ) = Wk (χk , 0) for k < n, we have that for all k ≤ n the distribution of the random element χk is given by the density qk (t) =
exp {−Wk (t)} . exp {−Wk (t)} dt
440
D. Ioffe, S. Shlosman, Y. Velenik
Let tmin be (any) global minimum of the function Wk (·). Then for every t the Taylor expansion implies the estimate Wk (tmin ) ≤ Wk (t) ≤ Wk (tmin ) + 8C¯ (k + 1) |t − tmin |2 ,
(19)
due to (12), (15), (16). (This is the point where both smoothness and two-dimensionality are crucial.) Hence √ (20) max qk (t) ≤ C1 k + 1 for some C1 = C1 C¯ . Because of (18), pk (t) = (qk ∗ · · · ∗ qn ) (t), where ∗ stays for convolution. Therefore it is natural to study the Fourier coefficients 1 as (ql ) = 2π
2π
ql (t) eist dt,
0
s = 0, ±1, ±2, ... , since as (pk ) =
n
as (ql ) .
(21)
l=k
We want to show that for every s = 0 the last product goes to 0 as n → ∞, uniformly in s. To estimate the coefficients |as (ql )| we use the following straightforward Lemma 6. Let PC be the set of all probability densities q (·) on a circle, satisfying sup q (t) ≤ C,
t∈S1
and s be an integer. Then the functional on PC , given by the integral 1 2π
2π
q (t) cos (st) dt,
0
attains its maximal value at the density ≤ C if t − 2πk s qC (t) = 0 otherwise.
1 2Cs
for some k = 0, . . . , s − 1,
Using this lemma and the estimate (20), we obtain that sup {|as (ql )| : s = 0} 1 C1 √l+1−1 √ 2 t2 1 . ≤ 2C1 l + 1 1− dt = 1 − 3 36 (C1 )2 (l + 1) 0 Since sup |pk (t) − 1| ≤
t∈S1
s=0
|as (pk )| ,
(22)
2D Models with Continuous Symmetry and Singular Interactions
441
we are almost done. Namely, note that due to the Parseval identity and (20) we have for every l, 1+
|as (ql )| = 2
√ (ql (t))2 dt ≤ C1 l + 1.
s=0
Let us introduce now the densities pk,r (t) = (qk ∗ · · · ∗ qr ) (t) , k ≤ r ≤ n. Due to the Cauchy inequality, 1+
as pk,k+1 ≤ C1 4 (k + 1) (k + 2). s=0
Therefore by (22) and (21), r 4 sup pk,r (t) − 1 ≤ C1 (k + 1) (k + 2) 1−
t∈S1
l=k+2
1 36 (C1 )2 (l + 1)
which ends the proof of (17), with C = 2C1 C¯ and N (U ) =
1 2. 36(C1 (C¯ ))
,
(23)
" !
2.2. Theorem 1: Singular case. The key step in the above proof was the use of the Taylor expansion, to bound the densities qr . There the existence of the second derivative of U and its boundedness was used in a crucial way. Yet, one can use essentially the same arguments to treat the general case, without the smoothness assumption. The main idea is to represent the singular interaction as a small perturbation of a smooth one, smallness being understood in the L1 sense. Another version of this idea was used earlier in [BI, BCPK, DV, IV]. Namely, we will consider the nearest neighbour interaction U¯ (φ) = U (φ) − υ (φ) ,
(24)
where U is a smooth function with a bounded second derivative, as above, while υ ≥ 0 is a “small” singular component. The precise meaning of smallness will be made explicit a bit later, see (26). However, already now we can say that every continuous function U¯ can be written in the form (24), with U twice differentiable and with υ satisfying 0 ≤ υ (·) ≤ ε,
(25)
with ε > 0 arbitrarily small. That follows immediately for example from the Weierstrass theorem, stating that the trigonometric polynomials are everywhere dense in the space of continuous functions on the circle. Clearly, the estimate (25) implies L1 -smallness of υ, whatever the latter may mean. We will denote by H¯ the Hamiltonian corresponding to the singular interaction U¯ , while H will be the Hamiltonian defined by the smooth part of the interaction, U . To proceed with the expansion, we introduce the set En to be the collection of all bonds of
442
D. Ioffe, S. Shlosman, Y. Velenik U¯ ,φ¯
Z2 with at least one end in the box n , and rewrite the partition function Zn in n , ¯ as follows: corresponding to the interaction U¯ and the boundary conditions φ, ¯ ¯ ZnU ,φ = exp −H¯ φ|φ¯ dφ 6 n = exp −H φ|φ¯ 1 + eυ(φ(x)−φ(y)) − 1 dφ 6n
= ≡
A⊂En 6n
A⊂En
x,y∈En
exp −H φ|φ¯
eυ(φ(x)−φ(y)) − 1 dφ
x,y∈A
¯ ZnU,φ,A .
For every subset A ⊂ En we now introduce the probability distribution ¯
φ,A µU, (dφ) = n
1 ¯ U,φ,A
Zn
υ(φ(x)−φ(y)) exp −H φ|φ¯ − 1 dφ. e x,y∈A U¯ ,φ¯
Then we have for the original Gibbs state µn the following decomposition: ¯ ,φ¯ ¯ φ,A µU = πn (A) µU, , n n A⊂En
with the probabilities πn (·) given by πn (A) =
¯ U,φ,A
Zn
U¯ ,φ¯
Zn
.
¯ U,φ,A
Note that the states µn are themselves Gibbs states in n , corresponding to the boundary condition φ¯ and the (non-translation invariant) nearest neighbour interaction U A , which for bonds outside A is given by our smooth function U (φs − φt ), while on bonds from A it equals U (φs − φt ) − ln eυ[φs −φt ] − 1 . (Here the positivity of the function υ is used.) Let us now introduce the bond percolation process A on En , defining its probability distribution Pn by Pn (A = A) = πn (A) . This process is of course a dependent percolation process. Happily, it turns out that it is dominated by independent bond percolation, with probability of a bond to be open very small! Our claim would follow once we check that the conditional probabilities Pn (b ∈ A| (En \b) ∩ A = D) are small uniformly in D. We will show this under the following condition on the smallness of the singular part υ of the interaction U¯ . We suppose that • U¯ (φ) = U (φ) − υ (φ), with U having bounded second derivative, • υ ≥ 0,
2D Models with Continuous Symmetry and Singular Interactions
443
• for every choice of the four values φ1 , φ2 , φ3 , φ4 , exp − 4i=1 U (φ − φi ) + 4i=1 υ (φ − φi ) dφ ≤ 1 + ε, exp − 4i=1 U (φ − φi ) dφ
(26)
with ε small enough. In words, the last condition says that the expectation of the observable 4 exp i=1 υ (φ − φi ) with respect to a single site conditional Gibbs distribution corresponding to the (smooth) interaction U and any boundary condition φ1 , φ2 , φ3 , φ4 around that site, is smaller than 1 + ε. A straightforward calculation implies that under (26), Pn (b ∈ A| (En \b) ∩ A = D) ≤ ε,
(27)
uniformly in D. We denote by Qε the distribution of the corresponding independent bond percolation process, η· . The strategy of the remainder of this subsection is the following: ¯ U,φ,A
• we will show that if the set A is sparse enough, then for the measure µn the analog of the estimate (13) holds. • such sparse sets A constitute the dominant contribution to the distribution Pn . Let us formulate now the sparseness condition on A we need. In what follows, by a path we will mean a sequence of pairwise distinct bonds of our lattice, such that any two consecutive bonds share a site. A path with coinciding beginning and end is called a loop. If a loop surrounds the origin, we will call it a circuit. Any two objects of the above will be called disjoint, if they share neither a bond nor a site. The same objects, associated with the dual lattice will be called d-sites, d-bonds, d-paths, d-loops and d-circuits. Suppose the set A is given, and λ1 , λ2 , . . . , λν is a collection of disjoint d-circuits, avoiding A. The latter means that no d-bond of any λk crosses any of the bonds from A. We suppose that these d-circuits are ordered by “inclusion”. Then we introduce layers Lk by Lk = x ∈ Z2 : x ∈ Int (λk ) \Int (λk−1 ) , k = 1, 2, . . . , ν + 1, with the convention that Int (λ0 ) = ∅ and Int (λν+1 ) = Z2 . (Note that these layers are connected sets of sites, and they surround the origin in the same way as the “old” layers did.) For every configuration φ in n we introduce, as in the previous section, the layer configurations *k , k = 1, 2, . . . , ν + 1 as its restrictions to the layers Lk , the layer angles ψ1 , . . . , ψν , the ν-dimensional torus * (φ), and we note that the distribution of ψ-s under the condition that the orbit * (φ) is fixed, is a (onedimensional) Gibbs distribution. Moreover, it is defined by the nearest neighbour interaction W*,φ¯ = {Wk , k = 1, 2, . . . , ν}, given by almost the same formula as (15): for k < ν,
(28) Wk (ψk , ψk+1 ) = U (*k + ψk ) (x) , (*k+1 + ψk+1 ) (y) , x∈Lk ,y∈Lk+1 : |x−y|=1
444
D. Ioffe, S. Shlosman, Y. Velenik
while for k = ν, Wν (ψν ) =
U (*n + ψn ) (x) , φ ∨ φ¯ (y) .
(29)
x∈Lν ,y∈Lν+1 : |x−y|=1
(Here the configuration φ ∨ φ¯ equals φ inside n and φ¯ outside n .) Note that the singular part of the interaction U A does not enter in these formulas, precisely because the d-circuits λk avoid the set A! Hence we can conclude that for every k the distribution of the random variable ψk under the measure ·|* (φ) = *n,φ¯ has a density pk (t) on S1 , which satisfies the following analog of (23): ν 1 1 4 sup |pk (t) − 1| ≤ C1 |λk | |λk+1 | exp − , (30) |λl | 36 (C1 )2 t∈S1 l=k+2
¯ The last relation suggests the following uniformly in *, φ. Definition 7 (of sparseness). The set A of bonds in En is τ -sparse, if there exists a family of ν (A) disjoint d-circuits λl in n , avoiding A, and such that ν(A) l=1
1 ≥ τ ln n. |λl |
Therefore we will be done, once we show the following: Proposition 8. For any κ, 1 > κ > 0, there exists a value τ = τ (κ) > 0, such that κ
Pn (A is not τ -sparse) ≤ e−n .
(31)
The proof of this proposition is the content of the following subsections. 2.2.1. τ -sparseness is typical. For every l = 2, 3, . . . , let us define the northern rectangle l = [−2l , . . . , 2l ] × [2l−1 + 1, . . . , 2l ], RN l , R l and R l be the clock-wise and let the eastern, southern and western rectangles RE S W l rotations of RN by, respectively, π/2, π and 3π/2 with respect to the origin. Define the l th shell T l by l l l T l = RN ∪ RE ∪ RSl ∪ RW .
Clearly, T l ⊂ n once n ≥ 2l , while different T l -s are disjoint. Let a configuration A of bonds be given. By a good crossing of a rectangle R·· we will mean a d-path, joining the two short sides of R·· and avoiding A. We denote the set of such crossings by R←→ . Let λlN , λlE , λlS , λlW be four good crossings of the l , R l , R l , R l respectively. Then the collection of those d-bonds of the rectangles RN E S W union λlN ∪ λlE ∪ λlS ∪ λlW , which are seen from the origin, form a d-circuit avoiding A. Therefore we want to get a
2D Models with Continuous Symmetry and Singular Interactions
445
2.2.2. Lower bound on the number of disjoint good crossings of a rectangle. We claim that for all ε sufficiently small there exist α = α(ε) > 0 and c1 = c1 (ε) > 0 such that at each scale k the Qε -probability that there are less than α2k disjoint good crossings of k is smaller than e−c1 2k , where Q is the measure of the independent bond percolation RN ε process η· , defined after (27). Indeed, by the Ford-Fulkerson min-cut/max-flow Theorem (see e.g. [R]), the number k (which by definition are left-to-right crossings by of disjoint good crossings of RN d-paths) is bounded from below by 1 λ − λ ∩ A , min 2 λ∈R) where the minimum is taken over the set R) of all “cuts”, which are paths in just k , joining the bottom and top sides of R k . The min-cut quantity min − RN λ λ ∩ A λ N equals the maximal left-to-right flow by d-paths, avoiding A, and the factor 1/2 accounts for the fact that the corresponding d-paths might share the same d-sites, so in order to estimate the number of disjoint paths we have to take half of the total flow. Evidently, Qε ∃ λ − λ ∩ A ≤ α2k ≤ Qε λ ∈R) with λ − λ ∩ A ≤ α2k , (32) λ∈R)
while for every λ
k Qε λ − λ ∩ A ≤ α2k ≤ 2|λ| ε |λ|−α2 ≤ e−c2 |λ| ,
since any top-to-bottom crossing contains at least 2k−1 bonds. Here c2 = c2 (α, ε) > 0 satisfies lim c2 (α, ε) = ∞,
ε→0
once α < 1/2. Thus, choosing α < 1/2 and ε sufficiently small, we infer that there exists c1 > 0, such that the right-hand side of (32) is bounded above by 2k
∞
k
3l e−c2 l ≤ e−c1 2 .
l=2k−1 k
Thus, the min-cut/max-flow theorem insures that up to the Qε -probability 1 − e−c1 2 , k . Moreover, observe that at least there are at least α2k−1 disjoint good crossings λi of RN k−2 α2 of these d-paths have the length bounded above by α −1 2k+3 . Indeed, should this not be the case, 1 k |λi | > α2k−2 2k+3 = 2|RN | α i
which in view of the disjointedness of λi -s is impossible. Let us say that a left-to-right crossing d-path λ of the k th scale is α-short, if |λ| < −1 α 2k+3 , and define the event k TNk,α = A : there are at least α2k−2 disjoint good α-short crossings of RN ,
446
D. Ioffe, S. Shlosman, Y. Velenik
What we have proved up to now can be summarized as follows: There exists c1 > 0, such that uniformly in k, k Qε TNk,α ≥ 1 − e−c1 2 ,
(33)
as soon as α and ε are sufficiently small. 2.2.3. Proof of Proposition 8. Consider now the event T k,α = TNk,α ∩ TEk,α ∩ TSk,α ∩ TWk,α . From the previous argument one knows that for ε close enough to 0 the Qε -probability k of the event T k,α is at least 1 − 4 e−c1 2 . Note that under T k,α there are at least α2k−2 disjoint d-circuits in T k , avoiding A, all of which have length at most 2k+5 /α. Also, the k events T k,α are non-decreasing, therefore their Pn -probability is at least 1 − 4 e−c1 2 as well. The claim of Proposition 8 is now an immediate consequence: Let 1 > ρ > 0. Then, for every n = 2, 3, . . . , the event T
n,ρ,α
=
[log 2 n]
T k,α
k=[ρ log2 n] ρ
has, by (33), Pn -probability at least 1 − c3 e−c4 n . However, by the very construction,
n,ρ,α ensures that in each shell T , k ∈ { ρ log n , . . . , the occurrence of the event T k 2
log2 n }, it is possible to find a family of disjoint d-circuits avoiding A and such that the sum of the inverse of their lengths is at least α 2 /128. Their total is at least α2 1 − ρ log2 n. 128 2 The conclusion (31) follows. 2.2.4. General finite-range interactions. We briefly describe the main modifications to the proof given above, which are needed in order to treat the case of finite-range, non nearest-neighbour interactions U¯ , Z2 . As in (24), we decompose U¯ = U − υ to a smooth part U and a small singular part 0 ≤ υ ≤ ε. Notice that the choice of ε = ε(r ) will in general depend on the diameter r = diam( ) of the interaction set . The singular part of the interaction will be controlled by a dependent site percolation ¯ n = {x : x + ∩ n = ∅}. process, which we construct in two steps as follows. Define Step 1. As in the nearest-neighbour case, write υ (φ ) ¯ ¯ ZnU ,φ = exp −H φ|φ¯ e · +x − 1 dφ ¯n A⊂
+
=
¯n A⊂
6n
x∈A
¯
ZnU,φ,A .
2D Models with Continuous Symmetry and Singular Interactions
447
Then, exactly as before, it is easy to show that the probability distribution +
Pn (A = A) =
¯ U,φ,A
Zn
(34)
U¯ ,φ¯
Zn
¯
on {0, 1} n is stochastically dominated by the Bernoulli site percolation process Qε with density ε. +
Step 2. Let us split Z2 into the disjoint union of the shifts of squares B = {−2r , . . . , 2r }2 , Z2 = (4r x + B ) . x
Given a realization A of the random set A (distributed according to (34)) let us say that x ∈ Z2 is good if 4r x + B ∩ A = ∅. Thus, for every n, A induces a probability 2 distribution on {0, 1}Z , which stochastically dominates Bernoulli site percolation with 2 density 1 − (1 − ε)16r . This dictates the choice of ε in terms of the diameter of the interaction r : For 2 ) for C large enough qualifies. example, ε = 1/(Cr The end of the proof is a straightforward modification of the one in the nearestneighbour case.
2.3. Long-range case: Proof of Theorem 2. In this section we study the long-range case, by adapting the technique of [P, FP] to the setting of singular interaction. As in the previous section, we restrict our attention to the case of S1 -valued spins (the extension to the general case is done in the same way as before). We give here a proof only for the case when all the interactions Jx in (8) are nonnegative. The proof in the general case is then straightforward. Let again n be the box x ∈ Z2 : ||x||∞ ≤ n , EnJ = {{x, y} : Jx−y = 0, {x, y} ∩ n = ∅}, and let φ¯ be an arbitrary boundary condition outside n . The relative Hamiltonian takes the form H¯ φ n |φ¯ = Jx−y U¯ (φx − φy ) + Jx−y U¯ (φx − φ¯ y ) , {x,y}∈EnJ {x,y}⊂ n
{x,y}∈EnJ {x,y}⊂ n
where as in (24) the interaction U¯ consists of a smooth part U and a small part υ. Recall that due to the normalization assumption (9), we can interpret the numbers +
j (x) = Jx as the transition probabilities of a symmetric random-walk X· on Z2 . We denote by EX expectation w.r.t. this random-walk conditioned to start at the origin at time 0. Our assumption on the coupling constants J· is that X· is recurrent. Let ·n,φ¯ be the Gibbs state in n corresponding to the interaction U¯ and the boundary ¯ To prove the theorem, it is enough to show that, for any δ > 0, any bounded condition φ. local function f (φ) and any ψ ∈ S1 , lim f (φ + ψ)n,φ¯ − f (φ)n,φ¯ ≤ δ . (35) n→∞
448
D. Ioffe, S. Shlosman, Y. Velenik
2.3.1. Expansion of the measure. As in Subsect. 2.2, we expand the Gibbs measure as ¯ ,φ¯ ¯ φ,A µU = πn (A) µU, , n n A⊂EnJ
with the probabilities πn (·) given by πn (A) =
¯ U,φ,A
Zn
U¯ ,φ¯
Zn
,
and consider the bond percolation process A on EnJ with probability distribution Pn (A = A) = πn (A) . Exactly as before, we can show that this process is stochastically dominated by an independent bond percolation process QJ,ε on EnJ with probabilities QJ,ε ({x, y} ∈ A) = εJx−y . From now on, we always assume that ε is chosen strictly smaller than 1. We will use the following notation for the connectivities of the process QJ,ε : A px,ε = QJ,ε 0 ↔ x . Notice that px,ε ≤
∞
+
ε n j (n) (x) = dε (x) ,
n=1
where j (n) are the n-steps transition probabilities of the random-walk X· . Therefore +
c(ε) =
px,ε ≤
x
dε (x) =
x
∞
εn =
n=1
ε , 1−ε
(36)
and the numbers c(ε)−1 px,ε can be considered as the transition probabilities of a new random-walk on Z2 , which we denote by Y· ; expectation w.r.t. Y· conditioned to start at 0 at time 0 is denoted by EY . The following lemma plays an essential role in the sequel: Lemma 9. X· recurrent -⇒ Y· recurrent. Proof. The recurrence of X· is equivalent (see Th. 8.2 in Chapter II of [Sp]) to dθ = ∞, 2 1 − φ(θ ) T where +
φ(θ ) = EX ei(θ,X1 ) =
x
ei(θ,x) j (x) =
x
cos ((θ, x)) j (x).
(37)
2D Models with Continuous Symmetry and Singular Interactions
One has to show that
T2
449
dθ = ∞. 1 − EY ei(θ,Y1 )
(38)
Now, Y· is symmetric. Thus 1 − EY ei(θ,Y1 ) = EY (1 − cos ((θ, Y1 ))) 1 = c(ε) (1 − cos ((θ, x))) px,ε x
≤ = = =
1 c(ε)
1 c(ε)
∞
(1 − cos ((θ, x))) ε n j (n) (x)
x n=1 ∞
1 − φ n (θ ) ε n
n=1
∞ 1 − φ(θ ) n ε 1 + φ(θ ) + · · · + φ n−1 (θ ) c(ε)
1 − φ(θ ) c(ε)
n=1 ∞
φ n (θ )
n=0
εk
k>n
(1 − φ(θ ))ε = , c(ε)(1 − ε)(1 − εφ(θ )) which implies that (38) follows from (37).
" !
2.3.2. The spin-wave. Let us denote by V the support of f .
A
Given a subset A ⊆ EnJ , we define the equivalence relation ↔ between sites of Z2 A
by saying that x ↔ y iff there is a path made from the bonds of A, which connects the A
sites x and y. By definition, x ↔ x for any A. For every x ∈ n we define A
rA (x) = sup{y∞ : y ∈ Z2 and y ↔ x} . Clearly, x∞ ≤ rA (x) ≤ ∞. We define ρV = max{x∞ ; x ∈ V } ∨ 1 and rA (V ) = max{rA (x); x ∈ V } ∨ 1 . Let R(δ) be the smallest number such that QJ,ε (rA (V ) > R(δ)) ≤ Notice that R(δ) < ∞ since QJ,ε (rA (V ) > R(δ)) ≤ |V | and
x
δ . 2f ∞
y: y∞ >R(δ)−ρV
px,ε = c(ε) < ∞, see (36).
py,ε ,
450
D. Ioffe, S. Shlosman, Y. Velenik
By recurrence of the random-walk Y· , which was established above, one can find, for any δ > 0 and 0 < ψ < ∞, a sequence of non-negative functions Dn,δ,ψ on Z2 – the spin-waves – such that Dn,δ,ψ (x) = 0 if x ∈ n , Dn,δ,ψ (x) = ψ if x∞ < R(δ), and 2 lim px−y,ε Dn,δ,ψ (x) − Dn,δ,ψ (y) = 0 . (39) n→∞
x∈ n y∈Z2
The most natural candidate for such a spin-wave is given by Dn,δ,ψ (x) = ψ PxY τ R < τ cn ,
(40)
PxY
where denotes the law of Y -random walk starting at x, whereas τ R and τ cn are the first hitting times of R(δ) and of the exterior cn = Z 2 \ n respectively. Then (39) is related to the vanishing, as n → ∞, of the escape probability from n . The function Dn,δ,ψ (·) in (40) also represents the voltage distribution (cf. [DoS] on the interpretation of recurrence in terms of electric networks) in the network on the graph Z, E J with bond conductances px−y,ε , once all the sites in R(δ) are kept at the constant voltage ψ, whereas all the sites in cn are grounded. In this language the vanishing of the limit in (39) means zero conductance from R(δ) to infinity, which is a characteristic property of electric networks corresponding to recurrent random walks. Let us fix a spin-wave sequence {Dn,δ,ψ (x)} so that (39) holds. For any n and any A ⊂ Z2 such that rA (V ) ≤ R(δ), we define the corresponding A-deformed spin-wave by + n,δ,ψ,A (x) = D min Dn,δ,ψ (y) .
(41)
A
y:x ↔y
n,δ,ψ,A ≡ 0. When A is such that rA (V ) > R(δ), we simply set D
A
For any x ∈ n we denote by tA (x) ∈ Z2 one of the sites y : x ↔ y, at which the minimum in (41) is attained. (This is a slight abuse of notation, since in fact the site tA (x) depends also on the function Dn,δ,ψ (·).) The deformed spin-wave is less regular than Dn,δ,ψ , but has the property, crucial for A n,δ,ψ,A (y) whenever x ↔ n,δ,ψ,A (x) = 0 n,δ,ψ,A (x) = D y. In particular, D us, that D whenever x is A-connected to the outside of n . We introduce the tilted measure ¯
¯
φ,A,D φ,A n,δ,ψ,A ) . µU, ( · ) = µU, (· + D n n ¯ U,φ,A, D
¯ U,φ,A
= µn Notice that µn hand, if rA (V ) ≤ R(δ), then
whenever A is such that rA (V ) > R(δ). On the other ¯
¯
φ,A φ,A,D f (φ + ψ)U, = f (φ)U, . n n
Consequently the following estimate holds: ¯ ¯ φ,A U,φ,A, D f − (φ) f (φ + ψ)n,φ¯ − f (φ)n,φ¯ ≤ En f (φ)U, n n + 2f ∞ Pn (rA (V ) > R(δ)) . Our target assertion (35) is a consequence of the following two results: ¯ ¯ φ,A φ,A, D lim En f (φ)U, − f (φ)U, = 0, n n n→∞
(42)
2D Models with Continuous Symmetry and Singular Interactions
451
and 2f ∞ Pn (rA (V ) > R(δ)) ≤ δ .
(43)
The second bound readily follows from the stochastic domination by the Bernoulli percolation process QJ and the definition of R(δ). The next subsection is devoted to the proof of (42). Our approach is essentially that of [P, FP], but with some simplifications. The main difference between the latter works and ours is that, using a suitable relative entropy inequality, we obtain estimates on difference of expectations in finite volume; in this way, (42) follows immediately by taking the thermodynamic limit, instead of using the general theory of infinite-volume Gibbs states. 2.3.3. Relative entropy estimate. By the well known inequality (see e.g. [F], f-la (3.4) on p. 133), ¯ ¯ ¯ ¯ U,φ,A, D U,φ,A U,φ,A U,φ,A, D − f (φ)n | µn ), f (φ)n ≤ f ∞ 2H(µn ¯ U,φ,A, D
¯ U,φ,A
¯ U,φ,A, D
¯ U,φ,A
where H(µn | µn ) is the relative entropy of µn with respect to µn . By Jensen’s inequality it suffices to show that ¯ ¯ A φ, lim En H µnU,φ,A,D | µU, = 0. (44) n n→∞
From now on we assume that we are working on the event rA (V ) ≤ R(δ) (otherwise the relative entropy is 0). We follow [P], and we write: ¯
¯
¯
¯
¯
¯
φ,A,D φ,A φ,A,D φ,A φ,A,−D φ,A H(µU, | µU, ) ≤ H(µU, | µU, ) + H(µU, | µU, ) n n n n n n
¯ ¯ ¯ , = H(φ + Dn,δ,ψ,A | φ) + H(φ − Dn,δ,ψ,A | φ) − 2H(φ | φ) n,φ,A ¯
¯ is the Hamiltonian defined by the smooth part of the interaction. where, as before, H(φ|φ) Taylor expansion yields ¯ + H(φ − D ¯ − 2H(φ | φ) ¯ n,δ,ψ,A | φ) n,δ,ψ,A | φ) H(φ + D 2 ≤ c4 Jx−y Dn,δ,ψ (tA (x)) − Dn,δ,ψ (tA (y)) x∈ n y∈Z2
with c4 = max U . By Jensen’s inequality,
2 Dn,δ,ψ (tA (x)) − Dn,δ,ψ (tA (y)) 2 2 ≤ 3 Dn,δ,ψ (tA (x)) − Dn,δ,ψ (x) + Dn,δ,ψ (tA (y)) − Dn,δ,ψ (y) 2 + Dn,δ,ψ (x) − Dn,δ,ψ (y) .
(45)
The sum of the third terms of (45) is bounded by x∈ n y∈Z2
2 1 2 Jx−y Dn,δ,ψ (x) − Dn,δ,ψ (y) ≤ px−y,ε Dn,δ,ψ (x) − Dn,δ,ψ (y) , ε x∈ n y∈Z2
452
D. Ioffe, S. Shlosman, Y. Velenik
and therefore, by the very definition of Dn,δ,ψ , goes to zero as n → ∞. The contribution of the remaining two terms of (45) to En H is bounded by 2 Jx−y Dn,δ,ψ (tA (x)) − Dn,δ,ψ (x) 2 En x∈ n y∈Z2
≤C
x∈ n
≤C
x∈ n y∈Z2
=C
2 En Dn,δ,ψ (tA (x)) − Dn,δ,ψ (x)
A
QJ,ε x ↔ y
2
Dn,δ,ψ (y) − Dn,δ,ψ (x)
2 px−y,ε Dn,δ,ψ (y) − Dn,δ,ψ (x) ,
x∈ n y∈Z2
and the result follows again from the definition (39) of Dn,δ,ψ . 2.4. Continuous symmetry breaking: Proof of Theorem 4. We construct the whole family the corresponding boundary conof spontaneously magnetized states µν by prescribing ditions. Let n be the box x ∈ Z2 : ||x||∞ ≤ n . We define first the boundary condition φ˜ τ by φ˜ τ (x1 , x2 ) = x2 τ θ, τ ∈ [0, 1] , see (11). It is easy to see that the unique configuration in n with finite energy with respect to the b.c. φ˜ τ =1 outside n is the one which coincides with φ˜ τ =1 inside n . In principle that means that the atomic measure µν=1 , concentrated on the configuration φ˜ τ =1 , is itself a Gibbs state for interaction (11), for any temperature, so we are already done. However, this measure has its finite-dimensional distributions singular with respect to the Lebesgue measure. To present a more aesthetically appealing example we proceed as follows. Consider the measure µ0 corresponding to zero boundary conditions φ˜ 0 = 0. In case it is not S 1 -invariant at some finite temperature β −1 and has nonzero spontaneous magnetization, we are done again. In the opposite case (which seems to us to be the true one) we have that for every arc γ on a circle S1 , P0, n {φ (0, 0) ∈ γ } →
|γ | as n → ∞, 2π
where we denote by Pτ, n the conditional Gibbs distribution in n subject to boundary conditions φ˜ τ outside n , corresponding to inverse temperature β. Let us fix γ to be the arc − π6 , π6 ⊂ S1 , say, so |γ | = π3 . Then P0, n {φ (0, 0) ∈ γ } → 16 as n → ∞. Note now, that for every n fixed, the function Pτ, n {φ (0, 0) ∈ γ } is continuous in τ , with Pτ, n {φ (0, 0) ∈ γ } → 1 as τ → 1. Therefore for every n big enough we can define the value τ (n, ν) to be the solution of the equation π π Pτ (n,ν), n φ (0, 0) ∈ − , = ν, 6 6
2D Models with Continuous Symmetry and Singular Interactions
453
where 16 < ν < 1. Denote by µν any weak limit of the sequence of the finite-dimensional Gibbs Pτ (n,ν), n . Then µν is of course a Gibbs state. Since evidently distributions µν φ (0, 0) ∈ − π6 , π6 = ν, this state is not S1 -invariant once ν > 16 . Acknowledgements. D.I. would like to acknowledge the warm hospitality of the Centre de Physique Theorique for its hospitality and financial support during his visit to Marseille in the Fall of 2001.
References Aizenman, M.: On the slow decay of O (2) correlations in the absence of topological excitations: Remark on the Patrascioiu-Seiler model. J. Statist. Phys. 77, 351–359 (1994) [BI] Bolthausen, E. and Ioffe, D.: Harmonic crystal on the wall: A microscopic approach. Commun. Math. Phys. 187, no. 3, 523–566 (1997) [BPK] Bonato, C. A., Perez, J.F. and Klein, A.: The Mermin-Wagner phenomenon and cluster properties of one- and two-dimensional systems. J. Stat. Phys. 29, no. 2, 159–175 (1982) [BLL] Brascamp, H.J., Lieb, E.H. and Lebowitz, J.L.: The statistical mechanics of anharmonic lattices. Bull. Inst. Intl. Stat. 46, no. 1, 393–404 (1976) [BCPK] Bovier, A., Campanino, M., Perez, F. and Klein, A.: Smoothness of the density of states in the Anderson model at high disorder. Commun. Math. Phys. 114, 439–461 (1988) [DS1] Dobrushin, R.L. and Shlosman, S.: Absence of breakdown of continuous symmetry in twodimensional models of statistical physics. Commun. Math. Phys. 42, 31–40 (1975) [DS2] Dobrushin, R.L. and Shlosman, S.: Nonexistence of one- and two-dimensional Gibbs fields with noncompact group of continuous symmetries. In: Multicomponent random systems, Adv. Probab. Related Topics, 6, New York: Dekker, 1980, pp. 199–210 [DoS] Doyle, P.G. and Snell, J.L.: Random walks and electric networks. Carus Math. Monographs 22, Washington, DC: Mathematical Association of America, 1984 [DV] Deuschel, J.D. and Velenik, Y.: Non-Gaussian surface pinned by a weak potential. Probab. Theory Related Fields 116, no. 3, 359– 377 (2000) [F] Föllmer, H.: Random fields and diffusion processes. École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–87, Lecture Notes in Math. 1362, Berlin: Springer, 1988, pp. 101–203 [FILS] Fröhlich, J., Israel, R., Lieb, E.H. and Simon, B.: Phase transitions and reflection positivity. I. General theory and long range lattice models. Commun. Math. Phys. 62, no. 1, 1–34 (1978) [FP] Fröhlich, J. and Pfister, Ch.: On the absence of spontaneous symmetry breaking and of crystalline ordering in two-dimensional systems. Commun. Math. Phys. 81, no. 2, 277–298 (1981) [FS] Fröhlich, J. and Spencer, T.: The Kosterlitz-Thouless transition in two-dimensional abelian spin systems and the Coulomb gas. Commun. Math. Phys. 81, no. 4, 527–602 (1981) [IV] Ioffe, D. and Velenik, Y.: A note on the decay of correlations under δ-pinning. Probab. Th. Rel. Fields 116, no. 3, 379–389 (2000) [I] Ito, K.R.: Clustering in low-dimensional SO(n)-invariant statistical models with long-range interactions. J. Statist. Phys. 29, no. 4, 747–760 (1982) [G] Georgii, H.-O.: Gibbs measures and phase transitions, de Gruyter Studies in Mathematics 9. Berlin: Walter de Gruyter & Co., 1988 [KLS] Klein, A., Landau, L.J. and Shucker, D.S.: On the absence of spontaneous breakdown of continuous symmetry for equilibrium states in two dimensions. J. Statist. Phys. 26, no. 3, 505–512 (1981) [KP] Kunz, H. and Pfister, C.-E.: First order phase transition in the plane rotator ferromagnetic model in two dimensions. Commun. Math. Phys. 46, no. 3, 245–251 (1976) [LSS] Liggett, T.M., Schonmann, R.H. and Stacey, A.M.: Domination by product measures. Ann. Probab. 25, no. 1, 71–95 (1997) [MS] McBryan, O. and Spencer, T.: On the decay of correlations in SO(n)-symmetric ferromagnets. Commun. Math. Phys. 53, no. 3, 299–302 (1977) [M] Mermin, N.D.: Absence of ordering in certain classical systems. J. Math. Phys. 8, no 5, 1061–1064 (1967) [MMR] Messager, A., Miracle-Solé, S. and Ruiz, J.: Upper bounds on the decay of correlations in SO(n)symmetric spin systems with long range interactions. Ann. Inst. H. Poincaré Sect. A (N.S.) 40, no. 1, 85–96 (1984) [N] Naddaf, A.: On the decay of correlations in non-analytic SO(n)-symmetric models. Commun. Math. Phys. 184, no. 2, 387–395 (1997) [PS] Patrascioiu, A. and Seiler, E.: Phase structure of two-dimensional spin models and percolation. J. Statist. Phys. 69, no. 3-4, 573–595 (1992) [A]
454
D. Ioffe, S. Shlosman, Y. Velenik
[P]
Pfister, C.-E.: On the symmetry of the Gibbs states in two-dimensional lattice systems. Commun. Math. Phys. 79, 181–188 (1981) Rockafellar, R. T.: Network flows and monotropic optimization. Pure and Applied Mathematics. A Wiley-Interscience Publication. New York: John Wiley & Sons, Inc., 1984 Sinai, Ya.G.: Theory of phase transitions: Rigorous results. International Series in Natural Philosophy, 108, Oxford-Elmsford, N.Y.: Pergamon Press, 1982 Shlosman, S.: Decrease of correlations in two-dimensional models with continuous group symmetry. Teoret. Mat. Fiz.37, no. 3, 427–430 (1978), English translation: Theor. and Math. Phys. 37, no. 3, 1118–1121 (1979) Shlosman, S.: Phase transitions for two-dimensional models with isotropic short range interactions and continuous symmetry. Commun. Math. Phys.71, 207–212 (1980) Spitzer, F.: Principles of random walk. The University Series in Higher Mathematics, Princeton, NJ–Toronto–London: D. Van Nostrand Co., Inc., 1964
[R] [Si] [S78] [S80] [Sp]
Communicated by H. Spohn
Commun. Math. Phys. 226, 559 – 566 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Nonlinear Maxwell Theory and Electrons in Two Dimensions Artur Sowa 120 Monticello Avenue, Durham, NC 27707, USA. E-mail: [email protected] Received: 16 December 1999 / Accepted: 3 December 2001
Abstract: We consider a system of nonlinear equations that extends the Maxwell theory. It was pointed out in a previous paper that symmetric solutions of these equations display properties characteristic of magnetic oscillations. In this paper I study a discrete model of the equations in two dimensions. This leads to the discovery of a new mechanism of vortex lattice formation. Namely, when a parameter corresponding to a magnetic field normal to the surface increases above a certain critical level, the trivial uniformmagnetic-field solution becomes, in a certain sense, unstable and a periodic vortex lattice solution emerges. The discrete vortex solutions are proven to exist, and can also be found numerically with high accuracy. Description of magnetic vortices given by the equations is optical in spirit, and may be particularly attractive in the context of highTc superconductivity and the quantum Hall effects. Moreover, analysis of parameters involved in the discrete theory suggests existence of continuous domain solutions – a conjecture that seems unobvious on grounds of the current topological and variational methods. 1. The Proposed Physical Model Let A be the electromagnetic vector potential, and let the corresponding electromagnetic field be denoted by FA = dA. Further, let f be a real valued function. Consider the following system of equations: dFA = 0
(1)
δ(f FA ) = 0
(2)
− f + |FA |2 f = νf.
(3)
The author is currently with the Pegasus Imaging Corporation. Neither the present nor any previous institutional affiliations of the author were of consequence for this privately conducted research.
560
A. Sowa
Fig. 1.
The main goal of this paper is to describe a vortex-lattice type solution of this system of equations in two dimensions (cf. Fig. 1) by means of a finite difference approach. Results obtained in this way suggest that in the continuous domain limit such a solution consists of a continuous function f and a connection A whose curvature 2-form FA is also continuous. Moreover, the equations are satisfied in the classical sense almost everywhere. Note that the displayed numerical solutions are not defined at the vortex points. (Here, the horizontal axes are indexed by discrete lattice points.) Moreover, their scaling should be matched to a given physical system. As explained in [1] (and in a simplified Abelian version in [2]), the particular form of this system is suggested by the geometry of principal U (1)-bundles. Purely phenomenologically (1)–(3) can be motivated by evoking heuristics used frequently in the optical literature. Namely, it is often assumed in optics that the nonlinearities arising in the interaction of radiation with matter can be accounted for by representing f as a series in tensorial powers of FA with coefficients characteristic of the material. Subsequently, one attempts to deduce properties of the coefficients from a microscopic theory. The shift of paradigm in this paper consists in assuming that f depends in a geometrically invariant way on FA via Eq. (3), and f remains essentially independent of the material except for a simple scaling. We will see below that in contrast to the soft optical nonlinearities, (1)–(3) cannot be understood as a small perturbation of a linear system. However, interaction of radiation with matter is described from the “point of view” of the former, and the electronic processes inside matter are never discussed directly. In order to explain superconductivity by means of a microscopic theory, one needs to display a mechanism that will let fermions overcome the obstacle imposed by the
Nonlinear Maxwell Theory and Electrons in Two Dimensions
561
Pauli exclusion principle. A solution given by the famous BCS theory is based on the observation that when thermic noise is sufficiently low, it is energetically favored for electrons to join in pairs, known as Cooper pairs, which behave like bosons. The BCS theory is in a certain correspondence to the Ginzburg–Landau equations. These are nonlinear equations for a complex valued function ψ, often interpreted as an order parameter governing collective behavior of Cooper pairs. Periodic solutions of these equations in the form of vortices have been found by Abrikosov in [4]. It must be emphasized that strict validity of the BCS/GL approach, at least in its classical s-wave pairing version, is limited to low temperature superconductivity of metals. On the other hand, type II superconductivity is known to occur in materials structurally different from metals, like YBCO, and at relatively high critical temperatures. As many researchers pointed out, this suggests that mechanisms other than those encompassed by the BCS theory may be responsible for high temperature superconductivity. Those aspects of solid state theory that go beyond BCS seem particularly attractive in terms of the possibility of merging with the nonlinear Maxwell equations. In my opinion, the new mathematical pattern introduced in this paper will be helpful in the description of the interaction of magnetic fields with composite particles. For illustration, consider the proposition that f describes a locally varying filling factor. In this interpretation, a part of the field gets entrapped in composite bosons, composite fermions, and Laughlin quasiparticles, which in turn see only the remaining f1 -fraction of the field. If f is a constant, this allows one to replace an electron picture with a composite particle picture, a suggestion that was present in science already ten years ago. However, if f is a vortextype solution, the composite particles will feel the vortex in FA , which should induce Josephson-type effects. Thus, while microscopic theory is always constructed with an a priori fixed filling factor, (1)–(3) would reflect the behavior of magnetic fields on a coarser scale. The above is meant to evoke some of the basic notions and ideas present in modern materials science. More thorough reviews can be found in the articles featured in the very incomplete list of references below ([5]–[13]). 2. Mathematics of a Finite Difference Approximation Consider the system (1)–(3) on a two-dimensional flat torus T 2 . In this case δ(f F ) = 0 implies F = B f for a constant B. In addition, dF = 0 and F is the curvature of a certain connection A, provided its cohomology class satisfies [F ] ∈ 2π Z. Thus, the system of equations reduces to −f + and
T2
B dV = 2π K f
B2 = νf, f
for an integer
(4)
K.
(5)
Suppose f is a solution of the first equation with parameter B. Then for any c > 0, the function cf is a solution of (4) with B replaced by cB. At the same time, this rescaling does not affect (5) in any way, since the ratio B/f remains fixed. However, the system behaves differently with respect to rescaling of the independent variable. Indeed, suppose f satisfies both equations with parameters B, ν, and K, then defining
562
A. Sowa
g(x, y) = f (cx, cy), we have that g satisfies (4) and (5) with parameters Bc, νc2 , and K/c, while its period is 1/c in both directions. This is consistent with the experimental fact that as a magnetic field normal to the surface increases, vortices should eventually collide with one another. (In physical reality they will at that point disappear together with the superconducting state.) This fact is also important mathematically, as it allows us to first obtain a solution of (4) with, say, T 2 B f dV < 2π , and then rescale the independent variable to satisfy (5). It appears that the system (4)–(5) does not subdue itself to the standard techniques of variational calculus or topological analysis. In particular, perturbative methods do not apply to (4), and no solutions arise as a result of bifurcation. In fact, in view of the theorem below and the results of numerical simulations, solutions are objects very unlike the familiar vortex solutions of nonlinear PDEs. To give some indication of the difficulties involved, consider the (4)is the Euler–Lagrange equation following. Equation for the Lagrangian L(f ) = ( 21 |∇f |2 + B 2 ln(f ))/( f 2 ). However, this functional is neither bounded below nor above. Indeed, let us note that L(cn ) → −∞ for constants cn → 0. On the other hand, let fn (x, y) = 1 + ε + cos(2π nx). Since ln fn (x, y) ≥ ln ε, one easily checks that in this case L(fn ) → ∞. Therefore, the best we can expect is to discover local extrema. Additional difficulty stems from the fact that since (4) always admits a trivial constant solution, we must devise a method of telling trivial and nontrivial solutions apart. We will now consider a finite difference model of the system (4)–(5). It is proven below that non-constant solutions of the discrete problem exist. The proof is independent of the number of points in the discretization (n2 ) but relies on finite-dimensionality essentially, and does not admit a direct generalization to the analytic case. However, all the universal parameters used in the proof, like the L2 -norm of f and B, are asymptotically independent of n. Thus, we conjecture existence of the continuous domain solutions of (4) that satisfy the equation a.e. in the classical sense and retain the particular vortex morphology. It is convenient to introduce the following notation. = n12 , = n12 , i,j o
(i,j ) =(i0 ,j0 )
where indices i, j run through the discrete n-by-n lattice. Also, denotes the common j i j +1 five-point periodic discrete Laplacian, i.e. f ( ni , nj ) = n2 f ( i+1 n , n) + f (n, n ) j i j −1 i j ∂ ∂ +f ( i−1 n , n ) + f ( n , n ) − 4f ( n , n ) and ∇ = ( ∂x , ∂y ) is the simplest two-point j ∂ i j periodic gradient, i.e. say ∂x f ( ni , nj ) = n f ( i+1 , ) − f ( , ) . In particular, the n n n n discrete integration-by-parts formula holds, i.e. − (f )g = (∇f, ∇g). Consider the function 1 "(f ) = |∇f |2 + B 2 ln(f ). 2 Pick arbitrary real numbers a, b, c, fix a point (x0 , y0 ) = ( in0 , jn0 ), and a number 2 −1 n 1 mn = . − b c Two submanifolds in R n ,
(6)
2
n Da,b,c = {f > 0 :
f 2 = a2,
1 1 ≤ , min f = f (x0 , y0 ) = mn } f b
(7)
Nonlinear Maxwell Theory and Electrons in Two Dimensions
563
and its boundary n ∂Da,b,c = {f > 0 :
f 2 = a2,
1 1 = , min f = f (x0 , y0 ) = mn }, f b
(8)
play a fundamental role in understanding the nature of critical points of ". Depending n on the particular value of a, b, and c, the set Da,b,c is either empty, an (n2 − 2)dimensional spherical disk, or it degenerates to a point. Consider the hyper-plane Hn = n {f : f (x0 , y0 ) = mn }. Da,b,c is a spherical disk immersed in Hn precisely when V , the point closest to the origin of an open submanifold given by f1 = b1 − n21m , is located inside the ball (x, y) = c satisfy
f 2 ≤ a2 −
m2n . n2
o
n
One easily finds V (x, y) = const = (n2 − 1)c for all
o (x0 , y0 ), and for it to be inside the ball, it is necessary and sufficient that a
m2n (n2 − 1)3 2 2 c < a − . n2 n2
and
(9)
Conversely, if condition (9) holds, and in addition mn is indeed a minimum of every n function that satisfies the two other conditions in (7), then Da,b,c is a nonempty spherical disk. In fact, we will see below that checking this latter condition is quite straightforward for the particular choice of constants that is of interest to us. At this point I would like to point out that in the description of local minima of " below, the parameter a is physical, and with good faith can be regarded as the L2 -norm of the critical point, whereas the parameters b, c are auxiliary and will converge to 0 as the density of discretization n increases to infinity. In particular, interpreting b1 as an approximate value of the integral of the reciprocal of the function where " attains its local minimum is erroneous since the function develops a singularity at (x0 , y0 ). The following theorem will be proven. Theorem 1. Fix constants B and a as above. For a certain choice of the constants b = b(n, a), and c = c(n, a), the function f → "(f ) assumes local relative minima n in Da,b,c . In particular, the corresponding critical point, say f0 > 0, satisfies the finite difference version of (4) everywhere except one point, i.e. − f0 (x, y) +
B2 = νf0 (x, y). f0 (x, y)
for all (x, y) = (x0 , y0 ).
(10)
Moreover, if B is sufficiently large, then f0 is not a constant function. n Proof. Let N denote an outward normal vector to ∂Da,b,c inside its ambient sphere, i.e. N is tangent to the (n2 − 2)-dimensional sphere Sa = {f : f 2 = a 2 , f (x0 , y0 ) = mn } n and points away from the region Da,b,c . The main task is to show that N " > 0. It will then follow from smoothness of " (∇ is a linear operator and f → ln(f ) is smooth for n f > 0) that it assumes a local minimum inside Da,b,c . Let us choose for N the vector field defined by 1 1 f (x, y) − f (x,y) for (x, y) = (x0 , y0 ) 2 (11) Nf (x, y) = ba 2 0 otherwise,
564
A. Sowa
n where f ∈ ∂Da,b,c . A direct calculation shows that 1 1 1 1 2 2 Nf "(f ) = 2 − f f + B (1 − 2 ) + f −B . ba n f2 f3 o
o
o
n It remains to analyze terms one by one. First, since a function in Da,b,c assumes its minimum at (x0 , y0 ), we obtain 1 2 − f f = |∇f | + 2 f (x0 , y0 )mn ≥ |∇f |2 ≥ 0. (12) n o
n Next, by definition of Da,b,c and (6) we obtain 1 1 1 1 1 1 1 1 = = + 2 + − 2 , = b f f n mn f b n c
so that
o
(i,j ) =(i0 ,j0 )
1 f
=
1 c
o
and therefore
f (x, y) > c
for
(x, y) = (x0 , y0 ).
As an immediate application we obtain 1 1 1 ≤ 3 (1 − 2 ). f3 c n
(13)
(14)
o
Finally, we obtain the following inequality: 1 n2 f = n12 i j 2 f2 f ( , ) o (i,j ) =(i0 ,j0 ) n n j i j +1 i−1 j i j −1 i j × f ( i+1 , ) + f ( , ) + f ( , ) + f ( , ) − 4f ( , ) n n n n n n n n n n i j 1 1 ≥ i j 2 (−4f ( n , n )) = −4 i j =
(i,j ) =(i0 ,j0 ) f ( n , n ) − 4n2 ( b1 − n12 m1n )
=
(i,j ) =(i0 ,j0 ) f ( n , n )
− 4c .
(15)
Together, estimates (12), (14), (15) yield Nf "(f ) ≥ (1 −
1 2 1 1 4 )B ( 2 − 3 ) − . n2 ba c c
(16)
So far no assumption has been made about the constants. Now, in order to guarantee n is nonempty, and that the outer existence of local minima one needs to ensure that Da,b,c derivative is positive. Both these requirements are met if we pick c = c(n) =
a , 2n2
b = b(n) =
a . 16n6
and
Nonlinear Maxwell Theory and Electrons in Two Dimensions
565
With this choice of c, (9) holds. Moreover, if f satisfying the first two conditions in (7) assumed at some other point a value at least as small as mn , then with the choice of c and 2m−1 b as above we would have f1 > n2n = b2 − cn22 > b1 , which is a contradiction. Thus n Da,b,c is nonempty. On the other hand, inequality (16) becomes Nf "(f ) ≥ 8B (1 − a3 1 8 2 6 )n − a n which implies Nf "(f ) > 0 for n sufficiently large. Consequently, " n2 n assumes a local minimum inside Da,b,c . Equation (10) is automatically satisfied because it expresses the fact that the component of the derivative of " which is tangent to the sphere f 2 = a 2 vanishes in all directions except possibly the (x0 , y0 )-direction. Still, the local minimum f0 could a priori be a function constant everywhere, except at the discontinuity in (x0 , y0 ). This can be avoided by taking B sufficiently large. As B increases, the solution which is constant except at (x0 , y0 ), say κ : (x, y) = (x0 , y0 ) f1 (x, y) = mn : otherwise 2
becomes unstable, i.e. it corresponds to a saddle point on the graph of ". Indeed, one checks directly that 2 d2 φ 2 2 "(f1 + εφ) = |∇φ| − B , 2 dε f12 for φ tangent to the origin-centered sphere at f1 , so that f1 φ = 0. Now consider φ to be a non-constant eigenfunction of the discrete Laplacian on a torus and let λ denote the corresponding eigenvalue. Shifting it if necessary, we can assume that φ(x0 , y0 ) = 0, so d2 that φ is orthogonal to f1 . We have that |∇φ|2 = λ φ 2 , and thus dε 2 "(f1 + εφ) < 0
only if B > λ. Thus, f1 is not a local minimum for B sufficiently large. Moreover, since 2 κ2 2 f1 = a , κ converges to a as n increases. Naturally, one can make sure that λ remains fixed regardless of n by always picking φ to be a discretization of the same trigonometric function. This shows that a choice of B which guarantees that f0 is a nontrivial solution does not depend on the discretization. 2
3. Closing Remarks The proof above depends essentially on the fact that all functions and manifolds are discrete. Indeed, the constants b = b(n), c = c(n) we have used tend to zero as n → ∞, and some of the estimates make no sense in the limit. However, it is important that the magnetic induction B and the L2 -norm of f do not depend on n. On the other hand, simulation shows that the discrete solution f0 retains its particular morphology independently of n, and is always subharmonic. In addition, multiplying (10) by f χ{(x,y) =(x0 ,y0 )}
and inspecting the vicinity of the singularity we easily obtain ν ∼ a12 B 2 + |∇f |2 , which is also expected to converge. In summary, we have enough evidence to believe that the enclosed figure shows a good approximation to a strong solution of the continuous version of Eq. (4) that would posess a vortex of Lipschitz regularity. It is also consistent with the one-dimensional case, where solutions of the analogous nonlinear equation can be expressed in terms of a closed-form integral. I point out for completeness that continuity of such a positive solution f0 guarantees continuity of FA = fB0 , and by rescaling the independent variables we can satisfy the
566
A. Sowa
condition that the cycle of FA be an integral multiple of 2π . This is sufficient to solve dA = FA for A on a compact surface, and the solution A has sufficient regularity to retain its geometric interpretation as a connection 1-form. On the other hand, A contains all the information necessary to derive the basic tenets of superconductive electronics, like the Josephson effect. It is important to realize that none of the propositions stated above necessarily rely on the lattice being a simple square lattice. Most likely, a hexagonal lattice setting would yield the same qualitative results. The only reason for using a square lattice is to avoid overwhelming numerical complexity in experiments, as well as arithmetical nuisance in theory. I introduced the system (1)–(3) in 1993, and later investigated its properties in my thesis. Some of those early results are contained in [1]. The geometry was first given a loose but essentially correct physical interpretation in [2]. More on the physical predictions and the mathematical properties of (the classical version of) this theory can be found in [3]. Acknowledgements. I wish to thank a friend of old days Fred Warner for proofreading the manuscript. I am also grateful to the referee for a prudent review and suggesting corrections and improvements.
References 1. Sowa, A.: On an Equation Arising From the Geometry of Riemannian Submersions, J. reine angew. Math. 514, 1–8 (1999) 2. Sowa, A.: Magnetic Oscillations and Maxwell Theory, Physics Letters A 228, 347–350 (1997) 3. Sowa, A.: The (Fully) Nonlinear Maxwell Theory – an Outline. physics/0103061 4. Abrikosov, A.A.: On the Magnetic Properties of Superconductors of the Second Group. Soviet Physics JETP Vol. 5, no. 6, 1174–1182 (1957) 5. Anderson, P.W.: Science 235, 1196 (1987) 6. Anderson, P.W.: Phys. Today, Oct. 1997, 42–47 7. Laughlin, R.B.: Science 242, 525–533 (1988) 8. Laughlin, R.B.: Phys. Rev. Lett. 50, 1395–1398 (1983) 9. Kivelson, S., Lee, D.-H., Zhang, S.-C.: Phys. Rev. B 46, 2223–2238 (1992) 10. Prange, R.E., Girvin, S.M. (eds.): The Quantum Hall Effect. Berlin–Heidelberg–New York: SpringerVerlag, 1990 11. Read, N.: Phys. Rev. Lett. 62, 86–89 (1989) 12. Shankar, R. and Ganpathy Murthy: Phys. Rev. Lett. 79, 4437–4440 (1997) 13. Zhang, S.-C., Hansson, T.H. and Kivelson, S.: Phys. Rev. Lett. 62, 82–85 (1989) Communicated by A. Connes
Commun. Math. Phys. 226, 455 – 474 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
On the Structure of Stationary Solutions of the Navier–Stokes Equations Peter Wittwer Département de Physique Théorique, Université de Genève, Switzerland. E-mail: [email protected] Received: 6 March 2001 / Accepted: 4 October 2001
Abstract: We consider stationary solutions of the incompressible Navier–Stokes equations in two dimensions. We give a detailed description of the fluid flow in a half-plane by using a mathematical setup within which the idea of a change of type from an elliptic to a parabolic partial differential equation can be made precise. 1. Introduction We consider, in d = 2 dimensions, the time independent incompressible Navier–Stokes equations −(u · ∇)u + u − ∇p = 0, ∇ · u = 0,
(1) (2)
in a half-space = {(x, y) ∈ R2 | x ≥ 1}. We are interested in modeling a situation where fluid enters the half-space through the surface = {(x, y) ∈ R2 | x = 1}, and where the fluid flows at infinity parallel to the x-axis at a nonzero constant speed u∞ ≡ (1, 0). We therefore impose the boundary conditions lim
x≥1 x 2 +y 2 →∞
u(x) = u∞ , u| = u∞ + u∗ ,
(3) (4)
with u∗ = (u∗ , v∗ ) in a certain set of vector fields S satisfying u∗ (y) = u∗ (−y), v∗ (y) = −v∗ (−y) and lim|y|→∞ u∗ (y) = 0. Let (u, v) = u − u∞ and u∗ in S. From (1), (2) it is then easy to see that the discussion can be restricted to the case of functions u, v and p satisfying u(x, y) = u(x, −y), v(x, y) = −v(x, −y) and p(x, y) = p(x, −y) Supported in part by the Fonds National Suisse de la Recherche Scientifique.
456
P. Wittwer
for all x ≥ 1, i.e., to flows that are symmetric with respect to the x-axis. Such flows are expected to have better behavior at infinity than asymmetric flows (see for example [7]), and we indeed make extensive use of the symmetry property in our proofs, even though our techniques are not a priori limited to this case. The following theorem is our main result. Theorem 1. Let and be as defined above. Then, for each u∗ = (u∗ , v∗ ) in a certain set of vector fields S to be defined later on, there exist a (locally unique) vector field u = u∞ + (u, v) and a function p satisfying the Navier–Stokes equations (1) and (2) in and the boundary conditions (3) and (4). Furthermore (5) x v(x, yx 1/2 ) ≤ const., for all (x, y) ∈ , and
y 2 1/2 c 1/2 = 0, lim sup x u(x, yx ) − √ exp − x→∞ y∈R 4 4π
with
c = lim
k→0+ R
eiky (u∗ (y) + i v∗ (y)) dy.
(6)
(7)
A proof of this theorem will be given in Sect. 5. See [1] for related results. Remark. The set S in Theorem 1 will be specified in Sect. 5, once appropriate function spaces have been introduced. Remark. We consider Theorem 1 to be a first step in an effort to bridge the gap between the mathematically rigorous theory of the Navier–Stokes equations (see [7] and references therein), and the work on asymptotic expansions for solutions of the Navier–Stokes equations that addresses questions relevant for engineering [6, 4]. Theorem 1 has the following interpretation: consider a rigid body that is placed into a uniform stream of a homogeneous incompressible fluid, filling up all of R2 . Experimentally, far away from the body, such a fluid flow appears to be close to a potential flow with the exception of a region downstream of the object, the so-called wake region, within which the vorticity of the fluid is concentrated. The situation in Fig. 1 is modeled by the equations −ρ (u˜ · ∇)u˜ + µu˜ − ∇ p˜ = 0, ∇ · u˜ = 0,
(8)
¯ = R2 \ B, subject to the boundary condition u| ˜ x) = u˜ ∞ = ˜ ∂ ¯ = 0, lim|˜x|→∞ u(˜ in (u∞ , 0). If we assume that the density ρ and the dynamic viscosity µ of the fluid are ¯ then we can always choose a coordinate system as indicated in Fig. 1, scale constant in , to dimensionless coordinates x = (ρu∞ /µ) x˜ , introduce a dimensionless vector field u ˜ x) = u∞ u(x) and p(˜ ˜ x) = ρu2∞ p(x). In and a dimensionless pressure p by defining u(˜ the new coordinates Eq. (8) becomes equal to (1) with located at x = 1. For solutions u˜ of (8) which are such that the corresponding scaled vector field (u − u∞ )| ∈ S (we expect this to be all solutions of (8) for which Re = Lρu∞ /µ 1, but we do not address this question here), Theorem 1 shows the existence of a parabolic wake, within
Stationary Solutions of Navier–Stokes Equations
457
Fig. 1. Stationary fluid flow around a body
which the leading order deviation from the constant flow is universal, i.e., independent of the details of the shape of the body. On a heuristic level this is a well known fact [3]. It is related to what is called a “change of type” of Eq. (8) from an elliptic partial differential equation to a parabolic partial differential equation. The mathematical tools that we use to prove this change of type are a version of the center manifold theorem as proved in [8], combined with techniques as developed in [2]. The rest of this paper is organized as follows. In Sect. 2 we rewrite Eq. (1) and (2) as a dynamical system with the coordinate parallel to the flow playing the role of time. The discussion will be formal. At the end of the discussion we get a set of integral equations. In Sects. 3 and 4 we then prove that these integral equations admit a solution, and in Sect. 5 we finally show that this solution provides a solution for (1) and (2) with the boundary conditions (3) and (4).
2. The Dynamical System Equations (1), (2) are equivalent to ω = ∂x v − ∂y u, 0 = −(u · ∇)ω+ω, 0 = ∂x u + ∂y v,
(9)
ω being the vorticity of the fluid. The main idea underlying the tools developed in this paper is to consider the coordinate parallel to the flow as a time coordinate [3]. Let η = ∂x ω, and u = (1, 0) + (u, v). Then Eqs. (9) are equivalent to ∂x ω = η, ∂x η = η − ∂y2 ω + q, ∂x u = −∂y v, ∂x v = ∂y u + ω,
(10)
where q = uη + v∂y ω.
(11)
458
P. Wittwer
Let ω(x, y) =
1 2π
dk e−iky ω(k, ˆ x),
R
and accordingly for the other functions. For (10) we then get (for simplicity we drop the hats and use in Fourier space t instead of x for the “time”-variable) the dynamical system ω˙ = η, η˙ = η + k 2 ω + q, u˙ = ikv, v˙ = −iku + ω,
(12)
where 1 (u ∗ η + v ∗ (−ikω)), (13) 2π the dot meaning derivative with respect to t and the “star” being the convolution product. Note that u is an even, real valued function of k, and that ω, η, v and q are odd functions of k with values in iR. Equations (12) are of the form z˙ = Lz + q, with z = (ω, η, u, v), q = (0, q, 0, 0) and 0 1 0 0 k2 1 0 0 . L(k) = 0 0 0 ik 1 0 −ik 0 q=
The matrix L(k) can be diagonalized. Namely, let σ (k) ≡ signum(k), and define , + and − by
(k) = 1 + 4k 2 , 1 + (k) , + (k) = 2 1 − (k) − (k) = . 2 Let z =Sζ with 1 1 0 0 − 0 0 + S(k) = i . − k − − ki + 1 1 1 1 −iσ iσ Then ζ˙ = Dζ + S −1 q with
− −
1
0
0
+ 1 0 0 − −1 , S (k) = − i (σ − 1 ) − 1 i 1 1 iσ 2 k 2k 2 2 i 1 1i 1 1 2 (σ + k ) − 2 k 2 − 2 iσ
Stationary Solutions of Navier–Stokes Equations
459
and D = S −1 LS is a diagonal matrix with diagonal entries + , − , |k|, and − |k|. Note that + (k) ≥ 1 and − (k) ≤ 0 and − (k) ≈ −k 2 for small values of k. Let ζ = (ω+ , ω− , u+ , u− ). Using the definitions we find that (12) is equivalent to 1 q, 1 ω˙ − = − ω− − q, 1i u˙ + = |k| u+ − q, 2k 1i u˙ − = − |k| u− − q, 2k
ω˙ + = + ω+ +
(14)
with q as defined in (13), with ω+ and ω− odd functions of k with values in iR and with u+ and u− even real valued functions of k. For convenience later on we also write z =Sζ in component form. Namely, ω = ω+ + ω− , η = + ω+ + − ω− , i i u = − − ω+ − + ω− + u + + u − , k k v = ω+ + ω− − iσ u+ + iσ u− .
(15)
To solve (14) we convert it into an integral equation. The +-modes are unstable (remember that + (k) ≥ 1) and we therefore have to integrate these modes backwards in time starting with ω+ (k, ∞) ≡ u+ (k, ∞) ≡ 0 (see [8]). We get 1 ∞ + (t−s) ω+ (k, t) = − e q(k, s) ds, t 1 t − (t−s) ∗ − (t−1) ω− (k, t) = ω− (k)e − e q(k, s) ds, 1 ∞ 1i e|k|(t−s) q(k, s) ds, u+ (k, t) = 2k t 1 i t −|k|(t−s) ∗ −|k|(t−1) u− (k, t) = u− (k)e − e q(k, s) ds. (16) 2k 1 ∗ can be re-expressed in terms of the In Sect. 4 we will show that the initial condition ω− ∗ vorticity on , and we will see that u− adds to (u, v) a potential flow. In particular, for the case of zero vorticity, we have that ω+ = ω− = q = u+ = 0 and we get that (u, v) is a pure potential flow. In order to prove the existence of a solution for (16) we will apply the contraction mapping principle to the map q˜ = N (q) that is formally defined by computing first (ω+ , ω− , u+ , u− ) from q using (16), then (ω, η, u, v) using (15) and then q by using (13).As discussed above, in direct space, one expects the vorticity to be a rapidly decaying function of y for all x, and we also assume this to be the case for η = ∂x ω and for q = uη + v∂y ω. As a consequence, in Fourier space, q(k, t) ought to be smooth as a function of k (probably entire), but for our purpose it will be sufficient to assume that k → q(k, t) is once continuously differentiable. The decay properties for u and v in direct space are much less obvious, and we should therefore avoid assuming any smoothness in k that
460
P. Wittwer
goes beyond what is necessary to show that limy→∞ u(x, y) = limy→∞ v(x, y) = 0 in order to satisfy the boundary conditions (3), (4). We finally note that since − (k) ≈ −k 2 for small k, the time evolution of ω− is in many ways similar to that of a solution of the heat equation. This is the origin of the appearance of a wake with a parabolic structure and motivates what follows (see [5]). Let α ≥ 0 and µα (k, t) =
1 α . 1 + |k| t 1/2
(17)
We will consider the Banach space Vα of even functions f ∈ C 0 (R) equipped with the norm |f (k)| f α = sup , µ k∈R α (k, 1) the Banach space Vα1 of imaginary valued odd functions f ∈ C 1 (R, iR) equipped with the norm |f (k)| |∂k f (k)| f 1α = sup + sup , µ (k, 1) µ k∈R α k∈R α (k, 1) and the Banach space Bα,β of continuous functions f from [1, ∞) to Vα1 equipped with the norm f α,β = sup t β ||f (t −1/2 . , t)||1α . t≥1
∗ 1 ∗ ∈ V 1 , and let ε = u∗ Theorem 2. Fix α > 0. Let u∗− ∈ Vα+1 , ω− 0 − α+1 + ω− α+1 . α+1 Then, N is well defined as a map from Bα,2 to Bα,2 and contracts, for ε0 sufficiently small, the ball Bα (ε0 ) = q ∈ Bα,2 | qα,2 ≤ ε0 into itself. ∗ and u∗ on , the This theorem implies that, for small enough initial conditions ω− − integral equations (16) admit a unique solution.
3. Proof of Theorem 2 The proof is organized as follows: we first prove that N is well defined and maps, for small enough initial conditions, a ball in Bα,2 into itself. Then we show that N is a contraction on this ball. ∗ by k and we first prove Note that Eqs. (15) and (16) contain divisions of q and ω− bounds on these quotients. Let ε0 be as in Theorem 2. Throughout this proof we then denote by ε a constant multiple of ε0 , i.e., ε = const. ε0 with a constant that may be different from instance to instance. Proposition 3. Let q ∈ Bα,2 , with qα,2 ≤ ε. Then, ε µα (k, t), t2 ε |∂k q(k, t)| ≤ 3/2 µα (k, t), t |k| |q(k, t)| ≤ ε 3/2 µα+1 (k, t). t |q(k, t)| ≤
(18) (19) (20)
Stationary Solutions of Navier–Stokes Equations
461
Proof. The inequalities (18) and (19) follow from our definition of the norm of Bα,β . We now prove (20). From (19) we get for |k| ≤ 1/t 1/2 the bound
|q(k, t)| ≤ |k|
|∂k q(k, t)| ≤ |k|
sup |k|≤1/t 1/2
ε ε ≤ |k| 3/2 µα+1 (k, t). t 3/2 t
The last inequality follows because |k| ≤ 1/t 1/2 implies that |k| t 1/2 ≤ 1, and therefore µα+1 (k, t) ≥ 21 . Similarly, for |k| > 1/t 1/2 we find using (18) that ε |q(k, t)| ≤ 2 µα (k, t) ≤ t
t 1/2 |k| 1 + t 1/2 |k|
ε ε µα (k, t) ≤ |k| 3/2 µα+1 (k, t), t2 t
as claimed. Similarly we have: ∗ 1 ∗ ∈ V1 Proposition 4. Let ω− α+1 with ω− α+1 ≤ ε0 . Then, ∗ ω (k) ≤ εµα+1 (k, 1), − ∂k ω∗ (k) ≤ εµα+1 (k, 1), − ∗ ω (k) ≤ ε|k| µα+2 (k, 1).
(21) (22) (23)
−
1 . We Proof. Inequalities (21) and (22) follow from the definition of the norm in Vα+1 now prove (23). From (22) we get for |k| ≤ 1 that ∗ ω (k) ≤ |k| sup ∂k ω∗ (k) ≤ |k| ε0 ≤ ε |k| µα+2 (k, 1), − − |k|≤1
and from (21) we have for |k| > 1 that ∗ ω (k, t) ≤ ε0 µα+1 (k, 1) ≤ ε0 2 |k| µα+1 (k, 1) ≤ ε |k| µα+2 (k, 1), − 1 + |k| as claimed. In the following two subsections we prove the following proposition: ∗ 1 ∗ ∈ V 1 , and let ε = u∗ Proposition 5. Let α > 0, u∗− ∈ Vα+1 , ω− 0 − α+1 + ω− α+1 . α+1 Then, for all q ∈ Bα,2 , with qα,2 ≤ const. ε0 we have the bounds |N (q)(k, t)| ≤ ε2 µα (k, t)/t 2 , |∂k N (q)(k, t)| ≤ ε µα (k, t)/t 2
3/2
(24) .
(25)
The bounds (24) and (25) imply that N (q)α,2 ≤ ε2 , and therefore N is well defined as a map from Bα,2 to Bα,2 . Furthermore, since ε 2 = const. ε02 , it follows that N maps the ball Bα (ε0 ) ≡ q ∈ Bα,2 | qα,2 ≤ ε0 into itself for ε0 small enough.
462
P. Wittwer
3.1. Bound on N (q). Using the bounds (18)-(20) and (21)-(23) we can now estimate ω+ , ω− , u+ and u− . Let α ≥ 0 and
µ¯ α (k, t) = 1/(1 + |kt|α ).
(26)
∗ and q be as defined above. Then, Proposition 6. Let u∗− , ω−
ε 1 1 µα (k, t), t 2 + ε |ω− (k, t)| ≤ 1/2 µα+1 (k, t), t 1 ω− (k, t) ≤ εµα+2 (k, t), k ε |− ω− (k, t)| ≤ 3/2 µα (k, t), t ε |u+ (k, t)| ≤ 1/2 µα+1 (k, t), t ε |u− (k, t)| ≤ εµ¯ α+1 (k, t) + 1/2 µα+1 (k, t). t Proof. For ω+ we have ∞ 1 ε 1 1 |ω+ (k, t)| ≤ sup |q(k, s)| e+ (t−s) ds ≤ 2 µα (k, t), s≥t t + t |ω+ (k, t)| ≤
(27) (28) (29) (30) (31) (32)
(33)
and (27) follows. For ω− we have
∗ (t−1) 1 (t+1)/2 − (t−s) |ω− (k, t)| ≤ ω− |q(k, s)| ds (k) e − + e 1 1 t + e− (t−s) |q(k, s)| ds (t+1)/2 ∗ (t−1) 1 − (t−1)/2 (t+1)/2 − |q(k, s)| ds + e ≤ ω− (k) e 1 t 1 |q(k, s)| ds + (t+1)/2 ∞ ∗ ds ε ≤ ω− (k) + |k| µα+1 (k, 1) e− (t−1)/2 s 3/2 1 ε t 1 + |k| µα+1 (k, ) 1/2 2 t 1 t − (t−1)/2 ≤ ε |k| µα+2 (k, 1)e + ε |k| 1/2 µα+1 (k, ) t 2 ≤ ε |k| µα+2 (k, t),
(34)
and (28) and (29) follow. For the last inequality we have used that sup sup µα+1 (k, t/2)/(t 1/2 µα+2 (k, t)) ≤ const.,
k∈R t≥1
and furthermore that µα+2 (k, 1)e− (t−1)/2 ≤ µα+2 (k, t). This is a consequence of the following proposition:
Stationary Solutions of Navier–Stokes Equations
463
Proposition 7. Let α ≥ β ≥ 0. Then, for all t ≥ 1 and k ∈ R, 1 1+
|k|α
e−
t−1 2
|− |β ≤ const.
1 tβ
1
1 + |k| t 1/2
α −β .
(35)
Proof. Since − (k) ≤ 0, |− | ≤ const. |k| and µα+2 (k, 2) ≤ const. µα+2 (k, 1), (35) is obvious for 1 ≤ t ≤ 2. For t > 2 we use that
α −β t−1 1 + |k| t 1/2 e− 2 |− t|β α −β t e− 4 |− t|β ≤ 1 + |k| t 1/2 |k|α −β α −β )/2 β − 4t ( |− t| |− t| e ≤ const. 1 + |− |(α −β )/2 |k|α −β (α −β )/2 |k| ≤ const. 1 + ≤ const. 1 + |− |(α −β )/2 ≤ const. 1 + |k|α ,
and (35) follows. Note that Proposition (7) will be routinely used below without mention. We now bound − ω− . We find, using the same techniques as for ω− , that ∞ ∗ t−1 |k| ds |− | e− 2 |− ω− (k, t)| ≤ ω− (k) + ε µα+1 (k, 1) 3/2 s 1 1 t + e− (t−s) |− | |q(k, s)| ds (t+1)/2 t−1
≤ ε |k| µα+2 (k, 1) |− | e− 2 t 1 ε t + µ (k, e−|− |(t−s) |− | ds ) α t2 2 t/2 ε 1 ε ≤ |k| µα+1 (k, t) + µα (k, t) t t2 ε ε ≤ 3/2 µα (k, t) + 2 µα (k, t), t t and (30) follows. We now bound u+ and u− . For u+ we find |u+ (k, t)| ≤ εµα+1 (k, t)
∞ t
ds ε ≤ 1/2 µα+1 (k, t), s 3/2 t
464
P. Wittwer
and for u− (proceeding as in the case of ω− ) we have ∞ t−1 ds |u− (k, t)| ≤ u∗− (k) + εµα+1 (k, 1) e−|k| 2 3/2 s 1 t 1 + e−|k|(t−s) |q(k, s)| ds |k| (t+1)/2 t−1 ε t ≤ εµα+1 (k, 1)e−|k| 2 + 1/2 µα+1 (k, ) t 2 ε ≤ ε µ¯ α+1 (k, t) + 1/2 µα+1 (k, t), t as claimed. For the last inequality we have used that µα+1 (k, t/2) ≤ const. µα+1 (k, t), and furthermore that µα+1 (k, 1)e−|k|(t−1)/2 ≤ µ¯ α+1 (k, t). This is obvious for 1 ≤ t ≤ 2, and for t > 2 we use that for all α ≥ 0, t−1 t 1 + |kt|α e−|k| 2 ≤ 1 + |kt|α e−|k| 4 ≤ const. This completes the proof of Proposition 6. From the bounds on ω+ , ω− , u+ and u− we get from (15) the following bounds on ω, η, u and v. ∗ and q be as defined above. Then, Proposition 8. Let u∗− , ω−
ε µα+1 (k, t), t 1/2 ε |−ikω(k, t)| ≤ µα (k, t), t ε |η(k, t)| ≤ 3/2 µα (k, t), t |u(k, t)| ≤ εµα+1 (k, t), ε |v(k, t)| ≤ 1/2 µα+1 (k, t) + εµ¯ α+1 (k, t). t |ω(k, t)| ≤
(36) (37) (38) (39) (40)
Proof. Inequality (37) immediately follows from (36). To prove (36) and (38)–(40) we apply the triangle inequality to Eqs. (15) and use then the bounds (27)–(32). We get ε
1 ε ε µα+1 (k, t) + 1/2 µα+1 (k, t) ≤ 1/2 µα+1 (k, t), + t t ε ε ε |η(k, t)| ≤ 3/2 µα+1 (k, t) + 3/2 µα (k, t) ≤ 3/2 µα (k, t), t t t ε 1 ε |u(k, t)| ≤ 3/2 µα+1 (k, t) + ε+ µα+2 (k, t) + 1/2 µα+1 (k, t) + εµ¯ α+1 (k, t) t + t ≤ εµα+1 (k, t), ε |v(k, t)| ≤ 1/2 µα+1 (k, t) + ε µ¯ α+1 (k, t), t
|ω(k, t)| ≤
t 3/2
as claimed. We can now bound the convolutions in (13):
Stationary Solutions of Navier–Stokes Equations
465
∗ and q be as defined above. Then, Proposition 9. Let u∗− , ω−
ε2 µα (k, t), t2 ε2 |(v ∗ (−ikω)) (k, t)| ≤ 2 µα (k, t). t Proof. Let k ≥ 0. Then, we have for u ∗ η, ∞ ε2 |(u ∗ η) (k, t)| ≤ 3/2 µα+1 (k , t)µα (k − k , t) dk t −∞ k/2 ε2 ≤ 3/2 µα (k/2, t) µα+1 (k , t) dk t −∞ 3k/2 + µα+1 (k , t) dk k/2 ∞ +µα (k/2, t) µα+1 (k , t) dk |(u ∗ η) (k, t)| ≤
(41) (42)
3k/2
ε2 ε2 ε2 ≤ 2 µα (k, t) + 3/2 |k| µα+1 (k/2, t) ≤ 2 µα (k, t), (43) t t t and (41) follows for k ≥ 0. Since u ∗ η is odd we have the same bound for k < 0. Similarly, we have for v ∗ (−ikω), ∞ ε2 |(v ∗ (−ikω)) (k, t)| ≤ 3/2 µα+1 (k, t)µα (k − k , t) dk t −∞ ε2 ∞ + µ¯ α+1 (k, t)µα (k − k , t) dk t −∞ k/2 ε2 ε2 ≤ 2 µα (k, t) + µα (k/2, t) µ¯ α+1 (k , t) dk t t −∞ 3k/2 + µ¯ α+1 (k , t) dk k/2 ∞ +µα (k/2, t) µ¯ α+1 (k , t) dk ≤
ε2 t2
µα (k, t) +
3k/2 ε2
t
|k| µ¯ α+1 (k/2, t),
(44)
and (42) follows. Note that the bounds (43) and (44) show that |N (q)(k, t)| ≤ ε 2 µα (k, t)/t 2 as required. 3.2. Bound on ∂k N (q). We have 1 (45) (u ∗ ∂k η + v ∗ (−iω) + v ∗ (−ik∂k ω)) (k, t), 2π and it is therefore sufficient to have bounds on the derivatives of η and ω to bound (45). In particular, no derivatives on u+ or u− are needed. We have: ∂k q(k, t) =
466
P. Wittwer
∗ and q be as defined above. Then, Proposition 10. Let u∗− , ω−
1 1 ε µα (k, t), + t 3/2 |∂k ω− (k, t)| ≤ εµα+1 (k, t), ε |− ∂k ω− (k, t)| ≤ µα (k, t). t |∂k ω+ (k, t)| ≤
(46) (47) (48)
Proof. Proceeding as in Sect. 3.1 we find that
|k| |k| ∞ + (t−s) |∂k ω+ (k, t)| ≤ const. |ω+ (k, t)| + 2 |t − s| |q(k, s)| ds e t ∞ 1 + e+ (t−s) |∂k q(k, s)| ds t ∞ ε 1 1 ε 1 1 µα (k, t) + 2 µα (k, t) e+ (t−s) + |t − s| ds ≤ 2 t + t + t ∞ 1 ε + µα (k, t) e+ (t−s) ds, t 3/2 t and (46) follows. Similarly we have, using the triangle inequality to bound ∂k ω− , that ∗ |k| ∗ |∂k ω− (k, t)| ≤ const. ∂k ω− (k) e− (t−1) + ω− (k) (t − 1)e− (t−1) t t |k| |k| + 2 e− (t−s) |q(k, s)| ds + 2 e− (t−s) |t −s| |q(k, s)| ds 1 1 1 t − (t−s) |∂k q(k, s)| ds . + e (49) 1 We now estimate the terms on the right-hand side of (49) individually. We have, proceeding in particular as in the bound (34) on ω− to prove (50) and (51), and using the inequality k 2 ≤ |− |, that ∂k ω∗ (k) e− (t−1) ≤ εµα+1 (k, 1)e− (t−1) ≤ εµα+1 (k, t), − 2 ∗ |k| ω (k) (t − 1)e− (t−1) ≤ ε |k| µα+2 (k, 1) |t − 1| e− (t−1) − ≤ εµα+2 (k, 1) (t − 1) |− | e− (t−1) ≤ εµα+1 (k, t), |k| t − (t−s) |k| ε |q(k, s)| ds ≤ e µα+1 (k, t), 2 1 t 1/2
(50)
Stationary Solutions of Navier–Stokes Equations
|k| 2
t
467
e− (t−s) |t − s| |q(k, s)| ds
1
t−1 |k|2 µα+1 (k, 1)(t − 1)e− 2 2 t |k|2 t ds + ε 2 µα+1 (k, ) e− (t−s) (t − s) 3/2 2 t/2 s t−1 1 ≤ ε µα+1 (k, 1) |− | (t − 1)e− 2 t 1 t ds + ε µα+1 (k, ) e− (t−s) |− | (t − s) 3/2 2 t/2 s 1 ≤ ε µα+1 (k, t),
≤ε
and 1
1
t
e− (t−s) |∂k q(k, s)| ds ≤ εµα+1 (k, t),
(51)
and (47) follows. We now prove (48). We multiply the inequality (49) with − and again bound the terms on the right hand side individually. Namely, ∂k ω∗ (k) |− | e− (t−1) ≤ εµα+1 (k, 1) |− | e− (t−1) ≤ ε µα (k, t), − t ∗ |k| ω (k) |− | (t − 1)e− (t−1) ≤ εµα+2 (k, 1) |− |2 (t − 1)e− (t−1) − ε ≤ µα (k, t), t |− | |k| t − (t−s) ε |q(k, s)| ds ≤ µα (k, t), e 2 t 1 and furthermore |k| 2
t 1
e− (t−s) |− | |t − s| |q(k, s)| ds
t−1 |k|2 ≤ ε 2 µα+1 (k, 1) |− | (t − 1)e− 2 t |k|2 ε t + 2 3/2 µα+1 (k, ) e− (t−s) |− | (t − s) ds t 2 t/2 t−1
≤ εµα+2 (k, 1) |− |2 (t − 1)e− 2 t + ε t + 2 3/2 µα+1 (k, ) e−|− |(t−s) (|− | (t − s)) |− | ds t 2 t/2 ε ≤ µα (k, t), t
468
P. Wittwer
and |− | t − (t−s) |∂k q(k, s)| ds e 1 |− | (t+1)/2 − (t−s) |∂k q(k, s)| ds ≤ e 1 |− | t − (t−s) |∂k q(k, s)| ds + e t/2 t ds − (t−1)/2 ≤ εµα+1 (k, 1) |− | e 3/2 1 s t |− | + e− (t−s) |∂k q(k, s)| ds t/2 t ε 1 ε ε ≤ µα (k, t) + µ (k, t/2) e− (t−s) |− | ds ≤ µα (k, t), α 3/2 t t t t/2 and (48) follows. From the bounds on ∂k ω+ and ∂k ω− we get the following bound on ik∂k ω and ∂k η: ∗ and q be as defined above. Then, Proposition 11. Let u∗− , ω−
ε µα (k, t), t 1/2 ε |∂k η(k, t)| ≤ µα (k, t). t
|ik∂k ω(k, t)| ≤
(52) (53)
Proof. We apply the triangle inequality to the derivatives of ω and η in (15), and get
1 1 ε |ik∂k ω(k, t)| ≤ |k| µα (k, t) + εµα+1 (k, t) ≤ ε |k| µα+1 (k, t) + t 3/2 ε ≤ 1/2 µα (k, t), t |k| |k| |∂k η(k, t)| ≤ |ω+ (k, t)| + |ω− (k, t)| + + |∂k ω+ (k, t)| + |− | |∂k ω− (k, t)| , |k| 1 ε ε |ω(k, t)| + µα (k, t) + µα (k, t), ≤ 3/2 t t |k| ε ε ε ≤ µα+1 (k, t) + µα (k, t) ≤ µα (k, t), t 1/2 t t as claimed. We can now estimate the convolution products in (45):
Stationary Solutions of Navier–Stokes Equations
469
∗ and q be as defined above. Then, Proposition 12. Let u∗− , ω−
|(u ∗ ∂k η)(k, t)| ≤
ε2
µα (k, t), t 3/2 ε2 |(v ∗ (−iω))(k, t)| ≤ 3/2 µα (k, t), t ε2 |(v ∗ (−ik∂k ω))(k, t)| ≤ 3/2 µα (k, t). t
(54) (55) (56)
Proof. We use the bounds (36), (39), (40) and (52), (53). The bounds on (54)–(56) then follow immediately since all the resulting convolutions have already been bounded in the proof of Proposition 8. Note that the bounds (54)–(56) show that |∂k N (q)(k, t)| ≤ ε 2 µα (k, t)/t 3/2 as required.
3.3. Bound on N (q) − N (q) ˜ α,2 . In order to prove Theorem 2 it remains to be shown that N is Lipschitz. ∗ and u∗ be as above and let q, q˜ ∈ B (ε ). Then Proposition 13. Let ω− α 0 −
N (q) − N (q) ˜ α,2 ≤ const. ε0 q − q ˜ α,2 .
(57)
˜ that Proof. We have, using the identity ab − a˜ b˜ = (a − a)b ˜ + a(b ˜ − b), 1 2π 1 = 2π
N (q) − N (q) ˜ =
(u ∗ η + v ∗ (−ikω)) − (u˜ ∗ η˜ + v˜ ∗ (−ik ω)) ˜
(u − u) ˜ ∗ η + u˜ ∗ (η − η) ˜
+(v − v) ˜ ∗ (−ikω) + v˜ ∗ ((−ikω) − (−ik ω)) ˜ .
Furthermore, since q − q˜ ∈ Bα,2 , we have as in (18)-(20) that |q(k, t) − q(k, ˜ t)| ≤ const. q − q ˜ α,2 |∂k q(k, t) − ∂k q(k, ˜ t)| ≤ const. q − q ˜ α,2 |q(k, t) − q(k, ˜ t)| ≤ const. q − q ˜ α,2
1 µα (k, t), t2 1 µα (k, t), t 3/2 |k| µα+1 (k, t). t 3/2
Finally, since ω, η, u, v and ω, ˜ η, ˜ u, ˜ v˜ are linear (respectively affine) in q and q, ˜ the bound (57) follows mutatis mutandis from the proof of the bound on N and ∂k N . In Sects. 3.1 and 3.2 we have shown that N maps the ball Bα (ε0 ) into itself, and Proposition 13 therefore shows that N is a contraction of Bα (ε0 ) into itself for ε0 small enough. This completes the proof of Theorem 2.
470
P. Wittwer
4. Choice of Initial Conditions ∗ it would be more natural We briefly discuss the choice of initial conditions. Instead of ω− to be able to prescribe the vorticity on . From (16) we see that 1 ∞ + (1−s) ω+ (k, 1) = − e q(k, s) ds, (58) 1 ∗ ω− (k, 1) = ω− (k).
Since ω = ω+ + ω− this means that we have to choose ∗ (k) = ω(k, 1) − ω+ (k, 1) ω−
(59)
to construct a solution with given vorticity ω(k, 1). If we evaluate the bounds (27) and (46) at t = 1, we immediately see that |ω+ (k, 1)| ≤ εµα+1 (k, 1) and |∂k ω+ (k, 1)| ≤ 1 . As a consequence, if we choose εµα+1 (k, 1), and therefore k → ω+ (k, 1) ∈ Vα+1 1 ∗ in (16) by the right hand side of Eq. (59), we get k → ω(k, 1) ∈ Vα+1 and replace ω− instead of the map N a map N1 , where the equation for ω− in (16) is replaced by 1 t − (t−s) − (t−1) − (t−1) − ω+ (k, 1)e − e q(k, s) ds, ω− (k, t) = ω(k, 1)e 1 1 , all the bounds in with ω+ (k, 1) given by (58). However, since k → ω+ (k, 1) ∈ Vα+1 the proof of Theorem 1 remain unchanged, and therefore N1 is well defined on Bα,2 and 1 and u∗− α + ω(., 1)1α ≤ ε0 a contraction on Bα (ε0 ) provided k → ω(k, 1) ∈ Vα+1 with ε0 small enough. We now discuss the role of the initial condition u∗− . From (15) we find that u and v are of the form u(k, t) = · · · + uE (k, t), v(k, t) = · · · + vE (k, t), where
uE (k, t) = u∗− (k)e−|k|(t−1) , vE (k, t) = iσ (k)u∗− (k)e−|k|(t−1) . 1 ∗ u− (k) e−|k|(t−1) , The vector field (uE , vE ) is a potential flow. Namely, let ψ(k, t) = −ik then uE (k, t) = −ikψ(k, t) and vE (k, t) = −∂t ψ(k, t) and moreover ∂t2 − k 2 ψ(k, t) ≡ 0, which implies that uE and vE are harmonic functions in direct space, provided α > 2.
5. Proof of Theorem 1 In Sect. 3 we have proved (to avoid confusion we now write the hats for the Fourier transforms) that uˆ and vˆ are continuous functions of k and t that satisfy the bounds u(k, ˆ t) ≤ εµα+1 (k, t), ε v(k, ˆ t) ≤ 1/2 µα+1 (k, t) + εµ¯ α+1 (k, t). (60) t For α > 0 it follows in particular that the functions t → u(., ˆ t) and t → v(., ˆ t) are continuous functions of t ≥ 1 with values in L1 (R) that vanish at infinity in the sense that ˆ , t)1 = lim v(. ˆ , t)1 = 0. lim u(. t→∞
t→∞
Stationary Solutions of Navier–Stokes Equations
The Fourier transforms
471
1 e−iky u(k, ˆ x) dk, 2π R 1 v(x, y) = e−iky v(k, ˆ x) dk, 2π R
u(x, y) =
satisfy the bounds 1 sup u(. ˆ , x)1 , 2π x≥1 x≥1 y∈R 1 sup v(. ˆ , x)1 , sup sup |v(x, y)| ≤ 2π x≥1 x≥1 y∈R
sup sup |u(x, y)| ≤
and therefore we can generalize the proof of the Riemann-Lebesgue lemma to show that x → u(x, . ) and x → v(x, . ) are continuous functions of x ≥ 1 with values in C∞ (R) (the Banach space of continuous functions that vanish at infinity) and vanish at infinity in the sense that lim sup |u(x, y)| = lim sup |v(x, y)| = 0.
x→∞ y∈R
x→∞ y∈R
Since C∞ ([1, ∞), C∞ (R)) ≡ C∞ ([1, ∞) × R) it follows that u and v converge to zero whenever |x| + |y| → ∞ in and hence satisfy the boundary condition (3). The reconstruction of the pressure from u and v is standard. For α > 2 second derivatives of u and v are continuous in direct space, and one easily verifies using the definitions that the triple (u, v, p) satisfies the Navier–Stokes equations (1). The details are left to the reader. The set S in Theorem (1) is by definition the set of all vector fields (u, v) obtained this way, restricted to . Note that the bounds (60) imply that ε sup |u(x, y)| ≤ 1/2 , (61) x y∈R ε (62) sup |v(x, y)| ≤ . x y∈R The bound (62) proves (5). Finally, to prove (6) we use the integral equations (16) to get more detailed information on the asymptotic behavior of u. We first prove an identity for (7) (we again drop the hats): Proposition 14. Let ω+ , ω− , u+ , u− be a solution of the integral equation (16) and let u, v be as defined in (15), and c as defined in (7). Then, ∞ ∗ c ≡ u(0, 1) + iv(0+ , 1) = −i∂k ω− (0) + i ∂k q(0, s) ds. (63) 1
Proof. Note that c ≡ u(0, 1) + iv(0+ , 1) by definition (7). From (16) we find that t ∗ (0) − ∂k q(0, s) ds, ∂k ω− (0, t) = ∂k ω− 1 i ∞ ∂k q(0, s) ds, u+ (0, t) = 2 t i t u− (0, t) = u∗− (0) − ∂k q(0, s) ds, 2 1
472
P. Wittwer
and from (15) we find that u(0, t) = −i∂k ω− (0, t) + u+ (0, t) + u− (0, t) i ∞ ∗ = −i∂k ω− (0) + ∂k q(0, s) ds + u∗− (0) = u(0, 1), 2 1 v(0+ , t) = −iu+ (0, t) + iu− (0, t) 1 ∞ = iu∗− (0) + ∂k q(0, s) ds = v(0+ , 1), 2 1 and (63) follows. To prove (6) we proceed in several steps: ∗ and q be as defined above. Then, Proposition 15. Let u∗− , ω−
u(k, t) + i + ω− ≤ ε µ¯ α+1 (k, t) + 1 µα+1 (k, t) . k t 1/2
(64)
Proof. This is an immediate consequence of (27)–(32). Note that (64) implies in direct space that all contributions to u with the exception of the ω− -term are bounded by O(1/x). It is therefore sufficient to prove a more detailed bound on ki + ω− in order to prove (6). We have ∗ and q be as defined above, and let Proposition 16. Let u∗− , ω−
i 1 − (t−1) t ∗ − (t−1) − e q(k, s) ds . W− (k, t) = − + ω− (k)e k 1
(65)
i − + ω− (k, t) − W− (k, t) ≤ ε µα+1 (k, t). t 1/2 k
(66)
Then,
Proof. We first note that, for 1 ≤ s ≤ t, 0 ≤ e− (t−s) − e− (t−1) ≤ const. e− (t−s)
|− | (s − 1) . 1 + |− | (s − 1)
Furthermore, proceeding as in the proof of Proposition 7 we find that const. µα+1 (k, 1)e− (t−1)/2 |− | min (t − 1), t 1/2 ≤ 1/2 µα+1 (k, t), t
Stationary Solutions of Navier–Stokes Equations
and therefore + 1 |k|
473
t
e− (t−s) − e− (t−1) |q(k, s)| ds
1
(t+1)/2
|− | (s − 1) |q(k, s)| ds 1 + |− | (s − 1) |k| 1 t − (t−s) |− | (s − 1) |q(k, s)| + e ds 1 + |− | (s − 1) |k| (t+1)/2 t t (s − 1) ds ≤ εµα+1 (k, 1) |− | e− (t−1)/2 ds + εµ (k, t/2) α+1 3/2 3/2 s 1 t/2 s ≤ εµα+1 (k, 1)e− (t−1)/2 |− | min (t − 1)2 , t 1/2 t ds + εµα+1 (k, t/2) , 3/2 t/2 s
≤ const.
e− (t−s)
and (66) follows. Note that (66) implies in direct space that, with the exception of the W− term, all contributions from ki + ω− to u are bounded by O(1/x). It is therefore sufficient to prove a more detailed bound on W− in order to prove (6). We have ∗ , q and c be as defined above. Then, Proposition 17. Let u∗− , ω−
lim W− (
t→∞
k t 1/2
, t) = c e−k . 2
Proof. The bound (67) is immediate using the definition of W− and (63).
(67)
We can now complete the proof of Theorem 1. For α > 0 we get from (65) that W− ( √k , t) ≤ εµα+1 (k, 1), (68) t and therefore it follows from Proposition 17 by the Lebesgue dominated convergence 2 k theorem that W− ( t 1/2 , t) converges in L1 (R) to c e−k as t → ∞ and (6) follows. Acknowledgements. It is a pleasure to thank Guillaume van Baalen for several helpful discussions on the subject of this paper and on some of the techniques involved, to Thierry Gallay for insisting on the importance of analyzing the vorticity, to Alain Schenkel for his detailed comments on a previous version of this manuscript, to Marius Mantoiu for helpful discussions on Functional Analysis and to the referee for pointing out that the limit (6) is actually uniform in y by virtue of (68).
References 1. van Baalen, G: Stationary Solutions of the Navier–Stokes Equations in a Half-Plane Down-Stream of an Object: Universality of the Wake. mp-arc 01-69, (2001) 2. van Baalen, G., Schenkel, A., Wittwer, P: Asymptotics of Solutions in nA + nB → C Reaction-Diffusion Systems. Commun. Math. Phys. 210, 145–196 (2000) 3. Batchelor, G.K.: An Introduction in Fluid Dynamics. Cambridge: Cambridge University Press, 1967 4. Berger, S. A.: Laminar Wakes. New York: American Elsevier Publishing Company, Inc., 1971
474
P. Wittwer
5. Bricmont, J., Kupiainen, A., Lin, G.: Renormalization Group and Asymptotics of Solutions of Nonlinear Parabolic Equations. Communications in Pure and Applied Mathematics 47, 893–922 (1994) 6. van Dyke, M.: Perturbation Methods in Fluid Mechanics. Stanford, CA: The Parabolic Press, 1975 7. Galdi, G.P.: An Introduction to the Mathematical Theory of the Navier–Stokes Equations, Volume 1 and Volume 2. Springer Tracts in Natural Philosophy 36–39. New York: Springer Verlag, 1994 8. Gallay, T: A center-stable manifold theorem for differential equations in Banach spaces. Commun. Math. Phys. 152, 249–268 (1993) Communicated by A. Kupiainen
Commun. Math. Phys. 226, 475 – 495 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base Ping Xu Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected] Received: 19 May 2001 / Accepted: 19 November 2001
Abstract: In this paper we consider dynamical r-matrices over a nonabelian base. There are two main results. First, corresponding to a fat reductive decomposition of a Lie algebra g = h ⊕ m, we construct geometrically a non-degenerate triangular dynamical r-matrix using symplectic fibrations. Second, we prove that a triangular dynamical rmatrix r : h∗ −→ ∧2 g naturally corresponds to a Poisson manifold h∗ × G. A special type of quantization of this Poisson manifold, called compatible star products in this paper, yields a generalized version of the quantum dynamical Yang–Baxter equation (or Gervais–Neveu–Felder equation). As a result, the quantization problem of a general dynamical r-matrix is proposed. 1. Introduction Recently, there has been growing interest in the so-called quantum dynamical Yang– Baxter equation: (2) (1) (3) )R23 (λ) = R23 (λ + hh )R13 (λ)R12 (λ + hh ). R12 (λ)R13 (λ + hh ¯ ¯ ¯
(1)
This equation arises naturally from various contexts in mathematical physics. It first appeared in the work of Gervais–Neveu in their study of quantum Liouville theory [24]. Recently it reappeared in Felder’s work on the quantum Knizhnik–Zamolodchikov– Bernard equation [23]. It also has been found to be connected with the quantum Caloger– Moser systems [4]. As the quantum Yang–Baxter equation is connected with quantum groups, the quantum dynamical Yang–Baxter equation is known to be connected with elliptic quantum groups [23], as well as with Hopf algebroids or quantum groupoids [20, 32, 33]. The classical counterpart of the quantum dynamical Yang–Baxter equation was first considered by Felder [23], and then studied by Etingof and Varchenko [19]. This is the Research partially supported by NSF grant DMS00-72171.
476
P. Xu
so-called classical dynamical Yang–Baxter equation, and a solution to such an equation (plus some other reasonable conditions) is called a classical dynamical r-matrix. More precisely, given a Lie algebra g over R (or over C) with an Abelian Lie subalgebra h, a classical dynamical r-matrix is a smooth (or meromorphic) function r : h∗ −→ g⊗g satisfying the following conditions: (i) (zero weight condition) [h⊗1 + 1⊗h, r(λ)] = 0, ∀h ∈ h; (ii) (normal condition) r12 + r21 = , where ∈ (S 2 g)g is a Casimir element; (iii) (classical dynamical Yang–Baxter equation1 ) Alt (dr) − ([r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ]) = 0,
(2)
(1) 23 (2) 13 (3) 12 where Alt dr = (hi ∂r − hi ∂r + hi ∂r ). ∂λi ∂λi ∂λi A fundamental question is whether a classical dynamical r-matrix is always quantizable. There has appeared a lot of work in this direction, for example, see [2, 25, 18]. In the triangular case (i.e., r is skew-symmetric: r12 (λ) + r21 (λ) = 0), a general quantization scheme was developed by the author using the Fedosov method, which works for a vast class of dynamical r-matrices, called splittable triangular dynamical r-matrices [34]. Recently, Etingof and Nikshych, using the vertex-IRF transformation method, proved the existence of quantizations for the so-called completely degenerate triangular dynamical r-matrices [21]. Interestingly, although the quantum dynamical Yang–Baxter equation in [23] only makes sense when the base Lie algebra h is Abelian, its classical counterpart admits an immediate generalization for any base Lie algebra h which is not necessarily Abelian. Indeed, all one needs to do is to change the first condition (i) to: (i’) r : h∗ −→ g⊗g is H -equivariant, where H acts on h∗ by coadjoint action and on g⊗g by adjoint action. There exist many examples of such classical dynamical r-matrices. For instance, when g is a simple Lie algebra and h is a reductive Lie subalgebra containing the Cartan subalgebra, there is a classification due to Etingof–Varchenko [19]. In particular, when h = g, an explicit formula was discovered by Alekseev and Meinrenken in their study of non-commutative Weil algebras [1]. Later, this was generalized by Etingof and Schiffermann [17] to a more general context. Moreover, under some regularity condition, they showed that the moduli space of dynamical r-matrices essentially consists of a single point once the initial value of the dynamical r-matrices is fixed. A natural question arises as to what should be the quantum counterpart of these r-matrices. And more generally, is any classical dynamical r-matrix (with nonabelian base) quantizable? A basic question is what the quantum dynamical Yang–Baxter equation should look like when h is nonabelian. In this paper, as a toy model, we consider the special case of triangular dynamical r-matrices and their quantizations. As in the Abelian case, these r-matrices naturally correspond to some invariant Poisson structures on h∗ × G. It is standard that quantizations of Poisson structures correspond to star products [8]. The special form of the Poisson bracket relation on h∗ × G suggests a specific form that their star products should take. This leads to our definition of compatible star products. The compatibility condition (which, in this case, is just the associativity) naturally leads to a quantum dynamical Yang–Baxter equation: Eq. (33). As we shall see, this equation 1 Throughout the paper, we follow the sign convention in [4] for the definition of a classical dynamical r-matrix in order to be consistent with the quantum dynamical Yang–Baxter equation (1). This differs in sign from the one used in [19].
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
477
indeed resembles the usual quantum dynamical Yang–Baxter equation (unsymmetrized version). The only difference is that the usual pointwise multiplication on C ∞ (h∗ ) is replaced by the PBW-star product, which is indeed the deformation quantization of the canonical Lie–Poisson structure on h∗ . Although Eq. (33) is derived by considering triangular dynamical r-matrices, it makes perfect sense for non-triangular ones as well. This naturally leads to our definition of quantization of dynamical r-matrices over an arbitrary base Lie subalgebra which is not necessary Abelian. The problem is that such an equation only makes sense for R : h∗ −→ U g⊗U g[[h]]. ¯ In the Abelian case, it appears that one may consider R valued in a deformed universal enveloping algebra Uh¯ g, but in most cases Uh¯ g is isomorphic to U g[[h]] ¯ as an algebra. So Eq. (33), in a certain sense, is general enough to include all the interesting cases. However, the physical meaning of this equation remains mysterious. Another main result of the paper is to give a geometric construction of triangular dynamical r-matrices. More precisely, we give an explicit construction of a triangular dynamical r-matrix from a fat reductive decomposition of a Lie algebra g = h ⊕ m (see Sect. 2 for the definition). This includes those examples of triangular dynamical r-matrices considered in [19]. Our main purpose is to show that triangular dynamical r-matrices (with nonabelian base) do rise naturally from symplectic geometry. This gives us another reason why it is important to consider their quantizations. Discussion of this part occupies Sect. 2. Section 3 is devoted to the discussion of compatible star products, whose associativity leads to a “twisted-cocycle” condition. In Sect. 4, we will derive the quantum dynamical Yang–Baxter equation from this twisted-cocycle condition. The last section contains some concluding remarks and open questions. Finally, we note that in this paper, by a dynamical r-matrix, we always mean a dynamical r-matrix over a general base Lie subalgebra unless specified. Also Lie algebras are normally assumed to be over R, although most results can be easily modified for complex Lie algebras. For simplicity, in this paper we assume that a dynamical r-matrix is defined on h∗ . In reality, it may only be defined on an open submanifold U ⊆ h∗ . 2. Classical Dynamical r-Matrices In this section, we will give a geometric construction of triangular dynamical r-matrices. As we shall see, these r-matrices do arise naturally from symplectic geometry. We will show some interesting examples, which include triangular dynamical r-matrices for simple Lie algebras constructed by Etingof–Varchenko [19]. Below let us recall the definition of a classical triangular dynamical r-matrix. Let g be a Lie algebra over R (or C) and h ⊂ g be a Lie subalgebra. A classical dynamical r-matrix r : h∗ −→ g⊗g is said to be triangular if it is skew symmetric: r12 + r21 = 0. In other words, a classical triangular dynamical r-matrix is a smooth function (or meromorphic function in the complex case) r : h∗ −→ ∧2 g such that (i) r : h∗ −→ ∧2 g is H -equivariant, where H acts on h∗ by coadjoint action and acts on ∧2 g by adjoint action. (ii) ∂r 1 (3) hi ∧ i − [r, r] = 0, ∂λ 2 i
where the bracket [· , ·] refers to the Schouten type bracket: ∧k g⊗ ∧l g −→ ∧k+l−1 g induced from the Lie algebra bracket on g, {h1 , . . . , hl } is a basis of h, and (λ1 , . . . , λl ) its induced coordinate system on h∗ .
478
P. Xu
The following proposition gives an alternative description of a classical triangular dynamical r-matrix. Proposition 2.1. A smooth function r : h∗ −→ ∧2 g is a triangular dynamical r-matrix iff ∂ − → −→ π = πh∗ + ∧ hi + r(λ) ∂λi i
is a Poisson tensor on M =
h∗
× G, where πh∗ denotes the standard Lie (also known − → as Kirillov–Kostant) Poisson tensor on the Lie algebra dual h∗ , hi ∈ X(M) is the left −→ invariant vector field on M generated by hi ∈ h, and similarly r(λ) ∈ (∧2 T M) is the left invariant bivector field on M corresponding to r(λ). Proof. Set π1 = πh∗ +
∂ − → ∧ hi . ∂λi i
−→ Then π = π1 + r(λ). Note that, for any (λ, x), π1 |(λ,x) is tangent to h∗ × xH , on which it is isomorphic to the standard Poisson (symplectic) structure on the cotangent bundle T ∗ H (see, e.g., [27]). Here T ∗ H is identified with h∗ × H (hence with h∗ × xH ) via left translations. It thus follows that [π1 , π1 ] = 0. Therefore −→ −→ −→ [π, π ] = 2[π1 , r(λ)] + [r(λ), r(λ)]. Now ∂ −→ −→ − → −→ [π1 , r(λ)] = [πh∗ , r(λ)] + [ ∧ hi , r(λ)] ∂λi i −→ −→ ∂ −→ − − → ∂ → [r(λ), ] ∧ hi − ∧ [r(λ), hi ]. = [πh∗ , r(λ)] + ∂λi ∂λi i
i
Hence [π, π] = I1 + I2 , where −→ ∂ −→ −→ − → [r(λ), ] ∧ hi + [r(λ), r(λ)], and ∂λi i ∂ −→ −→ − → ∧ [r(λ), hi ]. I2 = 2[πh∗ , r(λ)] − 2 ∂λi I1 = 2
i
With respect to the natural bigrading on ∧3 T (h∗ × G), I1 and I2 correspond to the (0, 3) and (1, 2)-terms of [π, π], respectively. It thus follows that [π, π ] = 0 iff I1 = 0 and I2 = 0. It is simple to see that I1 = −2
→ −−−−−−−→ − → ∂− r + [r(λ), r(λ)]. hi ∧ ∂λi i
Hence I1 = 0 is equivalent to Eq. (3).
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
To find out the meaning of I2 = 0, let us write πh∗ = −fj i ). A simple computation yields that I2 = 2
479 1 2
ij
fij (λ) ∂λ∂ i ∧
∂ ∂λj
(fij =
− → ∂ ∂ −−−−−→ ∂ r ∧ fij (λ) j + 2 ∧ [hi , r(λ)]. ∂λi ∂λ ∂λi i
j
Thus I2 = 0 is equivalent to [hi , r(λ)] = −
j
fij (λ)
i
∂r(λ) d ∗ = r(Adexp −1 th λ), ∀i, i ∂λj dt t=0
which means exactly that r is H -equivariant. This concludes the proof.
Remark. Note that M(= h∗ × G) admits a left G-action and a right H -action defined as follows: ∀(λ, x) ∈ h∗ × G, y · (λ, x) = (λ, yx), ∀y ∈ G; (λ, x) · y = (Ady∗ λ, xy), ∀y ∈ H. It is clear that the Poisson structure π is invariant under both actions. And, in short, we will say that π is G × H -invariant. Definition 2.2. A classical triangular dynamical r-matrix r : h∗ −→ ∧2 g is said to be non-degenerate if the corresponding Poisson structure π on M is non-degenerate, i.e., symplectic. In what follows, we will give a geometric construction of non-degenerate dynamical r-matrices. To this end, let us first recall a useful construction of a symplectic manifold from a fat principal bundle [26, 31]. A principal bundle P (M, H ) with a connection is called fat on an open submanifold U ⊆ h∗ if the scalar-valued form < λ, > is non-degenerate on each horizontal space in T P for λ ∈ U . Here is the curvature form, which is a tensorial form of type AdH on P (i.e., it is horizontal, h-valued, and AdH -equivariant). Given a fat bundle P (M, H ) with a connection, one has a decomposition of the tangent bundle T P = Vert(P) ⊕ Hor(P). We may identify Vert(P) with a trivial bundle with fiber h. Thus Vert∗ P ∼ = h∗ × P. On the other hand, Vert∗ P ∼ = Hor ⊥ (P) ⊂ T∗ P. Thus, by pulling back the canonical symplectic structure on T ∗ P , one can equip Vert ∗ P, hence h∗ × P , an H -invariant presymplectic structure, where H acts on h∗ × P by (λ, x) · h = (Adh∗ λ, x · h), ∀h ∈ H and (λ, x) ∈ h∗ ×P . If U ⊆ h∗ is an open submanifold on which P (M, H ) is fat, then we obtain an H -invariant symplectic manifold U × P . In fact, the presymplectic form ω can be described explicitly. Note that Vert ∗ P admits a natural fibration with T ∗ H being the fibers, and the connection on P induces a connection on this fiber bundle. In other words, Vert∗ P is a symplectic fibration in the sense of Guillemin–Lerman–Sternberg [26]. At any point (λ, x) ∈ h∗ × P ∼ = Vert ∗ P, the presymplectic form ω can be described as follows: it restricts to the canonical two-form on the fiber; the vertical subspace is ωorthogonal to the horizontal subspace; and the horizontal subspace is isomorphic to the horizontal subspace of Tx P and the restriction of ω to this subspace is the two form
480
P. Xu
− < λ, (x) > obtained by pairing the curvature form with λ (see Examples 2.2–2.3 in [26]). Now assume that g=h⊕m
(4)
is a reductive decomposition of a Lie algebra g, i.e., h is a Lie subalgebra and m is stable under the adjoint action of h: [h, m] ⊂ m. By G, we denote a Lie group with Lie algebra g, and H the Lie subgroup corresponding to h. It is standard [28] that the decomposition (4) induces a left G-invariant connection on the principal bundle G(G/H, H ), where the curvature is given by (X, Y ) = −[X, Y ]h ,
h − component of [X, Y ] ∈ g.
(5)
Here X and Y are arbitrary left invariant vector fields on G belonging to m. A reductive decomposition g = h ⊕ m is said to be fat if the corresponding principal bundle G(G/H, H ) is fat on an open submanifold U ⊆ h∗ . As a consequence, a fat decomposition g = h ⊕ m gives rise to a G × H -invariant symplectic structure on M = U ×G, where the symplectic structure is the restriction of the canonical symplectic form on T ∗ G. In other words, M is a symplectic submanifold of T ∗ G. Here the embedding U × G ⊆ h∗ × G −→ g∗ × G (∼ = T ∗ G) is given by the natural inclusion (λ, x) −→ (pr ∗ λ, x), where pr : g −→ h is the projection along the decomposition g = h ⊕ m. Since the symplectic structure ω on U × G is left invariant, in order to describe ω explicitly, it suffices to specify it at a point (λ, 1). Now T(λ,1) (U × G) ∼ = h∗ ⊕ g = ∗ h ⊕ h ⊕ m. Under this identification, we have ω = ω1 ⊕ ω2 , where ω1 ∈ 2 (h∗ ⊕ h) is the canonical symplectic two-form on T ∗ H at the point (λ, 1) ∈ h∗ × H (∼ = T ∗ H ), 2 and ω2 ∈ (m) is given by ω2 (X, Y ) = λ, [X, Y ]h , ∀X, Y ∈ m. Let r(λ) ∈ ∧2 m be the inverse of ω2 , which always exists for λ ∈ U since ω2 is assumed to be non-degenerate on U . It thus follows that the Poisson structure on U × G is ∂ − → −→ π = πh ∗ + ∧ hi + r(λ). ∂λi i
According to Proposition 2.1, r : U −→ ∧2 m ⊂ ∧2 g is a non-degenerate triangular dynamical r-matrix. Thus we have proved Theorem 2.3. Assume that g = h ⊕ m is a reductive decomposition which is fat on an open submanifold U ⊆ h∗ . Then the dual of the linear map ϕ : ∧2 m −→ h : (X, Y ) −→ [X, Y ]h , ∀X, Y ∈ m defines a non-degenerate triangular dynamical rmatrix r : U (⊆ h∗ ) −→ ∧2 m ⊂ ∧2 g, ∀λ ∈ U . Here m∗ is identified with m using the non-degenerate bilinear form ϕ ∗ (λ) ∈ ∧2 m∗ . It is often more useful to express r(λ) explicitly in terms of a basis. To this end, let us choose a basis {e1 , . . . , em } of m. Let aij (λ) = λ, [ei , ej ]h , i, j = 1, . . . , m. By (cij (λ)) we denote the inverse of the matrix (aij (λ)), ∀λ ∈ U . Then one has r(λ) =
1 cij (λ)ei ∧ ej . 2 ij
(6)
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
481
Remark. (i) After the completion of the first draft, we learned that a similar formula is also obtained independently by Etingof [15]. Note that this dynamical r-matrix r is always singular at 0. To remove this singularity, one needs to make a shift of the dynamical parameter λ → λ − µ. (ii) It would be interesting to compare our formula with Theorem 3 in [17]. We end this section with some examples. Example 2.1. Let g be a simple Lie algebra over C and h a Cartan subalgebra. Let (gα ⊕ g−α ) g=h⊕ α∈.+
be the root space decomposition, where .+ is the set of positive roots with respect to h. Take m = ⊕α∈.+ (gα ⊕ g−α ). Then g = h ⊕ m is clearly a reductive decomposition. Let eα ∈ gα and e−α ∈ g−α be dual vectors with respect to the Killing form: (eα , e−α ) = 1. For any λ ∈ h∗ , set aαβ (λ) = λ, [eα , eβ ]h , ∀α, β ∈ .+ ∪ (−.+ ). It is then clear that aαβ (λ) = 0, whenever α + β = 0; and aα,−α (λ) = λ, [eα , e−α ]h = (λ, α)(eα , e−α ) = (λ, α). Therefore, from Theorem 2.3 and Eq. (6), it follows that r(λ) = −
α∈.+
1 eα ∧ e−α (λ, α)
is a non-degenerate triangular dynamical r-matrix, so we have recovered this standard example in [19]. Example 2.2. As in the above example, let g be a simple Lie algebra over C with a fixed Cartan subalgebra h, and l a reductive Lie subalgebra containing h. There is a subset .(l)+ of .+ such that (gα ⊕ g−α ). l =h⊕ α∈.(l)+
Let .+ = .+ − .(l)+ , .(l) = .(l)+ ∪ (−.(l)+ ), and . = .+ ∪ (−.+ ), and denote by m the subspace of g: m= (gα ⊕ g−α ). α∈.+
It is simple to see that g = l ⊕ m is indeed a fat reductive decomposition, and therefore induces a non-degenerate triangular dynamical r-matrix r : l∗ −→ ∧2 g. To describe r explicitly, we note that the dual space l∗ admits a natural decomposition (g∗α ⊕ g∗−α ). l ∗ = h∗ ⊕ α∈.(l)+
482
P. Xu
Hence any element µ ∈ l∗ can be written as µ = λ ⊕ ⊕α∈.(l) ξα , where λ ∈ h∗ and ξα ∈ g∗α . Let aαβ (µ) =< µ, [eα , eβ ]l >, ∀α, β ∈ .. It is easy to see that if α + β = 0; (λ, α), aαβ (µ) = < ξγ , [eα , eβ ] >, if α + β = γ ∈ .(l); (7) 0, otherwise. By (cαβ (µ)), we denote the inverse matrix of (aαβ (µ)). According to Eq. (6), r(µ) =
1 cαβ (µ)eα ∧ eβ 2 α,β∈.
is a non-degenerate triangular dynamical r-matrix over l∗ . In particular, if µ = λ ∈ h∗ , it follows immediately that r(λ) = −
α∈.+
1 eα ∧ e−α . (λ, α)
(8)
Equation (8) was first obtained by Etingof–Varchenko in [19]. The following example was pointed out to us by D. Vogan. Example 2.4. Let g = Rm+n ⊕ Rm+n ⊕ R be a 2(m + n) + 1 dimensional Heisenberg Lie algebra and h = Rn ⊕Rn ⊕R its standard Heisenberg Lie subalgebra. By {pi , qi , c}, i = 1, . . . , n+m, we denote the standard generators of g and {pm+i , qm+i , c}, i = 1, . . . , n, the generators of h. Let m be the subspace of g generated by {pi , qi }, i = 1, . . . , m. It is then clear that g = h ⊕ m is a reductive decomposition. Let {pi∗ , qi∗ , c∗ }, i = 1, . . . , n + m, be the corresponding to the standard generators of g. For any dual basis ∗ ∗ ) + xc∗ . This induces a coordinate system λ ∈ h∗ , write λ = ni=1 (ai pm+i + bi qm+i ∗ ∗ on h , and therefore a function on h can be identified with a function with variables (ai , bi , x). It is clear that ω(pi , qj )(λ) = λ, [pi , qj ]h = xδij ; ω(pi , pj ) = ω(qi , qj ) = 0, ∀i, j = 1, . . . , m. It thus follows that r(ai , bi , x) = −
m 1 pi ∧ qi : h∗ −→ ∧2 g x i=1
is a non-degenerate triangular dynamical r-matrix. 3. Compatible Star Products From Proposition 2.1, we know that a triangular dynamical r-matrix r : h∗ −→ ∧2 g is equivalent to a special type of Poisson structure on h∗ × G. It is thus very natural to expect that quantization of r can be derived from a certain special type of star-product on h∗ × G. It is simple to see that the Poisson brackets on C ∞ (h∗ × G) can be described as follows: (i)
for any f, g ∈ C ∞ (h∗ ), {f, g} = {f, g}πh∗ ;
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
(ii) for any f ∈ C ∞ (h∗ ) and g ∈ C ∞ (G), {f, g} = −→ (iii) for any f, g ∈ C ∞ (G), {f, g} = r(λ)(f, g).
483
→ ∂f − i ( ∂λi )( hi g);
These Poisson bracket relations naturally motivate the following: Definition 3.1. A star product ∗h¯ on M = h∗ × G is called a compatible star product if (i) for any f, g ∈ C ∞ (h∗ ), f (λ) ∗h¯ g(λ) = f (λ) ∗ g(λ);
(9)
(ii) for any f (x) ∈ C ∞ (G) and g(λ) ∈ C ∞ (h∗ ), f (x) ∗h¯ g(λ) = f (x)g(λ);
(10)
(iii) for any f (λ) ∈ C ∞ (h∗ ) and g(x) ∈ C ∞ (G), f (λ) ∗h¯ g(x) =
∞ h¯ k k=0
− → − → ∂kf hi1 · · · hik g; i i 1 k k! ∂λ · · · ∂λ
(11)
(iv) for any f (x), g(x) ∈ C ∞ (G), −−→ f (x) ∗h¯ g(x) = F (λ)(f, g),
(12)
where F (λ) is a smooth function F : h∗ −→ U g⊗U g[[h¯ ]] such that F = 1 + h¯ F1 + O(h¯ 2 ). Here ∗ denotes the standard PBW-star product on h∗ quantizing the canonical Lie– Poisson structure (see [12]), whose definition is recalled below. Let hh¯ = h[[h]] ¯ be a Lie algebra with the Lie bracket [X, Y ]h¯ = h[X, Y ], ∀X, Y ∈ h[[h]], ¯ ¯ and σ : S(h)[[h]] = U hh¯ ¯ ∼ be the Poincaré–Birkhoff–Witt map, which is a vector space isomorphism. Thus the multiplication on U hh¯ induces a multiplication on S(h)[[h]] hence on = Pol(h∗ )[[h]]), ¯ (∼ ¯ ∗ ∞ C (h )[[h]], ¯ which is denoted by ∗. It is easy to check that ∗ satisfies 1 f ∗ g = f g + h¯ {f, g}πh∗ + h¯ k Bk (f, g) + · · · , ∀f, g ∈ C ∞ (h∗ ), 2 k≥0
where Bk ’s are bidifferential operators. In other words, ∗ is indeed a star product on h∗ , which is called the PBW-star product. The following proposition is quite obvious. Proposition 3.2. The classical limit of a compatible star product is the Poisson structure − → −→ π = πh∗ + i ∂λ∂ i ∧ hi + r(λ), where r(λ) = F12 (λ) − F21 (λ). Below we will study some important properties of compatible star products.
484
P. Xu
Proposition 3.3. A compatible star product is always invariant under the left G-action. It is right H -invariant iff F : h∗ −→ U g⊗U g[[h¯ ]] is H -equivariant, where H acts on h∗ by the coadjoint action and on U g⊗U g by the adjoint action. Proof. First of all, note that Eqs. (9–12) completely determine a star product. It is clear from these equations that ∗h¯ is left G-invariant. As for the right H -action, it is obvious from Eq. (10) that ∗h¯ is invariant for f (x) ∗h¯ g(λ). It is standard that ∗ is invariant under the coadjoint action, so it follows from Eq. (9) that f (λ) ∗h¯ g(λ) is also H -invariant. For any h ∈ h, g(x) ∈ C ∞ (G) and any fixed y ∈ H , − → ∗ h (Ry g)(x) = (Lx h)(Ry∗ g) = (Ry Lx h)(g) = (Lxy Ady−1 h)(g) −−−−→ = (Ady−1 hg)(xy) −−−−→ = [Ry∗ (Ady−1 hg)](x). Thus it follows that
− → − → − → − → hi1 · · · hik (Ry∗ g) = Ry∗ (h"i1 · · · h"ik g),
(13)
where h"i = Ady−1 hi , i = 1, . . . , n. Let ξi" = Ady∗ ξi , i = 1, . . . , n. Then {ξ1" , . . . , ξl" } is " " a dual basis for {h"1 , . . . , h"l }. Let (λ 1 , . . . , λ l ) be its corresponding induced coordinates ∗ on h . Then ∂ d ∗ ∗ ((Ady ) f) = ((Ady∗ )∗ f)(λ + tξi ) ∂λi dt t=0 d = f (Ady∗ λ + tAdy∗ ξi ) dt t=0 d = f (Ady∗ λ + tξi" ) dt t=0 ∂f ∗ = " (Ady λ) ∂λ i ∂f = (Ady∗ )∗ " i . ∂λ Hence ∂ k [(Ady∗ )∗ f]
∂kf " ]. ∂λi1 · · · ∂λik ∂λ · · · ∂λ ik From Eq. (11), it follows that for any f (λ) ∈ C ∞ (h∗ ) and g(x) ∈ C ∞ (G), (Ry∗ f )(λ) ∗h¯ (Ry∗ g)(x) = = =
= (Ady∗ )∗ [
"i 1
(14)
∞ k ∗ ∗ − → → h¯ k ∂ [(Ady ) f] − hi · · · hik (Ry∗ g) (by Eqs. (13–14)) k! ∂λi1 · · · ∂λik 1 k=0
∞ h¯ k
k!
(Ady∗ )∗ [
k=0 Ry∗ (f (λ) ∗h¯
− → − → ∂kf ∗ " " " i ]Ry [hi1 · · · hik g] ∂λ · · · ∂λ k "i 1
g(x)).
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
485
I.e., f (λ) ∗h¯ g(x) is also right H -invariant. Finally, ∀f (x), g(x) ∈ C ∞ (G), (Ry∗ (f ∗h¯ g))(λ, x) = (f ∗h¯ g)(Ady∗ λ, xy) −−−−−→ = F (Ady∗ λ)(f, g)(xy) = [Lxy (F (Ady∗ λ))](f, g). On the other hand, −−→ (Ry∗ f ∗h¯ Ry∗ g)(λ, x) = F (λ)(Ry∗ f, Ry∗ g)(x)
= (Lx F (λ))(Ry∗ f, Ry∗ g) = (Ry Lx F (λ))(f, g).
Therefore Ry∗ (f ∗h¯ g) = Ry∗ f ∗h¯ Ry∗ g iff Lxy (F (Ady∗ λ)) = Ry Lx F(λ). The latter is equivalent to that F (Ady∗ λ) = Ady−1 F(λ), or F is H -equivariant. This concludes the proof. In order to give an explicit formula for ∗h¯ , let us write F (λ) = aαβ (λ)Uα ⊗Uβ ,
(15)
where aαβ (λ) ∈ C ∞ (h∗ )[[h]] ¯ and Uα ⊗Uβ ∈ U g⊗U g. Using this expression, indeed one can describe ∗h¯ explicitly. Theorem 3.4. Given a compatible star product ∗h¯ as in Definition 3.1, for any f (λ, x), g(λ, x) ∈ C ∞ (h∗ × G)[[h¯ ]], f (λ, x) ∗h¯ g(λ, x) =
∞ h¯ k αβ k=0
k!
− → aαβ (λ) ∗ Uα
∂λi1
− →− − → → ∂kf ∗ Uβ hi1 · · · hik g. i · · · ∂λ k
(16)
We need a couple of lemmas first. Lemma 3.5. Under the same hypothesis as in Theorem 3.4, (i) for any f (λ, x) ∈ C ∞ (h∗ × G) and g(λ) ∈ C ∞ (h∗ ), f (λ, x) ∗h¯ g(λ) = f (λ, x) ∗ g(λ); (ii) for any f (x) ∈ C ∞ (G) and g(λ, x) ∈ C ∞ (h∗ × G), − → − → aαβ (λ) ∗ Uα f (x)Uβ g(λ, x); f (x) ∗h¯ g(λ, x) =
(17)
(18)
αβ
(iii) for any f (λ, x) ∈ C ∞ (h∗ × G) and g(x) ∈ C ∞ (G), f (λ, x) ∗h¯ g(x) =
∞ h¯ k αβ k=0
− → ∂ k f (λ, x) − − → →− → Uβ hi1 · · · hik g(x). (19) aαβ (λ) ∗ Uα i i 1 k k! ∂λ · · · ∂λ
486
P. Xu
Proof. (i) It suffices to show this identity for f (λ, x) = f1 (x)f2 (λ), ∀f1 (x) ∈ C ∞ (G) and f2 (λ) ∈ C ∞ (h∗ ). Now f (λ, x) ∗h¯ g(λ) = (f1 (x)f2 (λ)) ∗h¯ g(λ) (by Eq. (10)) = (f1 (x) ∗h¯ f2 (λ)) ∗h¯ g(λ) = f1 (x) ∗h¯ (f2 (λ) ∗h¯ g(λ)) (by Eqs. (9–10)) = f1 (x)(f2 (λ) ∗ g(λ)) = (f1 (x)f2 (λ)) ∗ g(λ) = f (λ, x) ∗ g(λ). (ii) Similarly, we may assume that g(λ, x) = g1 (x)g2 (λ), ∀g1 (x) ∈ C ∞ (G) and g2 (λ) ∈ C ∞ (h∗ ). Then, f (x) ∗h¯ g(λ, x) = f (x) ∗h¯ (g1 (x)g2 (λ)) = f (x) ∗h¯ (g1 (x) ∗h¯ g2 (λ)) = (f (x) ∗h¯ g1 (x)) ∗h¯ g2 (λ) (by Eq. (12)) − → − → = [aαβ (λ)(Uα f (x))(Uβ g1 (x))] ∗ g2 (λ) αβ
=
− → − → aαβ (λ) ∗ Uα f (x)Uβ g(λ, x).
αβ
(iii) Assume that f (λ, x) = f1 (x)f2 (λ), ∀f1 (x) ∈ C ∞ (G) and f2 (λ) ∈ C ∞ (h∗ ). Then f (λ, x) ∗h¯ g(x) = (f1 (x)f2 (λ)) ∗h¯ g(x) = (f1 (x) ∗h¯ f2 (λ)) ∗h¯ g(x) = f1 (x) ∗h¯ (f2 (λ) ∗h¯ g(x)) (using Eq. (18)) − → − → aαβ (λ) ∗ Uα f1 (x)Uβ (f2 (λ) ∗h¯ g(x)) = αβ ∞ → − → − → ∂ k f2 (λ) − − → h¯ k hi · · · hik g(x))] aαβ (λ) ∗ [Uα f1 (x)Uβ ( i = k! ∂λ 1 · · · ∂λik 1
= = =
αβ k=0 ∞ αβ k=0 ∞ αβ k=0 ∞ αβ k=0
→− → − → − → h¯ k ∂ k f2 (λ) − Uβ hi1 · · · hik g(x)] aαβ (λ) ∗ [Uα f1 (x) i k! ∂λ 1 · · · ∂λik − → ∂ k (f1 (x)f2 (λ)) − − → →− → h¯ k Uβ hi1 · · · hik g(x)] aαβ (λ) ∗ [Uα k! ∂λi1 · · · ∂λik →− → − → ∂ k f (λ, x) − − → h¯ k Uβ hi1 · · · hik g(x). aαβ (λ) ∗ Uα i k! ∂λ 1 · · · ∂λik
This concludes the proof of the lemma. Now we are ready to prove the main result of this section.
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
487
Proof of Theorem 3.4. Again, we may assume that g(λ, x) = g1 (x)g2 (λ), ∀g1 (x) ∈ C ∞ (G) and g2 (λ) ∈ C ∞ (h∗ ). Then f (λ, x) ∗h¯ g(λ, x) = f (λ, x) ∗h¯ (g1 (x)g2 (λ)) = f (λ, x) ∗h¯ (g1 (x) ∗h¯ g2 (λ)) = (f (λ, x) ∗h¯ g1 (x)) ∗h¯ g2 (λ) (by Eq. (17)) = (f (λ, x) ∗h¯ g1 (x)) ∗ g2 (λ) (by Eq. (19)) ∞ − → ∂ k f (λ, x) − − → →− → h¯ k = Uβ hi1 · · · hik g1 (x)] ∗ g2 (λ) [aαβ (λ) ∗ Uα i i 1 k k! ∂λ · · · ∂λ αβ k=0
=
∞ h¯ k αβ k=0
=
→ − → ∂ k f (λ, x) − →− − → ∗ Uβ hi1 · · · hik (g1 (x)g2 (λ)) aαβ (λ) ∗ Uα i i 1 k k! ∂λ · · · ∂λ
∞ h¯ k αβ k=0
→ − → ∂ k f (λ, x) − →− − → aαβ (λ) ∗ Uα i ∗ Uβ hi1 · · · hik g(λ, x). i 1 k k! ∂λ · · · ∂λ
As a consequence of Theorem 3.4, we will see that if a function F (λ) : h∗ −→ U g ⊗ U g[[h]] ¯ defines a compatible star product, it must satisfy a “twisted-cocycle” type condition. To describe this condition explicitly, we need to introduce some notations. For any f (λ) ∈ C ∞ (h∗ ), define f (λ + hh) ¯ ∈ C ∞ (h∗ )⊗U h[[h¯ ]] by f (λ + hh) ¯ = f (λ)⊗1 + h¯
∂f 1 ∂ 2f ⊗hi + h¯ 2 ⊗hi1 hi2 ∂λi 2! ∂λi1 ∂λi2 i
i1 i2
∂kf ¯ +··· + ⊗hi1 · · · hik + · · · . k! ∂λi1 · · · ∂λik hk
(20)
The correspondence C ∞ (h∗ ) −→ C ∞ (h∗ )⊗U h[[h¯ ]] : f (λ) −→ f (λ + h¯ h) extends naturally to a linear map from C ∞ (h∗ )⊗U g⊗U g[[h¯ ]] to C ∞ (h∗ )⊗U h⊗U g⊗U g[[h]] ¯ ⊆ (1) ). More C ∞ (h∗ )⊗U g⊗U g⊗U g[[h]], is denoted by F (λ) −→ F (λ + h h ¯ which ¯ 23 explicitly, assume that F (λ) = αβ fαβ (λ)Uα ⊗Uβ , where fαβ (λ) ∈ C ∞ (h∗ )[[h]] ¯ and Uα ⊗Uβ ∈ U g⊗U g. Then (1) )= fαβ (λ + hh)⊗U (21) F23 (λ + hh ¯ ¯ α ⊗Uβ . αβ
By a suitable permutation, one may define F12 (λ+hh ¯ (3) ) and F13 (λ+hh ¯ (2) ) similarly. Note that U g is a Hopf algebra. By . : U g −→ U g⊗U g and < : U g −→ R, we denote its co-multiplication and co-unit, respectively. Then . naturally extends to a map C ∞ (h∗ )⊗U g[[h]] ¯ −→ C ∞ (h∗ )⊗U g⊗U g[[h]], ¯ which will be denoted by the same symbol. Corollary 3.6. Assume that F : h∗ −→ U g⊗U g[[h¯ ]] defines a compatible star product ∗h¯ as in Definition 3.1. Then (.⊗id)F (λ) ∗ F12 (λ + h¯ h(3) ) = (id⊗.)F (λ) ∗ F23 (λ); (<⊗id)F (λ) = 1; (id⊗<)F (λ) = 1.
(22) (23)
488
P. Xu
Proof. Equation (23) follows from the fact that 1∗h¯ f (x) = f (x)∗h¯ 1 = f (x), ∀f (x) ∈ C ∞ (G). As for Eq. (22), note that for any f1 (x), f2 (x) and f3 (x) ∈ C ∞ (G), according to Eq. (19), we have (f1 (x) ∗h¯ f2 (x)) ∗h¯ f3 (x) ∞ − → ∂ k (f1 (x) ∗h¯ f2 (x)) − − → →− → h¯ k = aαβ (λ) ∗ Uα Uβ hi1 · · · hik f3 (x). i i 1 k! ∂λ · · · ∂λ k αβ k=0
Now (3) ) (.⊗id)F (λ) ∗ F12 (λ + hh ¯ ∞ k ∂kF h¯ = (.⊗id)F (λ) ∗ ( i ⊗hi1 · · · hik ) k! ∂λ 1 · · · ∂λik k=0
=
∞ h¯ k αβ k=0
k!
aαβ (λ) ∗ .Uα
∂kF ⊗Uβ hi1 · · · hik . · · · ∂λik
∂λi1
It thus follows that −−−−−−−−−−−−−−−−−−−−−→ (3) )(f1 (x), f2 (x), f3 (x)) (.⊗id)F (λ) ∗ F12 (λ + hh ¯ −−→ ∞ h¯ k − → ∂ k F (λ)(f1 (x), f2 (x)) − →− − → → = )Uβ hi1 · · · hik f3 (x) aαβ (λ) ∗ Uα ( i i k! ∂λ 1 · · · ∂λ k =
αβ k=0 ∞ αβ k=0
− → ∂ k (f1 (x) ∗h¯ f2 (x)) − − → →− → h¯ k Uβ hi1 · · · hik f3 (x) aαβ (λ) ∗ Uα i i k! ∂λ 1 · · · ∂λ k
= (f1 (x) ∗h¯ f2 (x)) ∗h¯ f3 (x). On the other hand, −−→ f1 (x) ∗h¯ (f2 (x) ∗h¯ f3 (x)) = f1 (x) ∗h¯ F (λ)(f2 (x), f3 (x)) (by Eq. (18)) − → − → −−→ aαβ (λ) ∗ Uα f1 (x)Uβ (F (λ)(f2 (x), f3 (x))) = αβ
−−−−−−−−−−−−−−−→ = (id⊗.)F (λ) ∗ F23 (λ)(f1 (x), f2 (x), f3 (x)). Now Eq. (22) follows from the associativity of ∗h¯ .
To end this section, as a special case, let us consider M = h∗ × H ∼ = T ∗ H , which is equipped with the canonical cotangent symplectic structure. The following proposition describes an explicit formula for a compatible star-product on it. Proposition 3.7. For any f (λ, x), g(λ, x) ∈ C ∞ (h∗ × H )[[h¯ ]], the following equation f (λ, x) ∗h¯ g(λ, x) =
∞ h¯ k k=0
− → − → ∂kf ∗ hi1 · · · hik g k! ∂λi1 · · · ∂λik
(24)
defines a compatible star product on M = h∗ ×H ∼ = T ∗ H , which is in fact a deformation quantization of its canonical cotangent symplectic structure.
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
489
Proof. As earlier in this section, let hh¯ = h[[h]] ¯ be equipped with the Lie bracket [X, Y ]h¯ = h[X, Y ], ∀X, Y ∈ hh¯ , and σ : S(h)[[h]] ¯ ¯ −→ U hh¯ the PBW-map. Note that hh¯ is isomorphic to h as a Lie algebra. Hence U hh¯ is canonically isomorphic to U h[[h]], ¯ whose elements can be considered as left invariant (formal) differential operators on H . To each polynomial function on T ∗ H ∼ = h∗ × H , we assign a (formal) differential operator on H according to the following rule. For f ∈ C ∞ (H ), we assign the operator multiplying by f ; for f ∈ Pol(h∗ ) ∼ = S(h), we assign the left invariant differential −−→ operator σ (f ); in general, for f (x)g(λ) with f (x) ∈ C ∞ (H ) and g(λ) ∈ Pol(h∗ ), −−→ we assign the differential operator f (x)σ (g). Then the multiplication on the algebra of differential operators induces an associative product ∗h¯ on Pol(T ∗ H )[[h]], ¯ hence a star product on T ∗ H . It is simple to see from the above construction that (i)
for any f (λ), g(λ) ∈ C ∞ (h∗ ), f (λ) ∗h¯ g(λ) = f (λ) ∗ g(λ);
(25)
(ii) for any f (x) ∈ C ∞ (H ) and g(λ) ∈ C ∞ (h∗ ), f (x) ∗h¯ g(λ) = f (x)g(λ);
(26)
(iii) for any f (λ) ∈ C ∞ (h∗ ) and g(x) ∈ C ∞ (H ), f (λ) ∗h¯ g(x) =
∞ h¯ k k=0
→ − → ∂ k f (λ) − hi1 · · · hik g(x); i i 1 k k! ∂λ · · · ∂λ
(27)
(iv) for any f (x), g(x) ∈ C ∞ (H ), f (x) ∗h¯ g(x) = f (x)g(x).
(28)
In other words, this is indeed a compatible star product with F ≡ 1. Equation (24) thus follows immediately from Theorem 3.4. Remark. It would be interesting to compare Eq. (24) with the general construction of star products on cotangent symplectic manifolds in [10, 11]. Equation (27) implies that the element f (λ + hh) ¯ ∈ C ∞ (h∗ )⊗U h[[h¯ ]], being considered as a left invariant differential operator on H , admits the following expression: −−−−−−→ f (λ + h¯ h) = f (λ) ∗h¯ . Thus we have: Corollary 3.8. For any f, g ∈ C ∞ (h∗ ), (f ∗ g)(λ + h¯ h) = f (λ + h¯ h) ∗ g(λ + h¯ h),
(29)
where the ∗ on the left hand side stands for the PBW-star product on h∗ , while on the right hand side it refers to the multiplication on the algebra tensor product of (C ∞ (h∗ )[[h¯ ]], ∗) with U h[[h¯ ]].
490
P. Xu
Proof. Let ∗h¯ denote the star product on T ∗ H as in Proposition 3.7. For any ϕ(x) ∈ C ∞ (H ), ∞ − → → h¯ k ∂ k (f (λ) ∗ g(λ)) − hi1 · · · hik ϕ(x) k! ∂λi1 · · · ∂λik k=0 −−−−−−−−−−→ = (f ∗ g)(λ + hh)ϕ(x). ¯
(f (λ) ∗h¯ g(λ)) ∗h¯ ϕ(x) (by Eqs. (25, 27)) =
On the other hand, f (λ) ∗h¯ (g(λ) ∗h¯ ϕ(x)) (by Eq. (24)) ∞ − → − → h¯ k ∂ k f (λ) = ∗ hi1 · · · hik (g(λ) ∗h¯ ϕ(x)) i i 1 k k! ∂λ · · · ∂λ k=0
=
∞ ∞ h¯ k k=0 l=0
→ − → − → h¯ l ∂ l g(λ) − − → ∂ k f (λ) hj · · · hjl ϕ(x)) ∗ h · · · h i ik ( 1 k! ∂λi1 · · · ∂λik l! ∂λj1 · · · ∂λjl 1
∞ ∞ h¯ k+l
− →− − → → → ∂ l g(λ) − ∂ k f (λ) ∗ j hi · · · hik hj1 · · · hjl ϕ(x) i i k!l! ∂λ 1 · · · ∂λ k ∂λ 1 · · · ∂λjl 1 k=0 l=0 −−−−−−−−−−−−−−−→ = f (λ + hh) ¯ ∗ g(λ + hh)ϕ(x). ¯ =
The conclusion thus follows from the associativity of ∗h¯ . Corollary 3.9. For any F, G ∈
C ∞ (h∗ )⊗U g⊗U g[[h]], ¯
(1)
(1)
(F ∗ G)23 (λ + h¯ h ) = F23 (λ + h¯ h ) ∗ G23 (λ + h¯ h(1) ).
(30)
In particular, if F (λ) ∈ C ∞ (h∗ )⊗U g⊗U g[[h¯ ]] is invertible, we have −1 F23 (λ + h¯ h(1) ) = F23 (λ + h¯ h(1) )−1 .
(31)
4. Quantum Dynamical Yang–Baxter Equation The main purpose of this section is to derive the quantum dynamical Yang–Baxter equation over a nonabelian base h from the “twisted-cocycle” condition (22). This was standard when h is Abelian (e.g., see [6]). The proof was based on the Drinfel’d theory of quasi-Hopf algebras [13]. In our situation, however, the quasi-Hopf algebra approach does not work any more. Nevertheless, one can carry out a proof in a way completely parallel to the ordinary case. The main result of this section is the following: Theorem 4.1. Assume that F : h∗ −→ U g⊗U g[[h¯ ]] satisfies the “twisted-cocycle” condition (22). Then R(λ) = F21 (λ)−1 ∗ F12 (λ)
(32)
satisfies the following generalized quantum dynamical Yang–Baxter equation (or Gervais–Neveu–Felder equation): R12 (λ) ∗ R13 (λ + h¯ h(2) ) ∗ R23 (λ) = R23 (λ + h¯ h(1) ) ∗ R13 (λ) ∗ R12 (λ + h¯ h(3) ). (33) Here ∗ denotes the natural multiplication on C ∞ (h∗ )⊗(U g)n [[h¯ ]], ∀n, with C ∞ (h∗ ) being equipped with the PBW-star product.
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
491
It is simple to see that the usual relation .(a ∗ b) = .a ∗ .b
(34)
: C ∞ (h∗ )⊗U g[[h]] still holds for any a, b ∈ C ∞ (h∗ )⊗U g[[h]]. ¯ Define . ¯ −→ C ∞ (h∗ )⊗U g⊗U g[[h¯ ]] by a = F (λ)−1 ∗ .a ∗ F (λ), ∀a ∈ C ∞ (h∗ )⊗U g[[h]]. . ¯
(35)
It is simple to see, using the associativity of ∗, that op a = R(λ) ∗ . a ∗ R(λ)−1 . .
(36)
The following is immediate from Corollary 3.9. Corollary 4.2. R23 (λ + h¯ h(1) ) = F32 (λ + h¯ h(1) )−1 ∗ F23 (λ + h¯ h(1) ).
(37)
Remark. Equation (37) is trivial when h is Abelian. It, however, does not seem obvious in general. We can see from the proof of Corollary 3.9 that this equation essentially follows from the associativity of the star product given by Eq. (24). For any given F (λ) ∈ C ∞ (h∗ )⊗U g⊗U g[[h]], ¯ introduce =123 (λ) ∈ C ∞ (h∗ )⊗U g ⊗U g⊗U g[[h]] by ¯ =123 (λ) = F23 (λ)−1 ∗ [(id⊗.)F (λ)−1 ] ∗ [(.⊗id)F (λ)] ∗ F12 (λ).
(38)
Lemma 4.3. ⊗id)R = =231 ∗ R13 ∗ =−1 ∗ R23 ∗ =123 ; (. 132 −1 (id⊗.)R = =312 ∗ R13 ∗ =213 ∗ R12 ∗ =−1 123 .
(39) (40)
Proof. By applying the permutation a1 ⊗a2 ⊗a3 −→ a1 ⊗a3 ⊗a2 on Eq. (38), one obtains that =132 (λ) = F32 (λ)−1 ∗ σ23 [(id⊗.)F (λ)−1 ] ∗ σ23 [(.⊗id)F (λ)] ∗ F13 (λ) = F32 (λ)−1 ∗ [(id⊗.)F (λ)−1 ] ∗ σ23 [(.⊗id)F (λ)] ∗ F13 (λ), since . is cocommutative. Similarly, applying the permutation a1 ⊗a2 ⊗a3 −→ a2 ⊗a3 ⊗a1 on Eq. (38), one obtains that =231 (λ) = F12 (λ)−1 ∗ [(.⊗id)F21 (λ)−1 ] ∗ σ23 [(.⊗id)F (λ)] ∗ F31 (λ).
(41)
On the other hand, by definition, R13 (λ) = F31 (λ)−1 ∗ F13 (λ), R23 (λ) = F32 (λ)−1 ∗ F23 (λ). It thus follows that =231 ∗ R13 ∗ =−1 132 ∗ R23 ∗ =123 = F12 (λ)−1 ∗ (.⊗id)F21 (λ)−1 ∗ (.⊗id)F (λ) ∗ F12 (λ) (by Eq. (34)) = F12 (λ)−1 ∗ (.⊗id)R(λ) ∗ F12 (λ) (by Eq. (35)) ⊗id)R. = (. Equation (39) can be proved similarly.
(42) (43)
492
P. Xu
Proof of Theorem 4.1. From Eq. (36), it follows that ⊗id)R = (. op ⊗id)R ∗ R12 . R12 ∗ (. According to Eq. (39), this is equivalent to −1 R12 ∗ =231 ∗ R13 ∗ =−1 132 ∗ R23 ∗ =123 = =321 ∗ R23 ∗ =312 ∗ R13 ∗ =213 ∗ R12 .
Thus, −1 −1 R12 ∗ (=231 ∗ R13 ∗ =−1 132 ) ∗ R23 = (=321 ∗ R23 ∗ =312 ) ∗ R13 ∗ (=213 ∗ R12 ∗ =123 ). (44)
Now the twisted-cocycle condition (22) implies that (3) −1 ) ∗ F12 (λ). =123 (λ) = F12 (λ + hh ¯
(45)
It thus follows that =213 ∗ R12 ∗ =−1 123 (3) −1 (3) = F21 (λ + hh ) ∗ F21 (λ) ∗ F21 (λ)−1 ∗ F12 (λ) ∗ F12 (λ)−1 ∗ F12 (λ + hh ) ¯ ¯ (3) −1 (3) = F21 (λ + hh ) ∗ F (λ + hh ) (by Corollary 4.2) ¯ ¯ 12 (3) = R12 (λ + hh ). ¯
Applying the permutations: a1 ⊗a2 ⊗a3 −→ a3 ⊗a1 ⊗a2 , and a1 ⊗a2 ⊗a3 −→ a1 ⊗a3 ⊗a2 respectively to the equation above, one obtains (1) ) and =321 ∗ R23 ∗ =−1 ¯ 312 = R23 (λ + hh (2) =231 ∗ R13 ∗ =−1 ). ¯ 132 = R13 (λ + hh
Equation (33) thus follows immediately.
5. Concluding Remarks Even though our discussion so far has been mainly confined to triangular dynamical r-matrices, we should point out that there do exist many interesting examples of nontriangular ones. For instance, when the Lie algebra g admits an ad-invariant bilinear form and the base Lie algebra h equals g, Alekseev and Meinrenken found an explicit construction of an interesting non-triangular dynamical r-matrix [1] in connection with their study of the non-commutative Weil algebra. In fact, for simple Lie algebras, the existence of AM-dynamical r-matrices was already proved by Etingof and Varchenko [19]. The construction of Alekseev and Meinrenken was later generalized by Etingof and Schiffmann to a more general context [17]. So there is no doubt that there are abundant non-trivial examples of dynamical r-matrices with a nonabelian base. It is therefore desirable to know how they can be quantized. Inspired by the above discussion in the triangular case, we are ready to propose the following quantization problem along the line of Drinfel’d’s naive2 quantization [14]. 2 Drinfeld’s original naive quantization was proposed for a classical r-matrix in A⊗A for an associative algebra A. Here one can consider A as the universal enveloping algebra U g, and r ∈ g⊗g ⊂ U g⊗U g.
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
493
Definition 5.1. Given a classical dynamical r-matrix r : h∗ −→ g⊗g, a quantization of r is R(λ) = 1 + h¯ r(λ) + O(h¯ 2 ) ∈ U (g)⊗U (g)[[h¯ ]] which is H -equivariant and satisfies the generalized quantum dynamical Yang–Baxter equation (or Gervais–Neveu–Felder equation): R12 (λ) ∗ R13 (λ + h¯ h(2) ) ∗ R23 (λ) = R23 (λ + h¯ h(1) ) ∗ R13 (λ) ∗ R12 (λ + h¯ h(3) ). (46) Combining Proposition 2.1, Proposition 3.2, Corollary 3.6 and Theorem 4.1, we may summarize the main result of this paper in the following: Theorem 5.2. A triangular dynamical r-matrix r : h∗ −→ ∧2 g is quantizable if there exists a compatible star product on the corresponding Poisson manifold h∗ × G. We conclude this paper with a list of questions together with some thoughts. Question 1. Is every classical triangular dynamical r-matrix quantizable? According to Theorem 5.2, this question is equivalent to asking whether a compatible star product always exists for the corresponding Poisson manifold h∗ × G. When the base Lie algebra is Abelian, a quantization procedure was found for splittable classical triangular dynamical r-matrices using Fedosov’s method [34]. Recently Etingof and Nikshych [21], using the vertex-IRF transformation method, showed the existence of quantization for the so-called completely degenerate triangular dynamical r-matrices, which leads to the hope that the existence of quantization could be possibly settled by combing both methods in [34] and [21]. However, when the base Lie algebra h is nonabelian, the method in [34] does not admit a straightforward generalization. One of the main difficulties is that the Fedosov method uses Weyl quantization, while our quantization here is in normal ordering. Nevertheless, for the dynamical r-matrices constructed in Theorem 2.3, under some mild assumptions a quantization seems feasible by using the generalized Karabegov method [3, 5]. This problem will be discussed in a separate publication. Question 2. What is the symmetrized version of the quantum dynamical Yang–Baxter equation (46)? We derived Eq. (46) from a compatible star product, which is a normal ordering star product. The reason for us to choose the normal ordering here is that one can obtain a very explicit formula for the star product: Eq. (16). A Weyl ordering compatible star product may exist, but it may be more difficult to work with. For the canonical cotangent symplectic structure T ∗ H , a Weyl ordering star product was found by Gutt [27], but it is rather difficult to write down an explicit formula [9]. As we can see from the previous discussion, how a quantum dynamical Yang–Baxter equation looks is closely related to the choice of a star product on T ∗ H . When H is Abelian, there is a very simple operator establishing an isomorphism between these two quantizations, which is indeed the transformation needed to transform a unsymmetrized QDYBE into a symmetrized one. Such an operator also exists for a general cotangent bundle T ∗ Q [10], but it is much more complicated. Nevertheless, this viewpoint may still provide a useful method to obtain the symmetrized version of a QDYBE. Question 3. Is every classical dynamical r-matrix quantizable? This question may be a bit too general. As a first step, it should already be quite interesting to find a quantum analogue of Alekseev–Meinrenken dynamical r-matrices. Question 4. What is the deformation theory controlling the quantization problem as proposed in Definition 5.1?
494
P. Xu
If R = 1 + hr ¯ + · · · + h¯ i ri + · · · , where ri ∈ C ∞ (h∗ )⊗U g⊗U g, i ≥ 2, is a solution to the QDYBE, the h-term r must be a solution of the classical dynamical Yang–Baxter ¯ equation. Indeed the quantum dynamical Yang–Baxter equation implies a sequence of equations of ri in terms of lower order terms. One should expect some cohomology theory here just as for any deformation theory [8]. However, in our case, the equation seems very complicated. On the other hand, it is quite surprising that such a theory does not seem to exist in the literature even in the case of quantization of a usual r-matrix. Finally, we would like to point out that perhaps a more useful way of thinking of quantization of a dynamical r-matrix is to consider the quantum groupoids as defined in [33]. This is in some sense an analogue of the “sophisticated” quantization in terms of Drinfel’d [14]. A classical dynamical r-matrix gives rise to a Lie bialgebroid (T h∗ × g, T ∗ h∗ × g∗ ) [7, 29]. Its induced Poisson structure on the base space h∗ is the LiePoisson structure πh∗ , which admits the PBW-star product as a standard deformation quantization. This leads to the following Question 5. Does the Lie bialgebroid (T h∗ ×g, T ∗ h∗ ×g∗ ) corresponding to a classical dynamical r-matrix always admit a quantization in the sense of [33], with the base algebra being the PBW-star algebra C ∞ (h∗ )[[h]]? ¯ To connect the quantization problem in Definition 5.1 with that of Lie bialgebroids, it is clear that one needs to consider preferred quantization of Lie bialgebroids: namely, a quantization where the total algebra is undeformed and remains to be D(h∗ )⊗U g[[h]]. ¯ Question 6. Does the Lie bialgebroid (T h∗ × g, T ∗ h∗ × g∗ ) admit a preferred quantization? How is such a preferred quantization related to the quantization of r as proposed in Definition 5.1? When h = 0, namely for usual r-matrices, the answer to Question 6 is positive, due to a remarkable theorem of Etingof–Kazhdan [16]. Acknowledgements. The author would like to thank Philip Boalch, Pavel Etingof, Boris Tsygan and David Vogan for fruitful discussions and comments. He is especially grateful to Pavel Etingof for explaining the paper [17], which inspired his interest on this topic. He also wishes to thank Simone Gutt and Stefan Waldmann for providing him some useful references on star products of cotangent symplectic manifolds.
References 1. Alekseev, A. and Meinrenken, E.: The non-commutative Weil algebra. Invent. Math. 139, 135–172 (2000) 2. Arnaudon, D., Buffenoir, E., Ragoucy, E. and Roche, Ph.: Universal solutions of quantum dynamical Yang–Baxter equation. Lett. Math. Phys. 44, 201–214 (1998) 3. Astashkevich, A.: On Karabegov’s quantizations of semisimple coadjoint orbits, Advances in geometry, Progress in Mathematics 172, 1–18 (1999) 4. Avan. J, Babelon, O., and Billey, E.: The Gervais–Neveu–Felder equation and the quantum CalogeroMoser systems. Commun. Math. Phys. 178, 281–299 (1996) 5. Bressler, P., and Donin, J.: Polarized deformation quantization. math.QA/0007186 6. Babelon, O., Bernard, D., and Billey, E.: A quasi-Hopf algebra interpretation of quantum 3j and 6j symbols and difference equations. Phys. Lett. B 375, 89–97 (1996) 7. Bangoura, M., and Kosmann-Schwarzbach, Y.: Equation de Yang–Baxter dynamique classique et algébroïdes de Lie. C. R. Acad. Sci. Paris, Série I 327, 541–546 (1998) 8. Bayen, F., Flato, M., Frønsdal, C., Lichnerowicz, A., and Sternheimer, D.: Deformation theory and quantization, I and II. Ann. Phys. 111, 61–151 (1977) 9. Bieliavsky, P., and Bonneau, P.: Quantization of solvable symmetric spaces II, Work in Progress 10. Bordemann, M., Neumaier, N. and Waldmann, S.: Homogeneous Fedosov star products on cotangent bundles, I. Weyl and standard ordering with differential operator representation. Commun. Math. Phys. 198, 363–396 (1998) 11. Bordemann, M., Neumaier, N., Pflaum, M., and Waldmann, S.: On representations of star product algebras over cotangent spaces on Hermitian line bundles. math.QA/9811055
Quantum Dynamical Yang–Baxter Equation Over a Nonabelian Base
495
12. Cannas da Silva, A., and Weinstein, A.: Geometric models for noncommutative algebras. Berkeley Mathematics Lecture Notes 10, AMS, 1999 13. Drinfel’d, V.G.: Quasi-Hopf algebras. Leningrad Math. J. 2, 829–860 (1991) 14. Drinfel’d, V.G.: On some unsolved problems in quantum group theory. Lecture Notes in Math. 1510. Berlin: Springer, 1992, pp. 1–8 15. Etingof, P.: Private communication 16. Etingof, P., and Kazhdan, D.: Quantization of Lie bialgebras I. Selecta Mathematica. New series 2, 1–41 (1996) 17. Etingof, P., and Schiffmann, O.: On the moduli space of classical dynamical r-matrices. Math. Res. Lett. 8, 157–170 (2001) 18. Etingof, P., Schedler, T., and Schiffmann, O.: Explicit quantization of dynamical r-matrices for finite dimensional semisimple Lie algebras, J. AMS 13, 595–609 (2000) 19. Etingof, P., and Varchenko, A.: Geometry and classification of solutions of the classical dynamical Yang– Baxter equation. Commun. Math. Phys. 192, 77–120 (1998) 20. Etingof, P., and Varchenko, A.: Solutions of the quantum dynamical Yang–Baxter equation and dynamical quantum groups. Commun. Math. Phys. 196, 591–640 (1998) 21. Etingof, P., and Nikshych, D.: Vertex-IRF transformations and quantization of dynamical r-matrices. Math. Res. Lett. 8, 331–345 (2001) 22. Fedosov, B.: A simple geometrical construction of deformation quantization. J. Diff. Geom. 40, 213–238 (1994) 23. Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Proc. ICM Zürich, 1994, pp. 1247–1255 24. Gervais, J.-L., and Neveu, A.: Novel triangle relation and absense of tachyons in Liouville string field theory. Nucl. Phys. B 238, 125–141 (1984) 25. Jimbo, M., Konno, H., Odake, S and Shiraishi, J.: Quasi-Hopf twistors for elliptic quantum groups. Transform. Groups 4, 303–327 (1999) 26. Guillemin, V., Lerman, E., and Sternberg, S.: Symplectic fibrations and multiplicity diagrams, Cambridge: Cambridge University Press, 1996 27. Gutt, S.: An explicit ∗-product on the cotangent bundle of a Lie group. Lett. Math. Phys. 7, 249–258 (1983) 28. Kobayashi, S, and Nomizu, K.: Foundations of differential geometry. Vol. I. Reprint of the 1963 original, New York: John Wiley & Sons, Inc., 1996 29. Liu, Z.-J., and Xu, P.: Dirac structures and dynamical r-matrices. Ann. Inst. Fourier, Grenoble 51, 831–859 (2001) 30. Schiffmann, O.: On classification of dynamical r-matrices. Math. Res. Lett. 5, 13–30 (1998) 31. Weinstein, A.: Fat bundles and symplectic manifolds. Adv. in Math. 37, 239–250 (1980) 32. Xu, P.: Quantum groupoids associated to universal dynamical R-matrices. C. R. Acad. Sci. Paris, Série I 328, 327–332 (1999) 33. Xu, P.: Quantum groupoids. Commun. Math. Phys. 216, 539–581 (2001) 34. Xu, P.: Triangular dynamical r-matrices and quantization. (math.QA/0005006) Advances in Math. 166, 1–49 (2002) Communicated by L. Takhtajan
Commun. Math. Phys. 226, 497 – 530 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Integrable Fredholm Operators and Dual Isomonodromic Deformations J. Harnad1,2 , Alexander R. Its3 1 Department of Mathematics and Statistics, Concordia University, 7141 Sherbrooke W., Montréal, Québec,
Canada H4B 1R6
2 Centre de recherches mathématiques, Université de Montréal, C. P. 6128, succ. centre ville, Montréal,
Québec, Canada H3C 3J7. E-mail: [email protected]
3 Department of Mathematical Sciences, Indiana University – Purdue University at Indianapolis, Indianapolis,
IN 46202-3216, USA. E-mail: [email protected]
Received: 10 June 1997 / Received revised: 16 February 2001 / Accepted: 27 November 2001
Abstract: The Fredholm determinants of a special class of integrable integral operators K supported on the union of m curve segments in the complex λ-plane are shown to be the τ -functions of an isomonodromic family of meromorphic covariant derivative operators Dλ , having regular singular points at the 2m endpoints of the curve segments, and a singular point of Poincaré index 1 at infinity. The rank r of the corresponding vector bundle over the Riemann sphere equals the number of distinct terms in the exponential sum defining the numerator of the integral kernel. The matrix Riemann–Hilbert problem method is used to deduce an identification of the Fredholm determinant as a τ -function in the sense of Segal–Wilson and Sato, i.e., in terms of abelian group actions on the determinant line bundle over a loop space Grassmannian. An associated dual isomonodromic family of covariant derivative operators Dz , having rank n = 2m, and r finite regular singular points located at the values of the exponents defining the kernel of K is derived. The deformation equations for this family are shown to follow from an associated dual set of Riemann–Hilbert data, in which the rôles of the r exponential factors in the kernel and the 2m endpoints of its support are interchanged. The operators whose Fredholm determinant Dz are analogously associated to an integral operator K is equal to that of K.
This paper is a slightly revised version of the preprint [HI], which brings it up to date with the current literature and clarifies some technical points in Sects. 1.2 and 4.2.
498
J. Harnad, A. R. Its
1. Introduction 1.1. Dual isomonodromic deformation equations. The differential equations determining isomonodromic deformations of rational covariant derivative operators of the form ∂ − N (λ), ∂λ n Nj N (λ) := B + , λ − αj Dλ =
(1.1a) (1.1b)
j =1
B = diag(β1 , . . . , βr ),
Nj ∈ gl(r, C)
were derived in [JMMS, JMU]. The residue matrices Nj (αi , βa ) may depend parametrically on the pole locations {α1 , . . . , αn } and the asymptotic eigenvalues {β1 , . . . , βr }. This dependence must be such as to satisfy the commutativity conditions Dλ , Dβa = 0, j = 1, . . . , n, a = 1, . . . , r, (1.2) Dλ , Dαj = 0, where the differential operators Dαj , Dβa are defined by Dαj :=
Nj ∂ + , ∂αj λ − αj
(1.3a)
Ea ∂ := − λEa − ∂βa b=1 r
Dβa
n j =1 Nj
Eb + Eb
n j =1 Nj
Ea
βa − β b
,
(1.3b)
b=a
and Ea is the elementary r × r matrix with elements (Ea )bc := δab δac .
(1.4)
In the Hamiltonian formulation [H] these relations, viewed as differential equations for the residue matrices Ni , are interpreted as a compatible system of nonautonomous Hamiltonian equations with respect to the Lie Poisson structure on the space (gl(r))∗n = {N1 , . . . , Nn }, embedded through formula (1.1b) into the loop algebra R (r) with rational R-matrix structure. They are generated by the Poisson commuting gl family of Hamiltonians {Hj , Ka }j =1,... ,n,a=1,... ,r defined by n tr Ni Nj Hj := tr(BNj ) + , (1.5a) αj − α i i=1 i=j
Ka :=
n
αj N j
j =1
aa
+
r
n j =1 Nj
n
ab
βa − β b
b=1 b=a
k=1 Nk ba
.
(1.5b)
The Poisson commutativity of the Hamiltonians follows from the R–matrix structure R (r) and implies that the differential 1–form on the parameter space {α1 , . . . , αn , on gl β1 , . . . , βr } defined by ω :=
n k=1
Hk dαk +
r a=1
Ka dβa
(1.6)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
499
is closed dω = 0.
(1.7)
This, in turn, implies the existence of a tau-function τ (α1 , . . . , αn ; β1 , . . . , βr ) in the sense of [JMU], determined, up to multiplication by a local constant, by ω = d(ln τ ).
(1.8)
As shown in [H] it is natural to view the phase space as a Poisson quotient of a symplectic space M = {F, G} consisting of canonically conjugate pairs (F, G) of rectangular N × r matrices, where N=
n
ki ,
ki := rankNi .
(1.9)
i=1
The quotient map is defined by the formula N (λ) = B + F T (A − λI)−1 G,
(1.10)
where A ∈ gl(N, C) is a diagonal matrix with eigenvalues {αi }i=1,... ,n and multiplicities {ki }. The Hamiltonians may be pulled back to M through the projection map defined by (1.10), and hence determine corresponding nonautonomous Hamiltonian systems, ∂F = {F, Hi }, ∂αi ∂F = {F, Ka }, ∂βa
∂G = {G, Hi }, i = 1, . . . , n, ∂αi ∂G = {G, Ka } a = 1, . . . , r, ∂βa
(1.11a) (1.11b)
whose integral curves project to solutions of the isomonodromic deformation equations (1.2). This leads quite naturally to a “dual” isomonodromic deformation system for the family of operators z := ∂ − M(z), D ∂z M(z) := A + F (B − zI)
(1.12a) −1
r Ma G =A+ , z − βa T
(1.12b)
a=1
Ma ∈ gl(N, C). The corresponding deformation equations are z , D z , D αj = 0, βa = 0, j = 1, . . . , n, D D
a = 1, . . . , r,
(1.13)
where βa := ∂ + Ma , D ∂βa z − βa r r n Ei a=1 Ma Ej + Ej a=1 Ma Ei αi := ∂ − zEi − D . ∂αi αi − α j j =1 j =i
(1.14a) (1.14b)
500
J. Harnad, A. R. Its
These have the same interpretation as nonautonomous Hamiltonian equations on the phase space (gl(n))∗r = {M1 , . . . , Mr }, generated by the Poisson commuting family of Hamiltonians, a := tr(AMa ) + K
r tr (Ma Mb ) b=1 b=a
j := H
r a=1
βa (Ma )jj +
βa − β b n
,
(1.15a)
r
a=1 Ma j k
r
αj − α k
k=1 k=j
b=1 Mb kj
.
(1.15b)
Pulled back to the space M, these are shown in [H] to coincide with the Hamiltonians (1.5a), (1.5b), and hence the corresponding “dual” τ –function τ (α1 , . . . , αn ; β1 , . . . , βr ), determined by d(ln τ ) =:=
n
k dαk + H
k=1
r
a dβa , K
(1.16)
a=1
coincides with τ (α1 , . . . , αn ; β1 . . . , βr ). In the present work, using the approach of [IIKS], it will be shown how a particular class of solutions to such isomonodromic deformation equations arises through the solution of an associated matrix Riemann–Hilbert problem (Theorems 2.1, 2.3), and how the corresponding τ -functions may be identified with the Fredholm determinants of a simple class of “integrable” Fredholm integral operators (Theorem 2.5). We also show (Theorem 2.6) that the τ -function obtained in this way may equivalently be interpreted as a τ -function in the sense of Sato [Sa] and Segal and Wilson [SW,W]; that is, as the determinant of a suitably defined projection operator applied to the image of an element W of a Hilbert space Grassmannian under the action of an abelian subgroup
$ r ⊂ Gl(r) of the loop group Gl(r). The “dual” system is then shown (Theorem 3.1), under suitable restrictions on the input data, to correspond to the integral operator obtained through a Fourier–Laplace transform. In this formulation, the element W of the Grassmannian is determined by the pole parameters {α1 , . . . , αn } of the first family of operators (1.1b), while the group element γ ∈ $ r is parameterized by the asymptotic eigenvalues {β1 , . . . , βr }. In the dual system, the rôles of the element W and the group element γ of are interchanged. 1.2. Integrable Fredholm kernels and the Riemann–Hilbert problem. Consider a p × p matrix Fredholm integral operator acting on Cp -valued functions v(λ), K(v)(λ) = K(λ, µ)v(µ)dµ, (1.17) $
defined along a piecewise smooth, oriented curve $ in the complex plane (possibly extending to ∞), with integral kernel of the special form K(λ, µ) =
f T (λ)g(µ) , λ−µ
(1.18)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
501
where f, g are rectangular r × p matrix valued functions, p < r. For the moment, we assume only that f and g are smooth functions along the connected components of $ and, in order that K be nonsingular, we also require that f T (λ)g(λ) = 0,
(1.19)
so that the diagonal values are given by taking the limit K(λ, λ) = f T (λ)g(λ) = −f T (λ)g (λ).
(1.20)
To simplify some technical matters below we shall also assume that the functions f and g can be analytically continued to a neighborhood of each of the connected components of $. Fredholm determinants of operators of this type appear as generating functions for correlators in many integrable quantum field theory models [WMTB, JMMS, IIKS, IIKV, KBI] and as spectral distributions for random matrix ensembles [M,TW1,TW2, HTW], the most common case being p = 1, r = 2, with K a scalar integral operator. A first important observation to note [IIKS] is that the resolvent operator R := (I − K)−1 K is also in the same class. Thus, R may also be expressed as R(v)(λ) = R(λ, µ)v(µ)dµ, $
(1.21)
(1.22)
where the resolvent kernel, R(λ, µ) :=
FT (λ)G(µ) , λ−µ
(1.23)
is determined by FT = (I − K)−1 f T = (I + R)f T ,
(1.24a)
−1
(1.24b)
G = g(I − K)
= g(I + R),
and where (1 − K)−1 in (1.24a) is understood as acting to the right, while in (1.24b) it acts to the left. These quantities similarly satisfy the nonsingularity condition FT (λ)G(λ) = 0,
(1.25)
with the diagonal value of the resolvent kernel given by R(λ, λ) = FT (λ)G(λ) = −FT (λ)G (λ).
(1.26)
The basic deformation formula relating the Fredholm determinant to the resolvent operator is d ln det(I − K) = −tr (1 − K)−1 dK = −tr (1 + R)dK , (1.27) where d signifies the differential with respect to any auxiliary parameters on which K may depend.
502
J. Harnad, A. R. Its
The second principal observation of [IIKS] is that the determination of R is equivalent to the solution of an associated matrix Riemann–Hilbert problem. The relevant Riemann– Hilbert problem is given there for the case p = 1. It is not difficult however to generalize the arguments of [IIKS] to an arbitrary value of p. The “p-independent” version of this Riemann–Hilbert reduction can be done as follows: Put F(µ)gT (µ) χ (λ) = Ir + dµ. (1.28) λ−µ $ The r × r matrix valued function χ (λ) is analytic on the complement of $, extends to λ = ∞ off $, with asymptotic form χ (λ) ∼ Ir + O λ−1 (1.29) for λ → ∞, and
χ (λ) = O ln(λ − α) ,
λ ∼ α,
(1.30)
if α is an endpoint of a connected component of $. Moreover, the function χ (λ) has cut discontinuities across $ given by χ− (λ) = χ+ (λ)H (λ),
λ ∈ $,
(1.31)
where χ+ (λ) and χ− (λ) are the limiting values of χ (λ) as $ is approached from the left and the right, respectively. The r × r invertible jump matrix H (λ) is defined as the following rank-p perturbation of the identity matrix Ir : H (λ) = Ir + 2π if(λ)gT (λ) = exp 2π if(λ)gT (λ)
(1.32)
(where the exponential form holds here because the nonsingularity condition (1.19) implies that f(λ)gT (λ) is nilpotent). The only statement above that is not a direct consequence of the general properties of Cauchy type integrals is Eq. (1.31). To check this equation we first observe that, in view of (1.28) and (1.19), F(µ)gT (µ) χ+ (λ)f(λ) = χ− (λ)f(λ) = f(λ) + f(λ)dµ λ−µ $ = f(λ) + F(µ)K T (λ, µ)dµ λ ∈ $. (1.33) $
This, together with the definition of F (see the first equation in (1.24a)) imply the equation F(λ) = χ (λ)f(λ),
λ ∈ $,
(1.34)
in which the choice of branch χ± (λ) is immaterial. From (1.28), we also have that χ+ (λ) − χ− (λ) = −2π iF(λ)gT (λ),
(1.35)
or, taking into account (1.34), χ+ (λ) − χ− (λ) = −2π iχ+ (λ)f(λ)gT (λ), and Eq. (1.31) follows.
(1.36)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
503
The considerations above show that if the operator I − K is invertible, then Eq. (1.28) defines a solution of the matrix Riemann–Hilbert problem consisting of finding a matrix valued function χ (λ) analytic on the complement of $ and satisfying conditions (1.29)–(1.31). We show that the converse is also true. Suppose that χ (λ) is a solution of Riemann–Hilbert problem (1.29)–(1.31). Note that the matrix χ (λ) is invertible for all λ. Moreover the following identity holds: detχ (λ) ≡ 1.
(1.37)
Indeed, since (Eq. (1.19) again) detH (λ) ≡ 1 we conclude that det χ (λ) is analytic on CP 1 \ { the set of end points of $} (with no jumps across $). To prove (1.37) it only remains to take into account the asymptotic condition (1.29) noticing that, by virtue of estimate (1.30), all the possible singularities of detχ (λ) at the end points of $ are in fact removable singularities. The fact that the solution χ (λ) is unique may be seen as follows. Suppose that χ˜ (λ) is another piecewise analytic function satisfying (1.29)–(1.31). Consider the matrix ratio χ˜ (λ)χ −1 (λ). The same arguments as were just used to prove (1.37), i.e. the absence of jumps and the fact that the end point singularities are removable, yield the identity χ˜ (λ)χ −1 (λ) ≡ I.
(1.38)
Define now the r × p matrix–value functions F(λ) and G(λ), λ ∈ $ by the equations (cf. (1.34)), F(λ) := χ (λ)f(λ), T −1
G(λ) := (χ )
(1.39a)
(λ)g(λ),
(1.39b)
where in view of (1.31) and (1.19) the choice of branch χ± is immaterial. Equation (1.19) implies that (1.25) also holds. It is worth noting that the functions F(λ) and G(λ) have no singularities along $. In fact, equations χ+ (λ)f(λ) = χ− (λ)f(λ),
(1.40a)
(χ+T )−1 (λ)g(λ) = (χ−T )−1 (λ)g(λ)
(1.40b)
imply that the functions F(λ) and G(λ), similar to f(λ) and g(λ), can be analytically continued to a neighborhood of the connected components of $. As one would expect, these functions satisfy the integral equations (1.41) F(λ) − F(µ)K T (λ, µ)dµ = f(λ), λ ∈ $, $
and
G(λ) −
$
G(µ)K(λ, µ)dµ = g(λ),
λ ∈ $.
(1.42)
To show this, we note that from the jump condition (1.31) and the Cauchy integral representation it follows that χ (µ)f(µ)gT (µ) dµ, (1.43a) χ (λ) = Ir − µ−λ $ f(µ)gT (µ)χ −1 (µ) χ −1 (λ) = Ir + dµ, (1.43b) µ−λ $ λ∈ /$
504
J. Harnad, A. R. Its
or, taking into account definitions (1.39a) and (1.39b), F(µ)gT (µ) χ (λ) = Ir + dµ, λ−µ $ f(µ)GT (µ) χ −1 (λ) = Ir − dµ. λ−µ $
(1.44a) (1.44b)
Multiplying both sides of (1.44a) by f(λ) on the right and both sides of (1.44b) by gT (λ) on the left and letting λ → $ (from which side of $ is again immaterial), Eqs. (1.41) and (1.42) follow. It remains to show that the functions F(λ) and G(λ) define, via Eq. (1.23), the resolvent kernel for the integral operator K. Consider the operator product KR. Denoting its kernel by KR(λ, µ), we have T f (λ)g(ν) FT (ν)G(µ) KR(λ, µ) = dν λ−ν ν−µ $ 1 1 1 = f T (λ)g(ν)FT (ν)G(µ) + dν λ−µ $ λ−ν ν−µ T f (λ) g(ν)FT (ν) g(ν)FT (ν) = dν + dν G(µ) λ−µ λ−ν ν−µ $ $ 1 1 = f T (λ) χ T (λ) − Ir G(µ) + f T (λ) Ir − χ T (µ) G(µ) λ−µ λ−µ FT (λ)G(µ) f T (λ)g(µ) = − , (1.45) λ−µ λ−µ where we have used Eq. (1.44a) in the second to last line, and (1.39a), (1.39b) in the last line. In operator notation, Eq. (1.45) reads KR = R − K.
(1.46)
Equation (1.46) implies that the operator I − K is invertible and formulae (1.39a), (1.39b), and (1.23) define its resolvent kernel. The equivalence of the inversion of I − K and the solution of Riemann–Hilbert problem (1.29)–(1.31) is thus established. We shall complete our exposition of the general theory of integrable Fredholm operators by noticing that using integral representations (1.44a) and (1.44b) one can express the asymptotic form of χ (λ) and χ −1 (λ) for large λ as χ (λ) ∼ Ir + χ −1 (λ) ∼ Ir +
∞ χj j =1 ∞ j =1
,
(1.47a)
χˆ j , λj
(1.47b)
λj
where the aymptotic coefficients χj , χˆj are given by λj −1 F(λ)gT (λ)dλ, χj = $ χˆ j = − λj −1 f(λ)GT (λ)dλ. $
(1.48a) (1.48b)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
505
These must of course also satisfy the algebraic recursion relations χˆ j = −χj −
j −1
χj −k χˆ k , j > 1,
χˆ 1 = −χ1 .
(1.49)
k=1
In the next section, we specialize to certain particular choices of the functions f(λ), g(λ) determining the Fredholm kernel, and support curves $, for which the computation of the Fredholm determinant can be reduced to the solution of an associated isomonodromic deformation problem. The resulting systems, together with the generalization discussed in Sect. 4, extend those obtained in [IIKS, P] via the Riemann–Hilbert method and in [JMMS, HTW,TW1,TW2,TW3] by other related methods. 2. Special Integrable Kernels and Isomonodromic Deformations In the following, we choose n = 2m to be even, and let {αj }j =1...n be a set of distinct points along $, ordered according to its orientation, and denote by {$j }j =1,...,m the ordered sequence of connected segments of $ such that $j has endpoints (α2j −1 , α2j ). Let B = diag(β1 , . . . , βr )
(2.1)
be a diagonal r × r matrix with distinct eigenvalues {βa }a=1,... ,r and {fj , gj }j =1,... ,m a set of fixed r × p rectangular matrices satisfying gjT fk = 0,
∀j, k.
(2.2)
fj θj (λ),
(2.3a)
gj θj (λ),
(2.3b)
Define f0 (λ) := g0 (λ) :=
m j =1 m j =1
where {θj (λ)}j =1,... ,m denote the characteristic functions of the curve segments {$j }, and f(λ) := 00 (λ)f0 (λ), −1 g(λ) := 00T (λ) g0 (λ),
(2.4b)
00 (λ) := eλB
(2.5)
(2.4a)
where
is viewed as a “vacuum” wave function. Let χ± (λ) be the solution to the matrix Riemann–Hilbert problem (1.29)–(1.31) corresponding to the choice (2.4a), (2.4b) for the functions f(λ), g(λ). Define the invertible r × r matrix valued function 0(λ) := χ (λ)00 (λ),
(2.6)
506
J. Harnad, A. R. Its
with limiting values 0± on either side of the segments of $, 0± (λ) := χ± (λ)00 (λ).
(2.7)
Then 0± satisfy the discontinuity conditions 0− (λ) = 0+ (λ)H0 (λ)
(2.8)
H0 (λ) := Ir + 2π ih0 (λ) = exp 2π ih0 (λ),
(2.9)
across these segments, where
and h0 (λ) :=
n j =1
fj gjT θj (λ)
(2.10)
has rank ≤ p, is piecewise contant along the segments $j and, because of (2.2), satisfies h0 (λ)h0 (µ) = 0,
∀ λ, µ.
(2.11)
In terms of these quantities, H (λ) is given by H (λ) = 00 (λ)H0 (λ)00−1 (λ).
(2.12)
From (1.39a), (1.39b) and (2.6), it follows that the matrix-valued functions F(λ), G(λ) determining the resolvent kernel are given by F(λ) = 0(λ)f0 (λ), −1 G(λ) = 0 T (λ) g0 (λ),
(2.13a) (2.13b)
where, again, the choice of 0± (λ) on the RHS is immaterial because of the orthogonality conditions (2.2).
2.1. Isomonodromic families of operators: Deformation of the endpoints. In this section, we construct an isomonodromic family of operators of the form (1.1a), (1.1b) having the “dressed” wave function 0(λ) defined in (2.6) as kernel. This follows by examining the analytic properties of 0(λ) in a neighborhood of the singular points {λ = αj }j =1,... ,n and asymptotically as λ → ∞. Specifically, define {Nj }j =1,... ,n by Nj := −Fj GjT ,
(2.14)
where Fj := lim F(λ) Gj := (−1)j lim G(λ), λ→αj
λ→αj
(2.15)
with the limit λ → αj taken inside the segment $j . We then have the following result.
Integrable Fredholm Operators and Dual Isomonodromic Deformations
507
Theorem 2.1. The wave function 0(λ) defined by (2.6) satisfies the equations n Nj ∂0 0 = 0, (2.16a) − B+ ∂λ λ − αj j =1
Nj ∂0 + 0 = 0, ∂αj λ − αj
(2.16b)
with the Nj ’s given by (2.14). This implies the commutativity [Dλ , Dαj ] = 0,
[Dαi , Dαj ] = 0,
i, j = 1, . . . , n
(2.17)
of the operators n
Nj ∂ ∂ Dλ := = −B − − N (λ), ∂λ λ − αj ∂λ
(2.18a)
j =1
Dαj :=
Nj ∂ + , ∂αj λ − αj
(2.18b)
and hence the invariance of the monodromy data of the operator Dλ under changes in the parameters {αj }. Proof. The discontinuity conditions (2.8) may be written 0+ (λ) − 0− (λ) = −2π i0+ (λ)
m j =1
fj gjT θj (λ).
(2.19)
From analyticity and the fact that the fj gjT satisfy fj gjT fk gkT = 0,
∀j, k,
(2.20)
which follows from (2.2), and also taking into account Eq. (1.30), we conclude that, in a neighborhood of $j , 0(λ) can be written 0(λ) = ψˆ j (λ)
λ − α2j λ − α2j −1
−fj gT j
,
(2.21)
where ψˆ j (λ) is locally holomorphic and invertible (cf. the similar representations of [JMMS] in the case of the “sine” kernel). It follows that the logarithmic derivative 0λ 0 −1 is meromorphic for finite λ with simple poles at the points {λ = αj }j =1,... ,n , and the residue matrices are given by (αj ). Nj := (−1)j +1 ψˆ j +1 (αj )f j +1 gT j +1 ψˆ −1 j +1 2
2
2
(2.22)
2
The local form (2.21) implies that these Nj ’s are given by (2.14), with Fj , Gj defined by (2.15). Note that, by Eq. (1.25), we also have FjT Gj = 0.
(2.23)
508
J. Harnad, A. R. Its
From the asymptotic form (1.29) and Eq. (2.6), we see that, for large |λ|, 0λ 0 −1 is of the form 0λ 0 −1 ∼ B + O(λ−1 ),
(2.24)
and hence, by Liouville’s theorem, 0 satisfies (2.16a). From the local form (2.21) of 0 and the fact that 0 is analytic away from $, we see ∂0 −1 0 is holomorphic away from αj and that in a neighborhood of αj , the sum that ∂α j Nj ∂0 −1 0 + ∂αj λ − αj
(2.25)
∂0 −1 0 is analytic. From the asymptotic form (1.29) and (2.6) it also follows that ∂α j vanishes at λ = ∞ and hence, again by Liouville’s theorem, the expression in (2.25) vanishes identically, hence proving that 0 satisfies (2.16b). The fact that the invertible matrix 0 satisfies both (2.16a) and (2.16b) implies the commutativity conditions (2.17) for the operators Dλ , Dαj defined in (2.18a) and (2.18b). These deformation equations for the operator Dλ imply the invariance of its monodromy data both at the finite regular singular points {αj }j =1,... ,n and at the irregular singular point λ = ∞ (cf. [JMMS, JMU]).
Equivalently, we may consider the equations for the matrices Fj , Gj implied by taking the limits λ → αk in (2.16a), (2.16b). Corollary 2.2. The matrices Fj , Gj satisfy the differential system ∂Fj Nk =− Fj , j = k, ∂αk αj − α k n ∂Fj Nk = B + Fj , ∂αj α j − αk k=1
(2.26a)
(2.26b)
κ=j
NkT ∂Gj = Gj , j = k, ∂αk αj − α k n T Nk ∂Gj = − B + Gj . ∂αj α j − αk k=1
(2.26c)
(2.26d)
κ=j
Proof. This follows from the definitions (2.13a), (2.13b), (2.15) and Eqs. (2.16a), (2.16b) of the theorem. Remark. Let N = np and define the pair of rectangular N × r matrices F, G formed from the p × r blocks {FjT , GjT }, T T G1 F1 G T F T 2 2 (2.27) F := · , G := · . · · FnT GnT
Integrable Fredholm Operators and Dual Isomonodromic Deformations
509
Then N (λ) = B +
n j =1
Ni = B + F T (A − λI)−1 G, λ − αj
(2.28)
as in (1.10), where A is the diagonal N × N matrix with eigenvalues {αj }j =1,... ,n all of multiplicity p, and the Eqs. (2.26a)–(2.26d) are just the Hamiltonian equations (1.11a). 2.2. Deformations of exponents. We now turn to the dependence of the wave function 0(λ) on the parameters {βa }a=1,...r . By examining the analytic structure of the loga∂0 rithmic derivatives ∂β near $ and asymptotically, we can similarly derive differential j equations for 0 with respect to these parameters, and hence deduce the independence of the monodromy of Dλ of the parameter values. Theorem 2.3. The wave function 0(λ) satisfies the equations Dβa 0 = 0,
a = 1, . . . , r,
where the operators {Dβa }a=1,... ,r are defined by n n r Ea j =1 Nj Eb + Eb j =1 Nj Ea ∂ − λEa − , Dβa := ∂βa βa − β b b=1
(2.29)
a = 1, . . . , r,
b=a
(2.30) with the Ni ’s given by (2.14), (2.15) and Ea given by (1.4). This implies the commutativity conditions (2.31) Dλ , Dβa = 0, Dβa , Dβb = 0, a, b = 1, . . . , r, and hence invariance of the monodromy data of Dλ under the deformations parameterized by {βa }a=1,...,r . Proof. Since the matrix H0 (λ) entering the discontinuity conditions (2.8) is independent of the parameters βa , we see that the logarithmic derivatives of 0± with respect to these parameters have no discontinuity across $, ∂0− −1 ∂0+ −1 0+ = 0 . ∂βa − ∂βa
(2.32)
∂0 −1 0 is in fact holomorphic in a From the local structure (2.21), it follows that ∂β a neighborhood of $, and hence throughout the complex λ plane. Furthermore, from the asymptotic form (1.47a) and (2.6), it follows that the difference
∂0 −1 0 − (λEa + [χ1 , Ea ]) ∂βa
(2.33)
vanishes in the limit λ → ∞. Therefore, again by Liouville’s theorem this difference, being globally holomorphic, vanishes identically, ∂0 − (λEa + [χ1 , Ea ]) 0 = 0. ∂βa
(2.34)
510
J. Harnad, A. R. Its
Since 0 is invertible, the compatibility of (2.34) and (2.16a) implies
n Nj ∂ ∂ , − λEa − [χ1 , Ea ] = 0, −B − ∂λ λ − αj ∂βa
a = 1, . . . , r.
(2.35)
j =1
From the asymptotic form of this equation, it follows that the off diagonal terms of χ1 are given by (βb − βa ) (χ1 )ab =
n
Nj
j =1
and hence
[χ1 , Ea ]bc = (δab − δac )
ab
,
n j =1 Nj
(2.36)
βb − β c
bc
.
(2.37)
Equation (2.34) is therefore just the condition that 0 be in the joint kernel of the operators defined in (2.30), and (2.35) are the commutativity conditions implying that the monodromy of Dλ is preserved under the deformations parameterized by {βa }a=1,... ,r . Remark. The compatibility of Eq. (2.29) with (2.16b) also implies the commutativity conditions [Dαi , Dβa ] = 0,
i = 1, . . . , n,
a = 1, . . . , r.
(2.38)
Finally, we may again equivalently consider the system of equations satisfied by the quantities {Fj , Gj }j =1,... ,n , which follow from (2.16a) and (2.29). Corollary 2.4. The matrices Fj , Gj satisfy the differential system ∂Fj = αj Ea + [χ1 , Ea ] Fj , ∂βa ∂Gj = − αj Ea + χ1T , Ea Gj , ∂βa
(2.39a) (2.39b)
with the term [χ1 , Ea ] given by (2.37), which are exactly the Hamiltonian equations (1.11b) under the identifications (2.27).
2.3. The Fredholm determinant. We now proceed in a standard way to compute the logarithmic differential (1.27) with respect to the parameters {αj , βa }1≤j ≤n, 1≤a≤r . The result is summarized by the formula Theorem 2.5. d ln det(I − K) = ω =
n k=1
Hk dαk +
r a=1
Ka dβa ,
(2.40)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
511
where the individual factors may be expressed n
Hk =
tr(Nj Nk ) ∂ ln det(I − K) = tr(BNk ) + , ∂αk αk − α j j =1 j =k
∂ ln det(I − K) = Ka = ∂βa
n
α j Nj
j =1
aa
(2.41a)
n
+
r (
n
j =1 Nj )ab (
k=1 Nk )ba
βa − β b
b=1 b=a
. (2.41b)
Proof. To derive (2.41a) we use the equation ln det(I − K) = tr ln(I − K) = −
∞ trKn n=1
From the formula n
trK = tr
m
α2j1
···
j1 ,... ,jn =1 α2j1 −1
α2jn
n
.
(2.42)
K(λ1 , λ2 )K(λ2 , λ3 ) · · · K(λn , λ1 )dλ1 dλ2 , . . . , dλn ,
α2jn −1
(2.43) we conclude that ∂trKn = (−1)k n trK n (αk , αk ), ∂αk
(2.44)
where K n (λ, µ) stands for the kernel of Kn and “tr” in the r.h.s. is the matrix trace. Equations (2.42) and (2.44) imply (cf. [TW]) that ∞
∂ ln det(1 − K) = (−1)k+1 tr K n (αk , αk ) = (−1)k+1 trR(αk , αk ), ∂αk
(2.45)
n=1
where the second equality follows from the Neumann expansion of the resolvent operator R. From Eqs. (1.26), (2.13b), and (2.16a) it follows that n T N j G(λ). R(λ, λ) = FT (λ) B + (2.46) λ − αj j =1
Passing in the last equation to the limit λ → αk and taking into account (2.15) and (2.23), we conclude that n T N j R(αk , αk ) = (−1)k FkT B + (2.47) Gk , α − α k j j =1 j =k
which, because of (2.14), implies that
n tr(Nj Nk ) trR(αk , αk ) = (−1)k+1 tr(BNk ) + , αk − α j j =1 j =k
(2.48)
512
J. Harnad, A. R. Its
and hence (2.41a). To prove (2.41b) we use (1.27), which implies Ka = −tr (1 − K)
−1
∂K ∂βa
.
(2.49)
From (2.4a) it follows that ∂K(λ, µ) = f T (λ)Ea g(µ). ∂βa
(2.50)
Equations (2.49) and (2.50) imply that Ka = −tr
$
FT (µ)Ea g(µ)dµ,
(2.51)
which in view of (1.48a) means that Ka = −tr(Ea χ1 ) = −(χ1 )aa .
(2.52)
From the asymptotic form of the Eq. (2.16a), it follows (cf. (2.36)) that [χ1 , B] =
n
Nj ,
(2.53a)
j =1
χ1 = [χ2 , B] −
n
Nj χ1 −
j =1
n
α j Nj .
(2.53b)
j =1
The last two formulae determine the diagonal elements of χ1 as (χ1 )aa = −
n
n
αj (Nj )aa −
j =1
r (
n
j =1 Nj )ab (
k=1 Nk )ba
βa − β b
b=1 b=a
from which follows the expression (2.41b) for Ka .
,
(2.54)
2.4. Monodromy data and the Fredholm determinant as a τ -function. Let ∂0(λ) = B+ ∂λ
n j =1
Aj 0(λ) λ − αj
(2.55)
be an arbitrary r × r linear system having the same singularity structure as the system (2.16a). According to [JMU], the τ -function associated to a given solution Aj = Aj (α1 , ..., αn ; β1 , ..., βr )
(2.56)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
513
of the corresponding isomonodromic deformation equations is defined via the differential form (see formula (5.17) in [JMU]): r
ω=
n
dαj − dαk 1 trAj Ak + (Aj )aa d(αj βa ) 2 αj − α k a=1 j =1
j =k
1 + 2
n a=b
Aj
j =1
n
ab
j =1
dβ − dβ a b Aj . ba βa − βb
(2.57)
More precisely, it is shown in [JMU] (and also follows from the Hamiltonian structure of the deformation equations [H]) that the form (2.57) is closed, allowing one to introduce the τ -function, defined up to multiplication by local constants by the equation: d ln τ = ω.
(2.58)
The monodromy data corresponding to system (2.55) consist of the set of n + 1 monodromy matrices, Mj ,
j = 1, ..., n,
M∞ ,
(2.59)
M∞ Mn Mn−1 · · · M1 = Ir ,
(2.60)
associated to the singular points, αj ,
j = 1, ..., n.
α∞ = ∞,
(2.61)
augmented by the two Stokes matrices, S1 , S 2 ,
(2.62)
related to α∞ , which is the only irregular singular point of (2.55). The Riemann–Hilbert problem (2.8) provides a special solution of the isomonodromy equations, Aj (α1 , ..., αn ; β1 , ..., βr ) := Nj ,
(2.63)
where Nj is defined in (2.14). This solution is characterized by the following choice of the monodromy data (cf. (2.21)): −1 T T M2j = M2j −1 = Ir − 2πifj gj = exp(−2π ifj gj ),
j = 1, . . . , m,
M∞ = S1 = S2 = Ir .
n = 2m, (2.64a) (2.64b)
Comparing now Eqs. (2.40) and (2.57) (cf. also Eqs. (1.6)–(1.8)), we conclude that the τ -function evaluated on the solution (2.63) of the isomonodromy equations is given by the Fredholm determinant, τ (α1 , . . . , an ; β1 , . . . , βr ) = det(I − K)
(2.65)
of the integrable Fredholm operator K characterized by the choice (2.4a), (2.4b) for the functions f(λ), g(λ) in (1.18). It is also worth noticing that Eq. (2.65) follows directly from Eqs. (1.6)–(1.8).
514
J. Harnad, A. R. Its
2.5. Grassmannian interpretation. We now show that this Fredholm determinant may also be intrepreted as a τ -function in the sense of Segal–Wilson [SW,W] and Sato [Sa]; i.e., as the determinant of a projection operator on a suitably defined infinite dimensional Grassmann manifold. For this, we assume $ is a simple, closed curve containing the origin of the λ-plane in its interior. (In [SW,W], $ is taken as the unit circle, but this is of no significance to the general formulation.) Following [SW], for any positive integer k, let Hk = L2 ($, Ck ) be the Hilbert space of square integrable Ck -valued functions on $, and let k k Hk = H+ + H−
(2.66)
k , Hk consisting, respectively, be its decomposition as the direct sum of subspaces H+ − of elements admitting a holomorphic extension to the interior (+) or the exterior (−) of $, the latter normalized to vanish as λ → ∞. Now, take k = r, and let W ⊂ Hr be a subspace that is the graph of a compact linear operator, r r hW : H+ → H− .
(2.67)
This may be viewed as an element of the Grassmannian GrH+r (Hr ) of subspaces of Hr r . Let modeled on H+ γ (t) : Hr −→ Hr ,
r γ ∈ $+
(2.68)
+ of the loop group Gl(r)
+ of denote the action of an abelian subgroup $ r ⊂ Gl(r) nonsingular r × r matrix valued functions on $ admitting a holomorphic nonsingular extension to the interior of $, acting by left multiplication on Hr . Let W (t) be the image
+ , and choose of W under the induced action of γ (t) on the Grassmannian $ r ⊂ Gl(r) (in the sense of [SW]) an admissible basis, in which W is represented by the (2∞) × ∞ matrix w+ W ∼ , (2.69) w− where w+ , w− represent maps r r w+ : H+ → H+ ,
r r w− : H+ → H− ,
(2.70)
the summed images of which give W . In this notation, we have w− = hW ◦ w+ . The action of γ (t) can then be expressed in matricial form as a(t) b(t) w+ a(t)w+ + b(t)w− W (t) = γ (t)W = = , 0 d(t) w− d(t)w−
(2.71)
(2.72)
where the maps r r a(t) : H+ → H+ , r r b(t) : H− → H+ , r r d(t) : H− → H− ,
(2.73a) (2.73b) (2.73c)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
515
r composed with the orthogonal projection maps are defined by restriction of γ (t) to H±
P± : Hr → H± , a := P+ ◦ γ |H+r ,
b := P+ ◦ γ |H−r ,
(2.74) d := P− ◦ γ |H−r .
A convenient way to express these projection maps is via the Cauchy operators v(µ) v(µ) 1 1 (P± v)(λ) = ± lim dµ =: ± dµ, (2.75) 2πi λ± →λ $ µ − λ± 2π i $ µ − λ± where the expression on the right is an abbreviated notation for the limit, indicating that after integration λ± approaches λ ∈ $ from the corresponding side $± . The Segal– Wilson–Sato τ -function for such a subspace W ⊂ Hr is then just given by the determinant τW (t) := det I + b(t) ◦ hW ◦ a(t)−1 . (2.76) For the specific case at hand, we let $ be a closed curve passing through the points r under the action of the loop group element {αi }i=1...n and define W to be the image of H+ −1 H0 (λ), where H0 (λ) is defined in (2.9), r r W = W (α1 , . . . αn ) := H0−1 (H+ ) = {v+ − 2π ih0 v+ , v+ ∈ H+ }.
(2.77)
r ⊂ Gl(r)
is taken to be the set of values of the “vacuum” wave The abelian subgroup $+ function
γ (t) := 00 (β1 , . . . , βr ) = eλB ,
{ta := βa }a=1,... ,r .
Then, using (2.75), and the orthogonality condition (2.11), we conclude that h0 (µ)v+ (µ) r W = v+ + dµ, v+ ∈ H+ . µ − λ− $ Hence we may express the operator hW as h0 (µ)v+ (µ) hW v+ (λ) = dµ, µ − λ− $
(2.78)
(2.79)
(2.80)
while the maps a, b and d are given by av+ (λ) = 00 (λ)v+ (λ), 00 (µ)v− (µ) 1 dµ, bv− (λ) = 2π i $ µ − λ+ 00 (µ)v− (µ) 1 dµ, dv− (λ) = − 2π i $ µ − λ− r r , v− ∈ H− . v+ ∈ H+ We then have the following theorem.
(2.81a) (2.81b) (2.81c)
516
J. Harnad, A. R. Its
Theorem 2.6. The following two determinants are equal: det(I − K) = det I + b(t) ◦ hW ◦ a(t)−1 ,
(2.82)
and hence the τ -function τ (α1 , . . . αn ; , β1 , . . . βr ) evaluated in (2.65) is the same as the Sato–Segal–Wilson τ -function τW (t) defined in (2.76). Proof. Composing the inverse a −1 of the map defined in (2.81a) with hW and b as r → Hr as defined in (2.80), (2.81b) gives the map b ◦ hW ◦ a −1 : H+ + b ◦ hW ◦ a −1 (v+ )(λ) =
1 2πi
$
$
00 (ν) h0 (µ)00−1 (µ)v+ (µ) dµdν. ν − λ+ µ − ν−
(2.83)
Inverting the order of integration, evaluating the residues at ν = µ and ν = λ+ and using (2.10) gives f(µ)gT (µ)v+ (µ) 00 (λ)f0 (µ)gT (µ)v+ (µ) b ◦ hW ◦ a −1 (v+ )(λ) = − dµ + dµ, µ − λ+ µ − λ+ $ $ (2.84) with f(µ), g(µ) defined by (2.4a), (2.4b). Now note that K : Hp → Hp can be written in the form K = Af ◦ Bg ,
(2.85)
r is defined by where Bg : Hp → H+ g(µ)φ(µ) Bg φ(λ) = dµ, λ−µ $
φ ∈ Hp ,
(2.86)
r → Hp is defined by while Af : H+
Af v+ (λ) = f T (λ)v+ (λ),
r v+ ∈ H+ .
(2.87)
On the other hand, from (2.84), we see that b ◦ hW ◦ a −1 can be expressed as f ◦ Ag , b ◦ hW ◦ a −1 = B
(2.88)
f = Bf − a ◦ Bf0 , B
(2.89)
where
with Bf and Bf0 defined as in (2.86), with g replaced by f and f0 , repectively (cf. (2.4a), (2.3a)), while Ag is defined as in (2.87), with f replaced by g. The Weinstein–Aronszajn identity implies that f ◦ Ag ) = det(I + Ag ◦ B f ) = det(I + Ag ◦ Bf ), det(I + B
(2.90)
where the second equality follows from the fact that, due to the orthogonality conditions (2.2), we have Ag ◦ a ◦ Bf0 = Ag0 ◦ Bf0 = 0.
(2.91)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
517
Therefore τW (t) = det(I + b(t) ◦ hW ◦ a(t)−1 ) = det(I − KT ),
(2.92)
KT := −Ag ◦ Bf .
(2.93)
where
Comparing (2.85) with (2.93), we see that KT is really just the transpose of K, and hence the determinants in (2.82) and (2.92) are the same. Remark. An identity related to (2.90); namely, the equation f ◦ Ag )−1 + B f (I + Ag ◦ B f )−1 Ag = I, (I + B lies, in fact, behind the basic relationship (1.39a), (1.39b) between the Riemann–Hilbert problem and integrable Fredholm kernels (see [DIZ], proof of Lemma 2.12). 3. Duality 3.1. The dual isomonodromic system. In the following, choose r to be even, r = 2s, and let $ be a piecewise continuous oriented curve in the complex z-plane consisting of the ordered sequence { $a }a=1,... ,s of connected components whose endpoints are given by the consecutive pairs {β2a−1 , β2a }a=1,...,s of eigenvalues of the qr × qr matrix B = diag(β1 , . . . , β2s ), where each eigenvalue βa has multiplicity q, with q ≤ n. Choose a set { fa , ga }a=1,... ,s of fixed rectangular n × q matrices satisfying gb = 0, faT
a, b = 1, . . . , s.
(3.1)
Let
g(z) =
s
θa (z), fa a=1 s e−zA θa (z), ga a=1
f(z) = ezA
(3.2a)
(3.2b)
$ , of the curve segment $a and A is where θa is now the characteristic function, along the diagonal n × n matrix, A = diag(α1 · · · αn ).
(3.3)
Let χ (z) be the nonsingular n × n matrix valued function satisfying the conditions that it be analytic in the compactified complex z-plane in the complement of $ , with asymptotic form χ (z) = In + O(z−1 ),
(3.4)
having cut discontinuities across $ given by (z), χ − (z) = χ + (z)H
z∈ $,
(3.5)
518
J. Harnad, A. R. Its
where χ + (z), χ − (z) are the limits as $ is approached from the left and right, respectively, (z) is the invertible n × n matrix function defined along and H $ by (z) = In + 2π i H f(z) gT (z) = exp 2π i f(z) gT (z).
(3.6)
acting on Cq -valued As above, we define the q × q matrix Fredholm integral operator K functions v on $ by v)(z) = w) K( K(z, v(w)dw, (3.7) $
where T g(w) w) = f (z) K(z, . z−w
(3.8)
= (I − K) −1 K R
(3.9)
The resolvent operator
may again be expressed as v)(z) = R(
$
w) R(z, v(w),
(3.10)
where the kernel T w) = F (z)G(w) R(z, z−w
(3.11)
is determined by −1 FT = (I − K) f T = (I + R) fT = fT χ T , = ( = −1 = g(I + R) χ T )−1 g. G g(I − K)
(3.12a) (3.12b)
As above, χ (z), χ −1 (z) have the asymptotic expansions χ (z) = In +
∞ χ j j =1
χ −1 (z) = In +
∞ ˆ χ j j =1
where
zj
zj
,
(3.13a)
,
(3.13b)
F(z) gT (z)dz, zj −1 T (λ)dz ˆ j = − zj −1 χ f(z)G χ j =
$
$
(3.14a) (3.14b)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
519
satisfy ˆ j = − χj − χ
j −1
ˆk , χ j −k χ
j > 1,
ˆ 1 = − χ χ1 .
(3.15)
k=1
Again, we may define the n × q rectangular matrices F(z), Fa := lim z→βa
a := (−1)a lim G(z), G z→βa
a = 1, . . . , r,
(3.16)
where the limits are taken inside the curve segments { $a }a=1,... ,s , the n × n matrices aT , Ma := − Fa G
a = 1, . . . , r,
a }, , G formed from the blocks { and the pair of n × rq matrices F Fa , G := F F2 · · Fr , F1 := G 2 · · G 1 G r . G
(3.17)
(3.18a) (3.18b)
Then M(z) := A +
r Ma T , (B − zI)−1 G =A+F z − βa
(3.19)
a=1
as in (1.12a), (1.12b). Now define, similarly to the previous section, (z) := χ 0 (z)ezA ,
(3.20)
with limiting values on either side of the segments of $, ± (z) := χ ± (z)ezA . 0
(3.21)
simultaIt then follows from the same arguments as in the preceding section that 0 neously satisfies z 0 = 0, D βa 0 = 0, D αi 0 = 0, D
(3.22a) a = 1, . . . , r,
(3.22b)
i = 1, . . . , n,
(3.22c)
where r
Ma z := ∂ − M(z) = ∂ − A − , D ∂z ∂z z − βa
(3.23a)
a=1
βa := ∂ + Ma , D ∂βa z − βa r r n Ei a=1 Ma Ej + Ej a=1 Ma Ei αi := ∂ − zEi − D , ∂αi αi − α j j =1 j =i
(3.23b) (3.23c)
520
J. Harnad, A. R. Its
with {Ei }i=1,... ,n the elementary n × n matrices with entry 1 in the (ii) position. Consequently, all these operators commute and the monodromy of the parametric family z is invariant under the changes in the deformation parameters of operators D {αi , βa }i=1,... ,n, a=1,... ,r . The corresponding dynamical equations, expressed in terms a }a=1,... ,r are of the matrices { Fa , G Mb ∂ Fa =− Fa , a = b, ∂βb βa − β b r ∂ Fa Mb = A + Fa , ∂βa β a − βb b=1
(3.24a)
(3.24b)
b=a
a MbT ∂G a , a = b, = G ∂βb βa − β b r T Mb ∂ Ga a , = − A + G ∂βa β a − βb b=1
(3.24c)
(3.24d)
b=a
∂ Fa = (βa Ei + [ χ1 , Ei ]) Fa , ∂αi
a ∂G = −βa Ei + χ 1T , Ei Ga , ∂αi
(3.24e) (3.24f)
with the term [χ1 , Ei ] given by
r χ1 , Ei ]j k = (δij − δik ) [
a=1 Ma j k
αj − α k
.
(3.25)
These are again just the Hamiltonian equations (1.11a), (1.11b) under the identifications is identified with (F, G). (3.18a), (3.18b), provided the pair ( F, G) Finally, we may compute the logarithmic differential of the Fredholm determinant to obtain, as in Sect. 2.3, of K = ω˜ = d ln det(1 − K)
n
j dαj + H
r
a dβa , K
(3.26)
a=1
j =1
where r
tr(Ma Mb ) a = ∂ ln det(I − K) = tr(AMa ) + K , ∂βa βa − β b b=1 b=a
j = ∂ ln det(I − K) = H ∂αj
r a=1
βa (Ma )jj +
n k=1 k=j
r
(3.27a)
a=1 Ma j k
r
αj − α k
b=1 Mb kj
. (3.27b)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
521
3.2. Duality theorem. In order to relate the above isomonodromic deformation system to the one in Sect. 2 we henceforth make the following restrictions. First of all, we consider are both scalar Fredholm integral operators. The only the case p = q = 1, so K, K rectangular matrices f(λ), g(λ), fi , gi all become r-component column vectors while f(z), g(z), fa , ga are n-component column vectors. Furthermore, the fi ’s and fa ’s are chosen to be of the special form fj = e,
fa = e,
j = 1, . . . , m,
a = 1, . . . , s,
(3.28)
where e ∈ Cr , e ∈ Cn are the column vectors with all components equal to 1, while the components of the column vectors gj and ga are chosen such that (gj )2a = −(gj )2a−1 = ( ga )2j = −( ga )2j −1 =: cj a
(3.29)
for some constant m × s matrix with elements {cj a }j =1,... ,m,a=1... ,s . We then have the following result, which shows that the isomonodromic deformation system (3.23a)–(3.23c) is, in fact, the dual system to the one defined in Sect. 1 by Eqs. (2.18a), (2.18b), (2.30). is the Fourier-Laplace transform of K along the curves $ and Theorem 3.1. K $ and , F =F ω= ω.
G = G,
(3.30a) (3.30b)
Proof. On the product $ × $ , define the following locally constant function
ˆ K(λ, z) :=
s m
cj a θj (λ) θa (z).
(3.31)
j =1 a=1
Taking the Fourier–Laplace transform with respect to the variables z and λ along the curves $ and $, respectively, gives the two Fredholm kernels
f T (λ)g(µ) ˆ K(µ, z)ez(λ−µ) dz = , λ−µ $ g(z) f T (w) ˆ K(w, z) = K(µ, z)eµ(w−z) dµ = , w−z $
K(λ, µ) =
(3.32a) (3.32b)
where f, g, f, g are defined by (2.4a), (2.4b), (3.2a), (3.2b) with fj , gj fa , ga given by (3.28), (3.29).
522
J. Harnad, A. R. Its
To prove (3.30a), we use the Neumann expansion of the resolvent R in (1.24a) to get the following convergent series for the components of F Fa (αj ) = fa (αj ) + K(αj , λ1 )fa (λ1 )dλ1 $ + K(αj , λ2 )K(λ2 , λ1 )fa (λ1 )dλ1 dλ2 + · · · $ $ ˆ 1 , z1 )ez1 (αj −λ1 ) eλ1 βa dz1 dλ1 = eαj βa + K(λ $ $ ˆ 1 , z1 )ez2 (αj −λ2 ) ˆ 2 , z2 )K(λ + K(λ ·e
$ $ $ $ z1 (λ2 −λ1 ) βa λ1
e
dz1 dz2 dλ1 dλ2 + · · ·
ˆ 1 , z1 )eλ1 (βa −z1 ) ez1 αj dλ1 dz1 K(λ = eαj βa + $ $ ˆ 2 , z2 )K(λ ˆ 1 , z1 )eλ2 (z1 −z2 ) K(λ + $ $ $
$
· eλ1 (βa −z1 ) eαj z2 dλ1 dλ2 dz1 dz2 + · · · a , z1 ) fj (z1 )dz1 = fj (βa ) + K(β $ a , z1 )K(z 1 , z2 ) K(β + fj (z2 )dz1 dz2 + · · · $ $
= Fj (βa ),
(3.33)
in (3.12a) is used in the last line. Similarly, using where the Neumann expansion of R in (1.24b), (3.12b) and taking into account that the Neumann expansions of R and R m s ˆ j , z) = K(λ, ˆ c a+1 θk (λ) K(α βa ) c j +1 θk (z) , (3.34) k=1
k
2
b=1
a = 1, . . . , r,
2
b
j = 1, . . . , n,
gives the equality j (βa ). Ga (αj ) = G
(3.35)
are related by the The equality (3.30b) follows directly from the fact that K and K Fourier–Laplace transforms (3.32a), (3.32b). It may also be seen from the equalities (3.30a) in view of (2.28), (3.19) and the general duality relations mentioned in the introduction and proved in [H]. 4. Reductions and Generalizations 4.1. Symplectic reduction. In applications, the structure of the Fredholm integral operator K and the associated Riemann–Hilbert problem data may be such that the generic systems appearing in Sects. 2.1 and 2.2 are reduced to systems having fewer independent
Integrable Fredholm Operators and Dual Isomonodromic Deformations
523
variables. An important example of this involves reduction from the group Gl(2s) to the symplectic subgroup Sp(2s). Let J be the symplectic 2s × 2s matrix having block form
0 Is J = , −Is 0
(4.1)
and suppose that the matrix B of Sect. 2 satisfies the relations B T J + J B = 0,
(4.2)
implying it is in the symplectic subalgebra sp(2s) ⊂ gl(2s). Since B is diagonal, with eigenvalues {βa }a=1,... ,2s , this just means that βa+s = −βa ,
a = 1, . . . , s,
(4.3)
so B has the form
B1 0 B= , 0 −B1
(4.4)
where B1 = diag(β1 , . . . , βs ). This implies that the vacuum wave function 00 (λ) satisfies 00T (λ)J 00 (λ) = J,
(4.5)
and hence takes values in Sp(2s). We also require that the r × p matrices {fj , gj } entering in (2.4a), (2.4b) satisfy the relations gj fjT J + J fj gjT = 0,
(4.6)
implying that the group element H0 (λ) defined in (2.9), (2.10) satisfies H0T (λ)J H0 (λ) = J,
(4.7)
and hence also is in Sp(2s). It then follows that the dressed wave function 0(λ) obtained by solution of the associated Riemann–Hilbert problem also takes values in Sp(2s), and the residue matrices Nj entering in the definition of the operator Dλ have values in sp(2s). More explicitly, assuming the (fj , gj )’s all have rank p, Eq. (4.6) implies that there exist symmetric, invertible matrices sj ∈ Gl(p) such that fj = J gj sj ,
sjT = sj ,
j = 1, . . . , s.
(4.8)
The orthogonality conditions (2.2) then reduce to the symplectic isotropy conditions gjT J gk = 0,
j = 1, . . . , s.
(4.9)
524
J. Harnad, A. R. Its 1
Since the sj ’s are invertible and symmetric, we may take their square roots sj2 (determined up to a Zr2 ambiguity) and, without changing the content of the Riemann–Hilbert problem, redefine the fj ’s and gj ’s by absorbing these square roots through the substitutions −1
1
fj −→ fj sj2 ,
gj −→ gj sj 2 .
(4.10)
This just amounts to setting the sj ’s equal to the identity element, reducing (4.8) simply to fj = J gj .
(4.11)
Since the solution of the Riemann–Hilbert problem preserves the symplectic reduction, it follows from Eqs. (2.13a), (2.13b) and (2.15) that the matrices (Fj , Gj ) satisfy the relations Fj = (−1)j J Gj .
(4.12)
This means that there exist pairs of rectangular p × s matrices (Qj , Pj ) such that the Fj ’s and Gj ’s may be expressed FjT = e
iπj 2
Pj , −Qj ,
GjT = (−1)j e
iπj 2
Qj , Pj ,
j = 1 . . . n.
(4.13)
The residue matrices Nj therefore have the reduced block form
PjT Qj Nj = − −QTj Qj
PjT Pj , − QTj Pj
(4.14)
showing explicitly that they all belong to sp(2s). The reduced form of Eqs. (2.26a)–(2.26d) then becomes (Pj QTk − Qj PkT )Qk ∂Qj = , ∂αk αj − α k
j = k,
(Pj QTk − Qj PkT )Pk ∂Pj = , j = k, ∂αk αj − α k n (Pj QTk − Qj PkT )Qk ∂Qj = −Qj B1 − , ∂αj αj − α k k=1
(4.15a) (4.15b) (4.15c)
k=j
∂Pj = Pj B1 − ∂αk
n k=1 k=j
(Pj QTk − Qj PkT )Pk , αj − α k
(4.15d)
In particular, for p = s = 1, r = 2, these are just the equations of [JMMS], Theorem 7.5, determining the n-particle correlation functions for an impenetrable bosonic gas (or the spectral distribution function for random unitary matrices in the double scaling limit [TW1]).
Integrable Fredholm Operators and Dual Isomonodromic Deformations
525
Similarly, the reduced form of (2.39a) and (2.39b) is given by s ∂Qj E a N P Q Eb + E b N P Q E a = − α j Qj E a + Q j βa − β b ∂βa b=1 b=a
− Pj
s Ea NQQ Eb + Eb NQQ Ea
(4.16a)
βa + β b
b=1
s ∂Pj Ea NQP Eb + Eb NQP Ea = αj Pj Ea − Pj ∂βa βa − β b b=1 b=a
+ Qj
s b=1
E a N P P E b + E b NP P E a βa + β b
a =1, . . . , s,
(4.16b)
j = 1, . . . , n,
where NQQ :=
n j =1
QTj Qj , NP P :=
n j =1
PjT Pj , NQP :=
n j =1
QTj Pj , NP Q :=
n j =1
PjT Qj , (4.17)
and Ea is the elementary s × s matrix with 1 in the aa position. Other discrete reductions of the generic Riemann–Hilbert data and the corresponding isomonodromic deformation families may similarly be derived along the lines indicated in [H]. 4.2. Higher order poles. Using arguments similar to the Zakharov–Shabat dressing method (see e.g. [NZMP]), it is possible to generalize the above considerations to a broader class of isomonodromic deformation equations associated to matrix Riemann– Hilbert problems of a similar nature, by allowing higher order poles at one or more of the singular points. For simplicity, we restrict attention to the case where there is just one irregular singular point of arbitrary index located at ∞. Extension to the case of any number of irregular singular points is quite straightforward. Let 00 (λ, t) ∈ Gl(r) be an isomonodromic “vacuum” solution, satisfying ∂00 = U0 (λ, t)00 , ∂λ ∂00 = Va0 (λ, t)00 , ∂ta
(4.18a) a = 1, . . . , k,
(4.18b)
where {U0 (λ, t), Va0 (λ, t)}a=1,... ,k are r × r matrix valued polynomials in λ whose coefficients are functions of the k-component vector t = (t1 , . . . , tk ). The compatibility conditions for (4.18a), (4.18b) imply that the generalized monodromy data, i.e. Stokes matrices associated with λ = ∞ (for definitions, see e.g. [JMU]), corresponding to the “vacuum” operator Dλ0 :=
∂ − U0 (λ, t) ∂λ
(4.19)
526
J. Harnad, A. R. Its
are invariant under the deformations induced by changes of the parameters (t1 , . . . , tk ) satisfying these equations. For example, we could consider the simplest case, when the matrices Va0 (λ) are independent of the parameters (t1 , . . . , tk ), and commute amongst themselves,
(4.20) Va0 (λ), Vb0 (λ) = 0, a, b = 1, . . . , k. The vacuum wave function then has trivial Stokes matrices and can be normalized as k 0 00 (λ, t) = exp ta Va (λ) , (4.21) a=1
so that its values form an abelian group. More general cases, in which the vacuum Stokes’ matrices are not necessarily trivial, include the subclass of integrable kernels considered in [TW2,TW3]. (For more on this relation see Remark 4.2 below.) As in Sect. 2, choose a set {fj , gj }j =1,... ,m of fixed r × p rectangular matrices satisfying (2.2), and let θj (λ) denote the characteristic function along the curve segment $j . Define f(λ, t) := 00 (λ, t)
m
fj θj (λ),
(4.22a)
j =1
g(λ, t) :=
m −1
00T (λ, t)
gj θj (λ),
(4.22b)
j =1
as in (2.4a), (2.4b), but with the exponential vacuum wave function 00 (λ) replaced by the vacuum solution 00 (λ, t), evaluated along the curve segments $j (note that 00 (λ, t) is an entire function of λ). Now define H0 (λ) again as in (2.9) and H (λ, t) as in (1.32), but with the functions f(λ), g(λ) of Eqs. (2.4a), (2.4b) replaced by f(λ, t), g(λ, t) and 00 (λ) replaced by 00 (λ, t). Note that Eq. (2.12) is still valid. Let χ (λ) again be a solution to the matrix Riemann–Hilbert problem as defined in (1.29)–(1.31), with the appropriate substitution of H (λ, t) for H (λ). Then, introducing as in Sects. 2.1 and 2.2 the “dressed” wave function (cf. (2.6)), 0(λ) := χ (λ)00 (λ, t),
(4.23)
we can repeat the same arguments, based on Liouville’s theorem, as in Sect. 2. Indeed, by virtue of Eq. (2.12), which is still valid for the jump matrix H (λ, t), the function 0(λ) defined in (4.23) again satisfies the jump condition (2.8) across $. This in turn leads to the same local representation (2.21) and hence the analyticity of the logarithmic ∂0 −1 −1 ∂0 −1 derivatives, ∂0 ∂λ 0 , ∂ta 0 , and ∂αj 0 , on C \ {αj }. Observe that ∂0(λ) −1 ∂χ (λ) −1 0 (λ) = χ (λ)U0 (λ, t)χ −1 (λ) + χ (λ), ∂λ ∂λ
(4.24)
∂χ (λ) −1 ∂0(λ) −1 0 (λ) = χ (λ)Va0 (λ, t)χ −1 (λ) + χ (λ). ∂ta ∂ta
(4.25)
Integrable Fredholm Operators and Dual Isomonodromic Deformations
527
Taking into account these formulae and, once again, the local representation (2.21), we arrive at the equations n
Nj ∂0 0 = 0, − U (λ, t)0 − ∂λ λ − αj
(4.26a)
j =1
∂0 − Va (λ, t)0 = 0, ∂ta Nj ∂0 + 0 = 0, ∂αj λ − αj
a = 1, . . . , k,
(4.26b)
j = 1, . . . , n,
(4.26c)
where, as before, Nj , Fj , Gj are given by (2.14), (2.15), while U (λ, t) = χ (λ)U0 (λ, t)χ −1 (λ) , + 0 −1 Va (λ, t) = χ (λ)Va (λ, t)χ (λ) , a = 1, . . . , k,
(4.27a)
+
(4.27b)
where ( )+ denotes projection to the polynomial part in λ (cf. [NZMP]). It follows that the operators n
Dλ :=
Nj ∂ , − U (λ, t) − ∂λ λ − αj
(4.28a)
j =1
∂ − Va (λ, t), ∂τa Nj ∂ := + , ∂αj λ − αj
Dta :=
a = 1, . . . , k,
(4.28b)
Dαj
j = 1, . . . , n,
(4.28c)
all commute, and the generalized monodromy data for Dλ , which include the Stokes’ matrices at λ = ∞ (the same as for Dλ0 ) and the monodromy matrices at αj , are invariant under the deformations induced by changes in the parameters {αj }j =1,... ,n , {ta }a=1,... ,k . The τ function associated with these deformations may once again be shown to be the Fredholm determinant det(I − K) of the integral operator K defined as in (1.17), (1.18), with the appropriate substitutions for f(λ), g(λ). The coefficients {Hj }j =1...n , {Ka }a=1...k of the differential form d ln det(I − K) =
n i=1
Hi dαi +
k
Ka dta
(4.29)
a=1
may again be viewed as commuting nonautonomous Hamiltonians whose Hamiltonian equations determine the deformation conditions corresponding to the commutativity of the operators Dλ , Dta , Dαj . For the specific case (4.21), where the vacuum wave functions form an abelian group, the same interpretation may be given to the corresponding τ function as in Theorem 2.6. More generally, if we drop the requirement that (2.72) define an abelian group action on the Hilbert space Grassmannian GrH+r (Hr ), and simply view the maps determined by (2.72) (2.81a), (2.81b), (2.81c) as defining a set of integral curves of a compatible nonautonomous system generated by the solution 00 (λ, t) of the associated vacuum system, the same geometrical interpretion in terms of determinants
z may also of projection operators is still valid. The resulting differential operators Dλ , D
528
J. Harnad, A. R. Its
be expressed in terms of a Hamiltonian quotienting procedure through formulae similar to (1.10), (1.12b), and we may thereby again associate a dual system and corresponding as was done in Sect. 3. Details of this, as well as generalizations dual integral operator K, involving multiple higher order poles will be given elsewhere. Remark 4.1. The linear differential equations (4.26a)–(4.26c), and hence the corresponding (nonlinear) isomonodromic deformation equations, are valid in the case of rational “vacuum” matrices U0 (λ, t) and Va0 (λ, t) as well. One only has to re-define the symbol ( )+ in (4.27a) and (4.27b) as the sum of the relevant principal parts at the poles of U0 (λ) and Va0 (λ) respectively. Indeed, if we assume, to avoid some technical issues, that the singularities of U0 (λ) and Va0 (λ) do not lie on $ then, by virtue of relations (4.24) and (4.25), it follows immediately that the possible multivaluedness of 00 (λ, t) will not ∂0 −1 −1 ∂0 −1 appear in the logarithmic derivatives ∂0 ∂λ 0 , ∂ta 0 , and ∂αj 0 . Also, as before, these logarithmic derivatives have no jumps across the contour $. Hence we arrive again ∂0 −1 −1 ∂0 −1 at the conclusion that ∂0 are rational functions. Equations ∂λ 0 , ∂ta 0 , and ∂αj 0 (4.26a)–(4.26c) (with the modification of the symbol ( )+ indicated) then follow from (4.24), (4.25) and the local representation (2.21). Remark 4.2. In the particular case when r = 2,
p = 1,
1 fj = , 0
0 gj = , 1
∀j,
(4.30)
and in the absence of the parameter t, the integrable kernels considered in this subsection were first introduced and studied in [TW2,TW3], where the systems of nonlinear PDEs describing the corresponding Fredholm determinants were derived. These PDEs were put into the context of integrable systems in [P]. Using a technique based on Cauchy– Riemann operators, it was shown in [P] that the Tracy–Widom differential equations are isomonodromic deformation equations of certain systems of linear ODEs with rational coefficients (in our notations – system (4.26a) with restrictions (4.30)), and the isomonodromic τ – function interpretation of the corresponding Fredholm determinants was obtained. For this case, the results of this subsection give a Riemann–Hilbert alternative to the approach of [P] and, as we have shown, they have an easy generalization to the general r × p case. It is also worth mentioning that the Riemann–Hilbert scheme presented here leads to the λ equation (4.26a) in a form that explicitly inherits the singularity structure of the “vacuum” system (4.18a), (4.18b) at λ = ∞. This fact greatly simplifies one of the basic technical questions in the Tracy–Widom theory; namely, identification of the nonlinear systems arising from kernels generated by classical special functions in the single interval case with the relevant Painlevé equations. In the Riemann–Hilbert approach presented here, Eqs. (4.18a), (4.18b) defining the “vacuum” determine the nature of the singularity at λ = ∞, which also has the same structure in the “dressed” system (4.26a). To determine which Painlevé transcendent is to be expected, one need only look at the known list of isomonodromic Lax pairs for the Painlevé equations, given, e.g., in [JMU]. More on this issue will appear in [BBIK]. Remark 4.3. Another approach to the derivation of integrable differential equations for a class of integrable kernels whose functions f(λ) and g(λ) have nontrivial monodromy properties was suggested recently in [BD]. The method used there also yields an easy Painlevé identification in the case of kernels determined by special functions. Similar to our technique, the approach of [BD] is based on the Riemann–Hilbert formalism of
Integrable Fredholm Operators and Dual Isomonodromic Deformations
529
[IIKS], but it is different from the scheme developed here in one important aspect; it does not make use of the “vacuum” linear system. Remark 4.4. Another class of “dual” isomonodromic deformation systems involving integrable Fredholm kernels, with applications to multi-matrix models, was studied in [BEH1, BEH2]. Although the isomonodromic deformations considered there only concern the “vacuum” systems, they have the features that: (i) the ranks r and n of the corresponding vector bundles may be arbitrarily large, (ii) there is an irregular singularity of arbitrary Poincaré index at λ = ∞ and z = ∞ (indeed, the form that duality takes there is to interchange the order of singularity with the rank) and (iii) there are non-trivial Stokes matrices present in the “vacuum” solutions. The associated Riemann–Hilbert problems and their solution will be dealt with in [BEH3]. Acknowledgements. The authors would like to thank M. Bertola for helpful remarks that led to a clearer expression of the Riemann–Hilbert problem in Sect. 1, and A. Borodin and P. Deift for letting us know of their work [BD] before it was published. This research was supported in part by the Natural Sciences and Engineering Research Council of Canada, the Fonds FCAR du Québec and the National Science Foundation, grant No. DMS-9801608.
References [BEH1] [BEH2] [BEH3] [BBIK] [BD] [DIZ] [H] [HI] [HTW] [IIKS] [IIKV] [JMMS] [JMU] [KBI] [M] [NZMP] [P]
Bertola, M., Eynard, B., Harnad, J.: Duality, Biorthogonal Polynomials and Multi-Matrix Models. Preprint CRM-2749 (2001), nlin.SI/0108049 Bertola, M., Eynard, B., Harnad, J.: Duality of spectral curves arising in two-matrix models. Preprint CRM-2828 (2001), nlin.SI/0112006 Bertola, M., Eynard, B., Harnad, J.: Formal asymptotics of dual isomonodromic families and tau functions associated to two-matrix models. In preparation, 2001 Bleher, P., Bolibruch, A.A., Its, A.R. and Kapaev, A.A: Linearization of the P34 Equation of Painlevé Gambier. In preparation, 2001 Borodin, A., Deift, P.A.: The Fredholm Determinant of a Class of Integrable Operators are Jimbo– Miwa–Ueno tau-functions. In preparation, 2001 Deift, P.A., Its, A.R., and Zhou, X.: A Riemann–Hilbert Approach to Asymptotic Problems Arising in the Theory of Random Matrix Models, And Also in the Theory of Integrable Statistical Mechanics. Ann. of Math. 146, 149–235 (1997) Harnad, J.: Dual Isomonodromic Deformations and Moment Maps to Loop Algebras. Commun. Math. Phys. 166, 337–365 (1994) Harnad, J. and Its, A.R.: Integrable Fredholm Operators and Dual Isomonodromic Deformations. Preprint Centre de recherches mathématiques, CRM–2477, Montreal (1997) Harnad, J., Tracy, C.A., Widom, H.: Hamiltonian Structure of Equations Appearing in Random Matrices. In: Low Dimensional Topology and Quantum Field Theory, ed. H. Osborn, New York: Plenum, 1993, pp. 231–245 Its, A.R., Izergin, A.G., Korepin, V.E., Slavnov, N.A.: Differential Equations for Quantum Correlation Functions. Int. J. Mod. Phys. B4, 1003–1037 (1990) Its, A.R., Izergin, A.G., Korepin, V.E., Varzugin, G.G.: Large Time and Distance Asymptotics of Field Correlation Function of Impenetrable Bosons at Finite Temperature. Physica 54D, 351–395 (1992) Jimbo, M., Miwa, T., Môri, Y., Sato, M.: Density Matrix of an Impenetrable Bose Gas and the Fifth Painlevé Transcendent. Physica 1D, 80–158 (1980) Jimbo, M., Miwa, T., Ueno, K.: Monodromy Preserving Deformation of Linear Ordinary Differential Equations with Rational Coefficients I. Physica 2D, 306–352 (1981) Korepin, V.E., Bogolyubov, N.M., and Izergin, A.G.: Quantum Inverse Scattering Method and Correlation Functions. Cambridge Monographs on Mathematical Physics, Cambridge: Cambridge University Press, 1993 Mehta, M.L.:, Random Matrices. 2nd edition, San Diego: Academic, 1991 Novikov, S.P., Zakharov, V.E., Manakov, S.V., Pitaevski, L.V.: Soliton Theory: The Inverse Scattering Method. Plenum, New York, 1984 Palmer, J.: Deformation Analysis of matrix models. Physica D 78, 166–185 (1995)
530
[Sa]
J. Harnad, A. R. Its
Sato, M.: Soliton equations as dynamical system on infinite dimensional Grassmann manifolds. RIMS Kokyuroku 439, 30 (1981) [SW] Segal, G., Wilson, G.: Loop Groups and Equations of KdV Type. Publications Math., I.H.E.S. 61, 5–65 (1985) [TW1] Tracy, C.A., Widom, H.: Introduction to Random Matrices. In: Geometric and Quantum Methods in Integrable Systems. Springer Lecture Notes in Physics 424, ed. G. F. Helminck. New York– Heidelberg: Springer–Verlag, 1993, pp. 103–130. [TW2] Tracy, C.A., and Widom, H.: Fredholm determinants, differential equations and matrix models. Commun. Math. Phys. 163, 33–72 (1994) [TW3] Tracy, C.A., and Widom, H.: Systems of partial differential equations for operator determinants. Oper. Th.: Adv. Appl. 78, 381–388 (1995) [W] Wilson, G.: Habillage et Fonction τ . C.R. Acad. Sc., Paris, 299, 587–590 (1984) [WMTB] Wu, T.T., McCoy, B.M., Tracy, C.A., and Barouch, E.: Spin-Spin Correlation Functions for the Two-Dimensional Ising Model: Exact Theory in the Scaling Region. Phys. Rev. B13, 316–374 (1976) Communicated by L. Takhtajan
Commun. Math. Phys. 226, 531 – 558 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Transience, Recurrence and Critical Behavior for Long-Range Percolation Noam Berger, Department of Statistics, The University of California, Berkeley, CA 94720-3860, USA. E-mail: [email protected] Received: 27 October 2000 / Accepted: 29 November 2001
Abstract: We study the behavior of the random walk on the infinite cluster of independent long-range percolation in dimensions d = 1, 2, where x and y are connected with probability ∼ β/x − y−s . We show that if d < s < 2d, then the walk is transient, and if s ≥ 2d, then the walk is recurrent. The proof of transience is based on a renormalization argument. As a corollary of this renormalization argument, we get that for every dimension d ≥ 1, if d < s < 2d, then there is no infinite cluster at criticality. This result is extended to the free random cluster model. A second corollary is that when d ≥ 2 and d < s < 2d we can erase all long enough bonds and still have an infinite cluster. The proof of recurrence in two dimensions is based on general stability results for recurrence in random electrical networks. In particular, we show that i.i.d. conductances on a recurrent graph of bounded degree yield a recurrent electrical network. 1. Introduction 1.1. Background. Long-range percolation (introduced by Schulman in 1983 [19]) is a percolation model on the integer lattice Zd in which every two vertices can be connected by a bond. The probability of the bond between two vertices to be open depends on the distance between the vertices. The models that were studied the most are models in which the probability of a bond to be open decays polynomially with its length. 1.2. The model – definitions and known results. Let {Pk }k∈Zd be s.t. 0 ≤ Pk = P−k < 1 for every k ∈ Zd . We consider the following percolation model on Zd : for every u and v in Zd , the bond connecting u and v is open with probability Pu−v . The different bonds are independent of each other. Research partially supported by NSF grant #DMS-9803597 and by a US-Israel BSF grant.
Part of the research was done while the author was at the Hebrew University of Jerusalem.
532
N. Berger
Definition 1.1. For a function f : Zd → R, we say that {Pk } is asymptotic to f if lim
k→∞
Pk = 1. f
We denote it by Pk ∼ f (k). Since the model is shift invariant and ergodic, the event that an infinite cluster exists is a zero–one event. We say that {Pk } is percolating if a.s. there exists an infinite cluster. We consider systems for which Pk ∼ βk−s 1 for certain s and β. The following facts are trivial. • If s ≤ d, then k Pk = ∞. Therefore, by the Borel Cantelli Lemma, every vertex is connected to infinitely many other vertices. Thus, there exists an infinite cluster. • If k Pk ≤ 1 then by domination by a (sub)-critical Galton-Watson tree there is no infinite cluster. Therefore, for every s > d and β one can find a set {Pk } s.t. Pk ∼ βk−s 1 and s.t. there is no infinite cluster. In [19], Schulman proved that if d = 1 and s > 2, then there is no infinite cluster. Newman and Schulman ([16]) and Aizenman and Newman ([3]) proved, among other results, the following: Theorem 1.2. (A) If d = 1, 1 < s < 2, and Pk ∼ β|k|−s for some β > 0, then there exists a {Pk } s.t. Pk = Pk for every k ≥ 2, P1 < 1 and {Pk } is percolating. I.e., if 1 < s < 2 then by increasing P1 one can make the system percolating. (B) If d = 1, s = 2, β > 1, and Pk ∼ β|k|−s , then there exists a {Pk } s.t. Pk = Pk for every k ≥ 2, P1 < 1 and {Pk } is percolating. (C) If d = 1, s = 2, β ≤ 1, and Pk ∼ β|k|−s then {Pk } is not percolating. These results show the existence of a phase transition for d = 1, 1 < s < 2 and β > 0, and for d = 1, s = 2 and β > 1. When considering Zd for d > 1, the picture is simpler. The following fact is a trivial implication of the existence of infinite clusters for nearest-neighbor percolation: • If d > 1, s > d and Pk ∼ βk−s 1 for some β > 0, then there exists a percolating {Pk } s.t. Pk = Pk for every k1 ≥ 2 and Pk < 1 for every k whose norm is 1. If d > 1, then for any s > d and β > 0 we may obtain a transition between the phases of existence and non-existence of an infinite cluster by only changing {Pk |k ∈ A} for a finite set A. In [8], Gandolfi, Keane and Newman proved a general uniqueness theorem. A special case of it is the following theorem: Theorem 1.3. If {Pk }k∈Zd is percolating and for every k ∈ Zd there exist n and k1 , ..., kn s.t. k = k1 + k2 + ... + kn and Pki > 0 for all 1 ≤ i ≤ n, then a.s. the infinite cluster is unique. In particular, if {Pk }k∈Zd is percolating and Pk ∼ βk−s 1 for some s and β > 0, then a.s. the infinite cluster is unique.
Transience, Recurrence and Critical Behavior for Long-Range Percolation
533
1.3. Goals. Random walks on percolation clusters have been studied intensively in recent years. In [10], Grimmett, Kesten and Zhang showed that a supercritical percolation in Zd is transient for all d ≥ 3. See also [6], [11] and [4]. The problem discussed in this paper, suggested by Itai Benjamini, was to determine when a random walk on the long-range percolation cluster is transient. In [9], Jespersen and Blumen worked on a model which is quite similar to the long-range percolation on Z, and they predict that when s < 2 the random walk is transient, and when s = 2 it is recurrent.
1.4. Behavior of the random walk. The main theorem proved here is: Theorem 1.4. (I) Consider long-range percolation on Z with parameters Pk ∼ β|k|−s such that a.s. there is an infinite cluster. If 1 < s < 2, then the infinite cluster is transient. If s = 2, then the infinite cluster is recurrent. (II) Let {Pk }k∈Z2 be percolating for Z2 such that Pk ∼ βk−s 1 . If 2 < s < 4, then the infinite cluster is transient. If s ≥ 4, then the infinite cluster is recurrent. In Sect. 2, we prove the transience for the one-dimensional case where 1 < s < 2 and for the two-dimensional case where 2 < s < 4. Actually, we prove more – we show that for every q > 1 there is a flow on the infinite cluster with finite q-energy, where the q-energy of a flow f is defined as Eq (f ) =
f (e)q .
(1)
e
It is well known that finite 2-energy is equivalent to transience of the random walk (see e.g. [18], Sect. 9), so the existence of such flows is indeed a generalization of the transience result (see also [14], [13] and [4]). In Sect. 3 we prove the recurrence for the one-dimensional case with s = 2 and for the two-dimensional case with s ≥ 4.
1.5. Critical behavior. As a corollary of the main renormalization lemma, we prove the following theorem, which applies to every dimension: Theorem 1.5. Let d ≥ 1 and let {Pk }k∈Zd be probabilities such that Pk ∼ βk−s 1 . Assume that d < s < 2d. Then, if {Pk } is percolating then it is not critical, i.e. there exists an > 0 such that {Pk = (1 − )Pk } is also percolating. In [12], Hara and Slade proved, among other results, that for dimension d ≥ 6 and an exponential decay of the probabilities, there is no infinite cluster at criticality. It is of interest to compare Theorem 1.5 with the results of Aizenman and Newman ([3]), that show that for d = 1 and s = 2, a.s. there exists an infinite cluster at criticality. In [1], Aizenman, Chayes, Chayes and Newman showed the same result for the Ising model – they showed that if s = 2, then at the critical temperature there is a non-zero magnetization. The technique that is used to prove Theorem 1.5 is used in Sect. 5 to prove the analogous result for the infinite volume limit of the free random cluster model, and to get:
534
N. Berger
Theorem 1.6. Let {Pk }k∈Zd be nonnegative numbers such that Pk ∼ k−s 1 (d < s < 2d) and let β > 0. Consider the infinite volume limit of the free random cluster model with probabilities 1 − e−βPk and with q ≥ 1 states. Then, at the critical inverse temperature βc = inf(β| a.s. there exists an infinite cluster) there is no infinite cluster. However, this technique fails to prove this result for the wired measure, so in the wired case the question is still open. A partial answer for the case s ≤ 23 d is given by Aizenman and Fern´andez in [2]. Consider the Ising model with s ≤ 23 d when the interactions obey the reflection positivity condition (which is defined there). Denote by M(β) the magnetization at inverse temperature β. Consider the critical exponent βˆ such that ˆ
M(β) ∼ |β − βc |β
for β near the critical value βc . They proved that (under the above assumptions) βˆ (as well as other critical exponents) exists and they showed that βˆ = 21 . A corollary of Theorem 1.6 is Corollary 1.7. Let {Pk }k∈Zd be nonnegative numbers s.t. Pk = P−k for every k and d s.t. Pk ∼ k−s 1 (d < s < 2d). Consider the Potts model with q states on Z , s.t. the interaction between v and u is Pv−u . At the critical temperature, the free measure is extremal. Another consequence of the renormalization lemma is the following: Theorem 1.8. Let d > 1 and let {Pk }k∈Zd be probabilities s.t. Pk ∼ βk−s 1 for some s < 2d. Assume that the independent percolation model with {Pk } has a.s. an infinite cluster. Then there exists N s.t. the independent percolation model with probabilities Pk
=
Pk k1 < N 0 k1 ≥ N
also has, a.s., an infinite cluster. In [15], Meester and Steif prove the analogous result for supercritical arrays of exponentially decaying probabilities. It is still unknown whether the same statement is true for probabilities that decay faster than k−s 1 (s < 2d) and slower than exponentially. 1.6. Random electrical networks. The proof of recurrence for the two-dimensional case involves some calculations on random electrical networks. In Sect. 4 we study such networks, and prove stability results for their recurrence. One of our goals in that section is: Theorem 1.9. Let G be a recurrent graph with bounded degree. Assign i.i.d. conductances on the edges of G. Then, a.s., the resulting electrical network is recurrent.
Transience, Recurrence and Critical Behavior for Long-Range Percolation
535
In [17] Pemantle and Peres studied the analogous question for the transient case, i.e. under what conditions i.i.d. conductances on a transient graph would preserve its transience. They proved that it occurs if and only if there exists p < 1 s.t. an infinite cluster for (nearest-neighbor) percolation with parameter p is transient. Comparing the results indicates that recurrence is more stable than transience for this type of perturbation. Section 4 is self-contained, i.e. it does not use any of the results proved in other sections.
2. The Transience Proof In this section we give the proof that the d-dimensional long-range percolation cluster, with d < s < 2d, is a transient graph. Our methods use the idea of iterated renormalization for long-range percolation that was introduced in [16], where it was used in order to prove the following theorem: Theorem 2.1 (Newman and Schulman, 1986). Let 1 < s < 2 be fixed, and consider an independent one-dimensional percolation model such that the bond between i and j is open with probability Pi−j = ηs (β, |i − j |), where ηs (β, k) = 1 − exp(−β|k|−s ),
(2)
and each vertex is alive with probability λ ≤ 1. Then for λ sufficiently close to 1 and β large enough, there exists, a.s., an infinite cluster. In order to prove our results, we need the following definition and the following two renormalization lemmas: Definition 2.2. We say that the cubes C1 = v1 + [0, N − 1]d and C2 = v2 + [0, N − 1]d are k cubes away from each other if v1 − v2 1 = N k. We will always use the notion of two cubes being k cubes away from each other for pairs of cubes of the same size that are aligned on the same grid. Lemma 2.3. Let {Pk }k∈Zd be such that Pk = P−k for every k, and such that there exists d < s < 2d s.t. Pk > 0. k1 →∞ k−s 1 lim inf
(3)
Assume that the percolation model on Zd with probabilities {Pk } has, a.s., an infinite cluster. Then, for every > 0 and ρ there exists N such that with probability bigger than s 1 − , inside the cube [0, N − 1]d there exists an open cluster that contains at least ρN 2 vertices. Lemma 2.3 shows that most of the cubes contain big clusters. We also want to estimate the probability that the clusters in two different cubes are connected to each other.
536
N. Berger
Lemma 2.4. Let {Pk }k∈Zd be such that Pk = P−k for every k, and such that there exists d < s < 2d s.t. lim inf
k1 →∞
Pk > 0. k−s 1
Let k0 be s.t. if k1 > k0 then Pk > 0, and let γ = inf
k>k0
− log(1 − Pk ) > 0. k−s 1
Let ρ > 2(2k0 )d , and let N and l be integers. Let C1 and C2 be cubes of side-length N , which are l cubes away from each other. Assume further that C1 and C2 contain clusters s U1 and U2 , each of size ρN 2 . Then, the probability that there is an open bond between a vertex in the U1 and a vertex in U2 is at least ηs (ζ γρ 2 , l) for ζ = 2−s−1 d −s . Proof of Lemma 2.3. Notice that by Theorem 1.3 ([8]) there is a unique infinite cluster. Choose Cn = na and Dn = n−b , where a > b > 1, and 2b < a(2d − s). Choose s.t. 2
∞
(1 + 3Dk ) < .
(4)
k=1
Such an exists because the product in (4) converges. By (3), there exists k0 s.t. for every k1 > k0 , we have Pk > 0. Let λ=
inf
k1 >k0
− log(1 − Pk ) . k−s 1
Notice that since − log(1 − x) = 1, x0 x lim
we get that λ > 0. By the choice of λ, for every k s.t. k1 > k0 we have that Pk ≥ ηs (λ, k). Denote by α the density of the infinite percolation cluster. Let M > 2k0 /α be s.t. with probability bigger than 1 − at least 21 αM d of the vertices in [0, M − 1]d are in the infinite cluster. The existence of this M follows from the (d-dimensional) ergodic theorem. The infinite cluster is unique, so all of the percolating vertices in [0, M − 1]d will be connected to each other within some big cube containing [0, M − 1]d . Let K be such that they are all connected inside [−K, M + K − 1]d with probability > 1 − . We call a cube C=
d i=1
[li M, (li + 1)M − 1] (li ∈ Z ∀i)
Transience, Recurrence and Critical Behavior for Long-Range Percolation
537
alive if there are at least 21 αM d vertices in C that are all at the same connected component in CK =
d
[li M − K, (li + 1)M + K − 1].
i=1
By the choice of M and K, a cube of side-length M is alive with probability at least 1 − 2‘. For every living cube, choose a semi-cluster (by semi-cluster we mean a set of vertices in the cube that is contained in a connected subset of the K-enlargement of the cube) of size at least 21 αM d inside it. We say that two cubes C1 and C2 are attached to each other if there exists an open bond between the semi-cluster in C1 and the semi-cluster in C2 . If the cubes C1 and C2 are alive and are k cubes away from each other, then the probability that they are connected is at least ηs (γ M 2d−s , k) for γ = 41 α 2 λ(2d)−s . This is true because there are at least 41 α 2 M 2d pairs of vertices (v1 , v2 ) from the semi-clusters of C1 and C2 respectively s.t. v1 − v2 1 > k0 . For these vertices, v1 − v2 1 < 2dkM. So, the probability that there is no edge between v1 and v2 is bounded by 1 − ηs (λ, 2dkm) = exp(−λ(2dkM)−s ). So, the probability that there is no edge between the semi-cluster in C1 and the one in C2 is no more than 1 α 2 M 2d 1 = exp − α 2 M 2d λ(2dkM)−s exp(−λ(2dkM)−s ) 4 4 1 = exp − α 2 (2d)λ−s M 2d−s k −s 4 = 1 − ηs (γ M 2d−s , k). Choose some large number β. Take M and K s.t. γ M 2d−s > β and s.t. the probability of a cube to be alive is more than 1 − 2 . The probability that two cubes that are k cubes away from each other are attached is at least ηs (β, k). Let R be a number such that (MR + 2K)d < 2(MR)d .
(5)
We want to renormalize to cubes of side length N = RM + K. We cannot apply the renormalization argument from [16], because the events that two (close enough) cubes are alive are dependent. Thus, we use a different technique of renormalization: Consider the M-sided cubes as first stage vertices. Then, take cubes of side-length C1 of first stage vertices, and consider them as second stage vertices. Now, take cubes of side-length C2 of second stage vertices and consider them as third stage vertices. Keep on taking cubes of side length Cn of n-stage vertices and consider them as n + 1 stage vertices. Choose R to be R=
L n=1
for L large enough for (5) to hold.
Cn
538
N. Berger
We already have a notion of a first stage vertex being alive. Define inductively that an n-stage vertex is alive if at least Dn (Cn )d of the (n − 1)-stage vertices in it are alive, and every two of those vertices are attached to each other, i.e. there is an open bond between the big clusters in these n − 1 stage vertices. Denote by λn the probability that an n-stage vertex is not alive. We want to bound λn : Denote by φn the probability that there aren’t enough living (n − 1)-stage vertices inside our n-stage vertex, and by ψn the probability that not every two of them are attached to each other. Then, λn ≤ φn + ψn . Given λn−1 , the expected number of dead (n − 1)-stage vertices in an n-stage vertex is λn−1 Cnd . Therefore, by the Markov inequality, φn ≤
λn−1 . 1 − Dn
Every living (n − 1)-stage vertex includes at least Vn =
n−1
(Ck )d Dk = ((n − 1)!)da−b
k=1
living first-stage vertices inside its connected component. The distance between those first-stage vertices cannot exceed Un = d
n
Ck = d(n!)a .
k=1
Therefore,
(Cn )d 2 2 −s ψn ≤ (1 − ηs (β, Un ))Vn ≤ Cn 2d e−βUn Vn , 2 i.e. ψn ≤ n2da e−β(d
−s (n!)−as ·((n−1)!)2(da−b) )
= n2ad · e−β(d
−s n−as ·((n−1)!)a(2d−s)−2b )
.
(Notice that the event that the connecting edges exist might depend on the existence of enough living vertices. However, in this case, the FKG inequality works in our favor.) This shows that ψn decays faster than exponentially, and therefore, since we control β and can make it as large as we like, we can achieve ψ n < Dn for every n. By the choice of M and K, and by the definition of λ1 , we see that λ1 < 2 . In addition, for every n, λn−1 1 − Dn ≤ Dn + λn−1 (1 + 2Dn ) ≤ (1 + 3Dn ) max(λn−1 , ).
λn ≤ ψn + φn ≤ Dn +
Transience, Recurrence and Critical Behavior for Long-Range Percolation
539
Therefore, by induction, we get that for every n n
λn ≤ 2
(1 + 3Dk ),
k=1
and so, for all n, λn ≤ 2 where 2=2
∞
(1 + 3Dk ) < ∞.
k=1
So, with probability at least 1 − 2 > 1 − , we have a cluster of size L
Dn (Cn )d =
n=1
L
nda−b =
n=1
L
da−b a Cn
=R
da−b a
.
n=1
s
This is larger than 2ρR 2 , if L is large enough, because s da − b > . a 2 So, by (5), the lemma is proved for N = RM + 2K.
Proof of Lemma 2.4. The argument repeats a calculation from the proof of Lemma 2.3: There are ρ 2 N s pairs of vertices (v1 , v2 ) s.t. v1 ∈ U1 and v2 ∈ U2 . For every v1 ∈ U1 s there are at most (2k0 )d < 21 ρN 2 vertices at distance smaller or equal to k0 from v1 . So, at least half of the pairs (v1 , v2 ) satisfy v1 − v2 1 > k0 . All of the pairs satisfy v1 − v2 1 ≤ 2ldN. For a given pair (v1 , v2 ) s.t. v1 − v2 1 > k0 , the probability that there is no edge between v1 and v2 is bounded by 1 − ηs (γ , 2ldN ). So, the probability that there is no edge between U1 and U2 is bounded by 1 ρ2N s 1 ρ2N s = exp(−γ (2ldN )−s ) 2 1 − ηs (γ , 2ldN ) 2 1 = exp(−γ (2ldN )−s · ρ 2 N s ) 2 1 −s 2 −s = exp(− (2d) γρ l ) 2 = 1 − ηs (ζ γρ 2 , l). We can now use Lemma 2.3 and Lemma 2.4 to prove the following extension of Theorem 1.5: Theorem 2.5. Let d ≥ 1, and let {Pk }k∈Zd be probabilities such that there exists s < 2d for which Pk > 0. k→∞ k−s 1 lim inf
(6)
Then, if {Pk } is percolating there exists an > 0 such that {Pk = (1−)Pk } is percolating too.
540
N. Berger
Proof. Let {Pk }k∈Zd be a percolating system that satisfies (6). Let k0 and γ be as in Lemma 2.4. Let, again, ζ = 2−s−1 d −s . Let λ < 1, β and δ > 0 be such that a system in which every vertex is alive with probability λ − δ and every two vertices x and y are connected to each other with probability ηs (β(1 − δ), x − y1 ) is percolating. For one dimension one can choose such λ, β and δ by Theorem 2.1. For higher dimensions we may use the fact that sitebond nearest neighbor percolation with high enough parameters has, a.s., an infinite cluster. Let ρ > 2(2k0 )d be s.t. ζ γρ 2 ≥ β. By Lemma 2.3, there exists N s.t. a cube of s side length N contains a cluster of size ρN 2 with probability bigger than λ. A Cube s that contains a cluster of size bigger than or equal to ρN 2 will be considered alive. For ω ≤ 1, consider the system {Pk = ωPk }. The probability that in the system {Pk } an N-cube is alive is a continuous function of ω. If we define k0 and γ for {Pk } the same way we defined k0 and γ , then we get that k0 = k0 , and γ is a continuous function of ω. Choose be so small that in the system {Pk = (1 − )Pk } the probability of an N-cube to be alive is no less than λ − δ and that γ ≥ (1 − δ)γ . Then, in the system {Pk }, every N -cube is alive with probability bigger than λ − δ, and two cubes at distance k cubes from each other are connected with probability bigger than ηs (ζ γ ρ 2 , k) = ηs ((1 − δ)ζ γρ 2 , k) ≥ ηs (β(1 − δ), k). So, by the choice of β, λ and δ, a.s. there is an infinite cluster in the system {Pk }. Corollary 2.6. Let d ≥ 1, and let {Pk }k∈Zd be probabilities such that there exists s < 2d for which Pk > 0. k→∞ k−s 1 lim inf
(7)
If {Pk } is critical, i.e. for every > 0 the system {(1+)Pk } is percolating but the system {(1 − )Pk } is not percolating, then {Pk } is not percolating. Lemma 2.3 also serves us in proving Theorem 1.8. Proof of Theorem 1.8. Let {Pk }k∈Z d be such that Pk >0 k→∞ k−s 1 lim inf
for s < 2d. Let k0 , γ and ζ be as before. Let and ρ > 2(2k0 )d be s.t. the site-bond nearest neighbor percolation s.t. every site is alive with probability 1 − and every bond is open with probability ηs (ζ γρ 2 , 1) on Z d percolates. Let N be suitable for those and ρ by Lemma 2.3. Now, erase all of the bonds of length bigger than 4N d. Renormalize the space to cubes of side-length N . By erasing only bonds of length > 4N d, we did not erase bonds that are either inside N -cubes, or between neighboring N -cubes. So, the renormalized picture still gives us site-bond percolation with probabilities 1 − and ηs (ζ γρ 2 , 1), and therefore an infinite cluster exists a.s. Returning to transience, we now prove that for large enough parameters β and λ, the infinite cluster is transient. Later we will use Lemma 2.3 and Lemma 2.4 to reduce any percolating system (with d < s < 2d) to one with these large β and λ.
Transience, Recurrence and Critical Behavior for Long-Range Percolation
541
Lemma 2.7. Let d ≥ 1 and d < s < 2d. Consider the independent bond-site percolation model in which every two vertices, i and j , are connected with probability ηs (β, i − j 1 ), and every vertex is alive with probability λ < 1. If β is large enough and λ is close enough to 1, then (a.s.) the random walk on the infinite cluster is transient. In order to prove Lemma 2.7, we need the notion of a renormalized graph: For a sequence {Cn }∞ n=1 , we construct a graph whose vertices are marked Vl (jl , .., j1 ), where l = 0, 1, ... and 1 ≤ jn ≤ Cn . For convenience, set Vk (0, 0, .., 0, jl , .., j1 ) = Vl (jl , .., j1 ). For l ≥ m, we define Vl (jl , ..., jm ) to be the set {Vl (jl , ..., jm , um−1 , ..., u1 )|1 ≤ um−1 ≤ Cm−1 , ..., 1 ≤ u1 ≤ C1 }. Definition 2.8. A renormalized graph for a sequence {Cn }∞ n=1 is a graph whose vertices are Vl (jl , .., j1 ), where l = 0, 1, ... and 1 ≤ jn ≤ Cn , such that for every k ≥ l > 2, every jk , ..., jl+1 and every ul , ul−1 and wl , wl−1 , there is an edge connecting a vertex in Vk (jk , ..., jl+1 , ul , ul−1 ) and a vertex in Vk (jk , ..., jl+1 , wl , wl−1 ). One may view a renormalized graph as a graph having the following recursive structure: The nth stage of the graph is composed of Cn graphs of stage (n − 1), such that every (n − 2)-stage graph in each of them is connected to every (n − 2)-stage graph in any other. (A zero stage graph is a vertex.) Lemma 2.9. Under the conditions of Lemma 2.7, if β and λ are large enough, then a.s the infinite cluster contains a renormalized sub-graph with Cn = (n + 1)2d . Proof. We will show that with a positive probability 0 belongs to a renormalized subgraph. Then, by ergodicity of the shift operator and the fact that the event E = {There exists a renormalized sub-graph} is shift invariant we get P(E) = 1. In order to do that, we use the exact same technique used by Newman and Schulman in [16]: Take Wn = 2(n + 1)2 ,
θn = 1 −
n−1.5 , 2
λn = 1 −
(n + 1)−1.5 . 4
(8)
Renormalize Z d by viewing cubes of side-length W1 as first stage vertices. (The original vertices will be viewed as zero-stage vertices.) Then, take cubes of side-length W2 of first-stage vertices as second stage vertices, and continue grouping together cubes of side-length Wn of (n − 1)-stage vertices to form n stage vertices. We now define inductively the notion of an (n-stage) vertex being alive: The notion of a zero-stage vertex being alive is given to us. A first-stage vertex is alive if at least θ1 W1d of its vertices are alive, and they are all connected to each other. For every living first-stage vertex, we choose C1 zero-stage vertices, and call them active. The active part of a first-stage vertex is the set of active zero-stage vertices in it. The active part of a living zero-stage vertex is the singleton containing the vertex. We now define (inductively) simultaneously the notion of an n-stage vertex being alive, and of the active part of this vertex. For n ≥ 2, we say that an n-stage vertex v is alive if: (A) At least θn Wnd of its vertices are alive, and (B) If i1 is a living (n − 2)-stage vertex that belongs to a living (n − 1)-stage vertex i2 that belongs to v, and j1 is a living (n − 2)-stage vertex that belongs to a living (n − 1)-stage vertex j2 that belongs to v then there exists an open bond connecting a zero-stage vertex in the active part of i1 to a zero-stage vertex in the active part of j1 .
542
N. Berger
When choosing the active vertices, if the vertex that includes 0 is alive, we choose it to be active. To define the active part: If v is a living n-stage vertex, then we choose Cn of its living n − 1-stage vertices to be active. The active part of v is the union of the active parts of its active vertices. (Notice that the active part is always a set of zero-stage vertices.) We denote the event that (A) occurs for the n-stage vertex containing 0 by An , and by Bn we denote that (B) occurs for the n-stage vertex containing 0. An (v) and Bn (v) will denote the same event for the n-stage vertex v. Of course, P(An ) = P(An (v)) and P(An ) = P(An (v)) for every v. Further, we denote by Ln (v) the event that the n-stage vertex v is alive, and by Ln the event that the n-stage vertex containing 0 is alive. Let v be an n-stage vertex. Given An we want to estimate the probability of Bn : We have at most (Wn Wn−1 )d (9) < 4d (n + 1)8d 2 pairs of (n − 2)-vertices. If i1 and i2 are living (n−2)-stage vertices in v, then the distance between a zero-stage vertex in i1 and a zero-stage vertex in i2 cannot exceed n
Wk = 2n ((n + 1)!)2 .
(10)
k=1
The size of the active part in i1 (and in i2 ), is n
Wk = ((n + 1)!)2d .
(11)
k=1
By (11) and (10), the probability that there is no open bond between i1 and i2 is bounded by
((n+1)!)4d
exp −β · 2−ns ((n + 1)!)−2s = exp −β · 2−ns ((n + 1)!)4d−2s and by (9) we get
P Bnc |An ≤ 4d (n + 1)8d exp −β · 2−ns ((n + 1)!)4d−2s
≤ exp 9d log(n) − β · 2−ns ((n + 1)!)4d−2s .
(12)
Assume that β > 1. We may assume that because we deal with “large enough” β. By (12), there exists n0 s.t. if n > n0 then P Bnc |An < e−n . (13) We now want to prove the following claim: Claim 2.10. There exists n1 such that for every n > n1 , if P(Ln ) ≥ λn then P(Ln+1 ) ≥ λn+1 .
Transience, Recurrence and Critical Behavior for Long-Range Percolation
543
Proof. Let ψ = P(Ln ). First, we like to estimate P(An+1 ). The event Acn+1 is the event d vertices are dead. The number of dead vertices is a (W d , ψ) that at least (1−θn+1 )Wn+1 n+1 binomial variable, and by the induction hypothesis together with (8), ψ < 21 (1 − θn+1 ). So, by large deviation estimates,
1 d 2 (1 − θn+1 )Wn+1 c (14) P(An+1 ) < exp − 16
n2d−1.5 ≤ exp − . 32 If n1 > n0 , and is large enough, by (13) and (14), c |An+1 ) P(Lcn+1 ) ≤ P(Acn+1 ) + P(Bn+1
n2d−1.5 ≤ exp − + e−n 32
≤
(n + 1)−1.5 = 1 − λn+1 . 4
We can take β and λ so large that P(Ln1 ) > λn1 . But then, by Claim 2.10, for every n > n1 , P(Ln ) > λn . So, since the events Ln are positively correlated,
∞ ∞ Ln ≥ P(Ln ) > 0. P n=1
n=1
So with positive probability, 0 is in an infinite cluster. The active part of the infinite cluster (i.e. the union of the active parts of the n-stage vertex containing 0 for all n) is a renormalized sub-graph of the infinite cluster that contains 0. Proof of Lemma 2.7. In view of Lemma 2.9 it suffices to show that for Cn = (n + 1)2d , the renormalized graph is transient. We build, inductively, a flow F from V1 (1) to infinity which has a finite energy. First, F flows C1−1 mass from V1 (1) to each of 1 {V1 (i)}C i=2 .
Now, inductively, assume that F distributes the mass among {Vn (i1 , ..., in )|2 ≤ ik ≤ Ck }. Then, for each (n−1)-stage graph Vn (i), i = 1 and every n-stage graph Vn+1 (j ), j = 1, (n) (n) there are two vertices, pi,j ∈ Vn (i) and qi,j ∈ Vn+1 (j ) which are connected to (n)
(n)
each other by an open bond. (Notice that the vertices {pi,2 , ...pi,Cn+1 }, as well as (n)
(n)
{q2,j , ...qCn ,j } do not necessarily differ from each other.) Inductively, we know how to flow mass from one vertex in Vn (i) to all of Vn (i). We can flow it backwards in the same manner to any desired vertex. Flow the mass so that it will be distributed equally (n) (n) among {pi,2 , ...pi,Cn+1 } (if a vertex appears twice, it will get a double portion). Now flow
544
N. Berger (n)
(n)
(n)
the mass from each pi,j to the corresponding qi,j , and from qi,j (again by the inductive familiar way) we will flood Vn+1 (j ). Now, we bound the energy of the flow: En . The maximal possible energy of the first n stages of the flow (i.e. the part of the flow which (n+1) distributes the mass the origin to Vn+1 and takes it backwards to {pi,j } ⊂ Vn+1 ) can be bounded by the energy of first n − 1 stages of the flow, plus: (n)
(n)
(A) Flowing between pi,j and qj,i : This will have energy of (Cn Cn+1 )−1 . En−1 Cn+1 .
(B) Flowing inside Vn+1 : the energy is bounded by So, En ≤ 1 +
1 Cn−1
En−1 +
1 . Cn Cn−1
The total energy is bounded by the supremum of {En } which is finite because ∞ 1 < ∞. Cn
n=1
Let v be a vertex. The amount of flow that goes through v is defined to be f (v) = |f (e)|, where the sum is taken over all of the edges e that have v as an end point. Then, we get a notion of the energy of the flow through the vertices, defined as 1 2
Evertices =
f (v)2 .
v is a vertex
Remark 2.11. The same calculation as in Lemma 2.7 yields that not only the energy of the flow on the bonds is finite, but also the energy of the flow through the vertices. This fact allows us to obtain the main goal of this section: satisfy: Theorem 2.12. Let d ≥ 1, and let {Pk }∞ k∈Zd (A) Pk = P−k for every k ∈ Z. (B) the independent percolation model in which the bond between i and j is open with probability Pi−j has, a.s., an infinite cluster. (C) there exists d < s < 2d s.t. lim inf
k→∞
Pk > 0. k−s 1
(D)
Pk < ∞.
k∈Zd
Then, a.s., a random walk on the infinite cluster is transient.
(15)
Transience, Recurrence and Critical Behavior for Long-Range Percolation
545
Proof. By (D), the degree of every vertex in the infinite cluster is finite, so the random walk is well defined. Let β and λ be large enough for Lemma 2.7. Then, by Lemma 2.3, there exists N such that after renormalizing with cubes of side-length N we get a system whose connection probabilities dominate ηs (β, |i − j |), and the probability of a vertex to live is bigger than λ. By Lemma 2.7, there is a flow on this graph whose energy is finite. For the walk to be transient, the energy of the flow should also be finite inside the N -cubes. This is true of Remark 2.11 and the fact that inside each N -cube there are no more than because
N d bonds. 2 One can look on other types of energy as well. For any q, we define the q-energy of a flow as in Eq. (1). Theorem 2.12 says that for every {Pk } that satisfies conditions (A) through (D), there is a flow with finite 2-energy. Actually, one can say more: Theorem 2.13. Let {Pk }k∈Z be as in Theorem 2.12. Then, for every q > 1, there is a flow with finite q-energy on the infinite cluster. A sketch of the proof. The proof is essentially the same as the proof of Theorem 2.12. We can construct a renormalized sub-graph of the infinite cluster with Cn = (n + 1)kd , for k s.t. k(q − 1) > 1. We construct the flow the same way we did it in Lemma 2.7. The same energy estimation will now yield the required finiteness of the energy. Lemma 2.3 and Remark 2.11 are used the same way they were used in Theorem 2.12. If we construct a renormalized graph with Cn = 2n (such a graph a.s. exists as a sub-graph of the infinite cluster), we get a flow whose q-energy is finite for every q > 1.
3. The Recurrence Proofs In this section we prove the recurrence results. Unlike the transient case, here we give two different proofs - one for the one-dimensional case, and the other for the two-dimensional case. We begin with the easier one-dimensional case. Theorem 3.1. Let {Pk }∞ k=1 be a sequence of probabilities s.t.: (A) the independent percolation model in which the bond between i and j is open with probability P|i−j | has, a.s., an infinite cluster, and (B) lim sup k→∞
Pk < ∞. k −2
Then, a.s., a random walk on the infinite cluster is recurrent. The proof of the theorem relies on the Nash–Williams theorem, whose proof can be found in [18]: Theorem 3.2 (Nash–Williams). Let G be a graph with conductance Ce on every edge e. Consider a random walk on the graph such that when the particle is at some vertex, it chooses its way with probabilities proportional to the conductances on the edges that it
546
N. Berger
sees. Let {
then the random walk is recurrent. In order to prove Theorem 3.1, we need the following definition and three easy lemmas. The following definition appeared originally in [3] and [16]. Definition 3.3 (Continuum Bond Model). Let β be s.t. 1 k+1 β(x − y)−2 dydx > Pk 0
k
for every k. The continuum bond model is the two dimensional inhomogeneous Poisson process ξ with density β(x − y)−2 . We say that two sets A and B are connected if ξ(A × B) > 0. Notice that by Definition 3.3, the probability that the interval [i, i + 1] is connected to [j, j + 1] in the continuum model is not smaller than the probability that i is directly connected to j in the original model. (By saying that a vertex is directly connected to an interval, we mean that there is an open bond between this vertex and some vertex in the interval.) So, we get: Claim 3.4. Let I be an interval. Let M be the length of the shortest interval that contains all of the vertices that are directly connected to I in the original model. Let M be the length of the smallest interval J s.t. ξ(I × (R − J )) = 0. Then, M stochastically dominates M. Lemma 3.5. (A) Under the conditions of Theorem 3.1, let I be an interval of length N . Then, the probability that there exists a vertex of distance bigger than d from the interval, that is directly connected to the interval, is O Nd . (B) Consider the continuum bond model. Let I be an interval of length N, and let J be the smallest interval s.t. ξ(I × (R − J )) = 0. Then P(|J | > d) = O Nd . Proof. (A) Let β = sup k
Pk < ∞. k −2
If v is at distance k from I , then the probability that d is directly connected to I is bounded by β
d+N k=d
k −2 <
β N . d2
So, the probability that there is a vertex of distance bigger than d that is directly connected to I is bounded by ∞ N β N = O . k2 d k=d
(B) is proved exactly the same way.
Transience, Recurrence and Critical Behavior for Long-Range Percolation
547
Lemma 3.6. Under the same conditions, and again letting I be an interval of length N, the expected number of open bonds exiting I is O(log N ). There is a constant γ , s.t. the probability of having more than γ log N open bonds exiting I is smaller than 0.5. Proof. Again, let Pk < ∞. k −2
β = sup k
The expected number of open bonds exiting I is P(v ↔ u) ≤ β (u − v)−2 v∈I,u∈I /
v∈I,u∈I /
= 2β
∞ N
k −2
i=1 k=i
≤ 4β
N 1
i
i=1
= O(log N ). Let C be s.t. the expected value is less than C log N for all n. For any γ > 2C, by Markov’s inequality, the probability that more than γ log N open bonds are exiting I is smaller than 0.5. Lemma 3.7. Let Ai be independent events s.t. P(Ai ) ≥ 0.5 for every i. Then, a.s., ∞ 1A i=1
n
n
= ∞.
Proof. Let Uk =
2k+1 −1 i=2k
1Ai . i
Then, Uk ≥ 2
−(k+1)
2k+1 −1
1Ai .
(16)
i=2k
The variables Uk are independent of each other, and by (16), for every k we have P(Uk > 0.25) > 0.5. Therefore, ∞ 1A n=1
a.s.
n
n
=
∞ k=0
Uk = ∞
548
N. Berger
Proof of Theorem 3.1. We will show that with probability 1, the infinite cluster satisfies the Nash-Williams condition. Let I0 be some interval. We define In inductively to be the smallest interval that contains all of the vertices that are connected directly to In−1 . Denote Dn =
|In+1 | . |In |
The edges exiting In+1 are stochastically dominated by the edges exiting an interval of length |In+1 | (without the restriction that no edge starting at In exits In+1 ). Furthermore, given In the edges exiting In+1 are independent of those exiting In . Let {Un }∞ n=1 be independent copies of the continuum bond model. Then, by Claim 3.4 Dn is stochastically |I
|
dominated by the sequence Dn = |In+1 , where In+1 is the smallest interval s.t. R − In−1 n| is not connected to the copy of In in Un . The variables Dn are i.i.d. Therefore, by Lemma 3.5, the sequence {log(Dn )} is dominated by a sequence of i.i.d. variables dn = log(Dn ), which satisfy E(dn ) < M. Let
|
N
dn .
(17)
n=1
By the strong law of large numbers, with probability 1, for all large enough N , N
dn < 2MN.
(18)
n=1
Combining (17), (18) and Lemma 3.7, we get that the Nash–Williams condition is a.s. satisfied. We now work on the two-dimensional case. Our strategy in this case will be to project the long bonds on the short ones. That is, for every open long bond we find a path of nearest-neighbor bonds s.t. the end points of the path are those of the original long bond. Then, we erase the long bond, and assign its conductance to this path. In order to keep the conductance of the whole graph, if the path is of length n, we add n to the conductance of each of the bonds involved in it. To make the discussion above more precise, we state it as a lemma. Lemma 3.8. Let s > 3 and let Pi,j be a sequence of probabilities, such that lim sup i,j →∞
Pi,j < ∞. (i + j )−s
Consider a shift invariant percolation model on Z2 on which a bond between (x1 , y1 ) and (x2 , y2 ) is open with marginal probability P|x1 −x2 |,|y1 −y2 | . Assign conductance 1 to every open bond, and 0 to every closed one. Call this electrical network G1 . Now, perform the following projection process: for every open long (i.e. not nearest neighbor) bond (x1 , y1 ), (x2 , y2 ) we
Transience, Recurrence and Critical Behavior for Long-Range Percolation
549
(A) erase the bond, and (B) to each nearest neighbor bond in [(x1 , y1 ), (x1 , y2 )] ∪ [(x1 , y2 ), (x2 , y2 )] increase the conductance by |x1 − x2 | + |y1 − y2 |. We call this new electrical network G2 . Then (I) A.s. all of the conductances in G2 are finite. (II) The effective conductance of G2 is bigger than or equal to that of G1 . (III) The distribution of the conductance of an edge in G2 is shift invariant. (IV) If s > 4 then the conductance of an edge is in L1 . (V) If s = 4 then the conductance Ce of an edge has a Cauchy tail, i.e. there is a constant χ such that P(Ce > nχ ) ≤ n−1 for every n. To complete the picture, we need the following theorem about random electrical networks on Z2 . The theorem is proved in the next section. Theorem 3.9. Let G be a random electrical network on the nearest neighbor bonds of the lattice Z2 , such that all of the edges have the same conductance distribution, and this distribution has a Cauchy tail. Then, a.s., a random walk on G is recurrent. Lemma 3.8 and Theorem 3.9 imply the following theorem: Theorem 3.10. Let s ≥ 4 and let Pi,j be probabilities, such that lim sup i,j →∞
Pi,j < ∞. (i + j )−s
(19)
Consider a shift invariant percolation model on Z2 on which the bond between (x1 , y1 ) and (x2 , y2 ) is open with marginal probability P|x1 −x2 |,|y1 −y2 | . If there exists an infinite cluster, then the random walk on this cluster is recurrent. Proof. The case s = 4 follows directly from 3.8 and Theorem 3.9. For the case s > 4, notice that if (19) holds for some s > 4, then it holds for s = 4 too. Proof of Lemma 3.8. (I) We calculate the expected number of bonds that are projected on the edge (x, y), (x, y + 1): W.l.o.g, the projected bond starts at some (x, y1 ≤ y), continues through (x, y2 ≥ y + 1), and ends at some (x1 , y2 ). The expected number will be P|y2 −y1 |,|x1 −x| ≤ 4M (k − j + h)−s 2 y1 ≤y,y2 ≥y+1,x1
j ≤0,k≥1,h≥0
≤ 4M
(l + h)1−s
l>0,h≥0
≤ 4M
(l)2−s < ∞,
l>0
where M = sup i,j
Pi,j < ∞, (i + j )−s
and therefore (I) is true. (II) let E be a bond which is projected on a path of length n. E has conductance 1, and
550
N. Berger
is therefore equivalent to a sequence of n edges with conductance n each. So, Divide E that way. By identifying the endpoints of these edges with actual vertices of the lattice, we only increase the effective conductance of the network. (III) is trivial. (IV) and (V) follow from the same calculation performed in the proof of (I).
4. Random Electrical Networks In this section we discuss random electrical networks. We have two main goals in this section: Theorem 3.9. Let G be a random electrical network on the lattice Z2 , such that all of the edges have the same conductance distribution, and this distribution has a Cauchy tail. (Notice that we do not require any independence.) Then, a.s., a random walk on G is recurrent. and Theorem 1.9. Let G be a recurrent graph with bounded degree. Assign i.i.d. conductances on the edges of G. Then, a.s., the resulting electrical network is recurrent. Notice that if in Theorem 3.9 we don’t require a Cauchy tail, then the network might be transient. A good example would be the projected two-dimensional long-range percolation with 3 < s < 4 (see Lemma 3.8). First, we prove Theorem 3.9, which is important for the previous section. We need the following lemma, which sets some bound for the sum of random variables with a Cauchy tail: Lemma 4.1. Let {fi }∞ i=1 be identically distributed positive random variables that have a Cauchy tail. Then, every has K and N such that if n > N , then
n 1 fi > K log n < . P n i=0
Proof. fi has a Cauchy tail, so there exists C such that for every n, P(fi > n) <
C . n
Let M > 2 be a large number. Let N be large enough that CN 1−M < n > N , and let gi = min(fi , nM ) for all 1 ≤ i ≤ n. Then,
n n 1 1 fi = gi ≤ n · P(f1 = g1 ) P n n i=1
1 2 .
Choose
i=1
1 . 2 E(gi ) ≤ CM log n, and gi is positive. Therefore, by Markov’s inequality, if we take K = CM 2 , then
n 1 1 CM log n 1 = < , gi > K log n < P 2 n CM log n M 2 ≤ Cn1−M <
i=1
Transience, Recurrence and Critical Behavior for Long-Range Percolation
and so
551
n 1 P fi > K log n < . n
i=1
We use another lemma: Lemma 4.2. Let An be a sequence of events such that P(An ) > 1 − for every n, and let {an }∞ n=1 be a sequence s.t. ∞
an = ∞.
n=1
Then, with probability at least 1 − , ∞
1An · an = ∞.
n=1
Proof. It is enough to show that for any M, ∞
P 1An · an < M ≤ . n=1
Assume that for some M this is false. Define BM to be the event ∞ BM = 1An · an < M . n=1
Since P(BM ) > , we know that there exists δ > 0 such that P(An |BM ) > δ for all n. Therefore, ∞
∞ E 1An · an |BM ≥ δ an = ∞, n=1
n=1
which contradicts the definition of BM .
Now, we can prove Theorem 3.9. Proof of Theorem 3.9. Let G be a random electrical network on the lattice Z2 , such that all of the edges have the same conductance distribution, and this distribution has a Cauchy tail. Define the cutset
(i) be the i th
edge (out of (8n + 4)) in 0 be arbitrary. Let en there exist K and N, such that for every n > N , we have
8n+4 C(en (i)) ≤ Kn log n > 1 − . (20) P i=1
552
N. Berger
Call the event in Eq. (20) An . Set an = (Kn log n)−1 for n = N, ..., ∞. Now, n
−1 C< ≥ n
∞
1An · an .
n=N
By the definition of {an }, ∞
an = ∞.
n=N
On the other hand, P(An ) > 1 − for all n. So, by Lemma 4.2,
−1 P C
Since is arbitrary, we get that a.s. n
−1 C< = ∞. n
Now, we turn to prove Theorem 1.9. First, we need a lemma: Lemma 4.3 (Yuval Peres). Let G be a recurrent graph, and let Ce be random conductances on the edges of G. Suppose that there exists M such that E(Ce ) < M for each edge e. Then, a.s., G with the conductances {Ce } is a recurrent electrical network. Proof. Let v0 ∈ G, and let {Gn } be an increasing sequence of finite sub-graphs of G, s.t. v0 ∈ Gn for every n and s.t. G = ∪∞ n=1 Gn . By the definition of effective conductance, lim Ceff (Gn ) = Ceff (G) = 0.
n→∞
Let Xn be the space of functions f s.t. f (v0 ) = 1 and f (u) = 0 for every u ∈ G − Gn . We know that Ceff (Gn ) = inf (f (v) − f (w))2 . f ∈Xn
(v,w) is an edge in G
If we denote by H (resp. Hn ) the electrical network of the graph G (resp. Gn ) and conductances Ce , then Ceff (Hn ) = inf Cv,w (f (v) − f (w))2 . f ∈Xn
(v,w) is an edge in G
Let f ∈ Xn . Denote Gn (f ) for
(f (v) − f (w))2
(v,w) is an edge in G
and Hn (f ) for
(v,w) is an edge in G
Cv,w (f (v) − f (w))2 .
Transience, Recurrence and Critical Behavior for Long-Range Percolation
553
There exists an f ∈ Xn such that Gn (f ) = Ceff (Gn ). Since E(Hn (f )) < MGn (f ), we get that E(Ceff (Hn )) ≤ M(Ceff (Gn ). So, by Fatou’s lemma, E(Ceff (H )) ≤ lim E(Ceff (Hn )) ≤ M lim (Ceff (Gn )) = 0, n→∞
and therefore Ceff (H ) = 0 a.s.
n→∞
Now we can prove Theorem 1.9. The main idea is to change the conductances in a manner that will not decrease the effective conductance, but after this change, the conductances will have bounded expectations (although they might be dependent). Proof of Theorem 1.9. Let G be a recurrent graph, and let d be the maximal degree in G. Let {Ce }{e is an edge in G} be i.i.d. non-negative variables, and let H be the electrical network defined on the graph G with the conductances {Ce }. We want to prove that with probability one H is recurrent. Let M be so large that P (Ce ≥ M) <
1 . d5
We introduce some notation: edges whose conductances are bigger than M will be called bad edges. Vertices which belong to bad edges will also be called bad. We look at connected clusters of bad edges. Edges that are good but have at least one bad vertex, will be called boundary edges. By the choice of M, the sizes of the clusters of bad edges are dominated by sub-critical Galton-Watson trees. Define a new network H as follows: Let U (e) be the connected component to which e belongs (if e is bad) or to which e is attached (if it is a boundary edge). If e is in the boundary of two components, then we take U (e) to be their union. For a bad or boundary e, the new conductance will be 2M · (#U (e) + #∂U (e))2 , where # measures the number of edges. If e is a good edge then its conductance is unchanged. The size of the connected cluster satisfies P(#U (e) + #∂U (e) > n) = o(n−4 ). Therefore, the expected values of the conductances of the edges are uniformly bounded. So by Lemma 4.3 H is recurrent. All we need to prove is that the effective resistance of H is not bigger than that of H : Let F be a flow, and let U be a connected component of bad edges in G. The energy of F on U in the network H will be
EU,F (H ) =
e∈U ∪∂U
F2 F2 Fe2 e e ≥ ≥ . Ce Ce M e∈∂U
e∈∂U
For every e in U ∪ ∂U , the flow Fe is smaller than |Fe |, e ∈∂U
so Fe2 ≤ #∂U ·
e ∈∂U
Fe2 ≤ M · #∂U · EU,F (H ).
554
N. Berger
Therefore, EU,F (H ) =
e∈U ∪∂U
Fe2 2M · (#U + #∂U )2
≤ (#U + #∂U )
M · #∂U · EU,F (H ) ≤ EU,F (H ). 2M · (#U + #∂U )2
Thus, by Thomson’s theorem (see [18]), the effective resistance of H is not bigger than that of H , and we are done.
5. Critical Behavior of the Free Long-Range Random Cluster Model We return to the critical behavior. Our goal in this section is to prove Theorem 1.6 and Corollary 1.7. We begin with the following extension of Theorem 1.6: Theorem 5.1. Let d < s < 2d and let {Pk }k∈Zd be nonnegative numbers such that ∀k (Pk = P−k ) and Pk > 0. k→∞ k−s 1 lim inf
(21)
Let β > 0, and consider the infinite volume limit of the free random cluster model with probabilities 1 − e−βPk and with q ≥ 1 states. Then, a.s., at βc = inf(β| a.s. there exists an infinite cluster) there is no infinite cluster. We need the following extension of Lemma 2.3: Lemma 5.2. Let d ≥ 1. Consider an ergodic (not necessarily independent) percolation model on Zd which satisfies (22) P i ↔ j |Bi,j ≥ Pi−j , where i ↔ j denotes the event of having an open bond between i and j , and Bi,j is the σ -field created by all of the events {i ↔ j }(i ,j )=(i,j ) . Assume further that: (A) The distribution has the FKG property [7]. (B) A.s. there is a unique infinite cluster. (C) There exists d < s < 2d s.t. Pk > 0. ||k||→∞ ||k||−s lim inf
Then, for every > 0 and ρ there exists N such that with probability bigger than 1 − , s inside the cube [0, N − 1]d there exists an open cluster which contains at least ρN 2 vertices. Lemma 5.2 is proved exactly the same way as Lemma 2.3. Lemma 5.2 is valid for the free random cluster model measure considered in Theorem 5.1. We can use Lemma 5.2 to prove the following:
Transience, Recurrence and Critical Behavior for Long-Range Percolation
555
Lemma 5.3. Let d < s < 2d and let {Pk }k∈Zd be nonnegative numbers such that ∀k (Pk = P−k ) and lim inf |k|→∞
Pk > 0. k−s 1
(23)
Let β > 0, and consider the infinite volume limit of the free random cluster model with probabilities 1 − e−βPk . Assume that, a.s., there is an infinite cluster. Then, for every and ρ there is an N such that given the values (open or closed) of all of the edges that have at least one end point out of the cube [0, N − 1]d , the probability of having an 1 open cluster of size ρN 2 s within [0, N − 1]d is larger than 1 − . Proof. The proof follows the guideline of the proof of Lemma 2.3: Choose and θ , and let M be s.t. by Lemma 5.2 with probability larger than 1 − there exists an open √ 1 cluster of size θ M 2 s inside [0, M − 1]d . Let K be s.t. this probability is larger than 1 − 2 even if all of the edges with at least one endpoint out of [−K, K + M − 1]d are closed. Such K exists because the free measure on Zd is the limit of the free measures on [−K, K + M − 1]d when K tends to infinity. Now, let R be a large number. Assume that all of the edges with (at least) one endpoint out of [−K, RM + K − 1]d are closed. For a cube C=
d
[li M, (li + 1)M − 1]
0 ≤ li ≤ R − 1
j =1
in [−K, RM + K − 1]d , the probability of the cube to be alive, i.e. to have an open √ 1 cluster of size θ M 2 s is larger than 1 − 2 (because of domination). The probability that there exists an open bond between two living cubes that are k cubes away from each other is larger than ηs ( θ2 , k). Now, we can proceed exactly as in the proof of Lemma 2.3. With , θ and R properly chosen, the lemma is proved. Now, we can prove Theorem 5.1: Proof of Theorem 5.1. Let {Pk }k∈Zd be such that for every k, Pk = P−k and such that κ = lim inf
k→∞
Pk > 0. k−s 1
Let β be s.t. for the Random Cluster Model with interactions {Pk } and inverse temperature β there exists, a.s., an infinite cluster. What we need to show is that there exists an > 0 s.t. there exists an infinite cluster at inverse temperature β −. For every a and b consider the independent percolation model I(a, b, s), where every vertex exists with probability −s a and two vertices x and y are attached to each other with probability 1 − e−b|x−y| . Let γ , λ and δ be s.t. in I(λ − δ, γ − δ, s) there exists, a.s., an infinite cluster. Let N be so large that by Lemma 5.3 with probability larger than λ there exists a 1 cluster of size ρN 2 s inside [0, N − 1]d , where the probability is with respect to the free measure on [0, N − 1]d , and ρ is s.t. ρ2 > γ. 2q
(24)
556
N. Berger
By the choice of ρ (24) we get that the probability of having an open bond between 1 clusters of size ρN 2 s that are located in the cubes at N x and Ny is (no matter what −s happens in any other bond) at least 1 − e−γ x−y1 . Now, let > 0 be s.t. in inverse temperature β − the probability of having this big cluster is larger than λ − δ, and the probability of having an open bond is larger −s than e(γ −δ)|x−y| . Such exists, because the probability of any event in a finite random cluster model is a continuous function of the (inverse) temperature. When considering the renormalized model in inverse temperature β − , it dominates I(λ − δ, γ − δ, s), and therefore has an infinite cluster. We can now restate and prove Corollary 1.7: Corollary 1.7. Let {Pk }k∈Zd be nonnegative numbers s.t. Pk = P−k for every k and d s.t. Pk ∼ k−s 1 (d < s < 2d). Consider the Potts model with q states on Z , s.t. the interaction between v and u is Pv−u . At the critical temperature, the free measure is extremal. Proof of Corollary 1.7. Recall the following construction of a configuration of the free measure of the Potts model: choose a configuration of the free measure of the Random Cluster model, and color each of the clusters by one of the q states. The states of different clusters are independent of each other. By Theorem 1.6, there is no infinite cluster at the critical temperature. Therefore, for every n and there exists K s.t. with probability 1 − for every x s.t. x1 ≤ n and y s.t. y1 ≥ K, x and y belong to distinct clusters. Therefore, for the Potts model, there is an event E of probability bigger than 1 − s.t. given E, the coloring of {x : x1 ≤ n} is independent of the coloring of {y : y1 ≥ K}. Therefore, the tail σ -field ∞
σ (v s.t. v1 > K)
K=1
is trivial, and therefore the measure is extremal.
6. Remarks and Problems Many more questions can be asked about these clusters. One example is the volume growth rate. It can be shown that the growth of the infinite cluster is not bigger than exponential with the constant Pk . k∈Zd
In the case d < s < 2d, the growth can be bounded from below by exp(nφ(s) ), for φ(s) = log2 (2d/s) − . This can be proved as follows: if β is large enough, then in the proof of Theorem 2.1, we may take Cn = exp(2φ(s)·n ). Then, the nth degree cluster contains Ck , k
Transience, Recurrence and Critical Behavior for Long-Range Percolation
557
vertices, while its diameter is at most 2n . This gives a lower bound of exp(nφ(s) ) for the growth. If β is not so large, then by using Lemma 2.3 we can make it large enough. In the case s = 2d, the volume growth rate is subexponential (see [5]). In the case s < 2d it is not known. So, we get a few questions on the structure of the infinite cluster. Question 6.1. What is the volume growth rate of the infinite clusters of super-critical long-range percolation with d < s < 2d? Is it exponential? Question 6.2. How many times do two independent random walk paths on the infinite cluster of long-range percolation intersect? Question 6.3. Are there any nontrivial harmonic functions on the infinite cluster of onedimensional long-range percolation with d < s < 2d? Other questions can be asked on the critical behavior. The renormalization lemma (Lemma 2.3) is only valid when d < s < 2d. So, the arguments given here say nothing about the critical behavior on other cases. At the case d = 1 and s = 2, Aizenman and Newman proved that there exists an infinite cluster at criticality (see [3]). For the other cases the following questions are still open: Question 6.4. Does critical long-range percolation have an infinite cluster when d ≥ 2 and s ≥ 2d? As remarked by G. Slade, the methods used in [12] might be used to prove that for d > 6 and s > d + 2 there is no infinity at criticality. This can reduce Question 6.4 to the case 2 ≤ d ≤ 6. Question 6.5. Does the conclusion of Theorem 1.8 hold for sequences which decay faster than those treated in Theorem 1.8 and slower than those treated by Steif and Meester ([15])? i.e. Let d ≥ 2. For which percolating d-dimensional arrays of probabilities {Pk }k∈Zd do there exist an N s.t. the independent percolation model with probabilities Pk k1 < N Pk = 0 k1 ≥ N also have, a.s., an infinite cluster? The arguments given in this paper are not strong enough to prove that there is no infinite cluster in the wired random cluster model at the critical temperature. So, the following question is still open: Question 6.6. Is there an infinite cluster at the critical temperature in the wired random cluster model with d < s < 2d? A different formulation of the same question is Question 6.6 (Revised). Let d ≥ 1 and let d < s < 2d. Let {Pk }k∈Z d be s.t. Pk = P−k for every k and s.t. Pk ∼ k−s 1 . Consider the Potts model (with q states) with interaction Pu−v between u and v. Let β be the critical inverse temperature for this Ising model. Is there a unique Gibbs measure at inverse temperature β? Question 6.6 is related to the question whether the free and the wired measures agree on the critical point. Conjecturing that for high values of q, the number of states, the critical wired measure has an infinite cluster, we will get the conjecture that the two measures won’t agree at the critical point.
558
N. Berger
Acknowledgements. First, I thank Yuval Peres and Itai Benjamini for presenting these problems to me and for helping me during the research. I also wish to thank Omer Angel and Elchanan Mossel for helpful suggestions. I thank Jeff Steif for his help in improving the exposition of the paper and for presenting to me the question leading to Theorem 1.8. I thank Michael Aizenman for useful and interesting discussions.
References 1. Aizenman, M., Chayes, J.T., Chayes, L. and Newman, C.M.: Discontinuity of Magnetization in One Dimensional 1/|x − y|2 Ising and Potts Models. J. Stat. Phys. 50, 1–41 (1988) 2. Aizenman, M. and Fern´andez , R.: Critical Exponents for Long-Range Interactions. Let. Math. Phys. 16, 39–49 (1988) 3. Aizenman, M. and Newman, C.M.: Discontinuity of the Percolation Density in One Dimensional 1/|x − y|2 Percolation Models. Commun. Math. Phys. 107, 611–647 (1986) 4. Angel, O., Benjamini, I., Berger, N., Peres, Y.: Transience of percolation clusters on wedges. Preprint (2001) 5. Benjamini, I. and Berger, N.: In Preparation (2001) 6. Benjamini, I. Pemantle, R. and Peres, Y.: Unpredictable paths and percolation. Ann. Probab. 26, no. 3, 1198–1211 (1998) 7. Fortuin, C.M., Kasteleyn, P.W. and Ginibre, J.: Correlation inequalities on some partially ordered sets. Commun. Math. Phys. 22 89–103 (1971) 8. Gandolfi, A., Keane, M.S. and Newman, C.M.: Uniqueness of the infinite component in a random graph with applications to percolation and spin glasses. Probab. Theory Related Fields 92, 511–527 (1992) 9. Jespersen, S. and Blumen, A.: Small-world networks: Links with long-tailed distributions. Phys. Rev. E 62, 6270–6274 (2000) 10. Grimmett, G.R., Kesten, H. and Zhang, Y.: Random walk on the infinite cluster of the percolation model. Probab. Th. Rel. Fields 96, 33–44 (1993) 11. Häggström, O. and Mossel, E.: Nearest-neighbor walks with low predictability profile and percolation in 2 + dimensions. Ann. Probab. 26, 1212–1231 (1998) 12. Hara, T. and Slade, G. Critical Behavior for Percolation in High Dimensions. Commun. Math. Phys. 128, 333–391 (1990) 13. Hoffman, C., Mossel, E.: Energy of flows on percolation clusters. Annals of Potential Analysis. To appear (1998) 14. Levin, D. and Peres, Y. Energy and cutsets in infinite percolation clusters. Proceedings of the Cortona Workshop on Random Walks and Discrete Potential Theory, M. Picardello and W. Woess (Eds). Cambridge: Cambridge Univ. Press, 1998 15. Meester, R. and Steif, J.E.: On the continuity of the critical value for long range percolation in the exponential case. Commun. Math. Phys. 180 483–504 (1996) 16. Newman, C.M. and Schulman, L.S.: One Dimensional 1/|j − i|s Percolation Models: The Existence of a Transition for s ≤ 2. Commun. Math. Phys. 104, 547–571 (1986) 17. Pemantle, R. and Peres, Y.: On which graphs are all random walks in random environments transient? In: Random Discrete Structures, IMA Volume 76, D. Aldous and R. Pemantle (Eds), Berlin–Heidelberg–New York: Springer-Verlag, 1996 18. Peres, Y.: Probability on trees: an introductory climb. Lectures on probability theory and statistics (SaintFlour, 1997), Lecture Notes in Math. 1717. Berlin: Springer, 1999, pp. 193–280 19. Schulman, L.S.: Long-range percolation in one dimension. J. Phys. A 16, no. 17, L639–L641 (1983) Communicated by M. Aizenman
Commun. Math. Phys. 226, 567 – 605 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
On the Inverse Scattering Problem for Jacobi Matrices with the Spectrum on an Interval, a Finite System of Intervals or a Cantor Set of Positive Length A. Volberg, P. Yuditskii Dept. of Mathematics, Michigan State University, East Lansing, MI 48824, USA. E-mail: [email protected]; [email protected] Received: 7 September 2001 / Accepted: 3 December 2001
Abstract: When solving the inverse scattering problem for a discrete Sturm–Liouville operator with a rapidly decreasing potential, one gets reflection coefficients s± and invertible operators I + Hs± , where Hs± is the Hankel operator related to the symbol s± . The Marchenko–Faddeev theorem [8] (in the continuous case, for the discrete case see [4, 6]), guarantees the uniqueness of the solution of the inverse scattering problem. In this article we ask the following natural question – can one find a precise condition guaranteeing that the inverse scattering problem is uniquely solvable and that operators I + Hs± are invertible? Can one claim that uniqueness implies invertibility or vise versa? Moreover, we are interested here not only in the case of decreasing potential but also in the case of asymptotically almost periodic potentials. So we merge here two mostly developed cases of the inverse problem for Sturm–Liouville operators: the inverse problem with (almost) periodic potential and the inverse problem with the fast decreasing potential.
Main Results The asymptotics of polynomials orthogonal on a homogeneous set, which we described earlier [10], indicated strongly that there should be a scattering theory for Jacobi matrices with an almost periodic background as it exists in the classical case of a constant background. Note that in this case left and right asymptotics are not necessarily the same almost periodic coefficient sequences, but they are of the same spectral class. In this work, we present all principal ingredients of such a theory: reflection/transmission coefficients, Gelfand–Levitan–Marchenko transformation operators, a Riemann–Hilbert problem related to the inverse scattering problem. Now we can say finally that the reflectionless Jacobi matrices with homogeneous spectrum are those whose reflection coefficient is zero.
568
A. Volberg, P. Yuditskii
Moreover, we extend the theory in depth and show that a reflection coefficient determines uniquely a Jacobi matrix of the Szegö class, and both transformation operators are invertible if and only if the spectral density satisfies the matrix A2 condition [13]. Concerning the A2 condition in the inverse scattering, we have to mention, at least as indirect references, [9, Chapter 2, Sect. 4] and [2]. Generally references to stationary scattering and inverse scattering problems in connection with spatial asymptotics can be found in [5], where explicit expressions of reflection and transmission coefficients in terms of Weyl functions and phases, asymptotic wave functions were given. Reference [12] gives a complete introduction to Jacobi operators, their spectral and perturbation theories. Let J be a Jacobi matrix defining a bounded self–adjoint operator on l 2 (Z): J en = pn en−1 + qn en + pn+1 en+1 ,
n ∈ Z,
(0.1)
where {en } is the standard basis in l 2 (Z), pn > 0. The resolvent matrix-function is defined by the relation R(z) = R(z, J ) = E ∗ (J − z)−1 E,
(0.2)
where E : C2 → l 2 (Z) in such a way that c E −1 = e−1 c−1 + e0 c0 . c0 This matrix-function possesses an integral representation dσ R(z) = x−z
(0.3)
with a 2 × 2 matrix-measure having compact support on R. J is unitary equivalent to the operator multiplication by an independent variable on f (x) L2dσ = f = −1 : f ∗ dσ f < ∞ . f0 (x) The spectrum of J is called absolutely continuous if the measure dσ is absolutely continuous with respect to the Lebesgue measure on the real axis, dσ (x) = ρ(x) dx.
(0.4)
Let J0 be a Jacobi matrix with constant coefficients, pn = 1, qn = 0 (the so-called Chebyshev matrix). It has the following functional representation, besides the general ¯ \ [−2, 2]. Let one mentioned above. Note that the resolvent set of J0 is the domain C ¯ \ [−2, 2] be a uniformization of this domain, z(ζ ) = 1/ζ + ζ . With z(ζ ) : D → C respect to the standard basis {t n }n∈Z in 2 2 L = f (t) : |f | dm , T
the matrix of the operator of multiplication by z(t), t ∈ T, is the Jacobi matrix J0 , since z(t)t n = t n−1 + t n+1 . The famous Bernstein–Szegö theorem implies the following proposition (for a matrix modification of the Szegö condition, see [1]).
Inverse Scattering Problem for Jacobi Matrices
569
Proposition 0.1. Let J be a Jacobi matrix whose spectrum is an interval [−2, 2]. Assume that the spectrum is absolutely continuous and the density of the spectral measure satisfies the condition log det ρ(z(t)) ∈ L1 .
(0.5)
Then pn → 1, qn → 0,
n → ±∞.
(0.6)
Moreover, there exist generalized eigenvectors pn e+ (n − 1, t) + qn e+ (n, t) + pn+1 e+ (n + 1, t) = z(t)e+ (n, t), pn e− (−n, t) + qn e− (−n − 1, t) + pn+1 e− (−n − 2, t) = z(t)e− (−n − 1, t),
(0.7)
such that the following asymptotics hold true: s(t)e± (n, t) =s(t)t n + o(1), ±
n
s(t)e (n, t) =t + s∓ (t)t
−n−1
n → +∞, + o(1),
n → −∞
(0.8)
in L2 . To clarify the meaning of the words “generalized eigenvectors”, we need some definitions and notation. The matrix s s S(t) = − (t) (0.9) s s+ is called the scattering matrix-function. It is a unitary-valued matrix-function with the following symmetry property: S ∗ (t¯) = S(t),
(0.10)
s(t) is boundary values of an outer function.
(0.11)
and analytic property:
We still denote by s(ζ ), ζ ∈ D, the values of the function inside the disk, and subsequently, we assume that s meets the normalization condition s(0) > 0. In fact, this means that each of the entries s± (the so-called reflection coefficient) determines the matrix S(t) in a unique way. Indeed, since |s(t)|2 + |s± (t)|2 = 1, using (0.11), we have 1
s(ζ ) = e 2
t+ζ T t−ζ
log{1−|s± (t)|2 } dm
(0.12) .
Then, we can solve for s∓ the relation s¯+ s + s¯ s− = 0.
(0.13)
570
A. Volberg, P. Yuditskii
With the function s± we associate the metric ||f ||2s±
1 = 2
f (t) 1 s± (t) f (t) , ¯ ¯ t¯f (t¯) t f (t ) s± (t) 1
= f (t) + t¯(s± f )(t¯), f (t),
f ∈ L2 .
Note that the conditions (0.11), (0.12) guarantee that ||f ||s± = 0 implies f = 0. We denote by L2dm,s± or L2s± (for shortness) the closure of L2 with respect to this new metric. The following relation sets a unitary map from L2s+ to L2s− : s(t)f − (t) = t¯f + (t¯) + s+ (t)f + (t), moreover, in this case, ||f + ||2s+ = ||f − ||2s− =
1 {||sf + ||2 + ||sf − ||2 }, 2
and the inverse map is of the form s(t)f + (t) = t¯f − (t¯) + s− (t)f − (t). We say that a Jacobi matrix J with the spectrum [−2, 2] is of Szegö class if its spectral measure dσ satisfies (0.4), (0.5). Theorem 0.1. Let J be a Jacobi matrix of Szegö class with the spectrum E = [−2, 2]. Then J possesses the scattering representation, i.e.: there exists a unique unitary-valued matrix-function S(t) of the form (0.9) with the properties: (0.10), (0.11), and a unique pair of Fourier transforms F ± : l 2 (Z) → L2s± ,
(F ± Jf )(t) = z(t)(F ± f )(t),
(0.14)
determining each other by the relations s(t)(F ± f )(t) = t¯(F ∓ f )(t¯) + s∓ (t)(F ∓ f )(t),
(0.15)
and having the following analytic properties sF ± (l 2 (Z± )) ⊂ H 2 ,
(0.16)
and asymptotic properties e± (n, t) = t n + o(1) in L2s± ,
n → +∞,
where e+ (n, t) = (F + en )(t),
e− (n, t) = (F − e−n−1 )(t).
(As before, {en } is the standard basis in l 2 (Z)).
(0.17)
Inverse Scattering Problem for Jacobi Matrices
571
Remark 0.1. Show that (0.17) is equivalent to (0.8). Due to 2
s¯ 1 s¯± |s| 0 + ± s± 1 , = s± 1 1 0 0
(0.18)
(0.17) is equivalent to (n → +∞) s(t)e± (n, t) = s(t)t n + o(1)
in L2 ,
s± (t)e± (n, t) + t¯e± (n, t¯) = s± (t)t n + t¯n+1 + o(1)
in L2 .
Using (0.15), we rewrite the second relation into the form s(t)e∓ (−n − 1, t) = t −n−1 + s± (t)t n + o(1)
in L2 .
Substituting n := −n − 1, we get the second relation of (0.8). A fundamental question is how to recover the Jacobi matrix from the scattering matrix, in fact, from the reflection coefficient s+ (or s− )? When can this be done? Do we have a uniqueness theorem? We show that for an arbitrary function s+ (t) satisfying s+ (t¯) = s+ (t)
and
log{1 − |s+ (t)|2 } ∈ L1 ,
(0.19)
there exists a Jacobi matrix J of Szegö class such that s+ (t) is its reflection coefficient in the scattering representation. However we can construct a matrix with this property, at least, in two different ways. First, consider the space Hs2+ = closL2s H 2 , +
and introduce the Hankel operator Hs+ :
H2
→ H 2,
Hs+ f = P+ t¯(s+ f )(t¯),
f ∈ H 2,
where P+ is the Riesz projection from L2 onto H 2 . This operator determines the metric in Hs2+ : ||f ||2s+ = f (t) + t¯(s+ f )(t¯), f (t) = (I + Hs+ )f, f ,
∀f ∈ H 2 .
Lemma 0.1. Under the assumptions (0.19), the space Hs2+ is a space of holomorphic functions with a reproducing kernel. Moreover, sf ∈ H 2 for any f ∈ Hs2+ , and the reproducing vector ks+ : f, ks+ = f (0),
∀f ∈ Hs2+ ,
is of the form ks+ = (I + Hs+ )[−1] 1 := lim (# + I + Hs+ )−1 1 in L2s+ . #→0+
Put Ks+ (t) = ks+ (t)/ ks+ (0).
(0.20)
572
A. Volberg, P. Yuditskii
Theorem 0.2. Let s+ (t) satisfy (0.19). Then the system of functions {t n Ks+ t 2n (t)}n∈Z forms an orthonormal basis in L2s+ . With respect to this basis, operator multiplication by z(t) is a Jacobi matrix J of Szegö class. Moreover, the initial function s+ (t) is the reflection coefficient of the scattering matrix-function S(t), associated to J by Theorem 0.1, and e+ (n, t) = t n Ks+ t 2n (t). On the other hand, the system of functions {t n Ks− t 2n (t)}n∈Z forms an orthonormal basis in L2s− , and we are able to define a Jacobi matrix J˜ by the relation z(t)e˜+ (n, t) = p˜ n e˜+ (n − 1, t) + q˜n e˜+ (n, t) + p˜ n+1 e˜+ (n + 1, t), where {e˜+ (n, t)} is the dual system to the system {t n Ks− t 2n (t)} (see (0.15)), i.e.: s(t)e˜+ (−n − 1, t) = t¯n+1 Ks− t 2n (t¯) + s− (t)t n Ks− t 2n (t). Even the invertibility condition for the operators (I + Hs± ) does not guarantee that operators J and J˜ are the same (see the Example at the end of Sect. 2). But if J = J˜, then the uniqueness theorem takes place. Theorem 0.3. Let s+ satisfy (0.19). Then the reflection coefficient s+ determines a Jacobi matrix J of Szegö class in a unique way if and only if the following relations take place s(0)Ks± (0)Ks∓ t −2 (0) = 1.
(0.21)
Corollary 0.1. Let J be a Jacobi matrix of Szegö class with the spectrum [−2, 2] and let ρ be the density of its spectral measure. If 2 ρ −1 (x) dx < ∞, −2
then there is no other Jacobi matrix of Szegö class with the same scattering matrix– function S(t). It is important to know when the operators (I + Hs± ), playing a central role in the inverse scattering problem, are invertible in the proper sense of the word. Theorem 0.4. Let J be a Jacobi matrix of Szegö class with the spectrum [−2, 2]. Let ρ be the density of its spectral measure and let s+ be the reflection coefficient of its scattering matrix-function. Then the following statements are equivalent. 1. The spectral density ρ satisfies condition A2 . 2. The reflection coefficient s+ determines a Jacobi matrix of Szegö class uniquely and both operators (I + Hs± ) are invertible. To extend these results to the case when a spectrum E is a finite system of intervals or a standard Cantor set of positive measure [15], see also [3], we need only to introduce a counterpart of the Hardy space. ¯ \ E. Thus there exists Let z(ζ ) : D → % be a uniformization of the domain % = C a discrete subgroup & of the group SU (1, 1) consisting of elements of the form γ γ γ = 11 12 , γ11 = γ22 , γ12 = γ21 , det γ = 1, γ21 γ22
Inverse Scattering Problem for Jacobi Matrices
573
such that z(ζ ) is automorphic with respect to &, i.e., z(γ (ζ )) = z(ζ ), ∀γ ∈ &, and any two preimages of z0 ∈ % are &–equivalent, i.e., z(ζ1 ) = z(ζ2 ) ⇒ ∃γ ∈ & : ζ1 = γ (ζ2 ). We normalize z(ζ ) by the conditions z(0) = ∞, (ζ z)(0) > 0. A character of & is a complex–valued function α : & → T, satisfying α(γ1 γ2 ) = α(γ1 )α(γ2 ),
γ1 , γ2 ∈ &.
The characters form an Abelian compact group denoted by & ∗ . For a given character α ∈ & ∗ , as usual let us define H ∞ (&, α) = {f ∈ H ∞ : f (γ (ζ )) = α(γ )f (ζ ), ∀γ ∈ &}. Generally, a group & is said to be of Widom type if for any α ∈ & ∗ the space H ∞ (&, α) is not trivial (contains a non-constant function). A group of Widom type acts dissipatively on T with respect to dm, that is there exists a measurable (fundamental) set E, which does not contain any two &-equivalent points, and the union ∪γ ∈& γ (E) is a set of full measure. We can choose E possessing the symmetry property: t ∈ E ⇒ t¯ ∈ E. For the space of square summable functions on E (with respect to the Lebesgue measure), we use the notation L2dm|E . Let f be an analytic function in D, γ ∈ & and k ∈ N. Then we put f |[γ ]k =
f (γ (ζ )) . (γ21 ζ + γ22 )k
Notice that f |[γ ]2 = f ∀γ ∈ & means that the form f (ζ )dζ is invariant with respect to the substitutions ζ → γ (ζ ) (f (ζ )dζ is an Abelian integral on D/ &). Analogically, f |[γ ] = α(γ )f ∀γ ∈ & means that the form |f (ζ )|2 |dζ | is invariant with respect to these substitutions. We recall that a function f (ζ ) is of Smirnov class, if it can be represented as a ratio of two functions from H ∞ with an outer denominator. Definition. Let & be a group of Widom type. The space A21 (&, α) (A12 (&, α)) is formed by functions f , which are analytic on D and satisfy the following three conditions 1) 2) 3)
f is of Smirnov class, f |[γ ] = α(γ )f (f |[γ ]2 = α(γ )f ) ∀γ ∈ &, |f |2 dm < ∞ ( |f | dm < ∞). E
E
A21 (&, α) is a Hilbert space with the reproducing kernel k α (ζ, ζ0 ), moreover 0 < inf ∗ k α (ζ0 , ζ0 ) ≤ sup k α (ζ0 , ζ0 ) < ∞. α∈&
α∈& ∗
Put k α (ζ ) = k α (ζ, 0)
and
k α (ζ ) K α (ζ ) = √ α . k (0)
(W)
574
A. Volberg, P. Yuditskii
We need one more special function. The Blaschke product b(ζ ) = ζ
γ ∈&,γ =12
γ (0) − ζ |γ (0)| 1 − γ (0)ζ γ (0)
is called the Green’s function of & with respect to the origin. It is a character-automorphic function, i.e., there exists µ ∈ & ∗ such that b ∈ H ∞ (&, µ). Note, if G(z) = G(z, ∞) denotes the Green’s function of the domain %, then G(z(ζ )) = − log |b(ζ )|. Theorem ([7]). Let & be a group of Widom type. The following statements are equivalent: (1) The function K α (0) is continuous on & ∗ . (2) sup{|f (0)| : f ∈ H ∞ (&, α), f ≤ 1} → 1, α → 1& ∗ . (3) The Direct Cauchy Theorem holds: f dt f (t) = (0), ∀f ∈ A12 (&, µ). b 2π i b E
(DCT)
(4) Let tA21 (&, α −1 ) = {g = tf : f ∈ A21 (&, α −1 )}. Then L2dm|E = tA21 (&, α −1 ) ⊕ A21 (&, α) ∀α ∈ & ∗ . (5) Every invariant subspace M ⊂ A21 (&, α) (i.e. φM ⊂ M ∀φ ∈ H ∞ (&)) is of the form M = 2A21 (&, β −1 α) for some character-automorphic inner function 2 ∈ H ∞ (β). Definition ([3]). A measurable set E is homogeneous if there is an η > 0 such that |(x − δ, x + δ) ∩ E| ≥ ηδ for all 0 < δ < 1 and all x ∈ E.
(C)
A standard Cantor set of positive length is an example of a homogeneous set [3], see also [10]. Let E be a homogeneous set, then the domain % = C¯ \ E (respectively the group &) is of Widom type and the Direct Cauchy Theorem holds. Recall that a sequence of real numbers {pn } ∈ l ∞ (Z) is called uniformly almost periodic if the set of sequences {{pn+l }, l ∈ Z} is a precompact in l ∞ (Z). The general way to produce a sequence of this type looks as follows: let G be a compact Abelian group, and let f (g) be a continuous function on G, then pn := f (g0 + ng1 ),
g0 , g1 ∈ G,
is an almost periodic sequence. A Jacobi matrix is almost periodic if the coefficient sequences are almost periodic. We denote by J (E) the class of almost periodic Jacobi matrices with absolutely continuous homogeneous spectrum E. In fact, if E = [−2, 2] then J (E) = {J0 }. In what follows the class J (E) will substitute the Chebyshev matrix in the case when the spectrum E is not an interval but a general homogeneous set. First of all this class can be described as follows.
Inverse Scattering Problem for Jacobi Matrices
575
¯ \ E be a uniformizing Theorem ([11]). Let E be a homogeneous set. Let z : D → C −n −n n αµ }n∈Z+ and {bn K αµ }n∈Z form an mapping. Then the systems of functions {b K orthonormal basis in A21 (&, α) and in L2dm|E , respectively, for any α ∈ & ∗ . With respect to this basis, the operator multiplication by z(t) is a three–diagonal almost periodic Jacobi matrix J (α). Moreover, J (E) = {J (α) : α ∈ & ∗ }, and J (α) is a continuous function on & ∗ . We say that a Jacobi matrix J with the spectrum E is of Szegö class if its spectral measure is absolutely continuous, dσ (x) = ρ(x) dx, and ρ(z(t)) satisfies (0.5). Theorem 0.5. Let J be a Jacobi matrix of Szegö class with a homogeneous spectrum E. Then J possesses the scattering representation, i.e.: there exists a unique unitary-valued matrix-function S(t) of the form (0.9) with the properties (0.10), (0.11), and a unique pair of Fourier transforms F ± : l 2 (Z) → L2dm|E,s± ,
(F ± Jf )(t) = z(t)(F ± f )(t),
(0.22)
determining each other by the relations s(t)(F ± f )(t) = t¯(F ∓ f )(t¯) + s∓ (t)(F ∓ f )(t),
(0.23)
and having the following analytic properties: −1 sF ± (l 2 (Z± )) ⊂ A21 (&, α∓ ),
(0.24)
and asymptotic properties −n
e± (n, t) = bn (t)K α± µ (t) + o(1) in L2dm|E,s± ,
n → +∞,
(0.25)
where e+ (n, t) = (F + en )(t),
e− (n, t) = (F − e−n−1 )(t),
and L2dm|E,s± is the closure of the functions from L2dm|E with respect to the metric ||f ||2s±
1 = 2
f (t) 1 s± (t) f (t) , ¯ ¯ , t¯f (t¯) t f (t ) s± (t) 1
f ∈ L2dm|E .
Theorems 0.2–0.4 also have their closely parallel counterparts in the case when the spectrum is a homogeneous set, see Theorems 1.1, 2.1 and 3.1 in combination with Theorem 4.1. We finish this paper with a remark on a connection between this new type inverse scattering problem and a Riemann–Hilbert problem.
576
A. Volberg, P. Yuditskii
1. In the Model Space ¯ \ E be a uniformization and b(ζ ) be Let E be a homogeneous set. Let z(ζ ) : D/ & ∼ C the Green’s function. Throughout the paper we assume that (bz)(0) = 1. Let E ⊂ T be a symmetric fundamental set (t ∈ E ⇒ t¯ ∈ E). With a function s+ (t) ∈ L∞ dm|E such that s+ (t¯) = s+ (t) we associate the metric ||f ||2s+
1 = 2
and
1 − |s+ (t)|2 > 0 a.e. on E,
(1.1)
f (t) 1 s+ (t) f (t) , ¯ ¯ t¯f (t¯) t f (t ) s+ (t) 1
= f (t) + t¯(s+ f )(t¯), f (t),
f ∈ L2dm|E .
Condition (1.1) guarantee that ||f ||s+ = 0 implies f = 0. We denote by L2dm|E,s+ or L2s+ (for shortness) the closure of L2dm|E with respect to this metric. Lemma 1.1. The operator multiplication by z(t) in L2s+ is unitary equivalent to the operator multiplication by z(t) in L2dm|E . Proof. Let us put 1/2 f (t) g(t) 1 s+ (t) , = t¯f (t¯) t¯g(t¯) s+ (t) 1
f ∈ L2dm|E .
In this case ||f ||s+ = ||g||. The system of identities 1/2 1/2 (zf )(t) f (t) 1 s+ (t) 1 s+ (t) = z(t) ¯ ¯ t¯(zf )(t¯) t f (t ) s+ (t) 1 s+ (t) 1 g(t) (zg)(t) = z(t) ¯ ¯ = ¯ t g(t ) t (zg)(t¯)
finishes the proof. $ % −2 Let α+ ∈ & ∗ . Further, we assume that s+ ∈ L∞ (&, α+ ) and
log(1 − |s+ (t)|2 ) ∈ L1 .
(1.2)
We define an outer function s, s(0) > 0, by the relation |s(t)|2 = 1 − |s+ (t)|2 ,
t ∈ T.
It is a character-automorphic function such that s(t¯) = s(t). It is convenient to denote −1 −1 −1 −1 its character by α+ α− , i.e., s ∈ H ∞ (&, α+ α− ). Let us discuss some properties of the space Hs2+ (α+ ) := closL2s A21 (&, α+ ). +
Inverse Scattering Problem for Jacobi Matrices
577
First of all, we define “a Hankel operator” Hs+ : A21 (&, α+ ) → A21 (&, α+ ), Hs+ f = PA2 (&,α+ ) t¯(s+ f )(t¯). 1
Note that this operator, indeed, does not depend on “an analytical part” of its symbol, more precisely, −2 ). H(s+ +#) = Hs+ , ∀# ∈ H ∞ (&, α+ Besides, in the classical case E = [−2, 2], & = {12 }, E = T, with a function
an t n s+ (t) = n∈Z
is associated the operator Hs+ : H 2 → H 2 having the representation a−1 a−2 a−3 . . . a a . . . Hs+ = −2 −3 a−3 . . . ... with respect to the standard basis {t n }n∈Z+ in H 2 . The operator Hs+ determines the metric in Hs2+ (α+ ): ||f ||2s+ = f (t) + t¯(s+ f )(t¯), f (t) = (I + Hs+ )f, f ,
f ∈ A21 (&, α+ ).
Lemma 1.2. Under the assumptions (1.2), the space Hs2+ (α+ ) is a space of holomorphic −1 functions with a reproducing kernel. Moreover, sf ∈ A21 (&, α− ) for any f ∈ Hs2+ (α+ ), α+ and the reproducing vector ks+ : α
f, ks++ = f (0),
∀f ∈ Hs2+ (α+ ),
is of the form α
ks++ = (I + Hs+ )[−1] k α+ := lim (# + I + Hs+ )−1 k α+ in L2s+ . #→0+
Proof. From the inequality |s(t)|2 0 1 s+ (t) |s+ (t)|2 s+ (t) = − ≥ 0, 0 0 s+ (t) 1 s+ (t) 1 it follows that
||sf ||2 ≤ 2||f ||2s+
(1.3)
(1.4)
∀f ∈ L2s+ .
Thus, if a sequence {fn }, fn ∈ A21 (&, α+ ), converges in Hs2+ (α+ ), then the sequence −1 {sfn } converges in A21 (&, α− ). In the same way we have boundedness of the functional f → f (0), |f (0)|2 ≤
−1 1 2 |(sf )(0)|2 ≤ ||f ||2s+ k α− (0). |s(0)|2 |s(0)|2
578
A. Volberg, P. Yuditskii
Let us prove (1.3). Let # > 0, then for the norm of the difference we have an estimate α
||ks++ − (# + I + Hs+ )−1 k α+ ||2s+ α
= ks++ (0) − 2{(# + I + Hs+ )−1 k α+ }(0) + (I + Hs+ )(# + I + Hs+ )−1 k α+ , (# + I + Hs+ )−1 k α+ α
≤ ks++ (0) − {(# + I + Hs+ )−1 k α+ }(0).
(1.5)
Therefore, α
{(# + I + Hs+ )−1 k α+ }(0) ≤ ks++ (0).
(1.6)
Besides, (1.5) implies that (1.3) follows from the relation α
lim {(# + I + Hs+ )−1 k α+ }(0) = ks++ (0).
(1.7)
#→0
Let us prove (1.7). Since the function {(# + I + Hs+ )−1 k α+ }(0) = (# + I + Hs+ )−1 k α+ , k α+ decreases with # and it is bounded by (1.6), there exists a limit α
lim {(# + I + Hs+ )−1 k α+ }(0) ≤ ks++ (0).
(1.8)
#→0
On the other hand, for any f ∈ A21 (&, α+ ) and # > 0 the following inequalities hold: |f (0)|2 ≤ (# + I + Hs+ )−1 k α+ , k α+ (# + I + Hs+ )f, f ≤ { lim (# + I + Hs+ )−1 k α+ , k α+ }(# + I + Hs+ )f, f , #→0
that is
|f (0)|2 ≤ { lim (# + I + Hs+ )−1 k α+ , k α+ }||f ||2s+ . #→0
Putting f =
α ks++ ,
we have α
ks++ (0) ≤ lim (# + I + Hs+ )−1 k α+ , k α+ . #→0
Comparing this inequality with (1.8), we get (1.7), thus (1.3) is proved. −2 We define s− ∈ L∞ (&, α− ) by
s− (t) = −s+ (t)s(t)/s(t). In this case
S(t) =
s− s (t) s s+
is a unitary-valued matrix function possessing properties (0.10), (0.11).
% $
Inverse Scattering Problem for Jacobi Matrices
579
Lemma 1.3. The following relation sets a unitary map from L2s+ to L2s− : s(t)f − (t) = t¯f + (t¯) + s+ (t)f + (t). In this case, ||f + ||2s+ = ||f − ||2s− =
1 {||sf + ||2 + ||sf − ||2 }, 2
and the inverse map is of the form s(t)f + (t) = t¯f − (t¯) + s− (t)f − (t). Moreover, this unitary map intertwines the operator multiplication by z(t) in L2s± . Proof. The first statement follows from the identities 1/¯s 0 1 s¯− 1/s 0 s¯+ 1 s+ 1 1 s¯+ = 0 1/s s− 1 1 s+ 1 s¯+ s+ 1 0 1/¯s and (0.18). Since z(t) = z(t¯), t ∈ D, the last statement is evident. $ % α α α α µ−n Lemma 1.4. Let Ks++ (t) = ks++ (t)/ ks++ (0). The system of functions {bn (t)Ks +b2n (t)} +
forms an orthonormal basis in Hs2+ (α+ ) when {n ∈ Z+ } and in L2s+ when {n ∈ Z}. With respect to this basis the operator multiplication by z(t) is a Jacobi matrix. Proof. First, we note that {f : f ∈ Hs2+ (α+ ), f (0) = 0} = {f = bf˜ : f˜ ∈ Hs2
+b
Therefore,
α
Hs2+ (α+ ) = {Ks++ (t)} ⊕ bHs2
+b
2
2
(α+ µ−1 )}.
(α+ µ−1 ).
α µ−n
Iterating this relation, we get that {bn (t)Ks +b2n (t)}n∈Z+ is an orthonormal basis in
Hs2+ (α+ ), since ∩n∈Z+ bn Hs2
+
2n +b
(α+ µ−n ) = {0}.
Then, we note that an arbitrary function f ∈ L2s+ can be approximated with the given accuracy by a function f1 from L2dm|E . This function, in its turn, can be approximated by a function f2 ∈ bn A21 (&, α+ µ−n ) with a suitable n. Therefore, linear combinations α µ−n
of functions from {bn (t)Ks +b2n (t)} are dense in L2s+ . Since this system of functions is +
orthonormal, it forms a basis in L2s+ . Since bz ∈ H ∞ (&, µ), we have z : bn Hs2
+b
2n
(α+ µ−n ) → bn−1 Hs2
+b
α µ−n
2n−2
(α+ µ−n+1 ).
For this reason, in the basis {bn (t)Ks +b2n (t)}n∈Z , the matrix of the operator multiplica+ tion by z(t) has only one non-zero entry over diagonal in each column. But the operator is self-adjoint, therefore, the matrix is a three-diagonal Jacobi matrix. $ %
580
A. Volberg, P. Yuditskii α µ−n
Lemma 1.5. Let e+ (n, t) = bn (t)Ks +b2n (t), n ∈ Z. Define +
−
+
s(t)e (n, t) = t¯e (−n − 1, t¯) + s+ (t)e+ (−n − 1, t). Then {e− (n, t)} is an orthonormal basis in L2s− , −1 s(t)e− (n, t) ∈ A21 (&, α+ ),
n ∈ Z+ ,
(1.9)
and e− (0, 0)(be+ )(−1, 0) =
b (0) . s(0)
(1.10)
Proof. Lemma 1.3 and Lemma 1.4 imply immediately that {e− (n, t)} is an orthonormal basis in L2s− . Moreover, s(t)e− (n, t) ∈ L2dm|E . To prove (1.9) consider a scalar product (f ∈ A21 (&, α+ )) + 1 f (t) e (n, t) s (t) 1 − + , t¯f (t¯), s(t)e (n, t) = t¯e+ (n, t¯) s+ (t) 1 2 t¯f (t¯) n+1 −n−1 K α+ µ )(t) (b 1 f (t) 1 s+ (t) s+ b−2n−2 = , α µn+1 −n−1 s+ (t) 1 2 t¯f (t¯) t¯(b Ks +b−2n−2 )(t¯) +
= b
n+1
α µn+1 f, Ks +b−2n−2 s+ b−2n−2 +
= 0,
∀n ≥ 0.
To prove (1.10), we write −1
s(0)e− (0, 0) = s(t)e− (0, t), k α+ (t).
(1.11)
Due to the Direct Cauchy Theorem, the reproducing kernel k α possesses the following property: −1
t¯k α+ (t¯) =
b (0) k α+ µ (t) . k α+ µ (0) b(t)
(1.12)
Substituting (1.12) in (1.11), we obtain −1 α µ b (0) (b k + )(t) 1 s+ (t) e+ (−1, t) − , s(0)e (0, 0) = α µ t¯e+ (−1, t¯) t¯(b−1 k α+ µ )(t¯) 2k + (0) s+ (t) 1 =
b (0) α µ K + (t), k α+ µ (t)s+ b−2 . k α+ µ (0) s+ b−2
Using (1.3), we have s(0)e− (0, 0) =
b (0) lim (# + I + Hs+ b−2 )−1 k α+ µ , k α+ µ s+ b−2 α µ k α+ µ (0)Ks +b−2 (0) #→0 +
b (0) lim (I + Hs+ b−2 )(# +I +Hs+ b−2 )−1 k α+ µ , k α+ µ = α µ α µ k + (0)Ks +b−2 (0) #→0 +
b (0) {k α+ µ (0) − lim #(# +I +Hs+ b−2 )−1 k α+ µ , k α+ µ }. = α µ α µ #→0 k + (0)Ks +b−2 (0) +
Inverse Scattering Problem for Jacobi Matrices
581
Since the limit (1.7) exists, finally, we get s(0)e− (0, 0) =
b (0) b (0) = . α µ (be+ )(−1, 0) Ks +b−2 (0) +
The lemma is proved. $ % Lemma 1.6. Let ||s+ || < 1. Then α
α µ
Ks±± (0)Ks ∓b−2 (0) = ∓
b (0) . s(0)
Proof. Note that operators (I + Hs± bn ) are invertible. −1 ). But, We use the notation of Lemma 1.5. As we know, s(t)e− (0, t) ∈ A21 (&, α+ ∞ in the case under consideration, 1/s ∈ H (&, α+ α− ). Hence, the function e− (0, t) itself belongs to A21 (&, α− ). Therefore, we can project each term onto A21 (&, α− ) in the relation t¯(se+ )(−1, t¯) = e− (0, t) + t¯(s− e− )(0, t¯). On the right-hand side we get PA2 (&,α− ) {e− (0, t) + t¯(s− e− )(0, t¯)} = (I + Hs− )e− (0, t). 1
To evaluate the left-hand side, using (1.10), we write −1
s(t)e+ (−1, t) = s(0)(be+ )(−1, 0)
k α− µ (t) −1
b(t)k α− µ (0)
+ g(t)
−1
k α− µ (t) b (0) + g(t), = − e (0, 0) b(t)k α−−1 µ (0)
−1 ). g ∈ A21 (&, α−
Using (1.12), we get PA2 (&,α− ) {t¯(se+ )(−1, t¯)} = 1
Thus,
k α− (t) = (I + Hs− )e− (0, t). e− (0, 0)
e− (0, t)e− (0, 0) = (I + Hs− )−1 k α− . α
In particular, e− (0, 0) = Ks−− (0), and (1.10) becomes the statement of the lemma.
% $
Lemma 1.7. Assume that for some Jacobi matrix J there exists a pair of unitary transforms F ± : l 2 (Z) → L2s± , (F ± Jf )(t) = z(t)(F ± f )(t), determining each other by the relations s(t)(F ± f )(t) = t¯(F ∓ f )(t¯) + s∓ (t)(F ∓ f )(t), such that −1 ). sF ± (l 2 (Z± )) ⊂ A21 (&, α∓
(1.13)
582
A. Volberg, P. Yuditskii
As before, we put e+ (n, t) = (F + en )(t),
e− (n, t) = (F − e−n−1 )(t).
(1.14)
Then e± (n, t) has at the origin zero (poles) of multiplicity n, n > 0 (−n, n < 0). Furthermore, F ± (l 2 (Z± )) ⊃ Hs2± (α± ), and, hence, α
e± (0, 0) ≥ Ks±± (0).
(1.15) α
The equality in (1.15) takes place if and only if e± (0, t) = Ks±± (t). Proof. Let us show that the annihilator of the linear space A21 (&, α + ) ⊂ L2s+ contains F + {l 2 (Z− )}. For f ∈ A21 (&, α + ) and e+ (−n − 1, t), n ≥ 0, we have 1 f (t) 1 s+ (t) e+ (−n − 1, t) f (t), e+ (−n − 1, t)s+ = , + t¯e (−n − 1, t¯) s+ (t) 1 2 t¯f (t¯) = f (t), e+ (−n − 1, t) + ts+ (t)e+ (−n − 1, t¯) = f (t), t¯(se− )(n, t¯). By (1.13) and (DCT), the last scalar product equals zero. Therefore, Hs2+ (α+ ) = closL2s A21 (&, α + ) ⊂ {F + (l 2 (Z− ))}⊥ = F + (l 2 (Z+ )). +
Now, from the three-term recurrent relation z(t)s(t)e+ (n, t) = pn s(t)e+ (n − 1, t) + qn s(t)e+ (n, t) + pn+1 s(t)e+ (n + 1, t), (1.16) and (1.13) it follows that e+ (n, t), n > 0, has in the origin zero, at least of multiplicity n. α Since Ks++ (t) ∈ F + (l 2 (Z+ )), it possesses the decomposition
α Ks++ (t) = an e+ (n, t). n∈Z+
Since e+ (n, 0) = 0, n > 0,
α
Ks + (0) a0 = + + e (0, 0)
in this decomposition. But, |a0 |2 ≤
α
|an |2 = ||Ks++ (t)||2s+ = 1.
Thus, (1.15) and the lemma are proved. $ % Lemma 1.8 ([10]). Let f ∈ L∞ (α −2 ). Then −n PA2 (&,α) t¯(f bn K αµ )(t¯) → 0, 1
n → +∞,
where PA2 (&,α) is the orthogonal projection from L2dm|E onto A21 (&, α). 1
Inverse Scattering Problem for Jacobi Matrices
583
Proof. Let us denote by 2β (t) an extremal function of the problem 2β (0) = sup{φ(0) : φ ∈ H ∞ (&, β), ||φ|| ≤ 1}. Using properties (1), (2) of a group of Widom type with (DCT), Theorem [7], and compactness of & ∗ , for any # > 0, we can find a finite covering of & ∗ , &∗ =
l(#)
{β : dist(β, βj ) ≤ η(#)}
j =1
−1 K βj (0) ≤ #2, 2 1 − 2βj β (0) β K (0)
such that
dist(β, βj ) ≤ η(#).
It means that −1
(2βj
β
−1
K βj ) − K β 2 ≤ 1 + 1 − 22βj
β
(0)
K βj (0) ≤ #2, K β (0)
dist(β, βj ) ≤ η(#).
For fixed β one can find n0 such that Pbn A2 (&,α 2 β −1 µ−n ) t¯(f K β )(t¯) ≤ #, ∀n > n0 . 1
Therefore, there exists n0 such that Pbn A2 (&,α 2 β −1 µ−n ) t¯(f K βj )(t¯) ≤ #, ∀n > n0 , 1 ≤ j ≤ l(#). j
1
Now, let n > n0 = n0 (#) and let βj : dist(βj , αµ−n ) ≤ η(#). For h ∈ A21 (&, α), we write −n
t¯(f bn K αµ )(t¯), h −n
= t¯(bn f [K αµ
−n β −1 j
− 2αµ
−n β −1 j
K βj ])(t¯), h + t¯(bn 2αµ
f K βj )(t¯), h.
Then −n
|t¯(bn f [K αµ
−n β −1 j
− 2αµ
K βj ])(t¯), h| −n
≤ f h K αµ
−n β −1 j
− 2αµ
K βj ≤ #||f || h,
and −n β −1 j
|t¯(bn 2αµ
−n β −1 j
f K βj )(t¯), h| = |t¯(f K βj )(t¯), bn 2αµ
(t¯)h|
βj
≤ Pbn A2 (&,α 2 β −1 µ−n ) t¯(f K )(t¯) ||h|| ≤ #||h||. 1
Therefore,
j
−n |PA2 (&,α) t¯(f bn K αµ )(t¯) , h| ≤ #(1 + ||f ||)||h||. 1 −n Putting h = PA2 (&,α) t¯(f bn K αµ )(t¯) , we get 1 −n ||PA2 (&,α) t¯(f bn K αµ )(t¯) || ≤ #(1 + ||f ||). 1
The lemma is proved. $ %
584
A. Volberg, P. Yuditskii
Proposition 1.1. Assume that for some Jacobi matrix J there exists a pair of unitary transforms F ± : l 2 (Z) → L2s± , (F ± Jf )(t) = z(t)(F ± f )(t), determining each other by the relations s(t)(F ± f )(t) = t¯(F ∓ f )(t¯) + s∓ (t)(F ∓ f )(t),
(1.17)
such that (1.13) holds. Then the following relations are equivalent: −n
e+ (n, t) = bn (t)K α+ µ +
+
+ o(1) in L2s+ , +
(1.18) +
t¯pn {e (n, t)e (n − 1, t¯) − e (n − 1, t)e (n, t¯)} = z (t), +
−
s(0)e (0, 0)(be )(−1, 0) = b (0),
(1.19) (1.20)
where {e± (n, t)} is defined by (1.14). Proof. (1.18) ⇒ (1.19). It follows from two remarks. First, the form on the left in (1.19) does not depend on n (it is the Wronskian of the recurrence relation (0.7)). Second, the identity K α (0) t¯ αµ {K α (t)(K αµ /b)(t¯) − (K αµ /b)(t)K α (t¯)} = z (t) K (0) holds for any α ∈ & ∗ . (1.19) ⇒ (1.20). Let us introduce the matrix − e (−1, t) −e− (0, t) :(t) = . −e+ (0, t) e+ (−1, t)
(1.21)
Then (1.17) implies t¯:(t¯) = −S(t):(t). In particular, with the help of (1.19), we get e+ (0, t)e+ (−1, t¯) − e+ (−1, t)e+ (0, t¯) e− (−1, t)e+ (−1, t) − e− (0, t)e+ (0, t) −z (t) = . − + p0 {e (−1, t)e (−1, t) − e− (0, t)e+ (0, t)}
s(t) = −t¯
(1.22)
Since b(t)e± (−1, t) are holomorphic functions (in fact, of Smirnov class) s(0) =
b (0) . p0 (be− )(−1, 0)(be+ )(−1, 0)
Now, we only have to mention that p0 (be± )(−1, 0) = e± (0, 0). (1.20) ⇒ (1.18). This is non–trivial part of the proposition. The main step is to prove that (b−n e+ )(n, 0) = 1. n→+∞ K α+ µ−n (0) lim
(1.23)
Inverse Scattering Problem for Jacobi Matrices
585
By Lemma (1.7) we have an estimate from below, α µ−n
−n
(b−n e+ )(n, 0) ≥ Ks +b2n (0) ≥ {(# + I + Hs+ b2n )−1 k α+ µ }1/2 (0) +
=√
1 1+#
α µ−n 2n (0). 1+# b
K s++
(1.24)
To get an estimate from above we use (1.20). Let us note that due to the recurrence relation, the form pn {e+ (n − 1, t)e− (−n − 1, t) − e+ (n, t)e− (−n, t)} also does not depend on n. Thus, a relation like (1.20) holds for all n: (b−n e+ )(n, 0)(bn+1 e− )(−n − 1, 0) = pn (b−n+1 e+ )(n − 1, 0)(bn+1 e− )(−n − 1, 0) = e+ (0, 0)(be− )(−1, 0) = b (0)/s(0). Therefore, 1 b (0) n+1 − s(0) (b e )(−n − 1, 0) b (0) 1 ≤ n+1 α µ s(0) K − (0)
(b−n e+ )(n, 0) =
≤
b (0)
s− b−2n−2
1
s(0) {(# + I + Hs− b−2n−2 )−1 k α− µn+1 }1/2 (0) √ 1+# b (0) = , n+1 α µ − s(0) K (0) s#,−
(1.25)
b−2n−2
where s#,− := s− /(1 + #). With the function s#,− , let us associate the functions s# , s#,+ and the character α#,+ 1 (note that s#,+ is not 1+# s+ , but s#,+ = −¯s#,− (s# /¯s# )). It is important that s# (0) and α#,+ depend continuously on #. By Lemma 1.6, b (0)
α
α µn+1 Ks − b−2n−2 (0) #,−
µ−n
= s# (0)Ks #,+b2n (0). #,+
(1.26)
Substituting (1.26) in (1.25), and combining the result with (1.24), we obtain √
1 1+#
α µ−n 2n (0) 1+# b
K s++
≤ (b−n e+ )(n, 0) ≤
√ s# (0) α#,+ µ−n K 1+# (0). s(0) s#,+ b2n
−2 ) with ||f || < 1 we have Lemma 1.8 implies that for any f ∈ L∞ (&, α+ α µ−n
lim
n→+∞
Indeed,
Kf +b2n (0) −n
K α+ µ (0)
→ 1.
(1.27)
586
A. Volberg, P. Yuditskii
α µ−n
−n
−n
−n
|kf +b2n (0) − k α+ µ (0)| = |Hf b2n k α+ µ , (I + Hf b2n )−1 k α+ µ | −n
−n
= |t¯(f bn k α+ µ )(t¯), bn (I + Hf b2n )−1 k α+ µ | −n
−n
≤ ||PA2 (&,α+ ) {t¯(f bn k α+ µ )(t¯)}||||bn (I +Hf b2n )−1 k α+ µ || 1
≤
1 −n −n {t¯(f bn k α+ µ )(t¯)}|| ||k α+ µ || → 0, ||P 2 1 − ||f || A1 (&,α+ )
as n → +∞. Also, since α#,+ depends continuously on # and K α+ (0) is continuous on a compact group & ∗ , for any δ > 0 we can choose # so small that −n
K α#,+ µ (0) ≤ 1 + δ, −n K α+ µ (0)
∀n.
Thus, returning to (1.27), we obtain √
1 1+#
≤ lim inf n→∞
(b−n e+ )(n, 0) (b−n e+ )(n, 0) √ s# (0) (1 + δ). ≤ lim sup ≤ 1+# −n −n α µ α µ n→∞ s(0) K + (0) K + (0)
Since # and δ are arbitrary small, (1.23) is proved. Now we are in a position to prove (1.18). Consider the norm of the difference −n
−n
−n
−n
−n
||e+ (n, t) − bn K α+ µ ||2s+ = 1 + ||bn K α+ µ ||2s+ − 2e+ (n, t), bn K α+ µ s+ . Since
−n
||bn K α+ µ ||2s+ = 1 + bn K α+ µ , t¯(s+ bn K α+ µ )(t¯), using Lemma 1.8, we conclude that −n
||bn K α+ µ ||2s+ → 1, n → +∞. Let us evaluate the scalar product −n
−n
e+ (n, t), bn K α+ µ s+ = se− (−n − 1, t), t¯(bn K α+ µ )(t¯) −1 n+1 µ
= se− (−n − 1, t), b−n b−1 K α+ = =
s(0)(bn+1 e− )(−n − 1, 0) −1 n+1 µ
K α+
−n K α+ µ (0)
(b−n e+ )(n, 0)
(0)
→ 1, n → +∞.
The proposition is proved. $ % The following theorem shows that an arbitrary function s+ , possessing (1.1), (1.2), is the reflection coefficient of a Jacobi matrix of Szegö class.
Inverse Scattering Problem for Jacobi Matrices
587
−2 Theorem 1.1. Let a function s+ ∈ L∞ (&, α+ ), ||s+ || ≤ 1, s+ (t¯) = s+ (t), be such that 2 1 that log(1 − |s+ | ) ∈ L . Let an outer function s, s(0) > 0, and s− be associated to s+ by the relations |s|2 = 1 − |s+ |2 , s− = −¯s+ s/¯s .
Then the system of functions α µ−n
e+ (n, t) = bn Ks +b2n +
forms an orthonormal basis in L2s+ . The dual system, defined by s(t)e− (n, t) = t¯e+ (−n − 1, t¯) + s+ (t)e+ (−n − 1, t), forms an orthonormal basis in L2s− . The subspaces of L2s± , that formed by functions with vanishing negative Fourier coefficients with respect to these bases, are spaces of holomorphic character-automorphic forms; moreover, −1 ) if f ± ∈ closL2s span{e± (n, t) : n ≥ 0}. sf ± ∈ A21 (&, α∓ ±
Further,
−n
e± (n, t) = bn K α± µ
+ o(1) in L2s± ,
and with respect to these bases the operator multiplication by z(t) is a Jacobi matrix J of Szegö class. Proof. All statements, besides the last one, only summarize results of Lemmas 1.4, 1.5 and Proposition 1.1. To prove that J is of Szegö class we evaluate its spectral density ρ(x). Using the definition of the resolvent matrix–function, we get (z(t) − z)−1 e+ (−1, t), e+ (−1, t) s (z(t) − z)−1 e+ (0, t), e+ (−1, t) s + + . R(z) = (z(t) − z)−1 e+ (−1, t), e+ (0, t) s (z(t) − z)−1 e+ (0, t), e+ (0, t) s +
+
Note that if f ± ∈ L2s± are related by s(t)f − (t) = t¯f + (t¯) + s+ (t)f + (t) then
f + (t) z(t) − z
−
=
f − (t) . z(t) − z
Therefore, using Lemma 1.3, we have ∗ + + 1 e (−1, t) e+ (0, t) e (−1, t) e+ (0, t) |s(t)|2 dm , R(z) = − − e− (0, t) e− (−1, t) z(t) − z 2 E e (0, t) e (−1, t) and, substituting s(t) from (1.22), we obtain + ∗ + e (−1, t) e+ (0, t) e (−1, t) e+ (0, t) e− (0, t) e− (−1, t) e− (0, t) e− (−1, t) |z (t)|2 dm 1 R(z) = 2 E p02 |e− (−1, t)e+ (−1, t) − e− (0, t)e+ (0, t)|2 z(t) − z ˜ −1∗ ˜ −1 2 1 : (t): (t) |z (t)| |dt| = , 2 E z(t) − z 2πp02
588
where
A. Volberg, P. Yuditskii
− e (−1, t) −e+ (0, t) ˜ :(t) = . −e− (0, t) e+ (−1, t)
(1.28)
˜ −1∗ (t): ˜ −1 (t)|z (t)|, 2πp02 ρ(z(t)) = :
(1.29)
Thus,
and det{2πp0 ρ(z(t))} =
|z (t)|2 = |s(t)|2 . 2 ˜ p 2 | det :(t)| 0
The theorem is proved. $ % ˜ ˜ t¯) = :∗ (t) Let us note, by the way, that :(t) (see (1.21)) and :(t) are related by :( and, besides (1.28), 2πp02 ρ(z(t)) = :−1 (t):−1∗ (t)|z (t)|.
(1.30)
2. Existence and Uniqueness We start this section with a remark that the spectral measure dσ determines a Jacobi matrix uniquely, but it is not an arbitrary 2 × 2 matrix–measure, or, say, a real-valued (all entries are real) 2 × 2 matrix–measure. Indeed, one can represent J as a two dimensional perturbation of an orthogonal sum of a pair of one–sided Jacobi matrices, i.e.: J 0 J = − + p0 , e−1 e0 + p0 , e0 e−1 , 0 J+ where J± = Pl 2 (Z± ) J |l 2 (Z± ). This formula implies that
−1
r −1 (z) p0 R(z) = − −1 p0 r + (z) where
,
(2.1)
dσ− (x) , r− (z) = r(z, J− ) = (J− − z)−1 e−1 , e−1 = x−z dσ+ (x) r+ (z) = r(z, J+ ) = (J+ − z)−1 e0 , e0 = . x−z
Thus, the real-valued matrix-measure dσ is determined by two scalar measures dσ± (with the normalization dσ± = 1) and a constant p0 . In what follows fˆ(x) ∈ L2dσ denotes the image of f ∈ l 2 (Z) in the spectral representation. Recall that 1 0 eˆ−1 = , eˆ0 = 0 1 and
! )(x) = x fˆ(x). (Jf
Inverse Scattering Problem for Jacobi Matrices
589
Let {Pn± (z)} be the orthonormal polynomials with respect to the (scalar) measure dσ± and Pn± (x) − Pn± (z) ± dσ± (x) Qn (z) := x−z (so-called polynomials of the second kind). In these terms −p0 Q+ (x) n eˆn (x) = , n ≥ 0, Pn+ (x) (2.2) Pn− (x) eˆ−n−1 (x) = , n ≥ 0. −p0 Q− n (x) Now, we prove Theorem 0.5. Proof of Theorem 0.5, the uniqueness part. The function e± (0, ζ )/e± (−1, ζ ) is &¯ \ E, automorphic, thus it defines a meromorphic function in C r˜± (z(ζ )) := −
e± (0, ζ ) . p0 e± (−1, ζ )
The recurrence relations imply that r˜± (z) possesses the same decomposition into a continued fraction as r± (z). Therefore, r± (z(ζ )) = −
e± (0, ζ ) . p0 e± (−1, ζ )
(2.3)
By Proposition 1.1 the asymptotic (1.18) implies the identity (1.19). Using this identity, we get (t ∈ T) e± (0, t)e± (−1, t) − e± (−1, t)e± (0, t) |p0 e± (−1, t)|2 −tz (t) = . |p0 e± (−1, t)|2
r± (z(t)) − r± (z(t)) = −p0
This means that an outer part of the function e± (−1, ζ ) is determined uniquely. But then (2.3) means that an outer part of e± (0, ζ ) is determined uniquely, and since b(ζ )e± (−1, ζ ) and e± (0, ζ ) are of Smirnov class, these functions are determined up to a common inner factor 2± (ζ ), i.e., e± (0, ζ ) = 2± (ζ )e˜± (0, ζ ) and e± (−1, ζ ) = 2± (ζ )e˜± (−1, ζ ),
(2.4)
where the inner parts of e˜± (0, ζ ), e˜± (−1, ζ ) are relatively prime. To show that 2± (ζ ) = 1, we use (0.23), (0.24). Since s(t)e∓ (0, t) = t¯e± (−1, t¯) + s± (t)e± (−1, t), s(t)e∓ (−1, t) = t¯e± (0, t¯) + s± (t)e± (0, t),
(2.5)
we have s(t){e∓ (−1, t)e± (−1, t) − e∓ (0, t)e± (0, t)} = t¯{e± (0, t¯)e± (−1, t) − e± (−1, t¯)e± (0, t)}.
590
A. Volberg, P. Yuditskii
Substituting (2.4) and using the symmetry e˜± (0, t¯) = e˜± (0, t),
e˜± (−1, t¯) = e˜± (−1, t),
we obtain s(t)b2 (t){e∓ (−1, t)e˜± (−1, t) − e∓ (0, t)e˜± (0, t)} = t¯2± (t¯)b2 (t){e˜± (0, t)e˜± (−1, t) − e˜± (−1, t)e˜± (0, t)} = −b2 (t)z (t){p0 2± (t¯)}−1 . Since the first expression here is a function of Smirnov class and b2 z is an outer function, we conclude that 2± (t) is a constant. Since t¯:(t¯) = −S(t):(t)
(2.6)
with :(t) defined by (1.21), S(t) is also determined in a unique way. At last, by the recurrence relations we get the same conclusion with respect to all functions {e± (n, ζ )}, not only for n = −1, 0. $ % Proof of Theorem 0.5, the existence part. The key instrument is the following theo¯ \ E such that Imr(z)/Imz ≥ 0 and rem [11]: if r(z) is a meromorphic function in C poles of r(z(ζ )) satisfy the Blaschke condition, then r(z(ζ )) is a function of bounded characteristic in D without a singular component in the multiplicative representation. Let us show that poles of r± (z(ζ )) satisfy the Blaschke condition. Diagonal entries ¯ \ E. R−1,−1 (z) and R0,0 (z) of the resolvent matrix–function R(z) are holomorphic in C By the theorem mentioned above they are functions of bounded characteristic. Using the force of (2.1), −1/R−1,−1 (z) = −1/r− (z) + p02 r+ (z), −1/R0,0 (z) = −1/r+ (z) + p02 r− (z). This means that poles of r± are subsets of poles of 1/R−1,−1 and 1/R0,0 . Thus r± (z(ζ )) are functions of bounded characteristic. Now, let us use the Szegö condition log det Im R(z(t)) ∈ L1 . Since det Im R −1 (z(t)) = | det R −1 (z(t))|2 det Im R(z(t)), using again (2.1), we have −1 −1 log Im r− (z(t)) + log Im r+ (z(t)) = log det Im R −1 (z(t)) ∈ L1 .
Therefore, each of the functions log Im r± (z(t)) belongs to L1 . Thus we can represent r± (z) (uniquely) in the form r± (z(ζ )) = −
e± (0, ζ ) , p0 e± (−1, ζ )
where e± (0, ζ ) and b(ζ )e± (−1, ζ ) are functions of Smirnov class with coprime inner parts (in fact, they are Blaschke products) such that t¯p0 {e± (0, t)e± (−1, t¯) − e± (−1, t)e± (0, t¯)} = z (t),
(2.7)
Inverse Scattering Problem for Jacobi Matrices
591
and e± (0, 0) > 0, (be± )(−1, 0) > 0. Note that p0 =
e± (0, 0) . (be± )(−1, 0)
As soon as the functions e± (0, ζ ) and e± (−1, ζ ) have been constructed we are able to introduce S(t) and F ± in their terms. First, let us write down an expression for the resolvent matrix-function: R(z(ζ )) =
−
) −p0 ee−(−1,ζ (0,ζ )
p0
= −(p0 :)
−1
−1
p0 +
) −p0 ee+(−1,ζ (0,ζ )
˜ 0 :) ˜ = = −=(p
−1
(2.8) ,
˜ are as in (1.21) and (1.28) respectively, and where : and : − ˜ ) = e (0, ζ ) + 0 =(ζ ) = =(ζ . 0 e (0, ζ ) Therefore, ˜ −1∗ (t): ˜ −1 (t), (2.9) p02 {R(z(t)) − R ∗ (z(t))} = −tz (t):−1 (t):−1∗ (t) = −tz (t): since (see (2.7)) ˜ ∗= ˜ −= ˜ ∗ :} ˜ = tz . p0 {=:∗ − := ∗ } = p0 {: ˜ ∗ (t) = :(t¯) we get immediately that the matrix–function S(t) From (2.9) and : defined by (2.6) is unitary–valued. Let us show that its element s(ζ ) is an outer function. In fact, we have to show that the function b2 (ζ ) det :(ζ ) is an outer function (see (1.22)). To this end let us use the representation for the diagonal entries of R(z) (see (2.8)) e+ (−1, ζ )e− (0, ζ ) , p0 det :(ζ ) e− (−1, ζ )e+ (0, ζ ) R0,0 (z(ζ )) = − . p0 det :(ζ )
R−1,−1 (z(ζ )) = −
Let 2 be an inner part of b2 (ζ ) det :(ζ ). Since R0,0 (z(ζ )) is of Smirnov class, 2 is a divisor of e− (−1, ζ )e+ (0, ζ ). If 2 is not trivial, then it has a non–trivial divisor 21 that is a divisor of one of these functions, say, e− (−1, ζ ). Since e− (−1, ζ ) and e− (0, ζ ) are coprime (and 21 is a divisor of b2 (ζ ) det :(ζ )), the 21 is a divisor of e+ (0, ζ ), and, therefore, it is not a divisor of e+ (−1, ζ ). Thus, 21 is not a divisor of the product e+ (−1, ζ )e− (0, ζ ). But this means that R−1,−1 (z(ζ )) is not of Smirnov class. We arrive at a contradiction, hence 2 is a constant. We define F ± by the formulas
(F + f )(t) = e+ (−1, t) e+ (0, t) fˆ(z(t)),
(F − f )(t) = e− (0, t) e− (−1, t) fˆ(z(t)).
(2.10)
592
A. Volberg, P. Yuditskii
Evidently, (F ± Jf )(t) = z(t)(F ± f )(t) and by (2.6), (0.23) are fulfilled. Using the formula for the spectral density ρ(x) = π1 Im R(x) and (2.9), we have 1 ∗ ˆ ˆ ˜ ∗−1 (p0 :) ˜ −1 fˆ(z(t)) |z (t)|2 dm(t). f (x) ρ(x)dx f (x) = fˆ∗ (z(t))(p0 :) 2 E E Since ˜ −1 (t) = :
+ 1 e (−1, t) e+ (0, t) , e− (0, t) e− (−1, t) ˜ det :(t)
we obtain 1 {||sF + f ||2 + ||sF − f ||2 } 2 = ||F + f ||2s+ = ||F − f ||2s− .
||f ||2 = ||fˆ||2L2 = dσ
Thus F + is an isometry, and since this map is invertible, p0 fˆ−1 (z(t)) t¯e+ (0, t¯) −e+ (0, t) g(t) = − , + + fˆ0 (z(t)) z (t) −t¯e (−1, t¯) e (−1, t) t¯g(t¯) where g = F + f , it is a unitary map. Further, using (2.2), for n ≥ 0 we have +
e (n, ζ ) =
e+ (−1, ζ )
e+ (0, ζ )
−p0 Q+ n (z(ζ )) . Pn+ (z(ζ ))
Due to the well known properties of orthogonal polynomials these functions have no singularity at the origin and hence they are functions of Smirnov class. This easily implies (0.24). At last, our maps possess properties (1.19) (or (1.20)), in force of Proposition 1.1, (0.25) holds. The theorem is proved. $ % −2 ), ||s+ || ≤ 1, s+ (t¯) = s+ (t), satisfy log(1−|s+ |2 ) ∈ Theorem 2.1. Let s+ ∈ L∞ (&, α+ 1 L . Then the reflection coefficient s+ determines a Jacobi matrix of Szegö class in a unique way if and only if α
α µ
s(0)Ks±± (0)Ks ∓b−2 (0) = b (0).
(2.11)
∓
Proof. Assume on the contrary that α
α µ
s(0)Ks++ (0)Ks −b−2 (0) = b (0).
(2.12)
−
We construct two Jacobi matrices. First, we consider the basis α µ−n
e+ (n, t) = bn (t)Ks +b2n (t),
(2.13)
+
and by J we denote the operator multiplication by z(t) in L2s+ with respect to this basis α µ−n
(Lemma 1.4). Then, starting with the basis {bn (t)Ks −b2n (t)} in L2s− , we introduce the − basis α µ−n
α µ−n
s(t)e˜+ (−n − 1, t) = t¯(bn Ks −b2n )(t¯) + s− (t)(bn Ks −b2n )(t). −
−
(2.14)
Inverse Scattering Problem for Jacobi Matrices
593
By J˜ we denote the operator multiplication by z(t) in L2s+ with respect to {e˜+ (n, t)}. By Lemma 1.5, α µ s(0)e˜+ (0, 0)Ks −b−2 (0) = b (0). −
e˜+ (0, 0)
e+ (0, 0).
= Due to the uniqueness part of Theorem 0.5, Thus (see (2.12)), J˜ = J . The “only if” part is proved. Now, let (2.11) hold, and let J be a Jacobi matrix of Szegö class and F ± be its representations in L2s± . By Lemma 1.7, α
Ks±± (0) ≤ e± (0, 0) =
1 b (0) 1 b (0) ≤ . ∓ s(0) (be )(−1, 0) s(0) K α∓ µ−2 (0) s∓ b
α
α µ
Then (2.11) implies that, in fact, e± (0, 0) = Ks±± (0) and (be∓ )(−1, 0) = Ks ∓b−2 (0), ∓ thus, due to a conclusion of Lemma 1.7, α
e± (0, t) = Ks±± (t),
α µ
e∓ (−1, t) = b−1 (t)Ks ∓b−2 (t). ∓
Recall that these functions determine the functions r± (z) and the coefficient p0 (see (2.3)), and they, in their turn, determine J . The theorem is proved. $ % Corollary 2.1. Let J be a Jacobi matrix of Szegö class with a homogeneous spectrum E. Let ρ(x) be the density of its spectral measure and S(t) be its scattering matrix-function. If ρ −1 (x) dx < ∞, (2.15) E
then there is no other Jacobi matrix of Szegö class with the same scattering matrixfunction S(t). Proof. By virtue of (1.29), (2.15) is equivalent to ˜ : ˜ ∗ (t) dm < ∞, :(t) E
that is e± (0, t) and e± (−1, t) belong to L2dm|E . Then word by word repetition of arguments in the proof of Lemma 1.6 gives us (I + Hs± )e± (0, t)e± (0, 0) = k α± (t),
(I + Hs± b−2 )(be± )(−1, t)(be± )(−1, 0) = k α± µ (t). α
α µ
Thus, e± (0, 0) = Ks±± (0) and (be± )(−1, 0) = Ks ±b−2 (0). Since, generally, ±
s(0)e± (0, 0)(be± )(−1, 0) = b (0), (2.11) holds, the corollary is proved.
% $
To finish this section we give an example of a scattering matrix-function, which does not determine a Jacobi matrix of Szegö class. Moreover, in this example, the associated operators (I + Hs± ) are invertible.
594
A. Volberg, P. Yuditskii
Example. Let v± ∈ H ∞ (&), ||v± || < 1, v± (t¯) = v(t), v± (0) = 0. Define outer functions u± , u± (0) > 0, by |u± |2 + |v± |2 = 1. Then, we put 0 s± = −v¯± u± /u¯ ± .
At last, S(t) =
0 −1 s− s s 0 u 0 v 0 u− 0 = − 0 + − E I− − E , s s+ 0 u+ 0 v+ 0 u+ 0 s+
where E=
1+2
1−2 2 2 1−2 1+2 2 2
,
and 2 is an inner function from H ∞ (&), 2(t¯) = 2(t). −2 In this case Hs± = Hs 0 , since their symbols differ by functions from H ∞ (&, α± ), ± and therefore (I + Hs± ) are invertible. On the other hand, the coefficient s is of the form s=
u+ u− (1 − 2)/2 , 1 − (v+ + v− )(1 + 2)/2 + v+ v− 2
and because of the factor (1 − 2)/2, 1/s does not belong to H ∞ (&, α+ α− ). The simplest choice of parameters: E = [−2, 2], v± (t) = a± t, a± ∈ (0, 1); 2(t) is a Blaschke product, deg 2 > 1, gives us an example where e˜+ (−1, t), defined by (2.14), does not belong to L2 (this is direct calculation), at the same time e+ (−1, t), defined by (2.13), belongs to L2 . 3. A Weighted Hilbert Transform By H we denote the transform (Hg)(z) =
E
g(x) dx, z−x
z ∈ C \ E,
(3.1)
primarily defined on integrable 2D vector-functions. Lemma 3.1. Let J be of Szegö class and F ± give its scattering representation in the model spaces L2s± . Then − − F f (ζ ) = p0 :(ζ ){H(ρ fˆ)}(z(ζ )), F +f + for any finite vector f = f − ⊕ f + ∈ l 2 (Z) = l 2 (Z− ) ⊕ l 2 (Z+ ).
(3.2)
Inverse Scattering Problem for Jacobi Matrices
595
Proof. Let P˜ n (z) denote the nth matrix orthonormal polynomial with respect the spectral measure dσ . Recall that
P˜ n (z) = eˆ−n−1 (z) eˆn (z) =
Pn− (z) −p0 Q+ n (z) , −p0 Q− Pn+ (z) n (z)
and, analogically to the scalar case, Q˜ n (z) :=
− P˜ n (z) − P˜ n (x) Qn (z) 0 dσ (x) = . 0 Q+ z−x n (z)
(3.3)
Based on (3.3), we have P˜ n (x) dσ (x) ˜ dσ (x) = p0 :(ζ ) Pn (z(ζ )) z(ζ ) − x z(ζ ) − x " P˜ n (z(ζ )) − P˜ n (x) − dσ (x) z(ζ ) − x
p0 :(ζ )
= − p0 :(ζ )R(z(ζ ))P˜ n (z(ζ )) − p0 :(ζ )Q˜ n (z(ζ )). Using (2.8) and Definition (2.10), we get p0 :(ζ )
ρ(x) eˆ−n−1 (x) eˆn (x) dx = =(ζ )P˜ n (z(ζ )) − p0 :(ζ )Q˜ n (z(ζ )) z(ζ ) − x − e (n, ζ ) 0 = 0 e+ (n, ζ ) − (F e−n−1 )(ζ ) 0 = . 0 (F + en )(ζ )
In fact, this finishes the proof. $ % Theorem 3.1. Let ρ(x) be the spectral density of a Jacobi matrix J of Szegö class and s+ (t) be the reflection coefficient. Then the following statements are equivalent: 1. There exist C < ∞ such that E
(Hg)∗ (x − i0)ρ −1 (x)(Hg)(x − i0) dx + (Hg)∗ (x + i0)ρ −1 (x)(Hg)(x + i0) dx E ≤C g ∗ (x)ρ −1 (x)g(x) dx. E
2. s+ determines J and the operators (I + Hs± ) are invertible.
(3.4)
596
A. Volberg, P. Yuditskii
Proof. 1 ⇒ 2. Since (see (1.30)) − − 2 + + 2 ||F f || + ||F f || = {H(ρ fˆ)}∗ (z(t))(p0 :)∗ (t)(p0 :)(t){H(ρ fˆ)}(z(t)) dm E 1 |z (t)||dt| = {H(ρ fˆ)}∗ (z(t)) ρ −1 (z(t)){H(ρ fˆ)}(z(t)) 2π 2π E 2 1 {H(ρ fˆ)}∗ (x − i0)ρ −1 (x){H(ρ fˆ)}(x − i0) dx = 2π E 1 2 {H(ρ fˆ)}∗ (x + i0)ρ −1 (x){H(ρ fˆ)}(x + i0) dx + 2π E C C ≤ fˆ∗ (x)ρ(x)fˆ(x) dx = ||f ||2 , (3.5) 2 (2π) E (2π )2 we get F ± f ± ∈ A21 (&, α± ). Thus, F ± {l 2 (Z± )} = Hs2± (α± ). By Lemma 1.7 and Theorem 2.1 we come to the conclusion that s+ determines J . Further, by (3.5) C {||f − ||2 + ||f + ||2 } (2π )2 C = {||F − f − ||2s− + ||F + f + ||2s+ }. (2π )2
||F − f − ||2 + ||F + f + ||2 ≤
Using again F ± f ± ∈ A21 (&, α± ), we can represent the last norms in the form ||F − f − ||2 + ||F + f + ||2 C ≤ {(I + Hs− )F − f − , F − f − + (I + Hs+ )F + f + , F + f + }. (2π)2
(3.6)
This proves the second statement in 2. 2 ⇒ 1. Recall that Hs2± (α± ) = closL2s A21 (&, α± ), but in the case under considera±
tion, the norm in Hs2± (α± ) is equivalent to the norm in A21 (&, α± ), i.e.: h ∈ Hs2± (α± ) ⇒ h ∈ A21 (&, α± ).
Further, since s+ determines J , by Lemma 1.7, we have F ± {l 2 (Z± )} = Hs2± (α± ). So, starting with (3.6) we obtain (3.4). $ % 4. Matrix A2 on Homogeneous Sets In this section our goal is to show that one can substitute (3.4) by the A2 condition. We do this in a bit more general setting than we need. Let E be a homogeneous set. Throughout this section P+ denotes the orthoprojector from the vector–valued L2 (Cn ) onto H 2 (Cn ) in the upper halfplane. We are interested in the boundedness of the weighted transform W 1/2 P+ W −1/2 : χE L2 (Cn ) → χE L2 (Cn ), where W is a weight on E and χE is the characteristic function of the set E.
(4.1)
Inverse Scattering Problem for Jacobi Matrices
597
Here is an analog of the matrix A2 condition ||W I(x,δ) W −1 I(x,δ) || < ∞, 1/2
sup
x∈E,0<δ<1
where I(x,δ) := (x − δ, x + δ) and W I(x,δ) :=
1
1/2
(4.2)
|I(x,δ) |
I(x,δ) ∩E
W (t) dt.
This supremum will be called Q2,E (W ). Theorem 4.1. The operator (4.1) is bounded if and only if Q2,E (W ) < ∞. Proof of necessity. With an arbitrary z0 ∈ C+ we associate a subspace Kbz0 = H 2 (Cn )* 0 bz0 H 2 (Cn ) of the Hardy space, bz0 (z) = z−z z−z0 . It is well known, that PKbz = P+ − bz0 P+ bz0 0
and
PKbz f, gL2 (Cn ) = (P+ f )(z0 ), (P+ g)(z0 )C n . 0
Because of the first of these relations we have |W 1/2 PKbz W −1/2 f, g| ≤ 2Q||χE f || ||χE g||. 0
Now, using the second one we get |(P+ W −1/2 f )(z0 ), (P+ W 1/2 g)(z0 )| ≤ 2Q||χE f || ||χE g||.
(4.3)
Let us substitute f = W −1/2
ξ , x − z0
g = W 1/2
η , x − z0
ξ, η ∈ Cn
in (4.3). This gives us | W −1 z0 ξ, W z0 η| ≤ 2Q||W −1 z0 ξ || ||W z0 η||, 1/2
1/2
where W z0 denotes an average with the Poisson kernel, Im z0 1 W z0 := W dx. π |x − z0 |2 Thus we proved an inequality with the Poisson averages W z0 ≤ 2QW −1 −1 z0 . At last let us note that c Im z0 ≥ χI , 2 |x − z0 | |I |
I = I(Re z0 ,Im z0 ) ,
with an absolute and positive constant c. Therefore (4.4) implies (4.2).
(4.4)
598
A. Volberg, P. Yuditskii
Lemma 4.1. If I is a centered at the E interval and z0 is the center of the square built on I , then W ∈ A2 (E) ⇒
W z0 ≤ C(E, Q2,E (W ))W I .
(4.5)
Proof. First we note that for λ = 2/η, |λI ∩ E| ≥ η|λI | ≥ 2|I | ≥ 2|I ∩ E|, and therefore |(λI \ I ) ∩ E| ≥ |I ∩ E|. Let us show that η2 W (λI ) ≥ 1 + 2 2 W (I ) for W ∈ A2 (E). λ Q Integrating the inequality
W −1 1 ≥0 1 W
over (λI \ I ) ∩ E we get
W −1 (λI \ I ) |(λI \ I ) ∩ E| ≥ 0. |(λI \ I ) ∩ E| W (λI \ I )
Therefore
W −1 (λI ) |(λI \ I ) ∩ E| ≥ 0, |(λI \ I ) ∩ E| W (λI ) − W (I )
or W (λI ) − W (I ) ≥ |(λI \ I ) ∩ E|2 {W −1 (λI )}−1 ≥ |I ∩ E|2 {W −1 (λI )}−1 . Using (4.2) we obtain W (λI ) − W (I ) ≥
|I ∩ E|2 η2 W (λI ) ≥ W (I ). Q2 |λI |2 Q 2 λ2
To prove (4.5), using c 1 Im z0 ≤ χ k , |x − z0 |2 |I | λ2k λ I
λk I = I(Re z0 ,λk Im z0 ) ,
we write the following chain of inequalities: c 1 W (λk I ) |I | λ2k c Q2 |λk I |2 ≤ {W −1 (λk I )}−1 |I | λ2k −k cQ2 η2 ≤ |I |2 {W −1 (I )}−1 1+ 2 2 |I | λ Q −k η2 cQ2 1+ 2 2 ≤ 2 W I . η λ Q
W z0 ≤
(4.6)
Inverse Scattering Problem for Jacobi Matrices
599
Proof of sufficiency. We want to prove that (4.2) suffices for W 1/2 P+ W −1/2 in (4.1) to be bounded. Fix f, g ∈ χE L2 (Cn ). We need to show (P+ W −1/2 f ) (z), (P+ W 1/2 g) (z)Cn Im z dA(z) ≤ C||f || ||g||. C+
In other words, introducing a Stolz cone &t and S(t) = (P+ W −1/2 f ) (z), (P+ W 1/2 g) (z)Cn dA(z) &t
one needs to prove that
S(t) dt ≤ C||f || ||g||.
(4.7)
We follow closely the lines of the proof in [13]. Let us consider a nonnegative function h(t) and Sh(t) (t) = (P+ W −1/2 f ) (z), (P+ W 1/2 g) (z)Cn dA(z), &t,h(t)
where
&t,h(t) = &t ∩ {z : Im z ≤ h(t)}.
Let us note that
S(t) dt ≤ c
Sh(t) dt
(4.8)
if the function h(t) has the following property: ∀I ⊂ R
|{t ∈ I : h(t) ≥ |I |}| ≥ a|I |.
(4.9)
Let us choose h to be maximal such that Sh(t) (t) ≤ B(M||f ||p∗ )1/p∗ (t)(M||g||p∗ )1/p∗ (t),
(4.10)
where B, p∗ ∈ (1, 2) will be chosen a bit later and M denotes the maximal function 1 |f (t)| dt. (Mf )(x) = sup δ>0 |I (x, δ)| I (x,δ) If this h satisfies (4.9), then (4.8) and (4.10) imply what we need. To choose B, p∗ and to prove that h satisfies (4.9) we follow the algorithm below. Let I0 be an arbitrary interval on the real axis. We will consider two cases: 2I0 ∩ E = ∅ and 2I0 ∩ E = ∅. In the first case we fix an interval I centered at E such that I0 ⊂ I and |I | ≤ 3|I0 |. 1/2 Let f1 = f · χ2I , g1 = g · χ2I and f2 = f − f1 , g2 = g − g1 . Denote AI = W I . Consider # $1/2 t ∈ I, S AI (fi )(t) = ||(P+ AI W −1/2 fi ) ||2 dA(z) , i = 1, 2; &t,|I |
t ∈ I,
S
A−1 I
# (gi )(t) =
&t,|I |
$1/2 1/2 ||(P+ A−1 gi ) ||2 dA(z) I W
,
i = 1, 2.
600
A. Volberg, P. Yuditskii
We will fix later α = α(Q2,E (W ), n) > 1. Now, % &α C(α) 1 A−1 1/2 I ||A−1 g(t)||α dt S (g1 )(t) dt ≤ I W |I | I |I | 2I $α # n
C1 (α, n) −1 1/2 ≤ ||g(t)|| ||W (t)AI ei || dt |I | 2I ≤
C2 (α, n) |I |
i=1
1 2−˜#
||g(t)||(2−˜# )α dt 2I
n
2I
i=1
1 2+#
(2+#)α ||W 1/2 (t)A−1 dt I ei ||
.
(4.11) Here (2 + #)−1 + (2 − #˜ )−1 = 1. Notice that for every vector x ∈ Cn the scalar function t → ||W (t)1/2 x|| is uniformly in the scalar A2 (E). In particular, there exists such an #0 > 0 that we have the inverse Hölder inequality for all such functions uniformly: ∀I centered at x ∈ E
1 |I |
≤C Let us choose # =
#0 2
(˜# =
#0 2+#0 ),
α =1+
I
||W (t)1/2 x||2+#0 dt
1 |I |
I
#0 2(2+#0 ) ,
||W (t)1/2 x||2 dt
1 2+#0
1 2
.
(4.12)
then we have
(2 + #)α < 2 + #0 ,
(4.13)
(2 − #˜ )α < 2.
(4.14)
We use (4.13) and the inverse Hölder inequality (4.12) in (4.11) to rewrite it as % 1/α &α −1 1 S AI (g1 )(t) dt |I | I 1 n (2−˜# )α 1 1 (2−˜# )α 2 ≤ C(α, n) ||g(t)|| dt ||W 1/2 (t)A−1 I ei || dt |I | 2I |I | 2I ≤ C1 (α, n)
1 |I |
||g(t)||(2−˜# )α dt
1 (2−˜# )α
i=1 n '
2I
)
≤ C3 (α, n, Q2,E (W )) inf M||g|| x∈I
p∗
*
i=1 1 p∗
−1/2
W I
−1/2
W 2I W I
ei , ei
−1/2
Q2
(1 2
(x),
where p∗ = (2 − #˜ )α < 2. We used the doubling property of W : W I
−1/2 W I
1 2
W 2I
≤ 2 η2 , the inequality which can be proved in the same way as (4.6). The last inequality ensures that for any τ , τ ∈ (0, 1), using Kolmogorov-type inequalities we can find a subset E(τ, I0 ) ⊂ I0 , |E(τ, I0 )| ≥ |I0 | − τ α |I | ≥ (1 − 3τ α )|I0 | such that ) *1 −1 C3 (α, n, Q2,E (W )) inf M||g||p∗ p∗ (x). (*) t ∈ E(τ, I0 ) ⇒ S AI (g1 )(t) ≤ x∈I τ
Inverse Scattering Problem for Jacobi Matrices
601
Similarly, for every τ there exists a set E(τ, I0 ), |E(τ, I0 )| ≥ (1 − 3τ α )|I0 | such that t ∈ E(τ, I0 ) ⇒
S AI (f1 )(t) ≤
) *1 C(α, n, Q2,E (W )) inf M||f ||p∗ p∗ (x). x∈I τ
(*)
Here we use the same calculations and the fact that for any I centered at E, W I W −1 2I W I 1/2
1/2
≤ 2Q2 .
Now let us work with f2 , g2 . Let cI be the center of the square built on 2I . Using the representation %
P+ AI W −1/2 f2
&
(z) =
1 2π i
(AI W −1/2 f2 )(x) dx, (x − z)2
Im z > 0,
(4.15)
clearly, we obtain for every t ∈ I , # &t,|I |
& % || P+ AI W −1/2 f2 (z)||2 dA(z)
$1/2
≤C
Im cI ||(AI W −1/2 f2 )(x)|| dx. |x − cI |2 (4.16)
Therefore, using the inverse Hölder inequality (4.12), we have again # &t,|I |
≤C
% & || P+ AI W −1/2 f2 (z)||2 dA(z) n
Im cI ||W −1/2 AI ei || ||f2 || dx |x − cI |2
i=1 n
≤ C1
i=1
≤ C2
n '
i=1
$1/2
Im cI ||W −1/2 AI ei ||2+# dx |x − cI |2
W I W −1 cI W I ei , ei 1/2
1/2
(1 2
1 2+#
Im cI ||f2 ||2−˜# dx |x − cI |2
% & 1 2−˜# inf M||f ||2−˜# (x) .
x∈I
1 2−˜#
(4.17)
Here 2 + # is close to 2 (# ≤ #0 ). Finally, using Lemma 4.1 we estimate the last sum by a constant: $1/2 # & % % & 1 2−˜# −1/2 2 || P+ AI W f2 (z)|| dA(z) ≤ C(n, E, Q) inf M||f ||2−˜# (x) . x∈I
&t,|I |
That is & 1 % 2−˜# S AI (f2 )(t) ≤ C(n, E, Q) inf M||f ||2−˜# (x) , x∈I
−1
the same for S AI (g2 )(t).
∀t ∈ I,
(*)
602
A. Volberg, P. Yuditskii
Combining all (*) inequalities we obtain that with a suitable C = C(n, E, W ), SI (t) := (P+ W −1/2 f ) (z), (P+ W 1/2 g) (z)Cn dA(z) &t,|I |
*1 ) *1 ) −1 ≤ S AI (f )(t)S AI (g)(t) ≤ C 2 M||f ||p∗ (t) p∗ M||g||p∗ (t) p∗
(4.18)
at least on a quarter of I0 . Of course, SI0 (t) ≤ SI (t). In the case 2I0 ∩ E = ∅ we fix an interval I centered at E such that I0 ⊂ I and dist(I0 , E) ≥ |I |/6. Let cI be the center of the square built on I . We can use again a representation of the form (4.15): % & 1 (AI W −1/2 f )(x) −1/2 P+ AI W f (z) = dx, Im z > 0, 2π i (x − z)2 to obtain an analog of (4.16), $1/2 # % & −1/2 2 || P+ AI W f (z)|| dA(z) ≤C
Im cI ||(AI W −1/2 f )(x)|| dx |x − cI |2
&t,|I |
for all t ∈ I0 . Continuing in this way we get ) *1 S AI (f )(t) ≤ C(n, E, Q) inf M||f ||p∗ (x) p∗ ,
∀t ∈ I0 .
x∈I
−1
The same for S AI (g)(t). Thus *1 ) *1 ) SI (t) ≤ C(n, E, Q)2 M||f ||p∗ (t) p∗ M||g||p∗ (t) p∗
(4.19)
everywhere on I0 . Let B be the largest constant in (4.18), (4.19). We have already chosen p∗ < 2. Now we introduce the following function h(t): ) *1 ) *1 h(t) = sup{h : Sh (t) ≤ B M||f ||p∗ (t) p∗ M||g||p∗ (t) p∗ }. What we proved can be summarized in: if I0 : 2I0 ∩ E = ∅ then h(t) ≥ |I0 | on a quarter of measure of I0 , if I0 : 2I0 ∩ E = ∅ then h(t) ≥ |I0 | ∀t ∈ I0 . In any case, 1 (P+ W −1/2 f ) (z), (P+ W 1/2 g) (z)Cn Im z dA(z) 4 C+ ≤ (P+ W −1/2 f ) (z), (P+ W 1/2 g) (z)Cn dA(z) dt R &t,h(t)
≤B
R
*1 ) *1 M||f ||p∗ (t) p∗ M||g||p∗ (t) p∗ dt
)
≤B
R
)
*
p∗
M||f || (t)
≤ BC(p∗ )
2 p∗
||f || (t) dt 2
R
1 2
dt 1 2
R
)
M||g|| (t)
||g|| (t) dt 2
R
*
p∗
1 2
2 p∗
dt
1 2
Inverse Scattering Problem for Jacobi Matrices
603
because p2∗ > 1, and we can use the Hardy–Littlewood maximal theorem. The theorem is proved. $ %
5. The Inverse Scattering Problem and a Riemann–Hilbert Problem Reduction of an inverse scattering problem to Riemann–Hilbert Problem is, maybe, the most popular approach (see, e.g. [14]). In this section, we show what kind of a Riemann–Hilbert √problem is associated with the problem under consideration. √ Let us define −b2 z (ζ ) as the square root of an outer function such that −b2 z (0) > 0. Put √ √ −b2 z (ζ ) −z (ζ ) = . (5.1) b(ζ ) √ √ In this case, −z (ζ¯ ) = −z (ζ ). Let E− = {t ∈ E : Imt < 0}. Then −tz (t) = i|z (t)|, t ∈ E− . Thus, √ √ √ √ √ t{ −z (t)}2 = i −z (t) −z (t) = i −z (t) −z (t¯),
t ∈ E− ,
or √ √ t¯ −z (t¯) = −i −z (t), Besides,
√
t ∈ E− .
(5.2)
√ −z |[γ ] = #(γ ) −z (ζ ),
where # ∈ & ∗ , # 2 = 1& ∗ . But, in fact, the group & is defined up to a choice of a half–period #˜ ∈ & ∗ . So, we may assume that √ √ −z |[γ ] = −z (ζ ). (5.3) Proposition 5.1. Let E = [b0 , a0 ] \ ∪j ≥1 (aj , bj ) be a homogeneous set. Then :(ζ ) G(z(ζ )) := √ −z (ζ ) ¯ \ [b0 , a0 ] satisfying the following RHP: is a holomorphic matrix function in C α−,j 0 G(x + i0), x ∈ (aj , bj ), G(x − i0) = 0 α+,j G(x − i0) = − iH(x)G(x + i0),
x ∈ E,
(5.4) (5.5)
where H(z(t)) := S(t), t ∈ E− , with the normalization at infinity: 1 1 + · · · − az + · · · √bs(0) 0 . G(z) = − bz + · · · 1 + · · · 0 √1 as(0)
(5.6)
604
A. Volberg, P. Yuditskii
Proof. Equation (5.4) follows from Equation (5.3). (5.5) follows from (5.2) and (2.6). To prove (5.6), we represent G(z) in the form 1 + · · · − az + · · · c1 0 . G(z) = 0 c2 − bz + · · · 1 + · · · Then, we note that c1 c2 = det G(∞) =
and
e− (−1, ζ )e+ (−1, ζ ) − e− (0, ζ )e+ (0, ζ ) 1 , = −z (ζ ) p s(0) 0 ζ =0 (5.7)
ac2 bc1 z(ζ )e± (0, ζ ) = = ± = p0 . c1 c2 e (−1, ζ ) ζ =0
Solving together (5.7), (5.8), we get (5.6).
(5.8)
% $
We want to finish this section with the following discussion. As an initial data for the inverse scattering problem in this paper we used a character −2 ) and a character α+ ∈ & ∗ . In fact, this set of data automorphic function s+ ∈ L∞ (&, α+ can be defined uniquely by a function σ+ (x) on the spectral set E (σ+ (z(t)) = s+ (t), t ∈ E− ) and a system of unimodular multipliers {α+,j }, each factor α+,j is associated with a spectral gap (aj , bj ). In terms of σ+ (x) and {α+,j } one can define a 2 × 2 matrix function H(x) over interval [b0 , a0 ]. Then one has to solve a RHP (5.4), (5.5) with a normalization condition (5.6) at infinity. The spectral density ρ(x) (and therefore J itself) is determined via a solution of the RHP by ρ −1 = 2π abG∗ G. However, when solving the RHP, one carefully has to specify a class of analytic functions to which G(z) belongs. Therefore, in any case, one has to introduce this or that analog of the functional space A21 (&, α). References 1. Aptekarev, A., Nikishin, E.: The scattering problem for a discrete Sturm–Liouville operator. Russian Mat. Sb. (N.S.) 121(163), 327–358 (1983) 2. Arov, D., Dym, H.: On matricial Nehari problems, J -inner matrix functions and the Muckenhoupt condition. J. Funct. Anal. 181, 227–299 (2001) 3. Carleson, L.: On H ∞ in multiply connected domains. In: Conference on harmonic analysis in honor of Antoni Zygmund. Vol. I, II. (Chicago, IL, 1981), 349–372, Wadsworth Math. Ser., Belmont, CA: Wadsworth, 1983 4. Geronimo, J., Case, K.: Scattering theory and polynomials orthogonal on the real line. Trans. Amer. Math. Soc. 258, 467–494 (1980) 5. Gesztesy, F., Nowell, R., Pötz, W.: One-dimensional scattering for quantum systems with nontrivial spatial asymptotics. Diff. Integral Eqs. 10 521–546 (1997) 6. Guseinov, G.S.: The determination of an infinite Jacobi matrix from the scattering data. Soviet Math. Dokl. 17, 596–600, 1976 7. Hasumi, M.: Hardy Classes on Infinitely Connected Riemann Surfaces. Lecture Notes in Math. 1027, Berlin–New York: Springer-Verlag, 1983 8. Marchenko, V.: Sturm–Liouville Operators and Applications. Basel: Birkhäuser, 1986
Inverse Scattering Problem for Jacobi Matrices
605
9. Marchenko, V.: Nonlinear equations and operator algebras. Translated from the Russian by V. I. Rublinetski˘ı. Mathematics and its Applications (Soviet Series), 17. Dordrecht–Boston, MA: D. Reidel Publishing Co., 1988 10. Peherstorfer, F. and Yuditskii, P.: Asymptotic behavior of polynomials orthonormal on a homogeneous set. J. Anal. Math., to appear 11. Sodin, M. and Yuditskii, P.: Almost periodic Jacobi matrices with homogeneous spectrum, infinite dimensional Jacobi inversion, and Hardy spaces of character-automorphic functions. Journ. of Geom. Analysis 7, 387–435 (1997) 12. Teschl, G.: Jacobi operators and completely integrable nonlinear lattices. Mathematical Surveys and Monographs, 72. Providence, RI: Am. Math. Soc., 2000 13. Volberg, A.: Matrix Ap weights via S-functions. Journ. Am. Math. Soc. 10, 445–466 (1997) 14. Beals, R., Deift, P. and Tomei, C.: Direct and inverse scattering on the line. Mathematical Surveys and Monographs, 28. Providence, RI: Am. Math. Soc., 1988 15. Nevanlinna, R.: Analytic functions. Berlin: Springer Verlag, 1970 Communicated by B. Simon
Commun. Math. Phys. 226, 607 – 626 (2002)
Communications in
Mathematical Physics
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations Michael K.-H. Kiessling Department of Mathematics, Rutgers University, 110 Frelinghuysen Rd., Piscataway, NJ 08854, USA. E-mail: [email protected] Received: 8 December 2000 / Accepted: 14 December 2001
In celebration of the 70th birthday of Joel L. Lebowitz Abstract: In the semi-classical limit the relativistic quantum mechanics of a stationary beam of counter-streaming (negatively charged) electrons and one species of positively charged ions is described by a nonlinear system of finite-temperature Thomas–Fermi equations. In the high temperature/low density limit these Thomas–Fermi equations reduce to the (semi-)conformal system of Bennett equations discussed earlier by Lebowitz and the author. With the help of a sharp isoperimetric inequality it is shown that any hypothetical particle density function which is not radially symmetric about and decreasing away from the beam’s axis would violate the virial theorem. Hence, all beams have the symmetry of the circular cylinder. 1. Introduction Modern books on charged-particle beams, e.g. [21], usually contain a chapter about the Bennett model [4], but back in the early 50’s when regular research on charged-particle beams came into sharper focus, W. H. Bennett’s pioneering pre-WWII paper [4] on the statistical mechanics of a relativistic, stationary particle beam had been forgotten, apparently, and so in 1953 Bennett sent out a reminder note [5]. For some reason or other, Bennett’s note did not appear until 1955 [5], the very year when Joel L. Lebowitz was launching his stellar career [30] with center of gravity in stationary non-equilibrium statistical mechanics [31–33]. At that time, a single issue of The Physical Review was still of a decent size and could be consumed from first to last page by an individual with huge scientific appetite such as Joel, and Bennett’s note [5] did not pass unnoticed before Joel’s hungry eyes. All this happened a few years before I was born, but when I came to spend some postdoctoral time with Joel nearly 40 years later, several interesting questions raised by Bennett’s work were still unanswered, and so we began to answer some of these [29]. One of the problems we had to leave open was that of the symmetry of a beam. Following © 2001 The author. Reproduction of this article, in its entirety, for non-commercial purposes is permitted.
608
M. K.-H. Kiessling
Bennett we only inquired into circular-cylindrically symmetric solutions. While it is a natural conjecture that in the absence of external fields an unbounded straight particle beam with finite electrical current through its cross-section necessarily possesses the symmetry of the circular cylinder, how to prove it is not quite so obvious. It is with great pleasure that in this paper I present a rigorous proof to Joel. Fitting for the occasion, the proof of the cylindrical symmetry of the beam involves statistical mechanics in an essential way. Namely, it is shown that any hypothetical stationary beam with finite electrical current whose particle density functions are not radially symmetric about and decreasing away from the beam’s axis would violate the virial theorem for this many-particle system. This symmetry proof covers Bennett’s strictly classical model as well as its semi-classical upgrade, i.e. a system of relativistic, finite-temperature Thomas–Fermi equations which in the high-temperature/low-density limit reduce to the (semi-)conformal Bennett equations. The proof is, however, restricted to a system of merely two equations because the coefficient matrix for the beam equations has rank 2. Our symmetry theorem therefore does not apply to beams that consist of the negatively charged electrons and more than one, differently positively charged ion species; but then again, our method of proof not only yields the cylindrical symmetry of the beam, it also yields monotonic radial decrease of the particle density functions. Hence, it is conceivable that monotonicity of the density functions may be violated in an electron/multi-ion species beam while cylindrical symmetry might still hold – yet to prove that would seem to require an entirely new argument. Incidentally, our result also sheds some new light on the theory of white dwarfs [8]. These Earth-sized, expired stellar objects shine in bright white light because they are still incredibly hot compared to our Sun, yet they are relatively cold compared to their Fermi temperature and therefore essentially in their quantum ground state. This justifies using zero-temperature Thomas–Fermi theory for the description of their overall structure [8] – a fortunate happening, for finite-temperature Thomas–Fermi theory could not be used in three dimensions since it does not have solutions with finite mass. Interestingly, the finite-temperature Thomas–Fermi equations of the two-dimensional caricature of such a white dwarf star should have solutions with finite mass, because the gravitational potential in two dimensions is sufficiently strongly confining for this purpose. In any event, relatively little is known rigorously1 for such a gravitating plasma of negative electrons and positive nuclei (all species treated as fermions) in either two or three dimensions; see the discussion of this model by W. E. Thirring in the preface to the E. H. Lieb jubilee volume [41], where Thirring gives an amusing account of the pitfalls associated with the fact that the Thomas–Fermi equations are the Euler–Lagrange equations for the saddle points of a variational functional. When dealing with saddle points, existence and symmetry of solutions via minimization by radial decreasing rearrangement [1, 6, 7,27] is not an option, and neither is symmetry via uniqueness by convexity [27] of the functional. Now recall that by the Biot-Savart law the magnetic interactions of straight, parallel electrical current filaments are attractive, with a distance law that is identical to the Newtonian gravity law in two dimensions. From this it follows that the finite-temperature 1 More is known rigorously [8, 35] for the locally neutral approximation of the model, where the positive and negative charges are distributed identically and Coulomb’s law is discarded. In particular, radial symmetry of solutions for this locally neutral model has been proven by energy minimization through radial rearrangement [35]. We remark that due to the enormous ratio of the electrical and gravitational coupling constants the locally neutral approximation is expected to be an excellent approximation for a white dwarf; however, this is not generally the case for a particle beam, where the ratio of electric and magnetic coupling constants may be arbitrarily close to 1.
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
609
Thomas–Fermi beam equations are identical to the finite-temperature Thomas–Fermi equations of the two-dimensional caricature of a white dwarf model, with the magnetic flux function re-interpreted as the gravitational Newton potential in two dimensions, and the mean electric current of each species (positive after at most a joint space rotation) re-interpreted as the mass of that species. Our symmetry result can be rephrased thus: two-dimensional finite-temperature white dwarfs are radially symmetric. Our proof of symmetry, which is based on the Rellich [39]–Pokhozaev [38] identity (which expresses the virial theorem) and the classical isoperimetric inequality [2, 6], does involve radial rearrangements in a strategy that goes back at least as far as [2], where it is applied to Liouville’s equation2 in a disk ⊂ R2 [2]. In [10] this strategy was generalized to systems of PDEs of Liouville type in all R2 which are unrestricted in size but which have a symmetric, fully stochastic coefficient matrix of full rank. The Bennett equations also constitute a Liouville system, but are not covered by the theorem of [10] because their coefficient matrix is generally not symmetric, has some negative elements, and is always rank 2. The present paper develops the necessary generalizations of [10] to overcome the first two peculiarities of the Bennett equations, but the rank 2 restricts the proof to a system of two equations. By adapting the treatment of single PDEs with more general nonlinearities developed in [36] (cf. also [28]) and [11, 12] to the system case we are able to extend our proof of symmetry for the Bennett equations to the relativistic, finite-temperature Thomas–Fermi beam equations. Our proof simplifies considerably when the systems of Thomas–Fermi and Bennett equations are restricted to a disk with 0-Dirichlet boundary conditions for the electric and magnetic potentials. In this compact case, an alternate proof of the radial symmetry and decrease of the solutions to systems of PDE which includes the finite-temperature Thomas–Fermi and Bennett equations, was given by Troy [42], who exploited Alexandroff’s method of moving planes. For more on the moving-planes method, see [40, 19, 34, 14, 9, 13]. Troy’s proof has been extended to Liouville systems in unbounded domains, the Bennett equations not included though, in [15]. Presumably, the moving planes method can be made to work also for the system of Thomas–Fermi equations studied here; however, this is not done in this paper. While the present paper addresses only the question whether invariance of the PDEs under rotations implies radial symmetry of their solutions, these PDEs feature other symmetries which deserve mentioning. The system of Thomas–Fermi equations is invariant under the isometries of Euclidean space, simple gauges, and Lorentz boosts along the beam. The Bennett equations are in addition to that invariant under isotropic scaling in R2 , and for a special family of parameter values also under Kelvin transformations, in which case they are invariant under the Euclidean conformal group of R2 . In this fully conformal case the conformal orbit of the finite current solutions is connected and itself invariant [29]. Invariance under the Euclidean conformal group holds also for the Liouville systems studied in [10], but their conformal orbit of finite mass solutions is generally not connected, and each component not invariant under inversions [15]. Toda systems in R2 , which are Liouville systems with symmetric coefficient matrix given by the SU (N ) Cartan matrix, are studied in [22,23]. The distribution of negative and positive signs in the SU (2) Cartan matrix is opposite to that in our Bennett equations, and sure enough, our radial symmetry proof fails in this case. Interestingly, in this case one can show that radial symmetry is in fact broken by some solutions, see the bifurcation argument with n = 2 in (1.7) of [10], and see [23] for the construction of the complete 2 The elliptic Liouville equation, known from two-dimensional differential geometry, is meant and not the evolution equation on phase space known from statistical mechanics.
610
M. K.-H. Kiessling
solution family with finite masses. Another interesting topic not discussed further here is whether the translation invariance along the beam can be broken, as is suggested by various dynamical beam instabilities [44]. The remainder of this paper is structured as follows. In the next section we formulate the basic equations of the semi-classical beam model and its classical limit. Existence of solutions is briefly touched upon. In Sect. 3 we state our two main theorems, and in Sect. 4 we present their proofs. 2. Relativistic Beam Equations We let a ∈ S2 denote the fixed axis of the beam, x ∈ R2 a point in the cross-section of the beam containing the coordinate origin, and p ∈ R3 the kinematical particle momentum. The self-consistent electric field of the beam is given by E(x) = −∇φ(x), where φ is the electric potential, and the magnetic field by B(x) = ∇ψ(x)∧a, where ψ is the magnetic flux function. The beam consists of spin 1/2 electrons (negatively charged, thus indexed by s = −) and one species of positively charged spin 1/2 fermions (indexed by s = +), characterized by the following parameters: the particle charges qs and rest masses ms ; the rest frame temperatures Ts ; the external chemical potentials µs ; and lab frame drift speeds cνs , where c is the speed of light and νs ∈ (−1, 1). We demand ν+ = ν− , as appropriate for counter-streaming particle species. The temperatures and drift speeds combine into the thermal lab frame parameters βs−1 = kB Ts 1 − νs2 . 2.1. The semi-classical model (Thomas–Fermi theory). The finite-temperature Thomas–Fermi model of a straight, relativistic beam is set up as follows. In the lab frame the density of s-particles at x is given by ρs (x) = G TF s φ, ψ (x), where GsTF (φ, ψ) =
2 h3
R3
1+e
−βs µs −c
√
dp
m2s c2 +|p|2 +νs cp·a−qs [φ−νs ψ]
(1)
is the finite-temperature Thomas–Fermi density function for the relativistic s-species, which is subjected to the integrability condition GsTF φ, ψ (x)dx = Ns , (2) R2
where Ns is the number of s-particles per unit length of beam. The phase-space density function under the integral in (1) is the drifting Fermi–Dirac–Jüttner function [26] with local chemical self-potential −qs (φ(x) − νs ψ(x)). The electric charge and current densities in the Poisson equations for the electric potential φ and the magnetic flux function ψ are computed with the density functions (1), which leads to the system of nonlinear PDEs −φ = 4π s qs GsTF (φ, ψ) , (3) TF −ψ = 4π s νs qs Gs (φ, ψ) . (4) Here and in the following, s or t always stands for summation over the particle species, i.e. s = ∓ and t = ∓.
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
611
The Thomas–Fermi equations (3), (4), are invariant under the isometries of threedimensional Euclidean space, Lorentz boosts along the beam’s axis a, and the gauge transformation φ(x) → φ(x) + φ0 ;
ψ(x) → ψ(x) + ψ0 ;
µs → µs + qs (φ0 − νs ψ0 ),
(5)
where φ0 and ψ0 are arbitrary constants. Since we are interested in the beam’s natural symmetries, we will not allow “sources at infinity” which would deform the beam; hence, we supplement (3) and (4) with the asymptotic conditions that, uniformly as |x| → ∞, φ(x)
lim
Q ln
|x|→∞
1 |x|
=2=
lim
|x|→∞
cψ(x) I ln
1 |x|
,
(6)
with I = 0 and Q = 0, where I = s Ns qs νs c is the total electrical current through the beam’s cross-section and Q = s Ns qs the total charge per unit length of beam in the lab system; if Q = 0, the left equation in (6) is to be replaced by the condition that φ(x) → const uniformly as |x| → ∞. The situation I = 0 is not considered here, for then of course there is no stationary beam. Remark. There are good reasons to conjecture that the asymptotic conditions (6) are in fact implied by (1)–(4). Analogous results have been proven for Liouville’s equation [14] and for some Liouville systems [10, 15]. No attempt will be made here to generalize these results to (1)–(4). However, we note that such a generalization would have the interesting physical implication (within the limits of applicability of the model) that one cannot maintain a stationary straight beam of finite current, whatever the geometry of its cross section, when there are magnetic or electric multipole sources “at infinity”. To the best of the author’s knowledge, the existence of beam solutions in the Thomas– Fermi model (1)–(4) with asymptotics (6) has not yet been studied rigorously. However, this semi-classical model is surely more regular than the classical one, addressed next. 2.2. The classical limit (Bennett theory). In the high-temperature/low-density limit, i.e. formally 0 < βs 1 and βs µs −1, the Fermi–Dirac–Jüttner functions [26] reduce to the Maxwell–Boltzmann–Jüttner functions [24] (see also [17], p. 46, Eq. (24)), so that the Thomas–Fermi densities (1) simplify to Boltzmann densities, √ 2 −βs c m2s c2 +|p|2 −νs cp·a B e dp eβs (µs −qs [φ−νs ψ]) , (7) Gs (φ, ψ) = 3 h R3 and (2) becomes
R2
GsB φ, ψ (x)dx = Ns .
(8)
The system of equations (3) and (4) then reduces to the Bennett equations −φ = 4π −ψ = 4π
s Ns qs
e−βs qs (φ−νs ψ) , −βs qs (φ−νs ψ) dx R2 e
s N s q s νs
e−βs qs (φ−νs ψ) , −βs qs (φ−νs ψ) dx R2 e
(9) (10)
612
M. K.-H. Kiessling
see [4] Eqs. (8),(9), and [5] Eq.(7),3 where we have eliminated the external chemical potentials µs via (8). The Bennett system is invariant under the isometries of three-dimensional Euclidean space and under Lorentz boosts along the beam’s axis, a. Restricted to the beam’s crosssection, it is also invariant under isotropic scaling, and in the special case when the parameters satisfy βs qs (νs c−1 I − Q) = 2,
s = ∓,
(11)
also invariant under translated inversions. Thus, (11) implies invariance of the Bennett system under the conformal group of two-dimensional Euclidean space, acting in the beam’s cross-section. In addition, the Bennett equations are invariant under a gauge transformation φ(x) → φ(x) + φ0 , ψ(x) → ψ(x) + ψ0 . Recall that we already eliminated the external chemical potentials via the constraint equations (2) in the Bennett limit. In the conformally invariant case (11), Bennett’s Ansatz4 I −1 cψ(x) = v(x) = Q−1 φ(x)
(12)
maps (9) and (10) separately into Liouville’s equation [37] −v = 4π
e2v . 2v R2 e dx
(13)
As remarked above,it has been proven in [14] that any regular solution of (13), with the understanding that exp(2v)dx < ∞, satisfies lim
|x|→∞
v(x) ln
1 |x|
= 2,
(14)
uniformly as |x| → ∞, which implies that the asymptotic conditions (6) are automatically satisfied if φ and ψ are given by (12). It has also been proven in [14], and subsequently in [10, 16] by using alternate techniques, that (13) has only one regular family of solutions, given by v(x|x0 ; k) = v0 + ln
1 1 + k 2 |x
− x 0 |2
,
(15)
where k −1 > 0 is an arbitrary scale length, x0 the arbitrary center of rotational symmetry of the solution, and v0 an arbitrary gauge constant. The corresponding current density j (x) and charge density q(x) are given by I −1 j (x) =
1 k2 −1 = Q q(x). π 1 + k 2 |x − x0 |2 2
(16)
3 In his papers [4, 5], Bennett employed a classical, semi-relativistic setup, assuming drifting MaxwellBoltzmann distributions with relativistic drift speeds, yet with non-relativistic velocity dispersion in the crosssection of the beam; the relativistic model with drifting Jüttner functions was used in [3]. It should be noticed, though, that after integration over momentum space the very system of Eqs. (9), (10) results in either case, and it does so also in the strictly non-relativistic limit [29] – except for minor re-interpretations of the parameters in each case. 4 Bennett actually made theAnsatz that ρ (x)/ρ (x) = const, which up to gauge freedom for the potentials + − is equivalent to (12).
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
613
The density profile (16) is the celebrated Bennett beam profile. Bennett speculated about the existence of other solutions to (9) and (10) with asymptotics (6); see [4] p. 893, and [5] p. 1587. (In the punctured plane additional solutions are readily found, see e.g. [3]; however, they all lack regularity, due to a point source, at the origin.) In [29] we proved that in the conformal case (11), Bennett’s system of equations (9) and (10), supplemented by the asymptotic conditions (6), are in fact equivalent to (13) (with asymptotic condition (14) automatically satisfied, see above) so that (15) then exhausts all possibilities. Moreover, for the semi-conformal case where (11) does not hold, we proved the existence of a continuous parameter family of smooth radial solutions to (9) and (10) with asymptotics (6) which are not invariant under inversions. All the solutions of our beam equations are automatically also stationary solutions of the equations of Vlasov’s relativistic kinetic theory [43]. In [29] we showed that the Bennett equations can also be realized as the transversal part of stationary dissipative kinetic equations in which the dissipation, modeled by a thermostat, compensates the action of an applied longitudinal electromotive force that drives the current. In [29] we also gave a rigorous proof that all radial solutions of (9) and (10) satisfying (6) also satisfy the Bennett identity c−2 I 2 − Q2 = 2 s Ns kB Ts 1 − νs2 . (17) The identity (17) was originally obtained by Bennett [5] in a formal (and not entirely compelling) manner by studying the radial time-dependent virial. In this paper we will show that the Bennett identity (17), respectively its counterpart for the Thomas–Fermi model, holds a priori without assuming symmetry, and this fact will be one major ingredient in our proof of the cylindrical symmetry of the beams. 3. Main Results To state our virial theorem, we introduce the thermodynamic potentials (per unit length of beam), given by √
2 β µ −c m2s c2 +|p|2 +νs cp·a−qs [φ−νs ψ] J TF = dp dx (18) βs−1 3 ln 1 + e s s h R2 R3 s for the semi-classical model, respectively by √
βs µs −c m2s c2 +|p|2 +νs cp·a−qs [φ−νs ψ] B −1 2 βs 3 e dp dx J = h R2 R3 s
(19)
for the classical model. Theorem 3.1 (Virial identity). Let φ ∈ C 2,α (R2 ) and ψ ∈ C 2,α (R2 ) solve (3) and (4) under the constraints (2), respectively solve (9) and (10) under the constraints (8), s = ∓, in either case subjected to the asymptotic conditions (6). Then c−2 I 2 − Q2 = 2J, where J stands for either J TF or J B .
(20)
614
M. K.-H. Kiessling
We also show that deviations from cylindrical symmetry violate (20), which gives us the next theorem. Theorem 3.2 (Cylindrical symmetry). Let φ ∈ C 2,α (R2 ) and ψ ∈ C 2,α (R2 ) solve (3) and (4) under the constraints (2), respectively solve (9) and (10) under the constraints (8), s = ∓, subjected to the asymptotic conditions (6). Then there exists a point x0 ∈ R2 such that both φ and ψ are radially symmetric about x0 , and the density functions Gs φ, ψ (x) are decreasing away from x0 , where Gs here stands for either the Thomas– Fermi or the Boltzmann density function. 4. Proofs We rewrite the Thomas–Fermi, respectively Bennett system in two equivalent versions, which may be called the “density potential representation” and the “chemical selfpotential representation”. We will switch between these representations at our convenience to obtain the asymptotic estimates, as |x| → ∞, and the isoperimetric estimates needed for our proofs of Theorems 3.1 and 3.2. 4.1. The alternate PDE representations. The chemical self-potentials Us (x), x ∈ R2 , are given by Us = −qs (φ − νs ψ).
(21)
We also introduce density potentials us (x), x ∈ R2 , defined by the invertible linear system φ = s qs u s , (22) ψ = s ν s qs us . (23) Clearly, Us =
t γs,t ut ,
(24)
where γs,t = −qs qt (1 − νs νt )
(25)
denotes the entries of the matrix of coupling constants. Notice that det(γ ) = −(q+ q− )2 (ν+ − ν− )2 ,
(26)
so that for ν+ = ν− , we have rank (γ ) = 2,
(27)
hence us =
−1 t γs,t Ut ,
−1 where γs,t denotes the entries of the inverse matrix γ −1 to γ .
(28)
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
615
Now let Gs stand for either GsTF or GsB . We note that Gs (φ, ψ) depends on φ and ψ only through the combination −qs (φ − νs ψ) = Us ; thus we can write Gs (φ, ψ) = B Gs (Us ) = Gs ( t γs,t ut ), where of course Gs stands for either GTF s or Gs . In either case, the map w → Gs (w) is monotonic increasing. It then follows at once that the chemical self-potentials Us solve the system of nonlinear PDEs −Us = 4π t γs,t Gt (Ut ), (29) supplemented by the integrability conditions Gs (Us )dx = Ns R2
(30)
and by the asymptotic conditions that, uniformly as |x| → ∞, lim
|x|→∞
Us (x) ln
1 |x|
=2
t γs,t Nt .
(31)
Alternately, in terms of the us we get the following representation for our Thomas– Fermi / Bennett models, −us = 4π Gs (32) t γs,t ut , supplemented by the integrability conditions Gs t γs,t ut dx = Ns R2
(33)
and by the asymptotic conditions that, uniformly as |x| → ∞, lim
us (x)
|x|→∞
ln
1 |x|
= 2Ns ,
(34)
for s = ∓. This constitutes the density potential representation of our Thomas–Fermi/ Bennett models. Remark. For the sake of completeness, we also state the PDEs of the Bennett model explicitly as a Liouville system. We readily eliminate the µs in terms of the Ns , using (33). Setting now us = Ns vs and βs γs,t Nt = 2κs,t , and furthermore t κs,t vt = Vs (equivalently, βs Us = 2Vs ), with s and t taking the “values” ±, we rewrite (32) into the form exp 2 t κs,t vt , −vs = 4π (35) R2 exp 2 t κs,t vt dx and (29) into −Vs = 4π
t κs,t
exp (2Vt ) . R2 exp (2Vt ) dx
(36)
Equations (35) and (36) are explicit alternate representations of the Liouville system associated to the Bennett model. The coefficient matrix κ is manifestly non-symmetric in general, having negative diagonal and positive off-diagonal elements. Note that in the conformal case (11), viz. t κs,t = 1 for s = ±, the Ansatz v+ = v− = v in (35), respectively V+ = V− = v in (36), reduces both (35) and (36) to Liouville’s equation (13).
616
M. K.-H. Kiessling
B 4.2. Isoperimetric estimates. Let Gs continue to stand for either GTF s or Gs . We intro duce gs , the primitive of Gs , i.e., gs (w) = Gs (w) for w ∈ R, such that the integrals
R2
gs (Us ) dx = Ms
(37)
exist (notice that Ms is defined by (37)). In each case this primitive gs is unique and given by gsTF (Us )
=
2 βs−1 3 h
R3
ln 1 + e
√ βs µs −c m2s c2 +|p|2 +νs cp·a+Us
dp
(38)
for the semi-classical model, and by gsB (Us ) = βs−1
2 h3
R3
e
√ βs µs −c m2s c2 +|p|2 +νs cp·a+Us
dp
(39)
for the classical model. Notice that in the classical model we have Ms = βs−1 Ns , while in the semi-classical model we have Ms > βs−1 Ns by the simple convexity inequality ln x ≤ −1 + x, with “=” only for x = 1. Notice furthermore that, in either case, the map w → gs (w) is monotonic increasing. Lemma 4.1. Let the pair (u+ , u− ) solve Eqs. (32) and (33), s = ∓, under the asymptotic conditions (34), with γ given in (25) satisfying (26). Then 1 s,t γs,t Ns Nt − s Ms ≥ 0, 2
(40)
and equality in (40) holds if and only if both u+ and u− are radially symmetric and decreasing about the same point. Proof. We follow the general reasoning of [10–12]. Since, by hypothesis, the pair (u+ , u− ) solves Eqs. (32) and (33), s = ∓, under the asymptotic conditions (34), then (U+ , U− ) satisfies (29) and (33), s = ∓, under the asymptotic conditions (31). Therefore, as |x| → ∞, 2 Gs (Us )(x) = 3 h
R3
e
√ βs µs −c m2s c2 +|p|2 +νs cp·a
dp |x|−2βs
t
γs,t Nt (1+θ(x))
,
(41)
with θ (x) = o(1). Also by hypothesis, (30) is satisfied, so that from (41) we conclude that βs t γs,t Nt < 1. Then, by (31) again, and since Us ∈ C 2,α (hence, Us ∈ C ∞ by bootstrapping), the level sets 4sξ = {x|Us ≥ ξs } are compact, hence |4sξ | < ∞. Let x → Us∗ (|x|) denote the equi-measurable, radially symmetric, non-increasing rearrangement of x → Us (x), centered at the origin, and denote by 4sξ ∗ = {x| Us∗ ≥ ξs } the ball of radius rξs , centered at the origin. By Sard’s theorem the C ∞ regularity of the Us implies that the outward normal λˆ to ∂4s exists except at most for ξ -values in a set ξ
of measure zero, so that the ensuing manipulations involving λˆ to ∂4sξ are well defined ξ -a.e.
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
617
ˆ ∇Us = |∇Us |, First, recalling that Gs > 0, we note that on ∂4sξ we have −λ, s by the Hopf lemma. Integration of this identity over ∂4ξ , a trivial rewriting, and an application of the Cauchy–Schwarz inequality now gives the estimate −
∂4sξ
ˆ ∇Us dσ = λ,
1 |∇Us | dσ ≥ |∇Us |
2
2
∂4sξ
∂4sξ
dσ
∂4sξ
1 dσ |∇Us |
−1 , (42)
with equality holding if and only if |∇Us | is constant on ∂4sξ . Noting that
∂4sξ
dσ = |∂4sξ |,
(43)
and applying the classical isoperimetric inequality [2], we have |∂4sξ | ≥ |∂4sξ ∗ |,
(44)
with equality holding if and only if, up to translation, ∂4sξ = ∂4sξ ∗ . By the co-area formula [18], 1 1 dσ = dσ. (45) ∗ ∗ s s ∂4ξ |∇Us | ∂4ξ |∇Us | Pulling these estimates together we have −
∂4sξ
ˆ ∇Us dσ ≥ λ,
|∂4sξ ∗ |2
1 dσ |∇Us∗ |
∂4sξ ∗
−1 ,
(46)
with equality holding if and only if, (i), |∇Us | is constant on ∂4sξ , and (ii), ∂4sξ = ∂4sξ ∗ , up to translation. This last remark implies in particular that we can restate (46) as [2], ˆ λ, ∇Us dσ ≥ − ∂r Us∗ dσ. (47) − ∂4sξ ∗
∂4sξ
Next, using Green’s theorem and (29), then a rearrangement identity for s = t, then a rearrangement inequality for s = t (in which case t = −s), noting that γs,−s > 0 and recalling that w → Gs (w) is increasing, we have ˆ ∇Us dσ = − λ, Us dx (48) − ∂4sξ
4sξ
= 4π
t γs,t
4sξ
= 4π γs,s ≤ 4π
t γs,t
Gt (Ut ) dx
Gs (Us∗ ) dx 4sξ ∗
4sξ ∗
(49)
+ γs,−s
Gt (Ut∗ ) dx,
G−s (U−s ) dx
4sξ
(50) (51)
618
M. K.-H. Kiessling
where equality in (51) can hold only if Ut and Us share their level lines (up to the labelling) in 4tξ , for our γ is irreducible. Combining inequalities (47) and (51), we arrive at the inequality − ∂r Us∗ dσ ≤ 4π t γs,t Gt (Ut∗ ) dx, (52) ∂4sξ ∗
4sξ ∗
where equality can hold if and only if each 4sξ is a disk, with |∇Us | constant on ∂4sξ , and all the Us share their level lines (up to the labelling). Thus, in case of equality in (52), from the first two conditions for equality it follows that the family of disks 4+ ξ and the family of disks 4− ξ are separately concentric, while from the third condition for equality it then follows that the families of disks must be jointly concentric. On the other hand, if at least one of the Us is not radially symmetric decreasing about any point, let :s be the image under Us of the (generally non-radial) set ⊂ R2 which supports the non-radial parts of Us . Then :s has finite measure. Since equality in (52) cannot hold for ξ ∈ :s , for ξ ∈ :s we now conclude that we have strict inequality in (52), −2πrξs Us∗ (rξs ) < 4π t γs,t Gt (Ut∗ ) dx (53) 4sξ ∗
for both s = ∓. We now set
Ns (r) =
and
Br (0)
Gs (Us∗ ) dx,
(54)
gs (Us∗ ) dx.
(55)
Ms (r) =
Br (0)
We have limr→∞ Ns (r) = Ns and limr→∞ Ms (r) = Ms , for f (Us∗ ) dx = f (Us ) dx, R2
R2
(56)
where f stands for either gs or Gs . By (53), 2πrUs∗ (r) ≥ −4π
t γs,t
Nt (r),
(57)
t γs,t
(58)
from which we conclude that
rMs (r) ≥ Ms (r) − 2Ns (r)
Nt (r),
with “>” valid for all r > 0 for which Us∗ (r) ∈ :s , while “=” holds for Us∗ (r) ∈ :s . We now sum (58) w.r.t. s = ∓, obtaining
s rMs (r) ≥
s Ms (r) −
N γ (r)N (r) , s,t s t s,t
(59)
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
619
where we made use of the fact that γ is real symmetric. Next we integrate (59) from r = 0 to r = ∞, using integration by parts on the left-hand side. Since gs (Us∗ ) ∈ L1 (R2 ) is radially decreasing, we have lim rMs (r) = 0,
(60)
1 γs,t Ns Nt − s Ms ≥ 0. 2 s,t
(61)
r→∞
thus we get the result
Now, if “=” holds in (61), then all the level curves ∂4sξ are circles with |∇Us | constant on ∂4sξ ; hence [6] the circular level curves of each Us are concentric, and then Us (x) = Us∗ (|x − x0s |) for some x0s . Moreover, since in case of “=” in (61), (51) tells us that the two Us must share their level curves (with generally different level values, of course), we conclude that x0+ = x0− , i.e. U+ and U− are then radially symmetric and decreasing about the same center of symmetry, x0 . On the other hand, if at least one of the Us is not radially symmetric and decreasing about any point, then the integration picks up all the strict inequalities from ξ ∈ :s , and “>” holds in (61). Finally, since rank (γ ) = 2, it follows that at least one us is not radially symmetric and decreasing about any point if at least one Us is not. In the same vein, u+ and u− are radially symmetric and decreasing about the same center of symmetry, x0 , whenever both U+ and U− are. This proves Lemma 4.1. 4.3. Asymptotic control near infinity. Standard harmonic analysis gives us: Proposition 4.2. Under the hypothesis stated in Lemma 4.1, each solution pair (u+ , u− ) of (32), (33), (34) satisfies the integral representation 1 1 ln G − ln γ u (y)dy. (62) us (x) − us (0) = s s,t t t |x − y|2 |y|2 R2 Corollary 4.3. By (62) we have
∇us (x) = −2
R2
x−y (y)dy. G γ u s s,t t t |x − y|2
(63)
With the help of (62) and (63) we obtain asymptotic control over the r.h.s. of (32), expressed in terms of the Us . Lemma 4.4. Under the hypotheses of Lemma 4.1, there exists an r0 (Us ) > 0, a constant Cs > 0, and a monotonic decreasing hs (|x|) > 0 satisfying lim |x|−hs (|x|) = 0,
|x|→∞
(64)
such that for s = ∓ we have, for |x| > r0 , Gs (Us )(x) ≤ Cs |x|−2−hs (|x|) . Furthermore, for at least one s, we have hs (|x|) ≥ >s > 0 for |x| > r0 .
(65)
620
M. K.-H. Kiessling
Proof. The bound (65), with hs (|x|) = O(1) monotonic decreasing, follows directly from and (30). Furthermore, by (41) and (30) we can find h such that ∞ (41) −1−hs (|x|) d|x| < ∞, but this is impossible if |x|−hs (|x|) → 0; hence, (64) follows. |x| 1 This still allows hs (|x|) = o(1) for both s, but by Lemma 4.1 and the fact that Ms ≥ βs−1 Ns (see the definition of the gs above), we find, after multiplying (61) by −2 and re-grouping terms, that −1 (66) s β s Ns 2 − β s t γs,t Nt ≤ 0. Thus, if for one of the s we have hs (|x|) = o(1), say for s = +, then by (41) we have β+ t γ+,t Nt = 1, so that (66) gives us right away
β − N+ . (67) β+ N− By symmetry, the analog conclusion holds for β+ t γ+,t Nt if h− (|x|) = o(1). Hence, hs (|x|) = o(1) for at most one of the s. This proves Lemma 4.4. Corollary 4.5. For at least one s, we have R2 ln |y| Gs Us (y)dy < ∞, so that for this s, we have (68) lim us (x) + 2Ns ln |x| = us (0) + 2 ln |y|Gs Us (y)dy. β−
t γ−,t Nt
≥2+
|x|→∞
R2
Lemma 4.6. Under the hypothesis stated in Lemma 4.1, each solution pair (u+ , u− ) of (32), (33), (34) satisfies the gradient estimates lim sup |x||∇us | ≤ 2Ns . |x|→∞
Proof. By Corollary 4.3, we have ∇us (x) ≤ 2
R2
Gs (Us ) (y) dy. |x − y|
After multiplying (70) by |x|, a simple rewriting of the r.h.s. gives |x| Gs (Us )(y)dy + 2 − 1 Gs (Us )(y)dy. |x| ∇us (x) ≤ 2 R2 R2 |x − y|
(69)
(70)
(71)
By (30) the first integral on the r.h.s. of (71) equals Ns . By the triangle inequality, the second integral on the r.h.s. of (71) is bounded in absolute value by |y| (72) Gs (Us )(y)dy. 2 R2 |x − y| We now show that
lim
|x|→∞ R2
|y| Gs (Us )(y)dy = 0, |x − y|
(73)
from which the lemma follows. We split the domain of integration in (72) as follows: R2 = @1 ∪@2 ∪@3 , with @1 = {y | |y| < |x|/2}, @2 = {y | |x|/2 ≤ |y| ≤ 2|x|}, and @3 = {y | |y| > 2|x|}. If
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
621
Gs (Us )(y) ≤ C|y|−2−> , with 0 < > < 1, then the estimates are precisely the same as in [10], Sect. 2, with exp replaced by Gs ; this is the case for at least one of the s. It remains to provide estimates when hs (|x|) = o(1) for one of the s. To estimate the contribution from @1 when Gs (Us )(y) ≤ C|y|−2−hs (|y|) with hs (|y|) = o(1), we note that |y| C C |x| −hs (ζ ) Gs (Us )(y)dy ≤ |y|Gs (Us )(y)dy ≤ ζ dζ. (74) |x| @1 |x| 0 @1 |x−y| As for the right-hand side of (74), L’Hopital’s Rule gives us C |x| −hs (ζ ) lim ζ dζ = C lim |x|−hs (|x|) = 0, |x|→∞ |x| 0 |x|→∞
(75)
the last step by Lemma 4.4. Hence, the l.h.s. (74) vanishes as |x| → ∞. Similarly, the contribution from @2 is estimated by using again that Gs (Us )(y) ≤ C|y|−2−hs (|y|) , so that for |x| large enough we have the bound |y| dy C Gs (Us )(y)dy ≤ ≤ C|x|−hs (|x|) . (76) 1+hs (|y|) |x − y| |y| |x| @2 |y|<4|x| Clearly r.h.s.(76)→ 0 as |x| → ∞, by the same reasoning as for @1 . Finally, the contribution from @3 is dominated by |y| Gs (Us )(y)dy, Gs (Us )(y)dy ≤ C @3 |x − y| |y|>2|x|
(77)
which vanishes as |x| → ∞, by hypothesis (30). This concludes the proof of Lemma 4.6. Lemma 4.7. Under the hypotheses of Lemma 4.1, we have, uniformly in x, lim x, ∇us = −2Ns .
|x|→∞
(78)
Proof. Let xˆ = x/|x| and yˆ = y/|y|, with |x| = |y|. Now fix xˆ ∈ S1 . By (34), we have us (τ x) ˆ = −2Ns . ln τ
(79)
d ˆ = lim x, ∇us = −2Ns us (τ x) |x|→∞ dτ
(80)
lim
τ →∞
Thus, by L’Hopital’s Rule, lim τ
τ →∞
for x = |x|x. ˆ It remains to establish uniformity of (80). To this extent, we show that there exist R and δ, such that, if |x| > R and |xˆ − y| ˆ < δ, then x, ∇us (x) − y, ∇us (y) < >. (81) We first show that for |x| > R and |x − y| < |x|/10, we have, |x|∇us (x) − ∇us (y) ≤ C|xˆ − y| ˆ + C |x|−hs (|x|) .
(82)
622
M. K.-H. Kiessling
By Corollary 4.3, ∇us (x) − ∇us (y) ≤ 2
x−z y − z dz. Gs (Us )(z) − |x − z|2 |y − z|2 R2
(83)
We break up the domain of integration in the above integral exactly as in the proof of Lemma 4.6. (Notice the integration variable is now z.) The integration over @1 is estimated exactly as in Sect. 2 of [10] to be dominated by |x − y| |x − y| Gs (Us )(z) dz ≤ C . (84) 2 |x| |x|2 R2 The integral over @2 is dominated by 1 1 + dz ≤ C|x|−1−hs (|x|) . Gs (Us )(z) |x − z| |y − z| |z|∼|x|
(85)
The final estimate above was identical to that made in the proof of Lemma 4.6. Use was made of Gs (Us )(z) ≤ C|x|−2−hs (|x|) on @2 , which holds by Lemma 4.4. The contribution from @3 is estimated once again exactly as in Sect. 2 of [10] to be dominated by Gs (Us )(z) |x − y| |x − y| dz ≤ C Gs (Us )dx ≤ C , (86) C|x − y| 2 2 |z| |x| |x|2 |z|>2|x| R2 where the last step follows by (30). By these estimates, x, ∇us (x) − y, ∇us (y) ≤ |x||xˆ − y| ˆ ∇us (x) + |y|∇us (x) − ∇us (y) ≤ |x|∇us (x)|xˆ − y| ˆ + |xˆ − y| ˆ + C|x|−hs (|x|) .
(87) (88)
By Lemma 4.6, the last expression above is at most C δ + C|x|−hs (|x|) . Thus our claim (81) follows now from Lemma 4.4 for suitably large R and small δ. Since S1 is compact, uniformity of the limit in Lemma 4.7 now follows. Corollary 4.8. Under the hypotheses expressed in Lemma 4.1, we have, uniformly in x, lim |x||∇us | = 2Ns .
|x|→∞
(89)
Proof. Follows essentially verbatim [10], proof of Corollary 2.2, with exp replaced by Gs . Let @ξ = {x| us (x) ≥ ξ }, where ξ −1. By (34) it follows that if x ∈ ∂@ξ , then |x| ≥ R(c) with R(c) large. For such x, it follows from Corollary 4.8 that ∇us = 0. 2,α Since u ∈ Cloc we easily see that therefore ∂@ξ ∈ C 2,α . Thus the unit outward normal ω(x) ˆ to ∂@ξ exists at all x ∈ ∂@ξ for ξ sufficiently negative. Lemma 4.9. Let ω(x) ˆ be the unit outward normal to ∂@ξ at x, and let xˆ = x/|x|. We have, uniformly in x, ˆ ω ˆ = 1. lim x,
c→−∞
(90)
Proof. Identical to [10], proof of Lemma 2.8. Remark. Lemma 4.9 implies that asymptotically for large x the ∂@ξ become concentric circles. We are now in a position to prove our Theorem 3.1.
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
623
4.4. Proof of the Virial Theorem. Proposition 4.10 (Rellich–Pokhozaev identity). Under the hypotheses expressed in Theorem 3.1, any solution pair (u+ , u− ), of (32), (33), (34) satisfies the Rellich–Pokhozaev identity 1 s,t γs,t Ns Nt − s Ms = 0. 2
(91)
Remark. The Rellich–Pokhozaev identity (91) is identical to the identity expressed in the Virial Theorem 3.1. Proof of Proposition 4.10. For (u+ , u− ) a solution pair of (32), (33), (34), we have the partial differential identity (92) div(x, ∇ut ∇us ) = ∇us , (1 + x, ∇)∇ut − 4π x, ∇ut Gs ( t γs,t ut ). We will multiply (92) by γs,t , sum over s and t, integrate over BR , use some partial integrations, then take the limit R → ∞. We evaluate first the left-hand side of (92). Green’s theorem gives us div(x, ∇ut ∇us )dx = |x|−1 x, ∇ut x, ∇us dσ. (93) BR
∂BR
Taking the limit R → ∞, using (78), we get |x|−1 x, ∇ut x, ∇us dσ = 8π Ns Nt , lim R→∞ ∂BR
hence lim
R→∞
s,t γs,t
BR
div(x, ∇ut ∇us )dx = 8π
s,t γs,t Ns Nt .
On the other hand, the last term in the right-hand side of (92) gives us x, ∇ut Gs ( t γs,t ut )dx = s x, ∇gs ( t γs,t ut )dx s,t γs,t BR B R = s |x| gs ( t γs,t ut )dσ − 2 s gs ( t γs,t ut )dx . ∂BR
BR
(94)
(95)
(96)
We now take the limit R → ∞ in (96). As for the surface integrals, we note that by Lemma 4.4 we have gs (Us )(x) ∼ CGs (Us )(x) as |x| → ∞, so that once again by Lemma 4.4, we have lim |x| gs ( t γs,t ut ) dσ = 0. (97) R→∞ ∂BR
As for the volume integrals, we get gs ( t γs,t ut ) dx = Ms . lim R→∞ BR
(98)
624
M. K.-H. Kiessling
Turning now to the first term in the right-hand side of (92), we use the symmetry of γ , an integration by parts and (32), to get 1 γ ∇u , (1+x, ∇)∇u dx = γ |x|∇us , ∇ut dσ (99) s t s,t s,t s,t 2 s,t BR ∂BR 1 = γ (100) |x|−1 x, ∇us x, ∇ut + |x|∇T us , ∇T ut dσ, s,t s,t 2 ∂BR where ∇T denotes the tangential derivative. By Lemma 4.7 and Corollary 4.8, we have |x|2 |∇T us |2 = |x|2 |∇us |2 − x, ∇us 2 → 0,
(101)
uniformly as |x| → ∞. Thus as R → ∞, |x|∇T us , ∇T ut dσ → 0,
(102)
∂BR
and therefore lim
R→∞
s,t γs,t
BR
∇us , (1 + x, ∇)∇ut dx = 4π
s,t γs,t Ns Nt .
Pulling all limit results together, we obtain Proposition 4.10.
(103)
Remark. The proof of the virial theorem extends to situations when γ does not have full rank, hence to more-than-two species beams. 4.5. Concluding the proof of the symmetry theorem. By Lemma 4.1, and by Proposition 4.10, the solutions us of (32), (33), (34), have to be radially symmetric and decreasing about a common center x0 . Since the coupling matrix γ is invertible, the same conclusion holds for the solutions Us of (29), (30), (31). The proof is complete. Acknowledgement. The results reported here spun out of my collaboration with Sagun Chanillo. Work supported by NSF Grant DMS 9623220.
References 1. Almgren, F.J., and Lieb, E.H.: Symmetric decreasing rearrangement is sometimes continuous. J. Am. Math. Soc. 2, 683–773 (1989) 2. Bandle, C.: Isoperimetric inequalities and applications. Boston: Pitman, 1980 3. Benford, G., Book, D.L. and Sudan, R.N.: Relativistic beam equilibria with back currents. Phys. Fluids 13, 2621–2623 (1970) 4. Bennett, W.H.: Magnetically self-focussing streams. Phys. Rev. 45, 890–897 (1934) 5. Bennett, W.H.: Self-focussing streams. Phys. Rev. 98, 1584–1593 (1955) 6. Brothers, J.E., and Ziemer, W.P.: Minimal rearrangement of Sobolev functions. J. reine angew. Math. 384, 153–179 (1988) 7. Carlen, E.A., and Loss, M.: Competing symmetries, the logarithmic HLS inequality, and Onofri’s inequality on Sn . Geom. Funct. Anal. 2, 90–104 (1992) 8. Chandrasekhar, S.: The highly collapsed configurations of a stellar mass. Month. Not. R. astr. Soc. 91, 456–466 (1931); Part II, ibid. 95, 207–225 (1935) 9. Chanillo, S., and Kiessling, M.K.-H.: Rotational symmetry of solutions to some nonlinear problems in statistical mechanics and geometry. Commun. Math. Phys. 160, 217–238 (1994)
Symmetry Results for Finite-Temperature, Relativistic Thomas–Fermi Equations
625
10. Chanillo, S., and Kiessling, M.K.-H.: Conformally Invariant Systems of Nonlinear PDE of Liouville type. Geom. Funct. Anal. 5, 924–947 (1995) 11. Chanillo, S., and Kiessling, M.K.-H.: Symmetry of solutions of Ginzburg–Landau equations. Compt. Rendus Acad. Sci. Paris, t. 321, 1023–1026 (1995) 12. Chanillo, S., and Kiessling, M.K.-H.: Curl-free Ginzburg–Landau vortices. Nonlinear Anal. 38, 933– 949 (1999) 13. Chanillo, S., and Kiessling, M.K.-H.: Surfaces with Prescribed Gauss Curvature. Duke Math. J. 105, 309–353 (2000) 14. Chen, W., and Li, C.: Classification of solutions of some nonlinear elliptic equations. Duke Math. J. 63, 615–622 (1991) 15. Chipot, M., Shafrir, I., Wolansky, G.: On the solutions of Liouville systems. J. Diff. Eq. 140, 59–105 (1997); and errata, ibid. 178, 630 (2002) 16. Chou, K.S., and Wan, T.Y.: Asymptotic radial symmetry for solutions of u + eu = 0 in a punctured disc. Pac. J. Math. 163, 269–276 (1994) 17. de Groot, S.R., van Leeuwen, W.A., and van Wert, Ch.G.: Relativistic Kinetic Theory. North Holland, Amsterdam (1980) 18. Federer, H.: Geometric measure theory. Berlin: Springer Verlag, 1996 19. Gidas, B., Ni., W.M., and Nirenberg, L.: Symmetry and related properties via the maximum principle. Commun. Math. Phys. 68, 209–243 (1979) 20. Gilbarg, D., and Trudinger, N.S.: Elliptic partial differential equations of second order. New York: Springer Verlag, 1983 21. Humphries, S.: Charged particle beams, pp. 582 ff. New York: Wiley, 1990 22. Jost, J., and Wang, G.: Analytic aspects of a Toda system I: a Moser-Trudinger inequality. e-print, arXiv:math-ph0011039 (2000); Commun. Pure Appl. Math. 54, 1289–1319 (2001) 23. Jost, J., and Wang, G.: Classification of solutions of a Toda system in R2 . e-print, arXiv:math-ph0105045 [2001]; Int. Math. Res. Notices 2002, 277–290 (2002) 24. Jüttner, F.: Das Maxwellsche Gesetz der Geschwindigkeitsverteilung in der Relativtheorie. Ann. Physik u. Chemie 34, 856–882 (1911) 25. Jüttner, F.: Die Dynamik eines bewegten Gases in der Relativtheorie. Ann. Physik u. Chemie 35, 145–161 (1911) 26. Jüttner, F.: Die relativistische Quantentheorie des idealen Gases. Z. Physik 47, 542–566 (1928) 27. Kawohl, B.: Rearrangements and convexity of level sets in PDE. New York: Springer-Verlag,1985 28. Kesavan, C., and Pacella, F.: Symmetry of positive solutions of a quasilinear elliptic equation via isoperimetric inequalities. Appl. Anal. 54, 27–37 (1994) 29. Kiessling, M.K.-H., and Lebowitz, J.L.: Dissipative Stationary Plasmas: Kinetic Modeling, Bennett’s Pinch and Generalizations. Phys. Plasmas 1, 1841–1849 (1994) 30. Lebowitz, J.L., and Bergmann, P.G.: New Approach to Nonequilibrium Processes. Phys. Rev. 99, 578–587 (1955) 31. Lebowitz, J.L., and Bergmann, P.G.: Irreversible Gibbsian Ensembles. Annals of Physics (N.Y.) 1, 1–23 (1957) 32. Lebowitz, J.L., and Frisch, H.: Model of a Nonequilibrium Ensemble: The Knudsen Gas. Phys. Rev. 107, 917–923 (1957) 33. Lebowitz, J.L.: Stationary Nonequilibrium Gibbsian Ensembles, Phys. Rev. 114, 1192–1202 (1959) 34. Li, C.-M.: Monotonicity and symmetry of solutions of fully nonlinear elliptic equations on unbounded domains. Commun. PDE 16, 585–615 (1991) 35. Lieb, E.H., and Yau, H.-T.: The Chandrasekhar theory of stellar collapse as the limit of quantum mechanics. Commun. Math. Phys. 112, 147–174 (1987) 36. Lions, P.-L.: Two geometrical properties of solutions of semilinear problems. Appl. Anal. 12, 267–272 (1981) 37. Liouville, J.: Sur l’équation aux différences partielles ∂ 2 log λ/∂u∂v ± λ/2a 2 = 0. J. de Math. Pures Appl. 18, 71–72 (1853) 38. Pokhozaev, S.I.: Eigenfunctions of the equation u + λf (u) = 0. Sov. Math. Dokl. 5, 1408–1411 (1965) 39. Rellich, F.: Darstellung der Eigenwerte von u + λu = 0 durch ein Randwertintegral. Math. Z. 46, 635–636 (1940) 40. Serrin, J.: A symmetry problem in potential theory. Arch. Rational Mech. Anal. 43, 304–318 (1971) 41. Thirring, W.E.: Preface to: The State of Matter, M. Aizenman and H. Araki, eds. Adv. Ser. Math. Phys. 20. Singapore: World Scientific, 1994
626
M. K.-H. Kiessling
42. Troy, W.C.: Symmetry properties in systems of semilinear elliptic equations. J. Diff. Eq. 42, 400–413 (1981) 43. Vlasov, A.A.: Many-Particle Theory and its Application to Plasma. New York: Gordon and Breach, 1961 44. Weinberg, S.: General theory of resistive beam instabilities. J. Math. Phys. 8, 614–641 (1967) Communicated by H.-T. Yau
Commun. Math. Phys. 226, 627 – 662 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Discrete Quantum Drinfeld–Sokolov Correspondence C. Grunspan The Weizmann Institute of Science, Department of Theoretical Mathematics, 76100 Rehovot, Israel Received: 3 July 2001 / Accepted: 14 December 2001
Abstract: We construct a discrete quantum version of the Drinfeld–Sokolov correspondence for the sine-Gordon system. The classical version of this correspondence is a birational Poisson morphism between the phase space of the discrete sine-Gordon system and a Poisson homogeneous space. Under this correspondence, the commuting higher mKdV vector fields correspond to the action of an Abelian Lie algebra. We quantize this picture (1) by quantizing this Poisson homogeneous space, together with the action of the Abelian Lie algebra, (2) by quantizing the sine-Gordon phase space, (3) by computing the quantum analogues of the integrals of motion generating the mKdV vector fields, and (4) by constructing an algebra morphism taking one commuting family of derivations to the other one.
1. Introduction 1.1. Background. The link between integrable systems and quantum groups has been intensively studied during the last few years from several viewpoints. The goal of this article is to present a discrete and quantum version of a natural construction occurring in the theory of integrable systems, namely the Drinfeld–Sokolov correspondence [DS]. At the classical level, this correspondence was discovered by Drinfeld and Sokolov in the eighties, by using the dressing method of Zakharov and Shabat [SZ]. It is a bijective map between phase spaces of certain evolution equations (such as the KdV, mKdV, or Toda hierarchy) and homogeneous spaces. Each phase space is equipped with an infinite commuting family of vector fields: the Hamiltonian fields generated by the integrals of motion of the KdV, mKdV, or Toda hierarchy. One of the main properties of the Drinfeld–Sokolov correspondence is that it leads to a geometric interpretation of these commutative families: they correspond to the action of an Abelian Lie algebra This work was supported in part by EC TMR network “Algebraic Lie Representations”, Grant No FMRXCT97-0100.
628
C. Grunspan
on a double coset space. After Drinfeld and Sokolov, Feigin and Frenkel developed a cohomological approach based on the fact that screening operators of the Toda theory satisfy the Serre relations [FF2, FF3]. This allowed them (1) to prove the existence of a commutative family of integrals of motion in the quantum case and (2) to suggest a possible discretization of the problem, generalizing those introduced much earlier by Izergin and Korepin [IK1, IK2] (see also [Pu]). At the semi-classical level, the discretized Toda system has been studied by Enriquez and Feigin in the case when the Lie algebra 2 [EFe]. This is the discrete sine-Gordon theory. By imitating the cohomological is sl approach of Feigin and Frenkel, the authors (1) proved the existence of a classical family of integrals of motion in involution and (2) constructed a Drinfeld–Sokolov correspondence between phase spaces equipped with the Hamiltonian action of the integrals of motion, and homogeneous spaces equipped with the action of an Abelian Lie algebra. Moreover, the phase spaces are endowed with a natural structure of Poisson manifold, the homogeneous spaces are Poisson homogeneous spaces and the correspondence is a Poisson isomorphism. The aim of our present work is to quantify this result. So, this article fills in the discrete-quantum square of the following array (Table 1). Table 1. Drinfeld–Sokolov correspondence Classical Continuous Drinfeld–Sokolov (1981) Discrete
Quantum ?
Enriquez–Feigin (1995) this article (2001)
1.2. The classical Drinfeld–Sokolov correspondence. The Drinfeld–Sokolov correspondence is inspired by the application of dressing methods developed by Zakharov and Shabat in the theory of integrable systems [SZ]. The integrable systems studied by Drinfeld and Sokolov by this method are the Korteweg–de Vries hierarchies (KdV) or the modified Korteweg–de Vries hierarchies (mKdV), associated with an affine Kac– 2 , the Moody algebra. For example, in the case when the Kac–Moody algebra g˜ is sl second equation of the mKdV hierarchy is (the first one being ∂z u = uz ) ut = uzzz + 6u2 uz .
(1)
The main achievement of Drinfeld and Sokolov was (1) to associate to these equations matrices Lax pairs (A(u), L(u)) taking values in affine Kac–Moody Lie algebras and then (2) by assigning to a point of the phase space, the matrix conjugating its Lax matrix to a prescribed form, to set up a bijection between the phase space and a coset space (3) to show that the corresponding system on the coset space is “linear”. This way, Drinfeld and Sokolov achieved the linearization of their system. More precisely, if u belongs to the phase space, the matrices K(u) conjugating the matrix L(u) into a standard form belong to a pro-algebraic pro-unipotent subgroup N+ of the Kac–Moody group G associated to g˜ . Moreover, such a matrix K(u) is determined uniquely up to a multiplication by an element of a commutative subgroup A+ in N+ , and all the coefficients of the class of K(u) in N+ /A+ are differential polynomials in u. As a result, one gets a map from the phase space of the hierarchy to N+ /A+ . The Drinfeld–Sokolov theorem asserts that this map is bijective. Moreover, in the corresponding bijection between the rings of functions,
Discrete Quantum Drinfeld–Sokolov Correspondence
629
the hierarchy equations viewed as commutative flows on the phase space correspond to the right action of the Lie algebra of the normalizer of A+ in G, N+ being embedded into the flag variety B− \G. The Hamiltonian structure (which one can associate to these hierarchies) was studied by Gelfand, Dickey and Dorfman ([GDi1, GDi2, GDo]).
1.3. The viewpoint of Feigin and Frenkel. Feigin and Frenkel reformulated the Drinfeld– Sokolov correspondence in a cohomological language. This allowed them to identify the action of n+ on the phase space U (which is, according to the Drinfeld–Sokolov correspondence, the same as the left action by vector fields of n+ on the homogeneous space N+ /A+ ) with the Hamiltonian action of screening charges of the Toda system associated with the Lie algebra g˜ . Besides, their formalism led to a quantization as well as a discretization of the Toda system. Precisely, let g be a semi-simple Lie algebra and g˜ be the affine Kac–Moody algebra built from g. The Toda system associated with the Lie algebra g˜ is the following system of differential equations: ∂τ ∂z φi (z, τ ) =
l
(αi , αj ) exp(−φj (z, τ )), i = 1, . . . , l,
(2)
j =0
where α0 , . . . , αl are simple roots of g˜ . Each function φi (z, τ ) depends on z as well as l 1 the time variable τ , and φ0 (z, τ ) = − ai φi (z, τ ), where a0 , . . . , al are labels of a0 i=1 2 , the system reduces to the sine-Gordon the Dynkin diagram. In the case when g˜ = sl equation: ∂τ ∂z φ(z, τ ) = exp(φ(z, τ )) − exp(−φ(z, τ )).
(3)
(n)
Let π0 be the ring of functions on U . We have π0 = C[ui ]1≤i≤l;0
i∈J
for some differential modules π−αi = π0 ⊗exp(−φi ) equipped with derivations ∂ defined (0) by ∂(u ⊗ exp(−φi )) = (∂u) ⊗ exp(−φi ) − (uui ) ⊗ exp(−φi ). One of the results of ˜ i = −Ti eG , where eG denotes the image under the Feigin and Frenkel shows that Q i i Drinfeld–Sokolov correspondence of the left action by the vector field ei on N+ /A+ , and Ti denotes the multiplication by exp(−φi ) sending π0 to π−αi . In the Hamiltonian formalism of Feigin and Frenkel, one defines F−αi := π−αi /Im∂. This is the space of functionals of the form u −→ P u(z), ∂z u(z), . . . exp(−φi (z))dz (5) |z|=1
630
C. Grunspan
(0) (0) : S 1 −→ h, where S 1 is the unit circle and φi is an with u(z) = u1 , . . . , ul (0) anti-derivative of ui . The space F0 is the local functionals space of the Toda system (one uses only derivatives of u). It was shown in [GDi2, GDo] that it may be equipped with the structure of a “vertex Poisson” algebra (this notion was first introduced in these papers, and developed in [BD, EFr2]). Such vertex Poisson structures are classical limits of families of VOA structure (as opposed to classical limits of associative algebra structures in the case of Poisson structures). It is possible to extend the Poisson bracket to bilinear maps: F0 × F−αi −→ F−αi (F, G) −→ {F, G}
(6)
satisfying the Jacobi identity on F0 . In other words, the F−αi are F0 -modules. By passing to the quotients, the morphisms Q˜ i define screening operators Q¯ i : F0 −→ F−αi . Feigin ¯ i = {., exp(−φi )}, where exp(−φi ) is the projection of and Frenkel showed that Q exp(−φi ) onto F−αi . This projection is called the classical screening charge. Then, we define a Hamiltonian ¯ i : F0 −→ F−αi . (7) H= Q i∈J
Indeed, in this formalism, we can write the Toda equations as ∂τ u(z) = {u(z), H}.
(8)
The integrals of motion of the Toda system are the local functionals which commute with ¯ i satisfy all the screening charges. On the other hand, it was known [BMP] that the Q the Serre relations for g. This gives an action of the nilpotent Lie algebra n+ on π0 and allowed Feigin and Frenkel to interpret the space of integrals of motion as the first cohomology group of n+ with coefficients in π0 . Using resolutions of BGG type, the authors managed to compute this cohomology, yet without giving explicit formulas for the integrals of motion (except for some particular cases). They showed that the space of integrals of motion forms a graded Abelian Lie subalgebra of F0 (for the gradation (k) defined by ∂ o ui = −k on π0 ), generated by integrals of motion Im , m ∈ Z− and o ∂ Im = m. This Abelian Lie subalgebra forms a maximal Abelian Lie subalgebra of the so-called W-algebra of g. Moreover, the Hamiltonian flow of Im corresponds to the mKdV hierarchy equations. Similarly, this Hamiltonian approach allowed them to show that the integrals of motion admit quantum deformations. 1.4. The quantum sine-Gordon system. The interpretation of the Poisson structure on F0 in terms of Kirillov-Kostant structures allows the following quantum deformation ([FF3]). 1.4.1. The continuous quantum model. Consider the quantum Heisenberg algebra of 2 , generated by I, qi , bni , i ∈ {0, 1}, n ∈ Z, satisfying the following relations: sl [I, qi ] = [I, bni ] = 0, j
[qi , bn ] = (αi , αj )δn,0 β 2 I, j
[bni , bn ] = n(αi , αj )δn+m,0 β 2 I,
Discrete Quantum Drinfeld–Sokolov Correspondence
631
where δi,j denotes the Kronecker symbol, β 2 is a deformation parameter (q = 2 . exp(iπβ 2 )), and (αi , αj ) denotes the scalar product of two roots αi and αj in sl This algebra acts on the direct sum of Fock modules in such a way that the vertex operators: Vi (z) =: exp φi (z) := exp φi− (z) exp φi+ (z) , satisfy the following commutation relations: Vj (w)Vi (z) = exp iπβ 2 (αi , αj ) Vi (z)Vj (w) (9) in the domain |z| > |w|. Here, φi (z) is the free field: 1 − bni z−n − b0i ln(z) − qi , φi (z) = n n=0
and φi+ (z) = φi− (z)
=
n>0
n<0
1 − bni z−n − b0i ln(z), n 1 − bni z−n − b0i ln(z). n
The screening charges are defined by: Vi (z)dz = Si = |z|=1
|z|=1
: exp φi (z) : dz
(10)
for i ∈ {0, 1}. The integrals of motion of the system are expressions of the form P ∂zk φi k≥0,i∈{0,1} (z)dz |z|=1
which commute with S0 and S1 . Feigin and Frenkel showed that the screening charges 2 . This allowed them – as in the classical S0 and S1 satisfy quantum Serre relations of sl case – to interpret the space of quantum integrals of motion as the first cohomology group of a certain complex. They proved that the integrals of motion commute with each other and that the space of integrals of motion generates a quantum deformation of the classical W-algebra. 1.4.2. The discrete quantum sine-Gordon model. The first model of a discrete integrable system was introduced by Izergin and Korepin in 1982 for the sine-Gordon system in order to resolve ultraviolet divergence problems occurring in the continuous theory. The q-commutation relations between vertex operators naturally lead to the aforementioned discretization, which is adopted by Izergin and Korepin. Set q = exp(iπβ 2 ), and replace the complex numbers z by relative integers k ∈ Z. The vertex operators V0 (z) and V1 (z) are replaced by variables yk and xk satisfying the relations: ∀k < l,
∀k ≤ l,
xk xl = qxl xk , yk yl = qyl yk ,
(11) (12)
yk xl = q −1 xl yk ,
(13)
xk yl = q −1 yl xk ,
(14)
632
C. Grunspan
coming analogues of screening charges are S0 and S1 defined by S0 = +∞ from (9). The +∞ y and S = 1 k=−∞ k k=−∞ xk in a certain completion of the algebra of variables on the given lattice. The Hamiltonian of the system is H = S0 + S1 . The integrals of motion correspond to expressions of the form +∞ i=−∞
±1 ±1 ±1 ±1 P xi+1 , yi+1 , . . . , xi+k , yi+k
which commute with S0 and S1 , where P X1±1 , Y1±1 , . . . , Xk±1 , Yk±1 denotes a polynomial in the variables, X1±1 , . . . , Yk±1 , these variables being q-commutative. As in the continuous case, Enriquez and Feigin showed that the screening charges 2 ), which gives a cohomological interpretation for the satisfy the Serre relations (for sl integrals of motion. By means of Demazure desingularization, they managed to compute this cohomology in the classical limit q → 1, to give formulae for densities of integrals of motion, and to prove that the integrals of motion are in involution. This justifies calling the system a discrete integrable system. Moreover, Enriquez and Feigin identified the phase space of this system with the homogeneous space H− \B− , where B− is a Borel subgroup of the loop group of SL2 , and H− is a subgroup of B− consisting of diagonal matrices, and established a discrete version of the Drinfeld–Sokolov correspondence. The Hamiltonian action by integrals of motions corresponds to a (left) action of a commutative Lie algebra h+ on the homogeneous space, which is embedded into H− \G/N+ . Moreover, the correspondence is a Poisson morphism.
2. Main Results In this section, we present our main results which deal with the discrete sine-Gordon theory. Proposition 3 is proved in [G1]. In Proposition 5, we construct a quantization of the Poisson homogeneous space considered by Enriquez and Feigin [EFe]. Theorem 1 is a discrete and quantum version of the Drinfeld–Sokolov correspondence. It generalizes a theorem of Enriquez and Feigin to the quantum case.
2.1. Some basic definitions. The following notation will be used throughout the article. Let q be a formal variable. The quantum phase space of the discrete sine-Gordon system. Let Aq be the algebra generated over Q[q, q −1 ] by the non-commutative variables xi±1 and yi±1 , i ∈ Z, subject to relations (11)–(14). This is the quantum phase space of our system. It can be shown αi βi that a basis for Aq is given by the family +∞ i=−∞ xi yi , where (αi ) and (βi ) are two sequences in ZZ which have only a finite number of non-zero elements. At the semiclassical limit, Aq defines a Poisson structure on Acl = Q[xi±1 , yi±1 , i ∈ Z]. There is 1
a gradation deg on Aq defined by deg(xi ) = − deg(yi ) = 1 for all i ∈ Z. Let T 2 1 denote the half-translation antigraded automorphism defined on Aq by T 2 (xi ) = yi and 1 2 1 T 2 (yi ) = xi+1 for all i ∈ Z. We shall set T = T 2 . For n ∈ Z, let Aq [n] be the submodule of Aq of all elements of degree n. The discrete analogue of π0 is Aq [0].
Discrete Quantum Drinfeld–Sokolov Correspondence
633
The functional spaces and integrals of motion. By definition, the functional spaces are Fn , n ∈ N defined by Fn = Aq [n]/Im(T − I d). If P ∈ Aq [n], one notes I (P ) its class in Fn . The discrete analogues of the screening charges S0 and S1 seen in (10) are 3+ = I (x0 ) ∈ F1 and 3− = I (y0 ) ∈ F−1 . The Hamiltonian of the system is H = 3− + 3+ ∈ F−1 ⊕ F1 . We set I = Ker[., 3− ] ∩ Ker[., 3+ ] ⊂ F0 . It is the space of all local integrals of motion of the system. If P ∈ Aq [0] and if I (P ) ∈ I, we say that P is a density of an integral of motion. The homogeneous space of Enriquez and Feigin. We set G = SL2 C((λ−1 )) , B− = π −1 B¯ − , and N+ = p−1 N¯ + , where B¯ − (resp. N¯ + ) is the subgroup of SL2 (C) defined by all lower matrices, π (resp. p) is the induced map triangular unipotent) (resp. upper from SL2 C[[λ−1 ]] (resp. SL2 C[λ] ) to SL2 (C) obtained by sending λ−1 (resp. λ) to 0. We also denote by H− the subgroup of B− given by all the diagonal matrices of the form diag(a, a −1 ), a ∈ C[[λ−1 ]]∗ . The Poisson homogeneous space considered by Enriquez and Feigin is H− \B− endowed with the Poisson structure induced by the Poisson bivector R on G such that the map H \B 8→ H \G/N is a Poisson morphism. P∞ = r L − r∞ − − − + Here, r is the standard r-matrix on G, r∞ corresponds to the conjugation of r with an element of the Weyl group with an infinite length (it is the r-matrix associated with the R ) is the left (resp. right) translation of “new realizations” of Drinfeld) and r L (resp. r∞ r (resp. r∞ ) on G. The homogeneous space H− \B− is endowed with an action from the left by the Abelian Lie algebra h+ := {diag(a, −a), a ∈ λC[λ]}. Enriquez and Feigin identified this action with the Hamiltonian action of the integrals of motion on Acl . Theorem 1 generalizes this result. 2.2. The results. The first few results deal with the integrals of motion of the discrete sine-Gordon system. Proposition 1. Let n be an integer. For any F ∈ F0 , G ∈ Fn , P ∈ I −1 (F ) ⊂ Aq [0] +∞ i i and Q ∈ I −1 (G) ⊂ Aq [n], the two sums +∞ i=−∞ [T P , Q] and i=−∞ [P , T Q] are equal and do not depend on P nor Q but only on F and G. Moreover, the bilinear map: F0 × Fn −→ Fn (F, G) −→ [F, G]
(15)
with F = I (P ), G = I (Q), and [F, G] := =
+∞ 1 [T i P , Q] q −1
1 q −1
i=−∞ +∞
[P , T i Q]
(16)
(17)
i=−∞
satisfies the Jacobi identity on F0 . In other words, the space F0 of all local functionals is a Lie algebra, and Fn is a F0 -module. The following proposition shows that the local functionals act on Aq by adjoint action. k Proposition 2. For any F ∈ F0 and x ∈ Aq , the sum ∞ k=−∞ [T P , x] does not depend on the representative P ∈ I −1 (F ) ⊂ Aq [0] chosen for F and defines a derivation on
634
C. Grunspan
Aq which commutes with the automorphism T . This derivation is the adjoint action of F on Aq and is denoted by ad(F ). Moreover, if Der T (Aq ) denotes the Lie algebra of all derivations on Aq which commute with T , we have a Lie algebra morphism: (Aq ) F0 −→ Der T 1 k F −→ q−1 ∞ k=−∞ [T P , .]
ad :
(18)
and the kernel of this morphism is equal to the class of constants in F0 . Proposition 3 gives an explicit basis for the space I of local integrals of motion. This should be compared with the formulas involving quantum trace identities of Izergin and Korepin [IK2] (see also [FTT]) at least in the classical case (indeed, in the quantum case, the Izergin and Korepin formulas for the sine-Gordon system are no longer local) and with Hikami’s formulas [H] which give an expression of the local integrals of motion by means of the fundamental L-operators (see also [V]). Proposition 3. The local classical integrals of motion of the discrete sine-Gordon system can be deformed in the quantum case. A basis for I is given by the family In = I (ψn ), n > 0. The generating function of the densitites of integrals of motion ψn is given by lnq U + lnq V =
∞ 1 ψp λ−p , [p]
(19)
p=1
where for all integers p, [p] :=
qp − 1 , U = lim UN , V = lim VN , N→∞ N→∞ q −1 1 (λx1 y1 )−1 (λy1 x2 )−1
UN := 1− 1−
..
,
(20)
. (λxN−1 yN−1 )−1
1 − (λyN−1 xN )−1 1 2
VN := T UN ,
(21)
the limits being taken in the sense of the λ−1 -topology. By convention, we have set a := ab−1 , and for all power series f in λ−1 with non-zero constant term, b lnq f :=
∞ 1 (1 − f −1 )p . [p]
p=1
The following proposition is proved in Sect. 3. Proposition 4. The space I is a commutative Lie subalgebra in F0 . Proposition 5 gives a quantization of the homogeneous space considered by Enriquez and Feigin.
Discrete Quantum Drinfeld–Sokolov Correspondence
635
Proposition 5. The algebra given by generators: ui , mi , i > 0 and (quadratic) relations −1 λ u(λ) − µ−1 u(µ) u(λ) − u(µ) = q u(λ) − u(µ) λ−1 u(λ) − µ−1 u(µ) , (22) −1 −1 −1 −1 λ m(λ) − µ m(µ) m(λ) − m(µ) = q m(λ) − m(µ) λ m(λ) − µ−1 m(µ) , (23) u(λ)m(µ) = q −1 m(µ)u(λ),
(24)
∞ i −i i −i with u(λ) = ∞ i=0 (−1) ui+1 λ and m(λ) = i=0 (−1) mi+1 λ is a quantum defor mation of the algebra of functions over the Poisson homogeneous space H− \B− , P∞ . The following proposition shows that the action by vector fields of the Abelian Lie subalgebra h+ on H− \B− can also be quantized. Proposition 6. There is (Hn )n∈N∗ , a commutative family of derivations on C[H− \B− ]q defined by the formulas: H (µ) u(λ) =
H (µ) m(λ) =
λ−1
−1 1 λ u(λ) − µ−1 u(µ) v(µ)u(λ) −1 −µ µ−1 − −1 u(λ) − u(µ) 1 + v(µ)u(µ) ; −1 λ −µ
(25)
µ−1 1 + m(µ)w(µ) m(λ) − m(µ) −1 −1 λ −µ 1 − −1 m(λ)w(µ) λ−1 m(λ) − µ−1 m(µ) (26) λ − µ−1
∞ k −k −1 −1 and w(µ) = with H (µ) = k=1 (−1) Hk µ , v(µ) = − u(µ) + µm(µ) −1 − m(µ) + µu(µ)−1 . This family of derivations deforms the classical action by vector fields of h+ on H− \B− . The following theorem is a quantum version of the Drinfeld–Sokolov correspondence. It shows that the quantization of the action of h+ on H− \B− considered in Proposition 6 can be identified with the adjoint action of the integrals of motion on the phase space Aq . Theorem 1. There is an injective and birational map DSq from C[H− \B− ]q to Aq defined by y0−1
DSq (u(λ)) = lim
N→∞
(λx0 y0 )−1 (λy−1 x0 )−1
1+ 1+
..
. (λy−N+1 x−N+2 )−1
1 + (λx−N+1 y−N+1 )−1
(27)
636
C. Grunspan
and x1−1
DSq (m(λ)) = lim
N→∞
1+ 1+
,
(λx1 y1 )−1 (λy1 x2 )−1 ..
(28)
. (λyN−1 xN )−1
1 + (λxN yN )−1
the limit being taken with respect to the λ−1 -adic topology, and where, in the first case, a a := b−1 a, and in the second case, := ab−1 . We have: DSq ◦Hn = ad(In ) ◦ DSq for b b all integers n. The algebras C[H− \B− ]q and Aq have fraction fields, and the word “birational” means that DSq induces an isomorphism of these fraction fields. 3. Commutativity of the Local Integrals of Motion of the Discrete Sine-Gordon System We give here a (new) proof of the commutativity of the quantum local integrals of motion [IK2] (see also [FTT,V, H]). This constitutes Proposition 4. Our proof is based on the explicit form taken by the elements In of the basis of I given in Proposition 3. First, we note that, as a consequence of Proposition 1, I is a Lie subalgebra of F0 . We shall use Proposition 7 given below, proved in [G1] and which is equivalent to Proposition 3. If a and b are two integers, we set: 1, if b = 0 ; 0, if b < 0 or b > Max(0, a) ; a = [a]! b , if 0 ≤ b ≤ a ; [b]![a − b]! with [n]! =
n
i=1 [i]
qi − 1 . q −1 integers N, a1 , . . . , aN ,
and [i] =
Also, for any relative
N a + ai+1 − 1 = i=1 i . ai+1
we
set
Fq (a1 , . . . , aN )
Proposition 7. A basis for I is given by the family In = I (ψn ), with ψn = An + Bn , 1 Bn = T 2 An , and An =
[n] Fq (α1 , . . . , α2N −2 )(x1 y1 )−α1 (y1 x2 )−α2 . . . (yN−1 xN )−α2N −2 , [α1 ]
(29)
the sum being taken on the set of indices α1 , . . . , α2N−2 such that αi ∈ N, α1 ∈ N∗ , and α1 + . . . + α2N−2 = n. Here, N is any integer such that n ≤ 2(N − 1). One will achieve the proof of the commutativity of I in several steps. The key step is the existence of a filtration on a Lie subalgebra F0 which contains I.
Discrete Quantum Drinfeld–Sokolov Correspondence
637
3.1. Gradation on F0 . We denote degp the principal gradation on Aq defined by degp (xi ) = degp (yi ) = 1 for all i ∈ Z. Also, we set e2i−1 = (xi yi )−1 and e2i = (yi xi+1 )−1 . The elements ej±1 , j ∈ Z generate Aq [0]. So, any element u ∈ F0 can be represented by a sum of terms Pk , k ∈ Z with degp (Pk ) = 2k ∈ 2Z. Therefore, the gradation degp on Aq induces a gradation deg on the module F0 in the following way: an element u of F0 is said to be homogeneous of degree k if there exists P ∈ Aq [0] such that I (P ) = u and degp (P ) = 2k. The formula (16) shows that (I, deg) is a Lie subalgebra of F0 generated by In = I (ψn ), n ∈ N∗ with deg(In ) = −n. 3.2. The subalgebra Bq [0] of Aq [0]. Let Bq [0] be the subalgebra (without unit) of Aq [0] generated by the ei , i ∈ Z. For all i, j ∈ Z, we have ei ei+1 = q −1 e i+1 ei and ei ej = ej ei if |i − j | ≥ 2. A basis for Bq [0] is given by the family εα = i∈Z eiαi , where α is a non-zero sequence in NZ such that almost all elements are zeros (except for a finite number of them). We define a function l on Bq [0] in the following way. If P = α λα εα is a non-zero element in Bq [0] we set l(P ) := Inf{l(α)/ λα = 0} with l(α) := card{i ∈ Z/ αi = 0}. We have l(P ) ∈ N∗ . By convention, we consider that l(0) = +∞. Obviously, Bq [0] is invariant by the translation automorphism T defined in 2.1, and l ◦ T = l. Moreover,if P and Q ∈ Bq [0], then l(P Q) ≥ Max l(P ), l(Q) and l(P + Q) ≥ Inf l(P ), l(Q) with equality in the last inequality if l(P ) = l(Q). 3.3. The Lie subalgebra F0 of F0 . We note F0 the quotient module Bq [0]/ Im(T − I d). A basis for F0 is given by the elements I (εα ), where α is an almost zero sequence satisfying the property αi = 0 if i ≤ 0, and α1 = 0 or α2 = 0. The module F0 is a Lie subalgebra by taking the formula (16) as a definition. Furthermore, we have a natural injective map of Lie algebras F0 8→ F0 such that the following diagram is commutative: Bq [0] 8→ Aq [0] ↓ ↓ F0 8→ F0 . The first horizontal map is the canonical injection of Bq [0] in Aq [0]. The vertical maps are the canonical projections. Note that F0 is a graded Lie subalgebra of F0 with respect to gradation deg. Also, according to Proposition 7, F0 contains I. 3.4. Filtration on F0 . Let us start with the following lemma. Lemma 1. Let u ∈ F0 with u = 0. Then Lu := {l(P )/ P ∈ Bq [0] and I (P ) = u} is a bounded non-empty set in N∗ . Proof. Obviously, Lu = ∅. Consider Q ∈ Bq [0] such that Q is generated by ei , n1 ≤ i ≤ n2 . Let V1 (resp. V2 ) be the submodule generated by the monomials εα with l(α) ≤ n2 − n1 + 1 (resp. l(α) > n2 − n1 + 1). We have Bq [0] = V1 ⊕ V2 , Q ∈ V1 , and Vi (i = 1, 2) is invariant by T . Assume that there exists P ∈ Bq [0] such that I (P ) = u and l(P ) > n2 − n1 + 1. Then, there exists R such that P = Q + T (R) − R. We set R = R1 + R2 with Ri ∈ Vi (i = 1, 2). By projecting on V2 , we obtain P = T (R2 ) − R2 . So, u = 0. But this contradicts our hypothesis. " #
638
C. Grunspan
Lemma 1 allows us to define a length function on F0 . Definition. Let u ∈ F0 , with u = 0. We set l(u) = Max Lu . By convention, l(0) = +∞. Thanks to Sect. 3.3, the following lemma allows us to compute explicitly lengths in F0 explicitly. Lemma 2. Let x ∈ Bq [0] be a non-zero element such that x is a linear combination of monomials of the form εα , with αi = 0 if i ≤ 0 and α1 = 0 or α2 = 0. Then, l I (x) = l(x). Proof. Set l(x) = k. Let V1 (resp. V2 ) denote the submodule of Bq [0] generated by all monomials εα with length k (resp. with a length different from k). We proceed as in the proof of Lemma 1. " # We are now able to show that F0 is a filtered Lie algebra. Lemma 3. Let u and v be two elements of F0 . Then, l [u, v] ≥ Max l(u), l(v) . Proof. Set j = l(u), k = l(v) and n = Max(j, k). There exist P and Q in Bq [0] such α that I (P ) = we have k. For all k ∈ Z, l(T (P )) = u,αI (Q) = v, l(P ) = j and αl(Q) = α (P ), Q] ≥ n. Hence, (P ))Q ≥ n and l Q(T (P )) ≥ n. So, l [T l(P ). So, l (T α l ∞ # α=−∞ [T (P ), Q] ≥ n and l [u, v] ≥ n. " 3.5. End of the proof. For n ∈ N∗ , we set un = e1n and vn = e2n . Thanks to Proposition 7, there exists wn ∈ Bq [0] such that l(wn ) ≥ 2 and ψn = un + vn + wn . Clearly we have [I (un ), I (up )] = [I (vn ), I (vp )] = 0 for all n, p ∈ N. On the other hand, by using Lemma 2, the computation shows that l [I (un), I (vp )] = 2. So, we deduce from Lemma 3 and the bilinearity of the Lie bracket that l [In , Ip ] ≥ 2 for all n, p ∈ N. But, for degree reasons, [In , Ip ] is proportional to In+p and l(In+p ) = 1. Hence, [In , Ip ] = 0. 4. Quantization of the Poisson Homogeneous Space H− \B− , P∞ The aim of this section is to prove Proposition 5. As shown in Subsect. 4.1, at the semiclassical level, the generators ui and mi , i > 0, give a natural system of coordinate functions on H− \B− which satisfy the Poisson relations obtained by taking the limit q → 1 in (22), (23), (24). Therefore, to prove Proposition 5, it is enough to show that C[H− \B− ]q is a flat deformation of C[H− \B− ]. The idea is to obtain a realization of the algebra C[H− \B− ]q in Aq by using Lemma 4 which asserts that the finite screening charges of the discrete sine-Gordon system satisfy the quantum Serre relations. In all the following, the ground ring is no longer Q[q, q −1 ] but C[q, q −1 ]. homogeneous space H− \B− , P∞ . The Poisson manifold 4.1. The Poisson H− \B− , P∞ was defined in 2.1. Any element x¯ ∈ H− \B− can be expressed uniquely in the following form: 1 vcl (λ)(x) ¯ 1 0 x¯ = clH− 0 1 ucl (λ)(x) ¯ 1
Discrete Quantum Drinfeld–Sokolov Correspondence
639
with ucl (λ) ∈ C[H− \B− ][[λ−1 ]], and vcl (λ) ∈ λ−1 C[H− \B− ][[λ−1 ]] and where clH− (y) denotes the left class of an element y in B− . For i ∈ N∗ , we define coordinate functions ui,cl and mi,cl on H− \B− by: ucl (λ) =
∞
(−1)i ui+1,cl λ−i ,
(30)
(−1)i mi+1,cl λ−i ,
(31)
i=0
mcl (λ) =
∞ i=0
−1 ∈ C[H− \B− ][[λ−1 ]]. and mcl (λ) := −λvcl (λ) 1 + ucl vcl (λ)
(32)
The functions ui,cl and mi,cl are algebraically independent and C[H− \B− ] = C[ui,cl , mi,cl , i > 0]. Moreover, computation shows that the Poisson relations between these functions (the Poisson structure is induced by the field of bivectors P∞ ) are precisely the ones we get from (22), (23), (24) when q → 1. 2 and let 4.2. The Enriquez–Feigin morphism. Let n− be a nilpotent subalgebra of sl Uq n− be the quantum algebra generated by the generators f+ and f− subject to the quantum Serre relations: f±3 f∓ − (q + 1 + q −1 )(f±2 f∓ f± − f± f∓ f±2 ) − f∓ f±3 = 0.
(33)
Let deg be the gradation on Uq n− defined by deg f± = ±1. In the remainder of this article, if (A, degA ) and (B, degB ) are two graded algebras, we define the twisted tensor ¯ by the formula: product A⊗B ¯ 1 )(a2 ⊗b ¯ 2 ) = q − degA (a2 ) degB (b1 ) (a1 a2 ⊗b ¯ 1 b2 ) (a1 ⊗b for homogeneous elements a1 , a2 , b1 , b2 in A and B. There is a unique graded algebra ¯ from Uq n− to Uq n− ⊗U ¯ ± ) = f± ⊗1 ¯ q n− given by ?(f ¯ + 1⊗f ¯ ± . This morphism ? morphism is called the twisted comultiplication on Uq n− . Lemma 4 ([EFe]). Let n ∈ N. Then 3+,n := ni=1 xi and 3−,n := ni=1 yi satisfy 2 ). In other words, there exists a graded algebra the quantum Serre relations (for sl morphism fn defined by fn :
Uq n− −→ Aq f± −→ 3±,n .
Proof. For 1 ≤ i ≤ n, we define two graded algebra morphisms ϕi and ψi by:
and
ϕi :
Uq n− −→ C[xi , xi−1 ] f+ −→ xi f− −→ 0
ψi :
Uq n− −→ C[yi , yi−1 ] f+ −→ 0 f− −→ yi
(34)
640
C. Grunspan
with the convention deg xi = − deg yi = 1. Then, it appears that the map fn defined in ¯ (2n) , where mult2n is the algebras ¯ 1⊗ ¯ . . . ⊗ϕ ¯ n ⊗ψ ¯ n) ◦ ? (34) is equal to mult2n ◦(ϕ1 ⊗ψ monomorphism mult2n :
−1 ¯ −1 ¯ −1 ¯ C[x1 , x1−1 ]⊗C[y 1 , y1 ]⊗C[x n , xn ]⊗C[y n , yn ] −→ Aq ¯ 1⊗ ¯ . . . un ⊗v ¯ n −→ u1 v1 . . . un vn . u1 ⊗v
# " In order to realize C[H− \B− ]q in Aq , it is useful to give another expression of the quantum algebra Uq n− . 1 2 1 2 1 1 1 1 Definition. Let q 4 be an indeterminate, q 2 = q 4 , q = q 2 , K0 = C[q 4 , q − 4 ], K = C[q, q −1 ],
1 q− 0 0 0 1 2 0 q4 0 0 H = ∈ M4 (K0 ) % M2 (K0 )⊗ , 1 0 0 q4 0 1 0 0 0 q− 4 1 0 0 0 1 1 − 0 1λ−µ 1 (q 12 −q 21)µ 0 q − 2 λ−q 2 µ q − 2 λ−q 2 µ ∈ M4 (K0 ) % M2 (K0 )⊗2 . and R(λ, µ) = (q − 21 −q 21 )λ λ−µ 0 q − 21 λ−q 21 µ q − 21 λ−q 21 µ 0 0 0 0 1
(35)
(36)
(r)
By definition, C[N+ ]q is the algebra generated over K by the ai,j for i, j ∈ {1, 2} r ∈ N and the relations: (0)
(0)
(37)
R(λ, µ)L (λ)H L (µ) = L (µ)H L (λ)R(λ, µ), a1,1 (qλ) a2,2 (λ) − a2,1 (λ)a1,1 (λ)−1 a1,2 (λ) = 1,
(38)
1
with: ai,j (λ) = L(µ).
(0)
a2,1 = 0, a1,1 = a2,2 = 1,
+∞ r=0
2
2
1
(39)
(r)
ai,j λr , L(λ) = [ai,j (λ)], L1 (λ) = L(λ) ⊗ I d and L2 (µ) = I d ⊗
Relations (38) and (39) have coefficients in K. So, the above definition makes sense. The relation (39) is the quantum determinant relation. It can be shown that C[N+ ]q is a quantum deformation of the algebra of functions on the Poisson manifold N+ defined 1 in 2.1, equipped with the Poisson bivector P = r L − r R + (hL ⊗ hR − hR ⊗ hL ), 4 where r denotes the r-trigonometric matrix: r(λ, µ) = −
1λ+µ 1 h⊗h− (λe ⊗ f + µf ⊗ e) 4λ−µ λ−µ
Discrete Quantum Drinfeld–Sokolov Correspondence
641
1 0 01 00 ,e= and f = , and where r L (resp. r R ) denotes the 0 −1 00 10 left (resp. right) translation of r on G. We can define a gradation deg on C[N+ ]q by (k) ¯ given by deg(ai,j ) = i − j for integers i, j, k as well as a twisted comultiplication ? ¯ ¯ ? L(λ) = L(λ)⊗L(λ). Moreover, the map
with h =
C:
Uq n− −→ C[N+ ]q (1) f+ −→ a2,1 (0) f− −→ a1,2
¯ ◦ C = (C⊗C) ¯ Let us consider the morphism ¯ is an algebra morphism such that ? ◦ ?. fn defined in Lemma 4 above. Thanks to the proof of Lemma 4, there exists a graded algebra morphism gn :
C[N+ ]q −→ Aq 1 0 1 yi L(λ) −→ ni=1 . λxi 1 0 1
(40)
(r−1) (r) (r) (n−1) (n−1) (r−1) , a1,2 , (n) (n) a2,1 and a2,2 are invertible. This leads to the study of the quantum algebra C[B− wn B− ∩ N+ ]q given below.
It is clear that for r > n, a1,1 , a1,2 , a2,1 , a2,2 ∈ Ker gn . Moreover, a1,1
4.3. The quantum Schubert cell C[B− wn B− ∩N+ ]q . The interest in studying the algebra C[B− wn B− ∩ N+ ]q stems from the fact that the generating series of certain functions defined on this quantum algebra satisfies the same relation (22) as the generators ui , i ∈ N∗ in C[H− \B− ]q , and that we can deduce from (40), the existence of an algebra morphism from C[B− wn B− ∩ N+ ]q to Aq . (k)
Definition. The algebra C[B− wn B− ∩ N+ ]q is given by generators ai,j (i, j ∈ {1, 2}; (n−1)
(n−1)
(n)
(n)
(n)
(n)
k ∈ {0, . . . , n}), a1,1 , a1,2 , a2,1 , a2,2 , and relations (37), (38), (39) with (k) (n) (n) ai,j (λ) = nk=0 ai,j λk and a1,1 = a1,2 = 0 as well as relations which express the (n−1)
fact that a1,1 (n)
(n−1)
(resp. a1,2
(n−1)
, a2,1 , a2,2 ) is an inverse for a1,1
(n−1)
(resp. a1,2
(n)
, a2,1 ,
a2,2 ). At the semi-classical limit, for q → 1, we get the algebra of functions on the Schubert cell (B− wn B− ∩ N+ , P ) with wn = diag(λ−n , λn ) with a Poisson structure given by the fact that (B− wn B− ∩ N+ , P ) may be viewed as a symplectic leaf of the Poisson manifold (N+ , P ). The algebra C[B− wn B− ∩N+ ]q is just a rough quantum deformation of (B− wn B− ∩ N+ , P ). To obtain an exact quantum deformation, one should impose (n−1) (n−1) (n) (n) (k) relations between a1,1 , a1,2 , a2,1 , a2,2 and all the ai,j on the definition. However, we don’t need to be so precise and our definition will suffice. There is a natural morphism p : C[N+ ]q −→ C[B− wn B− ∩ N+ ]q . Moreover, if C is an algebra and if (r−1) (r−1) (r) (r) f : C[N+ ]q −→ C is an algebra morphism such that a1,1 , a1,2 , a2,1 , a2,2 ∈ Ker f (n−1) (n−1) (n) (n) pour r > n, and f a1,1 , f a1,2 , f a2,1 , f a2,2 are invertible in C, then there
642
C. Grunspan
exists g : C[B− wn B− ∩ N+ ]q −→ C an algebra morphism such that the following diagram is commutative: p
C[N+ ]q −→ C[B− wn B− ∩ N+ ]q . f & 'g C By virtue of (40), it follows that there exists an algebra morphism hn : C[B− wn B− ∩ N+ ]q −→ Aq such that: n 1 0 1 yi hn L(λ) = (41) λxi 1 0 1 i=1
(k) with L(λ) = [ai,j (λ)]i,j ∈{1,2} and ai,j (λ) = nk=0 ai,j λk . The element a2,2 (λ) is invertible in the ring C[B− wn B− ∩ N+ ]q ((λ−1 )). We set α(λ) := a2,2 (λ)−1 a2,1 (λ). Lemma 5. The function α(λ) satisfies the same relation (22) as the function u(λ). We have: λ−1 α(λ) − µ−1 α(µ) α(λ) − α(µ) = q α(λ) − α(µ) λ−1 α(λ) − µ−1 α(µ) . Proof. By definition, for two elements a and b of an algebra, we denote by [a,b] the commutator ab - ba. Then, α(λ), α(µ) = a2,2 (λ)−1 a2,1 (λ), a2,2 (µ)−1 a2,1 (µ) + a2,2 (λ)−1 , a2,2 (µ)−1 a2,1 (λ)a2,1 (µ) + a2,2 (µ)−1 a2,2 (λ)−1 a2,1 (λ), a2,1 (µ) + a2,2 (µ)−1 a2,2 (λ)−1 , a2,1 (µ) a2,1 (λ). Relation (38) shows that ∀ i, j ∈ {1, 2}, and
[ai,j (λ), ai,j (µ)] = 0 (42) (43) a2,1 (λ), a2,2 (µ) = a2,1 (µ), a2,2 (λ) −1 = (1 − q ) µa2,2 (µ)a2,1 (λ) − λa2,2 (λ)a2,1 (µ) . (44)
So, thanks to (42) and (43), we get α(λ), α(µ) = a2,2 (λ)−1 a2,2 (µ)−1 a2,1 (λ), a2,2 (µ) a2,2 (λ)−1 a2,1 (λ) − a2,2 (µ)−1 a2,1 (µ) = (1 − q −1 ) µα(λ) − λα(µ) α(λ) − α(µ) . The result follows from this last equality.
(45) (46)
# "
Let us see now what is the image of α(λ) by the map hn . yn−1 a := b−1 a for two , with −1 b (λxn yn ) 1+ (λyn−1 xn )−1 1+ .. . (λy1 x2 )−1 1 + (λx1 y1 )−1 elements a and b such that b is invertible. Lemma 6. We have hn α(λ) =
Discrete Quantum Drinfeld–Sokolov Correspondence
643
a n bn Proof. Let us define elements an , bn , cn , dn in Aq [λ] such that hn L(λ) = . c d n n Then, thanks to (41), we have hn a2,1 (λ) = cn , hn a2,2 (λ) = dn and hn α(λ) = dn−1 cn . If n = 1 then d1 = 1 + λ(x1 y1 ) is invertible in Aq ((λ−1 )) and h1 α(λ) = −1 1 + (λx1 y1 )−1 y1−1 . Let us make the hypothesis that n > 1. We have: cn = cn−1 + λdn−1 xn , dn = cn−1 yn + dn−1 (λxn yn + 1).
(47) (48)
From this, the computation shows that −1 −1 −1 −1 hn α(λ) = 1 + (λxn yn )−1 1 + q −1 (dn−1 cn−1 yn−1 )(λyn−1 xn )−1 yn . −1 (k) Therefore, if we set Vn := hn α(λ) yn and Wn := 1 + q k Vn−1 (λyn−1 xn )−1 , we see that −1 Vn = 1 + (λxn yn )−1 Wn(−1) . By induction on n, we deduce that for all integers k with k > n, (xk yk ) and Vn commute. Hence, by induction on p, ∀p ∈ N,
(k+p)
(λxn yn )−p Wn(k) = Wn
(λxn yn )−p .
So, −1 Vn = 1 + Wn(0) (λxn yn )−1
−1 −1 = 1 + 1 + Vn−1 (λyn−1 xn )−1 (λxn yn )−1 . The result follows from this by induction on n. " # ∞ Set α(λ) = i=0 (−1)i αi+1 λ−i and let us see what the images of αi in Aq are. For that, we shall need the following proposition. Proposition 8. Let N be an integer and IN the ideal in AN := C[q, q −1 ]{{t1 , . . . , tN }} generated by elements ti ti+1 − qti+1 ti for i ∈ {1, . . . , N} and ti tj − tj ti for |i − j | ≥ 2. Then, in the ring AN /IN , we have:
−1 −1 −1 1 − 1 − 1 − . . . (1 − tN )−1 tN−1 . . . t2 t1 = Fq (α1 , . . . , αN )tNαN . . . t1α1 . α1 ,... ,αN
We recall that the function Fq has been defined in Sect. 3. Proof. We note FN the quantum fraction in the left hand side, vN the valuation on AN corresponding to the gradation given by deg tj for all j , and iN the valued injection from AN to AN+1 given by iN (tj ) = tj +1 . If N = 1, the result isobvious. Let us assume that the property is true up to rank N . Then, vN+1 iN (FN )t1 ≥ 1 for vN (FN ) ≥ 0. So, 1 − iN (FN )t1 is invertible in AN+1 and FN+1 exists. Set t1 = t1 , . . . , tN−1 = tN−1
644
C. Grunspan
and tN = (1 − tN+1 )−1 tN . Then, the tj satisfy the same relations as the generators tj in AN /IN . Moreover,
−1 −1 −1 FN+1 = 1 − 1 − 1 − . . . (1 − t N )−1 t N−1 . . . t 2 t 1 .
So, the induction hypothesis implies that FN+1 =
α1 ,... ,αN
α
α
Fq (α1 , . . . , αN ) t NN · · · t 1 1 .
On the other hand, an induction on k shows that: k (1 − tN+1 )−1 tN = (1 − tN+1 )−1 . . . (1 − q k−1 tN+1 )−1 tNk . Therefore, FN+1 =
α1 ,... ,αN
Fq (α1 , . . . , αN )(1 − tN+1 )−1 . . . (1 − q αN −1 tN+1 )−1 tNαN . . . t1α1 .
The result follows from the classical relation: N−1
(1 − q s t)−1 =
s=0
N + k − 1 tk. k k≥0
# " Lemma 6 and Proposition 8 allow us to obtain explicitly images of αi by hn . Corollary 1. We have hn (αi ) = Fq (α1 , . . . , α2n−1 )(x1 y1 )−α2n−1 . . . (xn yn )−α1 −1 yn , where the sum is taken on all integers α1 , . . . , α2n−1 such that α1 + . . . + α2n−1 = i − 1. The fact that the αi satisfy relation (22) leads to the study of the following quantum algebra. 4.4. The quantum homogeneous space C[S∞ \B− ]q . Let S∞−1bethe sub-group of B− aλ b . constituted by all lower triangular matrices of the form 0 a −1 \B− ]q the algebra given by ui , i ∈ N ∗ , Definition. We denote −1 by C[S∞−1 generators and relation (22): λ u(λ) − µ u(µ) u(λ) − u(µ) = q u(λ) − u(µ) λ−1 u(λ) − i −i µ−1 u(µ) with u(λ) = ∞ i=0 (−1) ui+1 λ . We can check that relations coming from (22) are equivalent to the equalities ∀ i < j,
[ui , uj ] = (1 − q
−1
)
i+j −1
uk ui+j −k .
(49)
k=i
The algebra C[S∞ \B− ]q is a graded algebra with the gradation given by deg ui = 1 for all i. Thanks to Lemma 5, there exists a specialization morphism: r:
C[S∞ \B− ]q −→ C[B− wn B− ∩ N+ ]+ ∀ i,
ui −→ αi
.
(50)
Discrete Quantum Drinfeld–Sokolov Correspondence
645
We set hn = T −n ◦ hn ◦ r, where T is the translation automorphism on Aq . Thanks to Corollary 1, for all integers i, j, m with i ≤ 2n and i ≤ 2m we have hn (ui ) = hm (ui ). Now, if we take into account Lemma 6, we deduce the existence of a graded algebra morphism h:
C[S∞ \B− ]q −→ Aq
y0−1
u(λ) −→ lim
N→∞
(λx0 y0 )−1 (λy−1 x0 )−1
1+ 1+
..
(51)
. (λy−N+1 x−N+2 )−1
1 + (λx−N+1 y−N+1 )−1 a with the convention that = b−1 a if b is invertible. Explicitly, the image of ui by h is b given by the formula: h(ui ) = Fq (a1 , a2 , . . . ) . . . (x−k y−k )−a2k+1 (y−k x−k+1 )−a2k . . . (x0 y0 )−a1 y0−1 , (52) the sum being taken on all integers ai such that k ak = i−1. For example, h(u1 ) = y0−1 and h(u2 ) = (x0 y0 )−1 y0−1 . We note that for all integers i and n with i ≤ 2(n − 1) we have hn (αi ) = T n h(ui ). ai Proposition 9. A basis for C[S∞ \B− ]q is given by the family ξa := ∞ i=1 ui , where a = (ai )i∈N∗ denotes an almost zero sequence of integers. Proof. Thanks to (49), C[S∞ \B− ]q is spanned by the family ξa . But this set of vectors is also free. Indeed, this is a consequence of (1) the existence of h given above, (2) the fact that in each new element of the sequence h(ui ), occurs one and only one new element of the form x−k or y−k (according to the parity of i) and (3) the fact that the family ai bi xi yi forms a basis of Aq where (ai ) and (bi ) are almost zero sequences in ZZ . " # Note that the proof of Proposition 9 shows the following result. Corollary 2. The algebra morphism h is injective. In the classical case, when q → 1, we see from (52), that h is a birational map. We can also deduce from Proposition 9 that C[S∞ \B− ]q is a quantum deformation of the algebra of functions on the Poisson manifold S∞ \B− equipped with the Poisson structure induced by the field of bivectors r L − r R . 4.5. End of the proof. It is based on Proposition 9 and Lemma 7 which show together that C[H− \B− ]q is in some way the quantum “double” of C[S∞ \B− ]q . given by generators: mi , i ∈ N∗ and Definition. We note C[S∞ \B− ]+ q the algebra −1 −1 relations: (23), i.e., λ m(λ)−µ m(µ) m(λ)−m(µ) = q −1 m(λ)−m(µ) λ−1 m i −i (λ) − µ−1 m(µ) with m(λ) = ∞ i=0 (−1) mi λ .
646
C. Grunspan
We note also by deg the gradation on C[S∞ \B− ]+ q defined by deg mi = −1 for all i and + by ϕ the anti-isomorphism of algebras: ϕ+ :
C[S∞ \B− ]q −→ C[S∞ \B− ]+ q . ∀ i ∈ N∗ , ui −→ mi
(53)
The map ϕ + is an anti-graded involution. On the other hand, there exists also an involution ϕ on Aq which is an algebra anti-graded anti-automorphism defined by: ϕ : Aq −→ Aq ∀ i ∈ Z, xi −→ y1−i . yi −→ x1−i
(54)
By considering the map ϕ◦h◦ϕ + , we deduce the existence of a graded algebra morphism h+ :
C[S∞ \B− ]+ q −→ Aq
x1−1
m(λ) −→ lim
N→∞
1+ 1+
(λx1 y1 )−1 (λy1 x2 )−1 ..
(55)
. (λyN−1 xN )−1
1 + (λxN yN )−1 with the convention that h+ (mi ) =
a = ab−1 . Explicitly, using Proposition 8, we get b
Fq (α1 , α2 , . . . ) x1−1 (x1 y1 )−α1 . . . (xk yk )−α2k−1 (yk xk+1 )−α2k . . .
(56)
As usual, the sum is taken on all almost zero sequences (ak ) such that k ak = i − 1. Here also, in each new term of the sequence mi occurs one and only one new variable of the form xk or yk (according to the parity of i). Therefore, the same argument as before shows the following two results. ∞ ai Proposition 10. A basis for C[S∞ \B− ]+ q is given by the family ηa := i=1 mi where a = (ai )i∈N∗ denotes any almost zero sequence of integers. Corollary 3. The map h+ is injective. In the classical case, h+ is also a birational isomorphism from C[S∞ \B− ]+ to C[xi−1 , yi−1 , i > 0]. Let us consider again the quantum algebra C[H− \B− ]q defined in Sect. 2. Lemma 7. The natural map + ¯ C[H− \B− ]q −→ C[S∞ \B− ]q ⊗C[S ∞ \B− ]q ¯ ∀ i, ui −→ ui ⊗1 ¯ i mi −→ 1⊗m
is a graded algebra isomorphism.
Discrete Quantum Drinfeld–Sokolov Correspondence
647
Proof. It is sufficient to construct the inverse. But, there exist natural morphisms f −1 and f + from C[S∞ \B− ]q and C[S∞ \B− ]+ q to C[H− \B− ]q . These morphisms q commute. So, there exists a morphism: + ¯ ¯ + : C[S∞ \B− ]q ⊗C[S f ⊗f ∞ \B− ]q −→ C[H− \B− ]q .
One can check that this gives an inverse for the studied map.
# "
Hence, by virtue of Proposition 9 and 10, we deduce that C[H− \B− ]q is indeed a flat deformation of C[H− \B− ]. This completes the proof of Proposition 5. Note that C[H− \B− ]q is a graded algebra with the gradation given by deg ui = − deg mi for all i βj αi ∞ and that a basis of C[H− \B− ]q is given by the family ∞ i=1 ui j =1 mj , where (αi ) and (βj ) are two almost zero sequences of integers. 5. Quantum Drinfeld–Sokolov Correspondence The aim of this section is to prove Theorem 1. As a result of the previous section, we have already proved the existence of DSq . Indeed, it suffices to consider the morphisms h and h+ seen in (51) and (55), to note that Im h ⊂ C[xi±1 , yi±1 , i ≤ 0]q , Im h+ ⊂ C[xi±1 , yi±1 , i > 0]q and to take into account Lemma 7 together with the isomorphism ±1 ±1 ¯ Aq % C[xi±1 , yi±1 , i ≤ 0]q ⊗C[x i , yi , i > 0]q . Note that the injectivity of h and of h+ imply that of DSq . It remains to prove the equality DSq ◦H (µ) = ad(Iµ ) ◦ DSq . For that, the idea is first to prove the existence of Hn (this will be achieved in Subsect. 5.4) and then, using the embedding of C[H− \B− ]q into Aq , to extend Hn not only on Aq but also on A¯ q , the algebra obtained from Aq by adding the two half screening charges 3 ± of the discrete sine-Gordon system. The interest in considering this algebra is that it is endowed with an Uq b− is a Borel subalgebra b− -module-algebra structure, where 2 . Moreover, the adjoint actions of integrals of motion extend to A¯ q and commute of sl with the action of Uq b− . Conversely, each derivation which commutes with the action of Uq b− is the adjoint action of an integral of motion. We shall use this fact to complete the proof. First, we start by giving precise definitions of the quantum group Uq b− and algebras A¯ q and C[H− \B− N+ ]q . 2 . We note Uq Definition. Let b− be a Borel subalgebra of sl b− the quantum group given ±1 by generators: kε , fε , with ε, ε ∈ {+, −} and relations: kε kε−1 = kε−1 kε = 1, kε kε = kε kε , kε fε kε−1 = q αε,ε fε , together with the quantum Serre relations between f± and f∓ : f±3 f∓ − (q + 1 + q −1 ) f±2 f∓ f± − f± f∓ f±2 − f∓ f±3 = 0,
(57) (58) (59)
(60)
with the convention that αε,ε = 1 if ε = ε , −1 otherwise. The comultiplication is given by: ?(fε ) = fε ⊗ 1 + kε ⊗ fε , ?(kε±1 )
=
kε±1
⊗ kε±1 .
We will use neither the antipode nor the co-unit in this article.
(61) (62)
648
C. Grunspan
5.1. The extended phase space A¯ q . If we consider only a finite number of sites xi±1 , yi±1 , b− -module-algebra. For i ∈ {1, . . . , n}, it can be shown from Lemma 4 that we get a Uq all x homogeneous with respect to deg, the formulas are the following: f+ .x = f− .x =
n
[xi , x]q ,
(63)
[yi , x]q , i=1 ± deg x
(64)
i=1 n
k± .x = q
x,
(65)
with, by definition, [a, b]q := ab − q (deg a)(deg b) ba
(66)
for any homogeneous elements a and b with respect to the gradation deg. If we consider now an infinite number of sites at the left of an arbitrary site xi±1 , yi±1 , i ≤ N , we also obtain a Uq b− -module-algebra. For that, we set: [xi , x]q , (67) f+ .x = i≤N
f− .x =
[yi , x]q , i≤N ± deg x
k± .x = q
x,
(68) (69)
for any x homogeneous with respect to deg. This follows from the fact that for all x ∈ Aq , xi and x q-commute provided that i is small enough. However, if we consider the whole algebra Aq , there is no longer a Uq b− -module-algebra structure on it. For that, it is necessary to add the half screening charges 3 + and 3 − which correspond heuristically to i>0 xi and i>0 yi . Definition. We note A¯ q the algebra given by generators: 3 + . 3 − , xi±1 , yi±1 , i ∈ Z and relations: ∀i < j, ∀i ≤ j, ∀i ∈ Z,
xi xj y i yj yi x j xi yj xi 3 + − q3 + xi xi 3 − − q −1 3 − xi yi 3 + − q −1 3 + yi yi 3 − − q3 − yi
= = = = = = = =
qxj xi , qyj yi , q −1 xj yi , q −1 yj xi , i [xi , xj ]q , ji =1 [xi , yj ]q , ji =1 [yi , xj ]q , ji =1 j =1 [yi , yj ]q
(70)
together with the quantum Serre relations between 3 ± and 3 ∓ : 2 3 2 3 3 ± 3 ∓ − (q + 1 + q −1 ) 3 ± 3 ∓ 3 ± − 3 ± 3 ∓ 3 ± − 3 ∓ 3 ± = 0.
(71)
Discrete Quantum Drinfeld–Sokolov Correspondence
649
As usual, for two elements a and b, [a, b]q denotes the q-commutator of a and b. The gradation deg on A¯ q is given by: ∀ i ∈ Z,
deg xi = deg 3 + = − deg yi = − deg 3 − = 1.
The following result can be proved easily. αi βi Lemma 8. A basis for A¯ q is given by the family +∞ i=−∞ xi yi u, where u belongs to a basis B of the quantum algebra C[3 + , 3 − ]q % Uq n− and (αi ), (βi ) are two almost zero sequences in ZZ . Hence, thanks to Subsect. 2.1, we get the following lemma. Lemma 9. There is a natural graded algebra embedding Aq 8→ A¯ q . This embedding identifies generators xi±1 and yi±1 , i ∈ Z with the ones of A¯ q . The semi-classical limit of A¯ q is C xi±1 , yi±1 , i ∈ Z, 3 ε0 , {3 ε0 , 3 ε1 }, {3 ε0 , {3 ε1 , 3 ε2 }}, . . . , εk ∈ {+, −} . We note this algebra A¯ cl . Let us remark that it is possi 1 1 ble to extend the half-translation automorphism T 2 on A¯ cl by setting T 2 3 + = 3 − 1 and T 2 3 − = 3 + − x1 . It can be shown that A¯ cl is the localized of a subalgebra of a projective limit of algebras. Explicitly, these algebras are the ones generated by variables xi and yi for i ≤ n with obvious projection morphisms. The considered subalgebra is the one generated by xi , yi , i ∈ Z and the half-screening charges 3 + and 3 − identified with (x1 , x1 + x2 , . . . ) and (y1 , y1 + y2 , . . . ). The multiplicative set is generated by elements xi and yi for i ∈ Z. It satisfies the Ore relation [D]. Therefore, we can deduce from formulas (67), (68), (69) that there exists a Uq b− -module-algebra structure on A¯ q given by: 0 f+ .x = 3+ , x q = [xi , x]q + 3 + , x q ,
(72)
i=−∞
0 f− .x = 3− , x q := [yi , x]q + 3 − , x q ,
(73)
i=−∞
k± .x = q ± deg x x.
(74)
At the semi-classical limit, it also gives a U b− -module-algebra structure on A¯ cl .
5.2. The quantum homogeneous space C[H− \B− N+ ]q . Geometrically, at the classical level, adding half screening charges is the same as studying the Schubert cell B− N+ of G instead of its Borel group B− .
650
C. Grunspan
Definition. We denote by C[H− \B− N+ ]q the quantum algebra given by generators: 3 + , 3 − , ui , mi , i ∈ Z and relations: −1 λ u(λ) − µ−1 u(µ) u(λ) − u(µ) = q u(λ) − u(µ) λ−1 u(λ) − µ−1 u(µ) (75) −1 λ m(λ)−µ−1 m(µ) m(λ) − m(µ) = q −1 m(λ)−m(µ) λ−1 m(λ) − µ−1 m(µ) (76) u(λ)m(µ) = q −1 m(µ)u(λ) u(λ)3 ± = q m(λ)3 + − q
−1
±1
(77)
3 ± u(λ)
(78)
3 + m(λ) = 1 − q
−1
−1
(79)
m(λ)3 − − q3 − m(λ) = −(q − 1)λ m(λ) (80) 2 2 3 − (q + 1 + q −1 ) 3 ± 3 ∓ 3 ± − 3 ± 3 ∓ 3 ± − 3 ∓ 3 ± = 0 (81) ∞ i −i and m(λ) = with the same notation as before i.e., u(λ) = i=0 (−1) ui+1 λ ∞ i −i i=0 (−1) mi+1 λ . 2
3 3±3∓
The algebra C[H− \B− N+ ]q is graded with the gradation given by: deg ui = deg 3 + = − deg mi = − deg 3 − = 1.
∀ i ∈ Z,
The relations between u(λ), m(λ) and 3 ± will appear to be natural when we prove the following proposition which claims the existence of the morphism DSq . Proposition 11. The map DSq :
C[H− \B− N+ ]q ∀ i ∈ N∗ , ui mi 3±
−→ −→ −→ −→
A¯ q DSq (ui ) DSq (mi ) 3±
exists and defines a graded algebra morphism. Proof. We need to prove some compatibility relations. The ones dealing with DSq (uk ), k ∈ N∗ and 3 ± follow from the fact that all the terms in DSq (uk ) are sums and products of xi±1 and yi±1 for i ≤ 0. The ones dealing with DSq (mk ) and 3 + can be handled in the following way. If we take again the involution ϕ defined in (54), we have DSq (mk ) = ϕ DSq (uk ) . So, for any integer n large enough,
DSq (mk ), 3 +
q
n 0
= DSq (mk ), xi q = ϕ yi , DSq (uk ) q i=1
i=−n+1
n
yi , T n DSq (uk ) q . = (T n ◦ ϕ) i=1
Recall the notation of Subsect. 4.3 and in particular the morphism hn : C[B− wn B− ∩ (0) N+ ]q −→ Aq , we have T n DSq (uk ) = hn (αk ) and hn (a1,2 ) = ni=1 yi . Then the result comes from commuting relations in C[B− wn B− ∩ N+ ]q . We prove the compatibility relation between DSq (mk ) and 3 − using a similar method. " #
Discrete Quantum Drinfeld–Sokolov Correspondence
651
From the commuting relations in C[H− \B− N+ ]q together with the results of Sect. 4, we can deduce the following corollary. αi Corollary 4. If B denotes a basis for C[3 − , 3 + ]q % Uq n− , then the family ∞ i=1 ui ∞ βj j =1 mj u, where (αi ) and (βj ) are two almost zero sequences of integers and u ∈ B is a basis for C[H− \B− N+ ]q . We also obtain the following result. Corollary 5. The morphism DSq is injective. Proof. This is a consequence of Corollary 4, Corollary 8 and of the fact already seen that each new term of the sequence uk (resp. mk ) gives a new variable x−i (resp. xi ) or y−i (resp. yi ), i ∈ N according to the parity of k. " # Corollary 4 also shows that C[H − \B− N+ ]q is a flat deformation of the function algebra of the Poisson manifold H− \B− N+ , P∞ . Poisson relations on this manifold show that we obtain a quantum deformation. Moreover, the map: F : C[[λ−1 ]] × λ−1 C[[λ−1 ]] × N+ −→ H− \B− , N+
1 vcl (λ) 1 0 (ucl (λ), vcl (λ), n+ ) −→ clH− , n+ 0 1 ucl (λ) 1 (82) (1)
is a bijection and the elements 3 + and 3 − correspond classically to the functions a2,1 (0)
and a1,2 on N+ . On the other hand, by virtue of Subsect. 4.1, classical limits of ui and mi correspond to coordinate functions with generating functions ucl (λ) and mcl (λ). Note that Corollary 4 also implies the following lemma. Lemma 10. There is a natural graded algebra embedding C[H− \B− ]q 8→ C[H− \B− N+ ]q . This embedding identifies generators ui (resp. mi ), i ∈ Z of C[H− \B− ]q with the ones of C[H− \B− N+ ]q . Moreover, we have the following commutative diagram where all maps are graded algebras embedding: C[H− \B− ]q 8→ C[H− \B− N+ ]q DSq ↓ ↓ DSq Aq 8→ A¯ q . Proposition 11 together with Lemma 10 lead to the following result. Corollary 6. There is a Uq b− -module-algebra structure on C[H− \B− N+ ]q given by: f+ .u(λ) = −(q − 1)λ−1 u(λ)2 − (q − q −1 )u(λ)3 + ,
(83)
f− .u(λ) = (1 − q −1 ) + (q − q −1 )u(λ)3 − ,
(84)
k± .u(λ) = q
±1
u(λ),
(85)
f+ .m(λ) = 1 − q + (q − q f− .m(λ) = (1 − q
−1
)λ
−1
−1
)m(λ)3 + ,
m(λ) − (q − q 2
(86) −1
)m(λ)3 − ,
∓1
k± .m(λ) = q m(λ), fε .3 ε = 3 ε , 3 ε q . The morphism DSq defined above is a Uq b− -module-algebra morphism.
(87) (88) (89)
652
C. Grunspan
Proof. This comes from the fact that the morphism DSq is injective and from the computation of f± .DSq (x) for x ∈ C[H− \B− N+ ]q . For example, thanks to formula (72), the computation of f + .DSq (uk ) leads to the computation of 0i=−∞ xi , DSq (uk ) q . This sum is finite. So, recalling the involution ϕ on Aq defined in (54), we have: 0
xi , DSq (uk )
i=−∞
q
=
∞
ϕ DSq (mk ), yi q = ϕ DSq (mk ), 3 − q i=1
= (ϕ ◦ DSq ) mk , 3 − q .
# Then it suffices to use (80) to get the expression of f + .DSq (uk ). " 5.3. Adjoint action of integrals of motion on A¯ q . Let I be an integral of motion. By using the definition seen in (18) of the adjoint action of I on Aq together with the equality between (16) and (17), it can be shown that there exists an uniquehomogeneous element R+ (I ) ∈ Aq [1] without constant term such that ad(I )(x1 ) = T R+ (I ) − R+ (I ). For instance, if I = I1 is the first integral of motion with respect to the basis (Ik ) of I −1 seen in Proposition 3, then R+ (I 1 ) = −y0 . It follows that for any integer n, we have ad(I )(x1 + · · · + xn ) = T n R+ (I ) − R+ (I ). On the other hand, given the form 1
1
taken by the Ik , we have ad(I ) ◦ T 2 = T 2 ◦ ad(I ) for all I ∈ I. So, there exists also 1 R− (I ) ∈ Aq [−1] with R− (I ) = T 2 R+ (I ) such that ad(I )(y1 ) = T R− (I ) −R− (I ). So, for all n, ad(I )(y1 + · · · + yn ) = T n R− (I ) − R− (I ). This leads to extend the derivation ad(I ) on A¯ q as explained in the following proposition. Proposition 12. Let Der(A¯ q ) be the Lie algebra of derivations on A¯ q . For I ∈ I, there is a unique derivation ad(I ) on A¯ q which satisfies formula (18) if x belongs to Aq and ad(I )(3 ± ) = −R± (I ). Moreover, the kernel of the Lie algebra morphism: I −→ Der(A¯ q ) I −→ ad(I )
ad :
(90)
is C[q, q −1 ] i.e., the one-dimensional Lie subalgebra of all constant integrals of motion. ¯ Its image is Der Uq b− (Aq ) the Lie subalgebra of all derivations which commute with the action of Uq b− . Proof. Let I ∈ I. So as to prove the existence of ad(I ) on A¯ q , it is necessary to show some compatibility relations like: ∀ j ∈ Z,
ad(I )(xj ), 3 +
q
+ xj , −R+
q
=
j
ad(I ) [xj , xk ]q .
(91)
k=1
But, according to Proposition 2, ad(I ) is well defined on Aq . So, for any fixed integer j , we have: ∀ n ∈ N,
ad(I )(xj ), 3+,n
j
q
+ xj , T n (R+ ) − R+ q = ad(I ) [xj , xk ]q k=1
Discrete Quantum Drinfeld–Sokolov Correspondence
653
with 3+,n = nk=1 xk . This equality leads to (91) by taking n large enough. The other relations except those coming from the quantum Serre relations between 3 ± and 3 ∓ can be proved in the same way. The unicity of ad(I ) is obvious. To prove that ad(I ) and f± commute, we set C± = {x ∈ A¯ q / ad(I ) ◦ f± (x) = f± ◦ ad(I )(x)}. We remark that for all x homogeneous with respect to deg, we have deg ad(I )(x) = deg(x). Hence, C± is a graded subalgebra of A¯ q . Then, computation shows that Aq ∪ {3 + , 3 − } ∈ C± . It follows that C± = A¯ q . ¯ Conversely, let us fix D ∈ Der Uq b− (Aq ). Using the fact that the result is true at the classical level ([EFe]) and the fact that classical integrals of motion can be quantized, first, we show the following result: ¯ ¯ ¯ ∀δ ∈ Der Uq b− (Aq ) ∀n ∈ N ∃ I ∈ I ∃ δ ∈ Der Uq b− (Aq ), ∀x ∈ Aq ,
δ(x) = ad(I )(x) + (q − 1)n δ (x).
(92)
Then, we can deduce from (92) and from the explicit form of the basis (Ik ) of I that (1) D is a graded derivation (it means that if x ∈ A¯ q is homogeneous with respect to deg, then D(x) is also homogeneous with respect to deg and deg D(x) = deg x), (2) 1 1 [D, T 2 ] = 0 or in other words, D and T 2 commute, and (3) Aq is invariant by D. Hence, we show that D is entirely defined by D(x0 ): the natural map coming from the foregoing, ¯ Der Uq b− (Aq ) −→ Der T 21 (Aq ) δ −→ δ|Aq
(93)
is injective. Let us give some definitions. Let Vp be the free sub-module of Aq of all homogeneous elements of degree p with respect to the principal gradation degp (see Subsect. 3.1). Let Bp be a basis for Vp . If there exists q ∈ Z such that p = −2q + 1, then Bp can be chosen such that ad(Iq )(x0 ) belongs to Bp . There is N ∈ N and αp ∈ Vp , p ∈ {−N, . . . , N} such that D(x0 ) = N p=−N αp . Let p ∈ {−N, . . . , N}. By projecting (92) on Vp with δ = D and x = x0 , we see that the valuations in q − 1 of all coefficients of αp on basis Bp (except perhaps the element ad(Iq )(x0 ) of Bp if there is an integer q such that p = −2q + 1) are arbitrarily large. Hence, αp = 0 or αp is proportional to ad(Iq )(x0 ). Thus, there is I ∈ I such that D(x0 ) = ad(I )(x0 ). Then, the injectivity of the map (93) shows that D = ad(I ). " #
5.4. Existence of Hn . The aim of this section is to prove Proposition 6. In fact, we shall prove the existence of Hn not only on C[H− \B− ]q but also on C[H− \B− N+ ]q . This will imply Proposition 6. Proposition 13. There is a commutative family of derivations (Hn )n∈N∗ on C[H− \B− N+ ]q which quantizes the classical action by vector fields of h+ on H− \B− N+ and which commutes with the action of Uq b− on C[H− \B− N+ ]q (for the definition of k H µ−k denotes the generating function h+ , see Subsect. 2.1). If H (µ) = ∞ (−1) k k=1
654
C. Grunspan
of (Hn )n∈N∗ , then, the derivations Hn are defined by formulas: −1 1 λ u(λ) − µ−1 u(µ) v(µ)u(λ) −1 −µ µ−1 − −1 u(λ) − u(µ) 1 + v(µ)u(µ) , −1 λ −µ µ−1 H (µ) m(λ) = −1 1 + m(µ)w(µ) m(λ) − m(µ) −1 λ −µ 1 − −1 m(λ)w(µ) λ−1 m(λ) − µ−1 m(µ) , −1 λ −µ H (µ)(3 + ) = −µ−1 u(µ) + u(µ)v(µ)u(µ) , H (µ) u(λ) =
λ−1
H (µ)(3 − ) = v(µ),
(94)
(95) (96) (97)
−1 −1 with v(µ) = − u(µ) + µm(µ)−1 and w(µ) = − m(µ) + µu(µ)−1 . 5.5. The classical case. For n ∈ N, set hn = diag(λn , −λn ). With the notation of Subsect. 5.2 and in particular of the map F defined in (82), we show that the left translation action of hn on generating series ucl (λ) and vcl (λ) of coordinate functions ui and mi is given by formulas: hn .ucl = (1 + 2ucl vcl )λn ≤ ucl − 2 ucl (1 + ucl vcl )λn ≤ − 2vcl λn < u2cl + (1 + 2ucl vcl )λn ≤ ucl (98) n hn .vcl = 2 vcl λ < (1 + 2ucl vcl ) − 2 (1 + 2ucl vcl )λn ≤ (99) where for x ∈ C((λ−1 )), x≤ (resp. x< ) denotes the part of x in C[[λ−1 ]] (resp. λ−1 C[[λ−1 ]]). Let h(µ) be the generating series of hn . From (98) and (99), it is possible 1 to compute the action of h(µ) on ucl (λ) and mcl (λ). It can be checked that these rela2 tions are precisely the ones we get from (94) and (95) when q → 1. In the same way, (1) (0) with the identification of 3 + and 3 − with 1 ⊗ a2,1 and 1 ⊗ a1,2 , it can be shown that 1 1 we have h(µ).3 + = −µ−1 u(µ) + u(µ)2 v(µ) and h(µ).3 − = v(µ). Hence, it is 2 2 1 clear that if the derivation Hn exists then it deforms the classical action of hn on the 2 homogeneous space H− \B− N+ . 5.6. The algebra Uλ,µ . To obtain algebra Uλ,µ , we just need to replace the generating series u(λ) and u(µ) by variables uλ and uµ . −1 −1 ground ring by Definition. We denote by Uλ,µ the algebra over the C[λ , µ ] given −1 −1 generators uλ and uµ and relation: λ uλ −µ uµ uλ −uµ = q uλ −uµ λ−1 uλ − µ−1 uµ .
Discrete Quantum Drinfeld–Sokolov Correspondence
655
Note that Uλ,µ is a graded algebra with respect to the gradation given by deg uλ = deg uµ = 1. It can be proved that Uλ,µ does not have any torsion of zero divisor and that β a basis of Uλ,µ is given by the family (uαλ uµ ) with (α, β) ∈ N2 . Thanks to the definition of Uλ,µ and C[S∞ \B− ]q , there is an algebra morphism: Uλ,µ −→ C[S∞ \B− ]q [[λ−1 , µ−1 ]], uλ −→ u(λ), uµ −→ u(µ).
(100)
k,l It can be shown that this morphism is injective. There are unique coefficients cα,β satisfying k,l ∀ k, l ∈ N, (uµ )k (uλ )l = cα,β (uλ )α (uµ )β . (101) α+β=k+l
We give below formulas which deal with cases k or l equal to 1 or 2. These formulas will be useful because the relations between uλ and uµ are quadratic as well as the right side of (94). (2)
(2)
Proposition 14. There are coefficients cα,β , dα,β , cα,β , dα,β such that: ∀ n ∈ N, uµ (uλ )n = cα,β (uλ )α (uµ )β ,
(102)
α+β=n+1
(uµ )n uλ =
dα,β (uλ )α (uµ )β ,
(103)
α+β=n+1
(uµ )2 (uλ )n =
α+β=n+2 n
(uµ ) (uλ ) = 2
α+β=n+2
(2)
cα,β (uλ )α (uµ )β , (2)
dα,β (uλ )α (uµ )β .
(104) (105)
For all α ≥ 0 and b ≥ 1, we have: (q α−1 − 1)λ−1 , (106) q α−1 λ−1 − µ−1 (λ−1 − µ−1 )(λ−1 − qµ−1 )µ−(β−1) α+β−1 . = q α−1 (q j − 1) α+β−1 j −1 −1 ) (q λ − µ j =α+1 j =α−1 (107)
cα,0 = ∀ β = 0,
cα,β
Also, for all α ≥ 0 and β ≥ 2, (2)
(q α−2 − 1)(q α−1 − 1)λ−2 , (108) − µ−1 )(q α−1 λ−1 − µ−1 ) (λ−1 − µ−1 )(λ−1 − qµ−1 )λ−1 = q α−2 (q α−1 − 1)[2] α−2 −1 , (q λ − µ−1 )(q α−1 λ−1 − µ−1 )(q α λ−1 − µ−1 ) (109)
cα,0 = (2)
cα,1
(2) cα,β
(q α−2 λ−1
=q
α−2
(λ−1 − µ−1 )(λ−1 − qµ−1 )µ−(β−2) P (2) (λ−1 , µ−1 ) α,β (q − 1) α+β−1 j −1 −1 −µ ) j =α+1 j =α−2 (q λ (110)
α+β−2
j
656
C. Grunspan
with (2)
Pα,β (λ−1 , µ−1 ) = q[α + β − 1](qλ−1 − µ−1 )(q α−2 λ−1 − µ−1 ) − [α](λ
−1
− qµ
−1
)(q
α+β−1 −1
λ
(2)
−µ
−1
(111)
).
(2)
Coefficients dα,β (resp. dα,β ) are obtained from cα,β (resp. cα,β ) by ∀ α, β,
(2)
λ−(β−1) cα,β = µ−(β−1) dβ,α ,
(112)
(2) λ−(β−2) cα,β
(113)
=
(2) µ−(β−2) dβ,α .
(2)
Proof. Coefficients cα,β and dα,β can be obtained by computation from cα,β and dα,β . To prove (106) and (107), we define cα,0 and cα,β according to formulas (106) and (107), and we try to prove (102). For that matter, we express uλ and uµ in terms of variables v := uλ − uµ and v := λ−1 uλ − µ−1 uµ . These variables being q-commuting, we expand the two expressions α+β=n+1 cα,β (uλ )α (uµ )β and uµ (uλ )n as a sum of terms in v i v j . Then, we fix i and j and we want to identify coefficients in v i v j . This leads λ−1 to prove an equality between polynomials in −1 which reduces as a relation between µ q-integers. In the same manner, we prove (103). " # 5.7. Proof of Proposition 13. First, we define derivations Hn on the free algebra A generated by elements ui , mi , i > 0, 3 + and 3 − . To prove that the Hn give derivations on C[H− \B− N+ ]q , we have to prove several relations. The most complicated one is π ◦ H (ν). Relation between u(λ) and u(µ) = 0, (114) where π denotes the projection of A onto C[H− \B− N+ ]q . To prove (114), we decompose H (ν).u(λ) and H (ν).u(µ) according to the relation: H (µ).u(λ) = µ−1 (u(µ) − u(λ)) +
× λ
+∞
(−1)k+1 q {k+1}
k=0 −1
k
u(λ)u(µ) u(λ) + µ−1 u(µ)k+2
− µ−1 u(µ)k+1 u(λ) − µ−1 u(λ)u(µ)k+1 m(µ)k+1 µ−(k+1)
(115)
which comes from (94) and (77) by decomposing v(µ) into a generating series in u(µ)k m(µ)k µ−k . k −k Thus, the left side of (114) is of the form +∞ k=0 Pk (u(λ), u(µ), u(ν))m(ν) ν , where Pk (u(λ), u(µ), u(ν)) is a polynomial in non-commutative variables u(λ), u(µ) and u(ν). Let k ∈ N. Using the fact that the relations between u(λ) and u(µ) are quadratric and that the only terms in u which appear in (94) are also quadratic, it is possible to reorganize the terms of polynomial Pk (u(λ), u(µ), u(ν)) by using Proposition 14 and the morphism defined in (100) so as to obtain a sum of monomials of the form (u(λ))α (u(µ))β (u(ν))γ . Then, we fix α, β, γ and we show that the coefficient of (u(λ))α (u(µ))β (u(ν))γ in Pk is equal to 0. Thus, Pk = 0, and (114) is true. In the same way, we prove all other relations.
Discrete Quantum Drinfeld–Sokolov Correspondence
657
Thus, Hn exists. To prove the commutativity, we deduce from formulas (94) and (95) that H (µ)(v(λ)) =
−1 1 µ v(λ) − λ−1 v(µ) −1 −µ 1 + −1 v(λ) µ−1 u(µ) − λ−1 u(λ) v(µ) −1 λ −µ 1 + −1 v(µ) µ−1 u(µ) − λ−1 u(λ) v(λ). −1 λ −µ
λ−1
(116)
Then, the computation shows that H (µ) ◦ H (ν) = H (ν) ◦ H (µ) (it is not necessary for that to decompose in the generating series). On the other hand, with the help of the formulas coming from Corollary 6, the computation shows that H (µ)◦f± = f± ◦H (µ). This completes the proof of Proposition 13 together with Proposition 6. 5.8. End of the proof of Theorem 1. We are going to prove that for any integer n, ad(In )◦ DSq = DSq ◦Hn . In other words, DSq sets up a Drinfeld–Sokolov correspondence for the extended phase space A¯ q and the quantum homogeneous space C[H− \B− ]q . Thanks to Lemma 10, the result will follow. To simplify, we note Uq = C[H− \B− ]q and U¯ q = C[H− \B− N+ ]q . In the remainder of this article, we shall identify the elements of U¯ q with their images in A¯ q by the algebra monomorphism DSq . The following lemma will be useful. Lemma 11. Let D1 and D2 be two derivations defined on Aq (resp. A¯ q ) such that D1 (x) = D2 (x) for all x ∈ Uq (resp. U¯ q ). Then, D1 = D2 . Proof. Let us denote by C the subalgebra of Aq (resp. A¯ q ) of all elements x such that D1 (x) = D2 (x). If x ∈ C, is invertible in Aq (resp. A¯ q ) then x −1 ∈ C. By induction, using the explicit forms of un and mn given in (52) and (56), we show that for all n ∈ Z, xn±1 , yn±1 ∈ C. For example, y0−1 ∈ C, x0−1 = q −1 u2 y02 ∈ C. Hence, we get the result. # " Let n ∈ N∗ . Then Hn is a derivation on U¯ q ⊂ A¯ q . We have to prove that Hn extends as a derivation on A¯ q and that Hn = ad(In ). 5.8.1. First, let us assume that Hn has an extension to Aq . Then, Hn has also an extension on A¯ q . Moreover, thanks to the definition of Hn together with the relations (27) and (28) which give expressions for u(λ) and m(λ) as quantum continued fractions, we show that 1 1 − 21 the image of U¯ q by T − 2 is generated by U¯ q and u−1 (x) = T − 2 ◦Hn (x) 1 and that Hn ◦T for all x ∈ U¯ q . Lemma 11 ensures that this relation is also true on A¯ q . On the other hand, by computations, we show that for all x ∈ Uq , Hn ◦ ϕ(x) = ϕ ◦ Hn (x), where ϕ is the involution on Aq defined in (54). Using again Lemma 11, this last equality 1 1 can be extended on Aq . So, by using the relation ϕ ◦ T 2 = T − 2 ◦ ϕ, we deduce that 1 1 Hn ◦ T 2 (x) = T 2 ◦ Hn (x) for all x ∈ Aq . It can be shown that this relation is also true 1 1 for x = 3 ± . It follows that Hn ◦ T ± 2 = T ± 2 ◦ Hn on A¯ q . On the other hand, by virtue of Proposition 13, we have Hn ◦ f± (y0−1 ) = f± ◦ Hn (y0−1 ) for u1 = y0−1 . From the
658
C. Grunspan
equality f± ◦T − 2 = T − 2 ◦f∓ , we deduce that Hn ◦f± (x) = f± ◦Hn (x) for all x ∈ Bq , where Bq denotes the subalgebra of Aq generated by xi−1 and yi−1 , for i ∈ Z. Then, the Uq b− -module-algebra structure of A¯ q implies that Hn ◦ f± (x) = f± ◦ Hn (x) for all x ∈ Aq and also on A¯ q since this equality is also true if x = 3 ε , ε ∈ {+, −}. Thus, ¯ Hn ∈ Der Uq b− (Aq ) and by Proposition 12, we see that Hn is a linear combination of ad(Ik ). But, the same proposition also shows that there is a gradation deg on Der (A¯ q ) 1
1
Uq b−
¯ ¯ given by deg δ = n for δ ∈ Der Uq b− (Aq ) if there is a homogeneous element α ∈ Aq with ¯ respect to the principal gradation degp on Aq defined on Aq in Subsect. 3.1 and extended on A¯ q by degp 3± = 1, such that δ(α) is also homogeneous with respect to degp and degp δ(α) = n + degp (α). Note that if δ is homogeneous, this last property occurs not only for one special homogeneous element α with respect to the gradation degp but also for all homogeneous elements x in A¯ q with respect to degp . Now, from the definition of Hn , it is easy to see that deg Hn = deg In . Then, the computation of both Hn (y0−1 ) and −1 −2 −2 −1 −2 ad(In )(y0−1 ) on the basis element y−k x−k+1 y−k+1 . . . x0−2 y0−2 or x−k y−k . . . x0−2 y0−2 of αi βi the basis xi yi (with k such that n = 2k or n = 2k + 1 according to the parity of n) shows that Hn = ad(In ). Thus, to conclude, it suffices to extend Hn on A¯ q .
5.8.2. Proof of the existence of an extension. In the classical case, we use the fact that the classical limit Ucl of Uq possesses the same field of fraction K as Acl , to extend Hn,cl 1 1 in a derivation of K. The extension is unique. So, the relation Hn,cl ◦ T − 2 = T − 2 ◦ Hn,cl true on Acl is also true on K. The same argument as above with the involution ϕ shows 1 1 that Hn,cl ◦ T ± 2 = T ± 2 ◦ Hn,cl . But, Hn,cl (y0−1 ) = Hn,cl (u1 ) ∈ Ucl ⊂ Acl . Thus, Acl is invariant by Hn . Hence, we get the result. In the classical case, it is a little bit more complicated. The problem comes from the fact that it is not obvious that non-zero elements of Aq as well as of C[H− \B− ]q satisfy Ore conditions. Nevertheless, we show that this is true when q is a formal variable “close to 1”. For that, we develop the notion of extended Ore conditions. Definition. Let A be an algebra without any zero divisors over a field k and (A[[t]], ∗) a formal deformation of the multiplication on A. For all n, we define πn to be the natural projection of A[[t]] on An := A[[t]]/(t n ). A multiplicative set S in A[[t]] is said to satisfy the extended Ore conditions if πn (S) satisfies the Ore conditions in An equipped with the natural non-commutative product induced by ∗. If S is a multiplicative set which satisfies the extended Ore conditions in A[[t]], then there are natural morphisms: (An )Sn −→ (Ap )Sp for n > p. We note A[[t]]S the projective limit of (An )Sn . Examples. 1. Let B = C[Xi−1 , Yi−1 , i ∈ Z], and K be the field of fractions of B. Let us consider the isomorphism of free modules: ∞
−αi i=1 Xi
B[[q − 1]] −→ Bq , ∞ −βj −βj −αi ∞ −→ ∞ , j =1 Yj i=1 xi j =1 yj
where (αi ) and (βj ) are two almost zero sequences in NZ and Bq is the (q − 1)-adic completion of the subalgebra Bq aforementioned. This isomorphism leads to a formal
Discrete Quantum Drinfeld–Sokolov Correspondence
659
deformation ∗ of the multiplication on B[[q − 1]]. We show that the multiplicative set S of all elements non-divisible by q − 1 satisfies the extended Ore conditions. Moreover, ∗ is a star-product, i.e., there are Bn bidifferential operators such that ∗ = Bn (q − 1)n . Hence, we get a non-commutative structure on (K[[q − 1]], ∗ ) which contains (B[[q − 1]], ∗ ) and it can be proved that (B[[q − 1]]S , ∗ ) % (K[[q − 1]], ∗ ). 2. One can also define a structure of the non-commutative algebra (C[Ui , Mi , i > 0][[q − 1]], ∗) using the isomorphism of free modules: C[Ui , Mi , i > 0][[q − 1]] −→ Uq , ∞ ∞ αi βi αi βi i=1 Ui Mi −→ i=1 ui mi , where Uq denotes the (q − 1)-adic completion of Uq . Unlike the previous case, it is not so easy to check that the non-commutative product ∗ is a star product. So, there is a priori no reason for (C(Ui , Mi , i > 0)[[q − 1]], ∗) to exist. However, we prove the following result. Lemma 12. For all x ∈ Aq , there is ω a monomial in xi−1 , yi−1 such that ω ∈ Uq and ωx ∈ Uq . Proof. It suffices to prove the lemma for x = xi−1 or x = yi−1 with i ∈ Z. Thanks to the symmetry relation between un and mn i.e., mn = ϕ(un ) and un ∈ C[xj−1 , yj−1 , j < 0]q , we can assume that i < 0. Then, we prove the result by induction. For example, 1.y0−1 = u1 , q −1 y0−2 x0−1 = u2 and y0−1 = u1 , and so on. " # Lemma 12 implies that the multiplicative set S of all elements in C[Ui , Mi , i > 0][[q − 1]] which are non-divisible by q − 1 satisfies the extended Ore conditions and that we have an isomorphism (C[Ui , Mi , i > 0][[q −1]]S , ∗) % (B[[q −1]]S , ∗ ). Now, to conclude, we say that the (q − 1)-adic completion of Hn defined on C[Ui , Mi , i > 0][[q − 1]] has naturally an extension on (K[[q − 1]], ∗ ). Moreover, this extension is unique. The same arguments as above with the involution ϕ and the half-translation 1 1 automorphism T ± 2 show that Hn and T ± 2 commute on K[[q −1]]. But Hn (y0−1 ) ⊂ Aq . It follows that Aq is invariant by Hn . 6. Conclusion and Outlooks First of all, it would be interesting to see whether it is possible to extend our result to the more general case of an arbitrary non-twisted Lie algebra and to study other possible models of discretization proposed by Enriquez and Feigin [EFe]. It would also be interesting to study in detail the case when q is a root of unity. 6.1. Affine Poisson homogeneous space. Theorem 1 could suggest – but it remains to be proven – that there is a general Drinfeld–Sokolov correspondence for the discrete Toda theory and that the homogeneous space of the correspondence is a Poisson homogeneous space equipped with a Poisson structure induced by a Poisson bivector π of the form π = r L − r R , where r and r are two r-matrices such that their Schouten bracket [r, r] and [r , r ] are equal and invariant by the adjoint action of the Lie group G on 3 Lie(G) and where r L (resp. r R ) is the left (resp. right) translation of r (resp. r )
660
C. Grunspan
on G. A group G endowed with such a Poisson structure is a particular case of an affine Poisson homogeneous space (APHS), according to the terminology introduced by Dazord and Sondaz [DaSo] (see also [L, Ko]). By definition, an APHS is a Poisson manifold which is a principal homogeneous space under the action of a Poisson-Lie group ; if it is the case, then there are two commuting actions on G by Poisson-Lie groups. 6.2. Links with Parmentier’s work. Our method of quantizing the Poisson manifold H− \B− equipped with the Poisson structure induced by the field of bivectors P∞ = R (which is truly a quotient of an APHS) lays on the study of the classical case r L − r∞ and on the fact that the phase space of the discrete sine-Gordon system had a natural quantization. But, it is perhaps not the easiest way to quantize (H− \B− , P∞ ). Indeed, in the case we dealt with, r denotes the standard r-matrix and r∞ denotes the r-matrix corresponding to Drinfeld new realizations. So, according to Parmentier works, to get a quantization of the APHS G with Poisson structure given by the field of bivectors P∞ , it suffices to have a twist linking the two Hopf algebra structures (Uq g, ?) and (Uq g, ?nr ) where, in the first case, the comultiplication is the “Drinfeld-Jimbo” comultiplication, and where in the second case, it is the one corresponding to the Drinfeld new realizations. But such a twist appears in the paper [KT]. We need to apply this method and to investigate further how derivations Hn appear in it. We plan to study this question in the future. 6.3. The continuous case. It would be also interesting to see whether it is possible to deduce from our results solutions to problems of continuous Toda theory, to compute explicitly integrals of motion, to quantize in terms of the Vertex Operator Algebra the Vertex Poisson Algebra shown by Enriquez and Frenkel on homogeneous spaces [EFr2], and to obtain a quantum version of Drinfeld–Sokolov correspondence in terms of the V.O.A. in the continuous case. 6.4. The Drinfeld–Sokolov reduction. Finally, we indicate that there exists another correspondence, close to the one we discussed here, which is called the Drinfeld–Sokolov reduction [DS]. This correspondence allows us to construct W-algebras from Kac– Moody algebras. It is a Poisson isomorphism between the manifold of scalar differential operators of order n with the second Gelfand-Dickey bracket on one hand and a Hamiltonian reduction of the manifold of matrix differential operators of order 1 viewed as a n ∗ , with Kirillov-Kostant bracket on the other hand. The quantization of subspace of sl this correspondence is studied in [FF1]. A q-deformed version of this correspondence, in which manifolds of differential operators are replaced by manifolds of q-difference operators, is proposed in [FRS, SS]. Quantization of this correspondence leads to the q-deformed W-algebra. Acknowledgement. I am very grateful to my teacher Professor Benjamin Enriquez for his help and support during the preparation of this paper. I would also like to thank Professor Anthony Joseph and Professor Joseph Bernstein for the hospitality of the Weizmann Institute and the Tel-Aviv University.
Discrete Quantum Drinfeld–Sokolov Correspondence
661
References [BD] Beilinson, A.A., Drinfeld, V.G.: Chiral algebras, preprint. [BMP] Bouwknegt, P., McCarthy, J., Pilch, K.: Quantum group structure in the Fock space resolutions of sl(n) representations. Commun. Math. Phys. 131, no. 1, 125–155 (1990) [CP] Chari, V., Pressley, A.N.: A guide to quantum groups. Cambridge: Cambridge University Press, 1994 [DaSo] Dazord, P., Sondaz, D.: Groupes de Poisson affines. Math. Sci. Res. Inst. Publ., 20. New York: Springer, 1991 [D] Dixmier, J.: Algèbres enveloppantes, Paris: Gauthier-Villars, 1974 [DS] Drinfeld, V.G., Sokolov, V.V.: Lie algebras and equations of Korteweg–de Vries type. Sov. Math. Dokl. 23, 457–62 (1981). Translated in English in J. Sov. Math. 30, 1975–2035 (1985) [EFe] Enriquez, B., Feigin, B.L.: Integrals of motion of classical lattice sine-Gordon system. Theor. Math. Phys. 103, 738–756 (1995) [EFr1] Enriquez, B., Frenkel, E.: Equivalence of two approaches to integrable hierarchies of KdV type. Commun. Math. Phys. 185, no. 1, 211–230 (1997) [EFr2] Enriquez, B., Frenkel, E.: Geometric interpretation of the Poisson structure in affine Toda field theories. Duke Math. J. 92, no.3, 459–495 (1998) [F] Frenkel, E.: Five lectures on soliton equations. Surv. Differ. Geom. IV, Boston, MA: Int. Press, 1998, pp. 131–180 [FF1] Feigin, B.L., Frenkel, E.: Quantization of the Drinfeld–Sokolov reduction. Phys. Lett. B 246, no. 1–2, 75–81 (1990) [FF2] Feigin, B.L., Frenkel, E.: Kac–Moody groups and integrability of soliton equations. Invent. Math. 120, no. 2, 379–408 (1995) [FF3] Feigin, B.L., Frenkel, E.: Integrals of motion and quantum groups. Lect. Notes in Math. 1620, Berlin–Heidelberg–New York: Springer-Verlag, 1996 [FL1] Fateev, V.A., Lykyanov, S.L.: The models of two-dimensional conformal quantum field theory with Zn symmetry. Internat. J. Modern Phys. A 3, no. 2, 507–520 (1988) [FL2] Fateev, V.A., Lukyanov, S.L.: Poisson-Lie groups and classical W -algebras. Internat. J. Modern Phys. A 7, no. 5, 853–876 (1992) [FTT] Faddeev, L.D., Takhtadzhyan, L.A., Tarasov,V.O.: Local Hamiltonians for integrable quantum models on a lattice. Theor. Math. Phys. 57, 1059–1073 (1983) [FRS] Frenkel, E., Reshetikhin, N., Semenov-Tian-Shansky, M.A.: Drinfeld–Sokolov reduction for difference operators and deformations of W-algebras I. The case of Virasoro algebra. Commun. Math. Phys. 192, no. 3, 605–629 (1998) [G1] Grunspan, C.: Sur les intégrales de mouvement du système de sinus-Gordon discret. Lett. Math. Phys. 54, no. 2, 101–121 (2000) [G2] Grunspan, C.: Théorie de Toda discrète et espaces homogènes quantiques. Ph.D. thesis (2000), Ecole Polytechnique [GDi1] Gelfand, I.M., Dickey, L.A.: Asymptotic behaviour of the resolvent of the Sturm–Liouville equations and the algebra of the Korteweg–de Vries equations. Russ. Math. Surv. 30, no. 5, 77–113 (1975) [GDi2] Gelfand, I.M., Dickey, L.A.: A Lie algebra structure in a formal variations calculus. Funct. Anal. Appl. 10, 16–22 (1976) [GDo] Gelfand, I.M., Dorfman, I.Y.: Hamiltonian operators and infinite-dimensional Lie algebras. Funct. Anal. Appl. 15, 173–187 (1981) [H] Hikami, K.: The ZN symmetric quantum lattice field theory, the quantum group symmetry, the Yang-Baxter equation and the integrals of motion. Journ. of the Phys. Soc. of Japan 68, no. 1, 55–60 (1999) [IK1] Izergin, A.G., Korepin, V.E.: The lattice sine-Gordon model. Lett. Math. Phys. 5, 199–205 (1981) [IK2] Izergin, A.G., Korepin, V.E.: Lattice versions of quantum field theory models in two dimensions. Nucl. Phys. B 205, 401–413 (1982) [Ka] Kac, V.G.: Infinite-dimensional Lie algebras. Cambridge: Cambridge Univ. Press, 1990 [Ko] Kosmann-Schwarzbach,Y.: Jacobian quasi-bialgebras and quasi-Poisson Lie groups, Contemp. Math. 132, Providence, RI: Am. Math. Soc., 1992 [KT] Khoroshkin, S.M., Tolstoy, V.N.: On Drinfeld’s realization of quantum affine algebras. J. of Geom. Phys. 11, 445–452 (1993) [KW] Kac, V., Wakimoto, M.: Exceptional hierarchies of soliton equations. Proc. Symp. Pure Math. 49, 191–237 (1989) [L] Lu, J.-H.: Classical dynamical r-matrices and homogeneous Poisson structures on G/H and K/T . Commun. Math. Phys. 212, no. 2, 337–370 (2000) [LP] Li, L.C., Parmentier, S.: Nonlinear Poisson structures and r-matrices. Commun. Math. Phys. 125, no. 4, 545–563 (1989) [Pa1] Parmentier, S.: Twisted affine Poisson structures, decomposition of Lie algebras, and the Classical Yang–Baxter equation. Preprint MPI/91-82, Max-Planck-Institut für Mathematik, Bonn, 1991
662
[Pa2] [Pu] [S] [SS] [SZ] [V] [W]
C. Grunspan
Parmentier, S.: On coproducts of quasi-triangular Hopf algebras. Algebra i Analiz 6, no. 4, 204–222 (1994); translated in English in St. Petersburg Math. J. 6, no. 4, 879–894 (1995) Pugay, Y.P.: Lattice W algebras and quantum groups, Theor. Math. Phys. 100, no. 1, 900–911 (1994) Sevostyanov, A.: Towards Drinfeld–Sokolov reduction for quantum groups, J. Geom. Phys. 33, no. 3–4, 235–256 (2000) Semenov-Tian-Shansky, M.A., Sevostyanov, A.: Drinfeld–Sokolov reduction for difference operators and deformations of W-algebras. II. The general semisimple case,. Commun. Math. Phys. 192, no. 3, 631–647 (1998) Shabat, A.B., Zakharov, V.E.: Integration of the nonlinear equations of mathematical physics by the method of the inverse scattering problem II (Russian). Funktsional. Anal. i Prilozhen 13, no. 3, 13–22 (1979); Translated in English in Funct. Anal. Appl. 13, 166–174 (1980) Volkov, A.Y.: Quantum Volterra model. Phys. Lett. A 167, 345–355 (1992) Wilson, G.: The modified Lax and two dimensional Toda lattice equations associated with simple Lie algebras. Ergod. Th. and Dynam. Syst. 1, 361–380 (1981)
Communicated by L. Takhtajan