Communications in Mathematical Physics - Volume 203

Commun. Math. Phys. 203, 1 – 19 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 Effective Dynami...

Author: A. Jaffe (Chief Editor)

30 downloads 813 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 203, 1 – 19 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Effective Dynamics for a Mechanical Particle Coupled to a Wave Field Alexander Komech1,? , Markus Kunze2,?? , Herbert Spohn3 1 Department of Mechanics and Mathematics of Moscow State University, Moscow 119899, Russia. E-mail: [email protected] 2 Mathematisches Institut der Universit¨ at K¨oln, Weyertal 86, D-50931 K¨oln, Germany. E-mail: [email protected] 3 Zentrum Mathematik, TU M¨ unchen, D-80290 M¨unchen, Germany. E-mail: [email protected]

Received: 2 September 1998 / Accepted: 13 November 1998

Abstract: We consider a particle coupled to a scalar wave field and subject to the slowly varying potential V (εq) with small ε. We prove that if the initial state is close, order ε2 , to a soliton (=dressed particle), then the solution stays forever close to the soliton manifold. This estimate implies that over a time span of order ε−1 the radiation losses are negligible and that the motion of the particle is governed by the effective Hamiltonian Heff (q, P ) = E(P ) + V (εq) with energy-momentum relation E(P ).

1. Introduction When a particle interacts with a field its mechanical properties are renormalized, e.g. the particle acquires an effective mass. In the context of charges interacting with the Maxwell field such an effective energy-momentum relation is discussed at length already in the classical work of Abraham [1] and Lorentz [16] with the implicit understanding that this relation determines how the particle responds to external forces. Kramers [14] emphasizes the distinction between bare (appearing in the equation of motion) and physical (observable by outside means) parameters of a charge. His vision has been implemented through the renormalization of quantum electrodynamics. To our knowledge, even on the classical level, it has never been properly settled in which sense and on what scale the dynamics governed by the effective energy-momentum relation is an approximation to the true solution of the coupled equations of motion. To gain some understanding we study here the arguably simplest model, namely a single particle interacting with a scalar wave field. ? Supported partly by French–Russian A.M.Liapunov Center of Moscow State University, by Max-Planck Institute for Mathematics in the Sciences (Leipzig), and by research grants of INTAS (IR-97-113) and of Volkswagen-Stiftung. ?? Supported by DAAD and NSF (through a grant of Ch. Jones) during a stay at Brown University.

2

A. Komech, M. Kunze, H. Spohn

Our second source of interest lies in the, by now, long list of examples we have for the emergence of an effective dynamics, to mention only the Boltzmann and Vlasov equation, hydrodynamics [21], homogenization in periodic and random environments [3, 8], interface and vortex dynamics in Ginzburg–Landau theories [11], quantum systems weakly coupled to a heat bath [6], and a quantum particle in the semiclassical limit [10, 18, 22]. Their common thread is a separation of space-time scales together with some sort of local stationarity in such a way that the slowly varying dynamical variables are governed by an effective dynamics. However, the detailed mechanisms differ notably from case to case. Here we add a novel item to the list. It is not covered by the mathematical techniques developed so far. We consider a scalar wave field φ(x), in three-dimensional space, coupled to a particle with position q, momentum p, governed by ˙ t) = π(x, t), φ(x, q(t) ˙ = p(t)/(1 + p2 (t))1/2 ,

π(x, ˙ t) = 1φ(x, t) − ρ(x − q(t)), Z p(t) ˙ = d3 x φ(x, t) ∇ρ(x − q(t)).

This is a Hamiltonian system with the Hamiltonian functional Z 1 d3 x |π(x)|2 + |∇φ(x)|2 H0 (φ, π, q, p) = (1 + p2 )1/2 + 2 Z + d3 xφ(x)ρ(x − q).

(1.1)

(1.2)

We have set the mechanical mass of the particle and the speed of wave propagation equal to one. In spirit the interaction term is simply φ(q). This would result however in an energy that is not bounded from below. Therefore we smoothen out the coupling by the function ρ(x). In analogy to the Maxwell–Lorentz equations we call ρ(x) the “charge distribution”. We assume ρ(x) to belong to the Sobolev space H 1 , radial, and compactly supported, i.e., ρ, ∇ρ ∈ L2 (R3 ), ρ(x) = ρr (|x|), ρ(x) = 0 for |x| ≥ Rρ .

(C)

The system (1.1) has solutions traveling with constant velocity v, |v| < 1. They are given by p Sv (t) = (φv (x − q − vt), πv (x − q − vt), q + vt, pv ), pv = v/ 1 − v 2 , (1.3) with φv (x) = −

Z 4π((1 −

v 2 )(y

ρ(y)d3 y , πv (x) = −v · ∇φv (x). − x)2 + (v · (y − x))2 )1/2 (1.4)

To have a short name we call Sv (t) the soliton with velocity v centered at q(t) = q + vt. We define the normalized energy of a soliton as Es (v) = H0 (Sv ) − H0 (S0 ), Sv = Sv (0), which, using the rotational invariance of ρ, is given by 1 1 + |v| 2 − v2 2 −1/2 . − log − 1 + 3me Es (v) = (1 − v ) 2(1 − v 2 ) 2|v| 1 − |v|

(1.5)

(1.6)

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

3

Here 3me = −hρ, 1−1 ρi, with h·, ·i the scalar product in L2 (R3 ); we have me < ∞ by assumption (C). Since the system (1.1) is invariant under spatial translations, the total momentum, Z (1.7) P(φ, π, q, p) = p − d3 x π(x) ∇φ(x), is conserved. Inserting Sv , the total momentum of a soliton is given by 1 1 1 + |v| 2 −1/2 − . + 3me v log Ps (v) = P(Sv ) = v(1 − v ) 2v 2 (1 − v 2 ) 4|v|3 1 − |v| (1.8) The map v 7→ Ps (v) is invertible from V = {v ∈ R3 : |v| < 1} onto R3 with the inverse vs (P ); see [12]. Therefore we obtain the effective energy-momentum relation E(P ) = Es (vs (P )).

(1.9)

Then E(P ) is radial. In the nonrelativistic limit (v small) we have Es (v) ∼ =

1 (1 + me )v 2 and Ps (v) ∼ = (1 + me )v for |v| 1. 2

(1.10)

Thus me is the additional mass acquired by the particle through the coupling to the field. For large |P | we have the relativistic dependence E(P ) ∼ = |P |. Now let us assume that, at some time t, we have the soliton Sv (t) centered at q(t), v = q(t), ˙ and that an external force is acting on the particle. This force changes the velocity to v 0 6= v and Sv (t) is no longer a solution to the system (1.1). However, if the force is small, so is the difference v 0 − v and, if the force is slowly varying, the wave field has enough time to reestablish a soliton with new velocity v 0 . In fact this happens essentially with the speed of wave propagation (one in our case). Geometrically in phase space, we have the 6-dimensional manifold S of solitons labeled by their center q and velocity v. For zero external force each point in this manifold moves on an orbit t 7→ (q + vt, v). Under a weak, slowly varying force, the true solution should remain close to the soliton manifold thereby inducing on it an effectively 6-dimensional motion. With this picture in mind, we add to H0 in (1.2) the slowly varying potential V (εq), ε 1, Z 1 2 1/2 d3 x |π(x)|2 + |∇φ(x)|2 Hε (φ, π, q, p) = (1 + p ) + V (εq) + 2 Z (1.11) + d3 x φ(x)ρ(x − q). For the potential V we require V ∈ C 2 (R3 ),

and sup q∈R3

inf V (q) > −∞,

q∈R3

|∇V (q)| + |∇∇V (q)| < ∞.

(P )

(U )

We remark that, using the conservation of energy, condition (U ) can be replaced by V (q) → ∞ as |q| → ∞,

(U 0 )

4

A. Komech, M. Kunze, H. Spohn

i.e., by the assumption that V be confining. In the sequel we study the Hamiltonian dynamics generated by (1.11), ˙ t) = π(x, t), φ(x,

π(x, ˙ t) = 1φ(x, t) − ρ(x − q(t)), Z

˙ = −ε∇V (εq(t)) + q(t) ˙ = p(t)/(1 + p2 (t))1/2 , p(t)

(1.12) d3 x φ(x, t) ∇ρ(x − q(t)).

The derivatives in (1.12) and below are understood in the sense of distributions. We consider the Cauchy problem for the system (1.12) with initial conditions (φ(x, 0), π(x, 0), q(0), p(0)) = (φ0 (x), π 0 (x), q 0 , p0 ).

(1.13)

Under our assumptions, the global solution to the Cauchy problem (1.12), (1.13) exists and is unique for initial data with finite energy. The solution depends on ε through the potential and possibly also through the initial conditions. In order as not to overburden our notation, we will mostly suppress this dependence. We assume the initial state to be close to a soliton. Since the force is slowly varying, near the particle such a wave field should persist. Indeed, we prove that k(φ(q(t) + x, t), π(q(t) + x, t)) − (φv(t) (x), πv(t) (x))kR ≤ CR ε, ∀R > 0, (1.14) uniformly in t ∈ R (with the norm k · kR being defined by the field energy in a ball of radius R), provided a smallness condition on ρ is satisfied. Presumably, this condition is an artifact of our method. In (1.12) the external force is O(ε). So is the self-force, since according to (1.14) the field φ deviates from the soliton only by O(ε). Then q¨ is of order ε, whereas q˙ is of order 1. The effective energy-momentum relation should be visible on a time scale O(1). Therefore we define the comparison dynamics through the effective Hamiltonian Heff (Q, P ) = E(P ) + V (εQ) with the corresponding equations of motion, ˙ = ∇E(P (t)), Q(t)

P˙ (t) = −ε∇V (εQ(t)),

(1.15)

suppressing again the ε-dependence of (Q(t), P (t)). Since the energy-momentum relation E(P ) depends on the charge distribution only through me , the effective dynamics is a structure independent property of the coupled system particle+field in the sense of the Kramers [14]. The particle loses energy through radiation, which is proportional to q¨2 and thus O(ε2 ). Therefore the comparison dynamics should be a valid approximation over a time scale ε−1 , i.e., over any time interval of duration ε−1 τ . At time t0 the comparison dynamics is adjusted to the true solution through the initial conditions Q(t0 ) = q(t0 ),

P (t0 ) = Ps (q(t ˙ 0 )).

(1.16)

Let (Q(t), P (t)) be the solution to (1.15) with these initial values. We then establish that, for |t − t0 | = O(ε−1 ), ˙ ¨ |q(t) − Q(t)| = O(1), |q(t) ˙ − Q(t)| = O(ε), |q(t) ¨ − Q(t)| = O(ε2 ) uniformly in t0 . This is our main result.

(1.17)

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

5

In the proof, we stick for a while to the traditional route. One solves the inhomogeneous wave equation and inserts the solution into the self-force. Thereby the force on the particle depends on its past history, but not on the field. If one expands this force at q(t) up to second order, one recovers the term missing in the full energy-momentum relation. To justify such a procedure mathematically we have to know a priori that |q(t)| ¨ ∼ε

and

...

| q (t)| ∼ ε2

(1.18)

uniformly...in t, which requires an estimate of the field difference (1.14) and a similar one to handle q (t). Our experience from the past is confirmed, namely a direct analysis of the exact delay equation for q(t) is hopeless. To make progress one has to switch back and forth between particle and field.

2. Main Results To formulate our results precisely, we need some definitions. We introduce the phase space suitable for the Cauchy problem corresponding to (1.12) and (1.13). Let L2 be the real Hilbert space L2 (R3 ) with norm || · ||, and let H˙ 1 be the completion of C0∞ (R3 ) with norm kφ(x)k = ||∇φ(x)|| . Equivalently, using Sobolev’s embedding theorem, H˙ 1 = {φ(x) ∈ L6 (R3 ) : |∇φ(x)| ∈ L2 }; see [15]. Let ||φ|| R denote the norm in L2 (BR ) for R > 0, where BR = {x ∈ R3 : |x| ≤ R}. Then the seminorms kφkR = ||∇φ|| R are continuous on H˙ 1 . Definition 2.1. i) The phase space E is the Hilbert space H˙ 1 ⊕ L2 ⊕ R3 ⊕ R3 of states Y = (φ, π, q, p) with finite norm k Y kE = kφk + ||π|| + |q| + |p|. ii) EF is the space E endowed with the Fr´echet topology defined by the local energy seminorms kY kR = kφkR + ||π|| R + |q| + |p|, ∀R > 0. iii) F is the Hilbert space H˙ 1 ⊕ L2 of the fields 8 = (φ, π) with finite norm k 8kF = kφk + ||π|| . iv) FF is the space F endowed with the Fr´echet topology defined by the local energy seminorms k8kR = kφkR + ||π|| R , ∀R > 0. A point in phase space is referred to as state. We write the Cauchy problem (1.12), (1.13) in the form Y˙ (t) = F(Y (t)), t ∈ R, Y (0) = Y 0 ,

(2.1)

where Y (t) = (φ(t), π(t), q(t), p(t)) and Y 0 = (φ0 , π 0 , q 0 , p0 ). As already mentioned, we mostly suppress the ε-dependence of the solutions, of the vector field F, and of the initial conditions. The following lemma is proved analogously to the corresponding result in [13].

6

A. Komech, M. Kunze, H. Spohn

Lemma 2.2. Let (C), (P ), and (U ), resp. (U 0 ), hold. Then for every Y 0 ∈ E, |ε| ≤ 1, the Cauchy problem (2.1) has a unique solution Y ∈ C(R, E) with speed bounded as ˙ ≤ v < 1. sup |q(t)| t∈R

(2.2)

The bound v = v(Y 0 ) is uniform in |ε| ≤ 1 and for initial values Y 0 in bounded subsets of E. If the effective dynamics is approximately valid, then the field should be close to the soliton centered at q(t) with velocity v(t) = q(t). ˙ We therefore consider the difference Z(x, t) = 8(x, t) − 8∗ (x, t), where

(2.3)

8(x, t) = (φ(x, t), π(x, t)), 8∗ (x, t) = 8v(t) (x − q(t))

and 8v (x) = (φv (x), πv (x)) is the field part of the soliton. Defining ρ(x) = (0, ρ(x)) and A(φ, π) = (π, 1φ), it follows that 8 and Z satisfy the equations of motion ˙ 8(x, t) = A8(x, t) − ρ(x − q(t)), ˙ Z(x, t) = AZ(x, t) − B(x, t), B(x, t) = p(t) ˙ · ∇p 8v(t) (x − q(t)).

(2.4) (2.5)

Here, according to the chain rule, ∇p 8v = ∇v 8v dv(p),

(2.6) p where dv(p) is the differential of the map p 7→ v(p) = p/ 1 + p2 . In Cartesian coordinates, dv(p) is just the Jacobi matrix ∂vi /∂pj . Theorem 2.3. Let the conditions of Lemma 2.2 hold and let ||ρ|| be sufficiently small, ||ρ|| ≤ δ(v, Rρ ). Then for every R > 0 there exists CR such that sup kZ(· + q(t), t)kR ≤ CR (kZ(0)kF + ε). t∈R

(2.7)

For the unperturbed, ε = 0, system our theorem states that the distance between the true solution and the soliton manifold S = {(φv (x − q), πv (x − q), q, pv ) : q ∈ R3 , v ∈ V}

(2.8)

remains bounded in time. This property is called orbital stability, which has been established for the system (1.1) in [12] and for related equations in [7, 2] using the Liapunov method in combination with energy and momentum conservation. For ε > 0 such an argument breaks down, since the Hamiltonian vector field is no longer parallel to S. To have a stability result as (2.7) we therefore need to exploit that through radiation damping the solution is “pushed” towards S. In other words, through the free wave equation a small deviation from the soliton is transported to infinity, which also shows that we are not allowed to replace the local energy norm in (2.7) by the global one. An adequate mathematical argument is provided by the nonautonomous integral equation method [4, 5, 19, 20], which has been used to prove the convergence to the soliton manifold in the context of the nonlinear Schr¨odinger equation. If we assume that initially kZ(0)kF ≤ Cε, then according to (2.7) the solution remains O(ε) close to S for all times. Thus it remains to characterize the motion along S as given by the particle trajectory q(t). To obtain its approximate equation of motion we

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

7

have to estimate the self-force. By Theorem 2.3 it is of O(ε). To control the error, O(ε2 ), the solution has to be slowly varying in time with outgoing fields, which we formalize through the notion of an adiabatic family of solutions Yε (t) = (φε (t), πε (t), qε (t), pε (t)). We denote by U (t) the dynamical group on F generated by the free wave equation and set 80 = (φε (0), πε (0)),

(φ0ε (·, t), πε0 (·, t)) = U (t)80 .

(2.9)

Definition 2.4. A family of solutions Yε (t) ∈ C(R, E), 0 < ε ≤ 1, to the system (1.12) is called adiabatic, if there exist constants a, T0 > 0, and v < 1, such that the following bounds hold: sup |q˙ε (t)| ≤ v,

(2.10)

sup |q¨ε (t)| ≤ aε,

(2.11)

t∈ R t∈ R

...

sup | q ε (t)| ≤ aε2 ,

(2.12)

t∈ R

| < φ0ε (x, t), ∇ρ(x − q) > | ≤ aε2 for |q| < |t| − T0 .

(2.13)

This definition is time-invariant, i.e., a family of solutions Yε (t + θ) is adiabatic for any θ ∈ R if it is for some θ. Our main result is the following Theorem 2.5. Let the assumptions of Theorem 2.3 hold and let Yε (t) ∈ C(R, E) be an adiabatic family of solutions to (1.12). Let (Q(t), P (t)) be the comparison dynamics (1.15) with initial values (1.16). Then for any τ > 0 there exists C = C(τ ) such that for |t − t0 | ≤ ε−1 τ , ˙ ¨ |q(t) − Q(t)| ≤ C, |q(t) ˙ − Q(t)| ≤ Cε, |q(t) ¨ − Q(t)| ≤ Cε2 .

(2.14)

The constant C(τ ) can be chosen independently of t0 . Of course, we still need a criterion for initial states, that ensures the corresponding family of solution trajectories is adiabatic. The following theorem provides sufficient conditions, which in particular show that any initial soliton (φv (x−q 0 ), πv (x−q 0 ), q 0 , pv ) defines an adiabatic family of solutions and that the set of adiabatic families of solutions is nonempty and open in an appropriate topology. We set (ϕ0 (x), ψ 0 (x)) = Z 0 (x) = Z(x, 0) with corresponding Fourier transforms 0 (ϕˆ (k), ψˆ 0 (k)), and we let Z p(0) ˙ = −ε∇V (εq(0)) + d3 x φ(x, 0) ∇ρ(x − q(0)). Theorem 2.6. Let there exist a0 > 0 such that for the initial states Yε0 = Y 0 = (φ0 , π 0 , q 0 , p0 ) ∈ E, 0 < ε ≤ 1, the following bounds hold: kY 0 (x)kE ≤ a0 ,

(2.15)

kZ (x)kF ≤ a ε,

(2.16)

0

k∇Z (x)kF + |p(0)| ˙ ≤a ε , ˆ ≤ a0 ε2 , d3 k |k||ϕˆ 0 (k)| + |ψˆ 0 (k)| |ρ(k)| 0

Z Z

0

0 2

ˆ + |∇[ψˆ 0 (k)ρ(k)]| ˆ ≤ a0 ε, d3 k |k| |k||∇[ϕˆ 0 (k)ρ(k)]|

(2.17) (2.18) (2.19)

8

A. Komech, M. Kunze, H. Spohn

and let ||ρ|| be sufficiently small, ||ρ|| ≤ δ(a0 , Rρ ). Then the family of solutions Yε (t) ∈ C(R, E) to the Cauchy problem (2.1) is adiabatic. Thus Theorem 2.6, in essence, requires that the deviation from the soliton has sufficient smoothness and decay. Our paper is organized as follows. Theorem 2.3 is proved in Sect. 3, and Theorem 2.6 is established in Sect. 4. In Sect. 5 we compute the self-force, and in Sect. 6 we complete the proof of Theorem 2.5. Section 7 concerns the translation invariant system (1.1). In Appendix A we collect Fourier space computations. Finally, in Appendix B, we list some remarks on the Hamiltonian structure. 3. Stability of the Soliton Manifold We prove Theorem 2.3 and establish first the required bound for R = Rρ from (C). Lemma 3.1. Under the assumptions of Theorem 2.3, the bound (2.7) holds for R = Rρ , kZ(· + q(t), t)kRρ ≤ C(kZ(0)kF + ε).

(3.1)

Proof. Solving Eq. (2.5) by Fourier transform we get the mild solution representation Z t U (t − s)[p(s) ˙ · ∇p 8v(s) (· − q(s))] ds, (3.2) Z(t) = U (t)Z(0) − 0

with U (t) being the group generated by the free wave equation in H˙ 1 ⊕ L2 . By conservation of energy for the wave equation k[U (t)Z(0)](· + q(t))kRρ ≤ k[U (t)Z(0)](· + q(t))kF = kZ(0)kF .

(3.3)

We denote by ϕ(x, t) = φ(x, t) − φv(t) (x − q(t)) the first component of Z(x, t) and observe that hφv (x), ∇ρ(x)i = 0 for |v| < 1 because the soliton (1.3) is a solution to (1.1). Then (1.12) implies p(t) ˙ = −ε∇V (εq(t)) + hϕ(x + q(t), t), ∇ρ(x)i. Thus with assumption (U ) we obtain, |p(t)| ˙ ≤ C ε + kZ(· + q(t), t)kRρ ||ρ|| .

(3.4)

(3.5)

We further introduce π v = ∇p πv , φv = ∇p φv , St−s (x) = {y : |y − x| = t − s}, and (φ(·, t, s), π(·, t, s)) = U (t − s)[∇p 8v(s) (· − q(s))].

(3.6)

Then Kirchhoff’s formula for U (t − s) implies the representation Z X (t − s)|α|−2 d2 y aα (x − y)∂yα π v(s) (y − q(s)) ∇φ(x, t, s) = St−s (x)

|α|≤1

+

X

|α|≤2

(t − s)|α|−3

Z

St−s (x)

d2 y bα (x − y)∂yα φv(s) (y − q(s)), (3.7)

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

9

and a similar representation for π(x, t, s). The coefficients aα (·), bα (·) are bounded and sums are taken over multiindices α = (α1 , α2 , α3 ) with integers αj ≥ 0. Therefore ∇φ(x + q(t), t, s) and π(x + q(t), t, s) can be represented as integrals of type (3.7) over the shifted sphere St−s (x + q(t)). If |x| ≤ Rρ , we have on this sphere |y − q(s)| = |(y − x − q(t)) + (x + q(t) − q(s))| ≥ (t − s) − |x| − v(t − s) ≥ (1 − v)(t − s) − Rρ

(3.8)

by the bound (2.2) on q(t). ˙ On the other hand, the integral representation (1.4) yields by Cauchy–Schwarz h |x||φv (x)| + |x|2 (|∇φv (x)| + |π v (x)|) + sup sup |v|≤v |x|≥2Rρ

i |x|3 (|∇∇φv (x)| + |∇π v (x)|) ≤ C(v, Rρ )|| ρ|| < ∞.

(3.9)

Inserting (3.9) and (3.8) in Kirchhoff’s formula for ∇φ(x + q(t), t, s), we obtain the pointwise bound X C1 (v, Rρ )|| ρ|| (t − s)2 (t − s)|α|−2 |∇φ(x + q(t), t, s)| ≤ (1 + |t − s|)|α|+2 |α|≤1

+

X

(t − s)|α|−3

|α|≤2

≤

C1 (v, Rρ )|| ρ|| (t − s)2 (1 + |t − s|)|α|+1

C2 (v, Rρ )|| ρ|| 1 + (t − s)2

(3.10)

for |x| ≤ Rρ and provided t − s ≥ 3Rρ /(1 − v). Therefore (3.10) implies for large t − s, together with a similar bound for π(x + q(t), t, s), the integral estimate k(φ(x + q(t), t, s), π(x + q(t), t, s)kRρ ≤

C3 (v, Rρ )|| ρ|| . 1 + (t − s)2

(3.11)

On the other hand, for bounded t − s this integral estimate follows directly from (3.6) by energy conservation for the map U (t − s), since k∇p 8v kF ≤ C(v, Rρ )|| ρ|| by (C). Finally, (3.5) and (3.11) imply kp(s) ˙ · (φ(x + q(t), t, s), π(x + q(t), t, s)kRρ ε + kZ(· + q(s), s)kRρ ||ρ|| ≤ C4 (v, Rρ )|| ρ|| , 1 + (t − s)2

(3.12)

and combining (3.2) and (3.3) we arrive at kZ(· + q(t), t)kRρ ≤ kZ(0)kF + C4 (v, Rρ )|| ρ||

(3.13) Z 0

t

ε + kZ(· + q(s), s)kRρ ||ρ|| 1 + (t − s)2

ds, t ≥ 0.

Thus, denoting M (t) = max0≤s≤t kZ(q(s) + x, s)kRρ , we have M (t) ≤ kZ(0)kF + C5 (v, Rρ )|| ρ|| (ε + ||ρ|| M (t)). We choose now ||ρ|| so small that C5 (v, Rρ )|| ρ|| 2 < 1. Then (3.1) follows for t ≥ 0.

10

A. Komech, M. Kunze, H. Spohn

We claim that the bound (3.1) implies (2.7) for any R > 0. Indeed, (3.11)-(3.13) hold with the norm k · kR instead of k · kRρ on the left hand sides and with Ci (v, ρ, R) instead of Ci (v, ρ) on the right hand sides. Then (3.13) with this generalization and (3.1) imply (2.7).

4. Adiabatic Solutions We prove Theorem 2.6. The bound (2.10) is the assertion of Lemma 2.2. Concerning (2.13), we have U (t)80 = U (t)8v(0) (· − q 0 ) + U (t)Z 0 .

(4.1)

Moreover, U (t)8v(0) (x − q 0 ) = 0 for |x − q 0 | < |t| − Rρ by Kirchhoff’s formula, since we have the representation Z 8v(0) (x) = −

0

−∞

[U (−s)ρ(· − q 0 − v(0)s)](x) ds.

(4.2)

Therefore with the choice T0 = 2Rρ + |q 0 | (2.13) holds for the first component of [U (t)8v(0) ](x). With the choice T0 = 0, (2.18) implies (2.13) for the first component of U (t)Z 0 , as can be seen in Fourier space representation. Thus it remains to prove (2.11) and (2.12). Proposition 4.1. For small ||ρ|| , the following bounds hold: ˙ ≤ C(a0 , Rρ ) ε, sup |v(t)|

(4.3)

¨ ≤ C(a0 , Rρ ) ε2 . sup |v(t)|

(4.4)

t∈R t∈R

Proof. The estimate (4.3) follows from (3.5), (3.1), and (2.16). To obtain (4.4), we differentiate (3.4) using (C), p(t) ¨ = −ε2 v(t) · ∇ ∇V (εq(t)) + M (t),

(4.5)

where M (t) = hL(t)ϕ(x + q(t), t), ∇ρ(x)i and L(t) = ∂t + v(t) · ∇. Then (U ) implies |p(t)| ¨ ≤ C(ε2 + |M (t)|).

(4.6)

Therefore (4.4) will be a consequence of Lemma 4.2. We have sup |M (t)| ≤ C(a0 , Rρ )ε2 t∈R

for small ||ρ|| .

(4.7)

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

11

Proof. We extend the method of the previous section. Denoting 4(x, t) = L(t)Z(x, t), we have M (t) = h4(x, t), ∇ρ∗ (x−q(t))i, where ρ∗ (x) = (ρ(x), 0). To obtain an equation for 4(t) we apply the differential operator L(t) to (2.5) in the sense of distributions to find ˙ 4(x, t) = A4(x, t) − L(t)B(x, t) + v(t) ˙ · ∇Z(x, t).

(4.8)

Here v(t) ˙ · ∇Z(·, t) ∈ C(R, F) due to (3.2), (2.17), and (C). Also L(t)B(·, t) ∈ C(R, F) because ˙ · ∇p )2 8v(t) (x − q(t)). L(t)B(x, t) = p(t) ¨ · ∇p 8v(t) (x − q(t)) + (p(t)

(4.9)

Moreover, assumptions (2.17) and (C) imply 4(·, 0) ∈ F, since ˙ · ∇p 8v(t) (x − q(t)) 4(x, t) = A8(x, t) − ρ(x − q(t)) + v(t) · ∇8(x, t) − p(t) (4.10) by definition of Z in (2.3) and by (2.5). Therefore, using the Fourier transform to solve the linear nonhomogeneous equation (4.8), we get the following integral representation, similar to (3.2), Z t Z t U (t − s)L(s)B(s)ds + v(s) ˙ · U (t − s)∇Z(s) ds, 4(x, t) = U (t)4(·, 0) − 0 0 (4.11) where both integrals converge in F . Hence (C) implies M (t) = hU (t)4(·, 0), ∇ρ∗ (· − q(t))i Z t hU (t − s)L(s)B(s), ∇ρ∗ (· − q(t))ids − 0 Z t v(s) ˙ · hU (t − s)∇Z(s), ∇ρ∗ (· − q(t))i ds. +

(4.12)

0

We analyze the three summands separately. (i) For the first summand we prove the bound sup |hU (t)4(·, 0), ∇ρ∗ (· − q(t))i| ≤ C1 (a0 )|| ρ|| ε2 . t≥0

(4.13)

Equation (4.10) implies k4(·, 0)kF ≤ C(a0 )ε2 by assumptions (2.17) and (C). Energy conservation then yields the uniform bound (4.13). (ii) For the second summand in (4.12) we will obtain Z t hU (t − s)L(s)B(s), ∇ρ∗ (· − q(t))i ds 0

Z

≤ C2 (a0 , Rρ ) ||ρ|| 2 0

t

ε2 + |M (s)| ds, t ≥ 0. 1 + (t − s)2

(4.14)

Equations (4.9), (4.6), and (4.3) result in L(t)B(x, t) = e(x, t) + m(x, t), where again by (C), sup ke(x, t)kF ≤ C(a0 , Rρ )|| ρ|| ε2 , km(x, t)kF ≤ C(a0 , Rρ )|| ρ|| |M (t)| . t≥0

12

A. Komech, M. Kunze, H. Spohn

Therefore (4.14) follows by repeating the arguments from (3.6)–(3.12). (iii) For the third summand in (4.12) we will prove Z t v(s) ˙ · hU (t − s)∇Z(s), ∇ρ∗ (· − q(t))i ds ≤ C3 (a0 , ρ) ε2 . sup t≥0

(4.15)

0

Taking the gradient of (3.2) yields

Z

s

U (t − s)∇Z(s) = U (t)∇Z(0) −

p(τ ˙ ) · U (t − τ )∇[∇p 8v(τ ) (· − q(τ ))] dτ.

0

(4.16)

For the first term, by partial integration in polar coordinates of the Fourier representation, (2.19) and (2.18) imply that |hU (t)∇Z(0), ∇ρ∗ (· − q(t))i| ≤ C(a0 )t−1 ε. The integral is oscillatory due to the bound (2.2). The justification for this partial integration comes from an appropriate averaging process. To bound the second term we note, similarly to (3.11), kU (t − τ )∇[∇p 8v(τ ) (· − q(τ ))]kRρ ≤

C(a0 , Rρ )|| ρ|| , 1 + (t − τ )3

(4.17)

since the bounds of type (3.9) hold for ∇∇p 8v (x) with an additional power of |x| on the left hand side. Then (4.16)-(4.17) and (4.3) imply (4.15). Finally we substitute (4.13), (4.14), and (4.15) into (4.12) to obtain the integral inequality Z t 2 ε + |M (s)| 0 2 0 2 ds, t ≥ 0. |M (t)| ≤ C(a , ρ)ε + C(a , Rρ ) ||ρ|| 2 0 1 + (t − s) Therefore (4.7) for t ≥ 0 follows, provided that ||ρ|| ≤ δ(a0 , Rρ ).

5. Inertial Representation of the Self-Force We study the self-action term Fs (t) =

Z d3 x φ(x, t) ∇ρ(x − q(t)).

Denote T1 = 2Rρ (1 − v)−1 , where v < 1 is the bound from (2.10), and T = max(T0 , T1 ) with T0 from (2.13). We also introduce the field part of the total momentum, Pf (v) = Ps (v) − pv ,

(5.1)

cf. (1.8), (1.3). The corresponding “effective mass”, mf (v), is given by the differential dPf (v) =: mf (v). Lemma 5.1. Let the assumptions of Theorem 2.5 hold. Then ˙ q(t) ¨ + fs (t), |fs (t)| ≤ Cε2 , for |t| ≥ T . Fs (t) = −mf (q(t))

(5.2)

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

13

Proof. We note that by (1.12) and (2.9), φ(x, t) = φ0 (x, t) + φr (x, t), where φ0 (x, t) is a solution to the free wave equation defined in (2.9), while φr is the retarded potential Z t Z ds 1 r d2 y ρ(y − q(s)). (5.3) φ (x, t) = − 4π 0 t − s |x−y|=t−s We decompose accordingly Fs (t) = F 0 (t) + F r (t), with F 0 (t) = hφ0 (·, t), ∇ρ(· − q(t)i, F r (t) = hφr (·, t), ∇ρ(· − q(t)i.

(5.4)

From (2.10) we conclude that |q(t) − q 0 | ≤ vt, and therefore F 0 (t) = O(ε2 ) for t ≥ T0

(5.5)

by (2.13), since the solution is adiabatic. Hence Fs (t) = F r (t) + O(ε2 ) for t ≥ T0 .

(5.6)

Equations (5.3) and (5.4) imply Z t Z Z ds 1 d3 x d2 y ρ(y − q(s)) ∇ρ(x − q(t)). (5.7) F r (t) = − 4π 0 t − s |x−y|=t−s Z t ds(. . . )-integral in (5.7) may be changed to Now observe that for all t, T ≥ T1 the 0 Z t ds(. . . )-integral, since a t−T

ρ(y − q(s)) ∇ρ(x − q(t)) = 0 if |x − y| = t − s ≥ T1 .

(5.8)

Indeed, ρ(y − q(s)) ∇ρ(x − q(t)) 6= 0 implies |y − q(s)| < Rρ and |x − q(t)| < Rρ . Therefore |x − y| < 2Rρ + v(t − s), since |q(t) − q(s)| ≤ v(t − s) by (2.2). Substituting |x − y| by t − s we obtain t − s < 2Rρ /(1 − v) = T1 . Next we fix t, T ≥ T1 and substitute in (5.7) the Taylor expansion 1 ¨ − s)2 + O(ε2 ) q(s) = q(t) − q(t)(t ˙ − s) + q(t)(t 2 according to (2.11)–(2.12). Then F r (t) = −

1 4π

Zt t−T

ds t−s

Z

Z

d2 y ρ y − q(t) + q(t)(t ˙ − s)

d3 x |x−y|=t−s

1 ¨ − s)2 + O(ε2 ) ∇ρ(x − q(t)). − q(t)(t 2 Combining with (5.6) we finally obtain 1 Fs (t) = − 4π

Zt t−T

ds t−s

Z

Z d3 x

h d2 y ρ(y − q(t) + q(t)(t ˙ − s))

|x−y|=t−s

i 1 ¨ · ∇ρ(y − q(t) + q(t)(t ˙ − s)) ∇ρ(x − q(t)) + fs (t) − (t − s)2 q(t) 2

(5.9)

14

A. Komech, M. Kunze, H. Spohn

with fs (t) satisfying (5.2). The integral does not depend on T provided T, t > T1 , which reflects the strong Huyghen’s principle. We will show in Appendix A by taking the limit ˙ q. ¨ Then (5.2) follows for t ≥ T . T → ∞ that the integral in (5.9) in fact equals −mf (q) 6. The Adiabatic Limit We complete the proof of Theorem 2.5. We first ensure the existence of the effective dynamics. Lemma 6.1. Define E(P ) through (1.9), and let the potential V satisfy (U ). Then for every initial state (Q(0), P (0)) ∈ R3 × R3 the Hamiltonian system ˙ = ∇E(P (t)), P˙ (t) = −ε∇V (εQ(t)) Q(t) (6.1) ...

¨ and | Q(t)| are has a unique solution (Q(t), P (t)) ∈ C(R, R3 × R3 ). Moreover, |Q(t)| bounded uniformly in t. Proof. Both ∇∇E(P ) and ∇∇V (Q) are bounded and Heff (P, Q) is bounded from below. Let m(v) = dPs (v). From Lemma 5.1, together with definitions (1.8), (5.1) and the equations of motion (1.12), we conclude that m(q(t)) ˙ q(t) ¨ = −ε∇V (εq(t)) + fs (t).

(6.2)

We want to rewrite (6.2) in a Hamiltonian form. For this purpose we introduce 5(t) = ˙ ˙ which yields m(q(t)) ˙ q(t) ¨ = 5(t). To obtain q˙ as a function of 5 we have to Ps (q(t)), invert the map v 7→ Ps (v). Lemma 6.2. The inverse function to Ps (v) is given by vs (P ) = ∇E(P ).

(6.3)

Proof. Using the chain rule, Eq. (9.1) states v = ∇Es (v) (dPs (v))−1 = ∇E(Ps (v)).

With these definitions, (6.2) becomes ˙ q(t) ˙ = ∇E(5(t)), 5(t) = −ε∇V (εq(t)) + fs (t).

(6.4)

ε

Let q ε (t) = εq(ε−1 t), Qε (t) = εQ(ε−1 t) and 5 (t) = 5(ε−1 t), P ε (t) = P (ε−1 t). Then (6.4) and (6.1) read q˙ε (t) = ∇E(5ε (t)), Q˙ ε (t) = ∇E(P ε (t)),

˙ ε (t) = −∇V (εq ε (t)) + ε−1 fs (εt), 5 P˙ ε (t) = −∇V (εQε (t)).

Since ∇∇E and ∇∇V are bounded, and |fs (εt)| ≤ Cε2 for |t| ≥ εT , from a Gronwall argument for r(t) = |q ε (t) − Qε (t)| + |5ε (t) − P ε (t)|, we conclude that r(t) ≤ C(r0 + ε)eC|t−t0 | .

(6.5)

Here r0 := r(t0 ) = 0 due to (1.16), if |t0 | > T , otherwise r0 := r(±εT ) = O(ε), since q ε (t), Qε (t), 5ε (t), P ε (t) change by O(ε) over the time interval |t| ≤ εT . Therefore, (6.5) implies the first two bounds of (2.14). The third bound follows from the second ¨ order equation (6.2) for q¨ and a similar equation for Q.

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

15

7. The Translation Invariant Case For V = 0 the velocity q(t) ˙ of the particle should, after a transient period, stabilize at some definite v dressed by the corresponding soliton field. Such a result was established in [12], where we only had to assume the Wiener condition ρ(k) ˆ 6= 0. The technique developed here avoids this condition at the prize of ||ρ|| 1 and obtains even a bound on the rate of convergence. We denote Z(0) = (ϕ0 (x), ψ 0 (x)). Proposition 7.1. Let ||ρ|| be sufficiently small, ||ρ|| ≤ δ(v, Rρ ), and assume for some σ ∈ (0, 1], |ϕ0 (x)| + |x|(|∇ϕ0 (x)| + |ψ 0 (x)|) + |x|2 (|∇∇ϕ0 (x)| + |∇ψ 0 (x)|) = O(|x|−σ ) as |x| → ∞.

(7.1)

Then the solution to (1.1) satisfies kZ(· + q(t), t)kR ≤ CR (1 + |t|)−1−σ , ∀R > 0.

(7.2)

Corollary 7.2. Under the same assumptions the acceleration is bounded as |q(t)| ¨ ≤ C(1 + |t|)−1−σ .

(7.3)

˙ = v± ∈ V exist, and Therefore, the limits limt→±∞ q(t) |q(t) ˙ − v± | ≤ C(1 + |t|)−σ .

(7.4)

Proof. Equations (7.1) and (3.2)–(3.11) with ε = 0 imply, similarly to (3.13), −1−σ

kZ(· + q(t), t)kRρ ≤ C(1 + |t|)

Z + C(v, ρ)|| ρ||

2 0

t

kZ(q(s) + x, s)kRρ ds (1 + |t − s|)2

for t ≥ 0. Therefore, setting M (t) = max0≤s≤t (1 + |t|)1+σ kZ(q(s) + x, s)kRρ , we find M (t) ≤ C + C(v, ρ)|| ρ|| 2 Iσ M (t), where

Z Iσ = sup(1 + |t|)1+σ t≥0

0

t

(1 + |s|)−1−σ ds < ∞ for σ ∈ (0, 1]. (1 + |t − s|)2

It remains to choose C(v, ρ)|| ρ|| 2 Iσ < 1, then (7.2) with R = Rρ follows for t ≥ 0. The corollary is a consequence of (3.4) with ε = 0. Remark. Soliton-like asymptotics are established in [17] for some translation invariant 1D completely integrable equations, in [4, 5] for small perturbations of soliton solutions to 1D translation invariant nonlinear Schr¨odinger equations, and in [19, 20] for U (1)invariant 2D and 3D nonlinear Schr¨odinger equations with a potential term decaying like a power decay at infinity; [9] studies soliton-like asymptotics for 1D translation invariant nonlinear reaction systems.

16

A. Komech, M. Kunze, H. Spohn

8. Appendix A. Fourier Integrals As usual, we denote by fˆ(k) = (2π)−3/2

Z

d3 x eikx f (x) the Fourier transform of f (x).

Solitons: The soliton (1.4) has the Fourier transform  ρ(k) ˆ   φˆ v (k) = − 2 , k − (k · v)2 ik · v ρ(k) ˆ   πˆ v (k) = − . k 2 − (k · v)2

(8.1)

Energy-momentum relation: Inserting (8.1) in (1.2) and (1.7), the energy and the total momentum of a soliton with velocity v are, respectively, Z 2 2 1 2 3(k · v) − k d3 k |ρ(k)| + ˆ , H0 (Sv ) = (1 − v ) 2 [k 2 − (k · v)2 ]2 Z k·v 2 −1/2 2 + d3 k |ρ(k)| ˆ k. Ps (v) = v(1 − v ) [k 2 − (k · v)2 ]2 2 −1/2

(8.2) (8.3)

After some calculations, this yields (1.6) and (1.8). Field mass: Equation (8.3) implies that the effective mass due to the coupling to the field is given by Z mf (v) = dPf (v) =

2 d3 k |ρ(k)| ˆ

k 2 + 3(k · v)2 3

[k 2 − (k · v)2 ]

k ⊗ k, |v| < 1.

(8.4)

Self-force: We compute the integral (5.9) by switching to Fourier space. The wave propagator in Fourier space is multiplication by |k|−1 sin |k|t. Hence Z Fs (t) =

Z 2

d3 k |ρ(k)| ˆ ik

t

t−T

h i 1 ˙ ¨ · (−ik)(t − s)2 1 − q(t) ds e−ik·q(t)(t−s) 2

×|k|−1 sin |k|(t − s) + fs (t).

(8.5)

We evaluate this integral by taking the limit as T → ∞, recalling that the integral does not depend on T provided T ≥ T 1 . We set Fs (t) = I1 (T ) + I2 (T ) + fs (t). In (8.5) we integrate over s. Setting v = q(t) ˙ and k± = −k · v ± |k|, the first integral reads Z Z t sin |k|(t − s) 2 3 ˙ ˆ ik ds e−ik·q(t)(t−s) I1 (T ) = d k |ρ(k)| |k| t−T

Z =i

ˆ d k |ρ(k)| Z

= −i

2

3

k k − k 2 − (k · v)2 2|k|

k ˆ d k |ρ(k)| 2|k| 3

2

eik+ T eik− T − k+ k−

eik+ T eik− T − k+ k−

=: I1+ (T ) + I1− (T ).

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

17

Introducing polar coordinates ν = |k|, θ = k/|k|, we have Z

k eik+ T 2|k| k+ Z Z ∞ i θ 2 i(θ·v+1) νT =− d2 θ dν ν |ρ(νθ)| ˆ e . 2 |θ|=1 θ·v+1 0

I1+ (T ) = −i

2

d3 k |ρ(k)| ˆ

(8.6)

The integral converges absolutely, since ρ(k) ˆ is smooth with all derivatives in L2 (R3 ) by assumption (C). Therefore, integrating by parts twice in the ν-integral yields |I1+ (T )| ≤ ˙ < 1. CT −2 because |v| = |q(t)| The same argument applies to I1− (T ) and it follows that |I1 (T )| ≤ CT −2 → 0 as T → ∞.

(8.7)

The second integral reads I2 (T ) = − =−

1 2 Z

Z

Z

t

sin |k|(t − s) |k| t−T ik+ T 2 2 e 1 eik− T k + 3(k · v) 2 ˆ k q(t) ¨ ·k + − d3 k |ρ(k)| 3 (k 2 − (k · v)2 )3 2|k| k+3 k− ik+ T e eik− T eik− T T 2 eik+ T iT − − − . − 2 2|k| k+2 4|k| k+ k− k− 2

d3 k |ρ(k)| ˆ k q(t) ¨ ·k

˙ ds e−ik·q(t)(t−s) (t − s)2

The integrals containing T are again oscillatory and vanish as T → ∞. Therefore, comparing with (8.4), we conclude ˙ q(t) ¨ as T → ∞. I2 (T ) → −mf (q(t))

(8.8)

Hence (5.2) follows from (8.7) and (8.8).

9. Appendix B. The Hamiltonian Structure Energy-momentum relation: In Sect. 6 we used the identity v dPs (v) = ∇Es (v), |v| < 1.

(9.1)

While obtained from the explicit expressions (8.2), (8.3), resp. (1.6), (1.8), this identity should be understood as a direct consequence of the conservation of total momentum, i.e., of the translation invariance of (1.1). Our argument uses the canonical transformation [12] T : (φ, π, q, p) 7→ (8(x), 5(x), Q, P ) = (φ(q + x), π(q + x), q, p − < π(x), ∇φ(x) >).

18

A. Komech, M. Kunze, H. Spohn

In new variables the Hamiltonian (1.2) reads HP (8, 5) = H0 8(x − Q), 5(x − Q), Q, P + < 5(x), ∇8(x) > Z 1 1 |5(x)|2 + |∇8(x)|2 + 8(x)ρ(x) = d3 x 2 2 2 1/2 . + 1 + P + < 5(x), ∇8(x) > HP is bounded from below and has its unique minimum at the point (φv , πv ), the soliton at velocity v = vs (P ), with minimal value HP (φv , πv ) = Es (v) + H0 (S0 ); see [12]. Differentiating in v we obtain ∇Es (v) = h

δHP δHP (φv , πv ), ∇v φv i + h (φv , πv ), ∇v πv i δ8 δ5

+∇P HP (φv , πv ) dPs (v) = v dPs (v), since (φv , πv ) is a critical point of HP and the first two terms vanish, while v = Q˙ = ∇P HP (φv , πv ) because T is a canonical transformation. Correspondence of the Hamiltonian structures: Definitions (1.5), (1.8), and (1.9) imply that the Hamiltonian functional Hε of (1.11) restricted to the soliton Sv = (φv (x − q), πv (x − q), q, pv ) becomes Hε (Sv ) = E(P ) + V (εq) + H0 (S0 ) = Heff (q, P ) + H0 (S0 )

(9.2)

with P = Ps (v). Thus the effective Hamiltonian can be understood as the restriction of Hε to the soliton manifold. We need in addition the appropriate choice of the canonical variables to write the Hamilton’s equations in standard form (1.15). For general reasons one expects the conserved quantities to play a distinguished role. In our case this suggests P and q as canonical variables. The next lemma gives an inherent geometrical meaning to this choice, which might be valuable in a more general context. Lemma 9.1. The canonical structure P dq on the soliton manifold S is the restriction of the full canonical form p dq + < φ, dπ >, i.e., P dq = (p dq + hφ, dπi) . S

Proof. We have p dq +hφ, dπi = P dQ+h8, d5i, since T is a canonical transformation, and h8, d5i = hφv , dπv i = hφv , ∇v πv i dv = 0 S

by antisymmetry in Fourier space and since |ρ(−k)| ˆ = |ρ(k)|. ˆ

Effective Dynamics for a Mechanical Particle Coupled to Wave Field

19

References 1. Abraham, M.: Theorie der Elektrizit¨at, Band 2: Elektromagnetische Theorie der Strahlung. Leipzig: Teubner, 1905 2. Bambusi, D., Galgani, L.: Some rigorous results on the Pauli-Fierz model of classical electrodynamics. Ann. Inst. H. Poincar´e, Phys. Theor. 58, 155–171 (1993) 3. Bensoussan, A., Lions, J.L., Papanicolaou, G.: Asymptotic Analysis for Periodic Structures. Studies in Mathematics and its Applications, Vol. 5, Amsterdam: North-Holland, 1978 4. Buslaev, V.S., Perelman, G.S.: On nonlinear scattering of states which are close to a soliton. In: M´ethodes Semi-Classiques, Vol.2 Colloque International (Nantes, juin 1991), Asterisque 208, 1992, pp. 49–63 5. Buslaev, V.S., Perelman, G.S.: On the stability of solitary waves for nonlinear Schr¨odinger equations. Trans. Amer. Math. Soc. 164, 75–98 (1995) 6. Davies, E.B.: Quantum Theory of Open Systems. London: Academic Press, 1976 7. Grillakis, M., Shatah, J., Strauss, W.A.: Stability theory of solitary waves in the presence of symmetry I and II. J. Func. Anal. 74, 160–197 (1987); 94, 308–348 (1990) 8. De Masi, A., Ferrari, P.A., Goldstein, S., Wick, W.D.: An invariance principle for reversible Markov processes. Application to random environments. J. Stat. Phys. 55, 787–855 (1989) 9. Fleckinger, J., Komech, A.: On soliton-like asymptotics for 1D nonlinear reaction systems. Russian J. Math. Phys. 5, 295–307 (1997) 10. Hagedorn, G.A.: A time dependent Born–Oppenheimer approximation. Commun. Math. Phys. 77, 1–19 (1980) 11. Jerrard, R.L., Soner, H.M.: Dynamics of Ginzburg–Landau Vortices. Preprint, 1995 12. Komech, A., Spohn, H.: Soliton-like asymptotics for a classical particle interacting with a scalar wave field. Nonlinear Anal. 33, 13–24 (1998) 13. Komech, A., Spohn, H., Kunze, M.: Long-time asymptotics for a classical particle interacting with a scalar wave field. Comm. Partial Differential Equations 22, 307–335 (1997) 14. Kramers, H.A.: Non-relativistic Quantum-Electrodynamics and correspondence Principle. In: Solvay Conference 1948, Rapport et Discussions, Bruxelles, 1950 pp. 241–265; in: Kramers, H.A.: Collected Scientific Papers. Amsterdam: North-Holland, 1956, pp. 845–869 15. Lions, J.L.; Probl`emes aux Limites dans les Equations aux D´eriv´ees Partielles. Montr´eal: Presses de l’Univ. Montr´eal, 1962 16. Lorentz, H.A.: Theory of Electrons, 2nd edition 1915. Reprinted by New York: Dover, 1952 17. Novikov, S.P., Manakov, S.V., Pitaevskii, L.P., Zakharov, V.E.: Theory of Solitons: The Inverse Scattering Method. Consultants Bureau, 1984 18. Robert, D.: Autour de l’Approximation Semi-Classique, Progress in Mathematics, Vol. 68 Basel: Birkh¨auser, 1987 19. Soffer, A., Weinstein, M.I.: Multichannel nonlinear scattering for nonintegrable equations. Commun. Math. Phys. 133, 119–146 (1990) 20. Soffer, A., Weinstein, M.I.: Multichannel nonlinear scattering for nonintegrable equations II. The case of anisotropic potentials and data. J. Differ. Eqs. 98, 376–390 (1992) 21. Spohn, H.: Large Scale Dynamics of Interacting Particles. Berlin: Springer, 1991 22. Spohn, H.: Long time asymptotics for quantum particles in a periodic potential. Phys. Rev. Lett. 77, 1198–1201 (1996) Communicated by A. Kupiainen

Commun. Math. Phys. 203, 21 – 30 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Monopole Equations on 8-Manifolds with Spin(7) Holonomy Ay¸se Hümeyra Bilge1 , Tekin Dereli2 , Sahin ¸ Koçak3 1 Department of Mathematics, ˙Istanbul Technical University, ˙Istanbul, Turkey.

E-mail: [email protected]

2 Department of Physics, Middle East Technical University, Ankara, Turkey.

E-mail: [email protected]

3 Department of Mathematics, Anadolu University, Eski¸sehir, Turkey.

E-mail: [email protected] Received: 17 October 1997 / Accepted: 16 November 1998

Abstract: We construct a consistent set of monopole equations on eight-manifolds with Spin(7) holonomy. These equations are elliptic and admit non-trivial solutions including all the 4-dimensional Seiberg–Witten solutions as a special case. 1. Introduction In a remarkable paper [1] Seiberg and Witten have shown that diffeomorphism invariants of 4-manifolds can be found essentially by counting the number of solutions of a set of massless, Abelian monopole equations [2, 3]. It is later noted that topological quantum field theories which are extensively studied in this context in 2, 3 and 4 dimensions also exist in higher dimensions [4, 5, 6, 7]. Therefore it is of interest to consider monopole equations in higher dimensions and thus generalizing the 4-dimensional Seiberg–Witten theory. In fact Seiberg–Witten equations can be constructed on any even dimensional manifold (D=2n) with a spinc -structure [8]. But there are problems. The self-duality of 2-forms plays an eminent role in 4-dimensional theory and we encounter projection maps ρ + (FA ) = ρ + (FA+ ) = ρ(FA+ ) (see the next section). The first projection ρ + (FA ) is meaningful in any dimension 2n ≥ 4. However, a straightforward generalization of the Seiberg–Witten equations using this projection yields an over determined set of equations having no non-trivial solutions even locally [9]. To use the other projections, one needs an appropriately generalized notion of self-dual 2-forms. On the other hand there is no unique definition of self-duality in higher than four dimensions. In a previous paper [10] we reviewed the existing definitions of self-duality and given an eigenvalue criterion for specifying self-dual 2-forms on any even dimensional manifold. In particular, in D = 8 dimensions, there is a linear notion of self-duality defined on 8-manifolds with Spin(7) holonomy [11, 12]. This corresponds to a specific choice of a maximal linear subspace in the set of (non-linear) self-dual 2-forms as defined by our eigenvalue criterion [13]. Eight dimensions is special because in this particular case the set of linear

22

A. H. Bilge, T. Dereli, S. ¸ Koçak

Spin(7) self-duality equations can be solved by making use of octonions [14] . The existence of octonionic instantons which realise the last Hopf fibration S 15 → S 8 is closely related with the properties of the octonion algebra [15, 16, 17]. Here we use this linear notion of self-duality to construct a consistent set of Abelian monopole equations on 8-manifolds with Spin(7) holonomy. These equations turn out to be elliptic and locally they admit non-trivial solutions which include all 4-dimensional Seiberg–Witten solutions as a special case. But before giving our 8-dimensional monopole equations, we first wish in the next section to give the set up and generalizations of 4-dimensional Seiberg–Witten equations to arbitrary even dimensional manifolds with spinc -structure as proposed by Salamon [8]. This is going to help us put our monopole equations into their proper context. We also wish to note that any 8-manifold with Spin(7) holonomy is automatically a spin manifold [18, 19] and thus carries a spinc -structure; making the application of the general approach possible. In fact our monopole equations can always be expressed purely in the real realm, but in order to relate them to the 4-dimensional Seiberg–Witten equations, it is preferable to use the spinc -structure and complex spinors. 2. Definitions and Notation A spinc -structure on a 2n-dimensional real inner-product space V is a pair (W, 0), where W is a 2n -dimensional complex Hermitian space and 0 : V → End(W ) is a linear map satisfying 0(v)2 = −kvk2 0(v)∗ = −0(v), for v ∈ V . Globalizing this defines the notion of a spinc -structure 0 : T X → End(W ) on a 2n-dimensional (oriented) manifold X, W being a 2n -dimensional complex Hermitian vector bundle on X. Such a structure exists if and only if w2 (X) has an integral lift. 0 extends to an isomorphism between the complex Clifford algebra bundle C c (T X) and End(W ). There is a natural splitting W = W + ⊕ W − into the ±i n eigenspaces of 0(e2n e2n−1 · · · e1 ), where e1 , e2 , · · · , e2n is any positively oriented local orthonormal frame of T X. The extension of 0 to C2 (X) gives, via the identification of 32 (T ∗ X) with C2 (X), a map ρ : 32 (T ∗ X) → End(W ) given by

X

ρ(

i<j

ηij ei∗ ∧ ej∗ ) =

X

ηij 0(ei )0(ej ).

i<j

The bundles W ± are invariant under ρ(η) for η ∈ 32 (T ∗ X). Denote ρ ± (η) = ρ(η)|W ± . The map ρ (and ρ ± ) extends to ρ : 32 (T ∗ X) ⊗ C → End(W ). (If η ∈ 32 (T ∗ X) ⊗ C is real-valued then ρ(η) is skew-Hermitian and if η is imaginaryvalued then ρ(η) is Hermitian.) A Hermitian connection ∇ on W is called a spinc connection (compatible with the Levi–Civita connection) if ∇v (0(w)8) = 0(w)∇v 8 + 0(∇v w)8, where 8 is a spinor (section of W ), v and w are vector fields on X and ∇v w is the Levi–Civita connection on X. ∇ preserves the subbundles W ± . There is a principal

Monopole Equations on 8-Manifolds with Spin(7) Holonomy

23

Spinc (2n) = {eiθ x|θ ∈ R, x ∈ Spin(2n)} ⊂ C c (R2n ) bundle P on X such that W and T X can be recovered as the associated bundles n

W = P ×Spinc (2n) C2 ,

T X = P ×Ad R2n ,

Ad being the adjoint action of Spinc (2n) on R2n . We get then a complex line bundle L0 = P ×δ C using the map δ : Spinc (2n) → S 1 given by δ(eiθ x) = e2iθ . There is a one-to-one correspondence between spinc connections on W and spinc (2n) = Lie (Spinc (2n) = spin(2n) ⊕iR-valued connection 1-forms Aˆ ∈ A(P ) ⊂ 1 ˆ A = 1n trace(A). ˆ This is (P , spinc (2n)) on P . Now consider the trace-part A of A: 2

an imaginary valued 1-form A ∈ 1 (P , iR) which is equivariant and satisfies Ap (p · ξ ) =

1 trace(ξ ) 2n

for v ∈ Tp P , g ∈ Spinc (2n),ξ ∈ spinc (2n) (where p·ξ is the infinitesimal action). Denote the set of imaginary valued 1-forms on P satisfying these two properties by A(0). There is a one-to-one correspondence between these 1-forms and spinc connections on W . Denote the connection corresponding to A by ∇A . A(0) is an affine space with parallel vector space 1 (X, iR). For A ∈ A(0), the 1-form 2A ∈ 1 (P , iR) represents a connection on the line bundle L0 . Because of this reason A is called a virtual connection 1/2 on the virtual line bundle L0 . Let FA ∈ 2 (X, iR) denote the curvature of the 1-form A. Finally, let DA denote the Dirac operator corresponding to A ∈ A(0), DA : C ∞ (X, W + ) → C ∞ (X, W − ) defined by DA (8) =

2n X

0(ei )∇A,ei (8),

i=1

where 8 ∈ C ∞ (X, W + ) and e1 , e2 , · · · , e2n is any local orthonormal frame. The Seiberg–Witten equations can now be expressed as follows. Fix a spinc -structure 0 : T X → End(W ) on X and consider the pair (A, 8) ∈ A(0) × C ∞ (X, W + ). The Seiberg–Witten equations read DA (8) = 0 ,

ρ + (FA ) = (88∗ )0 ,

where (88∗ )0 ∈ C ∞ (X, End(W + )) is defined by (88∗ )(τ ) =< 8, τ > 8 for τ ∈ C ∞ (X, W + ) and (88∗ )0 is the traceless part of (88∗ ). 3. Seiberg–Witten Equations on 4-Manifolds Before going over to 8-manifolds, we first show that the Seiberg–Witten equations on 4-manifolds ([8, p. 232]) can be rewritten in a different form. The Dirac equation DA (8) = 0

(1)

∇1 8 = I ∇2 8 + J ∇3 8 + K∇4 8,

(2)

can be explicitly written as

24

A. H. Bilge, T. Dereli, S. ¸ Koçak

and ρ + (FA ) = (88∗ )0

(3)

F12 + F34 = −1/28∗ I 8, F13 − F24 = −1/28∗ J 8, F14 + F23 = −1/28∗ K8,

(4)

is equivalent to the set

∂8 + Ai 8, where 8 : R4 → C2 , ∇i 8 = ∂x i P P4 1 4 A = i=1 Ai dxi ∈ (R , iR), FA = i<j Fij dxi ∧ dxj ∈ 2 (R4 , iR), and i 0 0 1 0i I= , J = , K= . 0 −i −1 0 i 0

In the most explicit form, these equations can be written as ∂φ1 ∂φ2 ∂φ2 ∂φ1 + A1 φ1 = i( + A2 φ1 ) + + A3 φ2 + i( + A4 φ2 ), ∂x1 ∂x2 ∂x3 ∂x4 ∂φ2 ∂φ1 ∂φ1 ∂φ2 + A1 φ2 = −i( + A2 φ2 ) − ( + A3 φ1 ) + i( + A4 φ1 ) ∂x1 ∂x2 ∂x3 ∂x4

(5)

(for DA (8) = 0) and F12 + F34 = −i/2(φ1 φ¯ 1 − φ2 φ¯ 2 ), F13 − F24 = 1/2(φ1 φ¯ 2 − φ2 φ¯ 1 ), F14 + F23 = −i/2(φ1 φ¯ 2 + φ2 φ¯ 1 )

(6)

(for ρ + (FA ) = (88∗ )0 ). We will reinterpret the second part of these equations in the following way: The 6-dimensional bundle of real-valued 2-forms on R4 has a 3-dimensional subbundle of self-dual forms with orthogonal basis f1 = dx1 ∧ dx2 + dx3 ∧ dx4 , f2 = dx1 ∧ dx3 − dx2 ∧ dx4 , f3 = dx1 ∧ dx4 + dx2 ∧ dx3 ,

(7)

in each fiber with respect to the usual metric. These forms span a 3-dimensional complex subbundle P of the bundle of complex-valued 2-forms. The projection of a (global) 2-form F = Fij dxi ∧ dxj ∈ 2 (R4 , iR) onto this complex subbundle is given by F + = 1/2(F12 + F34 )f1 + 1/2(F13 − F24 )f2 + 1/2(F14 + F23 )f3 .

(8)

We have ρ + (f1 ) = 2I, ρ + (f2 ) = 2J, ρ + (f3 ) = 2K, so that ρ + (F + ) = (F12 + F34 )I + (F13 − F24 )J + (F14 + F23 )K.

(9)

On the other hand, the orthogonal projection (88∗ )+ of 88∗ onto the subbundle of the positive spinor bundle generated by the (Hermitian-) orthogonal basis (ρ + (f1 ), ρ + (f2 ), ρ + (f3 )) is given by

Monopole Equations on 8-Manifolds with Spin(7) Holonomy

25

< 2I, 88∗ > 2I /|2I |2 + < 2J, 88∗ > 2J /|2J |2 + < 2K, 88∗ > 2K/|2K|2

=

1 1 1 < I, 88∗ > I + < J, 88∗ > J + < K, 88∗ > K. 2 2 2

(10)

Since < I, 88∗ >= −8∗ I 8, < J, 88∗ >= −8∗ J 8, < K, 88∗ >= −8∗ K8, (11) this shows that the second part of the Seiberg–Witten equations can be expressed as follows: Given any (global, imaginary-valued) 2-form F , the image under the map ρ + of its self-dual part F + coincides with the orthogonal projection of 88∗ onto the subbundle of the positive spinor bundle which is the image bundle of the complexified subbundle of self-dual 2-forms under the map ρ + , that is, ρ + (F + ) = (88∗ )+ .

(12)

Indeed, in the present case (88∗ )+ is nothing else than (88∗ )0 . In this modified form the Seiberg–Witten equations allow a tempting generalisation. Suppose we are given a subbundle S ⊂ 32 (T ∗ X). Denote the complexification of S by S ∗ , the projection of an imaginary valued 2-form field F onto S ∗ by F + and the projection of φφ ∗ onto ρ + (S ∗ ) by (φφ ∗ )+ . Then the equation ρ + (F + ) = (φφ ∗ )+ can be taken as a substitute of the 4dimensional equation (3) in 2n-dimensions. An arbitrary choice of S wouldn’t probably give anything interesting, but stable subbundles with respect to certain structures on X are likely to give useful equations.

4. Monopole Equations on 8-Manifolds We now consider 8-manifolds with Spin(7) holonomy. In this case there are two natural choices of S which have already found applications in the existing literature. In the 28-dimensional space of 2-forms 2 (R8 , R), there are two orthogonal subspaces S1 and S2 ( 7 and 21 dimensional, respectively) which are spin(7) ⊂ so(8) invariant [11, 12]. On an 8-manifold X with Spin(7) holonomy (so that the structure group is reducible to Spin(7)) they give rise to global subbundles (denoted by the same letters) S1 , S2 ⊂ 32 (T ∗ X) which can play the above mentioned role. We will concentrate on the 7-dimensional subbundle S1 and show that the resulting equations are elliptic, exemplify the local existence of non-trivial solutions and show that they are related to solutions of the 4-dimensional Seiberg–Witten equations. We would like to point out that instead of the widely known CDFN 7-plane, we are working with another 7-plane in 2 (R8 , R), which is conjugated to the CDFN 7-plane and thus invariant under a conjugated spin(7) embedding in so(8). This has the advantage that the 2-forms in this 7-plane can be expressed in an elegant way in terms of 4-dimensional self-dual and anti-self-dual 2forms. (For a general account we refer to our previous work, [10].) We will define this 7-plane below, but before that, for the sake of clarity, we first wish to present the global monopole equations. Let X be an 8-manifold with Spin(7) holonomy and S be any stable subbundle of 32 (T ∗ X) and S ∗ its complexification. Given an imaginary valued global 2-form F , let us denote its projection onto S ∗ by F + and the projection of any global

26

A. H. Bilge, T. Dereli, S. ¸ Koçak

spinor φ onto the subbundle ρ + (S ∗ ) ⊂ End(W + ) by φ + . Then the monopole equations read DA (φ) = 0,

(13)

ρ + (FA+ ) = (φφ ∗ )+ .

(14)

Now, we define S1 ⊂ 2 (R8 , R) to be the linear space of 2-forms X ωij dxi ∧ dxj ∈ 2 (R8 , R), ω= i<j

which can be expressed in matrix form as ω = ω12 f +

ω0 ω00 ω00 −ω0 ,

(15)

where ω12 is a real function, ω0 is the matrix of a 4-dimensional self-dual 2-form, ω00 is the matrix of a 4-dimensional anti-self-dual 2-form and we let f = iσ2 ⊗ I4 . These 2-forms span a 7-dimensional linear subspace S1 in the 28-dimensional space of 2-forms and the square of any element in this subspace is a scalar matrix. S1 is maximal with respect to this property. We choose the following orhogonal basis for this maximal linear subspace of self-dual 2-forms: f1 f2 f3 f4 f5 f6 f7

= dx1 ∧ dx5 + dx2 ∧ dx6 + dx3 ∧ dx7 + dx4 ∧ dx8 , = dx1 ∧ dx2 + dx3 ∧ dx4 − dx5 ∧ dx6 − dx7 ∧ dx8 , = dx1 ∧ dx6 − dx2 ∧ dx5 − dx3 ∧ dx8 + dx4 ∧ dx7 , = dx1 ∧ dx3 − dx2 ∧ dx4 − dx5 ∧ dx7 + dx6 ∧ dx8 , = dx1 ∧ dx7 + dx2 ∧ dx8 − dx3 ∧ dx5 − dx4 ∧ dx6 , = dx1 ∧ dx4 + dx2 ∧ dx3 − dx5 ∧ dx8 − dx6 ∧ dx7 , = dx1 ∧ dx8 − dx2 ∧ dx7 + dx3 ∧ dx6 − dx4 ∧ dx5 .

(16)

In matrix notation we set f1 = f , and take f2 = σ3 ⊗ a1 , f3 = σ1 ⊗ b1 , f4 = σ3 ⊗ a2 , f5 = σ1 ⊗ b2 , f6 = σ3 ⊗ a3 , f7 = σ1 ⊗ b3 , where (σ1 , σ2 , σ3 ) are the usual Pauli matrices and we have      0 −1 0 0 0 0 −1 0 0 1 0 0 0  0 0 0 1  0 , a2 =  , a3 =  a1 =  0 0 0 −1  1 0 0 0 0 0 0 1 0 0 −1 0 0 −1 and



0 1 b1 =  0 0

−1 0 0 0

0 0 0 −1

  0 00 0 0 0 , b2 =  1 10 0 01

−1 0 0 0

(17)

0 1 0 0

 1 0 0 0

0 −1 0 0

 1 0 . 0 0

0 0 −1 0

  0 0 0 −1   0 0 , b3 =  0  0 1 0 −1 0

Monopole Equations on 8-Manifolds with Spin(7) Holonomy

27

At this point it will be instructive to show that the above basis corresponds to a representation of the Clifford algebra Cl7 induced by right multiplications in the algebra of octonions. We adopt the Cayley-Dickson approach and describe a quaternion by a pair of complex numbers so that a = (x + iy) + j (u + iv), where (i, j, ij = k) are the imaginary unit quaternions. In a similar way an octonion is described by a pair of quaternions (a, b). Then the octonionic multiplication rule is ¯ da + bc). (a, b) · (c, d) = (ac − db, ¯

(18)

If we now represent an octonion (a, b) by a vector in R8 , its right multiplication by imaginary unit octonions correspond to linear transformations on R8 . We thus obtain the following correspondences: (0, 1) → f1 , (i, 0) → f2 , (j, 0) → f3 , (k, 0) → f4 , (19) (0, i) → f5 , (0, j ) → f6 , (0, k) → f7 . P The projection F + of a 2-form F= i<j Fij dxi ∧ dxj ∈ 2 (R8 , iR) onto the complexification of the above self-dual subspace is given by F + = 1/4(F15 + F26 + F37 + F48 )f1 + 1/4(F12 + F34 − F56 − F78 )f2 + 1/4(F16 − F25 − F38 + F47 )f3 + 1/4(F13 − F24 − F57 + F68 )f4 + 1/4(F17 + F28 − F35 − F46 )f5 + 1/4(F14 + F23 − F58 − F67 )f6 + 1/4(F18 − F27 + F36 − F45 )f7 . We now fix the constant spinc -structure 0 : R8 −→ C16×16 given by 0 γ (ei ) , 0(ei ) = −γ (ei )∗ 0

(20)

where ei , i = 1, 2, ..., 8 is the standard basis for R8 and γ (e1 ) = I d, γ (ei ) = fi−1 for i = 2, 3, ..., 8. We note that this choice is specific to 8 dimensions , because 2n = 2n−1 only for n = 4. We have X = R8 , W = R8 × C16 , W ± = R8 × C8 and L0 = L0 1/2 = R8 × C. Consider the connection 1-form A=

8 X

Ai dxi ∈ 1 (R8 , iR)

(21)

i=1

on the line bundle R8 × C. Its curvature is given by X Fij dxi ∧ dxj ∈ 2 (R8 , iR), FA =

(22)

i<j

where Fij =

∂Aj ∂xi

−

∂Ai ∂xj .

The spinc connection ∇ = ∇A on W + is given by ∇i 8 =

∂8 + Ai 8 ∂xi

(23)

28

A. H. Bilge, T. Dereli, S. ¸ Koçak

(i = 1, ..., 8), where 8 : R8 → C8 . Therefore the map ρ + : 32 (T ∗ X) ⊗ C → End(W + ) can be computed for our generators fi to give ρ + (f1 ) = γ (e1 )γ (e5 ) + γ (e2 )γ (e6 ) + γ (e3 )γ (e7 ) + γ (e4 )γ (e8 ), ρ + (f2 ) = γ (e1 )γ (e2 ) + γ (e3 )γ (e4 ) − γ (e5 )γ (e6 ) − γ (e7 )γ (e8 ), ρ + (f3 ) = γ (e1 )γ (e6 ) − γ (e2 )γ (e5 ) + γ (e3 )γ (e8 ) + γ (e4 )γ (e7 ), ρ + (f4 ) = γ (e1 )γ (e3 ) − γ (e2 )γ (e4 ) − γ (e5 )γ (e7 ) + γ (e6 )γ (e8 ), ρ + (f5 ) = γ (e1 )γ (e7 ) + γ (e2 )γ (e8 ) − γ (e3 )γ (e5 ) − γ (e4 )γ (e6 ), ρ + (f6 ) = γ (e1 )γ (e4 ) + γ (e2 )γ (e3 ) − γ (e5 )γ (e8 ) − γ (e6 )γ (e7 ), ρ + (f7 ) = γ (e1 )γ (e8 ) − γ (e2 )γ (e7 ) + γ (e3 )γ (e6 ) − γ (e4 )γ (e5 ). P Then for a connection A = 8i=1 Ai dxi ∈ 1 (R8 , iR) and a given complex 8-spinor 9 = (ψ1 , ψ2 , ..., ψ8 ) ∈ C ∞ (X, W + ) = C ∞ (R8 , R8 × C8 ) we state our 8-dimensional monopole equations as follows: DA (9) = 0 ,

ρ + (FA + ) = (99 ∗ )+ .

(24)

Here (99 ∗ )+ is the orthogonal projection of 99 ∗ onto the spinor subbundle spanned by ρ + (fi ), i = 1, 2, ..., 7. More explicitly, DA (9) = 0 can be expressed as ∇1 9 = γ (e2 )∇2 9 + γ (e3 )∇3 9 + ... + γ (e8 )∇8 9

(25)

and ρ + (FA + ) = (99 ∗ )+ is equivalent to the equation ρ + (FA + ) =

8 X

< ρ + (fi ), 99 ∗ > ρ + (fi )/|ρ + (fi )| . 2

i=2

Equation (26) is equivalent to the set of equations F15 + F26 + F37 + F48 = 1/8 < ρ + (f1 ), 99 ∗ >, F12 + F34 − F56 − F78 = 1/8 < ρ + (f2 ), 99 ∗ >, F16 − F25 − F38 + F47 = 1/8 < ρ + (f3 ), 99 ∗ >, F13 − F24 − F57 + F68 = 1/8 < ρ + (f4 ), 99 ∗ >, F17 + F28 − F35 − F46 = 1/8 < ρ + (f5 ), 99 ∗ >, F14 + F23 − F58 − F67 = 1/8 < ρ + (f6 ), 99 ∗ >, F18 − F27 + F36 − F45 = 1/8 < ρ + (f7 ), 99 ∗ >, or still more explicitly to the equations F15 + F26 + F37 + F48 = 1/4(ψ1 ψ¯ 3 − ψ3 ψ¯ 1 − ψ2 ψ¯ 4 + ψ4 ψ¯ 2 − ψ5 ψ¯ 7 + ψ7 ψ¯ 5 − ψ6 ψ¯ 8 + ψ8 ψ¯ 6 ), F12 + F34 − F56 − F78 = 1/4(ψ1 ψ¯ 5 − ψ5 ψ¯ 1 − ψ2 ψ¯ 6 + ψ6 ψ¯ 2 + ψ3 ψ¯ 7 − ψ7 ψ¯ 3 + ψ4 ψ¯ 8 − ψ8 ψ¯ 4 ),

(26)

Monopole Equations on 8-Manifolds with Spin(7) Holonomy

29

F16 − F25 − F38 + F47 = 1/4(ψ1 ψ¯ 7 − ψ7 ψ¯ 1 + ψ2 ψ¯ 8 − ψ8 ψ¯ 2 − ψ3 ψ¯ 5 + ψ5 ψ¯ 3 + ψ4 ψ¯ 6 − ψ6 ψ¯ 4 ), F13 − F24 − F57 + F68 = 1/4(ψ1 ψ¯ 2 − ψ2 ψ¯ 1 + ψ3 ψ¯ 4 − ψ4 ψ¯ 3 + ψ5 ψ¯ 6 − ψ6 ψ¯ 5 − ψ7 ψ¯ 8 + ψ8 ψ¯ 7 ), F17 + F28 − F35 − F46 = 1/4(ψ1 ψ¯ 4 − ψ4 ψ¯ 1 + ψ2 ψ¯ 3 − ψ3 ψ¯ 2 − ψ5 ψ¯ 8 + ψ8 ψ¯ 5 + ψ6 ψ¯ 7 − ψ7 ψ¯ 6 ), F14 + F23 − F58 − F67 = 1/4(−ψ1 ψ¯ 6 + ψ6 ψ¯ 1 − ψ2 ψ¯ 5 + ψ5 ψ¯ 2 − ψ3 ψ¯ 8 + ψ8 ψ¯ 3 + ψ4 ψ¯ 7 − ψ7 ψ¯ 4 ), F18 − F27 + F36 − F45 = 1/4(ψ1 ψ¯ 8 − ψ8 ψ¯ 1 − ψ2 ψ¯ 7 + ψ7 ψ¯ 2 − ψ3 ψ¯ 6 + ψ6 ψ¯ 3 − ψ4 ψ¯ 5 + ψ5 ψ¯ 4 ). 5. Conclusion We will now show that the system of monopole equations (25)-(26) form an elliptic system. These equations can be written compactly in the form hF, fi i = 1/8hρ + (fi ), 99 ∗ i, i = 1 . . . 7, DA (9) = 0. If in addition we impose the Coulomb gauge condition 8 X

∂i Ai = 0,

i=1

we obtain a system of first order partial differential equations consisting of eight equations for the components of the spinor 9 and eight equations for the components of the connection 1-form A. The characteristic determinant of this system [20] is the product of the characteristic determinants of the equations for 9 and A. As the Dirac operator is elliptic [19], the ellipticity of the present system depends on the characteristic determinant of the system consisting of hF, fi i = 1/8hρ + (fi ), 99 ∗ i, i = 1 . . . 7 and the Coulomb gauge condition. In the computation of the characteristic determinant, the fifth row, for instance, is obtained from F15 + F26 + F37 + F48 = ∂1 A5 − ∂5 A1 + ∂2 A6 − ∂6 A2 + ∂3 A7 − ∂7 A3 + ∂4 A8 − ∂8 A4 by replacing ∂i by ξi . Thus after a rearrangement of the order of the equations, the characteristic determinant can be obtained as   ξ1 ξ2 ξ3 ξ4 ξ5 ξ6 ξ7 ξ8  −ξ2 ξ1 −ξ4 ξ3 ξ6 −ξ5 ξ8 −ξ7     −ξ3 ξ4 ξ1 −ξ2 ξ7 −ξ8 −ξ5 ξ6     −ξ −ξ3 ξ2 ξ1 ξ8 ξ7 −ξ6 −ξ5  det  4 .  −ξ5 −ξ6 −ξ7 −ξ8 ξ1 ξ2 ξ3 ξ4    −ξ ξ  6 5 ξ8 −ξ7 −ξ2 ξ1 ξ4 −ξ3   −ξ7 −ξ8 ξ5 ξ6 −ξ3 −ξ4 ξ1 ξ2  −ξ8 ξ7 −ξ6 ξ5 −ξ4 ξ3 −ξ2 ξ1

30

A. H. Bilge, T. Dereli, S. ¸ Koçak

It is equal to (ξ12 + ξ22 + ξ32 + ξ42 + ξ52 + ξ62 + ξ72 + ξ82 )4 , and this proves ellipticity. Finally we point out that the monopole equations (25)-(26) admit non-trivial solutions. For example, if the pair (A, 8) with A=

4 X

Ai (x1 , x2 , x3 , x4 )dxi

i=1

and

8 = (φ1 (x1 , x2 , x3 , x4 ), φ2 (x1 , x2 , x3 , x4 ))

is a solution of the 4-dimensional Seiberg–Witten equations, then the pair (B, 9) with B=

4 X

Ai (x1 , x2 , x3 , x4 )dxi

i=1

(i.e. the first four components Bi of B coincide with Ai , thus not depending on x5 , x6 , x7 , x8 and the last four components of B vanish) and 9 = (0, 0, φ1 , φ2 , 0, 0, iφ1 , −iφ2 ), where φ1 and φ2 depend only on x1 , x2 , x3 , x4 , is a solution of these new 8-dimensional monopole equations. It can directly be verified that 9 is harmonic with respect to B and the second part of the equations is also satisfied. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

Seiberg, N., Witten, E.: Nucl. Phys. B426, 19 (1994) Witten, E.: Math. Res. Lett.1, 764 (1994) Flume, R.: O’Raifeartaigh, L., Sachs, I.: Brief resume of the Seiberg–Witten theory. hep-th/9611118 Donaldson, S.K., Thomas, R.P.: Gauge theory in higher dimensions. Oxford University preprint, 1996 Baulieu, L., Kanno, H., Singer, I.M.: Special quantum field theories in eight and other dimensions. hepth/9704167 Acharya, B.S., O’Loughlin, M., Spence, B.: Higher dimensional analogues of Donaldson-Witten Theory. hep-th/9705138 Hull, C.M.: Higher dimensional Yang–Mills theories and topological terms. hep-th/9710165 Salamon, D.: Spin Geometry and Seiberg–Witten Invariants. April 1996 version, Book to appear Bilge, A.H., Dereli, T., Koçak, S.: ¸ Seiberg–Witten equations on R 8 . In: The Proceedings of 5th Gökova Geometry-Topology Conference, Edited by S. Akbulut, T. Önder, R. Stern, TUBITAK, Ankara, 1997, p. 87 Bilge, A.H., Dereli, T., Koçak, S.: ¸ J. Math. Phys. 38, 4804 (1997) Corrigan, E., Devchand, C., Fairlie, D., Nuyts, J.: Nucl. Phys. B214, 452 (1983) Ward, R.S.: Nucl. Phys. B236, 381 (1984) Bilge, A.H., Dereli, T., Koçak, S.: ¸ Lett. Math. Phys. 36, 301 (1996) Gürsey, F., Tze, C.-H.: On the Role of Division, Jordan and Related Algebras in Particle Physics. Singapore: World Scientific, 1996 Fairlie, D., Nuyts, J.: J. Phys. A17, 2867 (1984) Fubini, S., Nicolai, H.: Phys. Lett. B155, 369 (1985) Grossman, B., Kephart, T.W., Stasheff, J.D.: Commun. Math. Phys. 96, 431 (1984) (Erratum:ibid, 100 311 (1985)) Joyce, D.D.: Invent. Math. 123, 507 (1996) Lawson, H.B., Michelsohn, M-L.: Spin Geometry . Princeton, NJ: Princeton U.P., 1989 John, F.: Partial Differential Equations. Berlin–Heidelberg–New York: Springer-Verlag, 1982

Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 203, 31 – 52 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Special Kähler Manifolds Daniel S. Freed? Schools of Mathematics and Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA. E-mail: [email protected] Received: 5 December 1997 / Accepted: 16 November 1998

Abstract: We give an intrinsic definition of the special geometry which arises in global N = 2 supersymmetry in four dimensions. The base of an algebraic integrable system exhibits this geometry, and with an integrality hypothesis any special Kähler manifold is so related to an integrable system. The cotangent bundle of a special Kähler manifold carries a hyperkähler metric. We also define special geometry in supergravity in terms of the special geometry in global supersymmetry.

Constraints on Riemannian metrics occur in many places in supersymmetry. For example, the requirement of extended supersymmetry in a two dimensional σ -model constrains the target manifold to be Kähler or hyperkähler depending on the amount of supersymmetry. The scalars in supergravity theories are often constrained to live on a particular homogeneous Riemannian manifold. These sorts of special metrics – metrics with restricted holonomy group (such as Kähler and hyperkähler metrics) and homogeneous metrics – are much studied by Riemannian geometers, but there are situations in which we meet something new. One important example occurs in four dimensional gauge theories with N = 2 supersymmetry: the scalars in the vector multiplet lie in a special Kähler manifold. This is the case pertaining to global supersymmetry; when coupled to N = 2 supergravity in four dimensions the scalars lie in a projective special Kähler manifold.1 Notice that N = 1 supersymmetry already constrains the scalars to lie ? The author is on leave from the Department of Mathematics at the University of Texas atAustin,Austin, TX 78712, USA, where he receives support from NSF grant DMS-962698. At the Institute for Advanced Study the author receives support from NSF grants DMS-9304580 and DMS-9627351, the Harmon Duncombe Foundation, and from the J. Seward Johnson Sr. Charitable Trust. 1 Physicists use the term “special Kähler manifold” for both cases, and use words like “rigid” and “local” to distinguish them. Since these words have other connotations in geometry, we adopt a different terminology.

32

D. S. Freed

in a Kähler manifold, which must be Hodge in the supergravity case. Special geometry is the additional constraint imposed by N = 2 supersymmetry. Special geometry appeared in the physics literature in 1984 in both global supersymmetry [ST,G] and supergravity [WP]. Strominger [St] gave a coordinate-free definition in the supergravity case. Projective special Kähler manifolds are important in mirror symmetry, as explained by Candelas and de la Ossa [CO]. Special Kähler manifolds in global supersymmetry have received more attention recently due to their prominent role in the seminal work of Seiberg and Witten on N = 2 supersymmetric Yang–Mills theories [SW1,SW2]. See [F,CRTP] for recent discussions of special geometry and for extensive references. In this paper we introduce an intrinsic2 definition of special geometry: A special Kähler structure is a flat connection on the tangent bundle of a Kähler manifold. The crucial condition is expressed in (1.2). From it follow the usual equations for special coordinates, the holomorphic prepotential, the Kähler potential, etc. We recount this in Sect. 1, where we also define this geometry in terms of a holomorphic cubic form. In Sect. 2 we construct a hyperkähler metric on the cotangent bundle of a special Kähler manifold. A local version of this result appears in the physics literature [CFG]. It seems likely that there is actually a one parameter family of hyperkähler metrics of which the one we construct is a limiting case (see [SW3]), but we have not pursued that here. In Sect. 3 we prove the assertion made by Donagi and Witten [DW] that with a suitable integrality hypothesis a special Kähler manifold parametrizes an algebraic completely integrable system. As a consequence the total space of an algebraic integrable system carries a hyperkähler metric. The usual definition of a projective special Kähler manifold is based on a particular type of variation of Hodge structure, which was first studied by Bryant and Griffiths [BG]. Our main observation here is that a projective special Kähler structure on a Hodge manifold M of dimension n induces a special pseudoKähler structure of Lorentz type on a closely related manifold M˜ of dimension n + 1. (M˜ is the total space of the Hodge line bundle with the zero section omitted.) With a suitable integrality hypothesis the associated intermediate Jacobians are an integrable system and carry a hyperkähler metric, results obtained previously [DM2,C]. Finally, in Sect. 5 we make some brief comments on the physics (in the case of global supersymmetry). We explain that supersymmetry combined with the quantization of electric and magnetic charges leads to the conclusion that integrable systems must enter into the low energy description of N = 2 supersymmetric gauge theories. As mentioned above, the base of an algebraic integrable system is a special Kähler manifold. This is, I believe, the proper context for special Kähler geometry. There are many examples of algebraic integrable systems, and hopefully this excuses the paucity of examples presented here. As mentioned in the footnote on the previous page, our terminology differs from that in the physics literature. We include the following table to aid in translation: Our Terminology Special Kähler Projective Special Kähler

Physics Literature Rigid Special Kähler (vector multiplets in global N = 2 supersymmetry) (Local) Special Kähler (vector multiplets in N = 2 supergravity)

2 Intrinsic geometry concerns the tangent bundle and associated bundles, whereas extrinsic geometry involves bundles not constructed directly from the coordinate charts of a manifold. Definitions of special geometry in the physics literature are not intrinsic in this sense.

Special Kähler Manifolds

33

This paper grew out of a seminar talk explaining [DW], and it had a long gestation period since. During that time I benefited from conversations and lectures by many colleagues, including Jacques Distler, Ron Donagi, Nigel Hitchin, Graeme Segal, Nathan Seiberg, Karen Uhlenbeck, and Edward Witten. From the first version of the paper I received helpful remarks from Vicente Cortés, James Gates, Zhiqin Lu, Simon Salamon, and the referees. I thank them all.

1. Definition and Basic Properties We introduce the following definition. Definition 1.1. Let M be a Kähler manifold with Kähler form ω. A special Kähler structure on M is a real flat torsionfree symplectic connection ∇ satisfying d∇ I = 0,

(1.2)

where I is the complex structure on M. First we examine the consequences of the connection on the underlying real symplectic structure on M. The connection ∇ determines an extension of the de Rham complex d∇ =∇ d∇ d∇ 0 −−−−→ 0 (T M) −−−−→ 1 (T M) −−−−→ 2 (T M) −−−−→ · · ·.

(1.3)

The flatness is the condition d∇2 = 0. Note that the Poincaré lemma holds for (1.3): a closed T M-valued form is locally exact. The torsionfree condition may be expressed by d∇ (id) = 0,

(1.4)

where id ∈ 1 (T M) is the identity endomorphism of T M. Now if {ξα } is a flat local framing of M with dual coframing {θ α }, then (1.4) implies dθ α = 0, whence θ α = dt α for some local coordinate functions t α .3 Since ∇ω = 0 we can choose these coordinates to be Darboux; that is, the coordinate functions are x i , yj (i, j = 1, . . . , n = dimC M) with ω = dx i ∧ dyi .

(1.5)

Summarizing, a flat torsionfree symplectic connection ∇ is equivalent to a flat symplectic structure on M. This is a covering by flat Darboux coordinate systems {x i , yj } whose transition functions are of the form ! ! ! x x˜ a (1.6) =P + , P ∈ Sp(2n; R), a, b ∈ Rn . y y˜ b (The coordinates are “flat” since ∇dx i = ∇dyj = 0.) Equation (1.5) is valid in any flat Darboux coordinate system. 3 For simplicity we always choose our coordinate systems to be defined on connected open sets, and we allow the domains of the coordinate systems to shrink when necessary.

34

D. S. Freed

The compatibility with the complex structure is expressed4 by (1.2), or equivalently by d∇ π (1,0) = 0,

(1.7)

where π (1,0) ∈ 1,0 (TC M) is projection onto the (1, 0) part of the complexified tangent bundle. The Poincaré lemma ensures that locally we can find a complex vector field ζ with ∇ζ = π (1,0) .

(1.8)

Note that ζ is unique up to a flat complex vector field. Also, ζ is not necessarily holomorphic. Let {x i , yj } be a flat Darboux coordinate system and write ζ =

∂ 1 i ∂ z − wj i 2 ∂x ∂yj

(1.9)

for some complex functions zi , wj . (The choice of sign and the factor ‘1/2’yield standard formulas for M = Cn .) Since π (1,0) has type (1, 0), Eq. (1.9) implies that zi , wj are holomorphic functions and π (1,0) =

1 ∂ ∂ dzi ⊗ i − dwj ⊗ . 2 ∂x ∂yj

(1.10)

It follows that Re(dzi ) = dx i , Re(dwj ) = −dyj .

(1.11)

In particular, {zi } is a local holomorphic coordinate system on M.5 We easily compute 1 ∂ ∂ ∂ = − τij , i i ∂z 2 ∂x ∂yj

(1.12)

where τij =

∂wj . ∂zi

(1.13)

Now the fact that ω has type (1, 1) implies that τij = τj i , and so there is a (local) holomorphic function F, determined up to a constant, so that wj =

∂F ∂ 2F , τ = . ij ∂zj ∂zi ∂zj

(1.14)

F is called the holomorphic prepotential. It determines a Kähler potential K=

∂F i 1 1 Im z¯ = Im(wi z¯ i ), 2 ∂zi 2

4 We give a characterization in terms of coordinates in Proposition 1.25 below. 5 Of course, so is {w }. We call {zi } and {w } conjugate coordinate systems (Definition 1.37). j j

(1.15)

Special Kähler Manifolds

35

and in terms of this data the Kähler form is √ √ √ −1 −1 ∂ 2F i j j ¯ dz ∧ dz = Im Im(τij )dzi ∧ dz . (1.16) ω = −1∂ ∂K = 2 ∂zi ∂zj 2 Formulas (1.14)–(1.16) are standard in the literature on special Kähler geometry; they show that our global Definition 1.1 reproduces the usual local characterization. We term {zi } a special coordinate system. We characterize special coordinate systems below in Definition 1.37. Remark 1.17. Condition (1.2) does not mean that the complex structure I is flat. Indeed, if ∇I = 0, then M is a flat Kähler manifold, locally isometric to Cn . Such a manifold is special Kähler, but of a very special type. Note that the existence of a flat symplectic structure has nontrivial global topological consequences but gives no local restriction. Equation (1.2), on the other hand, is a stringent local condition. Remark 1.18. Based on an earlier version of this paper, Zhigin Lu [L] proved that there are no nonflat complete special Kähler manifolds. Remark 1.19. The special Kähler condition (1.7) automatically implies that ∇ is torsionfree, since (1.4) is twice the real part of (1.7). Remark 1.20. Locally, we may specify a Kähler geometry by giving a holomorphic function F(z1 , . . . , zn ) such that Im

∂ 2F >0 ∂zi ∂zj

is positive definite. The function

√ −1 1 2 (z ) + · · · + (zn )2 F(z , . . . , z ) = 2 leads to the flat metric on Cn . A nontrivial example in one dimension is provided by the holomorphic function τ3 , F(τ ) = 6 defined on the upper half plane n

1

H = {τ : Im τ > 0}. The corresponding Kähler form

√ −1 Im(τ )dτ ∧ dτ ω= 2

has Gauss curvature 1/2(Im τ )3 . Note that the coordinate conjugate to τ is w = ∂F/∂τ = τ 2 /2. An adapted6 flat Darboux coordinate system {x, y} is x = Re τ, y = − Re τ 2 /2. In these coordinates the Riemannian metric is g=

2(x 2 + y) dx 2 + 2x dxdy + dy 2 p . x 2 + 2y

It is the Hessian of the function φ = 13 (x 2 + 2y)3/2 ; see Proposition 1.24 below. This metric is incomplete; see Remark 1.18. 6 See Definition 1.37.

36

D. S. Freed

Remark 1.21. Nowhere do we use the positive definiteness of ω. Hence our discussion applies also to pseudo-Kähler manifolds. (A pseudo-Kähler metric ω is nondegenerate and dω = 0, but it is not assumed positive definite.) We have the following easy result. Proposition 1.22. (a) Let (M, ω, ∇) be a special Kähler manifold. The connection ∇ determines a horizontal distribution H in the real cotangent bundle T ∗ M. Then H is invariant under the complex structure of T ∗ M. (b) The (0, 1) part of the connection ∇ on the complex tangent bundle T M equals the ∂¯ operator. Proof. (a) Choose a flat Darboux coordinate system {x i , yj }. Then the local 1-forms dx i , dyj define sections of T ∗ M → M whose image is an integral manifold of H . Since dx i and dyj are the real parts of holomorphic differentials (see (1.11)) their graphs are complex submanifolds. (b) From (1.12) we compute that ∇ ∂/∂zi is a form of type (1, 0): 1 ∂τj ` k ∂ ∂ dz ⊗ . ∇ j =− ∂z 2 ∂zk ∂y`

(1.23)

t Since ∂/∂zi is a local basis of holomorphic sections, the desired assertion follows. u The Riemannian metric has a very special form in flat real coordinates – it is the Hessian of a function. This observation is due to Nigel Hitchin. Proposition 1.24. Let (M, ω, ∇) be a special Kähler manifold. Suppose {uα } is a ∇-flat coordinate system. (For example, it may be a flat Darboux coordinate system.) Then the Riemannian metric g is ∂ 2φ g = α β duα ⊗ duβ ∂u ∂u for some real function φ. In fact, φ is a Kähler potential. Proof. In these coordinates the symplectic form ω = 21 ωαβ duα ∧ duβ has constant coefficients. Now γ gαβ = ωαγ Iβ and the special Kähler condition (1.2) implies ∂gαγ ∂gαβ = . γ ∂u ∂uβ Hence gαβ = ∂φα /∂uβ for some function φα . The symmetry of gαβ now implies that φα = ∂φ/∂uα for some φ, as desired. To see that φ is a Kähler potential, we compute √ ¯ = − 1 dI dφ −1 ∂ ∂φ 2 1 ∂φ α γ I du =− d 2 ∂uα γ ∂φ ∂Iγα β 1 ∂ 2φ α I + du ∧ duγ =− γ 2 ∂uα ∂uβ ∂uα ∂uβ 1 = − gαβ Iγα duβ ∧ duγ 2 = ω.

Special Kähler Manifolds

37

We use the special Kähler condition to pass from the third line to the fourth. u t We next express the special Kähler condition (1.2) in terms of coordinates. Proposition 1.25. Let (M, ω) be a Kähler manifold of dimension n and ∇ a flat torsionfree symplectic connection. Suppose {zi } is any local holomorphic coordinate system on M and {x i , yj } a flat Darboux coordinate system. Write 1 j ∂ ∂ ∂ = σi − τij i j ∂z 2 ∂x ∂yj j

j

for functions σi , τij . Then d∇ I = 0 if and only if σi , τij are holomorphic functions of z1 , . . . , zn and j

j

∂σ ∂σi = ki , k ∂z ∂z

∂τij ∂τkj = . k ∂z ∂zi

The proof is straightforward: Compute d∇ (π (1,0) ) = d∇ dzi ⊗ ∂z∂ i . Notice that τij is

not necessarily symmetric, but rather τik σjk is symmetric in i, j . There is a holomorphic cubic form 4 on a special Kähler manifold which encodes the extent to which ∇ fails to preserve the complex structure. Namely, set (1.26) 4 = −ω π (1,0) , ∇π (1,0) ∈ H 0 (M, Sym3 T ∗ M). That 4 is symmetric follows from the fact that ω is skew-symmetric, ∇ω = 0, and the special Kähler condition (1.7) (which says that ∇π (1,0) is symmetric). The holomorphicity follows from the computation (1.28) below. Note the alternative local expression 4 = −ω(∇ζ, ∇ 2 ζ ),

(1.27)

where ζ is a local complex vector field satisfying (1.8). We compute (1.26) in special coordinates {zi } introduced above. From (1.23) and the fact that ω has type (1, 1), we have ∂ ∂ , ∇(dzj ⊗ j ) ∂zi ∂z 1 ∂ 1 ∂τj ` k ∂ ∂ ), − dz ⊗ = −dzi ⊗ dzj ω ( i − τim k 2 ∂x ∂ym 2 ∂z ∂y` 1 ∂τj i dzi ⊗ dzj ⊗ dzk = 4 ∂zk 1 ∂ 3F = dzi ⊗ dzj ⊗ dzk . 4 ∂zi ∂zj ∂zk

4 = −ω dzi ⊗

(1.28)

Here we use (1.14) as well. The cubic form 4 can also be used7 to relate the special Kähler connection ∇ to the Levi–Civita connection D. Write ∇ = D + AR , 7 I learned this from the account in [BCOV], though it also appears in many other works.

(1.29)

38

D. S. Freed

where AR ∈ 1 (M, EndR T M). Then since Dπ (1,0) = 0, we have 4 = −ω(π (1,0) , [AR , π (1,0) ]).

(1.30)

Moreover, there is a complex tensor

A ∈ 1,0 Hom(T M, T M) with

(1.31)

AR = A + A.

To see this, note from (1.29) and Proposition 1.22(ii) that Aξ vanishes on vectors of type (0, 1) if ξ is of type (1, 0). Then, since Aξ is infinitesimal symplectic, for ζ of type (1, 0) and η¯ of type (0, 1), we have ¯ = −ω(ζ, Aξ η) ¯ = 0. ω(Aξ ζ, η) Since ω has type (1, 1), this implies that Aξ ζ is of type (0, 1). Therefore, A is as claimed in (1.31). Furthermore, A and 4 determine each other. In particular, we recover the special Kähler structure from 4. Conversely, we can start with a smooth cubic form 4 ∈ C ∞ (M, Sym3 T ∗ M) and ask for the conditions on 4 which ensure that ∇ as defined by (1.29) and (1.30) is a special Kähler structure. Note ‘T ∗ M’ denotes the complex tangent bundle; we assume 4 to be complex multilinear. The symmetry of 4 implies that ∇ is symplectic, torsionfree, and satisfies (1.2). Setting the curvature of ∇ to zero from (1.29) yields the equation 0 = R + dD A + A ∧ A + A ∧ A, where R is the curvature of the Kähler metric on M. Here ‘dD ’ is the alternation of the Levi–Civita covariant derivative. Notice that as endomorphisms of the tangent bundle R +A∧A+A∧A is complex linear, whereas dD A is complex antilinear (1.31); whence ¯ from which it follows that 4 is these separately vanish. The (1,1) piece of dD A is ∂A, holomorphic. The remaining equations are ∂D A = 0, R = −(A ∧ A + A ∧ A).

(1.32)

In any local coordinate system {zi } we write √ ω = −1 hi j¯ dzi ∧ dzj , j R= R ¯ dzk ∧ dz` , ik ` i,j

4 = 4ij k dzi ⊗ dzj ⊗ dzk , √ ¯ ¯ (Ai )kj = −1 4ij ` h`k . m . Then (1.32) is As usual, set Ri j¯k `¯ = hmj¯ Rik `¯

Di Aj = Dj Ai , ¯

Ri j¯k `¯ = −hα β 4ikα 4j `β . We summarize this discussion as follows.

(1.33)

Special Kähler Manifolds

39

Proposition 1.34. (a) If (M, ω, ∇) is a special Kähler manifold, then there is an associated holomorphic cubic form 4 ∈ H 0 (M, Sym3 T ∗ M), defined in (1.26), which satisfies (1.32). (b) If (M, ω) is a Kähler manifold and 4 ∈ H 0 (M, Sym3 T ∗ M) holomorphic cubic form which satisfies (1.32), then ∇ = D + A is a special Kähler structure, where D is the Levi–Civita connection and A is defined from 4 by (1.30). Remark 1.35. Lu [L] noticed that as a consequence of (1.33) any special Kähler manifold M has nonnegative scalar curvature ρ: ¯

¯

ρ = −4hi j hk ` Ri j¯k `¯ ¯

¯

¯

= 4hi j hk ` hα β 4ikα 4j `β

(1.36)

= 4|4| . 2

Then he computes 4ρ and uses a maximum principle to argue that if M is complete, then ρ = 0, from which 4 and then R vanish. Next, we discuss special coordinates. Definition 1.37. Let (M, ω, ∇) be a special Kähler manifold. (a) A holomorphic coordinate system {zi } is special if ∇ Re(dzi ) = 0. (b) We say that special coordinates {zi } and flat Darboux coordinates {x i , yj } are adapted if Re(zi ) = x i . (c) Special coordinate systems {zi }, {wj } are said to be conjugate if there exists a flat Darboux coordinate system {x i , yj } such that Re(zi ) = x i and Re(wj ) = −yj . Given adapted special coordinates {zi } and flat Darboux coordinates {x i , yj }, conjugate special coordinates {wj } are determined up to translation by a purely imaginary constant. For adapted coordinate systems we have Eqs. (1.9)–(1.16), but note that {wj }, τij , F, and K are not completely determined by {zi } and {x i , yj }. The following proposition clarifies the choices involved. Proposition 1.38. Let (M, ω, ∇) be a special Kähler manifold. (a) Given a flat Darboux coordinate system {x i , yj } there exists an adapted special coordinate system {zi }. Any two choices {zi }, {˜zi } satisfy zi = z˜ i + ci for some purely imaginary constants ci . (b) Given a special coordinate system {zi } there exists an adapted flat Darboux coordinate system {x i , yj }. Any two choices differ by a change of variables ! ! ! ! x 1 0 x˜ 0 = + , y A1 y˜ b where A is a (real) symmetric matrix and b ∈ Rn . (c) Given a special coordinate system {zi } the holomorphic prepotential F is determined up to a change 1 F −→ F + Aij zi zj + Bi zi + C, 2

40

D. S. Freed

where A = (Aij ) is a real symmetric matrix, and Bi , C ∈ C. So the conjugate coordinate system {wj } is determined up to a change wj −→ wj + Aj k zk + Bj and the Kähler potential (1.15) is determined up to a change K −→ K + Im(Bi z¯ i ). (d) If {zi }, {wj } are conjugate special coordinate systems, then any other pair {˜zi }, {w˜ j } of conjugate special coordinate systems are related by z w

! =P

! ! z˜ a + , P ∈ Sp(2n; R), a, b ∈ Cn . w˜ b

(1.39)

The corresponding matrices τ, τ˜ are related by τ = (D τ˜ + C)(B τ˜ + A)−1 , where P =

AB CD

(1.40)

.

2. The Associated Hyperkähler Manifold In this section we prove the following theorem, which (in local form) is due to Cecotti, Ferrara, and Girardello [CFG].8 Theorem 2.1. The cotangent bundle T ∗ M of a special Kähler manifold (M, ω, ∇) carries a canonical hyperkähler structure. Recall that a Riemannian manifold (Y, g) is hyperkähler if it carries a triple of integrable almost complex structures I, J, K which satisfy the quaternion algebra and such that the associated 2-forms ωT (ξ1 , ξ2 ) = g(ξ1 , T ξ2 ), T = I, J, K,

(2.2)

are closed. A useful lemma of Hitchin [H, p. 64] asserts that if ωI , ωJ , ωK are closed, then I, J, K are integrable. If we consider (Y, ωI ) as a Kähler manifold with complex structure I , then η = ωJ + iωK

(2.3)

is a holomorphic symplectic form. 8 Equation (B.7) in [CFG] corresponds to our description of the metric in (2.4), where their Z I are special coordinates on M and {Z I , WJ } the induced coordinate system on T ∗ M. Then (B.8b) describes the flat connection ∇.

Special Kähler Manifolds

41

Proof. Consider first a hermitian vector space V with complex structure I . The hermitian metric h·, ·i determines a metric and symplectic form on the underlying real vector space VR : hξ1 , ξ2 i = g(ξ1 , ξ2 ) + iω(ξ1 , ξ2 ), ξ1 , ξ2 ∈ VR . Then W = V ⊕ V ∗ ∼ = V ⊕ V has a constant hyperkähler structure. The complex structure J is the antilinear map J : V ⊕ V −→ V ⊕ V , v1 ⊕ v2 7 −→ −v2 ⊕ v1 . Now define K = I J . Then I, J, K satisfy the quaternion algebra. The metric on WR is gW (ξ1 ⊕ α1 , ξ2 ⊕ α2 ) = g(ξ1 , ξ2 ) + g −1 (α1 , α2 ), ξ1 , ξ2 ∈ VR , α1 , α2 ∈ VR∗ . (2.4) The forms ωI , ωJ , ωK are now determined by (2.2). It is straightforward to check that the holomorphic symplectic form η defined in (2.3) is the canonical form on W = V ⊕V∗ ∼ = T ∗V : η(v1 ⊕ `1 , v2 ⊕ `2 ) = `1 (v2 ) − `2 (v1 ), v1 , v2 ∈ V , `1 , `2 ∈ V ∗ . Now let (M, ω, ∇) be special Kähler and let Y = T ∗ M. Consider the distribution of horizontal spaces on Y given by the connection ∇. Here ‘horizontal’ means relative to the projection map π : Y → M. The horizontal space Hy at y ∈ Y is a complex subspace of Ty Y by Proposition 1.22. The projection π identifies Hy ∼ = Tm M, where m = π(y), and so the splitting into horizontal and vertical is a splitting Ty Y ∼ = Tm M ⊕ Tm∗ M.

(2.5)

The linear algebra of the preceding paragraph gives global endomorphisms I, J, K which satisfy the quaternion algebra. According to Hitchin’s lemma to check that this determines a hyperkähler structure we must only verify that ωI , ωJ , ωK are closed. First, since the canonical holomorphic symplectic form η on Y = T ∗ M is closed, Eq. (2.3) implies that ωJ and ωK are also closed. To see that ωI is closed we choose a flat Darboux coordinate system {x i , yj } on an open set U ⊂ M. This induces a local coordinate system {x i , yj ; qi , pj } on π −1 U ⊂ Y . Since the splitting (2.5) is induced by ∇, and dx i , dyj are ∇-flat by definition, it follows that ωI = dx i ∧ dyi + dqi ∧ dpi . This form is closed. u t 3. Integrable Systems In the mathematical description of a (finite dimensional) classical mechanical system one meets a symplectic manifold X and a Hamiltonian function. It is an integrable system if there is a maximal set of Poisson commuting conserved momenta which includes the Hamiltonian. Under suitable hypotheses this leads to a foliation of X by lagrangian tori [GS, Sect. 44]. The complex analogue leads to the following definition [DM1], which we explain in the succeeding paragraphs.

42

D. S. Freed

Definition 3.1. An algebraic integrable system is a holomorphic map π : X → M where (a) X is a complex symplectic manifold with holomorphic symplectic form η ∈ 2,0 (X); (b) The fibers of π are compact lagrangian submanifolds, hence affine tori; (c) There is a family of smoothly varying cohomology classes [ρm ] ∈ H 1,1 (Xm ) ∩ H 2 (Xm ; Z), m ∈ M, such that [ρm ] is a positive polarization of the fiber Xm . Hence Xm is an abelian torsor. The hypothesis that the fibers are compact lagrangian leads to the conclusion that they are affine tori. The fact that they are abelian torsors is an extra hypothesis. We assume that X and M are smooth.9 We now explain this definition and some consequences. Recall that a single10 abelian variety is a quotient A = V /3 of a complex vector space V by a full real lattice 3 such that H 1,1 (A) ∩ H 2 (A; Z) 6 = 0 and there is a positive class [ρ] in this intersection. Such a class is called a polarization and is represented by a unique invariant positive closed R n (1, 1)-form ρ on A. The polarization is principal if A ρn! = 1. Note that ρ is a real symplectic form on A, and since it is invariant it is a symplectic form on VR as well. Also, since ρ is an integral class, it induces a symplectic form on 3 ∼ = H1 (A). Let {γ i , δj } ⊂ 3 be a symplectic basis. Then there is a unique basis {ωi } of holomorphic differentials on A with Z ωj = δji , (3.2) γi

where δji is the Kronecker symbol. In fact, we can identify {ωi } as the complex basis of V ∗ dual to {γ i }. Now Z ωi = τij (3.3) δj

defines the period matrix τ of A. The Riemann bilinear relations state that the matrix τ = (τij ) belongs to the Siegel upper half space Hn = {τ an n × n complex matrix : τ is symmetric and Im τ is positive definite}. The group Sp(2n; R) acts transitively on Hn . A change of symplectic basis {γ i , δj } transforms τ by an element of a discrete subgroup 0 ⊂ Sp(2n; R) which depends on the polarization. (For a principal polarization 0 = Sp(2n; Z).) An abelian torsor X is a principal homogeneous space for an abelian variety A = V /3 with a polarization [ρ]. Here V is the space of invariant vector fields on X and 3 ⊂ V the lattice of such vector fields which exponentiate to the identity map. We can identify A as the Albanese variety of X. Any point x ∈ X determines an isomorphism A → X, and the pullback of [ρ] is a polarization [ρ] ˆ of A which is independent of the choice of x. The period matrix of X is equal to the period matrix of A. An algebraic integrable system π : X → M leads to a parametrized version of the preceding discussion. First, the holomorphic symplectic form η gives an isomorphism ∼ =

i : T ∗ M −→ V , 9 The singularities contain crucial physics, but for the geometry in this section we restrict to smooth points. 10 For convenience we use the same notation for the single abelian varieties in this explanatory paragraph

as we do in the rest of the text for families of abelian varieties.

Special Kähler Manifolds

43

where V → M is the bundle of invariant vector fields along the fibers of π. For a complex function f : M → C and complex vector field ξ on X we have η i(df ), ξ = π ∗ df (ξ ). This leads to a fiberwise action of T ∗ M ∼ = V by exponentiation. Let 3 be the kernel of the action. A basic fact is that 3 is a complex lagrangian submanifold of T ∗ M, where T ∗ M has the canonical holomorphic symplectic structure. (See [GS, Sect. 44] for proofs of the assertions made here.) Furthermore, 3 intersects each fiber of T ∗ M in a full lattice. The quotient A = T ∗ M/3 is a family of abelian varieties parametrized by M; it is the bundle of Albanese varieties of X → M. Since 3 is complex lagrangian, the symplectic canonical holomorphic symplectic form on T ∗ M passes to a holomorphic → U over an open form ηˆ on the quotient A. Now a local lagrangian section of π : X U set U ⊂ M induces a local isomorphism X U ∼ = A U , and this isomorphism maps ηˆ to η. Such sections may not exist globally. Since any two choices of local section lead to isomorphisms which differ by a translation on each fiber, the family of polarizations [ρm ] on X → M define a family of polarizations [ρˆm ] on A → M. To summarize: Every algebraic integrable system X → M has a canonically associated algebraic integrable system A → M whose fibers are abelian varieties. (An analogous assertion holds for real integrable systems.) Either system determines a welldefined period map τ : M −→ An = Hn / 0 into the moduli space An of suitably polarized abelian varieties. Now the bundle of lattices 3 determines a flat connection ∇ on T ∗ M, hence also on T M. Since 3 is lagrangian, ∇ is torsionfree. Also, the polarization [ρˆm ] on Am = Tm∗ M/3m determines a real symplectic form on Tm∗ M which restricts to an integral symplectic form on the lattice 3m . The dual 2-form ω on M is flat – ∇ω = 0 – and since ∇ is torsionfree it follows that ω is closed. Thus ω is a real symplectic form on M. The holonomy group of the flat connection ∇ is contained in the integral symplectic group Sp(3∗m ) at each m ∈ M, where 3∗m is the dual lattice to 3m . Furthermore, by the definition of a polarization ω is a (positive definite) Kähler form on M. If {γ i , δj } is a local symplectic basis of sections of 3 ⊂ T ∗ M, then we can write ω = γ i ∧ δi . There is also a global formula for ω. First, each polarization [ρˆm ] is represented by a unique invariant closed form ρˆm ∈ 1,1 (Am ). The family of forms {ρˆm } is flat with respect to ∇. Now the connection ∇ on T ∗ M induces an integrable distribution of horizontal planes on A, and we extend {ρˆm } to a form ρˆ ∈ 1,1 (A) by requiring that ρˆ vanish on those horizontal planes. Then d ρˆ = 0. The global formula for ω is expressed in terms of ρˆ and the holomorphic symplectic form ηˆ ∈ 2,0 (A): ω=

1 4

Z

ρˆ n−1 . ηˆ ∧ η¯ˆ ∧ (n − 1)! A/M

The conclusion of this discussion is a result stated by Donagi and Witten [DW]. Theorem 3.4. (a) Let (X → M, η, [ρm ]) be an algebraic integrable system. Then the Kähler form ω and the connection ∇ constructed above comprise a special Kähler structure on M. Furthermore, there is a lattice 3∗ ⊂ T M whose dual 3 ⊂ T ∗ M is a complex lagrangian submanifold, and the holonomy of ∇ is contained in the integral symplectic group defined by 3∗ .

44

D. S. Freed

(b) Conversely, suppose (M, ω, ∇) is a special Kähler manifold. Suppose further that there is a lattice 3∗ ⊂ T M, flat with respect to ∇, whose dual 3 ⊂ T ∗ M is a complex lagrangian submanifold. Then A = T ∗ M/3 → M admits a canonical holomorphic symplectic form η and a family of polarizations [ρm ] so that (A → M, η, [ρm ]) is an algebraic integrable system whose fibers are abelian varieties. Remark 3.5. The lattice 3 in (b) may be specified by a covering of distinguished flat Darboux coordinate systems {x i , yj } whose transition functions satisfy (1.6) with P ∈ Sp(2n; Z). In this case we also restrict the allowable special coordinate systems {zi } by requiring that {Re(zi )} be part of a distinguished flat Darboux coordinate system. Proof. For part (a) it remains to verify the special Kähler condition (1.2), or equivalently (1.7). We work locally. Let {γ i , δj } be a local symplectic basis of sections of 3. Since γ i , δj are closed 1-forms we can find flat Darboux coordinates {x i , yj } so that γ i = dx i and δj = dyj . Now γ i , δj also determine families of cycles on A and we can find holomorphic functions zi , wj such that dzi =

Z γi

Z η, ˆ dwj = −

δj

η. ˆ

Here the integrals are over the families of cycles in the fibration A → M, and Stokes’ theorem shows that the integrals are holomorphic (1, 0)-forms. It is easy to check that Re(dzi ) = dx i and Re(dwj ) = −dyj , so we can arrange that Re(zi ) = x i and Re(wj ) = −yj . Then ∂ ∂ 1 ζ = zi i − wj 2 ∂x ∂yj is a local complex vector field which satisfies (1.8). This implies (1.7). Notice that the vector fields ωi = ∂z∂ i define local holomorphic differentials on the fibers of A → M, and they satisfy (3.2). Thus Eq. (3.3) defines the period matrix (τij ) relative to {γ i , δj }. Equations (3.2) and (3.3) are equivalent to Eq. (1.12): 1 ∂ ∂ ∂ = − τ . ij ∂zi 2 ∂x i ∂yj By now the proof of (b) should be clear. Given (M, ω, ∇, 3), the family of polarizations on A = T ∗ M/3 → M is represented by the dual of the Kähler form ω. Hence A → M is a family of abelian varieties. The symplectic form is induced from the canonical symplectic form on T ∗ M. The hypothesis that 3 is complex lagrangian makes the t quotient T ∗ M/3 complex symplectic. u Remark 3.6. An arbitrary family of abelian varieties A → M does not admit a symplectic form. For that the differential of the period map must come from a cubic form c ∈ H 0 (M, Sym3 T ∗ M). (See [DM1, Sect. 7].) Here we assume a given identification of the bundle V with T ∗ M. (Recall that V is the bundle of constant vector fields along the fibers of A → M.) The cubic condition on the period matrix is essentially the special Kähler condition (1.2), as is clear from Proposition 1.25. Of course, the cubic form is (1.26).

Special Kähler Manifolds

45

Remark 3.7. The preceding discussion applies to the pseudo-Kähler case with one modification: the polarization classes [ρm ] are no longer positive definite. So Xm is an affine torus with an indefinite polarization. We term this an indefinite algebraic integrable system. The discussion in Sect. 2 applies directly to the quotient T ∗ M/3, and so Theorem 2.1 yields the following. Theorem 3.8. Let (X → M, η, [ρm ]) be an algebraic integrable system. Then X carries a canonical hyperkähler structure. 4. Projective Special Kähler Manifolds We term the triple (M, L, ω) a Hodge manifold if (M, ω) is Kähler and L → M is a holomorphic hermitian line bundle with curvature11 −2π iω. This implies [ω] ∈ H 2 (M; R) is an integral class. We begin with a geometric lemma about the principal C× bundle π : M˜ → M obtained by deleting the zero section from L → M. First, the hermitian connection on L is also a connection on π : M˜ → M, that is, a C× -invariant distribution of horizontal subspaces.Also, the bundle π ∗ L → M˜ has a canonical nonzero holomorphic section s. ˜ denote the form which equals |s|2 π ∗ ω on pairs of Lemma 4.1. Let ω˜ ∈ 1,1 (M) horizontal vectors, vanishes on a horizontal vector paired with a vertical vector, and is −1/π times the canonical Kähler form on pairs of vertical vectors. Then ω˜ =

i ¯ ∂∂|s|2 . 2π

(4.2)

Thus d ω˜ = 0, which implies that ω˜ is a pseudo-Kähler metric on M˜ of Lorentz type. Finally, π ∗ω =

i ¯ ∂∂ log |s|2 . 2π

(4.3)

¯ 2 , s ∈ L. The metric ω˜ The canonical Kähler form on a hermitian line L is 2i ∂ ∂|s| is negative definite on fibers and positive definite on horizontal subspaces. It has signature (n, 1), where n = dim M. Proof. Let t be a nonzero holomorphic section of L U → U for an open set U ∈ M, and ˜ set h(z) = |t (z)|2 , z ∈ U . We use local coordinates hz, λi 7→ λ t (z) ∈ π −1 U ⊂ M, where λ ∈ C× . Now s(z, λ) = λt (z), and so |s(z, λ)|2 = |λ|2 h(z). Compute the right-hand side of (4.2). To verify the description of ω˜ given before (4.2), note that ∂ is the horizontal lift of a tangent vector ξ in U . Formula (4.3) is the ξ − λh−1 ∂h(ξ ) ∂λ standard curvature formula for the hermitian connection. u t The usual definition for what we call a projective special Kähler structure is a particular type of variation of Hodge structure, which was considered specifically in a paper of Bryant and Griffiths [BG]. We discuss this first and defer our description to Proposition 4.6(b). Our version of the usual definition emphasizes the fact that the parameter space is a Hodge manifold, but it is equivalent to the definition in [BG] (cf., [C] for the relationship to [St]). 11 Since we do not use so many indices in this section, we revert to the standard notation i =

√

−1.

46

D. S. Freed

Definition 4.4. (i) A projective special Kähler structure on an n dimensional Hodge manifold (M, L, ω) is a triple (V , ∇, Q) where (a) V → M is a holomorphic vector bundle of rank n + 1 with a given holomorphic inclusion L ,→ V ; (b) ∇ is a flat connection on the underlying real bundle VR → M such that ∇(L) ⊂ V and the section M −→ P (VR )C (4.5) m 7 −→ Lm is an immersion with respect to ∇; (c) Q is a nondegenerate skew form on VR which has type (1,1) with respect to the complex structure and satisfies ∇Q = 0. Furthermore, we assume that Q L×L is i/2π times the hermitian metric on L. (ii) An integral projective special Kähler structure is a quadruple (3, V , ∇, Q) with (V , ∇, Q) as in (i) and 3 ⊂ VR a flat submanifold which intersects each fiber in a full lattice such that Q 3×3 has integral values. In this definition ∇ and Q are extended to the complexification (VR )C of VR . The flat connection gives a local identification of VR – hence also of its complexification (VR )C and the projectivization P (VR )C – with any fiber. The immersion condition in (b) states into the 2n + 1 dimensional projective space of a local that m 7 → Lm is an immersion trivialization of P (VR )C . The data in (ii) define a variation of polarized Hodge structures of weight 3 with Hodge numbers h3,0 = 1, h2,1 = n with an extra immersion condition. This is the form of the definition in [BG]. (See [CGGH] for the basic definitions related to variations of Hodge structures.) We recover the Hodge filtration {F p } by setting F 3 = L, F 2 = V , F 1 = ⊥ F 3 , and F 0 = (VR )C . (Here “⊥ ” is with respect to Q.) The Griffiths transversality condition ∇(F 3 ) ⊂ F 2 is given in (b) above; the condition ∇(F 2 ) ⊂ F 1 follows from this and the immersion condition [BG, pp.82–83]. Proposition 4.6 below implies that iQ H 2,1 ×H 2,1 is positive definite, where H 2,1 = F 2 ∩ F 1 . Variations of Hodge structure without the lattice, as in (i), were considered in [S]. Our main observation in this section is the following. We prefer to take the structure in (b) as the definition of projective special Kähler. Proposition 4.6. Let (M, L, ω) be a Hodge manifold with associated pseudo-Kähler ˜ ω) manifold (M, ˜ and canonical section s. (a) A projective special Kähler structure on (M, L, ω) induces a C× -invariant special ˜ ω) e on (M, e = π (1,0) . pseudo-Kähler structure ∇ ˜ with ∇s × ˜ ω) e on (M, ˜ which (b) Conversely, a C -invariant special pseudo-Kähler structure ∇ e = π (1,0) induces a projective special Kähler structure on (M, L, ω). satisfies ∇s Recall that ω˜ is defined in Lemma 4.1. The canonical section s defined there can be viewed as the holomorphic vertical vector field on M˜ induced by the C× action. e = π ∗ ∇ be the lifted flat connection on π ∗ V . Using the inclusion Proof. (a) Let ∇ L ,→ V we view s as a section of π ∗ V . The immersion condition (4.5) implies that e : T M˜ −→ π ∗ V ∇s

(4.7)

Special Kähler Manifolds

47

e ⊂ π ∗ V by the Griffiths transversality in (b).) Using is an isomorphism. (Note that ∇s ˜ we also the real isomorphism underlying (4.7) we obtain a real flat connection on M; ˜ = π ∗ Q pulls back to −ω. e Furthermore, under (4.7) the form Q ˜ This denote it by ‘∇’. follows by differentiating the equation i ˜ s¯ ), |s|2 = Q(s, 2π assumed in (c), to obtain ω˜ =

i ¯ ˜ ∇s, e ∇s). e ∂∂|s|2 = −Q( 2π

eω˜ = 0. Now under (4.7) the section s corresponds to a holomorphic vector field ζ Thus ∇ e satisfies the special Kähler condition (1.7), and which satisfies (1.8). This proves that ∇ e is also torsionfree. by Remark 1.19 ∇ (b) We simply indicate the construction of (V , ∇, Q). First, let V be the quotient of T M˜ by the C× action. Then V is a holomorphic bundle over M, and the inclusion of vertical ˜ R induces a e on (T M) vectors in T M˜ induces an inclusion L ,→ V . The connection ∇ connection ∇ on VR ; the immersion condition in Definition 4.4(i)(b) follows from the e = π (1,0) . The form −ω˜ on M˜ induces a skew form Q on V . u t hypothesis ∇s Notice as a consequence of (c) and the description of ω˜ in Lemma 4.1 that iQ H 2,1 ×H 2,1 is positive definite. Now the discussion of special coordinates, holomorphic prepotential, etc. from Sect. 1 ˜ ω, e We make C× -equivariant choices on M˜ and consider the induced applies to (M, ˜ ∇). tensors on M. We work on π −1 (U ) for U ⊂ M a sufficiently small open set. We do not ˜ which choose Darboux coordinates, but only a flat local symplectic framing12 of T M, we require to be C× -invariant. We say that a complex tensor field on M˜ has degree n if it transforms under λ ∈ C× by multiplication by λn . The vector field ζ (which corresponds to s under (4.7)) has degree 1. So from (1.9) we see that a special coordinate function zi also has degree 1. In other words, zi is a local holomorphic section of L → M. Thus a special coordinate system {zi } on M˜ gives rise to local projective coordinates on M (which transform as sections of L). From (1.13) we see that the period matrix (τij ) is a scalar, and from (1.14) that the holomorphic prepotential F has degree 2, i.e., F is a local holomorphic section of L⊗2 . Because of the C× -invariance there is less flexibility in choosing {zi } and F than in the nonprojective case – different choices differ by a homogeneous function. e be a projective special Kähler manifold. Proposition 4.8. Let (M, L, ω, ∇) (a) Given a special projective coordinate system {zi } the holomorphic prepotential F is determined up to a change 1 F −→ F + Aij zi zj , 2 where A = (Aij ) is a real symmetric matrix. Hence the conjugate special projective coordinate system {wj } is determined up to a change wj −→ wj + Aj k zk . 12 It is denoted { ∂ , ∂ } in Sect. 1, but here we do not consider coordinate functions x i and y . j ∂x i ∂yj

48

D. S. Freed

(b) If {zi }, {wj } are conjugate special projective coordinate systems, then any other pair {˜zi }, {w˜ j } of conjugate special projective coordinate systems are related by ! ! z z˜ =P , P ∈ Sp(2n; R). w w˜ ˜ From (4.3) we see that the lift of the metric ω to M˜ has a global “Kähler potential” K, which we write in special coordinates as . −1 log |s|2 K˜ = π . −1 ˜ s¯ ) log Q(s, = π . −1 log −ω(ζ, ˜ ζ¯ ) = π . −1 log Im(zi w¯ i ) = π ∂F . −1 . log Im zi = π ∂zi . Here “=” means “equals up to an additive constant”. K˜ pulls down to a local Kähler potential on M via a local holomorphic section of π : M˜ → M. ˜ of the special Kähler structure on M˜ (see (1.26)) is a holomorphic The cubic form 4 section 4 ∈ H 0 (M, Sym3 T ∗ M ⊗ L⊗2 ), as follows easily from (1.28). It is a basic ingredient in the analysis of [BG], where it is derived from an infinitesimal variation of Hodge structure. Since ζ is holomorphic of e ) = 0, and by differentiating ω(ζ, ∇ e2 ζ ) = 0. Differentiattype (1, 0), we have ω(ζ, ∇ζ ing once more we conclude from (1.27) that the cubic form in this case is ˜ 3 s, s). e3 ζ ) = Q(∇ 4 = ω(ζ, ∇ ˜ Sym3 T ∗ M) ˜ and the associated A˜ ∈ 1,0 Hom(T M, ˜ ˜ ∈ H 0 (M, We can use 4 ˜ to introduce an algebra structure on T M˜ ⊗R C. Fix m ˜ T M) ˜ ∈ M˜ and denote V = Tm˜ M. ˜ It is easy to see that A vanishes on ζ , and it is a well-defined map W ⊗ W → W , where W is the orthogonal complement to ζ . (Under the projection π we can identify ˜ We now obtain a graded algebra C: Set C0 = C · ζ , W ∼ = Tm M, where m = π(m).) C1 = W , C2 = W , and C3 = C · ζ¯ ; then ζ acts as the identity, the multiplication ˜ and the multiplication C1 ⊗C2 → C3 is α⊗β¯ 7 → ω(α, β) ¯ ζ¯ . C1 ⊗C1 → C2 is given by A, Associativity is trivial to verify. Now we consider the implications of the lattice 3 ⊂ VR in an integral projective special Kähler structure on M. Under the isomorphism (4.7) the lift π ∗ 3 ⊂ π ∗ VR ˜ Now 3 e ˜ ⊂ T M. ˜ is ∇-flat induces a lattice 3 by hypothesis, so by Proposition 1.22 it ˜ so the e ˜ is the graph of a ∇-flat is a complex submanifold. Locally 3 vector field on M, e e is torsionfree, this 1-form is ˜ ∗ is locally the graph of a ∇-flat 1-form. Since ∇ dual 3 ˜ ∗ ⊂ T ∗ M˜ is complex lagrangian. Thus Theorem 3.4(b) and also holomorphic and so 3 Theorem 3.8 apply to give the following conclusion.

Special Kähler Manifolds

49

e 3) is an integral projective special Kähler Proposition 4.9. Suppose (M, L, ω, ∇, manifold of dimension n. Then there is an associated indefinite algebraic integrable ˜ where M˜ is L with the zero section removed. The total space X carries system X → M, a “pseudo-hyperkähler” structure of real signature (4n, 4). The fibers of this integrable system are the intermediate Jacobians associated to the underlying variation of Hodge structure. The symplectic form on this family of intermediate Jacobians was constructed by Donagi and Markman [DM2] (for the case of a family of Calabi-Yau manifolds). The pseudo-hyperkähler structure was also given by Cortés [C]. As in the nonprojective case we restrict our local Darboux framings to lie in the lattice, and so the matrices A, P in Proposition 4.8 must be integral. 5. Remarks on N = 2 Gauge Theories in Four Dimensions We make some brief remarks on the role of special Kähler manifolds in global supersymmetric theories. We do not comment on their role in supergravity. References for the quantum physics are [SW1] and [SW2]. For a mathematical development of the relevant classical supersymmetry, see [DF]. The quantum aspects of our discussion have no pretension to rigor. We first recall the origin of the local formula (1.15) for the Kähler potential. It arises from the lagrangian for the complex scalars in the four dimensional N = 2 vector multiplets. There is a superspace description in terms of the superspace N 4|8 , which is an extension of ordinary four dimensional Minkowski space with eight odd dimensions. The complexification of the odd distribution splits into two pieces, and there is a corresponding notion of a chiral map 6 : N 4|8 → C. Such a map describes an (abelian) N = 2 vector multiplet. (More precisely, it is a component of the curvature of a constrained connection on superspace.) The most general supersymmetric lagrangian for n such multiplets is specified by a holomorphic function F : Cn → C. The theory is free if F is quadratic. Upon reduction to N = 1 superspace N 4|4 each multiplet 6 decomposes into an N = 1 chiral multiplet 8 and an N = 1 vector multiplet A. The lagrangian for the chiral multiplets is determined from the Kähler potential K, and a computation gives the formula (1.15) for K in terms of F. Next, we emphasize that a special Kähler manifold does not define a classical field theory for N = 2 vector multiplets. We do obtain a classical lagrangian from a special coordinate system, as explained in the previous paragraph. Furthermore, any Kähler manifold M does determine a well-defined N = 1 supersymmetric field theory for a chiral field 8 : N 4|4 → M. However, the change of special coordinates (1.39) must be accompanied by a duality13 transformation on the gauge field in the vector multiplet A, and this only makes sense in the quantum theory. Moreover, this duality transformation only makes sense when the holonomy of ∇ is contained in the integral symplectic group. Thus a special Kähler manifold M with a lattice as in Theorem 3.4 determines14 a quantum field theory which locally has a semiclassical description in terms of N = 2 vector multiplets. The manifold M is the moduli space of quantum vacua. According to Theorem 3.4 such a theory is always specified by an algebraic integrable system. These abelian theories describe the low energy behavior of the Coulomb branch of nonabelian N = 2 supersymmetric gauge theories, with or without matter. The 13 That is, electromagnetic duality. 14 Since typically M is incomplete this is not yet a full description of a theory. Also, an abelian gauge theory,

which has a positive β-function, only makes sense as an effective field theory, not as a fundamental theory.

50

D. S. Freed

simplest example [SW1] has gauge group SU (2) and no matter. Then M is the universal curve M(2) for the modular group 0(2) ⊂ SL(2; Z), which we can identify as CP1 with 3 points omitted, say M(2) = CP1 − {−1, 1, ∞}. The universal curve X(2) → M(2) is the algebraic integrable system which defines the model. Many more examples have been found, all of course involving integrable systems. (See [D] for a review.) So far we have taken M to be smooth. As stated above, a nonflat M is not complete and an honest physical theory is formulated on some completion of M. For example, for the pure SU (2) gauge theory the special Kähler metric on the moduli space CP1 −{−1, 1, ∞} is complete near ∞, but the singular points −1, 1 are at finite distance. At these points other fields are massless and must be added to the low energy description. We now remark further on the physical origin of the lattice 3. It is a feature of four dimensional abelian gauge theories; supersymmetry is irrelevant. (See [AgZ, Sect. 3] for a recent discussion.) Consider a four dimensional gauge theory with gauge group G = Tn , where T ∼ = U (1) is the circle group. The theory is specified by a complex bilinear form τ on the Lie algebra g whose imaginary part Im τ is an inner product.15 The lagrangian density in Minkowski space is o n 1 1 Im τ (FA , ∗FA ) + Re τ (FA , FA ) |d 4 x|, L= − 8π 8π

(5.1)

where A is a connection and |d 4 x| the standard density. There is a lattice gZ ⊂ g whose elements exponentiate to the identity in G, and each basis of this lattice produces a matrix (τij ) ∈ Hn which represents the form τ . The group GL(n; Z) permutes these bases. The larger duality group Sp(2n; Z) is generated by this group together with the electromagnetic duality transformation. The latter expresses the theory in terms of a “dual” connection A˜ and the bilinear form −τ −1 . The lagrangian has the same form as (5.1), and the operator FA in the original theory corresponds to ∗FA˜ in the dual theory. The action of Sp(2n; Z) which is generated acts on τ by (1.40). Fix a basis of gZ and so write the curvature as FA = (FAi )i=1,...,n . There are n electric charges q i and n magnetic charges gi for charged matter we might put into the theory. Classically, the electric charge in a spatial region bounded by a surface 6 is defined to be Z √ −1 i ∗ FAi , q = 6 2π and the enclosed magnetic charge is g i = (nm )i =

Z √ −1 i FA . 2π 6

The electric and magnetic charges of a quantum state are computed from the corresponding operators in the quantum theory. Now (nm )i is an integer by Chern-Weil theory for the compact gauge group Tn . In the classical theory (nm )i is an integer-valued function on the space of classical solutions; in the quantum theory it assigns an integer to each quantum state. There are other integers (ne )i attached to quantum states from the Noether charges associated to global infinitesimal gauge transformations. Here the integrality is from the fact that certain exponentials of these infinitesimal transformations √ 15 For n = 1 in the standard basis the form τ is usually written τ = θ + 8π −1 , where e is the coupling π e2

constant.

Special Kähler Manifolds

51

are the identity operator. These integers are related to the electric charge of states via the formula ij (Re τ )j k (nm )k + (ne )j . q i = (Im τ )−1 It is convenient to consider the complex quantity qi +

√ ij −1g i = (Im τ )−1 τj k (nm )k + (ne )j .

As the nm , ne run over all integers, this runs over the points of the (electromagnetic) charge lattice 3∗ in Cn . There is an integral symplectic form ω on 3∗ defined by ω

g q ,

g˜ q˜

τ τ = g i (Im )ij q˜ j − g˜ i (Im )ij q j 2 2 = (nm )i (n˜ e )i − (n˜ m )i (ne )i .

(5.2)

It is preserved by the duality group. Equation (5.2) is the form of charge quantization usually referred to as the “Dirac-Schwinger-Zwanziger condition”. Returning to an N = 2 supersymmetric abelian gauge theory, we have the moduli space M of the complex scalars which carries the special geometry we have been discussing. There is a distinguished set of conjugate special coordinate systems related by integral coordinate transformations. In each such coordinate system we have a lagrangian description as a gauge theory with gauge group G = Tn (with distinguished basis for the Lie algebra). We should regard the Cn where the coordinates live as the complexified Lie algebra with its distinguished basis. The electromagnetic charge lattice 3∗ discussed in the previous paragraph defines a global lattice in the complex conjugate cotangent ∼ T M. Note in the notation of Sect. 1 that (nm )i transforms analobundle T ∗ M = (ne )j i √ dx i i gously to dyj , and q + −1g transforms analogously to dzi . (See formulas (1.6) and (1.12).) There is a further geometric input from the BPS mass formula. Namely, in the classical theory the central charge Z in the N = 2 supersymmetry algebra is a complex-valued locally constant function on the space of solutions to the classical field equations. In the quantum theory (at a point m ∈ M) it is an operator whose eigenvalues are complex numbers. Let {zi }, {wj } be distinguished conjugate special coordinate systems. This means that there is a lagrangian description for the N = 2 theory in terms of a prepotential F(z1 , . . . , zn ) with wj = ∂F/∂zj . The BPS formula involves possible additional T charges S α which may appear in the theory. These have integer eigenvalues. The BPS formula asserts that the eigenvalue of Z is mα zi (ne )i + wi (nm )i + s α √ , 2 where s α is the eigenvalue of S α and (ne )i , (nm )i are the integers defined above. Let α 0 ⊂ C denote the set of points so described. If there are no S , then the fact that 0 is i m) intrinsic and the transformation law for (n (ne )j implies that there is no translation in the coordinate change (1.39) between different sets of distinguished conjugate coordinate systems. However, in the presence of charges S α there may be a nonzero translational component.

52

D. S. Freed

References [AgZ] Álvarez-Gaumé, L., Zamora, F.: Duality in quantum field theory (and string theory). hep-th/9709180 [BCOV] Bershadsky, M., Cecotti, S., Ooguri, H.,Vafa, C.: Kodaira-Spencer theory of gravity and exact results for quantum string amplitudes. Commun. Math. Phys. 165, 311–428 (1994), hep-th/9309140 [BG] Bryant, R.L., Griffiths, P.A.: Some observations on the infinitesimal period relations for regular threefolds with trivial canonical bundle. In: Arithmetic and Geometry, Vol. II Progr. Math., 36, Boston: Birkhäuser, 1983, pp. 77–102 [CO] Candelas, P., de la Ossa, X.C.: Moduli space of Calabi-Yau manifolds. Nucl. Phys. B 355, 455–481 (1991) [CGGH] Carlson, J., Green, M., Griffiths, P., Harris, J.: Infinitesimal variations of Hodge structure. I. Compositio Math. 50, 109–205 (1983) [CFG] Cecotti, S., Ferrara, S., Girardello, L.: Geometry of type II superstrings and the moduli of superconformal field theories. Int. J. Mod. Phys. A 4, 2475–2529 (1989) [C] Cortés, V.: On hyper-Kähler manifolds associated to Lagrangian Kähler submanifolds of T ∗ Cn . Trans. Am. Math. Soc. 350, 3193–3205 (1998) [CRTP] Craps, B., Roose, F., Troost, W., Van Proeyen, A.: What is special Kähler geometry? Nucl. Phys. B 503, 565–613 (1997), hep-th/9703082 [DF] Deligne, P., Freed, D.S.: Supersolutions. In: Quantum Fields and Strings: A Course for Mathematicians. Vol. 1, Providence, RI: American Mathematical Society, 1999 [D] Donagi, R.Y.: Seiberg-Witten integrable systems. In: Algebraic geometry – Santa Cruz 1995, Vol. 62, Proc. Sympos. Pure Math. Providence, RI: Am. Math. Soc. 1997, pp. 3–43, alg-geom/9705010 [DM1] Donagi, R.Y., Markman, E.: Spectral covers, algebraically completely integrable, Hamiltonian systems, and moduli of bundles. In: Integrable systems and quantum groups (Montecatini Terme, 1993), Lecture Notes in Math. 1620, Berlin: Springer, 1996, pp. 1–119 [DM2] Donagi, R.Y., Markman, E.: Cubics, integrable systems, and Calabi-Yau threefolds. In: Proceedings of the Hirzebruch 65 Conference on Algebraic Geometry (Ramat Gan, 1993), Israel Math. Conf. Proc. 9, 199–221 (1996) [DW] Donagi, R.Y., Witten, E.: Supersymmetric Yang–Mills theory and integrable systems. Nucl. Phys. B 460, 299–334 (1996) [F] Fré, P.: Lectures on special Kähler geometry and electric-magnetic duality rotations. Nucl. Phys. Proc. Suppl. 45BC, 59–114 (1996), hep-th/9512043 [G] Gates, S.J.: Superspace formulation of new non-linear sigma models. J. Nucl. Phys. B 238, 349–366 (1984) [GS] Guillemin, V., Sternberg, S.: Symplectic techniques in physics. Cambridge: Cambridge University Press, 1990 [H] Hitchin, N.: Monopoles, Minimal Surfaces and Algebraic Curves. Séminaire de Mathématiques Supérieures, Montréal, Quebec: Les Presses de L’Université de Montréal Vol. 105, 1987 [L] Lu, Z.: A note on the special Kähler manifolds. Preprint [ST] Sierra, G., Townsend, P.K.: An introduction to N = 2 rigid supersymmetry. In: Supersymmetry and Supergravity 1983, B. Milewski, ed., Singapore: World Scientific, 1983, p. 396 [SW1] Seiberg, N., Witten, E.: Electric-magnetic duality, monopole condensation, and confinement in N = 2 supersymmetric Yang–Mills theory. Nucl. Phys. B 430, 485–486 (1994); Erratum Nucl. Phys. B 430, 485–486 (1994), hep-th/9407087 [SW2] Seiberg, N., Witten, E.: Monopoles, duality and chiral symmetry breaking in N = 2 supersymmetric QCD. Nucl. Phys. B 431, 484–550 (1994), hep-th/9408099 [SW3] Seiberg, N., Witten, E.: Gauge dynamics and compactification to three dimensions. In: The Mathematical Beauty of Physics: A Memorial Volume for Claude Itzykson, J. M. Drouffe, J. B. Zuber, eds., Singapore: World Scientific, 1997, pp. 333–366, hep-th/9607163 [S] Simpson, C.T.: Higgs bundles and local systems. Inst. Hautes Études Sci. Publ. Math. 75, 5–95 (1992) [St] Strominger, A.: Special geometry. Commun. Math. Phys. 133, 163–180 (1990) [WP] de Wit, B., Van Proeyen, A.: Potentials and symmetries of general gauged N = 2 supergravityYang–Mills models. Nucl. Phys. B 245, 89–117 (1984) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 203, 53 – 69 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Chiral BRST Cohomology of N = 2 Strings at Arbitrary Ghost and Picture Number? Klaus Jünemann, Olaf Lechtenfeld Institut für Theoretische Physik, Universität Hannover, Appelstraße 2, D-30167 Hannover, Germany. E-mail: [email protected]; [email protected] Received: 5 January 1998 / Accepted: 16 November 1998

Abstract: We compute the BRST cohomology of the holomorphic part of the N = 2 string at arbitrary ghost and picture number. We confirm the expectation that the relative cohomology at non-zero momentum consists of a single massless state in each picture. The absolute cohomology is obtained by an independent method based on homological algebra. For vanishing momentum, the relative and absolute cohomologies both display a picture dependence – a phenomenon discovered recently also in the relative Ramond sector of N = 1 strings by Berkovits and Zwiebach [1]. 1. Introduction The standard approach to describe quantum string theories is the BRST procedure which consists of introducing unphysical ghost fields associated with the symmetries of the theory. Physical states are then characterised as elements of the cohomology of the nilpotent BRST charge Q. For the open bosonic string this so-called absolute cohomology is well known to contain twice as many states as one would expect from light-cone quantisation [2]. Each state appears in two copies – either with or without the zero mode of the reparametrisation ghost c0 . The true physical spectrum therefore is determined by the BRST cohomology supplemented by the condition that a representative should be annihilated by the zero mode b0 of the reparametrisation anti-ghost. This space defines the relative cohomology.1 In the case of world-sheet supersymmetry, an additional subtlety arises due to the existence of an infinite number of inequivalent Fock space representations of the spinor ghosts – the so-called picture degeneracy labelled by π ∈ 21 Z [3]. In the N = 1 string theory this problem is partly solved by bosonising the ghost fields, which allows one to ? Supported in part by the “Deutsche Forschungsgemeinschaft”; grant LE-838/5-1 1 For closed strings this kind of condition gets more complicated and leads to the concept of semi-relative

cohomology. In this paper we consider for simplicity the chiral cohomology (describing open strings or the holomorphic part of closed strings) only.

54

K. Jünemann, O. Lechtenfeld

construct a picture-raising operator X that maps physical states from the picture π to π+1. Moreover, there exists a picture-lowering operator Y that inverts X on the absolute cohomology spaces, implying that the picture-raising operation is an isomorphism of the cohomologies at different pictures [4]. Unfortunately, Y does not commute with b0 . Thus this argument does not guarantee that picture-raising is an isomorphism also of the relative cohomologies at different pictures. This problem has been addressed very recently by Berkovits and Zwiebach [1], who used the momentum operator in the −1 picture to invert the zero mode of the picture-raising operator on states with non-vanishing momentum. This new picture-lowering operator commutes with b0 and can therefore be used to prove the picture independence of the relative cohomology for non-vanishing momentum. However, these arguments do not rule out a picture dependence of the relative cohomology at zero momentum – a phenomenon which indeed occurs in the R sector of the relative cohomology of N = 1 strings [1]. In N = 2 string theory there exist two independent spinor ghost systems leading to two different picture numbers (π + , π − ). After bosonisation, one can construct pictureraising operators X ± in complete analogy to the N = 1 case. These operators, however, cannot be inverted with local conformal fields [5,6]. There is thus the immediate question whether or not the absolute or relative BRST cohomologies are identical at different pictures. We address this question by two independent methods. The first method consists of applying the ideas of ref. [1] to the N = 2 string. In contrast to conventional picturelowering, this new kind of picture-lowering also works for the N = 2 string, but only for non-vanishing momentum. Since it commutes with b0 , we confirm the picture independence of both the absolute and the relative cohomology at non-zero momentum. To describe the second method, let us recall that for the N = 1 theory there exists an alternative argument, due to Narganes-Quijano [7], that picture raising is an isomorphism. It makes use of the fact that bosonisation extends the Fock space by an additional oscillator and that in this extended space the absolute BRST cohomology is trivial. Some standard constructions from homological algebra then suffice to prove the isomorphy of the absolute cohomologies at different pictures. This work does not require the existence of an explicit picture-lowering operator and will be reviewed in more detail later on. For a specific choice of bosonisation, the absolute cohomology in the extended Fock space of the N = 2 string again turns out to be trivial. However, the method of NarganesQuijano cannot be carried over in a straightforward way, since the structure of the extended Fock space is more complicated for the N = 2 string. We therefore need to slightly modify his method and invoke the spectral flow automorphism of the N = 2 super Virasoro algebra [8]. For the massless level,2 this will allow us to give an alternative proof of the picture independence of the absolute BRST cohomology at non-vanishing momentum. Unfortunately, we cannot treat the relative cohomology within this approach. It is, however, possible to extract some information about the exceptional case of zero momentum. Nevertheless, most of our arguments fail for vanishing momentum, and we will demonstrate picture dependence of the exceptional cohomology by explicit computations. For example, we shall see that the relative zero-momentum cohomology in the (−1, −1) picture consists of a single state of ghost number one, whereas in the (−1, 0) picture there exist nontrivial states with any positive ghost number. In contrast to the N = 1 string, this phenomenon occurs in the absolute cohomology as well, but it is pos2 In principle, the proof works at any mass level. Its induction assumes the equality of the absolute cohomology for some pair of neighbouring pictures, which we only proved explicitly for the massless level.

Chiral BRST Cohomology of N = 2 Strings

55

sible to show that the picture dependence of the absolute zero-momentum cohomology is restricted to ghost numbers 0, 1, 2 and 3. We have checked that this peculiar situation does not improve much when including the center-of-mass coordinate of the string [9, 1]. The plan of the paper is as follows: In the next section we present a few basic facts about cohomology, clarify the relation between absolute and relative cohomology, and perform some explicit calculations in simple cases. Moreover, a complete computation of the cohomology in the (−1, −1) picture along the lines of refs. [2,10] and the role of spectral flow for the BRST cohomology are described. In the third section we apply the ideas of Berkovits and Zwiebach [1] to the N = 2 string and show that picture raising is a bijective map on both the absolute and relative cohomology classes for non-vanishing momentum. In the fourth section we review part of the work of Narganes-Quijano and extend his method to the N = 2 string to give an alternative treatment of picture raising. In the final section the results are summarised. 2. Preliminary Investigations BRST quantisation and picture raising of the N = 2 string has been reviewed recently in ref. [11] whose notation and conventions we adopt throughout this paper. To keep things simple we concentrate on the NS sector. Other boundary conditions can be obtained by spectral flow [12,13].

2.1. Relative and absolute cohomology. BRST-closed states with non-vanishing eigenvalues of the zero modes of the bosonic N = 2 super Virasoro generators L0 or J0 are always exact. For cohomology computations it is therefore sufficient to restrict oneself to the space of states that are annihilated by L0 and J0 . Due to the relations {Q, b0 } = L0 , {Q, b˜0 } = J0 ,

(1)

it is possible to impose the further constraints that also the fermionic anti-ghost zero modes b0 and b˜0 3 annihilate the states under consideration. This leads to the concept of relative cohomology which appears to have a more direct physical meaning than the cohomology of the full Fock space. Throughout this paper we assume that all states are annihilated by b˜0 and thus work with the Fock space F := {|ψi ; L0 |ψi = J0 |ψi = b˜0 |ψi = 0}.

(2)

The relative Fock space consists of states that are also annihilated by b0 : Frel := {|ψi ∈ F ; b0 |ψi = 0}.

(3)

We treat the two types of fermionic anti-ghosts differently because it seems to be necessary to impose the conditions J0 |ψi = b˜0 |ψi = 0 as subsidiary conditions on an open N = 2 string field in order to write down a free field action. The situation is quite similar to the field theory of closed bosonic strings where the conditions (L0 − L¯ 0 )|ψi = (b0 − b¯0 )|ψi = 0 have to be imposed [14]. In contrast, b0 |ψi = 0 can 3 As usual, c and b denote the reparametrisation ghosts. We write c˜ and b˜ for the U (1) ghosts which have conformal weights 0 and 1, respectively.

56

K. Jünemann, O. Lechtenfeld

be considered as a gauge-fixing condition (Siegel gauge), and L0 |ψi = 0 simply is the equation of motion. Both the spaces F and Frel possess a grading with respect to picture and ghost number: X X + − g,π + ,π − F g,π ,π , Frel = Frel . (4) F = g,π + ,π −

g,π + ,π −

We often suppress the obvious grading with respect to the center-of-mass momentum k ∈ R2,2 . Following ref. [5] we bosonise the (commuting) spinor ghosts, ±

±

γ ± → η± eϕ , β ∓ → e−ϕ ∂ξ ∓ ,

(5)

and define the (total) ghost number current in a slightly unsual way [15]: jgh = −bc − b˜ c˜ + η+ ξ − + η− ξ + .

(6)

This has the advantage of commuting with picture raising while still assigning the correct ghost number to all ghost fields and giving ξ ± the ghost number minus one. Moreover, we define the ghost number of the ground state in the (0, 0) picture (and therefore in all pictures) to be zero. The BRST cohomology spaces inherit the various gradings and are denoted by X X + − + − H g,π ,π (F ), H (Frel ) = H g,π ,π (Frel ). (7) H (F ) = g,π + ,π −

g,π + ,π −

H (F ) is called the absolute4 and H (Frel ) the relative cohomology. These two types of cohomology are related by a well known exact sequence [2]. Although this has been described in detail in refs. [10,16], we briefly repeat this analysis here. F and Frel differ just by the possibility to apply the oscillator c0 , which implies the decomposition F = Frel ⊕ c0 Frel . The inclusion i : Frel → F and the projection pr : F → Frel , defined as i(ψ) := ψ + c0 0, pr(ψ + c0 χ) := χ, ψ , χ ∈ Frel ,

(8)

can be combined to the following exact sequence: i

pr

0 −→ Frel −→ F −→ Frel −→ 0.

(9)

Since the inclusion and the projection both commute with the BRST operator Q, this exact sequence induces an exact cohomology triangle:5 H(F) @ pr @ R @ {Q, c0 } H (Frel ) H (Frel ) i

4 Obviously, this name in not entirely logical since our absolute cohomology is still relative with respect to b˜0 . The relation between H (F ) and the cohomology of the full Fock space (where also the b˜0 condition is relaxed) can be analysed straightforwardly by the methods described in this section and is not relevant to the picture degeneracy which is the subject of this paper. 5 This is a standard mathematical construction; see for example chapter 0 of ref. [17] for a review.

Chiral BRST Cohomology of N = 2 Strings

57

The connecting homomorphism carries ghost number 2 and thus allows us to unwind the above triangle into the long exact cohomology sequence pr

{Q,c0 }

i

pr

−→ H g+1 (F ) −→ H g (Frel ) −→ H g+2 (Frel ) −→ H g+2 (F ) −→ H g+1 (Frel ) −→ .

(10)

This sequence will turn out to be useful for explicit calculations. It is interesting that picture raising can be treated similarly [7] as we will show in Sect. 4. 2.2. Explicit computations in the massless sector. The simplest possible case for explicit computations is the massless sector in the (−1, −1) picture where all positively moded −1,−1 spinor ghost oscillators are annihilation operators. The relative Fock space Frel (k·k = 0) consists of a single state with ghost number g = 1, namely c1 | −1, −1, k i, k·k = 0,

(11)

where | π + , π − , k i denotes the ground state with momentum k in the (π + , π − ) picture. The state (11) is BRST invariant but not exact and thus constitutes the complete relative cohomology in the (−1, −1) picture. This is the analogue of the vanishing theorems in the massless sector for the bosonic [2] and the N = 1 string [10]. The sequence (10) implies that the absolute cohomology contains two states: the state given in (11) and c0 c1 | −1, −1, k i, k·k = 0.

(12)

The corresponding vertex operators creating these states from the (0, 0) picture vacuum are (1)

V(−1,−1) (z) = ce−ϕ

+ −ϕ −

(2)

eik·Z (z), V(−1,−1) (z) = c∂ce−ϕ

+ −ϕ −

eik·Z (z).

(13)

We will see shortly that the connection between the relative and the absolute cohomology is more complicated in other pictures, since multiplying a state from the relative cohomology by c0 does not in general produce a BRST-closed state. In the (−1, −1) picture everything carries over unchanged to the exceptional case k = 0, i.e. H −1,−1 (k = 0) = H −1,−1 (k·k = 0)

for F and Frel .

(14)

+ becomes a We now turn to the massless sector of the (−1, 0) picture where γ1/2

−1,0 (k·k = 0) is spanned by the following creation operator. The relative Fock space Frel states with ghost number g: µ

−µ

+ N − ) (γ−1/2 )N d−1/2 | −1, 0, k i, AN := c1 (γ1/2

g = 2N + 1,

+ N − ) (γ−1/2 )N+1 | −1, 0, k i, BN := c1 (γ1/2

g = 2N + 2,

µν CN

:=

−µ + N+1 − −ν c1 (γ1/2 ) (γ−1/2 )N d−1/2 d−1/2 |

−1, 0, k i,

(15)

g = 2N + 2,

where N is a non-negative integer, µ = 0, 1, and I I dz r−3/2 ± dz r−1/2 ±µ z z γ (z), dr±µ = iψ (z) γr± = 2πi 2π i

(16)

58

K. Jünemann, O. Lechtenfeld

are the Fourier modes of the spinor ghosts and matter fermions. The BRST operator acts as µ

νµ

QAN = 2k −µ BN + kν+ CN , µ

QBN = kµ+ AN+1 ,

µν QCN

= 2k

−µ

AνN+1

(17) − 2k

−ν

µ AN +1

(Q2 = 0 can be checked explicitly). By inspection one learns that the cohomology H −1,0 (Frel |k·k = 0) resides at g = 1 only and is represented by µ

− | −1, 0, k i, k·k = 0 kµ+ A0 = c1 k + ·d−1/2

(18)

for any non-vanishing value of the momentum.6 The corresponding vertex operator creating this state from the (0, 0) picture vacuum is −

(1)

V(−1,0) (z) = ck + ·ψ − e−ϕ eik·Z (z)

(19)

(1)

which is the picture-raised version of V(−1,−1) in (13) (see the appendix of ref. [5] for a detailed list of vertex operators). This proves that in this simple case the picture-raising operation X − (and similarly X+ ) is an isomorphism between the relative cohomologies at k 6 = 0. What about the absolute cohomology H −1,0 (F |k·k = 0)? The sequence (10) implies that it is non-vanishing only at ghost number one and two. Obviously, the ghost number µ µ one part is simply represented by kµ+ A0 . Applying Q to c0 kµ+ A0 yields µ

µ

Qc0 kµ+ A0 = −4kµ+ A1 = −4QB0 ,

(20) µ

showing that the cohomology class at ghost number two is represented by c0 kµ+ A0 +4B0 . The two corresponding vertex operators are

(1) V(−1,0)

and

− (2) V(−1,0) (z) = c∂ck + ·iψ − e−ϕ + 4cη− eik·Z (z)

(21)

which are both obtained by picture raising the vertex operators in (13). For non-zero momentum we thus see that picture raising is an isomorphism in the absolute cohomology, too. Together, we have X− :

∼ =

H −1,−1 (k·k = 0) −→ H −1,0 (k·k = 0)

at k 6 = 0

(22)

for F as well as for Frel . In the exceptional case, k = 0, things are strikingly different. Q vanishes identically −1,0 (k = 0), and any of the states in (15) represents its own on the relative Fock space Frel nontrivial cohomology class even though the picture-raising operation annihilates the (−1, −1) vertex operator. Moreover, explicit calculations at higher pictures seem to indicate a proliferation of physical states. Therefore, the exceptional relative cohomologies + − H π ,π (Frel |k = 0) look entirely different in various pictures. 6 Note that in our conventions k + and k − are related by complex conjugation and thus cannot vanish individually. This is different in a real SL(2, R) notation [18,19], where k + = 0 is possible with non-zero k − . µ In such a case the representative (18) can be replaced by µν k −ν A0 , but the cohomology is unchanged.

Chiral BRST Cohomology of N = 2 Strings

59

To work out the exceptional absolute cohomology, we additionally have to consider the states in (15) multiplied by c0 . For k = 0 one finds that Q acts on these states as µ

µ

Qc0 AN = −4AN +1 , Qc0 BN = −4BN +1 , µν µν Qc0 CN = −4CN+1 .

(23)

Obviously, the absolute zero-momentum cohomology H −1,0 (F |k = 0) is spanned by µ µν νµ the two states A0 at ghost number one, by two more, B0 and C0 = −C0 , at ghost number two, and vanishes at any other ghost number. Are these results for H −1,0 (k = 0) consistent with the sequence (10)? At odd positive µ ghost number g = 2N + 1, the relative cohomology is spanned by AN which contains + − γ−1/2 . The connecting homomorphism {Q, c0 } acts (up to a numerical N powers of γ1/2 factor) by multiplication of just such a factor. We thus see that it is an isomorphism between the relative cohomologies with odd ghost number. The same is true for positive even ghost number. But this is precisely what we learn from the sequence (10) if we insert the result that the absolute cohomology vanishes at ghost number greater than 2. Let us briefly summarise the result of the above calculations for k·k = 0: At nonzero momentum, picture raising establishes an isomorphy between the (−1, −1) and the (−1, 0) pictures for both the relative and the absolute cohomology. For vanishing momentum, however, the cohomologies look very different. In the (−1, −1) picture both the absolute and the relative cohomology are obtained by the zero-momentum limit of the cohomology at non-vanishing momentum. In the (−1, 0) picture the BRST operator vanishes in the relative Fock space. The relative cohomology is two-dimensional at any positive ghost number. In contrast, the absolute cohomology is two-dimensional only at ghost numbers one and two and vanishes elsewhere. In other pictures one finds non-trivial absolute cohomology classes also at ghost numbers zero and three. For example, the states |0, 0, k = 0i and c−1 c0 c1 | − 2, −2, k = 0i

(24)

are both BRST invariant but not exact. This is in contrast to the N = 1 string where picture-lowering guarantees the picture independence of the absolute cohomology even in the exceptional case. In Sect. 4, however, we will prove that the absolute exceptional cohomology vanishes for ghost number g 6= 0, 1, 2, 3 at any picture. The picture dependence can thus only occur for these ghost numbers. 2.3. Complete calculation in the (−1, −1) picture. The above calculations were all done for k·k = 0. But what about massive states? Surely such states would carry additional Lorentz indices and therefore describe higher spin fields. Due to the absence of transverse dimensions in the (2,2) space-time, these states should not contribute any physical degrees of freedom, leaving the ground state as the only physical state. Although this sounds very plausible it is not what one would call a rigorous computation of the relative BRST cohomology. The most powerful approach to this kind of problem has been invented by Frenkel, Garland and Zuckerman [2] and extended to the N = 1 string in the −1 picture by Lian and Zuckerman [10]. Their method consists of introducing a new kind of grading – the filtration degree – to reduce the computation of the BRST cohomology to a standard problem of Lie algebra cohomology and can be applied to the N = 2 string, as well. Its essential new feature, namely the existence of the additional

60

K. Jünemann, O. Lechtenfeld

bosonic current J , can be incorporated in a straightforward way by simply extending the definition of the relative Fock space as indicated in Sect. 2.1. Another important ingredient in this analysis is that the Fock space of the matter sector must be a free module of the algebra of the negatively moded N = 2 super Virasoro generators. This property is also satisfied for critical N = 2 strings. For non-vanishing momentum it has in fact been shown in ref. [20] that the Fock space is a direct sum of universal enveloping algebras of the negative N = 2 super Virasoro algebra.7 The rest of the argument works in complete analogy to the N = 1 string, and it does not seem necessary to repeat it here since it has been described in great detail in ref. [10]. One finally arrives at the expected result that the state (11) is the only physical degree of freedom in the (−1, −1) picture and that there is no room for discrete states or other surprises. For k·k > 0 we thus have H g,−1,−1 (F ) = H g,−1,−1 (Frel ) = 0 for any g.

(25)

Unfortunately, this kind of analysis applies only to the (−1, −1) picture. The latter is singled out as the only picture where the creation (annihilation) operators are precisely the negatively (positively) moded oscillators and which has a nondegenerate scalar product with itself. Perhaps it is possible to find a clever redefinition of the filtration degree to apply this method also to other pictures, but it is not obvious to the authors how this could be done. 2.4. Spectral flow. We finally discuss one further aspect of the N = 2 string, namely spectral flow [8,12]. However, this will only be needed for the discussion in Sect. 4. Spectral flow is an automorphism of the N = 2 superconformal algebra associated to the U (1) subalgebra. An explicit construction is presented in the appendix of ref. [11]. If the spectral flow parameter 2 is chosen from the interval (0, 1), the spectral flow operator S(2) relates sectors with different boundary conditions (see however ref. [6] for a different point of view). For 2 = 1 it is a map within each sector and has a number of useful properties [12]: it has zero ghost number, commutes with Q, changes π + by +1, π − by −1 and is invertible (choose 2 = −1). It is therefore an isomorphism of the cohomologies, S(1) :

Hπ

+ ,π −

∼ =

(F ) −→ H π

+ +1,π − −1

(F ),

(26)

and it follows by induction that Hπ

+ ,π −

+ − (F ) ∼ = H π +n,π −n (F )

(27)

for arbitrary π + , π − , k and any integer n. Moreover, S(1) commutes with the pictureraising operators X ± up to BRST trivial terms [12], i.e. it commutes with them on the cohomology spaces. Since we have seen above that, for non-vanishing momentum, X− is an isomorphism between H −1,−1 (F |k·k = 0) and H −1,0 (F |k·k = 0), the commutative diagram H −1,−1 (F |k·k = 0)   S (1)n y∼ =

X−

−−−−→ H −1,0 (F |k·k = 0) ∼ =   S (1)n y∼ = X−

H −1+n,−1−n (F |k·k = 0) −−−−→ H −1+n,n (F |k·k = 0) 7 For k = 0 this is not true since the ground state is then annihilated by L . As in other string theories, −1 for this reason such kind of analysis does not apply in the exceptional case.

Chiral BRST Cohomology of N = 2 Strings

61

implies that X− is also an isomorphism in the bottom row. Thus, the spaces Hπ

+ ,π −

(F |k·k = 0)

for π + +π − ∈ {−2, −1}

(28)

are all isomorphic for non-zero momentum. Finally, let us remark that the above argument is not true for the relative cohomology since S(1) does not commute with b0 . 3. Picture-Lowering In this section we apply the method of Sect. 2 of ref. [1] to the open N = 2 string. We will, however, refrain from presenting the details since the calculations carry over in a straightforward way. To begin with, let us recall the bosonisation of the spinor ghosts of the N = 2 string [5]: ±

±

γ ± (z) → η± eϕ (z), β ∓ (z) → e−ϕ ∂ξ ∓ .

(29)

The zero modes ξ0± of the weight-zero fields ξ ± (z) do not take part in this process, and thus the Fock space F is extended to the bigger space F¯ . The picture-raising operators acting on F are defined as I dz ± (30) X (z), X± (z) := {Q, ξ ± (z)} X0± := {Q, ξ0± } = 2πiz and map a BRST-closed state |ψi ∈ F to Qξ0± |ψi which is trivial in F¯ but not in F . Note that both X0± do not contain any ξ0± and therefore are maps within the small space F. Following ref. [1] we consider the momentum operators in the (−1, 0) and (0, −1) picture: I dz −ϕ ± ±µ e iψ . (31) p˜ ±µ = 2πi Because of ±

±

[Q, e−ϕ iψ ±µ ] = ∂(ce−ϕ iψ ±µ ),

(32)

p˜ ±µ is BRST invariant and satisfies the key relations X0± p˜ ∓µ = 2p∓µ + {Q, m±µ }, p˜ ∓µ X0± = 2p∓µ + {Q, n±µ }, where p ±µ is the center-of-mass momentum, I dz i∂Z ±µ , p±µ = 2πi and m±µ and n±µ are given by Z I I dz1 dz2 z1 ∓ dw ∂ξ ± (w) e−ϕ iψ ∓µ (z2 ), m±µ = 2πiz1 |z2 |<|z1 | 2πi z2

(33)

(34)

(35)

62

K. Jünemann, O. Lechtenfeld

n±µ =

I

dz2 2πi

I |z1 |<|z2 |

dz1 2πiz1

Z

z1

z2

∓

dw e−ϕ iψ ∓µ (z2 ) ∂ξ ± (w).

(36)

The proof of the analogue of Eqs. (33) for the N = 1 string has been given in ref. [1], Sect. 2, and works in our present case, as well. For completeness we present the calculation that establishes the first of Eqs. (33). In terms of conformal fields the expression X0+ p˜ −µ reads I I dz1 dz2 + − + −µ X (z1 ) e−ϕ iψ −µ (z2 ). X0 p˜ = (37) 2πiz1 |z2 |<|z1 | 2π i −

As the fields X + and e−ϕ iψ −µ approach each other, no singularity appears since they have the short distance expansion −

−

X + (z) e−ϕ iψ −µ (w) ∼ c∂ξ + e−ϕ iψ −µ (w) + 2i∂Z −µ (w) + O(z − w). (38) We can therefore insert the relation X + (z1 ) = X+ (z2 ) +

Z

z1

z2

dw {Q, ∂ξ + (w)}

(39)

into (37) and obtain I dz2 − (40) (c∂ξ + e−ϕ iψ −µ + 2i∂Z −µ )(z2 ) X0+ p˜ −µ = 2πi Z z1 I I dz1 dz2 − dw {Q, ∂ξ + (w)} e−ϕ iψ −µ (z2 ). + 2πiz1 |z2 |<|z1 | 2πi z2 With the help of Eq. (32) the integrand of the last term can be rewritten as −

−

{Q, ∂ξ + (w)} e−ϕ iψ −µ (z2 ) = {Q, ∂ξ + (w)e−ϕ iψ −µ (z2 )} −

+ ∂ξ + (w) ∂(ce−ϕ iψ −µ (z2 )).

(41)

The second integral in (40) thus becomes Z I I dz2 z1 dz1 − dw ∂ξ + (w) e−ϕ iψ −µ (z2 )} {Q , 2πiz1 |z2 |<|z1 | 2πi z2 I I dz2 + dz1 − (ξ (z1 ) − ξ + (z2 )) ∂(ce−ϕ iψ −µ (z2 )) + 2πiz1 |z2 |<|z1 | 2πi I dz2 + − (42) ξ (z2 ) ∂(ce−ϕ iψ −µ (z2 )). = {Q, m+µ } − 2πi Substituting this back into (40) yields I dz2 − − (c∂ξ + e−ϕ iψ −µ − ξ + ∂(ce−ϕ iψ −µ ))(z2 ). X0+ p˜ −µ = 2p−µ + {Q, m+µ } + 2πi (43) This proves the first of Eqs. (33) since the integrand in the last term is a total derivative and the integral thus vanishes. Because p ±µ is picture-neutral, the relations (33) ensure that X0± are bijective maps between absolute cohomology classes at non-zero momentum and therefore prove their picture independence (see ref. [1] for more details). Moreover, all operators involved commute with b0 and b˜0 and thus generalise the results to the relative cohomologies. Although obvious, let us emphasise that the above argument is invalid on states with vanishing momentum. There is no contradiction to the results of Sect. 2.2.

Chiral BRST Cohomology of N = 2 Strings

63

4. An Alternative Proof In this section we give an alternative proof, inspired by ref. [7], that picture raising is an isomorphism of the absolute massless cohomology for non-vanishing momentum. Here we do not refer to any kind of picture-lowering, and thus this analysis is logically independent from that of Sect. 3. After all, it is good to have two seperate proofs of one statement. Unfortunately, we can only treat the absolute, but not the relative cohomology within this approach. We are able, however, to obtain some information about the picture dependence of the absolute cohomology in the exceptional (k = 0) case. Before considering the cohomology of the N = 2 string at arbitrary picture, let us briefly review part of the work of Narganes-Quijano [7]. 4.1. The N = 1 string. Bosonisation in the N = 1 theory consists of replacing the γ and β ghosts by [3] γ → ηeϕ , β → e−ϕ ∂ξ.

(44)

As already mentioned in Sect. 3, this extends the Fock space F to the larger space F¯ = F ⊕ ξ0 F . We thus have a situation completely analogous to that described in Sect. 2.1. Consider the inclusion i : F 7 → F¯ and the projection pr : F¯ 7 → F , defined as i(a) := a + ξ0 0, pr(a + ξ0 b) := b, a, b ∈ F.

(45)

Note that the projection has ghost number one and picture number minus one. The corresponding exact sequence is i

pr

0 −→ F −→ F¯ −→ F −→ 0.

(46)

Since both the inclusion and the projection (anti-)commute with Q, this exact sequence again induces an exact cohomology triangle. The connecting homomorphism here is nothing but the picture-raising operator X0 = {Q, ξ0 }! Including the gradation with respect to picture number yields the long exact sequence pr X0 i . . . −→ H π (F¯ ) −→ H π−1 (F ) −→ H π (F ) −→ H π (F¯ ) −→ . . .

(47)

between the various cohomology spaces. Now, the key observation is that the BRST cohomology of F¯ is trivial! This follows immediately from the existence of the operator W = 4cξ ∂ξ e−2ϕ with the property {Q, W (z)} = 1.

(48)

Inserting H π (F¯ ) = 0 for arbitrary π into the above sequence implies that X0 is an isomorphism between H π−1 (F ) and H π (F ), without referring to any kind of picturelowering operator. It is, however, important to note that the operator W does not commute with b0 . Therefore, the cohomology in the large relative space F¯ ∩ Kerb0 need not be trivial. Correspondingly, the above construction does not imply that the picture-raising operation is an isomorphism between the relative cohomologies, as well.

64

K. Jünemann, O. Lechtenfeld

4.2. The N = 2 string. In N = 2 string theory, bosonisation extends the Fock space by two additional oscillators: F¯ = F ⊕ ξ0+ F ⊕ ξ0− F ⊕ ξ0+ ξ0− F.

(49)

The first step is to check whether the cohomology in the large space is trivial. Indeed, there exists an operator W with the right property, namely8 1 + − W (z) = − cξ + ξ − eϕ eϕ (z), 4

[Q, W (z)] = 1.

(50)

This result only holds for this special choice of bosonisation. If one bosonises a different linear combination of the spinor ghosts, the corresponding W does not exist. For the cohomology in the small space this is however irrelevant. As in the N = 1 theory, the operator W does not commute with b0 so that we cannot obtain information about the relative cohomology within this approach. The situation is more complicated than for the N = 1 string, because the small and the large Fock space cannot be connected in such a simple way as in (46). We thus have to proceed in two steps. First let us define F± := F ⊕ ξ0± F

(51)

pr(a + ξ0− b + ξ0+ c + ξ0+ ξ0− d) := c + ξ0− d

(52)

and the projection pr : F¯ → F−

by

for a, b, c, d ∈ F . The map pr again has picture number minus one and anticommutes with Q since (pr ◦ Q)(a + ξ0− b + ξ0+ c + ξ0+ ξ0− d)

= pr(Qa + Qξ0− b + Qξ0+ c + Qξ0+ ξ0− d)

= pr(Qa + X0− b − ξ0− Qb + X0+ c − ξ0+ Qc + X0+ ξ0− d − ξ0+ Qξ0− d)

= −Q(c + ξ0− d)

= −(Q ◦ pr)(a + ξ0− b + ξ0+ c + ξ0+ ξ0− d),

(53)

where all ξ0± are explicit. Together with the inclusion i : F− 7→ F¯ (which trivially commutes with Q) one can form the exact sequence pr i 0 −→ F− −→ F¯ −→ F− −→ 0.

(54)

As in the N = 1 theory, this induces the long exact cohomology sequence −→ H π Using H π

+ ,π −

+ ,π −

X+

+ − + − + − pr i 0 (F¯ ) −→ H π −1,π (F− ) −→ H π ,π (F− ) −→ H π ,π (F¯ ) −→ . (55)

(F¯ ) = 0 we obtain the following

8 There is a misprint in Eq. (7.3) of ref. [11]. The second b˜ must be replaced by b; it is this term that produces a pole when contracted with W .

Chiral BRST Cohomology of N = 2 Strings

65

Lemma 1. The maps X0+ : H π

+ ,π −

(F− ) −→ H π

+ +1,π −

X0− : H π

+ ,π −

(F+ ) −→ H π

+ ,π − +1

(F− )

(56)

(F+ )

(57)

and

are isomorphisms. Thus H π on π + .

+ ,π −

(F− ) can depend only on π − and H π

+ ,π −

(F+ ) only

Note that this result holds for any value of the momentum. In the second step consider the projection pr(a + ξ0− b) = b.

pr 0 : F− → F,

(58)

Again it anticommutes with Q, and via the exact sequence pr 0

i

0 −→ F −→ F− −→ F −→ 0

(59)

and the corresponding exact triangle one obtains for each pair (π + , π − ) a long exact cohomology sequence with connecting homomorphism X0− : X0−

. . . −→ H g,π

+ ,π −

i

H g,π

(F ) −→

+ ,π −

pr 0

H

(F− )

↓

g+1,π + ,π − −1

(60) X0−

(F ) −→ H

g+1,π + ,π −

i

(F ) −→ . . . .

We first treat the exceptional case. In Sect. 2.2 it has been shown that H g,−1,0 (F |k = 0) = H g,−1,−1 (F |k = 0) = 0

for g 6 = 1, 2.

(61)

This can be inserted into (60) at π + = −1 and π − = 0 to yield H g,−1,0 (F− |k = 0) = 0

for g 6 = 0, 1, 2.

(62)

Since spectral flow has ghost number zero, it follows from Eq. (27) that H g,π

+ ,π −

(F− |k = 0) = 0

for π + +π − = −1 and g 6 = 0, 1, 2.

(63)

Lemma 1 guarantees that these cohomologies do not depend on π + , which forces them to vanish at all pictures. Once more using the sequence (60) implies that X0− is an isomorphism of the absolute exceptional cohomology for g 6 = 0, 1, 2, 3. An analogous argument proves that X0+ generates isomorphies likewise. Since H g,−1,−1 (F |k = 0) = 0

for g 6 = 1, 2

(64)

for g 6 = 0, 1, 2, 3

(65)

we conclude that H g,π

+ ,π −

(F |k = 0) = 0

in an arbitrary picture. For g ∈ {0, 1, 2, 3} we have seen counterexamples in Sect. 2.2. Note that these results hold for the absolute cohomology only. If k·k = 0 but k 6 = 0 the situation is more difficult. We can, however, extract one more piece of information from the sequence (60):

66

K. Jünemann, O. Lechtenfeld

Lemma 2. If there exists a pair of numbers πˆ + , πˆ − such that X0+ : H πˆ

+ ,π ˆ−

(F ) −→ H πˆ

+ +1,π ˆ−

+ ,π −

(F ) −→ H πˆ

+ +1,π −

(F )

(66)

(F )

(67)

is an isomorphism, then X0+ : H πˆ

π −.

is an isomorphism for arbitrary Spectral flow then establishes isomorphy of all + − H π ,π (F ). An analogous result holds for X0− . This statement is again valid for all momenta. Lemma 2 is a direct application of the five-lemma from the theory of exact sequences (for example see ref. [21]). The important observation is that the sequence (60) depends on π + . To prove that X0+ is bijective we therefore write down the sequences for πˆ + and πˆ + + 1 side by side and connect them by X0+ (due to lack of space the following diagram is rotated by 90◦ from it usual form): H πˆ

+ ,π ˆ − +1

  pr 0 y

H πˆ

+ ,π ˆ−

  X0− y

H πˆ

H πˆ

X0+

(F− ) −−−−→ H πˆ ∼ =

(F )

+ ,π ˆ − +1

  iy

+ ,π ˆ − +1

  pr 0 y

H πˆ

+ ,π ˆ−

X0+

−−−−→

+ +1,π ˆ − +1

  pr 0 y

H πˆ

X0+

(F ) −−−−→ H πˆ X0+

(F− ) −−−−→ H πˆ

(F )

∼ =

X0+

−−−−→

+ +1,π ˆ−

  X0− y

(F− )

(F )

+ +1,π ˆ − +1

  iy

+ +1,π ˆ − +1

H πˆ

  pr 0 y + +1,π ˆ−

(F )

(F− )

(F )

Due to Lemma 1, the first and fourth horizontal maps from the top are isomorphisms as indicated. Moreover, the columns are exact, and we have [X0+ , pr 0 ] = [X0+ , i] = 0, [X0+ , X0− ] = X0+ ξ0− Q − ξ0− Qξ0+ Q − Q[X0+ , ξ0− ]. (68) Since [X0+ , ξ0− ] does not contain any ξ0± , this implies that all these maps commute on the cohomology spaces. Hence the diagram is commutative. If we now use the assumption of Lemma 2 that also the second and the bottom horizontal maps are isomorphisms, the five-lemma tells us that the horizontal map in the middle is an isomorphism, too. The lemma thus follows for all π − > πˆ − by induction. The case π − < πˆ − can be treated similarly. This concludes the proof. Lemma 2 immediately applies to the massless case for non-zero momentum. Indeed, + − the isomorphy in (22) now implies that H π ,π (F |k 6= 0) is picture independent. We add that our proof extends to the massive case as well. To employ Lemma 2 however, one first needs to show that H −1,−1 (F ) = H −1,0 (F ), for example, at k·k > 0. This gets involved due to the infinity of candidates states.

Chiral BRST Cohomology of N = 2 Strings

67

5. Summary To be able to properly put into context the results of this paper we again describe the cohomology of the N = 1 string. In N = 1 string theory the absolute BRST cohomology is picture independent for any value of the momentum. This can be proven either by inverting the picture-raising operator in a momentum independent way [4] or by exploiting the fact that the absolute cohomology in the extended Fock space is trivial [7]. At non-vanishing momentum this cohomology contains two copies of the space of states obtained e.g. by light-cone quantisation. At zero momentum even more states appear. To obtain a one-to-one relation between BRST and light-cone quantisation, it is necessary in addition to impose on physical states the condition b0 |physi = 0 which leads to the more relevant relative cohomology. Unfortunately, the analysis of refs. [4,7] does not apply to the relative cohomology. However, it has been shown recently [1] that the picture-raising operator can be inverted on states of non-vanishing momentum by an operator that commutes with b0 , thereby proving picture independence of the relative cohomology at non-vanishing momentum. At zero momentum the above argument does not work and, besides the fact that the zero-momentum cohomology is generally larger than the zero-momentum limit of the non-zero-momentum cohomology, there also is a picture dependence in the relative (but not in the absolute) case as has been demonstrated explicitly [1]. In N = 2 string theory, the relative BRST cohomology in the (−1, −1) picture can be computed rigorously along the lines of refs. [2,10], as we desribed in Sect. 2.3. There exists only a single massless physical state at ghost number one. However, the issue of picture independence of the BRST cohomology has long been unclear since the picture-raising operators cannot be inverted in a momentum independent way. Hence, an argument analogous to that of ref. [4] for the N = 1 string does not exist in this theory. Nevertheless, we proved the picture independence of the relative and the absolute cohomology at non-vanishing momentum, providing two different methods for the absolute massless case. In Sect. 3 we applied the ideas of ref. [1] to the N = 2 string thus showing the picture independence at non-vanishing momentum in a rather straightforward way. In Sect. 4 we gave an alternative treatment based on the fact that, as in the N = 1 theory, the absolute cohomology in the extended Fock space is trivial. Combined with the spectral flow automorphism of the N = 2 superconformal algebra and explicit computations in simple pictures we again proved inductively that picture raising is an isomorphism of the absolute massless cohomology at non-zero momentum, without referring to any kind of picture-lowering. However, this argument is restricted to the absolute cohomology and does not apply to the relative case. Higher mass levels can principally be treated in the same way. The virtue of the rather complicated analysis of Sect. 4 is that, in contrast to the argument of Sect. 3, it also allows one to constrain the exceptional cohomology at zero momentum where the concept of picture-lowering breaks down completely. By explicit computation in this case, we found a picture dependence of both the relative and the absolute exceptional cohomology. In the (−1, −1) picture we showed that the zero-momentum cohomology is simply the zero-momentum limit of the cohomology at non-vanishing momentum. In the (−1, 0) picture, however, the relative cohomology is two-dimensional at any positive ghost number, whereas the absolute cohomology is two-dimensional for ghost numbers one and two only. In higher pictures physical states with ghost numbers zero and three also appear. About the zero-momentum case we could only prove that its absolute cohomology vanishes at any picture for ghost

68

K. Jünemann, O. Lechtenfeld

numbers g 6 = 0, 1, 2 or 3. The picture dependence does not disappear when taking into account the center-of-mass coordinate of the string. The computed dimensions of the zero-momentum cohomologies are summarised as follows, with π := π + + π − . g&

dim H (F )

π↓

<0

-4

0

-3

0

0

-2

0

-1

0

0

0

dim H (Frel )

2

3

>3

3

1

0

2

2

0

0

2

0

1

1

0

0

0

2

2

0

0

1

3

0

0

1

<0

1

2

3

>3

2

1

0

0

2

2

0

0

0

0

0

1

0

0

0

0

0

2

2

2

2

0

1

2

0

The existence of a non-degenerate scalar product on the full cohomology implies a pairing (g , π + , π − ) ←→ (3 − g , −2 − π + , −2 − π − ) (g , π + , π − ) ←→ (2 − g , −2 − π + , −2 − π − )

for F, for Frel ,

(69)

so that the dimensions of the corresponding cohomologies coincide. We know from the BRST quantisation of gauge theories that extra physical states at zero momentum are remnants of gauge and ghost degrees of freedom off the mass shell. These are necessary for a covariant formulation but disappear when fixing a gauge and going on-shell. In string theory, such states should signal gauge symmetries present in a covariant string field formulation. The observed picture dependence then suggests a proliferation of field degrees of freedom in higher pictures, in tune with the results of [19]. Work in this direction is in progress. Acknowledgements. K. J. would like to thank P. Adamietz and J. Fuchs for useful discussions.

Note added in proof The picture dependence of the zero-momentum BRST cohomology gives rise to an infinite set of conserved non-local symmetry charges, which generate the linearised symmetries of the Plebanski equation. The corresponding Ward identities yield the well known vanishing theorem for the tree-level amplitudes [22]. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

Berkovits, N. and Zwiebach, B.: Nucl. Phys. B 523, 311 (1998), hep-th/9711087 Frenkel, I., Garland, H. and Zuckerman, G.J.: Proc. Natl. Acad. Sci. USA 83, 8446 (1986) Friedan, D., Martinec, E. and Shenker, S.: Nucl. Phys. B 271, 93 (1986) Horowitz, G.T., Myers, R.C. and Martin, S.P.: Phys. Lett. B 218, 309 (1989) Bischoff, J., Ketov, S.V. and Lechtenfeld, O.: Nucl. Phys. B 438, 37 (1995), hep-th/9406101 Lu, H. and Pope, C.N.: Nucl. Phys. B 447, 297 (1995), hep-th/9411101 Narganes-Quijano, F.J.: Phys.Lett. B 212, 292 (1988) Schwimmer, A. and Seiberg, N.: Phys. Lett. B 184, 191 (1987) Astashkevich, A. and Belopolsky, A.: Commun. Math. Phys. 186, 109 (1997), hep-th/9511111 Lian, B.H. and Zuckerman, G.J.: Commun. Math. Phys. 125, 30 (1989) Bischoff, J. and Lechtenfeld, O.: Int. J. Mod. Phys. A 27, 4933 (1997), hep-th/9612218 Bischoff, J. and Lechtenfeld, O.: Phys. Lett. B 390, 153 (1997), hep-th/9608196 Ooguri, H. and Vafa, C.: Mod. Phys. Lett. A5, 1389 (1990)

Chiral BRST Cohomology of N = 2 Strings 14. 15. 16. 17. 18. 19. 20. 21. 22.

69

Zwiebach, B.: Nucl. Phys. B 390, 33 (1993), hep-th/9206084 Berkovits, N.: Nucl. Phys. B 420, 332 (1994), hep-th/9308129 Witten, E. and Zwiebach, B.: Nucl. Phys. B 377, 55 (1992), hep-th/9201056 Greub, W., Halperin, S. and Vanstone, R.: Connections, Curvature and Cohomology. Volume 3, New York: Academic Press, 1976 Lechtenfeld, O. and Siegel, W.: Phys. Lett. B 405, 49 (1997), hep-th/9704076 Devchand, Ch. and Lechtenfeld, O.: Nucl. Phys. B 516, 255 (1998), hep-th/9712043; Nucl. Phys. B 539, 309 (1999), hep-th/9808053 Bie´nkowska, J.: Phys. Lett. B 281, 59 (1992), hep-th/9111047 Spanier, E.H.: Algebraic topology. New York: McGraw-Hill, 1966 Jünemann, K., Lechtenfeld, O. and Popov, A.D.: hep-th/9901164, to appear in Nucl. Phys. B

Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 203, 71 – 105 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

A Local (Perturbative) Construction of Observables in Gauge Theories: The Example of QED M. Dütsch? , K. Fredenhagen II. Institut für Theoretische Physik, Universität Hamburg, Luruper Chaussee 149, D-22761 Hamburg, Germany. E-mail: [email protected]; [email protected] Received: 5 August 1998 / Accepted: 20 November 1998

Abstract: Interacting fields can be constructed as formal power series in the framework ˜ of causal perturbation theory. The local field algebra F(O) is obtained without performing the adiabatic limit; the (usually bad) infrared behavior plays no role. To construct the observables in gauge theories we use the Kugo–Ojima formalism; we define the BRSTtransformation s˜ as a graded derivation on the algebra of interacting fields and use the implementation of s˜ by the Kugo–Ojima operator Qint . Since our treatment is local, the operator Qint differs from the corresponding operator Q of the free theory. We prove that the Hilbert space structure present in the free case is stable under perturbations. All assumptions are shown to be satisfied in QED. 1. Introduction The quantization of gauge theories is a longstanding problem of theoretical physics. Since the works of Tomonaga, Schwinger, Feynman and Dyson in the late forties the problem is solved for QED from a pragmatic point of view: the predictions (e.g. on the magnetic moment of the electron) are in perfect agreement with experiment. In the sixties and seventies the quantization of nonabelian gauge theories was developed by Feynman [F], Faddeev–Popov [FP], t’Hooft, Becchi–Rouet–Stora [BRS], Kugo–Ojima [KO] and others. Weinberg and Salam proposed to base the theory of electroweak interactions on a spontaneously broken gauge model, which has survived the last thirty years. The ultraviolet divergences appearing in quantum field theory can be removed by various renormalization methods. An elegant method is causal perturbation theory which was developed by Epstein and Glaser [EG] on the basis of ideas due to Stückelberg and Bogoliubov and Shirkov [BS]. However, the infrared problem is only partially solved. One aspect is that charged particles cannot be eigenstates of the mass operator (they have to be “infraparticles” [Sch, Bu2]). Another aspect of the infrared problem are the ? Work supported by the Alexander von Humboldt Foundation

72

M. Dütsch, K. Fredenhagen

divergences which appear in the adiabatic limit g → const. of the S-matrix, where g is a space-time dependent coupling “constant”. In QED these divergences are logarithmic and cancel in the cross section. (This is proven at least at low orders of the perturbation series [S].) Moreover, Blanchard and Seneor [BlSe] proved that the adiabatic limit of Green’s and Wightman functions exists for QED. But in nonabelian gauge theories the divergences are worse. Perturbation theory seems to be unable to describe the long distance properties of these models (“confinement”). There is the hope that these two aspects of the infrared problem are directly connected. (See e.g. Scharf in [S], Sect. 3.12, where the existence and uniqueness of the adiabatic limit of the S-matrix is proposed to be a good criterion for a selection of physical states.) The main message of the present paper is that a local construction of the observables in gauge theories is possible without performing the adiabatic limit. Hence the infrared divergences do not occur in the construction of the model. They rather appear on the level of the long distance properties of the theory. We hope that due to its local character, our construction can be generalized to curved space-times, continuing the program of [BFK,BF]. The quantization of gauge fields in a renormalizable gauge requires an indefinite metric space. Afterwards, one has to prove that the Wightman distributions of gauge invariant fields fulfill the condition of positivity. In [DHKS1,S] the free Kugo–Ojima charge Q which implements the BRST-transformation of free fields is used to select the physical Hilbert space. But the commutator of Q with the interacting gauge invariant fields vanishes only up to a divergence. One may expect that in the adiabatic limit the positivity is satisfied, but a proof in the nonabelian case seems to be rather hard. Here we adopt another point of view which avoids the discussion of the adiabatic limit. Our way out is to work with the interacting Kugo–Ojima charge Qint [KO], which implements the BRST-transformation [BRS] of the interacting fields. By means of the Ward identities we prove that the commutator of Qint with the gauge invariant interacting fields of QED is in fact zero. The infrared divergences which remained an open problem in the Kugo–Ojima formalism are absent in our treatment. However, since Qint 6 = Q before the adiabatic limit, the construction of the physical Hilbert space cannot be done on the level of free fields. We show that the physical Hilbert space of the interacting model is obtained as a deformation of the physical Hilbert space of the free theory. Here we adopt ideas from deformation quantization as developed by Bordemann and Waldmann [BW]. The paper is organized as follows. In the next section we study the interacting fields in the framework of causal perturbation theory [BS,EG]. They are formal power series of (unbounded) operators in the Fock space of free fields. We point out that up to unitary equivalence the interacting fields depend on the interaction Lagrangian only locally. In Sect. 3 we specialize to QED and compute commutators of interacting fields to all orders. Thereby we essentially use the Ward identities, which are proven in Appendix B. In Sect. 4 we turn to the problems specific to gauge theories, the elimination of the unphysical fields and the mentioned positivity. We give a general local construction of the observables in gauge theories and of the physical Hilbert space in which the observables are faithfully represented. We prove that this structure is stable if we replace the free fields by the interacting ones. This general construction relies on some assumptions. They are verified in the case of QED in Sect. 5. The main problem is the construction of the interacting Kugo–Ojima charge Qint . To avoid a volume divergence we embed the algebra of interacting fields on an arbitrary finitely extended region into the corresponding algebra over a spacetime with

Local (Perturbative) Construction of Observables in Gauge Theories

73

compact spatial sections. This does not change the algebraic relations. We expect that also the corresponding Hilbert space representations of the local algebras of observables are unitarily equivalent, but this remains to be proven. The technical details of the free quantum gauge fields on the spatially compactified Minkowski space are written down in Appendix A. 2. Perturbative Construction of Interacting Fields In the framework of causal perturbation theory [BS,EG,St3,S,BF], interacting fields can be constructed as formal power series of operator valued distributions on a dense invariant domain D in the Fock space of incoming fields. The interacting fields Aint L (x) (A is a Wick polynomial of incoming fields) depend on an interaction Lagrangian L which is a Wick polynomial of incoming fields with test functions g ∈ D(R4 ) as coefficients so that the interaction is switched on only within a finitely extended region of spacetime. The crucial observation is that the dependence of the interacting fields on the interaction Lagrangian is local, in the sense that given a causally complete finitely extended open spacetime region O, Lagrangians L1 and L2 which differ only within a closed region which does not intersect the closure of O, lead to unitarily equivalent fields within O, i.e. there exists a unitary formal power series V of operators on D such that V Aint L1 (x)V −1 = Aint L2 (x),

∀x ∈ O,

(2.1)

and V does not depend on A [BF]. This property (2.1) is a direct consequence of causality, which we are now going to explain. Building blocks for the construction of interacting fields are the time ordered products T (A1 (x1 ) . . . An (xn )) of Wick polynomials of free fields. They are multilinear (with C ∞ functions as coefficients) symmetrical operator valued distributions on the dense domain D which satisfy the causal factorization property Causality T (A1 (x1 ) . . . An (xn )) = T (A1 (x1 ) . . . Ak (xk ))T (Ak+1 (xk+1 ) . . . An (xn )) (2.2) if xj 6 ∈ V¯+ + xi , i = 1, . . . , k, j = k + 1, . . . , n, where V¯+ is the closed forward light cone in Minkowski space. Causality (2.2) and symmetry determine the time ordered products on the set of pairwise different points. Moreover, if the time ordered products of less than n factors are everywhere defined, the time ordered product of n factors is uniquely determined up to the total diagonal x1 = · · · = xn . Thus renormalization amounts to an extension, for every n, of time ordered products to the total diagonal. This extension is always possible, and it can be done such that the conditions of Poincaré covariance (w.r.t. some unitary ↑ positive energy representation U of the Poincaré group P+ ) (N1) Ad U (L)(T (A1 (x1 ) . . . An (xn ))) = T (Ad U (L)(A1 (x1 )) . . . Ad U (L)(An (xn ))), ↑

L ∈ P+ , and of unitarity hold,1 (N2)

T (A1 (x1 ) . . . An (xn ))+ =

X

(−1)|P |+n

P ∈Part {1,...,n}

Y p∈P

1 We work throughout with the conventions of [EG], not with the ones of [S].

(2.3)

T (Ai (xi )+ , i ∈ p). (2.4)

74

M. Dütsch, K. Fredenhagen

(+ means the adjoint on D, (φ, B(f )ψ) = (B + (f¯)φ, ψ), φ, ψ ∈ D.) The generating P functional for the time ordered products is the S-matrix S(L), L = N i=1 gi Ai , gi ∈ D(R4 ), ∞ n Z X i d 4 x1 . . . d 4 xn T (L(x1 ) . . . L(xn )), (2.5) S(L) = n! n=0

i.e. T (Ai1 (x1 ) . . . Ain (xn )) =

δn S(L)|g1 =···=gN =0 . i n δgi1 (x1 ) . . . δgin (xn )

(2.6)

Finally, the interacting field Aint L corresponding to the Wick polynomial A of the free fields, is defined by [BS,EG,DKS1] Aint L (x) =

δ S(L)−1 S(L + hA)|h=0 . iδh(x)

(2.7)

By inserting (2.5) into (2.7) one obtains the perturbative expansion of the interacting fields ∞ n Z X i d 4 x1 . . . d 4 xn R(L(x1 ) . . . L(xn ); A(x)), (2.8) Aint L (x) = A(x) + n! n=1

with the “totally retarded products” def

R(A1 (x1 ) . . . An (xn ); A(x)) =

X

(−1)|I | T¯ (Ai (xi ), i ∈ I )T (Aj (xj ),

I ⊂{1,...,n}

j ∈ I c , A(x)),

(2.9)

def where I c = {1, . . . , n} \ I and T¯ denotes the “antichronological product”. The corresponding generating functional is the inverse S-matrix,

S(L)−1 =

Z ∞ X (−i)n n=0

n!

d 4 x1 . . . d 4 xn T¯ (L(x1 ) . . . L(xn )),

and the antichronological products can be obtained from the time ordered products by the usual inversion of a formal power series, namely the r.h.s. of (2.4). The arbitrariness in the extensions of time ordered products to coinciding points can be further restricted. Let φ1 , . . . , φN be the free fields in terms of which the model is defined, which satisfy the linear field equation X Dij φj = 0, (2.10) j

(where Dij is a matrix whose entries are differential operators such that D is a relativistically invariant hyperbolic differential operator with a unique solution of the Cauchy problem) and with C-number commutators [φj (x), φk (y)] = i1j k (x − y),

av 1j k = 1ret j k − 1j k

(2.11)

Local (Perturbative) Construction of Observables in Gauge Theories

75

av av (1ret, retarded resp. advanced Green’s function of Dij , i.e. supp 1ret, ⊂ V¯+,− ). We jk jk ∂A define to every Wick polynomial A the sub Wick polynomials ∂φ by 2 j

[A(x), φk (y)] = i

X ∂A (x)1j k (x − y). ∂φj

(2.12)

j

We then require [T (A1 (x1 ) . . . An (xn )), φj (x)] =

(N3) i

n X X k=1

T (A1 (x1 ) . . .

l

∂Ak (xk ) . . . An (xn ))1lj (xk − x) ∂φl

(2.13)

and (cf. [St2]) X

(N4)

j

i

n X

Dijx T (A1 (x1 ) . . . An (xn )φj (x)) =

T (A1 (x1 ) . . .

k=1

∂Ak (xk ) . . . An (xn ))δ(xk − x). ∂φi

(2.14)

The first condition means that the time ordered product is determined up to a C-number by the time ordered products of sub Wick polynomials, whereas the second condition defines uniquely the time ordered products with additional free field factors once it is given away from the diagonal. It translates into the differential equation X Dijx R(A1 (x1 ) . . . An (xn ); φj (x)) = (N40 ) j

i

n X

R(A1 (x1 ) . . . kˆ . . . An (xn );

k=1

∂Ak (xk ))δ(xk − x) ∂φi

(2.15)

for the totally retarded products (2.9), where the hat means the omission of the corresponding factor. By means of (2.8) we see that the requirement (2.14) implies the field equation for the interacting field φint j,L , X j

∂L Dij φint j,L (x) = − (x). ∂φi int L

(2.16)

The remaining arbitrariness is the freedom in the extension of the expectation values ω(T (A1 (x1 ) . . . An (xn ))), where ω is some state (e.g. the vacuum), to the diagonal. This freedom consists in adding a distribution with support on the diagonal. Its form is restricted by covariance and by the requirement that the degree of the singularity at the 2 If A contains derivated free fields the definition (2.12) is replaced by

Z [A(x), φk (y)] = i

d4z

X δA(x) 1j k (z − y) δφj (z) j

and similar modifications appear in the following formulas.

76

M. Dütsch, K. Fredenhagen

diagonal, measured in terms of Steinmann scaling degree [Ste,BF], may not be increased by the extension. The requirements (2.13) and (2.14) are purely algebraic normalization conditions for the time ordered products resp. the interacting fields. They are independent of the choice of some state and, hence, are well suited for the generalization to curved spacetimes. For a later purpose we are going to list some properties of the totally retarded products (2.9). By means of causality (2.2) one easily finds that they have totally retarded support, supp R(A1 (x1 ) . . . An (xn ); A(x)) ⊂ {(x1 , . . . xn ; x) | xi ∈ x + V¯− , ∀i = 1, . . . n}. (2.17) This means that the interacting fields Aint L (x) (2.7–8) depend only on the interaction in the past of x, i.e. solely on L|x+V¯− . The following lemma describes the structure of the totally retarded products with a free field factor. Lemma 1. Let (φj )j be free fields and the normalizations fulfill (N4). Then R(A1 (x1 ) . . . An (xn ); φi (x)) =

(A) i

n X X k=1

l

∂Ak ˆ 1ret (xk )), il (x − xk )R(A1 (x1 ) . . . k . . . An (xn ); ∂φl R(A1 (x1 ) . . . An−1 (xn−1 )φj (y); A(x)) =

(B) i

n−1 X

X

m=1 h

1av j h (y − xm )R(A1 (x1 ) . . .

. . . An−1 (xn−1 ); A(x)) + i

X h

∂Am (xm ) . . . ∂φh

1av j h (y − x)R(A1 (x1 ) . . . An−1 (xn−1 );

∂A (x)). ∂φh

Thereby note (C)

R(A1 (x1 ) . . . An (xn ); A(x)) = 0 if n ≥ 1 and some Aj (1 ≤ j ≤ n) or A is a C − number.

Proof. The last statement P (C) is easily obtained from the definition (2.9) and, if A is the C-number, by taking I ⊂{1,...,n} T¯ (Ai (xi ), i ∈ I )T (Aj (xj ), j ∈ I c ) = 0 into account. Alternatively one can argue by means of (N1) and the translation invariance of C-number fields that the non-validity of (C) would contradict the support property (2.17). (A) Due to av (x) = δik δ(x), (2.18) Dij 1ret, jk the expression (A) is a solution of the hyperbolic differential equation (2.15). Moreover, it is the only solution which fulfills the support property (2.17). To prove (B) we note that (N4) implies y

Dij R(A1 (x1 ) . . . An−1 (xn−1 )φj (y); A(x)) = i

n−1 X m=1

δ(y − xm )R(A1 (x1 ) . . .

∂Am (xm ) . . . ∂φi

Local (Perturbative) Construction of Observables in Gauge Theories

77

∂A (x)), (2.19) ∂φi analogously to (2.15). Again there is only one solution of (2.19) which respects (2.17), namely (B). u t . . . An−1 (xn−1 ); A(x)) + iδ(y − x)R(A1 (x1 ) . . . An−1 (xn−1 );

In the next section we will compute commutators of interacting fields by means of Proposition 2. The (anti-) commutator of two interacting fields can be written in the form ∞ n Z X i d 4 y1 . . . d 4 yn , [A1int L (x), A2int L (y)]∓ = − n! n=0 o n 1 2 (2.20) R(L(y1 ) . . . L(yn )A (x); A (y)) ∓ R(L(y1 ) . . . L(yn )A2 (y); A1 (x)) . The anticommutator appears only if A1 and A2 have either both an odd number of spinors or both an odd number of ghost fields.3 Proof. Due to S(L + hA)−1 S(L + hA) = 1 we can write (2.7) in an alternative way δ S(L + hA)−1 S(L)|h=0 , iδh(x)

(2.21)

δ2 S(L + h1 A1 )−1 S(L + h2 A2 )|h1 =0=h2 . δh1 (x)δh2 (y)

(2.22)

Aint L (x) = − and, hence, we get A1int L (x)A2int L (y) =

Next we note that the first term on the r.h.s. of (2.20) is equal to −

δ2 i 2 δh1 (x)δh2 (y)

S(L + h1 A1 )−1 S(L + h1 A1 + h2 A2 )|h1 =0=h2 .

(2.23)

Therefore, the assertion (2.20) is equivalent to [A1int L (x), A2int L (y)] =

δ2 [S(L + h1 A1 )−1 S(L + h1 A1 + h2 A2 )− δh1 (x)δh2 (y)

−S(L + h2 A2 )−1 S(L + h1 A1 + h2 A2 )]|h1 =0=h2 ,

(2.24)

which holds obviously true by (2.22). If Ai is fermionic the corresponding test function hi is Grassmann valued (i.e. an anticommuting C-number, see e.g. [S], Appendix D) and, hence, the commutators possibly turn into anticommutators. u t By means of the support property (2.17) we immediately see from (2.20) that the interacting fields are local [A1int L (x), A2int L (y)]∓ = 0

if

(x − y)2 < 0.

(2.25)

Of course this can also be proven in a more direct way. Note that Proposition 2 also provides a decomposition of the commutator into a retarded (i.e. (x − y) ∈ V¯+ ) and advanced part ((x − y) ∈ V¯− ). 3 We work with the convention that the Fock space of the (free) incoming fields is the tensor product of the photon, spinor and ghost Fock spaces (see (A.15)). Hence a free spinor field commutes with a free ghost field.

78

M. Dütsch, K. Fredenhagen

3. Commutators of Interacting Fields in QED In QED the interaction is given by L(x) = g(x) : ψ(x)γµ Aµ (x)ψ(x) :,

g ∈ D(R4 ),

(3.1)

where Aµ is the free photon field and ψ, ψ are the free spinor fields. In addition, we introduce a pair u, u˜ of free, anticommuting ghost fields, u = 0 = u, ˜

{u(x), u(y)} = 0,

{u(x), ˜ u(y)} ˜ = 0,

{u(x), u(y)} ˜ = −iD(x −y), (3.2) where D is the massless Pauli-Jordan distribution. In QED the ghost fields do not couple and could therefore be eliminated. We are, however, interested in a formulation of the gauge structure which can be generalized to the nonabelian case, where the ghosts seem to be indispensable [F,FP,DHKS1]. The only non-vanishing anticommutator of the spinor fields is {ψ(x), ψ(y)} = −iS(x − y) = −i(iγ µ ∂µ + m)1(x − y)

(3.3)

and 1 denotes the massive Pauli-Jordan distribution. The quantization of the photon field Aµ is done in the Feynman gauge with the commutator [Aµ (x), Aν (y)] = ig µν D(x − y).

(3.4)

The ∗-operation is introduced by Aµ ∗ = Aµ , u∗ = u, def

u˜ ∗ = − u, ˜ def

(3.5)

and ψ(x)∗ = ψ(x)γ0 . Unitarity (N2) (with respect to the *-operation) implies for the interacting fields µ

µ

Aint L (x)∗ = Aint L (x),

µ

µ

jint L (x)∗ = jint L (x),

ψint L (x)∗ γ0 = ψ int L (x), (3.6) where g is assumed to be real-valued and j µ is the matter current j µ (x) = : ψ(x)γ µ ψ(x) : . def

(3.7)

In QED the field equations (2.16) read µ

µ

Aint L (x) = −g(x)jint L (x), (iγµ ∂ µ − m)ψint L (x) = −g(x)(γµ Aµ ψ)int L (x),

(3.8) (3.9)

and similarly for ψ int L . An important restriction of the normalizations (i.e. the freedom in the extension to the total diagonal) comes from the Ward identities, (N5) ∂µy T j µ (y)A1 (x1 ) . . . An (xn ) = i

n X j =1

δ(y − xj )T A1 (x1 ) . . . (θ Aj )(xj ) . . . An (xn ) ,

(3.10)

Local (Perturbative) Construction of Observables in Gauge Theories

79

def d dα |α=0 Aj α

where Aj is a sub Wick monomial of L (3.1) and (θ Aj ) = action of the global U (1) transformation

is the infinitesimal

ψ → ψ α = e−iα ψ, ψ → ψα = eiα ψ, µ µ u → u, u˜ → u, ˜ α ∈ R. A →A , def

def

(3.11)

We prove in Appendix B that all Ward identities (N5) can be fulfilled and are compatible with the other above mentioned normalization conditions (N1), (N2), (N3) and (N4). The subset of Ward identities involving only factors j and L, (3.12) ∂µy T j µ (y)L(x1 ) . . . L(xk )j µ1 (xk+1 ) . . . j µn−k (xn ) = 0, is equivalent to the free (perturbative) operator gauge invariance of QED ([S, DHKS2] and Sect. 5.1). Equation (3.10) implies the same Ward identities for the antichronological products T¯ up to a global factor (−1) on the r.h.s.. With that one easily derives from (3.10) the Ward identities for the totally retarded products ∂µy R A1 (x1 ) . . . An−1 (xn−1 )j µ (y); A(x) = iδ(y − x)R A1 (x1 ) . . . An−1 (xn−1 ); (θ A)(x) + i

n−1 X

δ(y − xk )R A1 (x1 ) . . . (θAk )(xk ) . . . An−1 (xn−1 ); A(x)

k=1

and ∂µy R A1 (x1 ) . . . An (xn ); j µ (y) = iδ(y − xk )R A1 (x1 ) . . . kˆ . . . An (xn ); (θ Ak )(xk ) , where kˆ means that Ak (xk ) is omitted. Especially we obtain ∂µx R(L(x1 ) . . . L(xn ); j µ (x)) = 0,

i.e.

µ

∂µ jint L (x) = 0.

(3.13)

µ

Hence jint L is a conserved current. The Ward identities (N5) also imply that the corresponding charge operator implements the infinitesimal U (1)-action θ on the interacting fields, i.e. 0 [jint L (f ), Aint L (x)] = i(θ A)int L (x),

where A is a sub Wick monom of L (3.1), and for the test function f ∈ D(R4 ) we assume that there exists h ∈ D(R) such that f (y) = h(y0 ) ∀y = (y0 , y) in a neighbourhood of x + (V¯+ ∪ V¯− ) and

Z dy0 h(y0 ) = 1.

80

M. Dütsch, K. Fredenhagen

R 4 0 (f ), A 0 To prove this we first note [jint int L (x)] = d y h(y0 )[jint L (y), Aint L (x)] by L the support property (2.25) of the commutator, and by means of Proposition 2 this is equal to ∞ n Z X i dy1 . . . dyn g(y1 ) . . . g(yn ) − n! n=0 Z n dy [h(y0 ) − h(y0 − a)] + h(y0 − a) · R(L(y1 ) . . . L(yn )j 0 (y); A(x)) − o − [h(y0 ) − h(y0 − b)] + h(y0 − b) R(L(y1 ) . . . L(yn )A(x); j 0 (y)) . Due to the support property (2.17) of the R-products we can choose a and b such that the def R y0 dz contributions from h(y0 − a) and h(y0 − b) vanish. Setting k(y) ≡ k0 (y0 ) = −∞ [h(z) − h(z − a)], the [h(y0 ) − h(y0 − a)]-term is equal to Z · · · dy k(y)∂µy R(L(y1 ) . . . L(yn )j µ (y); A(x)) = ik(x)(θ A)int L (x), where we have inserted a Ward identity. For the [h(y0 )−h(y0 −b)]-term we obtain i(1− k(x))(θ A)int L (x) by a similar procedure. So the sum of all terms is in fact i(θ A)int L (x). By means of the Ward identity (3.13) and Lemma 1, (A) we obtain ∞ n+1 Z X i µ µ d 4 x1 . . . d 4 xn · ∂µ Aint L (x) = ∂µ A (x) + n! n=1

·

n X

g(x1 ) . . . ∂µ g(xl ) . . . g(xn )D ret (x − xl )R(L0 (x1 ) . . . lˆ . . . L0 (xn ); j µ (xl )),

l=1

(3.14)

where L0 (x) = : ψ(x)γµ Aµ (x)ψ(x) :, def

i.e.

L(x) = g(x)L0 (x).

(3.15)

Thus in the formal partial adiabatic limit g|x+V¯− → const. the interacting field µ ∂µ Aint L (x) agrees with the corresponding free one. Let def (3.16) Ox,y = (x + V+ ) ∩ (y + V− ). ˜ We will study the algebra F(O) generated by Z {Aint L (f ) = d 4 x Aint L (x)f (x)|f ∈ D(O), A = Aµ , ψ, ψ, j µ . . . },

(3.17)

where O is the double cone def

O = O(−r,0),(r,0) ,

(r > 0)

which is centered at the origin, and the test function g ∈ assumed to fulfill g(x) = e = const., ∀x ∈ O.

(3.18) D(R4 )

in (3.1), (3.15) is (3.19)

µ ˜ We are now going to compute some commutators of ∂µ Aint L (x) with elements of F(O).

Local (Perturbative) Construction of Observables in Gauge Theories

81

Proposition 3. The following relations hold for the interacting fields at points x, y ∈ O. µ

[∂µ Aint L (x), Aνint L (y)] = i∂ ν D(x − y),

(1)

µ

(2)

[∂µ Aint L (x), ∂ν Aνint L (y)] = 0,

(3)

[∂µ Aint L (x), ψint L (y)] = D(x − y)eψint L (y),

(4)

[∂µ Aint L (x), ψ int L (y)] = −D(x − y)eψ int L (y),

(5)

[∂µ Aint L (x), L0int L (y)] = i(∂µ D)(x − y)jint L (y).

µ µ µ

µ

µ

Proof. Since ∂µ Aint L is a solution of the wave equation for x ∈ O, it is sufficient to check these relations for the advanced or retarded part of the commutator. (1) and (2). Inserting Lemma 1 (B) and (C) into Proposition 2 we obtain µ

[Aint L (x), Aνint L (y)]av = −ig µν D av (x − y) − n X

i

∞ n Z X i n=1

n!

d 4 z1 . . . d 4 zn g(z1 ) . . . g(zn )

D av (x − zm )R(L0 (z1 ) . . . j µ (zm ) . . . L0 (zn ); Aν (y)).

(3.20)

m=1

From the support properties of the retarded products (2.17) we see that the integration over zm is confined to the double cone zm ∈ Ox,y ⊂ O.

(3.21)

µ

Let us now consider [∂µ Aint L (x), Aνint L (y)]. We want to show that the divergence ∂µx of all terms in (3.20) with index n ≥ 1 vanish. In fact, the divergence ∂µx can be written as −∂µzm D av (x − zm ). So formally, after a partial integration in zm we get two terms: (i) A divergence of R(L0 . . . j . . . L0 ; A) with respect to a j -vertex, which vanishes due to the Ward identities (N5). (ii) A term ∼ ∂µ g(zm ) which vanishes since g is constant within O (3.19,3.21). The argument can easily be made rigorous by smearing with test functions. Hence µ

[∂µ Aint L (x), Aνint L (y)]av = −i∂ ν D av (x − y),

∀x, y ∈ O,

(3.22)

which implies (1). Formula (2) follows since D is a solution of the wave equation. (3). To prove (3) we proceed analogously to (3.20) and obtain µ

[∂µ Aint L (x), ψint L (y)]av = −

·i

n X m=1

∞ n Z X i n=1

n!

d 4 z1 . . . d 4 zn g(z1 ) . . . g(zn )·

R(L0 (z1 ) . . . j µ (zm ) . . . L0 (zn ); ψ(y))∂µx D av (x − zm ).

We integrate by parts with respect to zm and insert the Ward identity ∂µzm R L0 (z1 ) . . . j µ (zm ) . . . L0 (zn ); ψ(y) = ˆ . . . L0 (zn ); ψ(y) . = −δ(zm − y)R L0 (z1 ) . . . m

(3.23)

(3.24)

82

M. Dütsch, K. Fredenhagen

Because of zm ∈ O (3.21) the terms ∼ ∂g(zm ) vanish and we end up with µ

[∂µ Aint L (x), ψint L (y)]av = −D av (x − y)g(y)ψint L (y) for

x, y ∈ O.

(3.25)

This proves (3). (4). The relation for the conjugate spinor follows by applying the *-operation. (5). By means of Proposition 2 we get µ

∞ n Z X i n=0

n!

[∂µ Aint L (x), L0int L (y)]ret = d 4 z1 . . . d 4 zn g(z1 ) . . . g(zn )∂µx R(L0 (z1 ) . . . L0 (zn )L0 (y); Aµ (x)).

(3.26)

From part (A) of Lemma 1 this is equal to ∞ n+1 Z X i dz1 . . . dzn g(z1 ) . . . g(zn ){∂µ D ret (x − y)R((L0 (z1 ) . . . L0 (zn ); j µ (y))+ n! n=0

+

n X

∂µ D ret (x − zm )R(L0 (z1 ) . . . m ˆ . . . L0 (zn )L0 (y); j µ (zm ))}.

(3.27)

m=1

By a partial integration with respect to zm and the same reasoning as above, we see that the second term does not contribute, hence we find µ

µ

[∂µ Aint L (x), L0int L (y)]ret = i(∂µ D ret )(x − y)jint L (y),

(3.28)

which finally implies (5). u t 4. Connection of Observable Algebras and Field Algebras in Perturbative Gauge Theories ˜ The observation (2.1) applies to the algebra F(O) (3.17) with a unitary formal power ˜ 1 ), O1 ⊂ O, series V . We conclude that the net of algebras of local interacting fields F(O up to unitary equivalence is uniquely determined by g|O . Since O is arbitrary, the full net of local algebras can be constructed without performing the adiabatic limit g → constant. This general procedure can be applied also to gauge theories. But there the (local) algebras of interacting fields contain unphysical fields like vector potentials and ghosts. Therefore the question arises whether there is a local construction of the algebra of observables and the physical Hilbert space, i.e. a (pre)Hilbert space on which the observables can be faithfully represented. 4.1. Local construction of observables in gauge theories. Let O → F(O) be a net of Z2 -graded *-algebras. In addition, we are given a graded derivation s on ∪O F(O) =: F with s 2 = 0, s(F(O)) ⊂ F(O) and s(F ∗ ) = −(−1)δ(F ) s(F )∗ , where (−1)δ(F ) is the Z2 -gradiation of F ∈ F. The observables should be s-invariant. Therefore, we consider the kernel of s, A0 := s −1 (0) ⊂ F, and A0 (O) := A0 ∩ F(O). A0 is a *-algebra: A, B ∈ A0 H⇒ s(AB) = s(A)B + (−1)δ(A) As(B) = 0,

i.e. AB ∈ A0 . (4.1)

Local (Perturbative) Construction of Observables in Gauge Theories

83

We set A00 := s(F). Due to s 2 = 0 it is a subspace of A0 ; it is even a 2-sided ideal in A0 , (4.2) s(F )A = s(FA) − (−1)δ(F ) F s(A) = s(F A) ∈ A00 , and similarly As(F ) ∈ A00 if A ∈ A0 . The algebra of observables is defined as the quotient def A0 (4.3) A= A00 and def

O → A(O) =

A0 ∩ F(O) A00 ∩ F(O)

(4.4)

is the net of algebras of local observables. 4.2. Representation of the observables in the physical pre Hilbert space. We now ask: under which conditions does A have a nontrivial representation by operators on a pre Hilbert space, such that (A∗ φ, ψ) = (φ, Aψ),

∀A ∈ A?

(4.5)

For this purpose we assume that F has a faithful representation on an inner product space (K, < ., . >) such that < F ∗ φ, ψ >=< φ, F ψ >,

∀F ∈ F,

and s is implemented by an operator Q on K, i.e. s(F ) = QF − (−1)δ(F ) F Q, such that

< Qφ, ψ >=< φ, Qψ >

and

The assumptions (4.7) are made in order to fulfill s(F ∗ )

(4.6) Q2 = 0.

(4.7)

−(−1)δ(F ) s(F )∗

and s 2

= = 0. Note that if the inner product on K is positive definite, we find < Qφ, Qφ >=< φ, Q2 φ >= 0, hence Q = 0 and thus also s = 0. Hence for nontrivial s the inner product must necessarily be indefinite. def

Since the physical states should be s-invariant, we consider the kernel of Q: K0 = Ke Q. Let K00 be the range of Q. Because of Q2 = 0 we have K00 ⊂ K0 . We assume: (Positivity) and

(i) < φ, φ >≥ 0 ∀φ ∈ K0 , (ii) φ ∈ K0 ∧ < φ, φ >= 0 H⇒

φ ∈ K00 .

(4.8)

Then def

H=

K0 , K00

def

< [φ1 ], [φ2 ] >H : = < ψ1 , ψ2 >K ,

ψj ∈ [φj ] := φj + K00

(4.9) is a pre Hilbert space. (Due to (4.7) the definition of < [φ1 ], [φ2 ] >H is independent of the choice of the representatives ψj ∈ [φj ], j = 1, 2.) Now we verify that def

π([A])[φ] = [Aφ]

(4.10)

84

M. Dütsch, K. Fredenhagen

is a well defined representation on H (where A ∈ A0 , φ ∈ K0 , [A] := A + A00 ). Namely, let A + s(B), A ∈ A0 , B ∈ F, be a representative of [A] ∈ A in F, and let φ + Qψ, φ ∈ K0 , ψ ∈ K be a representative of [φ] ∈ H in K. We have to show that Aφ ∈ K0 and (A + s(B))(φ + Qψ) − Aφ ∈ K00 = QK. But QAφ = s(A)φ + (−1)δ(A) AQφ = 0, and s(B)φ + (A + s(B))Qψ = (QB − (−1)

δ(B)

BQ)(φ + Qψ) − (−1)δ(A) (s(A)ψ − QAψ) =

QB(φ + Qψ) + (−1)δ(A) QAψ ∈ QK. 4.3. Stability under deformations. It is gratifying that the described structure is stable under deformations, e.g. by turning P on the interaction. Let K be fixed and replace F ∈ F F, δ(Fn ) = const. In by a formal power series F˜ = n g n Fn with F0 = F and Fn ∈ P the same way replace s and Q by the formal power series s˜ = n g n sn (each sn is a P ˜ = n g n Qn , Qn ∈ L(K), with s0 = s, Q0 = Q and graded derivation), Q ˜ 2 = 0, < Qφ, ˜ ψ >=< φ, Qψ ˜ > s˜ 2 = 0, Q

˜

˜ F˜ − (−1)δ(F ) F˜ Q. ˜ and s˜ (F˜ ) = Q (4.11) def s˜ We can then define A˜ = Ke Ra s˜ . K0 and K00 have to be replaced by the formal power series ˜ and K˜ 00 := Ra Q ˜ with coefficients in K. Due to the above result, the algebra K˜ 0 := Ke Q def ˜ ˜ A has a natural representation on H˜ = K0 . The inner product on K induces an inner K˜ 00

product on H˜ which assumes values in the formal power series over C. We adopt the point P of view that a formal power series b˜ = n g n bn , bn ∈ C is positive if there is another P P n ˜ i.e. bn = nk=0 c¯k cn−k . This formal power series c˜ = n g cn , cn ∈ C with c˜∗ c˜ = b, is equivalent to the condition 4 bn ∈ R,

∀n ∈ N0 ,

(4.12)

and such that bl = 0 ∀l < 2k, ∃k ∈ N0 ∪ {∞} in the case k < ∞. and b2k > 0

(4.12) (4.13)

We now show that the assumptions concerning the positivity of the inner product are automatically fulfilled for the deformed theory, if they hold true in the undeformed model. Theorem 4. Let the positivity assumption (4.8) be fulfilled in zeroth order. Then (i) (ii) (iii) (iv)

˜ φ˜ >≥ 0 ∀φ˜ ∈ K˜ 0 , < φ, ˜ φ˜ >= 0 H⇒ φ˜ ∈ K˜ 00 . ˜ φ ∈ K˜ 0 ∧ < φ, ˜ 0 = φ. For every φ ∈ K0 there exists a power series φ˜ ∈ K˜ 0 with (φ) ˜ Let π and π˜ be the representations (4.10) of A, A on H, H˜ respectively. Then ˜ 6 = 0 if π(A0 ) 6 = 0. π( ˜ A)

4 Bordemann and Waldmann [BW] work with a weaker definition of positivity in the case of a formal Laurent series with real coefficients: they only require that the smallest non-vanishing coefficient is positive, it does not need to be an even coefficient.

Local (Perturbative) Construction of Observables in Gauge Theories

85

P Proof of Theorem 4, (i) and (ii). Let φ˜ ∈ K˜ 0 and bn = nk=0 < φk , φn−k >. bn clearly ˜ φ˜ = 0 implies Q0 φ0 = 0, hence φ0 ∈ K0 and b0 ≥ 0. If b0 > 0 (i) follows. If is real. Q (0) b0 = 0 we know that there is some ψ0 ∈ K with φ0 = Q0 ψ0 . Let ψk := ψ0 δk,0 and P (0) ψ˜ (0) := k g k ψk = ψ0 . Then ˜ ψ˜ (0) , η˜ (0) := φ˜ − Q

(4.14)

is a formal power series with vanishing term of zeroth order. We now proceed by induction and assume that b0 = b1 = · · · = b2n = 0 and there is some formal power series P (n) ψ˜ (n) = k g k ψk with coefficients in K such that η˜ (n) :=

X k

(n)

˜ ψ˜ (n) g k ηk = φ˜ − Q

(4.15)

vanishes up to order n. Then ˜ ψ˜ (n) , η˜ (n) + Q ˜ ψ˜ (n) >2n+1 =< η˜ (n) , η˜ (n) >2n+1 = 0 b2n+1 =< η˜ (n) + Q

(4.16)

(n) (n) ˜ η˜ (n) = 0 we get Q0 η(n) = 0, i.e. η(n) ∈ K0 and and b2n+2 =< ηn+1 , ηn+1 >. Since Q n+1 n+1 (n)

b2n+2 ≥ 0. If b2n+2 > 0 we obtain (i), otherwise ∃ψn+1 ∈ K with ηn+1 = Q0 ψn+1 , and we can define (n+1) (n) := ψk + δn+1,k ψn+1 . (4.17) ψk One easily verifies ˜ ψ˜ (n+1) )k = 0, (φ˜ − Q

∀k = 0, 1, . . . , n + 1.

(4.18)

˜ ψ, ˜ i.e. φ˜ ∈ K˜ 00 . Either the induction stops at some n or we find a ψ˜ with φ˜ = Q To prove (iii) we again proceed by induction and assume that there exists a power ˜ φ˜ (n) vanishes up to order n. This is certainly true for n = 0. series φ˜ (n) such that Q 2 (n) ˜ φ˜ (n) )n+1 , hence (Q ˜ φ˜ (n) )n+1 ∈ K0 . Moreover 0 =< ˜ φ˜ )n+1 = Q0 (Q Then 0 = (Q (n) (n) (n) (n) ˜ φ˜ >2n+2 =< (Q ˜ φ˜ )n+1 , (Q ˜ φ˜ )n+1 >, thus (Q ˜ φ˜ (n) )n+1 ∈ K00 and there ˜ φ˜ , Q Q ˜ φ˜ (n) )n+1 + Q0 φn+1 = 0. We then set (φ˜ (n+1) )k := (φ˜ (n) )k + exists a φn+1 ∈ K with (Q ˜ φ˜ (n+1) vanishes up to order n + 1; δn+1,k φn+1 and find that Q φ˜ := lim φ˜ (n) ∈ K˜ 0 n→∞

(4.19)

is then the desired formal power series. P k ˜˜ ˜ = 0 means A˜ = It remains to prove (iv). π( ˜ A) k g Ak ∈ Ke s˜ with Aφ ∈ K˜ 00 , ∀φ˜ ∈ K˜ 0 . By means of (iii) this implies in zeroth order A0 φ0 ∈ K00 , ∀φ0 ∈ K0 , i.e. π(A0 ) = 0. Note that φ → φ˜ is non-unique and this holds true also for the induced relation ˜ between H and H. The unit 1˜ in an algebra of formal power series is 1˜ = (1, 0, 0, . . . .) = 1g 0 , and P k ˜ a˜ = ∞ k=0 ak g is invertible iff a0 is invertible. We denote by C the formal power series P P def def n over C and consider K˜ = {φ˜ = n φn g |φn ∈ K} and F˜ = {F˜ = n Fn g n |Fn ∈ F}

86

M. Dütsch, K. Fredenhagen

˜ as C-modules. This is possible because the usual multiplication of power series yields maps ˜ × F˜ → F˜ : (a, ˜ → a˜ A˜ = A˜ a, C ˜ A) ˜

˜ × K˜ → K˜ : (a, ˜ → a˜ φ˜ = φ˜ a, C ˜ φ) ˜

which fulfill the relations ˜ a˜ φ) ˜ φ, ˜ = a( ˜ = (a˜ A) ˜ A( ˜ A˜ φ) and

˜ ∗ = a˜ ∗ A˜ ∗ , (a˜ A)

˜ b˜ ψ˜ >= a˜ ∗ b˜ < φ, ˜ ψ˜ > < a˜ φ, (4.20)

˜ = a˜ s˜ (A). ˜ s˜ (a˜ A)

(4.21)

˜ ˜ Also the physical pre-Hilbert space H˜ and the algebra of observables A(O) are Cmodules, and the multiplications by a “scalar” ˜ ˜ ˜ × A(O) ˜ → a[ ˜ = [a˜ A] ˜ = [A] ˜ a, C → A(O) : (a, ˜ [A]) ˜ A] ˜ ˜ × H˜ → H˜ : (a, ˜ → a[ ˜ = [a˜ φ] ˜ = [φ] ˜ a˜ C ˜ [φ]) ˜ φ]

(4.22) (4.23)

˜ ∈ H˜ can be normalized: satisfy (4.20). We are now going to prove that every [φ] ˜ [φ] ˜ such that ˜ ∈ H, ˜ 6 = 0, there exist [ψ] ˜ ∈ H˜ and a˜ ∈ C Corollary 5. For every [φ] ˜ = a[ ˜ [φ] ˜ ψ]

and

˜ [ψ] ˜ >= 1. < [ψ],

˜ From Theorem 4 we know b˜ = ˜ [φ] ˜ >∈ C. Proof. We set b˜ :=< [φ], R, b2k > 0.

(4.24) P∞

n=2k bn g

n,

bn ∈

˜ with a˜ ∗ a˜ = b. ˜ Then [ψ] ˜ := a˜ −1 [φ] ˜ Case (1), k = 0. There exists an invertible a˜ ∈ C satisfies the assertion (4.24). P ˜ Due to < φ0 , φ0 >= Case (2), k > 0. We consider a representative φ˜ = n φn g n of [φ]. b0 = 0 and Q0 φ0 = 0, there exists η0 ∈ K with Q0 η0 = φ0 . Then we can define τ˜1 by ˜ ˜ 0 which fulfills τ˜1 ∈ K˜ 0 and [φ] ˜ = g[τ˜1 ]. Hence < [τ˜1 ], [τ˜1 ] >= g −2 b. g τ˜1 := φ˜ − Qη ˜ until we obtain a If k > 1 we repeat this procedure (starting with [τ˜1 ] instead of [φ]) ˜ Similarly to case (1) we ˜ = g k [τ˜k ] and hence < [τ˜k ], [τ˜k ] >= g −2k b. τ˜k ∈ K˜ 0 with [φ] ˜ with c˜∗ c˜ = g −2k b. ˜ Then [ψ] ˜ := c˜−1 [τ˜k ] conclude that there exists an invertible c˜ ∈ C ˜ i.e. (4.24) is satisfied for a˜ := g k c. ˜ [ψ] ˜ >= 1 and [φ] ˜ = g k c[ ˜ ψ], ˜ u t satisfies < [ψ], ˜ A state ω on the algebra of observables A(O) is defined by (i) (ii) (iii) (iv)

˜ ˜ is linear, i.e. ω(a[ ˜ + [B]) ˜ = aω([ ˜ + ω([B]), ˜ ω: A(O) →C ˜ A] ˜ A]) ˜ ˜ ∗ ˜ ∈ A(O), ˜ ∗ ) = ω([A]) ∀[A] ω([A] ˜ ˜ ≥0 ˜ ∈ A(O) ˜ ∗ [A]) ∀[A] and ω([A] ˜ = 1. ˜ ω(1)

The constructed physical states, i.e. the vector states ˜ =< [φ], ˜ φ] ˜ [A][ ˜ >, ωφ˜ ([A])

˜ ˜ ∈ H, [φ]

(4.25)

˜ [φ] ˜ >= 1, also (iv). The positivity (iii) of the satisfy obviously (i), (ii) and, if < [φ], states ωφ˜ is ensured by

Local (Perturbative) Construction of Observables in Gauge Theories

87

Corollary 6 (Positivity of the Wightman distributions of gauge invariant fields). Let the algebra A˜ be generated by the s˜ -invariant fields φ˜ 1 , . . . , φ˜ l and let A˜ :=

k Z X n=0

X

fj1 ...jn (x1 , . . . , xn )φ˜ j1 (x1 ) . . . φ˜ jn (xn )dx1 . . . dxn ,

j1 ,...,jn

(4.26)

fj1 ...jn ∈ D(R4n ), and φ˜ ∈ K˜ 0 . Then

˜ A˜ ∗ A˜ φ˜ >≥ 0. < φ,

(4.27)

t Proof. Note A˜ φ˜ ∈ K˜ 0 and apply part (i) of Theorem 4. u ˜ = Q0 . This situation occurs if the adiabatic limit exists ([KO], Remark. Let us assume Q ˜ φ˜ = 0 means Q0 φk = 0, ∀k. see also Sect. 5.2), e.g. in massive gauge theories. Then Q ˜ Therefore, in this case the physical pre Hilbert space H is the space of formal power series with coefficients in H, ˜ H˜ = CH

˜ = Q0 ). (if Q

(4.28)

˜ ˜ But the states ωφ on A(O) induced by vectors φ ∈ H remain C-valued functionals. 5. Verification of the Assumptions in the Example of QED The construction in the previous section relies on some assumptions, which we are now going to verify in QED. The deformation is given by going over from the free theory to the interacting fields discussed in Sects. 2 and 3. For the free and the interacting theory we will first define the BRST-transformation s (˜s resp.) and then we will construct a ˜ which implements s (˜s ) in a representation nilpotent and hermitian operator Q (Q) space with indefinite inner product. Then the local observables (defined by (4.4)) are ˜ Q Ke Q ˜ naturally represented on H = Ke Ra Q (H = Ra Q ˜ ) by (4.10). It remains to prove the ˜ For the free theory we will do this by positivity of the inner product induced in H (H). giving explicitly (distinguished) representatives of the equivalence classes in H. Then ˜ we conclude from Theorem 4 that positivity holds true also for H. 5.1. The free theory. We consider the field algebra F which is generated by the free fields ˜ the Wick monomials j µ =: ψγ µ ψ :, γµ Aµ ψ, ψγµ Aµ , L0 = jµ Aµ Aµ , ψ, ψ, u, u, and the derivatives of free fields ∂µ Aµ , F µν = ∂ µ Aν − ∂ ν Aµ . This algebra has a faithful representation on the Fock space K = KA ⊗ Kψ ⊗ Kg of free fields (Appendix A). The Z2 -gradiation is (−1)δ(F ) , where F is a monomial in F and δ(F ) is the ghost number, Z ↔ def Qu = i d 3 x : u(x) ˜ ∂ 0 u(x) : . (5.1) [Qu , F ] = δ(F )F, x0 =const.

Note δ(u) = −1, δ(u) ˜ = 1. The graded derivation s is the BRST-transformation of free fields ˜ = −i∂µ Aµ . s(Aµ ) = i∂ µ u, s(ψ) = 0, s(ψ) = 0, s(u) = 0, s(u)

(5.2)

88

M. Dütsch, K. Fredenhagen

The transformation of Wick monomials and derivated free fields is given by s(: φ1 (x)φ2 (x) · · · :) =: s(φ1 )(x)φ2 (x) · · · : +(−1)δ(φ1 ) : φ1 (x)s(φ2 )(x) · · · : + . . . (5.3) and by translation invariance of s s(∂µ φ(x)) = ∂µx s(φ)(x). This transformation is implemented by the free Kugo–Ojima charge [DHKS1] Z ↔ def d 3 x (∂ν Aν (x)) ∂ 0 u(x), Q= x0 =const.

(5.4)

(5.5)

which fulfills Q∗ = Q, 5 [Qu , Q] = −Q and Q2 = 0. This is verified in Appendix A. Moreover, it is shown there that the inner product < ., . > is positive semidefinite on Ke Q and that the space of null vectors in Ke Q is precisely Ra Q [DHS1]. The existence of the integral in (5.5) can be proven by the following method due to Requardt [R]. We smear out J 0 = ∂ν Aν ∂ ↔ 0 u with k(x0 )h(x) ∈ D(R4 ), where R dx0 k(x0 ) = 1 and h is a smeared characteristic function of {x ∈ R3 , |x| ≤ R} for some R > 0. By scaling Z kλ (x0 ) := λk(λx0 ), hλ (x) := h(λx), Qλ := d 4 x kλ (x0 )hλ (x)J 0 (x) one easily finds limλ→0 kQλ k = 0 w.r.t. a suitable Krein space norm R (cf. Appendix A). In addition, due to current conservation, limλ→0 [Qλ , φ(y)]∓ = d 3 x [J 0 (x), φ(y)]∓ (note that the latter integral exists since the region of integration is bounded) for R big enough compared to the support of k. Therefore, the strong limit limλ→0 Qλ exists on a dense subspace and agrees with (5.5). s Unfortunately, the representation (4.10) of the observables A = Ke Ra s on the physical Ke Q pre Hilbert space H = Ra Q is not faithful . The counterexample is [u(f )], f ∈ D(R4 ) R real-valued, which induces a non-trivial element of A if f d 4 x 6= 0. Namely, due to u(f )∗ u(f ) = u(f )2 = 0, it is represented by zero on H. (This holds true for each representation in which < ·, · > is positive definite.) Since u(∂µ h) = i s(Aµ (h)), h ∈ D(R4 ), A has the following structure: (5.6) A = A(0) ⊕ u0 A(0) , R where u0 is the rest class of u(f ), f d 4 x = 1, and where A(0) is the subalgebra with ghost number zero. The representation (4.10) of A(0) on H is faithful. To make this plausible we mention that A(0) is generated by [F µν ], [ψ], [ψ] and Wick monomials thereof, whereas the “canonical” representatives of H are the states containing transversal photons, electrons and positrons only (A.39). The interaction Lagrangian of QED is s-invariant up to a divergence of a local field, [Q, L0 (x)] = i∂µ L1 µ (x),

L1 µ = : ψγ µ ψ : u. def

(5.7)

5 We restrict all operators (resp. formal power series of operators) to the dense invariant domain D and, therefore, there is no difference between symmetric and self-adjoint operators.

Local (Perturbative) Construction of Observables in Gauge Theories

89

Thus in the formal adiabatic limit the integral of the Lagrangian becomes invariant. In [DHKS1,DHKS2] the following Ward identities were postulated: [Q, T (L0 (x1 ) . . . L0 (xn ))] =

n X l=1

∂µxl T (L0 (x1 ) . . . L1 µ (xl ) . . . L0 (xn ))

(5.8)

(“free (perturbative) operator gauge invariance”, compare (3.12)). Provided the adiabatic limit exists this condition implies the s-invariance of the S-matrix; hence the S-matrix Q induces a unitary operator on the physical Hilbert space H = Ke Ra Q [DHS1,K]. The nice feature of the condition (5.8) is that its formulation makes sense independent of the adiabatic limit. So, if the normalizations are suitably chosen, the free (perturbative) operator gauge invariance (5.8) (more precisely the corresponding C-number identities which imply (5.8)) could been proven to hold to all orders in QED [DHS2,S] and also in SU (N ) Yang–Mills theories [DHS1,D1] and to imply (in the latter case) the usual Slavnov–Taylor identities [D2]. In addition, it determines to a large extent the possible structure of the model. Stora [St1] found that making a general ansatz for the interaction Lagrangian of self-interacting gauge fields, the Ward identities (5.8) require the coupling parameters to be totally antisymmetric structure constants of a Lie group. Moreover, (5.8) was used for a derivation of all the couplings of the standard model of electroweak interactions (especially the Higgs potential) [DS]. We emphasize that (5.8) is a pure quantum formulation of gauge invariance, without reference to classical physics. 5.2. The interacting theory: construction of the interacting Kugo–Ojima charge. We now replace the free fields (including Wick monomials and derivatives) considered in the previous subsection by the corresponding interacting fields, which are formal power series of unbounded operators in the Fock space K of free fields. Due to [Qu , L(x)] = 0 (3.1), we can normalize the time ordered products such that [Qu , T (L(x1 ) . . . L(xn ))] = 0

(N6) and

[Qu , T (L(x1 ) . . . L(xn )F (x))] = δ(F )T (L(x1 ) . . . L(xn )F (x)).

Hence [Qu , Fint L ] = δ(F )Fint L by (2.8-9). The fundamental normalization condition concerning the ghost number is (B.6) in combination with (N3); they imply (N6). Again def

we fix the region O to be the double cone O = O(−r,0),(r,0) , (r > 0) (3.18) and assume the switching function g ∈ D(R4 ) to be constant on O (3.19). We study the algebra ˜ F(O) (3.17) of interacting fields localized in O. The ghost fields do not couple in QED, hence u˜ int L (x) = u(x). ˜ (5.9) uint L (x) = u(x), The abelian BRST-transformation s˜ = s0 + gs1 [BRS] should be a graded derivation with zero square and compatible with the ∗-operation. In addition it shall induce the following transformations on the basic fields,6 µ

s˜ (Aint L (x)) = i∂ µ u(x),

s˜ (u(x)) = 0,

µ

s˜ (u(x)) ˜ = −i∂µ Aint L (x),

6 In contrast to the free case ψ int L and ψ int L are not observables in the sense of Sect. 4.1. This different behaviour can be understood physically by the accompanying soft photon cloud and mathematically by Gauss’ law.

90

M. Dütsch, K. Fredenhagen

s˜ (ψint L (x)) = −g(x)ψint L (x)u(x),

s˜ (ψ int L (x)) = g(x)ψ int L (x)u(x). (5.10)

(The pointwise products are well defined.) Let us assume that we have constructed the ˜ = Qint (g). Then we shall define s˜ in terms of the interacting Kugo–Ojima charge Q corresponding current such that Qint (g) implements s˜ , s˜ (F ) = Qint (g)F − (−1)δ(F ) F Qint (g),

˜ F ∈ F(O).

(5.11)

If Qint (g) is hermitian, s˜ is compatible with the ∗-operation, and s˜ 2 = 0 is implied by Qint (g)2 = 0. To get Qint (g) we follow Kugo and Ojima [KO] and replace the current in the free µ charge Q (5.5) by the corresponding interacting field ∂µ Aint L (x)∂ ↔xν u(x). By means of the field equation (3.8) and the Ward identity (3.13) we find µ

↔x

µ

µ

∂xν [∂µ Aint L (x) ∂ ν u(x)] = −( ∂µ Aint L (x))u(x) = (∂µ g)(x)jint L (x)u(x).

(5.12)

Hence the current is conserved in the region where g is constant. We may therefore ˜ define s˜ on an algebra F(O) in the following way: we choose g(x) = e = const on a neighbourhood U of O¯ and set Z ↔ def µ ˜ d 3 x [∂µ A (x) ∂ 0 u(x), F ]∓ , F ∈ F(O). (5.13) s˜ (F ) = int L

x0 =0

Because of current conservation, s˜ is implemented by the operators Z ↔x Qint (g, k) = d 4 x k µ (x)(∂ν Aνint L (x)) ∂ µ u(x)

(5.14)

µ

with k µ ∈ D(U), where k µ − δ0 h = ∂ µ f for some f ∈ D(U), and where h ∈ D(U) is a suitably chosen smeared characteristic function of the surface {(0, x), |x| ≤ r}. Now we are well prepared to prove that the definition (5.13) of s˜ agrees with the usual expressions (5.10) on the basic fields, and to compute all further commutators of Qint (g) with the interacting sub Wick monomials of L0 (3.15). Theorem 7. We assume that the interacting fields are normalized as described in Sects. 2 and 3, especially that they fulfill the field equations (3.8–9) and the Ward identities (N5). Furthermore we assume g = e = const on the double cone O = O(−r,0),(r,0) . Let k µ as before. Then we find the commutation rules µ

[Qint (g, k), Aint L (y)] = i∂ µ u(y),

µ

[Qint (g, k), ∂µ Aint L (y)] = 0,

(5.15a, b)

[Qint (g, k), ψint L (y)] = −eψint L (y)u(y),

(5.16a)

[Qint (g, k), ψ int L (y)] = eψ int L (y)u(y),

(5.16b)

µ {Qint (g, k), u(y)} ˜ = −i∂µ Aint L (y), (5.17a, b) {Qint (g, k), u(y)} = 0, µν µ [Qint (g, k), jint L (y)] = 0, (5.18a, b) [Qint (g, k), Fint L (y)] = 0, µ µ [Qint (g, k), (γµ A ψ)int L (y)] = −e(γµ A ψ)int L (y)u(y) + iγµ ψint L (y)∂ µ u(y),

(5.19) [Qint (g, k), (ψγµ Aµ )int L (y)] = e(ψγµ Aµ )int L (y)u(y) + iψ int L (y)γµ ∂ µ u(y), (5.20) (5.21) [Qint (g, k), (: ψγµ ψ : Aµ )int L (y)] = (: ψγµ ψ :)int L (y)i∂ µ u(y),

where always y ∈ O.

Local (Perturbative) Construction of Observables in Gauge Theories

91

Proof. Since the ghost fields are not influenced by the interaction, we know that the ghost and antighost fields commute with all other interacting fields. Moreover, the pointwise products of these fields with a ghost or antighost field are well defined and behave in commutators as ordinary products of operators in spite of their character as operator valued distributions. (This may be verified by using techniques from microlocal analysis as explained in [BF].) Thus the above relations follow from the commutation rules with ∂ν Aνint L (x) in Proposition 3 and the ghost antighost anticommutation relations in Eq. (3.2). With these preparations the commutators (5.15-16) can easily be computed, for example Z ↔x µ d 3 x [∂µ Aint L (x), ψint L (y)] ∂ 0 u(x) = [Qint (g), ψint L (y)] = x0 =0 Z ↔x (5.22) = eψint L (y) d 3 x D(x − y) ∂ 0 u(x) = −eψintL (y)u(y), where we have inserted Proposition 3, (3) and (A.31). By using other commutators of µ ∂µ Aint L (Proposition 3) we analogously prove (5.15a,b), (5.16b) and (5.21). Alternay tively (5.15b) can be obtained by taking the divergence ∂µ of (5.15a). Part (a) of (5.17) is obvious due to {u(x), u(y)} = 0; let us compute part (b), Z ↔x µ ˜ = −i d 3 x ∂µ Aint L (x) ∂ 0 D(x − y). (5.23) {Qint (g), u(y)} x0 =0

µ

From (5.12) we know ∂µ Aint L (x) = 0, ∀x ∈ O. Therefore, we may apply (A.31) to µν (5.23), which yields (5.17b). By applying ∂yν to (5.15a) and using Fint L = ∂ µ Aνint L − µ µ ∂ ν Aint L we get (5.18a). Analogously by working with the field equations for Aint L (3.8) and ψint L (3.9), we obtain (5.18b) and (5.19-20) from (5.15a) and (5.16a,b). In the formal adiabatic limit, ∂ν Aνint L converges to ∂ν Aν (3.14) and therefore one expects that Qint (g) will converge to the free Kugo–Ojima charge Q. Whereas in QED this reasoning seems to be correct, the corresponding argument does not work in nonabelian gauge theories (as can be seen by an explicit calculation of the first order of Qint (g)) [BDF]. We therefore prefer not to work in the adiabatic limit. The price to pay is that Qint (g) does not agree with Q, so for the construction of the physical Hilbert space we have to check the conditions of Sect. 4. We easily find that Qint (g, k) is hermitian for real valued k and we even get the nilpotency of Qint (g, k),

1 = 2

Z

Z d x h(x) 4

Qint (g, k)2 =

1 {Qint (g, k), Qint (g, k)} = 2 µ

↔x ↔y

d 4 y h(y)[∂µ Aint L (x), ∂ν Aνint L (y)] ∂ 0 ∂ 0 u(x)u(y) = 0, (5.24)

µ

by means of k µ = δ0 h + ∂ µ f and Proposition 3, (2).7 But we need in addition that the zeroth order term Q0 (k) of Qint (g, k) (5.14) satisfies the positivity assumption (4.8). There seems to be no reason why this should hold for a generic choice of k. One might try to control the limit when k tends to a smeared characteristic function of the t = 0 hyperplane (in order that Q0 (k) becomes equal to 7 We recall that the commutator [∂A int L (x), ∂Aint L (y)] vanishes for all x and y for which supp ∂µ g does not intersect Ox,y ∪ Oy,x .

92

M. Dütsch, K. Fredenhagen

the free charge Q (5.5)), but without a priori information on the existence of an s˜ -invariant state this appears to be a hard problem. There is a more elegant way to get rid of these problems which relies on the local character of our construction. We may embed our double cone O isometrically into the cylinder R × CL , where CL is a cube of length L, L r, with suitable boundary conditions (see appendix A), and where the first factor denotes the time axis. If we choose the compactification length L big enough, the physical properties of the local algebra ˜ F(O) are not changed. The quantization of the free fields on this cube is worked out in Appendix A. The inductive construction of the perturbation series for the S-matrix or the interacting fields is not changed by the compactification, Sects. 2 and 3 can be adopted without any modification [BF]. We assume the switching function g to fulfill ∀x ∈ O ∪ {(x0 , x)| |x0 | < }

g(x) = e = constant

(r > 0)

(5.25)

on R × CL and to have compact support in timelike directions. Now we may insert µ

k (x) :=

Z

µ δ0 h(x0 ),

h ∈ D([−, ]),

dx0 h(x0 ) = 1

into the expression (5.14) for Qint , because (x0 , x) → h(x0 ) is an admissible test function on R × CL . We define Qint (g) : D −→ D,

def

Qint (g) =

Z

Z dx0 h(x0 )

CL

µ

↔x

d 3 x ∂µ Aint L (x) ∂ 0 u(x). (5.26)

By means of (5.12) and the fact that g is constant on the region of integration (the timeslice [−, ] × CL ), we conclude that the result of the integration over CL is independent of x0 and hence the arbitrariness in the choice of h(x0 ) drops out, Z Qint (g) =

CL ,x0 =const.,|x0 |<

µ

↔x

d 3 x ∂µ Aint L (x) ∂ 0 u(x).

(5.27)

By construction, Qint (g) implements the BRST-transformation (5.13) and fulfills (see (5.24)) Qint (g)2 = 0, Q0 = Q, [Qu , Qint (g)] = −Qint (g) andQint (g)∗ = Qint (g), (5.28) where Q0 is the zeroth order and Q is the free charge (5.5). The last property relies on µ the *-selfadjointness of u and Aint L . We emphasize that our construction describes locally QED also in the non-compactified Minkowski space (this is the main concern of the paper) and, therefore, should not depend on the compactification length L. On the level of the algebras this is evident. The local algebras of interacting fields or observables belonging to different values of L are isomorphic. We conjecture that also the state space (i.e. the set of expectation functionals induced by vectors in the physical Hilbert space) is independent of L, but this remains to be proven.

Local (Perturbative) Construction of Observables in Gauge Theories

93

6. Outlook In an abstract setting, Buchholz has developed concepts for the treatment of scattering of infraparticles [Bu1]. It would be interesting to apply his ideas to perturbative QED. Our construction may be helpful for such an investigation which might lead to a more satisfactory understanding of the infrared problem in QED. The importance of a local construction of the observables becomes even more evident in nonabelian gauge theories. There, in the absence of Higgs fields, the adiabatic limit seems not to exist and, hence, only a local model makes sense in the framework of perturbation theory. In the application of our construction to nonabelian gauge theories some technical problems appear, but we see, in principle, no obstacle [BDF]. This is also the perspective for the generalization to curved space-times, where the techniques of [BF] can be used. An open question is the physical meaning of the remaining normalization conditions in a local perturbative construction, after the restrictions from gauge invariance and other symmetries are taken into account. The parameters involved may be considered as structure constants of the algebra of observables, but their usual interpretation as charge and mass involve the adiabatic limit. Appendix A: Implementation of the Free BRST-Transformation on the Spatially Compactified Minkowski Space In this appendix we quantize the free gauge, ghost and spinor fields in a finite spatial volume. Special care is needed in the choice of boundary conditions. In the second part of this appendix we prove that the free Kugo–Ojima charge Q (5.5) is nilpotent, implements the BRST-transformation of free fields and fufills our positivity assumption (4.8). Let CL be the open cube of length L. The algebra of a free scalar field ϕL with mass m ≥ 0 on R × CL with Dirichlet boundary conditions 8 is the unital *-algebra generated by elements ϕL (f ), f ∈ D(R × CL ) with the relations f 7 → ϕL (f ) is linear,

(A.1)

ϕL (( + m2 )f ) = 0, ϕL (f )∗ = ϕL (f¯), Z [ϕL (f ), ϕL (g)] = −i d 4 x d 4 y f (x)g(y)1L (x, y),

(A.2) (A.3) (A.4)

where 1L is the fundamental solution of the Klein Gordon equation on R × CL with Dirichlet boundary conditions, which has the explicit form X (−1)n(s) 1(x 0 − y 0 , x − s(y)), (A.5) 1L (x 0 , x, y 0 , y) = s∈S

where S is the group generated by the reflections on the planes which bound CL and n(s) is the number of reflections occurring in s (which is well defined modulo 2). In 8 We first tried periodic boundary conditions, but this seems not to work for massless particles because of the existence of zero modes [DF]. For bosonic particles the algebra of the zero mode agrees with the algebra of a free Schrödinger particle in one dimension. There is no ground state on this algebra, and this makes it impossible to define the physical Hilbert space as the cohomology of the free Kugo–Ojima charge Q. Therefore, we choose boundary conditions which exclude the zero mode.

94

M. Dütsch, K. Fredenhagen

particular one sees that 1L coincides with 1 on O × O if the closure of the double cone O is contained in R × CL , considered as a region in Minkowski space. Hence the algebra F(O) associated to O is independent of the boundary conditions. Since 1L depends only on time differences, the algebra is invariant under time translations. The ground state is the quasifree state whose 2-point function is the positive frequency part 1+ L of 1L . In the massive case it is given in terms of the Minkowski space 2-point functions by a formula analogous to the formula for the commutator function. In the massless case a corresponding sum might not converge, instead we exploit the fact that the possible frequencies must have squares which are eigenvalues of the Laplace operator on CL with Dirichlet boundary conditions. In particular zero modes do not appear. Therefore the frequency splitting can be done by convolution with the distribution ih(t) , (A.6) H (t) = 1 (2π) 2 (t + iε) where h is a test function from the Schwartz space S(R) with h(0) = 1 and a Fourier transform with support in the interval (−ω, ω), where ω2 is the smallest eigenvalue of the Laplace operator. H ∗ 1 differs from 1+ by a smooth function and decays fast in spatial directions. Hence 1+ L admits a representation in terms of H ∗ 1, 0 0 1+ L (x , x, y , y) =

X (−1)n(s) H ∗ 1(x 0 − y 0 , x − s(y)),

(A.7)

s∈S

and 1+ − 1+ L is smooth on O × O. Due to a result of Verch [V], this implies that the representation of F(O) induced by the ground state on R × CL is unitarily equivalent to the Minkowski space vacuum representation. We also need to compare the Wick products in both representations. Since the two point functions differ by a smooth function we may use the Minkowski space definition of Wick products also on R × CL . 9 In [BF] a domain of definition of Wick polynomials was found which depends only on the equivalence class of the representation, hence also the Wick products can be identified. For the electromagnetic field we may use metallic boundary conditions, i.e. the pullback of the 2-form F vanishes at the boundary (which means that the tangential components of the electric field and the normal component of the magnetic field vanish). In 9 To see this we consider 1 δn : φ(x1 ) . . . φ(xn ) :ω 9 = n e 2 ω(f,f ) eiφ(f ) 9|f =0 = i δf (x1 ) . . . δf (xn )

[n]

=

2 X X

l=0

1

0

1 0

δ 2l e 2 [ω(f,f )−ω (f,f )] δ (n−2l) e 2 ω (f,f ) eiφ(f ) 9 |f =0 · |f =0 = n−2l i 2l δf (xi1 )δf (xj1 ) . . . δf (xil )δf (xjl ) i δf (x1 ) . . . iˆ1 jˆ1 . . . iˆl jˆl . . . δf (xn ) [n]

2 X X =: φ(x1 ) . . . φ(xn ) :ω0 9 + (−1)l (ω − ω0 )(xi1 − xj1 ) . . . (ω − ω0 )(xil − xjl )·

l=1

· : φ(x1 ) . . . iˆ1 jˆ1 . . . iˆl jˆl . . . φ(xn ) :ω0 9,

where ω and ω0 denote quasifree states or the corresponding two-point functions, 9 is a suitable state vector and the hat means the omission of the corresponding factor. Hence if (ω − ω0 ) is a smooth function, the limit (xi − xj ) → 0 ∀i 6 = j exists in : φ(x1 ) . . . φ(xm ) :ω ∀m ∈ N iff it exists in : φ(x1 ) . . . φ(xm ) :ω0 ∀m ∈ N.

Local (Perturbative) Construction of Observables in Gauge Theories

95

addition we assume that the auxiliary Nakanishi-Lautrup field B = ∂ µ AL µ (in Feynman gauge) satisfies Dirichlet boundary conditions. The corresponding commutator function is X (−1)n(s) sµλ gλν D(x 0 − y 0 , x − s(y)), (A.8) Dµν,L (x 0 , x, y 0 , y) = s∈S

where the matrix (sµλ ) describes the action of s on covectors, e.g. for a reflection s on a plane x2 = const. we have sµλ gλν = diag(1, −1, 1, −1)µν . The algebra generated by the vector potential AL µ can then be defined as in the scalar case, and again the subalgebra associated to the double cone O is independent of the boundary conditions. A ground state can also be defined in terms of the positive frequency part of the two point function; as on Minkowski space, it violates the positivity condition. The ghost and antighost fields are quantized with Dirichlet boundary conditions, i.e. by the relation Z 2 (A.9) (uL (f ) + i u˜ L (g)) = d 4 x d 4 y f (x)g(y)DL (x, y) which replaces the commutator condition. (DL is obtained from 1L (A.5) by setting m = 0.) The ground state is obtained from the two point function ωu (u˜ L (x)uL (y) = iDL+ (x, y)

(A.10)

which again violates positivity w.r.t. the *-operation, which is defined by (cf. (3.5)) def ¯ (uL (f ) + i u˜ L (g))∗ = (uL (f¯) + i u˜ L (g)).

(A.11)

Finally, we have to find suitable boundary conditions for the electron field. For simplicity, we choose periodic boundary conditions. Because they are invariant under charge conjugation, the expectation value of the electric current (normal ordered w.r.t. the Minkowski vacuum) vanishes in the ground state (of the cube), hence the interaction Lagrangian L0 (3.15) keeps the same form as on Minkowski space. We are now going to represent these algebras in Fock spaces. The one-particle Hilbert space H for a massless free scalar field is the completion with respect to the scalar product Z ↔x d 3 x f (x)∗ ∂ 0 g(x) (A.12) (f, g) = i CL ,x0 =const.

of the space of all smooth functions f : R ×CL → C which are positive frequency solutions of the wave equation and fulfill the considered boundary conditions. For Dirichlet boundary conditions the mode decomposition for the functions f ∈ H reads ∞ X

f (x) =

fn vn (x),

fn = const.,

(A.13)

n1 ,n2 ,n3 =1

where def

vn (x) =

2 1

(ωn L3 ) 2

sin kn1 x1 sin kn2 x2 sin kn3 x3 e−iωn x0 ,

π π def k n = n, ωn = kk n k = knk L L def

(A.14)

96

M. Dütsch, K. Fredenhagen

(the normalization is such that (vn , vm ) = δn,m ). The representation of gauge fields in Feyman gauge requires an indefinite inner product space. We describe it in terms of a Krein operator JL = (−1)N0 ⊗ 1 ⊗ Jg

on the Hilbert space

KL = KA ⊗ Kψ ⊗ Kg . (A.15)

The latter is the tensor product of the photon-, spinor- and ghost-Fock space. N0 is the particle number operator for scalar photons and Jg will be defined below. The Krein operator (A.15) fulfills JL+ = JL (A.16) JL2 = 1, (+ denotes the adjoint in KL ), and the dense invariant domain DL can be chosen such that JL DL = DL . The indefinite inner product is given by def

< a, b > = (a, JL b),

a, b ∈ KL ,

(A.17)

where (., .) denotes the (positive definite) scalar product in KL , and the *-operation with respect to (A.17) is O ∗ = JL O + JL ,

< Oa, b >=< a, O ∗ b > .

def

µ

(A.18)

µ+

Let an , an , cj n , cj+n (j = 1, 2) be the usual annihilation and creation operators of the Fock spaces KA and Kg which fulfill the (anti-)commutation relations

and

ν+ ] = δn,m δ µν L3 2ωn [anµ , am

(A.19)

+ } = δn,m δj l L3 2ωn . {cj n , clm

(A.20)

The ghost fields uL and u˜ L and the zeroth component of the photon field A0L are scalar fields with Dirichlet boundary conditions and some unusual sign conventions, u˜ L (x) =

∞ X

1 1

3 n1 ,n2 ,n3 =1 (2ωn L ) 2 ∞ X

uL (x) =

n1 ,n2 ,n3 =1

A0L (x) =

∞ X n1 ,n2 ,n3 =1

+ (−c1n vn (x) + c2n vn (x)∗ ),

(A.21)

1

+ (c2n vn (x) + c1n vn (x)∗ ),

(A.22)

1

(an0 vn (x) − an0+ vn (x)∗ ).

(A.23)

1 (2ωn

L3 ) 2

1 (2ωn

L3 ) 2

The normalizations are such that they go over into the usual R 3 conP Lorentz covariant 3 ) by d k. For ventions of the non-compactified space by replacing ( 2π 3 n∈Z ,n6 =0 L µ the spatial components of the photon field AL we have a mixture of Dirichlet and von Neumann boundary conditions. For example for µ = 2 we define def

v2n (x) =

ηn 2

sin k1 x1 cos k2 x2 sin k3 x3 e−iωn x0 , 1 (ωn L3 ) 2 n1 , n3 = 1, 2, . . . , n2 = 0, 1, 2, . . . ,

(A.24)

Local (Perturbative) Construction of Observables in Gauge Theories

97

where ηn = 1 for n2 ≥ 1 and ηn = 2− 2 for n2 = 0 and similar for µ = 1, 3. Then we set X 1 (anl vln (x) + anl+ vln (x)∗ ), l = 1, 2, 3. (A.25) AlL (x) = 1 3 2 n (2ωn L ) 1

Jg (A.15) is defined implicitly by (A.11), (A.16) and (A.21-22), i.e. we have + ∗ = c2n , c1n

+ ∗ c2n = c1n .

For the photon two-point function we obtain µ+

µ

(A , AL (x)AνL (y)A ) =< A , AL (x)JL AνL (y)A > X def vµn (x)vµn (y)∗ , (v0n = vn ) = δ µν

(A.26)

n

which is obviously positive. We now transfer the construction of the interacting fields (Sects. 2 and 3) from Minkowski space to R × CL . Due to [BF] there is no principle obstacle and there are only a few changes in the formulas. Since spatial translation invariance is lost, the commutator functions and propagators do not only depend on the relative coordinates, they must be replaced by the above given expressions (A.5), (A.7) (A.8), etc. Some care is required in the proof of Proposition 3. To get (3.22) we use the identity av µρ

∂µx DL

(x, zm ) = −∂zρm DLav (x, zm )

(A.27)

(which is an immediate consequence of (A.5), (A.8)) and the fact that the boundary terms of the “partial integration” vanish because DLav (A.5) fulfills Dirichlet boundary conditions. By Lemma 1 (A) we obtain µ

Aint L (x) = Aµ (x) +

·

n X l=1

µν ret

g(x1 ) . . . g(xn )DL

∞ n+1 Z X i n=1

n!

d 4 x1 . . . d 4 xn ·

(x, xl )R(L0 (x1 ) . . . lˆ . . . L0 (xn ); j µ (xl )).

µν ret

(A.28) µ

Since DL (x, xl ) fulfills the boundary conditions of Aµ (x) we conclude that Aint L obeys the same boundary conditions as the corresponding free field, and similarly for ψint L , uint L , etc. Let us turn to the implementation of the free BRST-transformation. In the following and in the main text we omit the lower index “L”. Due to ∂ µ [(∂ν Aν )∂ ↔ µ u] = 0 the definition Z ↔ def d 3 x (∂ν Aν (x)) ∂ 0 u(x) (A.29) Q= CL , x0 =const.

of the free Kugo–Ojima charge (5.5) is independent of x0 . Because of Aµ ∗ = Aµ , u∗ = u (3.5) we immediately see Q∗ = Q. By means of Z ↔x d 3 x D(y − x) ∂ 0 φ(x) = φ(y), ∀φ fulfilling φ = 0, (A.30) CL , x0 =const.

98

M. Dütsch, K. Fredenhagen

one proves that the charge Q implements the BRST-transformation (5.2) of the free fields, e.g. Z ↔x d 3 x [∂ν Aν (x), Aµ (y)] ∂ 0 u(x) = [Q, Aµ (y)] = CL , x0 =const. Z ↔x d 3 x D(x − y) ∂ 0 u(x) = i∂ µ u(y). (A.31) = −i∂yµ CL , x0 =const.

The transformation (5.3–4) of Wick monomials and derivated fields is also implemented by Q, because of [Q, : φ1 (x)φ2 (x) · · · :]∓ =: =: [Q, φ1 (x)]∓ φ2 (x) · · · : +(−1)δ(φ1 ) : φ1 (x)[Q, φ2 (x)]∓ · · · : + . . . and

(A.32)

[Q, ∂µ φ(x)]∓ = ∂µx [Q, φ(x)]∓ .

We easily find that Q is nilpotent Z ↔ 1 d 3 x {Q, (∂ν Aν ) ∂ 0 u} = 0. Q2 = {Q, Q} = 2 CL , x0 =const.

(A.33)

Inserting (A.22–23) and (A.25) into (A.29) we obtain Q=

∞ X

1 1

L3 2 2

+ + [c1n b1n + b2n c2n ]

(A.34)

n1 ,n2 ,n3 =1

in a straightforward way (the sum in Q converges in the topology of the Krein space on the dense invariant domain D), where def

b1n =

j j

1

(an0 + i 1

22

which implies

kn a n ), ωn

def

b2n =

−1 1

22

j j

(an0 − i

kn a n ), ωn

+ ] = δn,m δj l L3 2ωn [bj n , blm

(A.35)

(A.36)

(cf. Sect. 5 of [DHS1]). By means of Q2 = 0 one finds similarly to [K] that the dense invariant domain D has the decomposition D = Ra Q ⊕ (Ke Q ∩ Ke Q+ ) ⊕ Ra Q+

(A.37)

and these three subspaces are pairwise orthogonal with respect to the scalar product (.,.) (cf. (A.17-18)). Additionally one easily verifies Ke Q ∩ Ke Q+ = Ke {Q, Q+ }.

(A.38)

Inserting (A.34) we find {Q, Q+ } =

1 L3

∞ X n1 ,n2 ,n3 =1

+ + + + ωn (b1n b1n + b2n b2n + c1n c1n + c2n c2n ),

(A.39)

Local (Perturbative) Construction of Observables in Gauge Theories

99

which agrees up to factors 2ωn2 with the particle number operator of the ghosts and the longitudinal and scalar photons; however, the kernels completely agree. Obviously the Krein operator J (A.15) is the identity on Ke {Q, Q+ }. Additionally J maps Ra Q onto Ra Q+ , due to J Q = Q+ J . From the decomposition (A.37) we conclude that our positivity assumption (4.8) is in fact satisfied, i.e. the indefinite product < ., . > is positive semidefinite on Ke Q and the null vectors in Ke Q are precisely Ra Q. The vectors in Ke{Q, Q+ } are distinguished representatives of the equivalence classes Q in the physical space H = Ke Ra Q (4.9). They provide the usual physical picture, namely that the states in H are built up from electrons, positrons and transversal photons only. Appendix B: Proof of the Ward Identities We recall the Ward identites (3.16), n X δ(y −xj )T A1 (x1 ) . . . (θ Aj )(xj ) . . . An (xn ) , ∂µy T j µ (y)A1 (x1 ) . . . An (xn ) = i j =1

(B.1)

where def

(θ Aj ) =

d |α=0 Aj α = i(rj − sj )Aj dα

for

s

Aj =: ψ rj ψ j B1 . . . Bl :

(B.2)

(B1 , . . . , Bl are non-spinorial free fields, i.e. photon or ghost operators), and Aj α is given by the global U (1)-transformation (3.17). Note Z def with Qψ = d 3 x : ψ(x)γ 0 ψ(x) :, (B.3) (θ Aj ) = −i[Qψ , Aj ] i.e. Qψ is the infinitesimal generator of the transformation (3.17). There exist several proofs of the Ward identities in QED, e.g. [FHRW,S]. Here we want to show that they can be fulfilled in our framework; in particular we have to check that they are compatible with our normalization conditions. We follow ideas from [St2]. First we point out a consequence of the Ward identities (B.1). For a given (x1 , . . . , xn ) let O ⊂ R4 be a double cone which contains the points x1 , . . . , xn and let g be a test function which is equal to 1 on a neighbourhood of O. We decompose ∂ µ g = a µ − bµ such that supp a µ ∩ (V − + O) = ∅ and supp bµ ∩ (V + + O) = ∅. We smear out (B.1) with this g in y. Then, by causal factorization, the left-hand side of (B.1) becomes −j µ (aµ )T A1 (x1 ) . . . An (xn ) + T A1 (x1 ) . . . An (xn ) j µ (bµ ) = = −[j µ (aµ ), T A1 (x1 ) . . . An (xn ) ] − T A1 (x1 ) . . . An (xn ) j µ (∂µ g).

(B.4)

The second term vanishes because j µ is a conserved current. Since T (A1 (x1 ) . . . An (xn )) is localized in O, the term −j µ (aµ ) in the commutator can be replaced by Qψ . 10 Hence the validity of the following lemma is necessary for the Ward identities: 10 This may be seen as follows. Different choices of a differ only in the spacelike complement of O and µ therefore do not affect the commutator. We may choose Z ex aµ (x) = ∂µ g(x) h(t)dt, (B.5) −∞

100

M. Dütsch, K. Fredenhagen

Lemma 8. In agreement with (N1-4) and (N6) the normalizations can be chosen such that the vacuum expectation values of the time ordered products vanish, if the sum of the charges of the factors is different from zero < |T (A1 . . . An )| >= 0 f or

X (rj − sj ) 6 = 0.

(B.6)

j

Under this condition the following identity becomes true n X T A1 (x1 ) . . . [Qψ , Aj (xj )] . . . An (xn ) ≡ [Qψ , T A1 (x1 ) . . . An (xn ) ] = j =1 n X T A1 (x1 ) . . . (θAj )(xj ) . . . An (xn ) . ≡i

(B.7)

j =1

Proof of Lemma 8. The lemma is certainly fulfilled for n = 1, and we proceed inductively with respect to the order n. For each fixed n we consider a second induction in the sum of the degrees of the Wick monomials Aj , j = 1, . . . , n. We commute the assertion (B.7) with the free fields. After inserting (N3) we can use the inductive assumption and find that these commutators vanish. Therefore, the identity (B.7) can only be violated by a C-number. (An analogous computation is given below in Step 1 of the proof of the Ward identities.) To determine this C-number we consider the vacuum expectation value of (B.7). Since Qψ = Q∗ψ annihilates the vacuum we find < |[Qψ , T (A1 . . . An )]| >= 0. Moreover note X

< |T (A1 . . . (θAj ) . . . An )| >= i

X (rj − sj ) < |T (A1 . . . An )| > .

j

j

(B.8) Due to the causal factorization and the validity of (B.7) in lower orders, the expression (B.8) must be local. Hence we can require (B.6) as a normalization condition, i.e. we extend zero by zero to the total diagonal. Obviously this prescription is compatible with (N1–4) and (N6). This completes the proof of the lemma. u t Proof of the Ward identities. We show that all Ward identities can be satisfied by choosing a suitable normalization of the vacuum expectation values of the time ordered products which contain no free field factor and with vanishing total charge (B.6). We work with the same double inductive procedure as in the previous proof. where e is a suitable timelike unit vector and h is a test function of one variable with sufficiently small support and total integral 1, i.e. the integral in (B.5) is a C ∞ -approximation to 2(ex − c) (c ∈ R is a suitable constant). Then by current conservation and partial integration we obtain −j µ (aµ ) = j µ (eµ gh(e·)) =

Z

hence the statement follows from g ≡ 1 on O.

Z dth(t)

ex=t

g(x)j µ (x)εµνρσ dx ν dx ρ dx σ ,

Local (Perturbative) Construction of Observables in Gauge Theories

101

Step 1. Again we commute the assertion with the free fields. By means of (N3) we obtain h {∂µy T j µ (y)A1 (x1 ) . . . An (xn ) − i X δ(y − xm )T A1 (x1 ) . . . (θAm )(xm ) . . . An (xn ) }, φj (z) = −i m

∂Ak . . . An − ∂µy T j µ (y)A1 . . . ∂φl k X ∂Ak δ(y − xm )T A1 . . . . . . (θ Am ) . . . An − −i ∂φl m (m6=k) o ∂(θAk ) . . . An 1lj (xk − z) + −iδ(y − xk )T A1 . . . ∂φl ∂j µ (y)A1 . . . An 1lj (y − z). +i∂µy T ∂φl =i

Xn

(B.9)

For φj 6 = ψ, ψ the last term vanishes and we obtain zero, due to ∂A ∂(θAk ) k 1lj = θ 1lj ∂φl ∂φl

(for φj 6 = ψ, ψ)

(B.10)

and the inductive assumption. If φj = ψ (φj = ψ is analogous) the last term is equal to o n i∂µy T ψ(y)A1 . . . An γ µ S(y − z) = X ∂Ak T A1 . . . . . . An δ(xk − y)S(xk − z) = −i ∂ψ

(B.11)

k

according to (N4). Because of ∂Ak ∂(θ Ak ) = i(rk − sk ) , ∂ψ ∂ψ

θ

∂A k

∂ψ

= i(rk − 1 − sk )

∂Ak ∂ψ

(B.12)

and the inductive assumption, the commutator (B.9) vanishes also in this case. Again we conclude that a possible violation of a Ward identity (we call it an anomaly) can only appear in the vacuum sector, i.e. in the vacuum expectation values. def a(y, x1 , . . . , xn ) = ∂µy T j µ (y)A1 (x1 ) . . . An (xn ) − −i

n X

δ(y − xj )T A1 (x1 ) . . . (θ Aj )(xj ) . . . An (xn ) =

j =1

−i

n X j =1

= ∂µy < |T j µ (y)A1 . . . An | > − δ(y − xj ) < |T A1 . . . (θ Aj ) . . . An | > .

(B.13)

102

M. Dütsch, K. Fredenhagen

Moreover the anomalies are local, i.e. a(y, x1 , . . . , xn ) = P (∂)δ(x1 − y) . . . δ(xn − y),

(B.14)

where P (∂) is a polynomial in ∂ ≡ (∂x1 , . . . , ∂xn ). The latter is a consequence of the induction with respect to the order n and the causal factorization (2.2) of the time ordered products. Step 2. Next we prove the Ward identities with a free field factor. We only need to consider their vacuum expectation values. The normalization condition (N4) implies the well known identity < |T (A1 (x1 ) . . . An (xn )φi (x))| >= =i

n X X k=1

l

1Fil (x − xk ) < |T (A1 (x1 ) . . .

∂Ak (xk ) . . . An (xn ))| >, ∂φl

(B.15)

where 1F is the Feynman propagator. By inserting this formula we obtain ∂µy < |T j µ (y)A1 (x1 ) . . . Am (xm )φi (x) | > − −i

n X j =1

δ(y − xj ) < |T A1 (x1 ) . . . (θ Aj )(xj ) . . .

. . . Am (xm )φi (x) | > −iδ(y − x) < |T A1 (x1 ) . . . Am (xm )(θ φi )(x) | >= X ∂Ak 1Fil (x − xk )∂µy < |T j µ (y)A1 . . . . . . Am | > + =i ∂φl k ∂j µ n o (y)A1 . . . Am | > + +i∂µy 1Fil (x − y) < |T ∂φl X X ∂Ak F δ(y − xj ) 1il (x − xk ) < |T A1 . . . . . . (θ Aj ) . . . Am | > + + ∂φl j k (k6=j ) X ∂(θ Ak ) δ(y − xk )1Fil (x − xk ) < |T A1 . . . . . . Am | > + + ∂φl k X ∂Ak 1F(θ i)l (x − xk ) < |T A1 . . . . . . Am | >, +δ(y − x) ∂φl k (B.16) def where m = n − 1 and 1F(θ i)l (x − y) = i < |T ((θ φi )(x)φl (y))| >. If φi 6 = ψ, ψ the second and the last term vanish (αi = 0). Due to (B.10) and the Ward identities in lower order we get zero. If φi = ψ (φi = ψ is analogous) the second term is equal to X ∂Ak < |T A1 . . . . . . Am | > [S F (xk − y)δ(y − x) − δ(xk − y)S F (y − x)] i ∂ψ k (B.17) by means of (N4). Because of (B.12) and the Ward identities in lower order all terms cancel in this case, too.

Local (Perturbative) Construction of Observables in Gauge Theories

103

Step 3. By choosing g as in (B.4) we conclude from Lemma 8, Z Z 0 = d 4 y g(y)a(y, x1 , . . . , xn ) = d 4 y a(y, x1 , . . . , xn )

(B.18)

in D0 (R4n ). This restricts the want to remove them by finite remaining anomalies. We µ renormalizations of < |T j (y)A1 (x1 ) . . . An (xn ) | >. This can only be done if the polynomials P (∂) (B.14) have the form P (∂) = (

n X i=1

µ

∂µxi )P1 (∂),

µ

P1 (∂) polynomial in ∂ ≡ (∂x1 , . . . , ∂xn ).

(B.19)

To prove this we consider the Fourier transformation of the anomaly (B.14), Z a(y, ˆ p1 , . . . , pn ) = (2π)−2n dx1 . . . dxn a(y, x1 , . . . , xn )ei(p1 x1 +···+pn xn ) = = (2π)−2n P (−ip1 , . . . ., −ipn )ei(p1 +···+pn )y . From (B.18) we know that P (−ip1 , . . . ., −ipn ) vanishes on the submanifold 0, n X pi ) = 0. P (−ip1 , . . . ., −ipn )δ(

Pn (B.20) i=1 pi = (B.21)

i=1

Let P˜ (q, p1 , . . . , pn−1 ) = P (−ip1 , . . . ., −ipn ), where q = Taylor series of P˜ , def

P˜ (q, p1 , . . . , pn−1 ) =

def

degree P˜

X

X

k=1

|α|+|β|=k

Pn

i=1 pi . We consider the

q α pβ ∂ |α| ∂ |β| ˜ P (0), α!β! ∂q α ∂pβ

(B.22)

p ≡ (p1 , . . . , pn−1 ). |β| ∂ ˜ ˜ The terms |α| = 0 vanish because ∂p β P (0) is obtained by varying P on the submanifold q = 0. There remain only terms with a factor q α , |α| ≥ 1. This proves (B.19). Step 4. But there is still a problem. The renormalization < |T j µ (y)A1 (x1 ) . . . An (xn ) | >→ µ →< |T j µ A1 . . . An | > +P1 (∂)δ(x1 − y) . . . δ(xn − y), µ

(B.23)

(which removes the anomaly) is only admissible if P1 (∂)δ(x 1 − y) . . . δ(xn − y) has µ the same symmetries as required for < |T j A1 . . . An | >. Especially if there are factors j µl (xl ) among A1 (x1 ) . . . An (xn ) the permutation symmetry with respect to (y, µ) ↔ (xl , µl ) must be maintained (for all l). There is a prominent counterexample µ µ µ where this is impossible: the axial anomaly, i.e. < |T jA (y)jA 1 (x1 )jA 2 (x2 ) | >, µ def

where jA = : ψγ µ γ 5 ψ :.

104

M. Dütsch, K. Fredenhagen

We have not found argument (for non-axial QED) that all possible anomalies P a general µ µ factorize P (∂) = ( i ∂µxi )P1 (∂) such that P1 (∂) has the wanted symmetries. However, taking Step 2 into account, and also the fact that the scaling degree at xj − y = 0 (∀j = 1, . . . , n) [BF] of the anomaly cannot be higher than the scaling degree of the terms in the corresponding Ward identity, the number of Ward identities which can be violated is strongly reduced. In addition, due to (B.18) terms of singular order ω = 0 (i.e. scaling degree δ = 4n) are excluded in the anomalies. The famous Ward identity which connects the vertex function with the electron self-energy has only one factor j and, hence, the renormalization (B.23) maintains the symmetries in that case. There remain the following anomalies: ∂µy < |T j µ (y)L(x11 ) . . . L(x1m )j µ1 (x21 ) | >= Y X (B.24) µ C1a1 ∂ a δ(xhj − y), = 1≤|a|≤3

h,j

∂µy < |T j µ (y)L(x11 ) . . . L(x1m )j µ1 (x21 )j µ2 (x22 )j µ3 (x23 ) | >= Y X µµµ C2a1 2 3 ∂ a δ(xhj − y). = |a|=1

(B.25)

h,j

... , Cl...

l = 1, 2, 3, 4 are restricted by Lorentz covariance and The unknown constants the permutation symmetry in x11 , . . . , x1m . The analogous Ward identity with three factors j is trivially fulfilled, due to Furry’s theorem, by imposing C-invariance as a further normalization condition. 11 The anomalies in (B.24) and (B.25) can be further x21 y restricted by the symmetry in the factors j of the terms on the l.h.s., e.g. ∂µ1 ∂µ <

|T j µ (y)L(x11 ) . . . L(x1m )j µ1 (x21 ) | > is symmetrical in y, x21 . By working this out one finds that the factorization (B.19) of the anomalies can be done in such a way that the symmetries are preserved in the renormalizations (B.23) [DHS2]. u t

Acknowledgements. We profited from discussions with Franz-Marc Boas, Izumi Ojima, Klaus Sibold and Raymond Stora which are gratefully acknowledged. Part of this work was done at the Max-Planck-Institute for Mathematics in the Sciences in Leipzig and at the University of Leipzig. The authors wish to thank Bodo Geyer, Gert Rudolph, Klaus Sibold and Eberhard Zeidler for warm hospitality.

References [BF]

[BDF] [BFK]

Brunetti, R. and Fredenhagen, K.: Interacting quantum fields in curved space: Renormalization of φ 4 . gr-qc/9701048, Proceedings of the Conference “Operator Algeras and Quantum Field Theory”, held at Accademia Nazionale dei Lincei, Rome, July 1996; Brunetti, R. and Fredenhagen, K.: Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds. In preparation Boas, F.-M., Dütsch, M. and Fredenhagen, K.: A local (perturbative) construction of observables in gauge theories: Nonabelian gauge theories. Work in progress Brunetti, R., Fredenhagen, K. and Köhler, M.: The microlocal spectrum condition and Wick polynomials of free fields on curved spacetimes. Commun. Math. Phys. 180, 312 (1996)

11 In the inductive construction of the time ordered products C-invariance can only get lost in the extension to the total diagonal, because of the causal factorization (2.2). Starting with an extension which fulfills all other normalization condition (N1-4), (N6), (B.6) and symmetrizing it with respect to C-invariance, we obtain an extension which satisfies all requirements.

Local (Perturbative) Construction of Observables in Gauge Theories

[BRS]

105

Becchi, C., Rouet, A. and Stora, R.: Renormalization of the abelian Higgs–Kibble model. Commun. Math. Phys. 42, 127 (1975); Becchi, C., Rouet, A. and Stora, R.: Renormalization of gauge theories. Annals of Physics (N.Y.) 98, 287 (1976) [BlSe] Blanchard, P. and Seneor, R.: Green’s functions for theories with massless particles (in perturbation theory). Ann. Inst. H. Poincaré A 23, 147 (1975) [BS] Bogoliubov, N.N. and Shirkov, D.V.: Introduction to the Theory of Quantized Fields. New York: (1959) [Bu1] Buchholz, D., Porrmann, M. and Stein, U.: Dirac versus Wigner: Towards a universal particle concept in local quantum field theory. Phys. Lett. B 267, 377 (1991) [Bu2] Buchholz, D.: Gauss’ law and the infraparticle problem. Phys. Lett. B 174, 331 (1986) [BW] Bordemann, M. and Waldmann, S.: Formal GNS construction and states in deformation quantization. Commun. Math. Phys. 195, 549 (1998) [D1] Dütsch, M.: On gauge invariance of Yang–Mills theories with matter fields. N. Cimento A 109, 1145 (1996) [D2] Dütsch, M.: Slavnov–Taylor identities from the causal point of view. Int. J. Mod. Phys. A 12, 3205 (1997) [DF] Dütsch, M. and Fredenhagen, K.: Deformation stability of BRST-quantization. hep-th/9807215, to appear in the proceedings of the conference “Particles, Fields and Gravitation”, Lodz, Poland (1998) [DHS1] Dütsch, M., Hurth, T. and Scharf, G.: Causal construction of Yang–Mills theories. IV. Unitarity, N. Cimento A 108, 737 (1995) [DHS2] Dütsch, M., Hurth, T. and Scharf, G.: Gauge invariance of massless QED. Phys. Lett. B 327, 166 (1994) [DHKS1] Dütsch, M., Hurth, T., Krahe, F. and Scharf, G.: Causal construction of Yang–Mills theories. I. N. Cimento A 106, 1029 (1993) [DHKS2] Dütsch, M., Hurth, T., Krahe, F. and Scharf, G.: Causal construction of Yang–Mills theories. II. N. Cimento A 107, 375 (1994) [DKS] Dütsch, M., Krahe, F. and Scharf, G.: Interacting fields in finite QED. N. Cimento A 103, 871 (1990) [DS] Dütsch, M. and Scharf, G.: Perturbative gauge invariance: The electroweak theory. hep-th/9612091, to appear in Annalen der Physik (Leipzig); Aste, A., Dütsch, M. and Scharf, G.: Perturbative gauge invariance: The electroweak theory II. hep-th/9702053, to appear in Annalen der Physik (Leipzig) [EG] Epstein, H. and Glaser, V.: The role of locality in perturbation theory. Ann. Inst. H. Poincaré A 19, 211 (1973) [F] Feynman, R.P.: Acta Phys. Polonica 24, 697 (1963) [FHRW] Feldman, J.S., Hurd, T.R., Rosen, L. and Wright, J.D.: QED: A Proof of Renormalizability. Berlin– Heidelberg–New York: Springer-Verlag, 1988 [FP] Faddeev, L.D. and Popov, V.N.: Feynman diagrams for the Yang–Mills field. Phys. Lett. B 25, 29 (1967) [K] Krahe, F.: A causal approach to massive Yang–Mills theories. Acta Phys. Polonica B 27, 2453 (1996) [KO] Kugo, T. and Ojima, I.: Local covariant operator formalism of non-abelian gauge theories and quark confinement problem. Suppl. Progr. Theor. Phys. 66, 1 (1979) [R] Requardt, M.: Symmetry conservation and integrals over local charge desities in quantum field theory. Commun. Math. Phys. 50, 259 (1976) [S] Scharf, G.: Finite Quantum Electrodynamics. 2nd. ed., Berlin–Heidelberg–New York: SpringerVerlag, 1995 [Sch] Schroer, B.: Infrateilchen in der Quantenfeldtheorie. Fortschr. Phys. 173, 1527 (1963) [St1] Stora, R.: Local gauge groups in quantum field theory: Perturbative gauge theories. Talk given at the workshop “Local Quantum Physics” at the Erwin-Schroedinger-institute, Vienna (1997) [St2] Stora, R.: Lagrangian field theory. Summer School of Theoretical Physics about “Particle Physics”, Les Houches, 1971 [St3] Stora, R.: Differential algebras in Lagrangian field theory. ETH-Zürich Lectures, January–February 1993; Popineau, G. and Stora, R.: A pedagogical remark on the main theorem of perturbative renormalization theory. Unpublished preprint [Ste] Steinmann, O.: Perturbation expansions in axiomatic field theory. Lecture Notes in Physics 11, Berlin–Heidelberg–New York: Springer-Verlag, 1971 [V] Verch, R.: Local Definiteness, Primarity and Quasiequivalence of Quasifree Hadamard Quantum States in Curved Spacetime. Commun. Math. Phys. 160, 507 (1994) Communicated by D. C. Brydges

Commun. Math. Phys. 203, 107 – 118 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Global Existence for the Einstein–Boltzmann Equation in the Flat Robertson–Walker Spacetime Piotr Bogusław Mucha Institute of Applied Mathematics and Mechanics, Warsaw University, ul. Banacha 2, 02-097 Warsaw, Poland. E-mail: [email protected] Received: 18 May 1998 / Accepted: 23 November 1998

Abstract: The initial value problem for the Einstein–Boltzmann equation in the spatially homogeneous and isotropic case is considered. The global in time mild solution is obtained. In the paper we consider the Einstein–Boltzmann system which describes an evolution of a collision gas in general relativity [1, 2]: i pµ pν f,pi = Q(f, f ), pα f,α − 0µν

(1.1)

Gµν = Tµν ,

(1.2)

Z

|g|1/2 d p, ¯ (1.3) p0 where Q(f, f ) is the collision operator and T µν is the energy-momentum tensor. The first equation (1.1) called the Boltzmann equation determines the distribution function f (t, x, p) of gas particles. To describe f we have to have a submanifold P (M) of the tangent bundle T M of the pseudoriemann manifold M which is defined by the constraint: T µν =

pµ pν f (t, x, p)

Px (p) : gx (p, p) = gαβ pα pβ = 1

(α, β = 0, 1, 2, 3),

(2)

where gαβ is a metric of M given by the Einstein equations (1.2). Then f : P (M) → R. The assumption (2) describes that all particles have the same mass equal to 1. We assume that the spacetime is spatially homogeneous and isotropic. The symmetry implies that the metric simplifies to the form: (3) ds 2 = dt 2 − R 2 (t) (dx 1 )2 + (dx 2 )2 + (dx 3 )2 , where R(t) > 0 and the distribution function f (t, x, ¯ p) ¯ does not depends on x and depend only on p = |p| ¯ (f (t, x, ¯ p) ¯ = f (t, p)). This metric describes the cosmological model known as the flat Robertson–Walker spacetime.

108

P. B. Mucha

We consider the initial value problem in such a case. The aim of this paper is to show a global in time mild solution. To prove this we use methods similar to ones applied to the classical spatially homogeneous Boltzmann equation [4]. The initial value problem for the Einstein–Boltzmann system in a general case has been considered in [1, 3 and 6]. The results contained in these papers are local in time. With the above assumptions the Einstein–Boltzmann system reads (see [1, 2]): f,t −

R˙ 1 pf,p = 0 Q(f, f ), R p ˙ 2 R R

(4.1)

= T00 ,

(4.2)

f (t, p)p ¯ 0 d p, ¯

(4.3)

Z

where T00 = R 3 (t)

Q(f, g) = Q+ (f, g) − Q− (f, g), Z Z R 3 d q¯ dωf (p0 )g(q 0 )S(p, ¯ q, ¯ p¯ 0 , q¯ 0 ), Q+ (f, g) = 0 Rq3 q S2 Z Z R 3 d q¯ dωf (p)g(q)S(p, ¯ q, ¯ p¯ 0 , q¯ 0 ), Q− (f, g) = 0 3 2 q Rq S

(4.4)

p¯ 0 = p¯ − (ω, p¯ − q)ω, ¯

(4.7)

0

S 2 , C1

(4.5) (4.6)

¯ q¯ = q¯ + (ω, p¯ − q)ω, q p0 = 1 + R 2 p2 ,

(4.9)

0 ≤ S( · , · , · , · ) ≤ C1 ,

(4.10)

is constant, p0

(4.8)

and q 0

are defined the same as for classical Boltzmann where ω ∈ (it can be done because for a fixed t the Riemann submanifold numbered by t is flat (E 3 )) and S(p, q, p0 , q 0 ) - a continuous function is the cross section for the collisions. Equation (4.2) is the equivalent to the system Gµν = Tµν (1.2), because the integrability conditions ∇µ T µν = 0 hold for T µν defined by the Boltzmann equation [2, 5]. For the system (4) we consider the initial value problem with data: R(0) = R0 > 0,

0 ≤ f (0, p) = f0 (p) = f0 (p) ¯ ∈ L1 (R3 ).

(5)

Because of (4.2) above the initial data do not ensure a uniqueness. We have to add an ˙ < 0 or R(0) ˙ > 0. Then we can even compute R(t). But first we extra condition R(0) have to define the mild solution to system (4). To reach our aim we have to reformulate the problem. For the Boltzmann equation (4.1), f,t −

R˙ 1 pf,p = 0 Q(f, f ), R p

we apply the characteristic method. Thus we consider the system: R˙ dp(t, y) = − p(t, y), dt R

(6.1)

Einstein–Boltzmann Equation

109

1 df (t, y) = 0 Q(f, f )(t, y), dt p where p0 =

(6.2)

q 1 + R 2 (t)p 2 (t, y).

Equation (6.1) gives the characteristic: p(t, y) =

yR(0) . R(t)

(7)

It’s easily seen that the jacobian of the transformation p → y is equal to: det (

R(0) 3 ∂ p¯ )=( ) > 0. ∂ y¯ R(t)

(8)

The second equation (6.2) will be solved in the form: Z

t

f (t, y) = f0 (y) + 0

1 Q(f, f )(s, y)ds. p0

(9)

The solution of (9) together with (7) we call the mild solution of (4). From (7) we see that q p 0 = 1 + R 2 (0)y 2 . Multiplying (9) by p 0 and integrating over p we get Z

Z p0 f (t, y)d y¯ =

p0 f (0, y)d y¯ = const.

R The integral from the collision operator vanishes ( Q(f, f )d p¯ = 0 which follows from properties of S [3]). And this implies from (4.3) and (8) that T00 (t) = T00 (0). (We have to assume that T00 (0) is finite.) ˙ From (4.2) we have two cases. The first case when R(0) < 0 gives p p R˙ = − T00 (0) then R(t) = R(0) exp (− T00 (0)t), R

(10)

and then R(t) is decreasing on [0, ∞). The density ||f ||L1 (Rp3 ) of the gas will grow up. ˙ The second case when R(0) > 0 gives p p R˙ = T00 (0) then R(t) = R(0) exp ( T00 (0)t), R

(11)

R(t) is increasing and the density will diminish. From the above considerations we conclude that the problem concentrates on the Boltzmann equation, because R(t) has already been given by (10) or (11).

110

P. B. Mucha

Notation.

v u 3 uX xk2 , x¯ = (x1 , x2 , x3 ), x = |x| = t k=1

∂ u = u,t = u, ˙ ∂t Z f (x)dx ≤ r}, Xr = {f ≥ 0 : ∂ u = u,x , ∂x

R3

Z ||u||y =

|u|d y, ¯ ||u||p = d y¯ =

Z |u|d p, ¯

R 3 (t) d p, ¯ R 3 (0)

R 3 (t) ||u||p . R 3 (0) The main result of the paper is the following theorem: R Theorem. If f0 (p) ∈ Xr for r ≥ 0 and R0 > 0 and p0 f0 (p)R03 d p¯ < ∞ then: ||u||y =

˙ (i) the Cauchy problem for the system (4) with R(0) < 0 has a unique global in time nonnegative mild solution such that: f ∈ C(0, ∞; L1 (Ry3 )) with

p R(t) = R(0) exp (− T00 (0)t)

and

p ||f (t)||p = ||f0 ||p exp (3 T00 (0)t);

˙ (ii) and if R(0) > 0, then the system (4) has global in time nonnegative mild solution such that: f ∈ C(0, ∞; L1 (Ry3 )) with

p R(t) = R(0) exp ( T00 (0)t)

and

p ||f (t)||p = ||f0 ||p exp (−3 T00 (0)t).

To prove the theorem we need some lemmas. Lemma 1. If f, g ∈ Xr then: 1 + Q (f, f ) − p0 1 − Q (f, f ) − p0 and N (r) = C1 r.

1 + Q (g, g) ≤ N (r)||f − g||y , 0 p y 1 − ≤ N (r)||f − g||y , Q (g, g) p0 y

(12) (13)

Einstein–Boltzmann Equation

111

Proof. We note that 1 1 1 Q(f, f ) − 0 Q(g, g) = 0 [Q(f, f − g) + Q(f − g, f )]. 0 p p p One can estimate the second term like the first. Q (see (4.4)) is estimated in two parts. We consider Q+ and Q− separately ((4.5) and (4.6)) Z Z 1 1 1 + 3 Q (f, f − g) = 0 R (t)d q¯ 0 dωf (p0 )(f − g)(q 0 )S. p0 p q S2 Taking the norm in L1 (Ry3 ) we obtain Z Z Z S dωf (p0 )|(f − g)(q 0 )| 0 0 =; d yR ¯ 3 d q¯ p q S2 changing variables y → p (see (8)) we get Z Z Z S dωf (p0 )|(f − g)(q 0 )| 0 0 . = γ d p¯ d q¯ 2 p q S By properties of p, q, p0 , q 0 ( p2 + q 2 = p 0 2 + q 0 2 ) there is: q q 0 0 2 2 2 2 p q = (1 + R p )(1 + R q ) ≥ 1 + R 2 (p0 2 + q 0 2 ) . By transformation (p, q) → (p 0 , q 0 ) (the jacobian is equal to 1 (4.7),(4.8)) we obtain: 1 + Q (f, f − g) ≤ C1 R 6 (t)||f ||p ||f − g||p p0 y (see (4.10)). Going back to L1 (Ry3 ) we get from (8): 1 + Q (f, f − g) ≤ C1 ||f ||y ||f − g||y . p0 y For Q− (f, f − g) using the same methods as for Q+ one can get Z Z d qR ¯ 3 1 dωf (y)f (q)S ≤ C1 ||f ||y ||f − g||y . d y¯ 0 p q0

(14)

(15)

In that way from (14) and (15) we conclude (12) and (13) with: N(r) = C1 r.

t u

(16)

Lemma 2. For any r > 0 there exists such n(r) > 0 that the equation: nu −

1 Q(u, u) = v p0

(17)

for v ∈ Xr has a unique nonnegative solution u which belongs to L1 (R3 ) for any n ≥ n(r).

112

P. B. Mucha

Proof. We construct the sequence of approximations: v u1 = n .. . uk+1 =

1 Q+ (uk , uk ) p0 . n + p10 Q− (1, uk )

v+

1 3 It’s obvious that uk ≥ 0. We have to show that {uk }∞ k=1 converge in L (Ry ) (and from now || · || = || · ||y ). Applying Lemma 1 we get:

||uk+1 − uk || ≤ (v + Q+ (uk ,uk ) )(n + Q− (1,uk−1 ) ) − (v + Q+ (uk−1 ,uk−1 ) )(n + Q− (1,uk ) ) 0 0 0 0 p p p p ≤ 2 n 1 nQ+ (u , u ) Q− (v, u ) Q+ (u , u )Q− (1, u ) k k k−1 k k k−1 + + − ≤ 2 [ n p0 p0 p0 2 nQ+ (uk−1 , uk−1 ) Q− (v, uk ) Q+ (uk−1 , uk−1 )Q− (1, uk ) − − ] − p0 p0 p0 2 But from Lemma 1 we have: + Q (uk ,uk ) − Q+ (uk−1 ,uk−1 ) N (r) 0 0 p p ≤ ||uk − uk−1 ||, n n − Q (v,uk−1 ) − Q− (v,uk ) N (r) 0 0 p p ≤ ||uk − uk−1 ||, 2 n n2 − Q (1,uk )[Q+ (uk ,uk )−Q+ (uk−1 ,uk−1 )] N 2 (r) 2 p0 ||uk − uk−1 ||, ≤ n2 n2 + Q (uk ,uk )[Q− (1,uk )−Q− (1,uk−1 )] N 2 (r) 2 p0 ||uk − uk−1 ||. ≤ n2 n2 So we obtain

||uk+1 − uk || ≤

Hence if

N(r) N(r) 2N 2 (r) + 2 + ||uk − uk−1 ||. n n n2

N(r) N(r) 2N 2 (r) < 1, + 2 + n n n2

(18)

Einstein–Boltzmann Equation

113

then from the Banach fixed point theorem the sequence {uk }∞ k=1 has a unique limit. Thus there exists n(r) such that for any n ≥ n(r) Eq. (17) has a solution (by (18) and Lemma 1 the uniqueness is obvious). u t Deffinition. Let us define the operator R(n, Q) = (n − n ≥ n(r).

1 Q)−1 p0

: Xr → Xr for

For R(n, Q) we show the following estimates: Lemma 3. If g, h ∈ Xr and n ≥ max{8N(r), 1} then ||R(n, Q)g|| ≤ where ε =

4N(r) n2

1+ε ||g||, n

(19)

and ||R(n, Q)nu − R(n, Q)nv|| ≤ N1 (r)||u − v||,

(20)

where N1 (r) < 2. Proof. Let R(n, Q)g = h, then from Lemma 1 and from properties of the sequence of approximations from Lemma 2 with n ≥ max{8N (r), 1}, one easily obtains: g + ||h|| = n +

1 Q+ (h, h) p0 1 Q− (1, h) p0

≤

2 ||g||. n

Using the above estimate and again Lemma 1 and Lemma 2 we get (19): g + ||h|| = n +

1 Q+ (h, h) p0 1 Q− (1, h) p0

≤

1+

4N (r) n2

n

||g||

with ε=

4N(r) . n2

(21)

We denote R(n, Q)nu = U and R(n, Q)nv = V ; then nu + 10 Q+ (U, U ) nv + 10 Q+ (V , V ) p p − ||R(n, Q)nu − R(n, Q)nv|| ≤ ≤ 1 − (1, V ) n + 10 Q− (1, U ) n + Q 0 p p 1 nQ+ (U, U ) nQ− (u, V ) Q+ (U, U )Q− (1, V ) + + ≤ 2 [n2 u + n p0 p0 p0 2 nQ+ (V , V ) nQ− (v, U ) Q+ (V , V )Q− (1, U ) 2 −(n v + + + )] . p0 p0 p0 2

114

P. B. Mucha

We have 1 + 0 Q (U, U ) − 10 Q+ (V , V ) N (2r) p p ||U − V ||, ≤ n n − Q (u,V ) − Q− (v,U ) Q− (u,V −U ) − Q− (v−u,U ) p0 0 0 0 p p p = ≤ n n 2N (r) N(r) ||U − V || + ||u − v||, n n Q+ (U, U )Q− (1, V ) − Q+ (V , V )Q− (1, U ) = n2 p 0 2 ≤

Q+ (U, U )Q− (1, V − U ) − [Q+ (V , V ) − Q+ (U, U )]Q− (1, U ) ≤ 2 2 0 n p ≤

2N 2 (2r) ||U − V ||, n2

hence we get N(2r) N (r) 2N 2 (2r) 2N(r) ||u − v|| + + + ||U − V ||. ||U − V || ≤ 1 + n n n n2 From the above inequality we conclude (20) and N1 (r) ≤

1+ 1−

2N (r) n 4N (r) n

<2

(22)

for n ≥ max{8N(r), 1}. u t Deffinition. We define the Yosida approximation of the operator Qn = nR(n, Q)n − n =

1 Q: p0

1 QR(n, Q)n. p0

(23)

Lemma 4. For Qn defined by (23) we have lim Qn =

n→∞

1 Q p0

in L1 (R3 ). Proof. We have for any u ∈ L1 (R3 ), 1 1 Q(R(n, Q)nu, R(n, Q)nu) − 0 Q(u, u) = p0 p 1 1 = 0 Q(R(n, Q)nu, R(n, Q)nu − u) + 0 Q(R(n, Q)nu − u, u), p p

(24)

Einstein–Boltzmann Equation

115

hence by Lemma 1 and Lemma 3 we get 1 Q(R(n, Q)nu, R(n, Q)nu) − 1 Q(u, u) ≤ γ ||R(n, Q)nu − u||. p0 p0 Taking (17) with v = nu we get nR(n, Q)nu −

1 Q(R(n, Q)nu, R(n, Q)nu) = nu, p0

hence by Lemma 1 and (19) we get ||R(n, Q)nu − u|| ≤

1 γ ||u||2 , n

which tends to zero as n → ∞. u t The solution of the Boltzmann equation (9) we will approximate by the solution of the following equation on intervals [t0 , t0 + t]: Z t0 +t Qn (fn , fn )(s, y)ds. (25) fn (t0 + t, y) = fn (t0 , y) + t0

1 ; L1 (R3 )) Lemma 5. There exists the unique solution fn (t, y) of Eq. (25) in C(t0 , t0 + 3n 4N (r) 1 1 and ||fn (t0 + t)|| ≤ 1−δ ||fn (t0 )|| for 0 < t ≤ 3n and δ = n2 .

Proof. Applying (25) one can get the following equation:

integrating over

R t0 +t t0

d nt [e fn ] = ent Qn (fn ) + nent fn ; dt ·ds and using (23) we get

fn (t0 + t) = e−nt fn (t0 , y) +

Z

t0 +t

t0

e−n(t0 +t−s) nR(n, Q)nfn ds.

By (20) and (22) we conclude: Z t0 +t −n(t0 +t−s) ne [R(n, Q)nu − R(n, Q)nv]ds ≤ sup t

t0

Z

≤ 2 sup ||u − v|| t

t

ne−ns ds ≤ 2nt sup ||u − v||, t

0

1 we get an existence for (25) (in particular for t ≤ that for t < 2n We estimate the norm of the solution

1 3n ).

sup ||fn (t0 + t)|| ≤ e−nt ||fn (t0 )|| + sup ||R(n, Q)nfn (t0 + t)|| t

t

≤e

−nt

Z 0

t

1+ε sup ||nfn (t0 + t)||(1 − e−nt ). ||fn (t0 )|| + n t

ne−ns ds ≤

116

P. B. Mucha

From this sup ||fn (t0 + t)|| ≤ t

e−nt 1 ||fn (t0 )|| ≤ ||fn (t0 )||, −nt 1 − (1 + ε)(1 − e ) 1 + ε(1 − ent )

recalling ε defined by (21) and e1/2 < 2 we have: sup ||fn (t0 + t)|| ≤ t

for 0 ≤ t ≤ t u

1 3n

1 1−

4N (r) n2

||fn (t0 )||

1 and sup is taken for t ∈ [0, 3n ]. The continuous part is obvious by (20).

Proof of the Theorem. We can construct an approximation of the solution of the Boltzmann equation (9) on the interval [0, T ] for any T > 0 and small δ > 0. We take n1 = [4N (r)] + l = k0 + 1, where l is taken such that: exp

1 4N(2r) < . k0 1−δ

(26)

Then from Lemma 5 we get a unique solution of (25) on the interval [0, T1 ] for T1 = 1 1 3n1 = 3(k0 +1) we denote this solution as fn1 . By Lemma 5 we get for 0 ≤ t ≤ T1 : ||fn1 ||(T1 ) ≤

1 1−

4N(r) n21

||f0 || ≤

1 4N (2r) (k0 +1)2

1−

||f0 ||.

Solving (25) with t0 = T1 with greater n we obtain T2 , etc. Precisely, we construct Fk0 - the approximation of the solution on [0, T ]: 1. Fk0 |[0,T1 ] = fn1 where n1 = k0 + 1, T1 = sup ||fn1 (t)|| ≤

0≤t≤T1

1 3(k0 +1) ,

fn1 (0, y) = f0 (y) and we have:

1 1−

4N (2r) (k0 +1)2

||f0 ||.

2. Fk0 |[T1 ,T2 ] = fn2 , where n2 = k0 + 2, T2 = T1 + 3(k01+2) , fn2 (T1 , y) = fn1 (T1 , y) and we have: 2 Y 1 ||f0 || sup ||fn2 (t)|| ≤ 4N (2r) 1 − T1 ≤t≤T2 2 j =1 (k +j ) 0

.. . i. Fk0 |[Ti−1 ,Ti ] = fni where ni = k0 + i, Ti = Ti−1 + fni−1 (Ti−1 , y) and we have: sup

Ti−1 ≤t≤Ti

||fni (t)|| ≤

i Y

1

j =1 1 −

4N (2r) (k0 +j )2

.. .

1 3(k0 +i) ,

||f0 ||

fni (Ti−1 , y) =

Einstein–Boltzmann Equation

117

nK . Fk0 |[TK−1 ,TK ] = fnK , where nK = k0 +K, TK = TK−1 + 3(k01+K) , fnK (TnK−1 , y) = fnK−1 (TnK−1 , y) and we have: sup

TK−1 ≤t≤TK

||fnK (t)|| ≤

K Y

1

j =1

4N (2r) (k0 +j )2

1−

||f0 ||,

P 1 where K is so large that TK ≥ T or K j =1 3(k0 +j ) > T (it is always possible) Q∞ P P 1 1 1 1 and from (26) j =1 < ∞). And this 4N (2r) < 1−δ ( n = ∞ and n2 1−

implies that we have:

(k0 +j )2

sup ||Fk0 || ≤

t∈[0,T ]

1 ||f0 ||. 1−δ

(27)

Thus we have constructed Fk0 . We show the convergence of this sequence. By the above construction we have Z t [Qm(s) (fm(s) ) − Qn(s) (fn(s) )]ds, FM − FN = 0

where m(s) and n(s) are defined by construction of FM and FN . We consider Qm (fm ) − Qn (fn ) = Qm (fm ) − Qm (fn ) + Qm (fn ) − Qn (fn ). By (23), Lemma 1 and (20) we get 1 Q(R(m, Q)mfm , R(m, Q)mfm ) − 1 Q(R(n, Q)nfn , R(n, Q)nfn ) ≤ p0 0 p ≤ γ ||fm − fn ||, hence we get sup ||FM − FN || ≤ T γ sup ||FM − FN || + T sup ||Qm (fn ) − Qn (fn )||.

t∈[0,T ]

t∈[0,T ]

Taking T so small that T γ <

1 2

t∈[0,T ]

we get by Lemma 4,

sup ||FM − FN || ≤ 2T εM,N → 0

t∈[0,T ]

with M, N → ∞, which gives lim Fk0 = f

k0 →∞

in C(0, T ; L1 (R3 )),

hence by Lemma 4 we have obtained the solution of (11). From free choice of δ > 0 we get: (28) sup ||f (t)||y ≤ ||f (0)||y . t∈[0,T ]

By (28) we can continue the solution on intervals [T , 2T ], [2T , 3T ], ..., etc. Thus we constructed the solution for any T . R Now to obtain the norm ||f ||p it’s enough to integrate (9) (using Q(f, f ) dpp0¯ = 0, see [3]) and we get: (29) ||f (t)||y = ||f0 ||p

118

P. B. Mucha

˙ from (29) and (8). For R(0) < 0 we have

p ||f (t)||p = ||f0 ||p exp (3 T00 (0)t),

(30)

˙ and for R(0) > 0 we have

p ||f (t)||p = ||f0 ||p exp (−3 T00 (0)t).

(31)

Summing-up p(t, y) defined by (7) and the f solution of (9) is Summing-up p(t, y) defined by (7) and the f solution of (9) is the solution of (6). And from (10), (11), (30) and (31) we conclude the thesis of the theorem. The uniqueness is obvious. u t Acknowledgements. The author wishes to express his gratitude to Professor Wojciech Zaja˛czkowski for useful discussions and for help during the preparation of the paper.

References 1. Bancel, D., Choquet-Bruhat, Y.: Existence, Uniqueness, and Local Stability for the Einstein–Maxwell– Boltzman System. Commun. Math. Phys. 33, 84–96 (1973) 2. Bernstein, J.: Kinetic Theory in the Expanding Universe. Cambridge: Cambridge University Press, 1988 3. Bitcheler, K.: On the Cauchy problem for relativistic Boltzmann equation. Commun. Math. Phys. 4, 352– 364 (1967) 4. Di Blasio, G.: Differtiability of Spatially Homogeneous Solution of the Boltzmann Equation in the Non Maxwellian Case. Commun. Math. Phys. 38, 331–340 (1974) 5. Ehlers, J.: Survey of General Relativity Theory. In: Israel, W.(ed.), Relativity, Astrophysics and Cosmology. Dordrecht: Reidel, 1973 6. Mucha, P.B.: The Cauchy Problem for the Einstein–Boltzmann System. J. of Appl. Anal. 4 No 1, 129–141 (1998) Communicated by H. Araki

Commun. Math. Phys. 203, 119 – 184 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Supersymmetric Quantum Theory and Non-Commutative Geometry J. Fr¨ohlich1 , O. Grandjean2 , A. Recknagel3 1 Institut f¨ ur Theoretische Physik, ETH-H¨onggerberg, CH-8093 Z¨urich, Switzerland. E-mail: [email protected] 2 Department of Mathematics, Harvard University, Cambridge, MA 02138, USA. E-mail: [email protected] 3 Institut des Hautes Etudes ´ Scientifiques, 35, route de Chartres, F-91440 Bures-sur-Yvette, France. E-mail: [email protected]

Received: 1 April 1997 / Accepted: 24 November 1998

Abstract: Classical differential geometry can be encoded in spectral data, such as Connes’ spectral triples, involving supersymmetry algebras. In this paper, we formulate non-commutative geometry in terms of supersymmetric spectral data. This leads to generalizations of Connes’ non-commutative spin geometry encompassing noncommutative Riemannian, symplectic, complex-Hermitian and (Hyper-) K¨ahler geometry. A general framework for non-commutative geometry is developed from the point of view of supersymmetry and illustrated in terms of examples. In particular, the noncommutative torus and the non-commutative 3-sphere are studied in some detail. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Spectral Data of Non-Commutative Geometry . . . . . . . . . . . . . . . . . . . . 2.1 The N = 1 formulation of non-commutative geometry . . . . . . . . . . . . . . 2.2 The N = (1, 1) formulation of non-commutative geometry . . . . . . . . . . 2.3 Hermitian and K¨ahler non-commutative geometry . . . . . . . . . . . . . . . . . 2.4 The N = (4, 4) spectral data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Symplectic non-commutative geometry . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Non-Commutative 3-Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The N = 1 data associated to the 3-sphere . . . . . . . . . . . . . . . . . . . . . . . 3.2 The topology of the non-commutative 3-sphere . . . . . . . . . . . . . . . . . . . 3.3 The geometry of the non-commutative 3-sphere . . . . . . . . . . . . . . . . . . . 3.4 Remarks on N = (1, 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 The Non-Commutative Torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The classical torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Spin geometry (N = 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Riemannian geometry (N = (1, 1)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 K¨ahler geometry (N = (2, 2)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Directions for Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

120 123 123 138 146 153 154 155 156 158 163 166 169 169 170 174 176 178

120

J. Fr¨ohlich, O. Grandjean, A. Recknagel

1. Introduction The study of highly singular geometrical spaces, such as the space of leaves of certain foliations, of discrete spaces, and the study of quantum theory have led A. Connes to develop a general theory of non-commutative geometry, involving non-commutative measure theory, cyclic cohomology, non-commutative differential topology and spectral calculus, [Co1–5].A broad exposition of his theory and a rich variety of interesting examples can be found in his book [Co1]. Historically, the first examples of non-commutative spaces carrying geometrical structure emerged from non-relativistic quantum mechanics, as discovered by Heisenberg, Born, Jordan, Schr¨odinger and Dirac. Mathematically speaking, non-relativistic quantum mechanics is the theory of quantum phase spaces, which are non-commutative deformations of certain classical phase spaces (i.e., of certain symplectic manifolds), and it is the theory of dynamics on quantum phase spaces. Geometrical aspects of quantum phase spaces and supersymmetry entered the scene implicitly in Pauli’s theory of the non-relativistic, spinning electron and in the theory of non-relativistic positronium. Later on, the mathematicians discovered Pauli’s and Dirac’s theories of the electron as a powerful tool in algebraic topology and differential geometry. In a companion paper [FGR1], hereafter referred to as I, we have described a formulation of classical differential geometry in terms of the spectral data of non-relativistic, supersymmetric quantum theory, in particular in terms of the quantum theory of the non-relativistic electron and of positronium propagating on a general (spinc ) Riemannian manifold. The work in I is inspired by Connes’ fundamental work [Co1–5], and by Witten’s work on supersymmetric quantum theory and its applications in algebraic topology [Wi1, 2]; it attempts to merge these two threads of thought. Additional inspiration has come from the work in [AG, FW, AGF, HKLR] on the relation between index theory and supersymmetric quantum theory and on supersymmetric non-linear σ-models, as well as from the work by Jaffe and co-workers on connections between supersymmetry and cyclic cohomology [Ja1–3]. To elucidate the roots of some of these ideas in Pauli’s non-relativistic quantum theory of the electron and of positronium has proven useful and suggestive of various generalizations. The work described in the present paper has its origins in an attempt to apply the methods of non-commutative geometry to exploring the geometry of string theory, in particular of superstring vacua; see [CF, FG]. In trying to combine quantum theory with the theory of gravitation, one observes that it is impossible to localize events in spacetime arbitrarily precisely, and that, in a compact region of space-time, one can only resolve a finite number of distinct events [DFR]. One may then argue, heuristically, that space-time itself must have quantum-mechanical features at distance scales of the order of the Planck length, and that space-time and matter should be merged into a fundamental quantum theory of space-time-matter. Superstring theory [GSW] is a theoretical framework incorporating some of the features necessary for a unification of quantum theory and the theory of gravitation. Superstring vacua are described by certain superconformal field theories, see e.g. [GSW]. The intention of the program formulated in [CF, FG] is to reconstruct space-time geometry from algebraic data of superconformal field theory. In the study of concrete examples, one observes that, in general, the target spaces (spacetimes) of superconformal field theories are non-commutative geometrical spaces, and the tools of Connes’ non-commutative geometry become essential in describing their geometry. This observation has been confirmed more recently in the theory of D-branes [Pol, Wi4].

Supersymmetric Quantum Theory and Non-Commutative Geometry

121

The purpose of this paper is to cast some of the tools of non-commutative (differential) geometry into a form that makes connections to supersymmetric quantum theory manifest and that is particularly useful for applications to superconformal field theory. The methods and results of this paper are mathematically precise. Applications to physics are not treated here; but see e.g. [FGR2]. Instead, the general formalism developed in this paper is illustrated by an analysis of the geometry of the non-commutative torus and of the fuzzy 3-sphere; more details can be found in [Gr]. Next, we sketch some of the key ideas underlying our approach to non-commutative geometry; for further background see also part I and [FGR2]. Connes has shown how to formulate classical geometry in terms of algebraic data, so-called spectral triples, involving a commutative algebra A = C ∞ (M ) of (smooth) functions on the smooth manifold M under consideration, a Hilbert space H of spinors over M on which the algebra A acts by bounded operators, and a self-adjoint Dirac operator D on H satisfying certain properties with respect to A. As explained in [Co1], it is possible to extract complete geometrical information about M from the spectral triple (A, H, D). The definition of spectral triples involves, in the classical case, a Clifford action on certain vector bundles over M , e.g. the spinor bundle or the bundle of differential forms. As was recalled in ref. I, the latter bundle actually carries two anti-commuting Clifford actions – which can be used to define two Dirac-K¨ahler operators, D and D. It turns out that the algebraic relations between these operators are precisely those of the two supercharges of N = (1, 1) supersymmetric quantum mechanics (see part I, especially Sect. 3, for the precise meaning of the terminology): These relations are { D, D } = 0 and D2 = D2 . The commutators [ D, a ] and [ D, a ], for arbitrary a ∈ A, extend to bounded operators (anti-commuting sections of two Clifford bundles) acting on the Hilbert space H of square-integrable differential forms. Furthermore, if the underlying manifold M is compact, the operator exp(−εD2 ) is trace-class for any ε > 0. One may then introduce a nilpotent operator d := D − iD, which turns out to correspond to exterior differentiation of differential forms. From the N = (1, 1) supersymmetric spectral data (A, H, D, D) just described, one can reconstruct the de Rham-Hodge theory and the Riemannian geometry of smooth (compact) Riemannian manifolds. N = (1, 1) supersymmetric spectral data are a variant of Connes’ approach involving spectral triples. They are very natural from the point of view of supersymmetric quantum theory and encode the differential geometry of Riemannian manifolds (not required to be spinc manifolds). In a formulation of differential geometry in terms of spectral data (A, H, D, D, . . . ) with supersymmetry, additional geometrical structures, e.g. a symplectic or complex structure, appear in the form of global gauge symmetries commuting with the elements of A but acting non-trivially on the Dirac-K¨ahler operators D and D; see part I. For example, a global gauge symmetry group containing U(1) × U(1) generates four DiracK¨ahler operators – the “supercharges” of N = (2, 2) supersymmetry – from D and D and identifies the underlying manifold M as a K¨ahler manifold. A global gauge symmetry group containing SU(2) × SU(2) leads to eight supercharges generating an N = (4, 4) supersymmetry algebra and is characteristic of Hyperk¨ahler geometry; see also [AGF, HKLR]. Complex-Hermitian and symplectic geometry are encoded in N = (2, 2) supersymmetric spectral data with partially broken supersymmetry. A systematic classification of different types of differential geometry in terms of supersymmetric

122

J. Fr¨ohlich, O. Grandjean, A. Recknagel

spectral data extending the N = (1, 1) data of Riemannian geometry has been described in I (see Sect. I 3 for an overview, and [FGR2]). In this paper, we generalize these results from classical to non-commutative geometry, starting from the simple prescription to replace the commutative algebra of functions C ∞ (M ) over a classical manifold by a general, possibly non-commutative ∗ -algebra A satisfying certain properties. Section 2 contains general definitions and introduces various kinds of spectral data: We start with an exposition of Connes’ non-commutative spin geometry; most of the material can be found in [Co1], but we add some details on metric aspects ranging from connections over curvature and torsion to non-commutative Cartan structure equations. In Subsect. 2.2, we introduce spectral data with N = (1, 1) supersymmetry that naturally lead to a non-commutative analogue of the de Rham complex of differential forms. Moreover, this “Riemannian” formulation of non-commutative geometry allows for immediate specializations to spectral data with extended supersymmetry – which, in the classical case, correspond to manifolds carrying complex, K¨ahler, Hyperk¨ahler or symplectic structures. Spectral data with higher supersymmetry are treated in Subsects. 2.3–2.5. In Subsect. 2.2.5, we discuss the relationship between spectral triples, as defined by Connes, and spectral data with N = (1, 1) supersymmetry: Whereas in the classical case, one can always pass from one description of a smooth manifold to the other, the situation is not quite as clear in the non-commutative framework. We propose a procedure how to construct N = (1, 1) data from a spectral triple – heavily relying on Connes’ notion of a real structure [Co4] – but the construction is not complete for general spectral triples. Furthermore, Subsect. 2.2.6 contains proposals for definitions of non-commutative manifolds and non-commutative phase spaces, as suggested by the study of N = (1, 1) spectral data and by notions from quantum physics. In Sects. 3 and 4 we discuss two examples of non-commutative spaces, namely the “fuzzy 3-sphere” and the non-commutative torus. The choice of the latter example does not require further explanation since it is one of the classic examples of a non-commutative space; see e.g. [Co1, Co5, Ri]. Here we add a description of the non-commutative 2-torus in terms of spectral data with N = (1, 1) and N = (2, 2) supersymmetry, thus showing that this space can be endowed with a non-commutative Riemannian and a non-commutative K¨ahler structure. This is not too surprising, since the non-commutative torus can be regarded as a deformation of the classical flat torus. The calculations in Sect. 4 also provide an example where the general ideas of Subsect. 2.2.5 on how to construct N = (1, 1) from N = 1 spectral data can be carried out completely. The other example, the non-commutative 3-sphere discussed in Sect. 3 (see also [Gr]), represents a generalization of another prototype non-commutative geometrical space, namely the fuzzy 2-sphere [Ber, Ho, Ma, GKP]. We choose to study the 3-sphere for the following reasons: First, in contrast to the fuzzy 2-sphere and the non-commutative torus, it cannot be viewed as a quantization of a classical phase space. Second, it is the simplest example of a series of quantized spaces arising from so-called Wess–Zumino– Witten-models – conformal field theories associated to non-linear σ-models with compact simple Lie groups as target manifolds, see [Wi3]. There is reason to expect that the spectral data arising from other WZW-models – see [FG, FGR2] for a discussion – can be treated essentially by the same methods as the fuzzy 3-sphere associated to the group SU(2). In view of the conformal field theory origin, one is led to conjecture that, as a non-commutative space, the non-commutative 3-sphere describes the non-commutative geometry of the quantum group Uq (sl2 ), for q = exp(2πi/k + 2), where k ∈ Z+ is the level of the WZW-model. The parameter k appears in the spectral data of the non-

Supersymmetric Quantum Theory and Non-Commutative Geometry

123

commutative 3-sphere in a natural way. One may expect that the fuzzy 3-sphere can actually be defined for arbitrary values of this parameter, since the same is true for the quantum group. As in the example of the non-commutative torus with rational deformation parameter, a truncation of the algebra of “functions” occurs for the special values k ∈ Z+ , leading to the finite-dimensional matrix algebras used in Sect. 3. In Sect. 5, we conclude with a list of open problems arising naturally from our discussion. In particular, we briefly comment on other, string theory motivated applications of non-commutative geometry; see also [FG, FGR2]. The present text is meant as a companion paper to I: Now and then, we will permit ourselves to refer to [FGR1] for technical details of proofs which proceed analogously to the classical case. More importantly, the study of classical geometry in part I provides the best justification – besides the one of naturality – of the expectation that our classification of (non-commutative) geometries according to the supersymmetry content of the spectral data leads to useful and fruitful definitions of non-commutative geometrical structure. 2. Spectral Data of Non-Commutative Geometry In the following, we generalize the notions of part I from classical differential geometry to the non-commutative setting. The classification of geometrical structure according to the “supersymmetry content” of the relevant spectral data, which was uncovered in [FGR1], will be our guiding principle. In the first part, we review Connes’ formulation of non-commutative geometry using a single generalized Dirac operator, whereas, in the following subsections, spectral data with realizations of some genuine supersymmetry algebras will be introduced, allowing us to define non-commutative generalizations of Riemannian, complex, K¨ahler and Hyperk¨ahler, as well as of symplectic geometry. 2.1. The N = 1 formulation of non-commutative geometry. This section is devoted to the non-commutative generalization of an algebraic description of spin geometry – and, according to the results of Sect. I 2, of general Riemannian geometry – following the ideas of Connes [Co1]. The first two subsections contain the definition of abstract N = 1 spectral data and of differential forms. In Subsect. 2.1.3, we describe a notion of integration which leads us to a definition of square integrable differential forms. After having introduced vector bundles and Hermitian structures in Subsect. 2.1.4, we show in Subsect. 2.1.5 that the module of square integrable forms always carries a generalized Hermitian structure. We then define connections, torsion, and Riemannian, Ricci and scalar curvature in the next two subsections. Finally, in 2.1.8, we derive noncommutative Cartan structure equations. Although much of the material in Sect. 2.1 is contained (partly in much greater detail) in Connes’ book [Co1], it is reproduced here because it is basic for our analysis in later sections and because we wish to make this paper accessible to non-experts. 2.1.1. The N = 1 spectral data. Definition 2.1. A quadruple (A, H, D, γ) will be called a set of N = 1 (even) spectral data if 1) H is a separable Hilbert space; 2) A is a unital ∗ -algebra acting faithfully on H by bounded operators; 3) D is a self-adjoint operator on H such that i) for each a ∈ A, the commutator [ D, a ] defines a bounded operator on H,

124

J. Fr¨ohlich, O. Grandjean, A. Recknagel

ii) the operator exp(−εD2 ) is trace class for all ε > 0 ; 4) γ is a Z2 -grading on H, i.e., γ = γ ∗ = γ −1 , such that { γ, D } = 0, [ γ, a ] = 0 for all a ∈ A. As mentioned before, in non-commutative geometry A plays the role of the “algebra of functions over a non-commutative space”. The existence of a unit in A, together with property 3 ii) above, reflects the fact that we are dealing with “compact” noncommutative spaces. Note that if the Hilbert space H is infinite-dimensional, condition 3 ii) implies that the operator D is unbounded. By analogy with classical differential geometry, D is interpreted as a (generalized) Dirac operator. Also note that the fourth condition in Definition 2.1 does not impose any restriction e H, e D) e satisfying Properties 1–3 from on N = 1 spectral data: In fact, given a triple (A, above, we can define a set of N = 1 even spectral data (A, H, D, γ) by setting A = Ae ⊗ 12 ,

e ⊗ C2 , H=H e ⊗ τ1 , D=D

γ = 1H˜ ⊗ τ3 ,

where τi are the Pauli matrices acting on C2 . 2.1.2. Differential forms. The construction of differential forms follows the same lines as in classical differential geometry: We define the unital, graded, differential ∗ -algebra of universal forms, • (A), as in [Co1, CoK]: •

(A) =

∞ M

k

(A),

k

(A) := {

N X

ai0 δai1 · · · δaik | N ∈ N, aij ∈ A },

(2.1a)

i=1

k=0

where δ is an abstract linear operator satisfying δ 2 = 0 and the Leibniz rule. Note that, even in the classical case where A = C ∞ (M ) for some smooth manifold M , no relations ensuring (graded) commutativity of • (A) are imposed. The complex conjugation of functions over M is now to be replaced by the ∗ -operation of A. We define (δa)∗ = −δ(a∗ )

(2.1b)

for all a ∈ A. With the help of the (self-adjoint) generalized Dirac operator D, we introduce a ∗ -representation π of • (A) on H, π(a) = a,

π(δa) = [ D, a ],

cf. [Co1] or Eq. (I 2.12). A graded ∗ -ideal J of • (A) is defined by J :=

∞ M

J k,

J k := ker π |k (A) .

(2.2)

k=0

Since J is not a differential ideal, the graded quotient • (A)/J does not define a differential algebra and thus does not yield a satisfactory definition of the algebra of differential forms. This problem is solved as in the classical case.

Supersymmetric Quantum Theory and Non-Commutative Geometry

125

Proposition 2.2. ([Co1]) The graded sub-complex J + δJ =

∞ M

J k + δJ k−1 ,

k=0

:= 0 and δ is the universal differential in • (A), is a two-sided graded where J ∗ differential -ideal of • (A). −1

We define the unital graded differential ∗ -algebra of differential forms, •D (A), as the graded quotient • (A)/(J + δJ), i.e., •D (A) :=

∞ M

kD (A),

kD (A) := k (A)/(J k + δJ k−1 ).

(2.3)

k=0

Since •D (A) is a graded algebra, each kD (A) is, in particular, a bi-module over A = 0D (A). Note that π does not determine a representation of the algebra (or, for that matter, of the space) of differential forms •D (A) on the Hilbert space H: A differential k-form is an equivalence class [ω] ∈ kD (A) with some representative ω ∈ k (A), and π maps this class to a set of bounded operators on H, namely π [ω] = π(ω) + π δJ k−1 . In general, the only subspaces where we do not meet this complication are π 0D (A) = A and π 1D (A) ∼ = π 1 (A) . However, the image of •D (A) under π is Z2 -graded, ∞ ∞ M M 2k (A) ⊕ π 2k+1 π •D (A) = π D D (A) , k=0

k=0

because of the (anti-)commutation properties of the Z2 -grading γ on H, see Definition 2.1. 2.1.3. Integration. Property 3ii) of the Dirac operator in Definition 2.1 allows us to define the notion of integration over a non-commutative space in the same way as in the classical case, see part I. Note that, for certain sets of N = 1 spectral data, we could use the Dixmier trace, as Connes originally proposed; but the definition given below, first introduced in [CFF], works in greater generality (cf. the remarks in Sect. I 2.1.3). Moreover, it is closer to notions coming up naturally in quantum field theory. Definition 2.3. The integral over theR non-commutative space described by the N = 1 spectral data (A, H, D, γ) is a state − on π • (A) defined by  •  Z π (A) −→ C 2 Z − : Limε→0+ Tr H ωe−εD  , 7−→ − ω :=  ω Tr H e−εD2 R where Limε→0+ denotes some limiting procedure making the functional − linear and positive semi-definite; the existence of such a procedure can be shown analogously to [Co1, 3], where the Dixmier trace is discussed.

126

J. Fr¨ohlich, O. Grandjean, A. Recknagel

R For this integral − to be a useful tool, we need an additional property that must be checked in each example: R Assumption 2.4. The state − on π • (A) is cyclic, i.e., Z Z ∗ − ω η = − η∗ ω for all ω, η ∈ π • (A) . R The state − determines a positive semi-definite sesqui-linear form on • (A) by setting Z (2.4) (ω, η) := − π(ω) π(η)∗ for all ω, η ∈ • (A). In the formulas below, we will often drop the representation symbol π under the integral, as there is no danger of confusion. Note that the commutation relations of the grading γ with the Dirac operator imply that forms of odd degree are orthogonal to those of even degree with respect to (·, ·). By K k we denote the kernel of this sesqui-linear form restricted to k (A). More precisely we set K :=

∞ M

Kk,

K k := { ω ∈ k (A) | ( ω, ω ) = 0 }.

(2.5)

k=0

Obviously, K k contains the ideal J k defined in Eq. (2.2); in the classical case they coincide. Assumption 2.4 is needed to show that K is a two-sided ideal of the algebra of universal forms, so that we can pass to the quotient algebra. Proposition 2.5. The set K is a two-sided graded ∗ -ideal of • (A). Proof. The Cauchy–Schwarz inequality for states implies that K is a vector space. If ω ∈ K k , then Assumption 2.4 gives Z Z ∗ ∗ ∗ (ω , ω ) = − π(ω) π(ω) = − π(ω)π(ω)∗ = 0, i.e. that K is closed under the involution *. With ω as above and η ∈ p (A), we have that Z Z (ηω, ηω) = − π(η)π(ω)π(ω)∗ π(η)∗ = − π(ω)∗ π(η)∗ π(η)π(ω) Z ≤ kπ(η)k2H − π(ω)∗ π(ω) = 0,

where k · kH is the operator norm on B(H). On the other hand, we have that Z Z (ωη, ωη) = − π(ω)π(η)π(η)∗ π(ω)∗ ≤ kπ(η)k2H − π(ω)π(ω)∗ = 0, and it follows that both ω η and η ω are elements of K, i.e., K is a two-sided ideal.

Supersymmetric Quantum Theory and Non-Commutative Geometry

127

We now define e • (A) :=

∞ M

e k (A), e k (A) := k (A)/K k .

(2.6)

k=0

e k (A), and The sesqui-linear form (·, ·) descends to a positive definite scalar product on e k the Hilbert space completion of this space with respect to the scalar we denote by H product, ∞ (·,·) M ek , H e k := e • := e k (A) . H (2.7) H k=0

e • does e k is to be interpreted as the space of square-integrable k-forms. Note that H H not in general coincide with the Hilbert space that would arise from a GNS construction R e • , orthogonality of forms of different degree e • (A): Whereas in H using the state − on is installed by definition, there may exist forms of even degree (or odd forms) in the GNS Hilbert space that have different degrees but are not orthogonal. •

k

e (A), the e (A) is a unital graded ∗ -algebra. For any ω ∈ Corollary 2.6. The space p p+k e e left and right actions of ω on (A) with values in (A), mL (ω)η := ωη, mR (ω)η := ηω, e • is a are continuous in the norm given by (·, ·). In particular, the Hilbert space H • e bi-module over (A) with continuous actions. Proof. The claim follows immediately from the two estimates given in the proof of the e p (A). e k (A) and η ∈ previous proposition, applied to ω ∈ •

•

e • are “well-behaved” with respect to the e (A)e (A) and H This remark shows that action. Furthermore, Corollary 2.6 will be useful for our discussion of curvature and torsion in Subsects. 2.1.7 and 2.1.8. e • (A) may fail to be differential, we introduce the unital graded Since the algebra e •D (A) as the graded quodifferential ∗ -algebra of square-integrable differential forms tient of • (A) by K + δK, e •D (A) :=

∞ M

e kD (A), e kD (A) := k (A)/(K k + δK k−1 ) ∼ e k (A)/δK k−1 . (2.8) =

k=0 •

e D (A) has the stated properties, one repeats the proof of Proposition In order to show that e •D (A) as a “smaller version” of •D (A) 2.2. Note that we can regard the A-bi-module in the sense that there exists a projection from the latter onto the former; whenever one deals with a concrete set of N = 1 spectral data that satisfy Assumption 2.4, it will be advantageous to work with the “smaller” algebra of square-integrable differential forms. The algebra •D (A), on the other hand, can be defined for arbitrary data. In the classical case, differential forms are identified with the orthogonal complement of Cl(k−2) within Cl(k) , see [Co1] and the remarks in part I, after Eq. (I 2.15). Now, we e k to introduce, for each k ≥ 1, the orthogonal projection use the scalar product (·, ·) on H

128

J. Fr¨ohlich, O. Grandjean, A. Recknagel

e k −→ H ek PδK k−1 : H

(2.9)

e k , and we set onto the image of δK k−1 in H ek ω ⊥ := (1 − PδK k−1 ) ω ∈ H

(2.10)

k

e D (A). This allows us to define a positive definite scalar product for each element [ω] ∈ k e on D (A) via the representative ω ⊥ : ( [ω], [η] ) := ( ω ⊥ , η ⊥ )

(2.11)

k

e D (A). In the classical case, this is just the usual inner product on the for all [ω], [η] ∈ space of square-integrable k-forms. 2.1.4. Vector bundles and Hermitian structures. Again, we simply follow the algebraic formulation of classical differential geometry in order to generalize the notion of a vector bundle to the non-commutative case: Definition 2.7 ([Co1]). A vector bundle E over the non-commutative space described by the N = 1 spectral data (A, H, D, γ) is a finitely generated projective left A-module. Recall that a module E is projective if there exists another module F such that the direct sum E ⊕ F is free, i.e., E ⊕ F ∼ = An as left A-modules, for some n ∈ N. Since A is an algebra, every A-module is a vector space; therefore, left A-modules are representations of the algebra A, and E is projective iff there exists a module F such that E ⊕ F is isomorphic to a multiple of the left-regular representation. By Swan’s Lemma [Sw], a finitely generated projective left module corresponds, in the commutative case, to the space of sections of a vector bundle. With this in mind, it is straightforward to define the notion of a Hermitian structure over a vector bundle: Definition 2.8 ([Co1]). A Hermitian structure over a vector bundle E is a sesqui-linear map (linear in the first argument) h·, ·i : E × E −→ A such that for all a, b ∈ A and all s, t ∈ E, 1) h as, bt i = a h s, t i b∗ ; 2) h s, s i ≥ 0 ; 3) the A-linear map

( ∗ E −→ ER , g : s 7−→ h s, · i

∗ := { φ ∈ Hom(E, A) | φ(as) = φ(s)a∗ }, is an isomorphism of left Awhere ER modules, i.e., g can be regarded as a metric on E. k

˜ (A). In this section we show that the 2.1.5. Generalized Hermitian structure on k e A-bi-modules (A) carry Hermitian structures in a slightly generalized sense. Let A e 0 , i.e., A is the von Neumann algebra be the weak closure of the algebra A acting on H 0 e0 . e (A) acting on the Hilbert space H generated by

Supersymmetric Quantum Theory and Non-Commutative Geometry

129

Theorem 2.9. There is a canonically defined sesqui-linear map e k (A) × e k (A) −→ A h·, ·iD : e k (A), such that for all a, b ∈ A and all ω, η ∈ 1) h a ω, b η iD = a h ω, η iD b∗ ; 2) h ω, ω iD ≥ 0 ; 3) h ω a, η iD = h ω, η a∗ iD . k

e (A). It is the non-commutative We call h·, ·iD a generalized Hermitian structure on analogue of the Riemannian metric on the bundle of differential forms. Note that h·, ·iD takes values in A and thus Property 3) of Definition 2.8 is not directly applicable. e k (A) and define the C-linear map Proof. Let ω, η ∈ Z ϕω,η (a) = − a η ω ∗ , e 0 (A). Note that a on the rhs actually is a representative in A of the class for all a ∈ e 0 (A), and analogously for ω and η (and we have omitted the representation a ∈ symbol π). The value of the integral is, however, independent of the choice of these representatives, which is why we used the same letters. The map ϕ satisfies Z 1 Z 1 Z 21 1 ∗ 2 ∗ ∗ 2 2 |ϕω,η (a)| ≤ − aa − ωη ηω ≤ (a, a) − ωη ∗ ηω ∗ . e 0 , and there exists an element Therefore, ϕω,η extends to a bounded linear functional on H e 0 such that h ω, η iD ∈ H ϕω,η (x) = (x, h ω, η iD ) e 0 ; since (·, ·) is non-degenerate, h ω, η iD is a well-defined element; but it for all x ∈ H remains to show that it also acts as a bounded operator on this Hilbert space. To this end, e 0 (A) which converges to h ω, η iD . Then, for all b, c ∈ e 0 (A), choose a net {aι } ⊂ Z Z ∗ ( h ω, η iD b, c ) = lim ( aι b, c ) = lim − aι bc = lim − aι (cb∗ )∗ ι→∞

ι→∞

∗

ι→∞

∗

= lim ( aι , cb ) = ( h ω, η iD , cb ), ι→∞

and it follows that |( h ω, η iD b, c )| = |( h ω, η iD , cb∗ )| = |( cb∗ , h ω, η iD )| Z Z Z ∗ ∗ ∗ ∗ = − cb η ω = − ω cb η = − b∗ η ω ∗ c 21 Z 21 21 Z 21 Z Z ≤ − b∗ b − c∗ ω η ∗ ηω ∗ c ≤ k ωη ∗ kH − b∗ b − c∗ c ≤ k ωη ∗ kH (b, b) 2 (c, c) 2 . 1

1

R In the third line, we first use the Cauchy–Schwarz inequality for the positive state − , and then an estimate which is true for all positive operators on a Hilbert space; the

130

J. Fr¨ohlich, O. Grandjean, A. Recknagel

upper bound k ωη ∗ kH again involves representatives ω, η ∈ π k (A) , which was not explicitly indicated above, since any two will do. e 0 , we see that h ω, η iD indeed defines a bounded operator in e 0 (A) is dense in H As e 0 , which, by definition, is the weak limit of elements in e 0 (A), i.e., it belongs to A. H Properties 1-3 of h·, ·iD are easy to verify. Note that the definition of the metric h·, ·iD given here differs slightly from the one of refs. [CFF, CFG]. One can, however, show that in the N = 1 case both definitions agree; moreover, the present one is better suited for the N = (1, 1) formulation to be introduced later. 2.1.6. Connections. Definition 2.10. A connection ∇ on a vector bundle E over a non-commutative space is a C-linear map e 1D (A) ⊗A E ∇ : E −→ such that

∇(as) = δa ⊗ s + a∇s

for all a ∈ A and all s ∈ E. Given a vector bundle E, we define a space of E-valued differential forms by e •D (A) ⊗A E ; e •D (E) := if ∇ is a connection on E, then it extends uniquely to a C-linear map, again denoted ∇,

such that for all ω ∈

e kD (A)

e •+1 e •D (E) −→ ∇ : D (E)

(2.12)

∇(ωs) = δω s + (−1)k ω ∇s

(2.13)

and all s ∈

e •D (E).

Definition 2.11. The curvature of a connection ∇ on a vector bundle E is given by e 2D (A) ⊗A E. R (∇) = −∇2 : E −→ Note that the curvature extends to a map e •+2 e •D (E) −→ R (∇) : D (E), which is left A-linear, as follows easily from Eq. (2.12) and Definition 2.10. Definition 2.12. A connection ∇ on a Hermitian vector bundle (E, h·, ·i) is called unitary if δ h s, t i = h ∇s, t i − h s, ∇t i for all s, t ∈ E, where the rhs of this equation is defined by h ω ⊗ s, t i = ω h s, t i, e 1D (A) and all s, t ∈ E. for all ω, η ∈

h s, η ⊗ t i = h s, t i η ∗

(2.14)

Supersymmetric Quantum Theory and Non-Commutative Geometry

131

2.1.7. Riemannian curvature and torsion. Throughout this section, we make three additional assumptions which limit the generality of our results, but turn out to be fulfilled in interesting examples. Assumption 2.13. We assume that the N = 1 spectral data under consideration have the following additional properties: e 0D (A) = A and e 1D (A) = e 1 (A), thus e 1D (A) carries a 1) K 0 = 0. (This implies that generalized Hermitian structure.) e 1D (A) is always a e 1D (A) is a vector bundle, called the cotangent bundle over A. ( 2) left A-module. Here, we assume, in addition, that it is finitely generated and projective.) e 1D (A) defines an isomorphism of left A-modules 3) The generalized metric h·, ·iD on e 1D (A) and the space of A-anti-linear maps from e 1D (A) to A, i.e., for each between A-anti-linear map, e 1D (A) −→ A, φ : e 1D (A) and all a ∈ A, there is a unique satisfying φ(aω) = φ(ω)a∗ for all ω ∈ e 1D (A) with ηφ ∈ φ(ω) = h ηφ , ω iD . If N = 1 spectral data (A, H, D, γ) satisfy these assumptions, we are able to define non-commutative generalizations of classical notions like torsion and curvature. e 1D (A) is a vecWhereas torsion and Riemann curvature can be introduced whenever tor bundle, the last assumption in 2.13 will provide a substitute for the procedure of “contracting indices” leading to Ricci and scalar curvature. 1

e D (A) over a nonDefinition 2.14. Let ∇ be a connection on the cotangent bundle commutative space (A, H, D, γ) satisfying Assumption 2.13. The torsion of ∇ is the A-linear map e 2D (A), e 1D (A) −→ T(∇) := δ − m ◦ ∇ : e 2D (A) denotes the product of 1-forms in e •D (A). e 1D (A) −→ e 1D (A) ⊗A where m : Using the definition of a connection, A-linearity of torsion is easy to verify. In analogy to the classical case, a unitary connection ∇ with T(∇) = 0 is called a Levi–Civita connection. In the classical case, there is exactly one Levi–Civita connection that, in addition, is a real operator on the complexified bundle of differential forms. In contrast, for a given set of non-commutative spectral data, there may be several (real) Levi–Civita connections – or none at all. e 1D (A) is a vector bundle, we can define the Riemannian curSince we assume that vature of a connection ∇ on the cotangent bundle as a specialization of Definition 2.11. To proceed further, we make use of part 2) of Assumption 2.13, which implies that e 1D (A) and an associated “dual basis” there exists a finite set of generators { E A } of e 1D (A)∗ , { εA } ⊂ e 1D (A) −→ A | φ(aω) = aφ(ω) for all a ∈ A, ω ∈ e 1D (A) }, e 1D (A)∗ := { φ :

132

J. Fr¨ohlich, O. Grandjean, A. Recknagel

e 1D (A) can be written as ω = εA (ω)E A , see e.g. [Jac]. Because the such that each ω ∈ e 2D (A) with curvature is A-linear, there is a family of elements { RA } ⊂ B

R (∇) = εA ⊗ RAB ⊗ E B ;

(2.15)

here and in the following the summation convention is used. Put differently, we have applied the canonical isomorphism of vector spaces e 1D (A) ∼ e 2D (A) ⊗A e 1D (A) e 2D (A) ⊗A e 1D (A)∗ ⊗A e 1D (A), HomA = 1

e D (A) is projective – and chosen explicit generators E A , εA . – which is valid because e 1D (A). Then we have that R (∇) ω = εA (ω) RAB ⊗ E B for any 1-form ω ∈ Note that although the components RAB need not be unique, the element on the rhs of Eq. (2.15) is well-defined. Likewise, the Ricci and scalar curvature, to be introduced below, will be invariant combinations of those components, as long as we make sure that all maps we use have the correct “tensorial properties” with respect to the A-action. The last part of Assumption 2.13 guarantees, furthermore, that to each εA there exists e 1D (A) such that a unique 1-form eA ∈ εA (ω) = h ω, eA iD e 1D (A). By Corollary 2.6, every such eA determines a bounded operator for all ω ∈ e 1 −→ H e 2 acting on H e 1 by left multiplication with eA . The adjoint of this mL (eA ) : H e • is denoted by operator with respect to the scalar product (·, ·) on H e2 e1 ead A : H −→ H .

(2.16)

ead A is a map of right A-modules, and it is easy to see that also the correspondence e 1D (A), we have εA 7→ ead is right A-linear: For all b ∈ A, ω ∈ A

(εA · b)(ω) = εA (ω) · b = h ω, eA i b = h ω, b∗ eA i, e 1 , ξ2 ∈ H e2 , and, furthermore, for all ξ1 ∈ H ( b∗ eA (ξ1 ), ξ2 ) = ( eA (ξ1 ), bξ2 ) = ( ξ1 , ead A (bξ2 ) ), e k . Altogether, the where scalar products have to be taken in the appropriate spaces H asserted right A-linearity follows. Therefore, the map A B εA ⊗ RAB ⊗ E B 7−→ ead A ⊗R B ⊗E

is well-defined and has the desired tensorial properties. The definition of Ricci curvature involves another operation which we require to be similarly well-behaved: e k , see Eq. (2.9), satisfy Lemma 2.15. The orthogonal projections PδK k−1 on H PδK k−1 (axb) = aPδK k−1 (x)b ek . for all a, b ∈ A and all x ∈ H

Supersymmetric Quantum Theory and Non-Commutative Geometry

133

e k . Then Proof. Set P := PδK k−1 , and let y ∈ P H ( P (axb), y ) = ( axb, P (y) ) = ( axb, y ) = ( x, a∗ yb∗ ) = ( x, P (a∗ yb∗ ) ) = ( aP (x)b, y ), where we have used that P is self-adjoint with respect to (·, ·), that P y = y, and that the image of P is an A-bi-module. This lemma shows that projecting onto the “2-form part” of RAB is an A-bi-module map, i.e., we may apply A B ad A ead A ⊗ R B ⊗ E 7−→ eA ⊗ R B

⊥

⊗ EB

⊥ with RAB = (1 − PδK 1 ) RAB as in Eq. (2.10). Altogether, we arrive at the following definition of the Ricci curvature, ⊥ e 1 ⊗A e 1D (A), RA B ⊗ EB ∈ H Ric(∇) = ead A which is in fact independent of any choices. In the following, we will also use the abbreviation ⊥ RA B RicB := ead A for the components (which, again, are not uniquely defined). From the components RicB we can pass to scalar curvature. Again, we have to make sure that all maps occurring in this process are A-covariant so as to obtain an invariant e 0 with ω defines a e 1D (A), right multiplication on H definition. For any 1-form ω ∈ 0 1 e −→ H e , and we denote by bounded operator mR (ω) : H ad e 1 −→ H e0 : H ωR

(2.17)

the adjoint of this operator. In a similar fashion as above, one establishes that ad ∗ (ωa)ad R (x) = ωR (xa )

e 1 and a ∈ A. This makes it possible to define the scalar curvature r (∇) for all x ∈ H of a connection ∇ as ad e0 . r (∇) = E B∗ R (RicB ) ∈ H As was the case for the Ricci tensor, acting with the adjoint of mR E B∗ serves as an analogue for “contraction of indices”. We summarize our results in the following e 1D (A) over a nonDefinition 2.16. Let ∇ be a connection on the cotangent bundle commutative space (A, H, D, γ) satisfying Assumption 2.13. The Riemannian curvature R (∇) is the left A-linear map e 1D (A). e 1D (A) −→ e 2D (A) ⊗A R (∇) = −∇2 : 1

1

e D (A) and dual generators εA of e D (A)∗ , and Choosing a set of generators E A of writing R (∇) = εA ⊗ RAB ⊗ E B as above, the Ricci tensor Ric (∇) is given by e 1 ⊗A 1 (A), Ric(∇) = RicB ⊗ E B ∈ H D

134

J. Fr¨ohlich, O. Grandjean, A. Recknagel

⊥ A where RicB := ead R , see Eqs. (2.10) and (2.16). Finally, the scalar curvature A B r (∇) of the connection ∇ is defined as r (∇) = E B∗

ad R

e0 , (RicB ) ∈ H

with the notation of Eq. (2.17). (Note that, in the classical case, our definition of the scalar curvature differs from the usual one by a sign.) Both Ric(∇) and r (∇) do not depend on the choice of generators. 2.1.8. Non-commutative Cartan structure equations. The classical Cartan structure equations are an important tool for explicit calculations in differential geometry. Noncommutative analogues of those equations were obtained in [CFF, CFG]. Since proofs were only sketched in these references, we will give a rather detailed account of their e 1D (A) is a results in the following. Throughout this section, we assume that the space vector bundle over A. In fact, no other properties of this space are used. Therefore all the statements on the non-commutative Cartan structure equations for the curvature will hold for any finitely generated projective module E over A; the torsion tensor, on the other hand, is defined only on the cotangent bundle over a non-commutative space. e 1D (A), then the curvature and the Let ∇ be a connection on the vector bundle torsion of ∇ are the left A-linear maps given in Definitions 2.16 and 2.14, e 1D (A), e 2D (A) ⊗A e 1D (A) −→ R (∇) : e 2D (A). e 1D (A) −→ T (∇) : e 1D (A) is finitely generated, we can choose a finite set of Since the left A-module e 1D (A), and define the components AB ∈ e 1D (A), generators { E A }A=1,... ,N ⊂ e 2D (A) and TA ∈ e 2D (A) of connection, curvature and torsion, resp., by setting RA ∈ B

∇ E A = − AB ⊗ E B , A

A

A

A

R (∇) E = R

B

B

⊗E ,

T (∇) E = T .

(2.18) (2.19) (2.20)

e 1D (A) is not a free Note that the components AB and RAB are not uniquely defined if module. Using Definitions 2.16 and 2.14, the components of the curvature and torsion tensors can be expressed in terms of the connection components: RAB = δ AB + AC C B , T

A

A

A

= δE +

BE

B

.

(2.21) (2.22)

As they stand, Eqs. (2.21) and (2.22) cannot be applied for solving typical problems like finding a connection without torsion, because the connection components AB e 1D (A) is free. We obtain more useful Cartan structure cannot be chosen at will unless e on a free equations if we can relate the components AB to those of a connection ∇ module AN . To this end, we employ some general constructions valid for any finitely generated projective left A-module E.

Supersymmetric Quantum Theory and Non-Commutative Geometry

135

e A }A=1,... ,N be the canonical basis of the standard module AN , and define a Let { E left A-module homomorphism ( e 1D (A) AN −→ (2.23) p : e A 7−→ aA E A aA E e 1D (A) is projective there exists a left A-module F such that for all aA ∈ A. Since e 1D (A) ⊕ F ∼ = AN .

(2.24)

e 1D (A) −→ AN the inclusion map determined by the isomorphism Denote by i : e 1D (A). For each A = 1, . . . , N , we define a left (2.24), which satisfies p ◦ i = id on A-linear map ( −→ A AN (2.25) εeA : B e 7−→ aA . aB E e A = ω for all ω ∈ AN . With the help of the inclusion i , we can It is clear that εeA (ω)E introduce the left A-linear maps ( 1 e D (A) −→ A (2.26) εA : ω 7−→ εeA i(ω) e 1D (A) can be written as for all A = 1, . . . , N . With these, ω ∈ e A = εA (ω)E A , ω = p i(ω) = p εeA (i(ω))E

(2.27)

and we see that { εA } is the dual basis already used in Sect. 2.1.7. The first step towards the non-commutative Cartan structure equations is the following result; see also [Kar]. e on AN , Proposition 2.17. Every connection ∇ e : AN −→ e 1D (A) ⊗A AN , ∇ 1

e D (A) by determines a connection ∇ on e ◦ i, ∇ = (id ⊗ p) ◦ ∇

(2.28)

1

e D (A) is of this form. and every connection on e be a connection on AN – which always exists (see the remarks after the Proof. Let ∇ e ◦ i is a well-defined map, and it satisfies proof). Clearly, ∇ = (id ⊗ p) ◦ ∇ e i(ω)) = (id ⊗ p) δa ⊗ i(ω) + a∇i(ω) e ∇(a ω) = (id ⊗ p) ∇(a = δa ⊗ ω + a∇ω 1

1

e D (A). e D (A). This proves that ∇ is a connection on for all a ∈ A and all ω ∈ 1 0 e If ∇ is any other connection on D (A), then e 1D (A) , e 1D (A) ⊗A e 1D (A), ∇0 − ∇ ∈ HomA

136

J. Fr¨ohlich, O. Grandjean, A. Recknagel

where HomA denotes the space of homomorphisms of left A-modules. Since e 1D (A) e 1D (A) ⊗A e 1D (A) ⊗A AN −→ id ⊗ p : 1

e D (A) is a projective module, there exists a module map is surjective and e 1D (A) ⊗A AN e 1D (A) −→ ϕ : with ∇0 − ∇ = (id ⊗ p) ◦ ϕ. e +ϕ e 1D (A) ⊗A AN , and ∇ e is a connection on AN Then ϕ e := ϕ ◦ p ∈ HomA AN , 1 e D (A) is given by ∇0 : whose associated connection on e + ϕ) (id ⊗ p) ◦ (∇ e ◦ i = ∇ + (id ⊗ p) ◦ ϕ = ∇0 . e 1D (A) comes from a connection on AN . This proves that every connection on

The importance of this proposition lies in the fact that an arbitrary collection of N e e1 e AB } 1-forms { A,B=1,... ,N ⊂ D (A) defines a connection ∇ on A by the formula e A − aA eB , e A = δaA ⊗ E e aA E e AB ⊗ E ∇ e 1D (A) is and conversely. Thus, not only the existence of connections on AN and guaranteed, but Eq. (2.28) allows us to compute the components AB of the induced e 1D (A). The action of ∇ on the generators is connection ∇ on e i(E A ) = (id ⊗ p) ∇ e εeB (i(E A ))E e B = (id ⊗ p) ∇ eB e εB (E A )E ∇E A = (id ⊗ p) ∇ e B − εB (E A ) eC e BC ⊗ E = (id ⊗ p) δεB (E A ) ⊗ E e CB ⊗ EB , = δεB (E A ) ⊗ E B − εC (E A ) where we have used some of the general properties listed before. In short, we get the relation e C B − δεB (E A ) (2.29) AB = εC (E A ) e 1D (A) in terms of the components expressing the components of the connection ∇ on e on AN . of the connection ∇ Upon inserting (2.29) into (2.21, 22), one arrives at Cartan structure equations which express torsion and curvature in terms of these unrestricted components. We can, howe 7→ ∇ is ever, obtain equations of a simpler form if we exploit the fact that the map ∇ many-to-one; this allows us to impose some extra symmetry relations on the components e of the connection ∇.

Supersymmetric Quantum Theory and Non-Commutative Geometry

137

e on AN , and denote by e AB be the coefficients of a connection ∇ Proposition 2.18. Let e A := ε (E A ) e the connection on AN whose components are given by e C D εB (E D ) . ∇ C B Then, these components enjoy the symmetry relations eC = eA εC (E A ) B B

eA , e A ε (E C ) = C B B

(2.30)

e and ∇ e induce the same connection on e 1D (A). In particular, every connection and ∇ 1 e D (A) is induced by a connection on AN that satisfies (2.30). on e on a generProof. We explicitly compute the action of the connection ∇ induced by ∇ ator, using Eqs. (2.27, 28) and the fact that all maps and the tensor product are A-linear: e C ⊗ EB ∇ E A = − AB ⊗ E B = δεB (E A ) ⊗ E B − εC (E A ) B D

e F εB (E F ) ⊗ E B = δεB (E A ) ⊗ E B − εC (E A )εD (E C ) D e F ⊗ εB (E F )E B = δεB (E A ) ⊗ E B − εD εC (E A )E C e DF ⊗ E F . = δεB (E A ) ⊗ E B − εD (E A ) e The symmetry relations This shows that ∇ is identical to the connection induced by ∇. (2.30) follow directly from A-linearity and (2.27). We are now in a position to state the Cartan structure equations in a simple form. e A be as in Proposition 2.18. Then the curvature and e AB and Theorem 2.19. Let B e 1D (A) are given by torsion components of the induced connection on eC + eA e C + δε (E A ) δ ε (E C ), RAB = εC (E A ) δ C B B C B A A B A B e T = ε (E ) δE + E . B

B

Proof. With Eqs. (2.21, 29, 30) and the Leibniz rule, we get eA + e A − δε (E A ) e C − δε (E C ) R AB = δ C B B C B eC A eC A A e = δ εC (E ) B + C − δεC (E ) B − δεB (E C ) e C + δε (E A ) δ ε (E C ) − eC + eA e A δε (E C ). = εC (E A )δ C B B C B C B The last term does in fact not contribute to the curvature, as can be seen after tensoring with E B : e A ⊗ EB + δ e A ⊗ ε (E C )E B = 0, e A δε (E C ) ⊗ E B = −δ B C B B C where we have used the Leibniz rule, the relations (2.30) and A-linearity of the tensor product. To compute the components of the torsion, we use Eqs. (2.22, 29) analogously, e A E B − δε (E A )E B = δE A + e A E B − δE A + ε (E A )δE B , TA = δE A + B B B B which gives the result.

138

J. Fr¨ohlich, O. Grandjean, A. Recknagel

The Cartan structure equations of Theorem 2.19 are considerably simpler than those one would get directly from (2.29) and (2.21, 22). The price to be paid is that the e A are not quite independent from each other, but of course they can easily components B e AB according to Proposition 2.18. be expressed in terms of the arbitrary components Therefore, the equations of Theorem 2.19 are useful e.g. for determining connections on e 1D (A) with special properties. We refer the reader to [CFG] for an explicit application of the Cartan structure equations. 2.2. The N = (1, 1) formulation of non-commutative geometry. In this section, we introduce the non-commutative generalization of the description of Riemannian geometry by a set of N = (1, 1) spectral data, which was presented, for the classical case, in Sect. 2.2 of part I. The advantage over the N = 1 formulation is that now the algebra of differential forms is naturally represented on the Hilbert space H. Therefore, calculations in concrete examples and also the study of cohomology rings will become much easier. There is the drawback that the algebra of differential forms is no longer closed under the ∗ -operation on H, but we will introduce an alternative involution below and add further remarks in Sect. 5. The N = (1, 1) framework explained in the following will also provide the basis for the definition of various types of complex non-commutative geometries in Sects. 2.3 and 2.4. 2.2.1. The N = (1, 1) spectral data. Definition 2.20. A quintuple (A, H, d, γ, ∗) is called a set of N = (1, 1) spectral data if 1) H is a separable Hilbert space; 2) A is a unital ∗ -algebra acting faithfully on H by bounded operators; 3) d is a densely defined closed operator on H such that i) d2 = 0 , ii) for each a ∈ A, the commutator [ d, a ] extends uniquely to a bounded operator on H, iii) the operator exp(−ε4) with 4 = dd∗ + d∗ d is trace class for all ε > 0 ; 4) γ is a Z2 -grading on H, i.e., γ = γ ∗ = γ −1 , such that i) [ γ, a ] = 0 for all a ∈ A , ii) { γ, d } = 0 ; 5) ∗ is a unitary operator on H such that i) ∗ d = ζ d∗ ∗ for some ζ ∈ C with |ζ| = 1 , ii) [ ∗, a ] = 0 for all a ∈ A . Several remarks are in order. First of all, note that we can introduce the two operators D = d + d∗ , D = i (d − d∗ ) on H which satisfy the relations D2 = D2 ,

{ D, D } = 0,

cf. Definition I 2.6. Thus, our notion of N = (1, 1) spectral data is an immediate generalization of a classical N = (1, 1) Dirac bundle – except for the boundedness conditions to be required on infinite-dimensional Hilbert spaces, and the existence of the additional operator ∗ (see the comments below).

Supersymmetric Quantum Theory and Non-Commutative Geometry

139

As in the N = 1 case, the Z2 -grading γ may always be introduced if not given from the start, simply by “doubling” the Hilbert space – see the remarks following Definition 2.1. e H, e d e, γ) Moreover, if (A, ˜ is a quadruple satisfying Conditions 1–4 of Definition 2.20, we obtain a full set of N = (1, 1) spectral data by setting e ⊗ C2 , A = Ae ⊗ 12 , H=H e∗ ⊗ 1 (12 − τ3 ), e ⊗ 1 (12 + τ3 ) − d d=d 2 2 ∗ = 1H˜ ⊗ τ1 γ = γ˜ ⊗ 12 , with the Pauli matrices τi as usual. Note that, in this example, ζ = −1, and the ∗-operator additionally satisfies ∗2 = 1 as well as [ γ, ∗ ] = 0 . The unitary operator ∗ was not present in our algebraic formulation of classical Riemannian geometry. But for a compact oriented manifold, the usual Hodge ∗-operator acting on differential forms satisfies all the properties listed above, after appropriate rescaling in each degree. (Moreover, one can always achieve ∗2 = 1 or ζ = −1.) For a non-orientable manifold, we can apply the construction of the previous paragraph to obtain a description of the differential forms in terms of N = (1, 1) spectral data including a Hodge operator. In our approach to the non-commutative case, we will make essential use of the existence of ∗, which we will also call Hodge operator, in analogy to the classical case. 2.2.2. Differential forms. We first introduce an involution, \, called complex conjugation, on the algebra of universal forms: \ : • (A) −→ • (A) is the unique C-anti-linear anti-automorphism such that \(a) ≡ a\ := a∗ ,

\(δa) ≡ (δa)\ := δ(a∗ )

(2.31)

for all a ∈ A. Here we choose a sign convention that differs from the N = 1 case, Eq. (2.1). If we write γˆ for the mod 2 reduction of the canonical Z-grading on • (A), we have δ\γˆ = \δ. (2.32) We define a representation of • (A) on H, again denoted by π, by π(a) := a,

π(δa) := [ d, a ]

(2.33)

for all a ∈ A. The map π is a Z2 -graded representation in the sense that π(γω ˆ γ) ˆ = γπ(ω)γ

(2.34)

for all ω ∈ • (A). Although the abstract algebra of universal forms is the same as in the N = 1 setting, the interpretation of the universal differential δ has changed: In the N = (1, 1) framework, it is represented on H by the nilpotent operator d, instead of the self-adjoint Dirac operator D, as before. In particular, we now have π(δω) = [ d, π(ω) ]g

(2.35)

140

J. Fr¨ohlich, O. Grandjean, A. Recknagel

for all ω ∈ • (A), where [·, ·]g denotes the graded commutator (defined with the canonical Z2 -grading on π(• (A)), see (2.34)). The validity of Eq. (2.35) is the main difference between the N = (1, 1) and the N = 1 formalism. It ensures that there do not exist any forms ω ∈ p (A) with π(ω) = 0 but π(δω) 6 = 0, in other words: Proposition 2.21. The graded vector space J=

∞ M

J k , J k := ker π |k (A)

k=0

with π defined in (2.33) is a two-sided graded differential \ -ideal of • (A). Proof. The first two properties are obvious, the third one is the content of Eq. (2.35). Using (2.31) and the relations satisfied by the Hodge ∗-operator according to part 5) of Definition 2.20, we find that π (δa)\ = π(δ(a∗ )) = [ d, a∗ ] = [ a, d∗ ]∗ = ζ [ a, ∗ d ∗−1 ]∗ = ζ ∗ [ a, d ]∗ ∗−1 = −ζ ∗ π(δa)∗ ∗−1 , which implies

π ω \ = (−ζ)k ∗ π(ω)∗ ∗−1

for all ω ∈ k (A). In particular, J = ker π is a \ -ideal.

(2.36)

As a consequence of this proposition, the algebra of differential forms •d (A)

:=

∞ M

kd (A),

kd (A) := k (A)/J k ,

(2.37)

k=0

is represented on the Hilbert space H via π. For later purposes, we will also need an involution on •d (A), and according to Proposition 2.21, this is given by the anti-linear map \ of (2.31). Note that the “natural” involution ω 7→ ω ∗ , see Eq. (2.1), which is inherited from H and was used in the N = 1 case, is no longer available here: The space π(k (A)) is not closed under taking adjoints, because d is not self-adjoint. In summary, the space •d (A) is a unital graded differential \ -algebra and the representation π of • (A) determines a representation of •d (A) on H as a unital differential algebra. 2.2.3. Integration. The integration theory follows the same lines as in the N = 1 case: R The state − is given as in Definition 2.3 with D2 written as 4 = dd∗ + d∗ d. Again, we make Assumption 2.4 about the cyclicity of the integral. This yields a sesqui-linear form on •d (A) as before: Z (2.38) (ω, η) = − ω η ∗ for all ω, η ∈ •d (A), where we have dropped the representation symbols π under the integral. Because of the presence of the Hodge ∗-operator, the form (·, ·) has an additional feature in the N = (1, 1) setting:

Supersymmetric Quantum Theory and Non-Commutative Geometry

141

Proposition 2.22. If the phase in part 5) of Definition 2.20 is ζ = ±1, then the inner product defined in Eq. (2.38) behaves like a real functional with respect to the involution \, i.e., for ω, η ∈ •d (A) we have ( ω \ , η \ ) = (ω, η), where the bar denotes ordinary complex conjugation. Proof. First, observe that the Hodge operator commutes with the Laplacian, which is verified e.g. by taking the adjoint of the relation ∗ d = ζ d∗ ∗ . Then the claim follows immediately using Eq. (2.36), unitarity of the Hodge operator, and cyclicity of the trace on H: Let ω ∈ pd (A), η ∈ qd (A), then Z Z Z ∗ ¯ q − ∗ ω ∗ ∗−1 ∗η ∗−1 = (−ζ)p−q − ω ∗ η ( ω \ , η \ ) = − ω \ η \ = (−ζ)p (−ζ) Z p−q − η ω ∗ = (−ζ)p−q (ω, η); = (−ζ) again, we have suppressed the representation symbol π. The claim follows since the Z2 -grading implies (ω, η) = 0 unless p − q ≡ 0 (mod 2). Note that in examples, p- and q-forms for p 6 = q are often orthogonal with respect to the inner product (·, ·); then Proposition 2.22 holds independently of the value of ζ. Since •d (A) is a \ - and not a ∗ -algebra, Proposition 2.5 is to be replaced by Proposition 2.23. The graded kernel K, see Eq. (2.5), of the sesqui-linear form (·, ·) is a two-sided graded \ -ideal of •d (A). Proof. The proof that K is a two-sided graded ideal is identical to the one of Proposition 2.5. That K is closed under \ follows immediately from the proof of Proposition 2.22. The remainder of Sect. 2.1.3 carries over to the N = (1, 1) case, with the only e • (A) is a \ -algebra and that the quotients k (A)/ K k + δK k−1 ∼ differences that = k k k−1 e e are denoted by d (A). (A)/δK e • (A) is not, in general, a While •d (A) is a differential algebra (by construction), differential algebra, because the ideal K may not be a differential ideal (i.e. there may / K k ). However, K is trivial in many interesting examples. If exist ω ∈ K k−1 with δω ∈ e • (A) of square-integrable forms is a differential algebra K is trivial then the algebra e• . which is faithfully represented on H 2.2.4. Unitary connections and scalar curvature. Except for the notions of unitary connections and scalar curvature, all definitions and results of Sects. 2.1.4–8 literally apply to the N = (1, 1) case as well. The two exceptions explicitly involve the ∗ -involution on the algebra of differential forms, which is no longer available now. Therefore, we have to modify the definitions for N = (1, 1) non-commutative geometry as follows: Definition 2.24. A connection ∇ on a Hermitian vector bundle E, h·, ·i over an N = (1, 1) non-commutative space is called unitary if d h s, t i = h ∇s, t i + h s, ∇t i

142

J. Fr¨ohlich, O. Grandjean, A. Recknagel

for all s, t ∈ E; the Hermitian structure on the rhs is extended to E-valued differential forms by h ω ⊗ s, t i = ω h s, t i, h s, η ⊗ t i = h s, t i η \ •

e d (A) and s, t ∈ E. for all ω, η ∈ 1

e d (A) is defined by Definition 2.25. The scalar curvature of a connection ∇ on r (∇) = E B \

ad R

e0 . (RicB ) ∈ H

2.2.5. Remarks on the relation of N = 1 and N = (1, 1) spectral data. The definitions of N = 1 and N = (1, 1) non-commutative spectral data provide two different generalizations of classical Riemannian differential geometry. In the latter context, one can always find an N = (1, 1) description of a manifold originally given by an N = 1 set of data (see part I), whereas a non-commutative set of N = (1, 1) spectral data seems to require a different mathematical structure than a spectral triple, because of the additional generalized Dirac operator which must be given on the Hilbert space. Thus, it is a natural and important question under which conditions on an N = 1 spectral triple e d, ∗) over the same (A, H, D) there exists an associated N = (1, 1) set of data (A, H, non-commutative space A. We have not been able yet to answer the question of how to pass from N = 1 to N = (1, 1) data in a general way. But in the following we present a procedure that might lead to a solution. Our guideline is the classical case, where the main step in passing from N = 1 to N = (1, 1) data is to replace the Hilbert space H = L2 (S) by e = L2 (S ⊗ S) carrying two actions of the Clifford algebra and therefore two antiH commuting Dirac operators D and D – which yield a description equivalent to the one involving the nilpotent differential d, see the remark after Definition 2.20. It is plausible that there are other approaches to this question, in particular approaches of a more operator algebraic nature, e.g. using a “Kasparov product of spectral triples”, but we will not enter these matters here. The first problem one meets when trying to copy the classical step from N = 1 to N = (1, 1) is that H should be an A-bi-module. To ensure this, we require that the set of N = 1 (even) spectral data (A, H, D, γ) is endowed with a real structure [Co4], i.e. that there exists an anti-unitary operator J on H such that J 2 = 1,

Jγ = 0 γJ,

JD = DJ

for some (independent) signs , 0 = ±1, and such that, in addition, JaJ ∗ commutes with b and [ D, b ] for all a, b ∈ A. This definition of a real structure was introduced by Connes in [Co4]; J is of course a variant of Tomita’s modular conjugation (cf. the next subsection). In the present context, J provides a canonical right A-module structure on H by defining ξ · a := Ja∗ J ∗ ξ for all a ∈ A, ξ ∈ H, see [Co4]. We can extend this to a right action of 1D (A) on H if we set ξ · ω := Jω ∗ J ∗ ξ

Supersymmetric Quantum Theory and Non-Commutative Geometry

143

for all ω ∈ 1D (A) and ξ ∈ H; for simplicity, the representation symbol π has been omitted. Note that by the assumptions on J, the right action commutes with the left action of A. Thus H is an A-bi-module, and we can form tensor products of bi-modules over the algebra A just as in the classical case. From now on, we assume that H contains a o o dense projective left A-module H which is stable under J and γ. In particular, H is itself o an A-bi-module. Since H is projective, it carries a Hermitian structure, see Definition o o 2.8, that induces a scalar product on H ⊗A H (see also Sect. 4.3). We shall denote by e the Hilbert space completion of Ho ⊗A Ho with respect to this scalar product. H The real structure J allows us to define the anti-linear “flip” operator  o o 1D (A) ⊗A H −→ H ⊗A 1D (A) . 9 :  ω ⊗ ξ 7−→ Jξ ⊗ ω ∗ It is straightforward to verify that 9 is well-defined and that it satisfies 9(a s) = 9(s) a∗ o

for all a ∈ A, s ∈ 1D (A) ⊗A H . o Since H is projective, it admits connections o

o

∇ : H −→ 1D (A) ⊗A H , i.e. C-linear maps such that

∇(aξ) = δa ⊗ ξ + a∇ξ

o

o

for all a ∈ A and ξ ∈ H . We assume that ∇ commutes with the grading γ on H , i.e. o o ∇ γ ξ = (1 ⊗ γ) ∇ξ for all ξ ∈ H . For each connection ∇ on H , there is an “associated right-connection” ∇ defined with the help of the flip 9: o o H −→ H ⊗A 1D (A) . ∇ :  ξ 7−→ −9(∇J ∗ ξ) ∇ is again C-linear and satisfies ∇(ξa) = ξ ⊗ δa + (∇ξ)a. o

A connection ∇ on H , together with its associated right connection ∇, induces a C-linear e on Ho ⊗A Ho of the form “tensor product connection” ∇ o o o o H ⊗A H −→ H ⊗A 1D (A) ⊗A H e ∇ :  ξ ⊗ ξ 7−→ ∇ξ ⊗ ξ + ξ ⊗ ∇ξ . 1 2 1 2 1 2 e is not quite a connection in the usual Because of the position of the factor 1D (A), ∇ sense. In the classical case, the last ingredient needed for the definition of the two Dirac operators of an N = (1, 1) Dirac bundle are the two anti-commuting Clifford actions on e Their obvious generalizations to the non-commutative case are the C-linear maps H. o o o o H ⊗A 1D (A) ⊗A H −→ H ⊗A H c :  ξ1 ⊗ ω ⊗ ξ2 7−→ ξ1 ⊗ ω ξ2

144

J. Fr¨ohlich, O. Grandjean, A. Recknagel

and c :

o o H ⊗A 1D (A) ⊗A H 

o

o

−→ H ⊗A H

ξ1 ⊗ ω ⊗ ξ2 7−→ ξ1 ω ⊗ γξ2 . o

o

With these, we may introduce two operators D and D on H ⊗A H in analogy to the classical case: e e D := c ◦ ∇. D := c ◦ ∇, In order to obtain a set of N = (1, 1) spectral data, one has to find a connection ∇ on o H which makes the operators D and D essentially self-adjoint and ensures that the relations D2 = D2 and { D, D } = 0 of Definition 2.20 are satisfied. The Z2 -grading e is simply the tensor product grading, and the Hodge operator can be taken to be on H ∗ = γ ⊗ 1. In Sect. 4 below, we will verify these conditions in the example of the noncommutative torus. In the general case, we have, up to now, not been able to prove o the existence of a connection ∇ on H which supplies D and D with the correct algebraic properties, but the naturality of the construction presented above as well as the similarity with the procedure of Sect. I 2.2.2 lead us to expect that this problem can be solved in many cases of interest. More precisely, we expect that the relation { D, D } = 0 can be satisfied under rather general assumptions, whereas it may often be appropriate to deal with a non-vanishing operator D2 − D2 that generates an S 1 -action. 2.2.6. Riemannian and Spinc “manifolds” in non-commutative geometry. In this section, we address the following question: What is the additional structure that makes an N = (1, 1) non-commutative space into a non-commutative “manifold”, into a Spinc “manifold”, or into a quantized phase space? There is a definition of non-commutative manifolds in terms of K-homology, see e.g. [Co1]. In our search for the characteristic features of non-commutative manifolds we will, as before, be guided by the classical case and by the principle that they should be natural from the point of view of quantum physics. Extrapolating from classical geometry, we are e.g. led to the following requirement an N = (1, 1) space (A, H, d, γ, ∗) should satisfy in order to describe a “manifold”: The data must extend to a set of N = 2 spectral data (A, H, d, T, ∗) where T is a self-adjoint operator on H such that i) [ T, a ] = 0 for all a ∈ A ; ii) [ T, d ] = d ; iii) T has integral spectrum, and γ is the mod 2 reduction of T , i.e. γ = ±1 on H± , where H± = span { ξ ∈ H | T ξ = n ξ for some n ∈ Z, (−1)n = ±1 }. Such N = 2 spectral data have been used in Sect. I 1.2 already, and have also been briefly discussed in Sect. I 3. Before we can formulate further properties that we suppose to characterize noncommutative manifolds, we recall some basic facts about Tomita-Takesaki theory. Let M be a von Neumann algebra acting on a separable Hilbert space H, and assume that ξ0 ∈ H is a cyclic and separating vector for M, i.e. M ξ0 = H

Supersymmetric Quantum Theory and Non-Commutative Geometry

145

and a ξ0 = 0

a=0

=⇒

for any a ∈ M, respectively. Then we may define an anti-linear operator S0 on H by setting S0 a ξ0 = a∗ ξ0 for all a ∈ M. One can show that S0 is closable, and we denote its closure by S. The polar decomposition of S is written as 1

S = J1 2 , where J is an anti-unitary involutive operator, referred to as modular conjugation, and the so-called modular operator 1 is a positive self-adjoint operator on H. The fundamental result of Tomita-Takesaki theory is the following theorem: JMJ = M0 , 1it M1−it = M for all t ∈ R; here, M0 denotes the commutant of M on H. Furthermore, the vector state ω0 (·) := (ξ0 , · ξ0 ) is a KMS-state for the automorphism σt := Ad1it of M, i.e. ω0 (σt (a) b) = ω0 (b σt−i (a)) for all a, b ∈ M and all real t. Let (A, H, d, T, ∗) be a set of N = 2 spectral data coming from an N = (1, 1) space as above. We define the analogue ClD (A) of the space of sections of the Clifford bundle, ClD (A) = { a0 [ D, a1 ] . . . [ D, ak ] | k ∈ Z+ , ai ∈ A }, where D = d + d∗ , and, corresponding to the second generalized Dirac operator D = i(d − d∗ ) , ClD (A) = { a0 [ D, a1 ] . . . [ D, ak ] | k ∈ Z+ , ai ∈ A }. In the classical setting, the sections ClD (A) and ClD (A) operate on H by the two actions c and c, respectively, see Definition I 2.6. In the general case, we notice that, in contrast to the algebra •d (A) introduced before, ClD (A) and ClD (A) form ∗ -algebras of operators on H, but are neither Z-graded nor differential. We want 00 to apply Tomita-Takesaki theory to the von Neumann algebra M := ClD (A) . Suppose there exists a vector ξ0 ∈ H which is cyclic and separating for M, and let J be the anti-unitary conjugation associated to M and ξ0 . Suppose, moreover, that for all a ∈ JA := JAJ the operator [ D, a ] uniquely extends to a bounded operator on H. Then we can form the algebra of bounded operators ClD (JA) on H as above. The properties JAJ ⊂ A0 and { D, D } = 0 imply that ClD (A) and ClD (JA) commute in the graded sense; to arrive at truly commuting algebras, we first decompose ClD (JA) into a direct sum − J + J ( A) ⊕ ClD ( A) ClD (JA) = ClD with

± J ( A) = { ω ∈ ClD (JA) | γ ω = ±ω γ}. ClD

f (JA) := Cl+ (JA) ⊕ γ Cl− (JA). This algebra Then we define the “twisted algebra” Cl D D D commutes with ClD (A).

146

J. Fr¨ohlich, O. Grandjean, A. Recknagel

We propose the following definitions: The N = 2 spectral data (A, H, d, T, ∗) describe a non-commutative manifold if f (JA) = J ClD (A) J. Cl D Furthermore, inspired by classical geometry, we say that a non-commutative manifold f (JA) module (A, H, d, T, ∗, ξ0 ) is spinc if the Hilbert space factorizes as a ClD (A)⊗ Cl D in the form H = HD ⊗Z HD , where Z denotes the center of M. Next, we introduce a notion of “quantized phase space”. We consider a set of N = (1, 1) spectral data (A, H, d, γ, ∗), where we now think of A as the algebra of phase space “functions” (i.e. of pseudo-differential operators, in the Schr¨odinger picture of quantum mechanics) rather than functions over configuration space. We are, therefore, not postulating the existence of a cyclic and separating vector for the algebra ClD (A). Instead, we define for each β > 0 the temperature or KMS state   Z ClD (A) −→ RC 2 7−→ − ω:=Tr H ωe−βD − : β  ω β  2 Tr H e−βD

,

R with no limit β → 0 taken, in contrast to Definition 2.3. The β-integral −β clearly is a faithful state, and through the GNS-construction we obtain a faithful representation of ClD (A) on a Hilbert space Hβ with a cyclic and separating vector ξβ ∈ Hβ for M. Each bounded operator A ∈ B(H) on H induces a bounded operator Aβ on Hβ ; this is easily seen by computing matrix elements of Aβ , Z h Aβ x, y i = − Axy ∗ β

for all x, y ∈ M ⊂ Hβ , and using the explicit form of the β-integral. We denote the modular conjugation and the modular operator on Hβ by Jβ and 4β , respectively, and we assume that for each a ∈ M the commutator 1 d itD −itD e Jβ aJβ e [ D, Jβ aJβ ] = i dt β β t=0 defines a bounded operator on Hβ . f (Jβ A) on Hβ , which is conThen we can define an algebra of bounded operators Cl D tained in the commutant of ClD (A), and we say that the N = (1, 1) spectral data (A, H, d, γ, ∗) describe a quantized phase space if the following equation holds: f (Jβ A). Jβ ClD (A) Jβ = Cl D 2.3. Hermitian and K¨ahler non-commutative geometry. In this section, we introduce the spectral data describing complex non-commutative spaces, more specifically spaces that carry a Hermitian or a K¨ahler structure; the terminology is of course carried over from the classical case, see part I. Since these structures are more restrictive than the data of Riemannian non-commutative geometry, we will be able to derive some appealing properties of the space of differential forms. We also find a necessary condition for a set

Supersymmetric Quantum Theory and Non-Commutative Geometry

147

of N = (1, 1) spectral data to extend to Hermitian data. A different approach to complex non-commutative geometry has been proposed in [BC]. 2.3.1. Hermitian and N = (2,2) spectral data. Definition 2.26. A set of data (A, H, ∂, ∂, T, T , γ, ∗) is called a set of Hermitian spectral data if 1) the quintuple (A, H, ∂ + ∂, γ, ∗) forms a set of N = (1, 1) spectral data; 2) T and T are self-adjoint bounded operators on H, ∂ and ∂ are densely defined, closed operators on H such that the following (anti-)commutation relations hold: ∂ 2 = ∂ 2 = 0,

{ ∂, ∂ } = 0,

[ T, ∂ ] = ∂,

[ T, ∂ ] = 0,

[ T , ∂ ] = 0,

[ T , ∂ ] = ∂,

[ T, T ] = 0; 3) for any a ∈ A, [ T, a ] = [ T , a ] = 0 and each of the operators [ ∂, a ], [ ∂, a ] and { ∂, [ ∂, a ] } extends uniquely to a bounded operator on H; 4) the Z2 -grading γ satisfies { γ, ∂ } = { γ, ∂ } = 0, [ γ, T ] = [ γ, T ] = 0 ; 5) the Hodge ∗-operator satisfies ∗ ∂ = ζ ∂ ∗ ∗,

∗ ∂ = ζ ∂∗ ∗

for some phase ζ ∈ C. Some remarks on this definition may be useful: The Jacobi identity and the equation { ∂, ∂ } = 0 show that Condition 3 above is in fact symmetric in ∂ and ∂. As in Sect. 2.2.1, a set (A, H, ∂, ∂, T, T ) that satisfies the first three conditions but does not involve γ or ∗, can be made into a complete set of Hermitian spectral data. In classical Hermitian geometry, the ∗-operator can always be taken to be the usual Hodge ∗-operator – up to a multiplicative redefinition in each degree – since complex manifolds are orientable. Next, we describe conditions sufficient to equip a set of N = (1, 1) spectral data with a Hermitian structure. In Subsect. 2.3.2, Corollary 2.34, a necessary criterion is given as well. Proposition 2.27. Let (A, H, d, γ, ∗) be a set of N = (1, 1) spectral data with [ γ, ∗ ] = 0, and let T be a self-adjoint bounded operator on H such that the operator ∂ := [ T, d ] is nilpotent: ∂ 2 = 0; [ T, ∂ ] = ∂ ; [ T, a ] = 0 for all a ∈ A; [ T, ω ] ∈ π(1 (A)) for all ω ∈ π(1 (A)); the operator ∂ := d − ∂ satisfies ∗ ∂ = ζ ∂ ∗ ∗ , where ζ is the phase appearing in the relations of ∗ in the N = (1, 1) data; f) [ T, γ ] = 0 and [ T, T ] = 0, where T := − ∗ T ∗−1 .

a) b) c) d) e)

Then (A, H, ∂, ∂, T, T , γ, ∗) forms a set of Hermitian spectral data.

148

J. Fr¨ohlich, O. Grandjean, A. Recknagel

Notice that Conditions a)–d) are identical to those in Definition I 2.20 of Sect. I 2.4.1. Requirement e) will turn out to correspond to part e) of that definition. The relations in f) ensure compatibility of the operators T , γ and ∗ and were not needed in the classical setting. Proof. We check each of the conditions in Definition 2.26: The first one is satisfied by assumption, since d = ∂ + ∂ is the differential of N = (1, 1) spectral data. The equalities ∂ 2 = ∂ 2 = { ∂, ∂ } = [ T, ∂ ] = 0 follow from a) and b), as in the proof of Lemma I 2.21. With this, we compute [ T , ∂ ] = −[ ∗ T ∗−1 , ∂ ] = −ζ ∗ [ T, ∂ ∗ ] ∗−1 = ∂, and since

[ T , d ] = [ ∗ T ∗−1 , d∗ ]∗ = ζ ∗ [ T, d ]∗ ∗−1 = ∂,

we obtain [ T , ∂ ] = 0. The relation [ T, T ] = 0 and self-adjointness of T were part of the assumptions, and T ∗ = T is clear from the unitarity of the Hodge ∗-operator. That [ ∂, a ] and [ ∂, a ] are bounded for all a ∈ A follows from the corresponding property of d and from the assumption that T is bounded. As in the proof of Proposition I 2.22, one shows that { ∂, [ ∂, a ]} ∈ π(2d (A)), and therefore { ∂, [ ∂, a ]} is a bounded operator. T and ∗ commute with all a ∈ A by assumption, and thus the same is true for T. Using f) and the Jacobi identity, we get { γ, ∂ } = { γ, [ T, d ] } = [ T, { d, γ } ] + { d, [ γ, T ] } = 0 and { γ, ∂ } = { γ, d − ∂ } = 0. By assumption, γ commutes with T and ∗, therefore also with T . Finally, the relations of Condition 5 in Definition 2.26 between the ∗-operator and ∂, ∂ follow directly from e) and ∗ d = ζ d∗ ∗ . As in classical differential geometry, K¨ahler spaces arise as a special case of Hermitian geometry. In particular, K¨ahler spectral data provide a realization of the N = (2, 2) supersymmetry algebra: Definition 2.28. Hermitian spectral data (A, H, ∂, ∂, T, T , γ, ∗) are called N = (2, 2) or K¨ahler spectral data if { ∂, ∂ ∗ } = { ∂, ∂ ∗ } = 0, { ∂, ∂ ∗ } = { ∂, ∂ ∗ }. Note that the first line is a consequence of the second one in classical complex geometry, but has to be imposed as a separate condition in the non-commutative setting. One can also define K¨ahler spectral data, as in Sect. I 1.2, as containing a nilpotent differential d – together with its adjoint d∗ – and two commuting U(1) generators L3 and J0 , say, which satisfy the relations (I 1.49-51). This approach has the virtue that the complex structure familiar from classical differential geometry is already present in the algebraic formulation; see Eq. (I 1.54) for the precise relationship with J0 . Moreover, this way of introducing non-commutative complex geometry makes the role of Lie group symmetries of the spectral data explicit, which is somewhat hidden in the formulation of Definitions 2.26 and 2.28 and in Proposition 2.27: The presence of the U(1) × U(1)

Supersymmetric Quantum Theory and Non-Commutative Geometry

149

symmetry, acting in an appropriate way, ensures that a set of N = (1, 1) spectral data acquires an N = (2, 2) structure. Because of the advantages in the treatment of differential forms, we will stick to the setting using ∂ and ∂ for the time being, but the data with generators L3 and J0 will appear naturally in the context of symplectic geometry in Sect. 2.5. 2.3.2. Differential forms. In the context of Hermitian non-commutative geometry, we have two differential operators ∂ and ∂ at our disposal. We begin this section with the definition of an abstract algebra of universal forms which is appropriate for this situation. Definition 2.29. A bi-differential algebra B is a unital algebra together with two anticommuting nilpotent derivations δ, δ : B −→ B . A homomorphism of bi-differential algebras ϕ : B −→ B0 is a unital algebra homomorphism which intertwines the derivations. Definition 2.30. The algebra of complex universal forms •,• (A) over a unital algebra A is the (up to isomorphism) unique pair (ι, •,• (A)) consisting of a unital bi-differential algebra •,• (A) and an injective unital algebra homomorphism ι : A −→ •,• (A) such that the following universal property holds: For any bidifferential algebra B and any unital algebra homomorphism ϕ : A −→ B , there is a unique homomorphism ϕ e : •,• (A) −→ B of bi-differential algebras such that ϕ=ϕ e ◦ ι. The description of •,• (A) in terms of generators and relations is analogous to the case of • (A), and it shows that •,• (A) is a bi-graded bi-differential algebra •,• (A) =

∞ M

r,s (A)

(2.39)

r,s=0

by declaring the generators a, δa, δa and δδa, a ∈ A, to have bi-degrees (0,0), (1,0), (0,1) and (1,1), respectively. As in the N = (1, 1) framework, we introduce an involution \, called complex conjugation, on the algebra of complex universal forms, provided A is a ∗ -algebra: \ : •,• (A) −→ •,• (A) is the unique anti-linear anti-automorphism acting on generators by \(a) ≡ a\ := a∗ , \(δa) ≡ (δa)\ := δ(a∗ ), \

\(δa) ≡ (δa)\ := δ(a∗ ),

(2.40)

∗

\(δδa) ≡ (δδa) := δδ(a ). Let γ˜ be the Z2 -reduction of the total grading on •,• (A), i.e., γ˜ = (−1)r+s on r,s (A). Then it is easy to verify that δ\γ˜ = \δ. (2.41) This makes •,• (A) into a unital bi-graded bi-differential \ -algebra. Let (A, H, ∂, ∂, T, T , γ, ∗) be a set of Hermitian spectral data. Then we define a Z2 -graded representation π of •,• (A) as a unital bi-differential algebra on H by setting

150

J. Fr¨ohlich, O. Grandjean, A. Recknagel

π(a) = a, π(δa) = [ ∂, a ],

π(δa) = [ ∂, a ],

(2.42)

π(δδa) = { ∂, [ ∂, a ] }. Note that, by the Jacobi identity, the last equation is compatible with the anti-commutativity of δ and δ. As in the case of N = (1, 1) geometry, we have that π(δω) = [ ∂, π(ω) ]g ,

π(δω) = [ ∂, π(ω) ]g ,

(2.43)

for any ω ∈ •,• (A), and therefore the graded kernel of the representation π has good properties: We define J •,• :=

∞ M

J r,s ,

J r,s := { ω ∈ r,s (A) | π(ω) = 0 },

(2.44)

r,s=0

and we prove the following statement in the same way as Proposition 2.21: Proposition 2.31. The set J is a two-sided, bi-graded, bi-differential \ -ideal of •,• (A). We introduce the space of complex differential forms as (A) := •,• ∂,∂¯

∞ M r,s=0

r,s (A), ∂,∂¯

r,s (A) := r,s (A)/J r,s . ∂,∂¯

(2.45)

(A) is a unital bi-graded bi-differential \ -algebra, too, and the repreThe algebra •,• ∂,∂¯ sentation π determines a representation, still denoted π, of this algebra on H. Due to the presence of the operators T and T among the Hermitian spectral data, (A) under π enjoys a property not present in the N = (1, 1) case: the image of •,• ∂,∂¯ Proposition 2.32. The representation of the algebra of complex differential forms satisfies ∞ M •,• π r,s (A) . (2.46) π ∂,∂¯ (A) = ∂,∂¯ r,s=0

(A) as a unital, bi-graded, bi-differential In particular, π is a representation of •,• ∂,∂¯ \ \ -algebra. The -operation is implemented on π •,• (A) with the help of the Hodge ∂,∂¯ ∗-operator and the ∗ -operation on B(H): ( −→ π r,s π r,s ¯ (A) ¯ (A) ∂, ∂ ∂, ∂ . \ : ω 7−→ ω \ := (−ζ)r+s ∗ ω ∗ ∗−1 (A) . Then part 2) of Definition 2.26 implies that Proof. Let ω ∈ π r,s ∂,∂¯ [ T, ω ] = r ω,

[ T , ω ] = s ω,

which gives the direct sum decomposition (2.46). It remains to show that the \ -operation (A) : For a ∈ A, we have that is implemented on the space π •,• ∂,∂¯

Supersymmetric Quantum Theory and Non-Commutative Geometry

151

π (δa)\ = π δ(a∗ ) = [ ∂, a∗ ] = −[ ∂ ∗ , a ]∗ = −[ ζ¯ ∗ ∂ ∗−1 , a ]∗ = −ζ ∗ [ ∂, a ]∗ ∗−1 = −ζ ∗ π(δa)∗ ∗−1 , and, similarly, using (2.40) and the properties of the Hodge ∗-operator, π (δδa)\ = ζ 2 ∗ π(δδa)∗ ∗−1 . π (δa)\ = −ζ ∗ π(δa)∗ ∗−1 , This proves that π(ω \ ) = π(ω)\ .

(A) via the Hodge As an aside, we mention that the implementation of \ on π •,• ∂,∂¯ ∗-operator shows that Conditions e) of the “classical” Definition I 2.20 and of Proposition 2.27 are related; more precisely, the former is a consequence of the latter. Hermitian spectral data carry, in particular, an N = (1, 1) structure, and thus we have two notions of differential forms available. Their relation is described in our next proposition. Proposition 2.33. The space of N = (1, 1) differential forms is included in the space of Hermitian forms, i.e., M π r,s (A) , (2.47) π pd (A) ⊂ ∂,∂¯ r+s=p

and the spaces coincide if and only if

for all ω ∈ 1d (A).

[ T, ω ] ∈ π 1d (A)

(2.48)

Proof. The inclusion (2.47) follows simply from d = ∂ + ∂. If the spaces are equal then the equation [ T, ω ] = r ω, r,s for all ω ∈ π ∂,∂¯ (A) , implies (2.48). The converse is shown as in the proof of Proposition I 2.22 in Sect. 2.4.1 of part I, concerning classical Hermitian geometry. Note that even if the spaces of differential forms do not coincide, the algebra of complex forms contains a graded differential algebra ∂,• ∂¯ (A), d with d = ∂ + ∂ and ∂,• ∂¯ (A) :=

M p

∂,p ∂¯ (A),

∂,p ∂¯ (A) :=

M r+s=p

r,s (A). ∂,∂¯

(2.49)

By Proposition 2.32, we know that M (A) = π ∂,p ∂¯ (A) , π •,• ∂,∂¯ p

and hence we obtain a necessary condition for N = (1, 1) spectral data to extend to Hermitian spectral data: Corollary 2.34. If a set of N = (1, 1) spectral data extends to a set of Hermitian spectral data then M π pd (A) . π •d (A) = p

152

J. Fr¨ohlich, O. Grandjean, A. Recknagel

This condition is clearly not sufficient since it is always satisfied in classical differential geometry. Beyond the complexes (2.45) and (2.49), one can of course also consider the analogue (A). The details are of the Dolbeault complex using only the differential ∂ acting on •,• ∂,∂¯ straightforward. We conclude this subsection with some remarks concerning possible variations of our Definition 2.26 of Hermitian spectral data. For example, one may wish to drop the boundedness condition on the operators T and T , in order to include infinitedimensional spaces into the theory. This is possible, but then one has to make some stronger assumptions in Proposition 2.27. Another relaxation of the requirements in Hermitian spectral data is to avoid introducing T and T altogether, and to replace them by a decomposition of the Z2 -grading γ = γ∂ + γ∂¯ such that

{ γ∂ , ∂ } = 0,

[ γ∂ , ∂ ] = 0,

{ γ∂¯ , ∂ } = 0,

[ γ∂¯ , ∂ ] = 0.

Then the space of differential forms may be defined as above, but Propositions 2.32 and 2.33, as well as the good properties of the integral established in the next subsection, will not hold in general. 2.3.3. Integration in complex non-commutative geometry. The definition of the integral is completely analogous to the N = (1, 1) setting: Again we use the operator 4 = d d∗ + d∗ d, where now d = ∂ + ∂. Due to the larger set of data, the space of square-integrable, complex differential forms, now obtained after quotienting by the two-sided bi-graded \ -ideal K, has better properties than the corresponding space of forms in Riemannian non-commutative geometry. There, two elements ω ∈ pd (A) and η ∈ qd (A) with p 6 = q were not necessarily orthogonal with respect to the sesqui-linear form (·, ·) induced by the integral. For Hermitian and K¨ahler non-commutative geometry, however, we can prove the following orthogonality statements: i (A) , i = 1, 2. Then Proposition 2.35. Let ωi ∈ π r∂,i ,s ∂¯ (ω1 , ω2 ) = 0

(2.50)

if r1 + s1 6 = r2 + s2 in the Hermitian case; if the spectral data also carry an N = (2, 2) structure, then eq. (2.50) holds as soon as r1 6 = r2 or s1 6 = s2 . Proof. In the case of Hermitian spectral data, the assertion follows immediately from cyclicity of the trace, from the commutation relations [ T, ωi ] = ri ωi ,

[ T , ωi ] = si ωi ,

which means that T + T counts the total degree of a differential form, and from the equation [ T + T , 4 ] = 0. In the K¨ahler case, Definition 2.28 implies the stronger relations [ T, 4 ] = [ T , 4 ] = 0.

Supersymmetric Quantum Theory and Non-Commutative Geometry

153

1 e ∂,∂ 2.3.4. Generalized metric on (A). The notions of vector bundles, Hermitian structure, torsion, etc. are defined just as for N = (1, 1) spectral data in Subsect. 2.2. The definitions of holomorphic vector bundles and connections can be carried over from the classical case; see Sect. I 2.4.4. Again, we pass from ∂,1 ∂¯ , see Eq. (2.49), to the space

e ∂,1 ∂¯ , which is equipped with a generalized Hermitian of all square-integrable 1-forms structure h·, ·i∂,∂¯ according to the construction in Theorem 2.9. Starting from here, we can define an analogue e ∂,1 ∂¯ (A) −→ C e ∂,1 ∂¯ (A) × hh·, ·ii : of the C-bi-linear metric in classical complex geometry by hh ω, η ii := h ω, η \ i∂,∂¯ . e ∂,1 ∂¯ (A) has the following properProposition 2.36. The generalized metric hh·, ·ii on ties: 1) hh aω, η b ii = a hh ω, η ii b ; 2) hh ω a, η ii = hh ω, a η ii ; 3) hh ω, ω \ ii ≥ 0 ; e ∂,1 ∂¯ (A) and a, b ∈ A. If the underlying spectral data are K¨ahlerian, one here ω, η ∈ has that hh ω, η ii = 0 0,1 e 1,0 e ∂, if ω, η ∈ ∂¯ (A) or ω, η ∈ ∂,∂¯ (A) .

Proof. The first three statements follow directly from the definition of hh·, ·ii and the corresponding properties of h·, ·i∂,∂¯ listed in Theorem 2.9. The last assertion is a consee r,s quence of Proposition 2.35, using the fact that the spaces ∂,∂¯ (A) are A-bi-modules. Note that this property of the metric hh·, ·ii corresponds to the property gµν = gµ¯ ν¯ = 0 (in complex coordinates) in the classical case. 2.4. The N = (4, 4) spectral data. We just present the definition of spectral data describing non-commutative Hyperk¨ahler spaces. Obviously, it is chosen in analogy to the discussion of the classical case in Sect. 2.5 of part I. Definition 2.37. A set of data (A, H, Ga± , Ga± , T i , T i , γ, ∗) with a = 1, 2, i = 1, 2, 3, is called a set of N = (4, 4) or Hyperk¨ahler spectral data if 1) the subset (A, H, G1+ , G1+ , T 3 , T 3 , γ, ∗) forms a set of N = (2, 2) spectral data; on H, andT i , i = 1, 2, 3, 2) Ga± , a = 1, 2 are closed, densely defined operators ∗ a± ∗ = Ga∓ , T i = T i and the are bounded operators on H which satisfy G following (anti-)commutation relations (a, b = 1, 2, i, j = 1, 2, 3, and τ i are the Pauli matrices): { Ga+ , Gb+ } = 0,

{ Ga− , Gb+ } = δ ab ,

[ , Ga+ ] = 0,

[ , T i ] = 0,

1 i b+ τ G , 2 ab for some self-adjoint operator on H, which, in the classical case, is the holomorphic part of the Laplace operator; [ T i , T j ] = iijk T k ,

[ T i , Ga+ ] =

154

J. Fr¨ohlich, O. Grandjean, A. Recknagel

3) the operators Ga± , a = 1, 2, and T i , i = 1, 2, 3, also satisfy the conditions in 2) and (anti-)commute with Ga± and T i . The construction of non-commutative differential forms and the integration theory is precisely the same as for N = (2, 2) spectral data. We therefore refrain from giving more details. It might, however, be interesting to see whether the additional information encoded in N = (4, 4) spectral data gives rise to special properties, beyond the ones found for K¨ahler data in Subsect. 2.3.3. 2.5. Symplectic non-commutative geometry. Once more, our description in the noncommutative context follows the algebraic characterization of classical symplectic manifolds given in Sect. 2.6 of part I. The difference between our approaches to the classical and to the non-commutative case is that, in the former, we could derive most of the algebraic relations – including the SU(2) structure showing up on symplectic manifolds – from the specific properties of the symplectic 2-form, whereas now we will instead include those relations into the defining data, as a “substitute” for the symplectic form. Definition 2.38. The set of data (A, H, d, L3 , L+ , L− , γ, ∗) is called a set of symplectic spectral data if 1 (A, H, d, γ, ∗) is a set of N = (1, 1) spectral data; 2) L3 , L+ and L− are bounded operators on H which commute with all a ∈ A and satisfy the sl2 commutation relations [ L3 , L± ] = ±2L± , [ L+ , L− ] = L3 as well as the Hermiticity properties (L3 )∗ = L3 , (L± )∗ = L∓ ; furthermore, they commute with the grading γ on H; e∗ := [ L− , d ] is densely defined and closed, and together with d it 3) the operator d forms an SU(2) doublet, i.e., the following commutation relations hold: [ L3 , d ] = d, [ L+ , d ] = 0, e∗ , [ L− , d ] = d

e∗ ] = −d e∗ , [ L3 , d e∗ ] = d, [ L+ , d e∗ ] = 0. [ L− , d

As in the classical case, there is a second SU(2) doublet spanned by the adjoints d∗ e. The Jacobi identity shows that d e∗ is nilpotent and that it anti-commutes with d. and d Differential forms and integration theory are formulated just as for N = (1, 1) spectral data, but the presence of SU(2) generators among the symplectic spectral data leads to additional interesting features, such as the following: Let ω ∈ kd (A) and η ∈ ld (A) be two differential forms. Then their scalar product, see Eq. (2.38), vanishes unless k = l: (ω, η) = 0 if k 6 = l.

(2.51)

This is true because, by the SU(2) commutation relations listed above, the operator L3 induces a Z-grading on differential forms, and because L3 commutes with the Laplacian 4 = d∗ d + dd∗ . One consequence of (2.51) is that the reality property of (·, ·) stated in Proposition 2.22 is valid independently of the phase occurring in the Hodge relations. The following proposition shows that we can introduce an N = (2, 2) structure on a set of symplectic spectral data if certain additional properties are satisfied. As was the case for Definition 2.38, the extra requirements are slightly stronger than in the

Supersymmetric Quantum Theory and Non-Commutative Geometry

155

classical situation, where some structural elements like the almost-complex structure are given automatically. In the K¨ahler case, the latter allows for a separate counting of holomorphic resp. anti-holomorphic degrees of differential forms, which in turn ensures that the symmetry group of the symplectic data associated to a classical K¨ahler manifold is in fact SU(2) × U(1) – see also Sect. 3 of part I. Without this enlarged symmetry group, it is impossible to re-interpret the N = 4 as an N = (2, 2) supersymmetry algebra. Therefore, we explicitly postulate the existence of an additional U(1) generator in the non-commutative context – which coincides with the U(1) generator J0 in Eq. (I 1.49) of Sect. I 1.2 and is intimately related to the complex structure. Proposition 2.39. Suppose that the SU(2) generators of a set of symplectic spectral data satisfy the following relations with the Hodge operator: ∗ L3 = −L3 ∗,

∗ L+ = −ζ 2 L− ∗,

where ζ is the phase appearing in the Hodge relations of the N = (1, 1) subset of the symplectic data. Assume, furthermore, that there exists a bounded self-adjoint operator J0 on H which commutes with all a ∈ A, with the grading γ, and with L3 , whereas it acts like e, e] = id [ J0 , d [ J0 , d ] = −i d between the SU(2) doublets. Then the set of symplectic data carries an N = (2, 2) K¨ahler structure with 1 1 e), e), ∂ = (d + i d ∂ = (d − i d 2 2 1 1 T = (L3 + J0 ), T = (L3 − J0 ). 2 2 Proof. All the conditions listed in Definition 2.26 of Hermitian spectral data can be e2 = 0 and verified easily: Nilpotency of ∂ and ∂ follows from d2 = d e } = 0, { d, d

(2.52)

and the action of the Hodge operator on the SU(2) generators ensures that ∗ intertwines ∂ and ∂ in the right way. As for the extra conditions in Definition 2.28 of K¨ahler spectral data, one sees that the first one is always true for symplectic spectral data, whereas the second one, namely the equality of the “holomorphic” and “anti-holomorphic” Laplacians, is again a consequence of relation (2.52). 3. The Non-Commutative 3-Sphere Here and in the next section, we present two examples of non-commutative spaces and show how the general methods developed above can be applied. We first discuss the “quantized” or “fuzzy” 3-sphere. We draw some inspiration from the conformal field theory associated to a non-linear σ-model with target being a 3-sphere, the socalled SU(2)-WZW model, see [Wi3] and also [FGK, PS]. But while the ideas on a non-commutative interpretation of conformal field theory models proposed in [FG] are essential for placing non-commutative geometry into a string theory context, the following calculations are self-contained; the results of subsections 3.2 and 3.3 are taken from [Gr]. Although there is no doubt that the methods used in [Gr] and below can be extended to arbitrary compact, connected and simply connected Lie groups, we will, for simplicity, restrict ourselves to the case of SU(2).

156

J. Fr¨ohlich, O. Grandjean, A. Recknagel

We first introduce a set of N = 1 spectral data describing the non-commutative 3sphere, then discuss the de Rham complex and its cohomology, and finally turn towards geometrical aspects of this non-commutative space. SubSect. 3.4 briefly describes the N = (1, 1) formalism. 3.1. The N = 1 data associated to the 3-sphere. In this subsection, we introduce N = 1 data describing the non-commutative 3-sphere. Since the 3-sphere is diffeomorphic to the Lie group G = SU(2), we are looking for data describing a Lie group G . Let {TA } be a basis of g = Te G , the Lie algebra of G . By ϑA and ϑA we denote the left- and right-invariant vector fields associated to the basis elements TA , and by θA and θA the corresponding dual basis of 1-forms. The structure C are defined, as usual, by constants fAB C ϑC . [ϑA , ϑB ] = fAB

(3.1)

The Killing form on g induces a canonical Riemannian metric on T G given by D C fBD , gAB ≡ g(ϑA , ϑB ) = −Tr (adTA ◦ adTB ) = −fAC

(3.2)

and the Levi–Civita connection reads ∇A ϑB ≡ ∇ϑA ϑB =

1 C f ϑC . 2 AB

(3.3)

The left-invariant vector fields ϑA define a trivialization of the (co-)tangent bundle. We denote by ∇L the flat connection associated to that trivialization, ∇L θA = 0 for all A. We introduce the operators aA ∗ = θA ∧ , aA = g AB ϑB on the space of differential forms, as well as the usual gamma matrices γ A = aA ∗ − aA , γ A = i ( aA ∗ + aA ).

(3.4)

It is easy to verify that γ A and γ A generate two anti-commuting copies of the Clifford algebra, { γ A , γ B } = { γ A , γ B } = −2g AB , { γ A , γ B } = 0. (3.5) Following the notations of Sect. I 2.2, we shall denote by S the bundle of differential forms endowed with the above structures. We define two connections ∇S and ∇S on S by setting ∇S = θA ⊗ (∇L ϑA + ∇S = θ ⊗ (∇L ϑ A

A

1 fABC γ B γ C ), 12 1 fABC γ B γ C ), − 12

(3.6)

D gDC , and we put where fABC = fAB A A L A A JA := i ∇L ϑA , ψ := −i γ , J A := −i ∇ϑ , ψ := i γ . A

(3.7)

Supersymmetric Quantum Theory and Non-Commutative Geometry

157

These objects satisfy the commutation relations C JC , { ψ A , ψ B } = 2g AB , [ JA , JB ] = ifAB

(3.8)

with analogous relations for J A and ψ A ; barred and unbarred operators (anti-)commute. The two anti-commuting Dirac operators D and D on S read [FG] i fABC ψ A ψ B ψ C , 12 i A D = ψ JA − fABC ψ A ψ B ψ C . 12

D = ψ A JA −

(3.9)

The Z2 -grading operator γ on S , anti-commuting with D and D, is given by γ=

1 g εABC εDEF ψ A ψ B ψ C ψ D ψ E ψ F , i (3!)2

(3.10)

where g = det gAB . By L2 (S) ' L2 (G) ⊗ W , where W is the irreducible representation of the Clifford algebra of Eqs. (3.4,5), we denote the Hilbert space of square integrable sections of the bundle S , with respect to the normalized Haar measure on G . In the language of Connes’ spectral triples, the classical 3-sphere is described by the N = 1 data (L2 (S), C ∞ (G), D, γ) , with D ≡ D. The Hilbert space L2 (S) carries a unitary representation π of G × G given by (3.11) π(g1 , g2 )f (h) = f (g1−1 hg2 ), for all gi , h ∈ G and f ∈ L2 (G) . For each j ∈ 21 Z+ we denote by (π j , Vj ) the irreducible unitary (2j + 1)-dimensional (spin j) representation of G , and to each vector ξ ∗ ⊗ η ∈ Vj∗ ⊗ Vj we associate a smooth function fξ∗ ⊗η ∈ C ∞ (G) by setting fξ∗ ⊗ η (g) = √

1 h ξ ∗ , π j (g)η i. 2j + 1

(3.12)

This defines a linear isometry ϕ :

M

Vj∗ ⊗ Vj −→ L2 (G),

(3.13)

j∈ 21 Z+

and the Peter-Weyl theorem states that the image of ϕ is dense in L2 (G) and also in C(G) inLthe supremum norm topology. It is easy to verify that the operators JA and J A act on j∈ 1 Z+ Vj∗ ⊗ Vj as dπ(TA , 1) and dπ(1, TA ) , respectively. For each positive 2 integer k , we denote by P(k) the orthogonal projection P(k) : L2 (G) −→ H0 :=

k/2 M

Vj∗ ⊗ Vj .

(3.14)

j=0, 21 , ...

The Dirac operator D and the Z2 -grading γ clearly leave the finite-dimensional Hilbert space H0 ⊗ W invariant. We define A0 to be the unital subalgebra of End(H0 ) generated by operators of the form P(k) fξ∗ ⊗η , where ξ ∗ ⊗ η ∈ H0 . The following theorem is proven in [Gr]:

158

J. Fr¨ohlich, O. Grandjean, A. Recknagel

Theorem 3.1. The algebra A0 coincides with the algebra of endomorphisms of H0 ,i.e., A0 = End (H0 ). The proof in [Gr] shows that A0 is a full matrix algebra for any compact, connected and simply connected group. That A0 equals the endomorphism ring of H0 was only proved for SU(2), but a slight generalization of the proof for SU(2) should yield the result for all groups of the above type. We define the non-commutative 3-sphere by the N = 1 data (A0 , H0 ⊗ W, D, γ) . Notice that this definition of the non-commutative 3-sphere is very close to that of the non-commutative 2-sphere [Ber, Ho, Ma, GKP]. For an alternative derivation of this definition, the reader is referred to [FG] where it is shown how this space arises as the quantum target of the WZW model based on SU (2) . We note that 1/k plays the role of Planck’s constant ~ in the quantization of symplectic manifolds, i.e., it is a deformation parameter. Formally, the classical 3-sphere emerges as the limit of non-commutative 3-spheres as the deformation parameter 1/k tends to zero. 3.2. The topology of the non-commutative 3-sphere. In this subsection, we shall apply the tools of Subsect. 2.1 to the non-commutative space (A0 , H0 ⊗ W, D, γ) describing the non-commutative 3-sphere; we follow the presentation in [Gr]. For convenience, we shall choose the basis {TA } of Te G in such a way that gAB = 2δAB . The structure C = εABC . constants are then given by the Levi–Civita tensor, fAB 3.2.1. The de Rham complex. First, we determine the structure of the spaces of differential forms nD (A0 ) and the action of the exterior differentiation δ : •D (A0 ) −→ •D (A0 ) . We use the same notations as in Subsect. 2.1.2. The space of 1-forms is nX o ai0 [ J A , ai1 ] ⊗ ψ A aij ∈ A0 . (3.15) 1D (A0 ) ' π(1 (A0 )) = i

Since A0 is a full matrix algebra, see Theorem 3.1, it follows that 1D (A0 ) ' { aA ⊗ ψ A | aA ∈ A0 }.

(3.16)

Using the fact that any element of π(2 (A0 )) can be written as a linear combination of products of pairs of elements in π(1 (A0 )), we get π(2 (A0 )) = { aAB ⊗ ψ A ψ B | aAB ∈ A0 }.

(3.17)

Our next task is to determineP the space π(δJ 1 ) of so-called “auxiliary 2-forms”, see Eq. (2.2). To this end, let ω = i ai δbi ∈ 1 (A0 ) be such that π(ω) =

X

ai [D , bi ] = 0.

(3.18)

i

Using Eqs. (3.8) and (3.18), we see that the coefficient of [ ψ A , ψ B ] in π(δω) is proportional to

Supersymmetric Quantum Theory and Non-Commutative Geometry

εAB

X i

[ J A , ai ][ J B , bi ] = −εAB

X

159

ai J A , [ J B , bi ]

i

X X i 1 ai [ J A , J B ], bi = − εAB εABC ai [ J C , bi ] = 0, = − εAB 2 2 i i

where εAB denotes the Levi–Civita antisymmetric tensor. This shows that π(δJ 1 ) is included in A0 , and since A0 is a full matrix algebra, this implies that π(δJ 1 ) is either 0 or equal to A0 . We construct a non-vanishing element of π(δJ 1 ) explicitly. Let Pj be the orthogonal projection onto Vj∗ ⊗ Vj . We define a , b ∈ A0 by a = P0 a P1/2 , b = P1/2 b P0 and

α γ ⊗ V1/2 3 ⊗ 7−→ α − 2β + 2γ + δ ∈ V0∗ ⊗ V0 , a : β δ 1 2 ∗ ∗ ⊗ ∈ V1/2 ⊗ V1/2 . b : V0 ⊗ V0 3 α 7−→ α · −2 1 ∗ V1/2

It is straightforward to verify that ω := aδb satisfies π(ω) = 0 and π(δω) 6 = 0 . This proves that π(δJ 1 ) = A0 , and we get 2D (A0 ) ' { aAB ⊗ ψ A ψ B | aAB = −aBA ∈ A0 }.

(3.19)

In order to determine the space of 3-forms, we first notice that π(3 (A0 )) = { aABC ⊗ ψ A ψ B ψ C }, and we compute the space π(δJ 2 ) . Let ai , bi , ci ∈ A0 be such that ω = satisfies X ai [ D, bi ][ D, ci ] = 0. π(ω) =

(3.20) P i

ai δbi δci (3.21)

i

The coefficient of ψ 1 ψ 2 ψ 3 in π(δω) is proportional to X X [ J A , ai ][ J B , bi ][ J C , ci ] = −εABC ai J A , [ J B , bi ][ J C , ci ] εABC i

= −εABC

X

ai

i

[J A , [ J B , bi ] [ J C , ci ] + [ J B , bi ] J A , [ J C , ci ]

i

X i ai εABD [ J D , bi ] [ J C , ci ] + εACD [ J B , bi ] [ J D , ci ] = 0, = − εABC 2 i where we have used Eq. (3.21) and the Jacobi identity. Thus, π(δJ 2 ) is included in π(1 (A0 )) , and since A0 is a full matrix algebra, it is either 0 or equal to π(1 (A0 )) . Let ω, η ∈ 1 (A0 ) be such that π(ω) = −1 ⊗ ψ A , π(η) = 0 and π(δη) = 1 ⊗ 1 . The existence of ω and η is ensured by Eqs. (3.16) and the fact that π(δJ 1 ) = A0 . We have ωη ∈ 2 (A0 ) , π(ωη) = 0 and π(δ(ωη)) = 1 ⊗ ψ A as π(δω) = 0 . This proves that π(δJ 2 ) = π(1 (A0 )), and we get 3D (A0 ) ' { a ⊗ ψ 1 ψ 2 ψ 3 | a ∈ A0 }.

(3.22)

160

J. Fr¨ohlich, O. Grandjean, A. Recknagel

We proceed with the space of 4-forms. First, we notice that due to the Clifford algebra relations, Eqs. (3.4,5,8), we have π(4 (A0 )) = { aAB ⊗ ψ A ψ B | aAB ∈ A0 }.

(3.23)

Let ω ∈ 1 (A0 ) and η ∈ 2 (A0 ) be such that π(ω) = 0 , π(δω) = 1 ⊗ 1 , and π(η) = 1 ⊗ ψ A ψ B . The existence of ω and η is ensured by the fact that π(δJ 1 ) = A0 and by Eq. (3.17). We have ωη ∈ 3 (A0 ) , π(ωη) = 0 and π(δ(ωη)) = 1 ⊗ ψ A ψ B as π(ω) = 0 . Since A0 is a full matrix algebra, this proves that π(4 (A0 )) = π(δJ 3 ) , and we get 4D (A0 ) = 0 . Using the fact that the product of differential forms induces a surjective map n+m nD (A0 ) ⊗ m D (A0 ) −→ D (A0 ) we obtain

nD (A0 ) = 0 ∀ n > 3 .

(3.24)

Collecting Eqs. (3.16), (3.22) and (3.24), we arrive at the following theorem on the structure of differential forms over the non-commutative space (A0 , H0 ⊗ W, D, γ): Theorem 3.2. The left A0 -modules nD (A0 ) are all free and given as follows: 0) 0D (A0 ) = A0 is one-dimensional with basis {1} ; 1) 1D (A0 ) is three-dimensional with basis {1 ⊗ ψ A } ; 2) 2D (A0 ) is three-dimensional with basis {1 ⊗ ψ A ψ A+1 } (where addition is taken modulo 3); 3) 3D (A0 ) is one-dimensional with basis {1 ⊗ ψ 1 ψ 2 ψ 3 } ; 4) nD (A0 ) = 0 for all n > 3 . Notice that the structure of the modules nD (A0 ) is the same as that of the spaces of differential forms on SU(2) ' S 3 . In the following, we compute the action of the exterior differential δ : nD (A0 ) −→ n+1 D (A0 ). We introduce the following bases of 1D (A0 ) and 2D (A0 ): eA = 1 ⊗ ψ A ∈ 1D (A0 ), f

A

=ε

ABC

B

⊗ψ ψ

C

∈

2D (A0 ),

(3.25) (3.26)

which allows us to identify 1D (A0 ) and 2D (A0 ) with the standard free module A30 , and we decompose their elements with respect to these bases, ω = ωA eA for ω ∈ 1D (A0 ), ω = ωA f

A

for ω ∈

2D (A0 ).

(3.27) (3.28)

It is easily verified that the product of 1-forms ω, η ∈ 1D (A0 ) is given by ω · η = εABC ωB ηC f A .

(3.29)

By the Leibniz rule for the exterior differential δ , knowledge of the action of δ on the elements a ∈ A0 , eA and f A fully determines the action of the differential on •D (A0 ) . By definition, we have (3.30) δa = [ J A , a ] eA .

Supersymmetric Quantum Theory and Non-Commutative Geometry

161

Using Eq. (3.30) and the nilpotency of δ we get 0 = δ 2 J A = iεABC δ(J C eB ) = −εBAC εDCF J F eD eB + iεBAC J C δeB from which we can successively conclude that εBAE εDEC = iεBAC δeB , eA eC = iεBAC δeB . With Eq. (3.29), we finally get

δeA = −if A .

(3.31)

This equation, together with the nilpotency of δ, furthermore implies that δf A = 0.

(3.32)

We summarize these results in the following Theorem 3.3. Let g = 3!1 εABC ψ A ψ B ψ C be the basis element of 3D (A0 ) , and eA and f A as in Eqs. (3.25,26). Then the algebra structure of •D (A0 ) is given as follows: a1) a2)

[ a, eA ] = [ a, f A ] = [ a, g ] = 0 for all a ∈ A0 , A B

ABC C

A B

AB

e e =ε e f

=δ

A B C

f , e e e =ε

ABC

g,

g.

(3.33) (3.34) (3.35)

The differential structure on •D (A0 ) is given by δa = [ J A , a ] eA ,

b1)

A

A

δe = −if , δf

b2)

(3.36) A

= 0.

(3.37)

3.2.2. Cohomology of the de Rham complex. Let us now compute the cohomology groups of the de Rham complex (•D (A0 ), δ) of Theorems 3.2 and 3.3. The zeroth cohomology group H 0 consists of those elements a ∈ A0 that are closed, i.e., satisfy δa = 0 . We have a ∈ H 0 ⇐⇒ δa = [ J A , a ]eA = 0 ⇐⇒ [ J A , a ] = 0 for all A, and it follows that H 0 = AR ≡

k/2 M j=0

and dimC H 0 =

k/2 X j=0

1Vj∗ ⊗ End (Vj ),

(2j + 1)2 =

1 (2k + 3)(k + 2)(k + 1). 6

(3.38)

(3.39)

In order to compute the first cohomology group, we first determine the closed 1-forms. For any 1-form ω = ωA eA ∈ 1D (A0 ), relation (3.37) implies that δω = [ J A , ωB ]εABC − iωC f C ,

162

J. Fr¨ohlich, O. Grandjean, A. Recknagel

and thus δω = 0 is equivalent to [ J A , ωB ]εABC = iωC .

(3.40)

We show that all closed 1-forms are exact. First, notice that if we view A0 as a representation space of su(2), then, for a closed 1-form, Eq. (3.40) must hold in all isotypic components. Therefore, there is no loss of generality in assuming that all coefficients ωA transform under the spin j representation, i.e., [ J A , [ J A , ωB ]] = j(j + 1) ωB .

(3.41)

Furthermore, we can assume that j 6 = 0 since otherwise ω = 0, as follows from Eq. (3.40). We define a(ω) ∈ A0 by 1 [ J A , ωA ] a(ω) = j(j + 1) and we compute δa . Using Eqs. (3.40,41) and the Jacobi identity, we get 1 [ J A , [ J B , ωB ]]eA j(j + 1) 1 iεABC [ J C , ωB ] + [ J B , [ J A , ωB ]] eA = j(j + 1) 1 [ J B , [ J B , ωA ]]eA = ωA eA . = j(j + 1)

δa(ω) =

This proves that

H 1 = 0.

(3.42)

We proceed towards the second cohomology group. The condition for a 2-form ω = ωA f A to be closed reads δω = 0 ⇐⇒ [ J A , ωA ] = 0.

(3.43)

Again, we assume that the components ωA belong to a spin j representation of su(2). If j = 0 , then setting ηA = iωA we get δ(ηA eA ) = ωA f A , proving that ω is exact. If j 6 = 0, we set ηA = −

1 εABC [ J B , ωC ], j(j + 1)

and one easily verifies that δ(ηA eA ) = ωA f A . This proves that H 2 = 0.

(3.44)

Finally, we compute the third cohomology group. Since all 3-forms are closed, we just have to compute the image of the exterior differential in 3D (A0 ) . For any 2-form ω we have δω = [ J A , ωA ] g with g being the basis element of 3D (A0 ) as in Theorem 3.3. This means that the image of δ in 3D (A0 ) is given by

Supersymmetric Quantum Theory and Non-Commutative Geometry

im δ

2D (A0 )

= span

3 [

163

im (ad J A ) ·g,

A=1

and this space consists of linear combinations of elements of A0 transforming under a spin j representation for j 6 = 0 , multiplied by g . Thus, the quotient 3D (A0 )/im δ is given by k/2 M 3 1Vj∗ ⊗ End (Vj ). (3.45) H ' AR ≡ j=0

Collecting our results of Eqs. (3.38,39,42,44) and (3.45), we get the following Theorem 3.4. The cohomology groups of the de Rham complex of Theorem 3.3 are H 0 ' H 3 ' AR ,

H1 = H2 = 0

with dimensions dimC H 0 = dimC H 3 =

1 (2k + 3)(k + 2)(k + 1). 6

This theorem shows that the cohomology groups of the fuzzy 3-sphere – which is the quantum target of the WZW model based on SU(2) [FG, Gr] – look very much like those of the classical SU(2) group manifold, except for the unexpected dimensions of the spaces H 0 and H 3 . We observe that in the classical setting, the cohomology groups are modules over the ring H 0 and that, for a connected space, the Betti numbers coincide with the dimensions of these modules. We are thus led to the idea that the dimensions of the cohomology groups over C may be less relevant than their dimensions as modules over H 0 . Of course, it may happen in general that some H 0 -module is not free, and we would, in that case, lose the notion of dimension. For the cohomology groups of the de Rham complex (•D (A0 ), δ) we get dimH 0 H 0 = dimH 0 H 3 = 1, dimH 0 H 1 = dimH 0 H 2 = 0, which fits perfectly with the classical result. The above proposal is obviously tailored to make sense of the cohomology groups of Theorem 3.4 and its general relevance remains to be decided by the study of other examples of non-commutative spaces. 3.3. The geometry of the non-commutative 3-sphere. The N = 1 spectral data (A0 , H0 ⊗ W, D, γ) permit us to investigate not only topological but also geometrical aspects of the quantized 3-sphere, namely integration of differential forms and Hermitian structures, as well as connections and the associated Riemann, Ricci and scalar curvatures. For a more detailed account of the results of this section, the reader is referred to [Gr]. 3.3.1. Integration and Hermitian structures. We start with the canonical scalar product and the Hermitian structures on the spaces of differential forms. We use the same notations as in Subsects. 2.1.3–2.1.5. Any element ω ∈ π(• (A0 )) can be written uniquely as 1 A 2 A e + ωA f + ω 3 g, ω = ω 0 + ωA

(3.46)

164

J. Fr¨ohlich, O. Grandjean, A. Recknagel

R i where ω i , ωA ∈ A0 . The integral − , as given in Definition 2.3, is just the normalized trace on H0 ⊗ W , denoted by Tr . Thus, for any element ω as above, we have Z (3.47) − ω = Tr ω 0 . It is easy to show that the sesqui-linear form (·, ·) associated to the integral is given by 1 1 ∗ 2 2 ∗ (ηA ) + ωA (ηA ) + ω 3 (η 3 )∗ . (3.48) (ω, η) = Tr ω 0 (η 0 )∗ + ωA This proves that the kernels K i of the sesqui-linear form (·, ·) equal the kernels J i of the representation π . Thus, in this example we also have the equality e pD (A0 ) = p (A0 ). D Furthermore, since π(δJ 1 ) = A0 and π(δJ 2 ) = π(1D (A0 )) , we see that the decomposition (3.46) gives the canonical representative ω ⊥ of an arbitrary differential form ω ∈ •D (A0 ) . The Hermitian structure on pD (A0 ) is readily seen to be hω, ηi = ωA (ηA )∗ , ω, η ∈ pD (A0 ).

(3.49)

Notice that, in this example, we get a true Hermitian structure on pD (A0 ) and not only e p (A0 ) , cf. Subsect. 2.1.5. a generalized Hermitian structure on 3.3.2. Connections on 1D (A0 ) . This last property makes it possible to regard 1D (A0 ) as the cotangent bundle of the non-commutative 3-sphere and to study connections on 1D (A0 ) . Since the space of 1-forms 1D (A0 ) is a trivial left A0 -module, a connection ∇ on 1 D (A0 ) is uniquely determined by the images of the basis elements, i.e., A eB ⊗ eC , ∇eA = −ωBC

(3.50)

A where ωBC are arbitrary elements of A0 .

Proposition 3.5. A connection ∇ is unitary if and only if its coefficients satisfy the Hermiticity condition A∗ C = ωBA . (3.51) ωBC Proof. It follows from (3.49) that heA , eB i = δ AB . Then we have for a unitary connection (see Definition 2.12) A B∗ eC + eC ωCA . 0 = δheA , eB i = −ωCB

Proposition 3.6. The torsion of a connection is given by A εBCD ) f D . TA = (−iδ AD + ωBC

Proof. Using Definition 2.14 and Eqs. (2.20), (3.34,37), we get A εBCD f D . T(∇) eA = −if A + ωBC

Supersymmetric Quantum Theory and Non-Commutative Geometry

165

Proposition 3.7. A connection is torsionless and unitary if and only if its coefficients satisfy the following conditions: i) ii) iii)

A A∗ − ωBC = iεABC , ωBC A ωBC A ωBC

= =

(3.52)

A∗ ωCB , C ωAB .

(3.53) (3.54)

In particular, such a connection is uniquely determined by the nine self-adjoint elements A 1 ∈ A0 and the self-adjoint part of ω23 . ωAB Proof. The condition of vanishing torsion, A εBCD = iδ AD , ωBC

can equivalently be written as A A − ωCB = iεABC . ωBC

(3.55)

Using alternatively Eqs. (3.55) and the unitarity condition Eq. (3.51) we get A A B∗ B∗ C = iεABC + ωCB = iεABC + ωCA = ωAC = ωAB . ωBC

which proves the result.

This proposition shows that, as in the classical case, there are many unitary and torsionless connections. There are two possibilities to reduce the space of “natural” connections further. First, we can consider real connections, i.e., connections whose associated parallel transport maps real forms to real forms. In the classical setting, a 1-form ω is real if ω ∗ = −ω (the sign comes from the fact that the Clifford matrices are anti-Hermitian). Thus, we see that our basis of 1-forms consists of imaginary 1-forms, i.e., eA∗ = eA . If the covariant derivative of an imaginary 1-form is to be imaginary, A must be anti-Hermitian. We call such a connection then the connection coefficients ωBC a real connection. Corollary 3.8. There is a unique real unitary and torsionless connection on the cotangent bundle 1D (A0 ) , and its coefficients are given by A = ωBC

i ABC . ε 2

There is another way of reducing the number of “natural” connections. If we look at a general unitary and torsionless connection, we see that it does not have any isotropy A are all independent of one another. We hope property. For example, the coefficients ωAA that if we require the connection to be invariant under all 1-parameter group of isometries (see [CFG, Gr]) we shall get relations among these coefficients. We shall not pursue this route here, but we refer the reader to [Gr] for a detailed analysis. e We proceed with the computation of the scalar curvature of the real connection ∇ of Corollary 3.8. A as defined in Eq. (3.50), the curvature For any connection ∇ with coefficients ωBC tensor is given by (see Definition 2.11)

166

J. Fr¨ohlich, O. Grandjean, A. Recknagel

R(∇) eA = −∇2 eA A A A = [ J D , ωEC ] εDEB − iωBC + ωDE ωFEC εDF B f B ⊗ eC .

(3.56)

e of Corollary 3.8, the curvature tensor reads In particular, for the real connection ∇ e eA = R(∇)

1 ABC B ε f ⊗ eC . 4

(3.57)

In order to compute the Ricci curvature, we use a dual basis to the generators eA , as in Subsect. 2.1.7, before Eq. (2.15). It is clear that the elements εA ∈ 1D (A0 )∗ defined by εA (ω) = εA (ωB eB ) := ωA

(3.58)

for all ω ∈ 1D (A0 ) , form a dual basis to eA . Using Eq. (3.49) it is then easy to verify that the dual 1-forms eA and their dual maps ead A , Eq. (2.16), are given by B ABC C e . eA = eA , ead A (f ) = −ε

(3.59)

e we get from Eq. (3.57), For the real connection ∇, e = − 1 eA ⊗ eA . Ric(∇) 2

(3.60)

ad We proceed with the computation of the scalar curvature. The right dual maps (eA R ) to A the basis 1-forms e , Eq. (2.17), act as ad B AB . (eA R ) (e ) = δ

(3.61)

e follows from Eq. (3.60) and is given by The scalar curvature of the real connection ∇ e = −3. r(∇) 2

(3.62)

It is the same as the scalar curvature of the unique real unitary and torsionless connection for the classical SU(2) – recall that the definition of the scalar curvature in the noncommutative setting differs from the classical one by a sign, see the remark in Definition 2.16. This completes our study of the non-commutative 3-sphere in terms of N = 1 spectral data. Our results show that the non-commutative 3-sphere has striking similarities with its classical counterpart. As we saw in Subsects. 3.2.1 and 3.2.2, the spaces of differential forms have the same structure as left-modules over the algebra of functions, and the cohomology groups have the same dimensions as modules over the zeroth cohomology group, H 0 . Furthermore, geometric invariants like the scalar curvature, too, coincide for the classical and the quantized 3-sphere. 3.4. Remarks on N = (1, 1). In the following, we consider N = (1, 1) data for the algebra A0 . The construction of the first subsection starts from the BRST operator of the group G and leads to a deformation of the de Rham complex for the classical 3-sphere in the form of N = (1, 1) data for the non-commutative 3-sphere. In the second subsection, we return to the two generalized Dirac operators provided by superconformal field theory [FG], which lead to a different formulation of N = (1, 1) data, displaying “spontaneously broken supersymmetry”.

Supersymmetric Quantum Theory and Non-Commutative Geometry

167

3.4.1. N = (1, 1) data from BRST. One way to arrive at N = (1, 1) data for the algebra A0 and at the associated (non-commutative) de Rham complex for the quantized 3-sphere is to use the action of the group G on the Hilbert space H0 for introducing a BRST operator (see also Sect. I 2.3). Let { JA } be the basis of the complexified Lie algebra gC of G introduced in Eq. (3.7). The BRST operator Q for the group G is defined as usual: We introduce ghosts cA and anti-ghosts bA satisfying the ghost algebra A . { cA , cB } = { bA , bB } = 0, { cA , bB } = δB

(3.63)

Then the BRST operator is given by Q = cA JA − and the ghost number operator is

i C A B f c c bC , 2 AB

(3.64)

T = cA bA .

(3.65)

The Hilbert space of the N = (1, 1) data will be of the form H0 ⊗ W where W is a representation space for the ghost algebra. In order to obtain N = (1, 1) data, we require that the ghost algebra acts unitarily on W with respect to the natural ∗ -operation, namely cA ∗ = g AB bB .

(3.66)

This choice is compatible with positive definiteness of the scalar product on W , and it renders the ghost number operator T self-adjoint. 1 Furthermore, this choice of ∗ operation leads to identifying the ghost algebra with the CAR { cA , cB } = 0, { cA , cB ∗ } = g AB ,

(3.67)

and the BRST operator can be written Q = cA JA −

i C A B ∗ f c c cC , 2 AB

(3.68)

where indices are raised and lowered with the metric gAB as usual. Under the identifications cA ∼ aA ∗ := −i θA ∧, where { θA } is a basis of 1-forms dual to { ϑA } , Eq. (3.68) for the BRST operator formally coincides with the exterior derivative on G. This fact was already mentioned in Sect. I 2.3. In order to complete our construction of N = (1, 1) data, we introduce the Hodge ∗-operator 1 √ g εA1 ...An ( cA1 + cA1 ∗ ) · · · ( cAn + cAn ∗ ), (3.69) ∗= n! where n = dim G. This operator clearly commutes with the algebra A0 of Theorem 3.1. Moreover, it is easy to verify that ∗ is unitary and satisfies ∗2 = (−1) 1

n(n−1) 2

(3.70) A∗

A

b∗ A =bA

In the context of gauge theories, one considers representations such that c =c , . These Hermiticity conditions together with the defining relations (3.63) imply that the inner product of the representation space is not positive definite – which is why cA and bA are called ghosts.

168

J. Fr¨ohlich, O. Grandjean, A. Recknagel

as well as

∗Q = (−1)n−1 Q ∗ .

(3.71)

It follows that (A, H, d, γ, ∗) with A = A0 , H = H0 ⊗ W , d = Q and where γ is the modulo 2 reduction of the Z-grading T , form a set of N = (1, 1) data in the sense of Definition 2.20. We refrain from presenting the details of the construction of differential forms and of the other geometrical quantities, since the computations are fairly straightforward. For example, the space of k-forms is given by kd (A0 ) = { aA1 ...Ak cA1 · · · cAk | aA1 ...Ak ∈ A0 }.

(3.72)

For G = SU(2), we see that these spaces are isomorphic to kD (A0 ) as left A0 -modules. Furthermore, it is easy to see that •d (A0 ) and •D (A0 ) are isomorphic as complexes, which proves that, in particular, their cohomologies coincide. Of course, the same constructions and results apply to the BRST operator associated with the right-action of G on H0 given by the generators J A of Eq. (3.7). The Hilbert space H = H0 ⊗ W can be decomposed into a direct sum of eigenspaces of the Z-grading operator T , n M H(k) , H= k=0

where H = H0 , n = dim G ( = 3 for G = SU(2)). The subspaces H(k) are left-modules for A0 . Furthermore, it follows from Eqs. (3.65) and (3.68) that d := Q maps H(k) into H(k+1) for k = 0, . . . , n (with H(n+1) := {0}). Since d2 = 0, H is a complex. Viewed as linear spaces, the cohomology groups of •d (A0 ) and (H, Q) are isomorphic, although the latter do not carry a ring structure. As a side remark, consider an odd operator H on H. Then d˜ := d + H is nilpotent if and only if { d, H } + H 2 = 0. If H commutes with A0 , then •d˜(A0 ) and •d (A0 ) are identical complexes. In the next subsection, we will meet a conformal field theory motivated example for d˜ = d + H which is nilpotent but for which H does not commute with A0 . (0)

3.4.2. Spontaneously broken supersymmetry. In Sect. 3.1 we introduced two connections ∇S and ∇S and their associated Dirac operators D and D, see Eqs. (3.6-9). Since these two Dirac operators correspond to different connections, they are not Dirac operators on an N = (1, 1) Dirac bundle in the sense of Definition I 2.6. It is interesting to notice that D and D nevertheless satisfy the N = (1, 1) algebra [FG], D2 = D2 , { D, D } = 0.

(3.73)

The easiest way to prove (3.73) is to verify that the generalized exterior derivative 1 d˜ := (D + iD) 2

(3.74)

is nilpotent. Let { ϑA } and { θA } denote a basis of the Lie algebra and the dual basis of 1-forms, respectively, as before. We define the operators aA ∗ = θA ∧, aA = ϑA as usual, and we can express the fermionic operators ψ A and ψ A as

Supersymmetric Quantum Theory and Non-Commutative Geometry

ψ A = −i(aA ∗ − aA ), ψ A = −(aA ∗ + aA ),

169

(3.75)

where indices are raised and lowered with the metric gAB . Using Eqs. (3.9) and (3.74), we can rewrite the operator d˜ as a sum of terms of degree 1, −1 and −3,

where

d˜ = d˜1 + d˜−1 + d˜−3 ,

(3.76)

1 + − fABC aA ∗ aB ∗ aC , d˜1 = aA ∗ JA 4 − , d˜−1 = −aA JA 1 d˜−3 = − fABC aA aB aC , 12

(3.77)

with

i ± = − (JA ± J A ). (3.78) JA 2 It is then straightforward to show that d˜ given by Eqs. (3.76,77) satisfies d˜2 = 0 and that ˜ d˜∗ } is given by the associated Laplacian 4 = { d, 4 = g AB JA JB +

dim G dim G = g AB J A J B + . 24 24

(3.79)

Thus, 4 is a strictly positive operator – corresponding to what one calls spontaneously broken supersymmetry in the context of field theory. This implies that the cohomology ˜ is trivial. However, the cohomology of the complex •˜(A0 ), as of the complex (H, d) d introduced in Sect. 2.2, is not trivial. Notice that d˜1 is the BRST operator associated to + and hence nilpotent. This implies that the BRST cohomology of the the generators JA fuzzy 3-sphere can be extracted from •d˜(A0 ). 4. The Non-Commutative Torus As a second, “classic” example of non-commutative spaces, we discuss the geometry of the non-commutative 2-torus [Ri, Co1, Co5]. After a short review of the classical torus in Subsect. 4.1, we analyze the spin geometry (N = 1) of the non-commutative torus in Subsect. 4.2 along the lines of [FGR2, Gr]. In Subsects. 4.3 and 4.4, we successively extend the N = 1 data to N = (1, 1) and N = (2, 2) data – according to the general procedure proposed in Subsect. 2.2.5 above. In these two last subsections, we do not give detailed proofs, but merely state the results since the computations, although straightforward, are tedious and not very illuminating. 4.1. The classical torus. To begin with, we describe the N = 1 data associated to the classical 2-torus T20 . By Fourier transformation, the algebra of smooth functions over T20 is isomorphic to the Schwarz space A0 := S(Z2 ) over Z2 , endowed with the (commutative) convolution product: X a(q) b(p − q), (4.1) (a • b)(p) = q∈Z2

where a, b ∈ A0 and p ∈ Z2 . Complex conjugation of functions translates into a ∗ operation:

170

J. Fr¨ohlich, O. Grandjean, A. Recknagel

a∗ (p) = a(−p),

a ∈ A0 .

(4.2)

If we choose a spin structure over T20 in such a way that the spinors are periodic along the elements of a homology basis, then the associated spinor bundle is a trivial rank 2 vector bundle. With this choice, the space of square integrable spinors is given by the direct sum (4.3) H = l2 (Z2 ) ⊕ l2 (Z2 ), where l2 (Z2 ) denotes the space of square summable functions over Z2 . The algebra A0 acts diagonally on H by the convolution product. We choose a flat metric (gµν ) on T20 and we introduce the corresponding 2-dimensional gamma matrices {γ µ , γ ν } = − 2 g µν , γ µ∗ = − γ µ .

(4.4)

Then, the Dirac operator D on H is given by (D ξ)(p) = i pµ γ µ ξ(p),

ξ ∈ H.

(4.5)

Finally, the Z2 -grading on H, denoted by σ, can be written as σ =

i √ g εµν γ µ γ ν , 2

(4.6)

where εµν is the Levi–Civita tensor. The data (A0 , H, D, σ) are the canonical N = 1 data associated to the compact spin manifold T20 , and it is thus clear that they satisfy all the properties of Definition 2.1. 4.2. Spin geometry (N = 1). The non-commutative torus is obtained by deforming the product of the algebra A0 . For each α ∈ R , we define the algebra Aα := S(Z2 ) with the product X a(q) b(p − q) eiπαω(p,q) , (4.7) (a •α b) (p) = q ∈ Z2

where ω is the integer-valued anti-symmetric bilinear form on Z2 × Z2 , ω(p, q) = p1 q2 − p2 q1 ,

p, q ∈ Z2 .

(4.8)

The ∗ -operation is defined as before. Alternatively, we could introduce the algebra Aα as the unital ∗ -algebra generated by the elements U and V subject to the relations U U ∗ = U ∗U = V V ∗ = V ∗V = 1 ,

U V = e−2πiα V U.

(4.9)

Having chosen an appropriate closure, the equivalence of the two descriptions is easily seen if one makes the following identifications: U (p) = δp1 ,1 δp2 ,0 , V (p) = δp1 ,0 δp2 ,1 .

(4.10)

If α is a rational number, α = M N , where M and N are co-prime integers, then the centre Z(Aα ) of Aα is infinite-dimensional: (4.11) Z(Aα ) = span U mN V nN | m, n ∈ Z . Let Iα denote the ideal of Aα generated by Z(Aα ) − 1. Then it is easy to see that the quotient Aα /Iα is isomorphic, as a unital ∗ -algebra, to the full matrix algebra MN (C).

Supersymmetric Quantum Theory and Non-Commutative Geometry

171

If α is irrational, then the centre of Aα is trivial and Aα is of type II1 , the trace being given by the evaluation at p = 0. Unless stated differently, we shall only study the case of irrational α. We define the non-commutative 2-torus T2α by its N = 1 data (Aα , H, D, σ) where H, D and σ are as in Eqs. (4.3), (4.5) and (4.6), and Aα acts diagonally on H by the deformed product, Eq. (4.7). 2 When α = M N is rational, one may work with the data (Aα /Iα , MN (C)⊗C , Dα , σ), where the Dirac operator Dα is given by π µ sin N pµ . (4.12) Dα = i γ π N

4.2.1. Differential forms. Recall that there is a representation π of the algebra of universal forms • (Aα ) on H (see Subsect. 2.1.2). The images of the homogeneous subspaces of • (Aα ) under π are given by (4.13) π 0 (Aα ) = Aα (by definition), 2k−1 µ (Aα ) = {aµ γ | aµ ∈ Aα } π (4.14) 2k π (Aα ) = {a + bσ | a, b ∈ Aα } (4.15) for all k ∈ Z+ . In principle, one should then compute the kernels J n of π (see Eq. (2.2)), but these are generally huge and difficult to describe explicitly. To determine the space of n-forms, it is simpler to use the isomorphism (4.16) nD (Aα ) = n (Aα ) J n + δJ n−1 ' π (n (Aα )) π(δJ n−1 ). First, we have to compute the spaces of “auxiliary forms” π(δJ n−1 ). Lemma 4.1. The spaces π(δJ n−1 ) of auxiliary forms are given by π δJ 1 = Aα , π δJ 2k = π 2k+1 (Aα ) , π δJ 2k+1 = π 2k+2 (Aα ) ,

(4.17) (4.18) (4.19)

for all k ≥ 1.

P 1 Proof. Let ai , bi ∈ Aα be such that the universal 1-form η = i ai δbi ∈ (Aα ) satisfies π(η) = 0. This means that X (p − q)µ γ µ aj (q)bj (p − q) eiπαω(p,q) = 0 (4.20) i j,q

for all p ∈ Z2 . Using Eq. (4.20), we have X qµ (p − q)ν γ µ γ ν aj (q)bj (p − q) eiπαω(p,q) π(δη) = − j,q

=−

X

(q 2 − p2 )aj (q)bj (p − q) eiπαω(p,q) ∈ Aα .

(4.21)

j,q

This proves that π(δJ 1 ) ⊂ Aα . Then, we construct an explicit non-vanishing element of π(δJ 1 ). We set

172

J. Fr¨ohlich, O. Grandjean, A. Recknagel

a1 (p) = b2 (p) = δp1 ,−1 δp2 ,0 , a2 (p) = b2 (p) = δp1 ,1 δp2 ,0 , P2 and an easy computation shows that the element η = i=1 ai δbi satisfies π(η) = 0 , π(δη) = −g 11 . Since π(δJ 1 ) is an Aα -bimodule, Eq. (4.17) follows. Let k ≥ 3 and η ∈ k (Aα ). Then, using Eqs. (4.14) and (4.15), we see that there exists an element ψ ∈ k−2 (Aα ) with π(η) = π(ψ). The first part of the proof ensures the existence of an element φ ∈ 1 (Aα ) with π(φ) = 0 and π(δφ) = 1. Then we have φψ ∈ J k−1 , and π(δ(φψ)) = π(ψ) = π(η), proving that η ∈ δJ k−1 , and therefore Eqs. (4.18) and (4.19). As a corollary to this lemma, we obtain the following Proposition 4.2. Up to isomorphism, the spaces of differential forms are given by 0D (Aα ) = Aα , 1D (Aα ) ∼ = {aµ γ µ | aµ ∈ Aα } , 2 (Aα ) ∼ = {a σ | a ∈ Aα } , D

nD (Aα ) = 0

for n ≥ 3,

(4.22) (4.23) (4.24) (4.25)

where we have chosen special representatives on the right-hand side. Notice that 1D (Aα ) and 2D (Aα ) are free left Aα -modules of rank 2 and 1, respectively. This reflects the fact the bundles of 1- and 2-forms over the 2-torus are trivial and of rank 2 and 1, respectively. 4.2.2. Integration and Hermitian structure over 1D (Aα ) . It follows from Eqs. (4.1315) that there is an isomorphism π(• (Aα )) ' Aα ⊗ M2 (C). Applying the general definition of the integral – see Subsect. 2.1.3 – to the non-commutative torus, one finds for an arbitrary element ω ∈ π(• (Aα )) , Z (4.26) − ω = Tr C2 (ω (0)) , where Tr C2 denotes the normalized trace on C2 . The cyclicity property, Assumption 2.4 in Subsect. 2.1.3, follows directly from the definition of the product in Aα and the cyclicity of the trace on M2 (C). The kernels K n of the canonical sesqui-linear form on π(• (Aα )) – see Eq. (2.5) – coincide with the kernels J n of π, and we get for all n ∈ Zn : e nD (Aα ) = nD (Aα ). e n (Aα ) = n (Aα ) , (4.27) Note that the equality K n = J n holds in all explicit examples of non-commutative N = 1 spaces studied so far. It is easy to see that the canonical representatives ω ⊥ on H of differential forms [ω] ∈ nD (Aα ), see Eq. (2.10), coincide with the choices already made in Eqs. (4.22–25). The canonical Hermitian structure on 1D (Aα ) is simply given by (4.28) hω, ηiD = ωµ g µν ην∗ ∈ Aα

Supersymmetric Quantum Theory and Non-Commutative Geometry

173

for all ω, η ∈ 1D (Aα ). Note that this is a true Hermitian metric, i.e., it takes values in Aα and not in the weak closure A00α . Again, this is also the typical situation in other examples. 4.2.3. Connections on 1D (Aα ) , and cohomology. Since 1D (Aα ) is a free left Aα module, it admits a basis which we can choose to be E µ := γ µ . A connection ∇ on 1D (Aα ) is uniquely specified by its coefficients 0λµν ∈ Aα , ∇ E µ = − 0µνλ E ν ⊗ E λ ∈ 1D (Aα ) ⊗Aα 1D (Aα ),

(4.29)

and these coefficients can be chosen arbitrarily. Note that in the classical case (α = 0) the basis E µ consists of real 1-forms. Accordingly, we say that the connection ∇ is real if its coefficients in the basis E µ are self-adjoint elements of Aα . A simple computation shows that there is a unique real, unitary, torsionless connection ∇L.C. on 1D (Aα ) given by (4.30) ∇L.C. E µ = 0. In the remainder of this subsection, we determine the de Rham complex and its cohomology. Let U and V be the elements of Aα defined in Eq. (4.10), then it is easy to verify that the elements E µ of 1D (Aα ) given by E 1 = U ∗ δU ,

E 2 = V ∗ δV,

(4.31)

form a basis of 1D (Aα ) and that they are closed, δE 1 = δE 2 = 0.

(4.32)

A word of caution is in order here: Eq. (4.32) does not mean that δE µ is zero as an element of 2 (Aα ), but that δE µ ∈ δJ 1 which is zero in the quotient space 2D (Aα ). As a basis of 2D (Aα ) we choose F =

1 εµν γ µ γ ν 2

(4.33)

and we get for the product of basis 1-forms, E µ E ν = εµν F.

(4.34)

This completely specifies the de Rham complex, and we can now compute the cohomology groups H p . For a ∈ Aα , we have the equivalences [D, a] = 0 ⇐⇒ ipµ γ µ a(p) = 0 ∀p ⇐⇒ a(p) = δp,0 a˜

(4.35)

for some a˜ ∈ C. This shows that H 0 ' C. Let aµ E µ be a 1-form, then we obtain with Eqs. (4.32) and (4.34) that pµ aν εµν F = 0, δ(aµ E µ ) = 0 ⇐⇒ ib

(4.36)

pµ a)(q) = qµ a(q). Suppose where pbµ denotes the multiplication operator by pµ , i.e., (b that the 1-form aµ E µ is closed and satisfies aµ (0) = 0, and define the algebra element −1 aµ . Using Eq. (4.36), we see that b by b = 2ib pµ δb = aµ E µ .

(4.37)

174

J. Fr¨ohlich, O. Grandjean, A. Recknagel

This proves that any closed 1-form is cohomologous to a “constant” 1-form cµ E µ with cµ ∈ C. On the other hand, a non-vanishing constant 1-form cµ E µ cannot be exact since the equation (4.38) (δa)(p) = ipµ E µ a(p) = δp,0 cµ E µ has no solution. Thus, we have H 1 ' C2 . The same argument shows that a constant 2-form cF , with c ∈ C, is not exact. If a 2-form aF satisfies a(0) = 0, then it is the −1 µ aE and this proves the following coboundary of the 1-form i εµν pbν Proposition 4.3. In the basis { 1, E µ = γ µ , F = 21 εµν γ µ γ ν } of •D (Aα ), the de Rham differential algebra is specified by the following relations: E µ E ν = εµν F, δE µ = δF = 0,

δa = ib pµ E µ ∀a ∈ Aα .

Furthermore, the cohomology of the de Rham complex is given by H0 ' H2 ' C ,

H 1 ' C2 .

(4.39)

This completes our study of the N = 1 data describing the non-commutative 2-torus at irrational deformation parameter. 4.3. Riemannian geometry (N = (1, 1)). At the end of our discussion of the noncommutative 3-sphere, in Subsect. 3.4, we have briefly outlined a description in terms of “Riemannian” N = (1, 1) data – with the two generalized Dirac operators borrowed from conformal field theory, see [FG]. In the following, we will treat the non-commutative torus (at irrational deformation parameter) as a Riemannian space. Here we can, moreover, construct a set of N = (1, 1) data from the Connes spectral triple along the general lines of Subsect. 2.2.5. Our first task is to find a real structure J on the N = 1 data (Aα , H, D, σ). To this end, we introduce the complex conjugation κ : H → H, (κx)(p) := x(p) := x(p), as well as the charge conjugation matrix C : H → H as the unique (up to a sign) constant matrix such that C γ µ = − γ µ C, ∗

C = C =C

−1

(4.40)

.

(4.41)

Then it is easy to verify that J = Cκ is a real structure. The right actions of Aα and 1D (Aα ) on H (see Subsect. 2.2.5) are given as follows ξ • a ≡ Ja∗ J ∗ ξ = ξ •α a∨ , ∗

∗

µ

ξ • ω ≡ Jω J ξ = γ ξ

∨ •α ω µ ,

(4.42) (4.43)

where ξ ∈ H, a ∈ Aα , ω ∈ 1D (Aα ), ξ •α a denotes the diagonal right action of a on ξ by the deformed product, and a∨ (p) := a(−p). o

Notice that (a •α b)∨ = a∨ •α b∨ . We denote by H the dense subspace S(Z2 )⊕S(Z2 ) ⊂ H o of smooth spinors. The space H is a two-dimensional free left Aα -module with canonical o basis {e1 , e2 }. Then, any connection ∇ on H is uniquely determined by its coefficients ωj i ∈ 1D (Aα ):

Supersymmetric Quantum Theory and Non-Commutative Geometry

175

∇ ei = ωi j ⊗ ej = ωµi j γ µ ⊗ ej ∈ 1D (Aα ) ⊗Aα H . o

(4.44)

The “associated right connection” ∇ is then given by ∇ ei = ej ⊗ ω i j ∈ H ⊗Aα 1D (Aα ),

(4.45)

ω j i = − Cki (ωl k )∗ Cjl = Cki (ωµ l k )∗ Cjl γ µ .

(4.46)

o

where

o

o

An arbitrary element in H ⊗Aα H can be written as ei ⊗ aij ej , where aij ∈ Aα . As in o o Subsect. 2.2.5, the “Dirac operators” D and D on H ⊗Aα H associated to the connection ∇ are given by (4.47) D ei ⊗ aij ej = ei ⊗ δ aij + ω k i akj + aik ωk j • ej , ij ij ik i kj j D ei ⊗ a ej = ei • δ a + ω k a + a ωk ⊗ σ ej . (4.48) o

o

In order to be able to define a scalar product on H ⊗Aα H , we need a Hermitian structure o on the right module H , denoted by h·, ·i, with values in Aα . It is defined by Z o (4.49) − hξ, ζi a = (ξ, ζ a) ∀ ξ, ζ ∈ H , ∀ a ∈ Aα . This Hermitian structure can be written explicitly as hξ, ζi =

2 X

ξi

•α

ζ i∨ ,

(4.50)

i=1

and it satisfies

hξ a, ζ bi = a∗ hξ, ζi b

(4.51)

o

o

o

for all ξ, ζ ∈ H and a, b ∈ Aα . Then we define the scalar product on H ⊗Aα H as (see [Co1]) (ξ1 ⊗ ξ2 , ζ1 ⊗ ζ2 ) = ξ2 , hξ1 , ζ1 i ζ2 . (4.52) This expression can be written in a more suggestive way if one introduces a Hermitian o structure, denoted h·, ·iL , on the left module H : hξ, ζiL := hJ ξ, J ζi. This Hermitian structure satisfies ha ξ, b ζiL = a hξ, ζiL b∗ o

o

o

for all a, b ∈ Aα and ξ, ζ ∈ H , and the scalar product on H ⊗Aα H can be written as follows Z (ξ1 ⊗ ξ2 , ζ1 ⊗ ζ2 ) = − hξ1 , ζ1 i hζ2 , ξ2 iL . A tedious computation shows that the relations D∗ = D , are equivalent to

∗

D = D,

{ D, D } = 0 ,

e ei ⊗ ej = 0 ∀ i, j. ∇

D2 = D

2

(4.53) (4.54)

176

J. Fr¨ohlich, O. Grandjean, A. Recknagel

In particular, we see that the original N = 1 data uniquely determine the operators D and D satisfying the N = (1, 1) algebra – cf. Definition 2.20 – D2 = D2 ,

{ D, D } = 0.

One can prove that there are unique Z2 -grading operators γ =1⊗σ ,

γ =σ⊗1

(4.55)

commuting with Aα and such that { D, γ } = { D, γ } = 0, [ D, γ ] = [ D, γ ] = 0. The combined Z2 -grading

0 = γγ

together with the Hodge operator

∗=γ

complete our data to a set of N = (1, 1) data Aα , H ⊗Aα H, D, D, 0, ∗ . Furthermore, these data admit a unique Z-grading T =

1 gµν γ µ ⊗ γ ν σ 2i

commuting with Aα , whose mod 2 reduction equals 0, and such that [ T , d ] = d. 4.4. K¨ahler geometry (N = (2, 2)). The classical torus can be regarded as a complex K¨ahler manifold, and thus it is natural to ask whether we can extend the N = (1, 1) spectral data to N = (2, 2) data in the non-commutative case, too. The simplest way to determine such an extension is to look for an anti-selfadjoint operator I commuting with Aα , 0, ∗, and T , and then to define a new differential by dI = [ I, d ].

(4.56)

The nilpotency of dI implies further constraints on the operator I. The idea behind this construction is to identify I with i(T − T ), where T and T are as in Definition 2.26. In the classical setting, this operator has clearly the above properties. The most general operator I on H ⊗Aα H that commutes with all elements of Aα is of the form 3 X R γ µ ⊗ γ ν Iµν , (4.57) I = µ,ν=0

where set

R Iµν

are elements of Aα acting on H ⊗Aα H from the right, and where we have γ 0 = 1, γ 3 = σ.

(4.58) R Iµν

= 0 unless The vanishing of the commutators of I with 0 and ∗ implies that R R R = I30 and leaves the coefficients I00 µ, ν ∈ {0, 3}. The equation [ I, T ] = 0 requires I03 R and I33 undetermined. Since the operator I appears only through commutators, its trace R = 0. All constraints together give part is irrelevant and we can set I00

Supersymmetric Quantum Theory and Non-Commutative Geometry R R I = (σ ⊗ 1 + 1 ⊗ σ) I03 + (σ ⊗ σ) I33 ,

177

(4.59)

R R and I33 are anti-selfadjoint elements of Aα . We decompose I into two parts where I03 R , I1 = (σ ⊗ 1 + 1 ⊗ σ) I03

I2 = (σ ⊗ σ)

R I33 ,

(4.60) (4.61)

and we introduce the new differentials according to Eq. (4.56), 1 (D − i D), 2 d2 = [ I1 , d ] , d3 = [ I2 , d ] . d1 = d =

(4.62) (4.63) (4.64)

The nilpotency of d2 and d3 implies that I03 and I33 are multiples of the identity, and we normalize them as follows: i (4.65) I1 = (σ ⊗ 1 + 1 ⊗ σ) , 2 (4.66) I2 = i (σ ⊗ σ) . Comparing Eqs. (4.66) and (4.55), we see that I2 = i γ γ

(4.67)

and it follows, using Eqs. (4.62) and (4.64), that d3 = [ I2 , d ] = 2 i d γ γ.

(4.68)

Thus, the differential d3 is a trivial modification of d, and we discard it. It is then easy to verify that (Aα , H ⊗Aα H, d1 , d2 , 0, ∗, T ) form a set of N = (2, 2) spectral data together with a Z-grading. Furthermore, they are, as we have shown, canonically determined by the original N = (1, 1) data. Therefore, a Riemannian non-commutative torus (at irrational deformation parameter α) admits a canonical K¨ahler structure. Notice that if we choose the metric g µν = δ µν in Eq. (4.4), then ∂ = − 21 (d1 + id2 ) coincides with the holomorphic differential obtained in [Co1] from cyclic cohomology and using the equivalence of conformal and complex structures in two dimensions. We have only given the definitions of the spectral data in the N = (1, 1) and the N = (2, 2) setting. As a straightforward application of the general methods described in Sect. 2, we could compute the associated de Rham resp. Dolbeault complexes, or geometrical quantities like curvature. We do not carry out these calculations. Instead, let us emphasize the following feature: In Sect. 3, we already say that the topology of “the” non-commutative 3-sphere depends on the spectral data other than the algebra. Now, we learn once again that, for rational deformation parameter α = M N , the algebra Aα does not specify the geometry of the underlying non-commutative space. It is only the selection of a specific K-cycle (H, D) that allows us to identify this space as a deformed torus. By choosing different K-cycles (H, D) for the same algebra Pl A = MN (C) (with N = j=1 j 2 ) we are able to describe either a fuzzy three-sphere, as discussed in Sect. 3, or a non-commutative torus. In other words, choosing different spectral data, but keeping the algebra A fixed, may lead to different non-commutative geometries. Yet, it is plausible that the sequence AN := MN (C), N = 1, 2, 3, . . . , of algebras may be associated uniquely with non-commutative tori, while the sequence AN := MN (C), Pl N = j=1 j 2 , l = 1, 2, 3, . . . , may be associated uniquely with fuzzy three-spheres.

178

J. Fr¨ohlich, O. Grandjean, A. Recknagel

5. Directions for Future Work In this work and in part I, we have presented an approach to (non-commutative) geometry rooted in supersymmetric quantum theory. We have classified the various types of classical and of non-commutative geometries according to the symmetries, or to the “supersymmetry content”, of their associated spectral data. Obviously, many natural and important questions remain to be studied. In this concluding section, we describe a few of these open problems and sketch, once more, some of the physical motivations underlying our work. (1) An obvious question is whether one can give a complete classification of the possible types of spectral data in terms of graded Lie algebras (and, perhaps, q-deformed graded Lie algebras). As an example, we recall the structure of N = 4+ spectral data , describing an extension of K¨ahler geometry (see Sects. 1.2 and 3 of part I). The spectral data involve e, d e∗ , L3 , L+ , L− , J0 and 4, which close under taking (anti-) the operators d, d∗ , d commutators: They generate a graded Lie algebra defined by [ L3 , L± ] = ±2L± , [ L+ , L− ] = L3 , [ J0 , L3 ] = [ J0 , L+ ] = 0, e∗ , e, [ L+ , d ] = 0, [ L− , d ] = d [ J0 , d ] = −i d [ L3 , d ] = d, e] = d e, e ] = 0, e ] = −d∗ , e ] = i d, [ L+ , d [ L− , d [ J0 , d [ L3 , d e, d e } = { d, d e } = {d, d e∗ } = 0, { d, d } = { d e, d e∗ } = 4, { d, d∗ } = { d where 4, the Laplacian, is a central element. The remaining (anti-)commutation relations follow by taking adjoints, with the rules that 4, J0 and L3 are self-adjoint, and (L− )∗ = L+ . It would be interesting to determine all graded Lie algebras (and their representations) occurring in spectral data of a (non-commutative) space. In the case of classical geometry, we have given a classification up to N = (4, 4) spectral data, and there appears to be enough information in the literature to settle the problem completely; see [Bes, HKLR, Joy]. In the non-commutative setting, however, further algebraic structures might occur, including q-deformations of graded Lie algebras. To give a list of all graded Lie algebras that are, in principle, admissible, appears possible; see [FGR2] for additional discussion. However, in view of the classical case, where we only found the groups U(1), SU(2), Sp(4) and direct products thereof (see part I, Sect. 3) we expect that not all Lie group symmetries that may arise in principle are actually realized in (non-commutative) geometry. Determining the graded Lie algebras that actually occur in the spectral data of geometric spaces is clearly just the first step towards a classification of non-commutative spaces. A more difficult problem will be to characterize the class of all ∗ -algebras A that admit a given type of spectral data, i.e. the class of algebras that possess a K-cycle (H, di ) with a collection of differentials di generating a certain graded Lie algebra such that the ordinary Lie group generators Xj contained in the graded Lie algebra commute with the elements of A. (2) Given some non-commutative geometry defined in terms of spectral data, it is natural to investigate its symmetries, i.e. to introduce a notion of diffeomorphisms. For definiteness, we start from a set of data (A, H, d, d∗ , T, ∗) with an N = 2 structure, cf. Sect. 2.2.6. To study notions of diffeomorphisms, it is useful to introduce an algebra 8•d (A) defined as the smallest ∗ -algebra of (unbounded) operators containing B :=

Supersymmetric Quantum Theory and Non-Commutative Geometry

179

π(• (A)) ∨ π(• (A))∗ and arbitrary graded commutators of d and d∗ with elements of B. Due to the existence of the Z-grading T , 8•d (A) decomposes into a direct sum M 8nd (A), 8nd (A) := { φ ∈ 8•d (A) | [ T, φ ]g = n φ }. 8•d (A) := n∈Z

Note that both positive and negative degrees occur. Thus, 8•d (A) is a graded ∗ -algebra. This algebra is quite a natural object to introduce when dealing with N = 2 spectral data, as the algebra •d (A) of differential forms does not have a ∗ -representation on H, because d is not self-adjoint. Ignoring operator domain problems arising because the (anti-)commutator of d with the adjoint of a differential form is unbounded, in general, we observe that 8•d (A) has the interesting property that it forms a complex with respect to the action of d by graded commutation, and, in view of examples from quantum field theory, we call it the field complex in the following. For N = (2, 2) non-commutative K¨ahler data with holomorphic and anti-holomorphic gradings T and T , see Definition 2.26, one may introduce a bi-graded complex 8•,• (A) ∂,∂ in a similar way. A slight generalization of such bi-graded field complexes containing operators φ of degree (n, m) with n and m real, but n + m ∈ Z, naturally occurs in N = (2, 2) superconformal field theory, see e.g. [FG, FGR2] and references given there. Next, we show how the field complex appears when we attempt to introduce a notion of diffeomorphisms of a (non-commutative) geometric space described in terms of N = 2 spectral data: One possible generalization of the notion of diffeomorphisms to non-commutative geometry is to identify them with ∗ -automorphisms of the algebra A of “smooth functions”. It may be advantageous, though, to follow concepts from classical geometry more closely: An infinitesimal diffeomorphism is then given by a derivation δ(·) := [ L, · ] of A, where L is an element of 80d such that δ commutes with d, i.e. [ d, L ] = 0. The derivation δ can then be extended to all of π(•d (A)), and δ preserves the degree of differential forms iff L commutes with T , i.e. iff L ∈ 80d . For a classical manifold M , it turns out that each L with the above properties can be written as L = { d, X } for some vector field X ∈ 8−1 d , i.e. L is the Lie derivative in the direction of this vector field. In the non-commutative situation, however, it might happen that the cohomology of the field complex at the zeroth position is non-trivial. In this case, the study of diffeomorphisms of the non-commutative space necessitates studying the cohomology of the field complex 8•d (A) in degree zero. As in classical differential geometry, it is interesting to investigate special diffeomorphisms, i.e. ones that preserve additional structure in the spectral data. As an example, consider derivations δ(·) = [ L, · ] such that L commutes with d and d∗ : They generate isometries of the non-commutative space. For complex spectral data, we may consider derivation not only commuting with d but also with ∂: They generate one-parameter groups of holomorphic diffeomorphisms. In the example of symplectic spectral data, we are interested in diffeomorphisms preserving the symplectic forms, i.e., in symplectomorphisms. One-parameter groups of symplectomorphisms are generated by derivations e∗ . commuting with d and d

180

J. Fr¨ohlich, O. Grandjean, A. Recknagel

(3) Another important topic in non-commutative geometry is deformation theory. Given spectral data specified in terms of generators { Xj , d, dα , 4 } of a graded Lie algebra (t) as in remark (1), we may study one-parameter families { Xj(t) , d, d(t) α , 4 }t∈R of deformations. Here, we choose to keep one generator, d, fixed, and we require that the graded Lie algebras are isomorphic to one another for all t. This means that we study deformations of the (non-commutative) complex or symplectic structure of a given space A while preserving the differential and the de Rham complex. Only those deformations of spectral data are of interest which cannot be obtained from the original ones by ∗ -automorphisms of the algebra A commuting with d (i.e. by “diffeomorphisms”). In classical geometry, the deformation theory of complex structures is well-developed (Kodaira-Spencer theory), and there are non-trivial results in the deformation theory of symplectic structures (e.g. Moser’s theorem); but this last topic is still a subject of active research. Next, we consider deformations d0 of the differential d of a given set of N = 2 spectral data (A, H, d, d∗ , T, ∗) which are of the form d0 := d + ω, for some operator ω ∈ 8•d (A) of odd degree. We require that d0 again squares to zero, which implies that ω has to satisfy a zero curvature condition ω 2 + { d, ω } = 0.

(5.1)

We distinguish between several possibilities: First, we require that the deformed data still carry an N = 2 structure with the same Z-grading T as before. Then ω must be an element of 81d (A) satisfying (5.1), and we can identify it with the connection 1-form of a flat connection on some vector bundle; for an example, see the discussion of the structure of classical N = (1, 1) Dirac bundles in Sect. 2.2.3 of part I. More generally, we only require the deformed data to be of N = (1, 1) type, with a Z2 -grading γ given by the mod 2 reduction of T . As a simple example, consider an operator ω in 8•d (A) of degree 2n + 1, with n 6 = 0. Then condition (5.1) implies that ω 2 = 0 and { d, ω } = 0. If ω = [ d, β ] and [ β, ω ] = 0 then d0 = e−β d eβ . We then say that d and d0 are equivalent. If ω represents a non-trivial cohomology class of the field complex 8•d (A) then d and d0 are inequivalent. (4) In the introduction to paper I and in [FGR2] we have remarked that, from the point of view of physics, it is quite unnatural to attribute special importance to the algebra of functions over configuration space. The natural algebra in Hamiltonian mechanics is the algebra of functions over phase space, and, in quantum mechanics, it is a noncommutative deformation thereof, denoted F~ (where ~ is Planck’s constant), which is the natural algebra to study. In examples where phase space is given as the cotangent bundle T ∗ M of a smooth manifold M , the configuration space, one may ask whether there are natural mathematical relations between spectral data involving the algebra A = C ∞ (M ) and ones involving the algebra F~ . For example, it may be possible to represent A and F~ on the same Hilbert space H and consider spectral data (A, H, d, T, ∗) and (F~ , H, d, T, ∗) with the same choice of operators d, T and ∗ on H. It is well known that from (A, H, d, T, ∗) configuration space M can be reconstructed (Gelfand’s theorem

Supersymmetric Quantum Theory and Non-Commutative Geometry

181

and extensions thereof). This leads to the natural question whether M can also be reconstructed from (F~ , H, d, T, ∗), or whether at least some of the topological properties of M , e.g. its Betti numbers, can be determined from these data. It is known that, in string theory, spectral data generalizing (F~ , H, d, T, ∗) do not determine configuration space uniquely; this is related to the subject of stringy dualities and symmetries, more precisely to T dualities, see e.g. [GPR] and also [KS, FG]. The distinction between “algebras of functions on configuration space” A and “algebras of functions on phase space” F remains meaningful in many examples of non-commutative spaces. Typically, F arises as a crossed product of A by some group G of “diffeomorphisms”. Under what conditions properties of the algebra A can be inferred from spectral data (F~ , H, d, T, ∗) without knowing explicitly how the group G acts on F represents a problem of considerable interest in quantum theory. For another perspective concerning the distinction between “algebras of functions on configuration space” and “algebras of functions on phase space” see Sect. 2.2.6. It is worth emphasizing that in quantum field theory and string theory, where M is an infinite-dimensional space, the analogue of the “algebra of functions on M ”, i.e. of A, does not exist, while the analogue of the “algebra of functions on phase space T ∗ M ”, i.e. of F, still makes sense. For additional discussion of these matters see also [FGR2]. (5) A topic in the theory of complex manifold that has attracted a lot of interest, recently, is mirror symmetry. For a definition of mirrors of classical Calabi-Yau manifolds, see e.g. [Y] and references given there, and cf. the remarks at the end of Sect. I 2.4.3. It is natural to ask whether one can define mirrors of non-commutative spaces, and whether some classical manifolds may have non-commutative mirrors. Superconformal field theory with N = (2, 2) supersymmetry suggests how one might define a mirror map in the context of non-commutative geometry (see [FG, FGR2]): Assume that two sets of N = (2, 2) spectral data (Ai , H, ∂i , ∂ i , Ti , T i , ∗i ), i = 1, 2, are given, where the algebras Ai act on the Hilbert spaces Hi which are subspaces of a single Hilbert space H on which the operators ∂i , ∂ i , Ti , T i and ∗i are defined. We say that the space A2 is the mirror of A1 if ∂2 = ∂1 , ∂ 2 = ∂ ∗1 , T2 = T1 , T 2 = −T 1 , and if the dimensions bp,q i of the cohomology of the Dolbeault complexes (2.45) satisfy p,q n−p,q , where n is the top dimension of differential forms (recall that in Definition b 2 = b1 2.26 we required T and T to be bounded operators). e Within superconformal field Let A be a non-commutative K¨ahler space with mirror A. theory, there is the following additional relation between the two algebras: Viewing A e as the algebra of functions over a (non-commutative) target M , and analogously for A f, the phase spaces over the loop spaces over M and M f coincide. and M (6) The success of the theory presented in this paper will ultimately be measured in terms of the applications it has to concrete problems of geometry and physics. In particular, one should try to apply the notions developed here to further examples of truly non-commutative spaces such as quantum groups, or the non-commutative complex projective spaces (see e.g. [Ber, Ho, Ma, GKP]), non-commutative Riemann surfaces [KL], and non-commutative symmetric spaces [BLU, BLR, GP, BBEW ]. In most of these cases, it is natural to ask whether the “deformed” spaces carry a complex or K¨ahler structure in the sense of Sect. 2.3 above. From our point of view, however, the most interesting examples for the general theory and the strongest motivation to study spectral data with supersymmetry come from string theory: The “ground states” of string theory are described by certain N = (2, 2) superconformal field theories. They provide the

182

J. Fr¨ohlich, O. Grandjean, A. Recknagel

spectral data of the loop space over a target which is a “quantization” of classical space – or rather of an internal compact manifold. It may happen that the conformal field theory is the quantization of a σ-model of maps from a parameter space into a classical target manifold. In general, the target space reconstructed from the spectral data of the conformal field theory then turns out to be a (non-commutative) deformation of the target space of the classical σ-model. The example of the superconformal SU(2) Wess– Zumino–Witten model, which is the quantization of a σ-model with target SU(2), has been studied in some detail in [FG, Gr, FGR2] and has motivated the results presented in Sect. 3. A more interesting class of examples would consist of N = (2, 2) superconformal field theories which are quantizations of σ-models whose target spaces are given by threedimensional Calabi-Yau manifolds. But one may also apply the methods developed in this paper to superconformal field theories which, at the outset, are not quantizations of some classical σ-models. They may enable us to reconstruct (typically non-commutative) geometric spaces from the supersymmetric spectral data of such conformal field theories. This leads to the idea that, quite generally, superconformal field theories are (quantum) σ-models, but with target spaces that tend to be non-commutative spaces. An interesting family of examples of this kind consists of the Gepner models, which are expected to give rise to non-commutative deformations of certain Calabi-Yau three-folds. For further discussion of these ideas see also [FG, FGR2]. Acknowledgement. We thank A. Connes for useful criticism and suggestions that led to improvements of this paper. We thank O. Augenstein, A.H. Chamseddine, G. Felder and K. Gaw¸edzki for previous collaborations that generated many of the ideas underlying our work. Useful discussions with A. Alekseev are gratefully acknowledged. We thank the referee for pointing out some imprecisions in Sect. 2.2.5. The work of O.G. is supported in part by the Department of Energy under Grant DE-FG02-94ER-25228 and by the National Science Foundation under Grant DMS-94-24334, and by the Clay Mathematics Institute. The work of A.R. has been supported in part by the Swiss National Foundation.

References [AG]

Alvarez-Gaum´e, L.: Supersymmetry and the Atiyah–Singer index theorem. Commun. Math. Phys. 90, 161–173 (1983) [AGF] Alvarez-Gaum´e, L., Freedman, D.Z.: Geometrical structure and ultraviolet finiteness in the supersymmetric σ-model. Commun. Math. Phys. 80, 443–451 (1981) [BC] Beazley Cohen, P.: Structure complexe non commutative et superconnexions. Preprint MPI f¨ur Mathematik, Bonn, MPI/92-19 [Ber] Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 40, 153–174 (1975) [Bes] Besse, A.L.: Einstein Manifolds. Berlin–Heidelberg–New York: Springer-Verlag, 1987 [BLR] Borthwick, D., Lesniewski, A., Rinaldi, M.: Hermitian symmetric superspaces of type IV. J. Math. Phys. 343, 4817–4833 (1993) [BLU] Borthwick, D., Lesniewski, A., Upmeier, H.: Non-perturbative quantization of Cartan domains. J. Funct. Anal. 113, 153–176 (1993) [BBEW] Bordemann, M., Brischle, M., Emmrich, C., Waldmann, S.: Phase space reduction for star-products: An explicit construction for CPn . Lett. Math. Phys. 36, 357–371 (1996) [CF] Chamseddine, A.H., Fr¨ohlich, J.: Some elements of Connes’non-commutative geometry, and spacetime geometry. In: Chen Ning Yang, a Great Physicist of the Twentieth Century, C.S. Liu and S.-T. Yau (eds.), Cambridge, MA: International Press 1995, pp. 10–34 [CFF] Chamseddine, A.H., Felder, G., Fr¨ohlich, J.: Gravity in non-commutative geometry. Commun. Math. Phys. 155, 205–217 (1993) [CFG] Chamseddine, A.H., Fr¨ohlich, J., Grandjean, O.: The gravitational sector in the Connes-Lott formulation of the standard model. J. Math. Phys. 36, 6255–6275 (1995) [Co1] Connes, A.: Noncommutative Geometry. London–New York: Academic Press, 1994

Supersymmetric Quantum Theory and Non-Commutative Geometry

183

´ Connes, A.: Noncommutative differential geometry. Inst. Hautes Etudes Sci. Publ. Math. 62, 257– 360 (1985) [Co3] Connes, A.: The action functional in noncommutative geometry. Commun. Math. Phys. 117, 673– 683 (1988) [Co4] Connes, (1988): Reality and noncommutative geometry. J. Math. Phys. 36, 6194–6231 (1995) [Co5] Connes, A.: C ∗ -alg`ebres et g´eom´etrie diff´erentielle. C.R. Acad. Sci. Paris S´er. A-B 290, 599–604 (1980) [CoK] Connes, A., Karoubi, M.: Caract`ere multiplicatif d’un module de Fredholm. K-Theory 2, 431–463 (1988) [DFR] Doplicher, S., Fredenhagen, K., Roberts, J.E.: The quantum structure of space-time at the Planck scale and quantum fields. Commun. Math. Phys. 172, 187–220 (1995) [FG] Fr¨ohlich, F, Gaw¸edzki, K.: Conformal field theory and the geometry of strings. CRM Proceedings and Lecture Notes, Vol. 7„ 57–97 (1994) [FGR1] Fr¨ohlich, J., Grandjean, O., Recknagel, A.: Supersymmetric quantum theory and differential geometry. Commun. Math. Phys. 193, 527–594 (1998) [FGR2] Fr¨ohlich, J., Grandjean, O., Recknagel, A.: Supersymmetric quantum theory, non-commutative geometry, and gravitation. In: Quantum Symmetries, A. Connes, K. Gaw¸edzki and J. Zinn-Justin (eds.), Les Houches, Session LXIV, 1995. Amsterdam, New York: Elsevier Science, 1998 [FGK] Felder, G., Gaw¸edzki, K., Kupiainen, A.: Spectra of Wess–Zumino–Witten models with arbitrary simple groups. Commun. Math. Phys. 117, 127–158 (1988) [FW] Friedan, D., Windey, P.: Supersymmetric derivation of the Atiyah–Singer index theorem and the chiral anomaly. Nucl. Phys. B235, 395–416 (1984) [GKP] Grosse, H., Klimˇcik, C., Preˇsnajder, P.: Towards finite quantum field theory in non-commutative geometry. Int. J. Theor. Phys. 35, 231–244 (1996) [GP] Grosse, H., Preˇsnajder, P.: The construction of non-commutative manifolds using coherent states. Lett. Math. Phys. 28, 239–250 (1993) [GPR] Giveon, A., Porrati, M., Rabinovici, E.: Target space duality in string theory. Phys. Rep. 244,77.-202 (1994) [Gr] Grandjean, O.: Non-commutative differential geometry. Ph.D. Thesis, ETH Z¨urich, July 1997 [GSW] Green, M.B., Schwarz, J.H., Witten, E.: Superstring Theory I,II, Cambridge: Cambridge University Press, 1987 [HKLR] Hitchin, N.J., Karlhede, A., Lindstrom, U., Rocek, M.: Hyperk¨ahler metrics and supersymmetry. Commun. Math. Phys. 108, 535–589 (1987) [Ho] Hoppe, J.: Quantum theory of a massless relativistic surface and a two-dimensional boundstate problem. Ph.D. Thesis, MIT 1982; Quantum theory of a relativistic surface. In: Constraint’s theory and relativistic dynamics, G. Longhi, L. Lusanna (eds.), Proceedings Florence 1986, Singapore: World Scientific [Ja1] Jaffe, A., Lesniewski, A., Osterwalder, K.: Quantum K-theory I: The Chern character. Commun. Math. Phys. 118, 1–14 (1988) [Ja2] Jaffe, A., Lesniewski, A., Osterwalder, K.: On super-KMS functionals and entire cyclic cohomology. K-theory 2, 675–682 (1989) [Ja3] Jaffe, A., Osterwalder, K.: Ward identities for non-commutative geometry. Commun. Math. Phys. 132, 119–130 (1990) [Jac] Jacobson, N.: Basic Algebra II, San Francisco, CA: W.H. Freeman and Company, 1985 [Joy] Joyce, D.D.: Compact hypercomplex and quaternionic manifolds. J. Differ. Geom. 35, 743–762 (1992); Manifolds with many complex structures. Q. J. Math. Oxf. II. Ser. 46, 169–184 (1995) [Kar] Karoubi, M.: Homologie cyclique et K-th´eorie. Soci´et´e Math´ematique de France, Ast´erisque 149, (1987) [KL] Klimek, S., Lesniewski, A.: Quantum Riemann surfaces I. The unit disc. Commun. Math. Phys. 146, 103–122 (1992); Quantum Riemann surfaces II. The discrete series. Lett. Math. Phys. 24, 125–139 (1992) ˇ [KS] Klimˇcik, C., Severa, P.: Dual non-abelian duality and the Drinfeld double. Phys. Lett. B 351, 455– 462 (1995) [Ma] Madore, J.: The commutative limit of a matrix geometry. J. Math. Phys 32, 332–335 (1991) [Pol] Polchinski, J.: Dirichlet branes and Ramond–Ramond charges. Phys. Rev. Lett. 75, 4724–4727 (1995); TASI lectures on D-branes, hep-th/9611050 [PS] Pressley, A., Segal, G.: Loop groups. Oxford: Clarendon Press, 1986

[Co2]

184

J. Fr¨ohlich, O. Grandjean, A. Recknagel

[Ri]

Rieffel, M.: Non-commutative tori – a case study of non-commutative differentiable manifolds. Contemp. Math. 105, 191–211 (1990) Swan, R.G.: Vector bundles and projective modules. Trans. Amer. Math. Soc. 105, 264–277 (1962) Witten, E.: Constraints on supersymmetry breaking. Nucl. Phys. B202, 253–316 (1982) Witten, E.: Supersymmetry and Morse theory. J. Diff. Geom. 17, 661–692 (1982) Witten, E.: Non-abelian bosonization in two dimensions. Commun. Math. Phys. 92, 455–472 (1984) Witten, E.: Bound states of strings and D-branes. Nucl. Phys. B460, 335–350 (1996) Yau, S.T. (ed.): Essays on mirror symmetry, Cambridge, MA: International Press, 1992

[Sw] [Wi1] [Wi2] [Wi3] [Wi4] [Y]

Communicated by A. Connes

Commun. Math. Phys. 203, 185 – 210 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Vertex Operator Solutions to the Discrete KP-Hierarchy M. Adler1,? , P. van Moerbeke1,2,?? 1 Department of Mathematics, Brandeis University, Waltham, Mass 02454, USA. E-mail: [email protected] 2 Department of Mathematics, Universit´ e de Louvain, 1348 Louvain-la-Neuve, Belgium. E-mail: [email protected]; [email protected]

Received: 27 August 1998 / Accepted: 24 November 1998

Abstract: Vertex operators, which are disguised Darboux maps, transform solutions of the KP equation into new ones. In this paper, we show that the bi-infinite sequence obtained by Darboux transforming an arbitrary KP solution recursively forward and backwards, yields a solution to the discrete KP-hierarchy. The latter is a KP hierarchy where the continuous space x-variable gets replaced by a discrete n-variable. The fact that these sequences satisfy the discrete KP hierarchy is tantamount to certain bilinear relations connecting the consecutive KP solutions in the sequence. At the Grassmannian level, these relations are equivalent to a very simple fact, which is the nesting of the associated infinite-dimensional planes (flag). The discrete KP hierarchy can thus be viewed as a container for an entire ensemble of vertex or Darboux generated KP solutions. It turns out that many new and old systems lead to such discrete (semi-infinite) solutions, like sequences of soliton solutions, with more and more solitons, sequences of Calogero–Moser systems, having more and more particles, just to mention a few examples; this is developed in [3]. In this paper, as another example, we show that the q-KP hierarchy maps, via a kind of Fourier transform, into the discrete KP hierarchy, enabling us to write down a very large class of solutions to the q-KP hierarchy. This was also reported in a brief note [4]. Contents 0 1 2 3 4 5 ?

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The KP τ -Functions, Grassmannians and a Residue Formula . . . . . . . . . The Existence of a τ -Vector and the Discrete KP Bilinear Identity . . . . Sequences of τ -Functions, Flags and the Discrete KP Equation . . . . . . . Discrete KP-Solutions Generated by Vertex Operators . . . . . . . . . . . . . . Example of Vertex Generated Solutions: The q-KP Equation . . . . . . . . .

186 189 194 198 202 203

The support of a National Science Foundation grant # DMS-9503246 is gratefully acknowledged. The support of a National Science Foundation grant # DMS-9503246, a Nato, a FNRS and a Francqui Foundation grant is gratefully acknowledged. ??

186

M. Adler, P. van Moerbeke

0. Introduction Given the shift operator 3 = (δi,j−1 )i,j∈Z , consider the Lie algebra (

)

X

D=

i

ai 3 , ai diagonal operators

= D− + D +

(0.1)

−∞
with the usual splitting D = D− + D+ , into subalgebras D+ =

  X 

i

ai 3 ∈ D

0≤i∞

  

( , D− =

X

) i

ai 3 ∈ D .

(0.2)

−∞
The discrete KP-hierarchy equations ∂L = [(Ln )+ , L], n = 1, 2, . . . ∂tn are deformations of an infinite matrix X ai (t)3i + 3 ∈ D, with t = (t1 , t2 , . . . ) ∈ C∞ . L=

(0.3)

(0.4)

−∞
If we represent L as a dressing up of 3 by a wave operator S ∈ I + D− , P∞ i L = S3S −1 = W 3W −1 , W = Se 1 ti 3 ,

(0.5)

then the L-deformations are induced by S-deformations and W -deformations: ∂S = −(Ln )− S, ∂tn

∂W = (Ln )+ W, n = 1, 2, . . . . ∂tn

(0.6)

In terms of vectors χ(z) = (z n )n∈Z ,

χ∗ (z) = χ(z −1 ),

(0.7)

such that zχ(z) = 3χ(z), zχ∗ (z) = 3> χ∗ (z), let us define wave and adjoint wave vectors 9(t, z) and 9∗ (t, z), 9(t, z) = W χ(z) and 9∗ (t, z) = (W −1 )> χ∗ (z).

(0.8)

We find, using (0.5), (0.8), (0.6), that L9(t, z) = z9(t, z), L> 9∗ (t, z) = z9∗ (t, z), ∂9∗ ∂9 = (Ln )+ 9, = −((Ln )+ )> 9∗ . ∂tn ∂tn

(0.9)

Vertex Operator Solutions to Discrete KP-Hierarchy

187

Theorem 0.1. If L satisfies the discrete KP-hierarchy (0.3), then the wave vectors 9(t, z) and 9∗ (t, z) can be expressed in terms of one sequence of τ -functions τ (n, t) := τn (t1 , t2 , . . . ), n ∈ Z, to wit: P∞ i τn (t − [z −1 ]) P∞ ti zi n e 1 = z , 9(t, z) = e 1 ti z ψ(t, z) τn (t) n∈Z n∈Z P∞ i 9∗ (t, z) = e− 1 ti z ψ ∗ (t, z) n∈Z τn+1 (t + [z −1 ]) − P∞ ti zi −n 1 = z , e τn+1 (t) n∈Z satisfying the bilinear identity I

9n (t, z)9∗m (t0 , z)

z=∞

(0.10)

dz =0 2πiz

(0.11)

for all n > m. It follows that P∞ i 9 = W χ(z) = e 1 ti z Sχ(z), 9∗ = W >

−1

χ∗ (z) = e−

P∞ 1

ti z i

(S −1 )> χ∗ (z),

with1 S=

∞ X ˜ (t) pn (−∂)τ

τ (t)

0

−n

3

and S

−1

=

∞ X

−n

3

˜ 3

0

˜ (t) pn (∂)τ τ (t)

.

(0.12)

Then Lk has the following expression in terms of τ -functions2 , k

L =

∞ X `=0

diag

˜ n+k−`+1 ◦ τn p` (∂)τ 3k−` τn+k−`+1 τn n∈Z

(0.13)

with the τn ’s satisfying ! `−1 X ∂ ˜ k−r (∂) ˜ τn ◦ τn−` = 0, for `, k = 1, 2, 3, . . . − (` − r)pr (−∂)p ∂tk r=0 and 1 ∂2 ˜ τn ◦ τn = 0, for k = 1, 2, 3, . . . . − pk+1 (∂) (0.14) 2 ∂t1 ∂tk P (n) n 1 In an expression, like S = a 3 , we have a(n) = diag(a(n) ) ; also introduce the notation k k∈Z 0 ˜ (3a)k = ak+1 3 6 ak+1 3.

2 where the p are elementary Schur polynomials and where p (∂)f ˜ ◦ g refers to the usual Hirota operation, ` ` to be defined in Sect. 1.

188

M. Adler, P. van Moerbeke

Remark. Equation (0.13) reads, upon using (0.14), ∂ τn+k log 3k−1 + . . . Lk = 3k + ∂t1 τn n∈Z τn+1 ∂2 ∂ 0 log 3 + log τn 3−1 + · · · . + ∂tk τn n∈Z ∂t1 ∂tk n∈Z

(0.15)

With each component of the wave vector 9, or, what is the same, with each component of the τ -vector, we associate a sequence of infinite-dimensional planes in the Grassmannian Gr(n) , ( ) k ∂ 9n (t, z), k = 0, 1, 2, . . . Wn = spanC ∂t1 ) ( k P∞ i ∂ + z ψn (t, z), k = 0, 1, 2, . . . = e 1 ti z spanC ∂t1 P∞ i (0.16) =: e 1 ti z Wnt . Note that the plane z −n Wn ∈ Gr(0) has so-called virtual genus zero, in the terminology of [13]; in particular, this plane contains an element of order 1 + O(z −1 ). Setting {f, g} = f 0 g − f g 0 for 0 = ∂/∂t1 , we have the following statement: Theorem 0.2. The following six statements are equivalent (i) The discrete KP-equations (0.3); (ii) 9 and 9∗ , with the proper asymptotic behaviour, provided by (0.8), satisfy the bilinear identities for all t, t0 ∈ C∞ , I dz 9n (t, z)9∗m (t0 , z) = 0, for all n > m; (0.17) 2πiz z=∞ (iii) the τ -vector satisfies the following bilinear identities for all n > m and t, t0 ∈ C∞ : I P∞ 0 i τn (t − [z −1 ])τm+1 (t0 + [z −1 ])e 1 (ti −ti )z z n−m−1 dz = 0; (0.18) z=∞

(iv) The components τn of a τ -vector correspond to a flag of planes in Gr, · · · ⊃ Wn−1 ⊃ Wn ⊃ Wn+1 ⊃ . . . ;

(0.19)

(v) A sequence of KP-τ -functions τn satisfying the equations {τn (t − [z −1 ]), τn+1 (t)} + z(τn (t − [z −1 ])τn+1 (t) − τn+1 (t − [z −1 ])τn (t)) = 0;

(0.20)

(vi) A sequence of KP-τ -functions τn satisfying the first set of equations (0.14) for ` = 1, i.e., ∂ ˜ − pk (∂) τn+1 ◦ τn = 0 for k = 2, 3, . . . and n ∈ Z. (0.21) ∂tk

Vertex Operator Solutions to Discrete KP-Hierarchy

189

Remark. The 2-Toda lattice, studied in [15], amount to two coupled discrete KPhierarchy or discrete KP-hierarchies, thus introducing two sets of times tn ’s and sn ’s. Actually, every discrete KP-hierarchy can naturally be extended to a 2-Toda lattice; this is the content of Theorem 3.4. How to construct discrete KP-solutions. A wide class of examples of discrete KPsolutions is given in Sect. 4 by the following construction, involving the simple vertex operators, P∞ i P∞ z−i ∂ − i ∂ti 1 , (0.22) X(t, z) := e 1 ti z e which are disguised Darboux transformations acting on KP τ -functions. We now state: Theorem 0.3. Consider an arbitrary τ -function for the KP equation and a family of weights . . . , ν−1 (z)dz, ν0 (z)dz, ν1 (z)dz, . . . on R. The infinite sequence of τ -functions: τ0 = τ and, for n > 0, Z Z X(t, λ)νn−1 (λ)dλ· · · X(t, λ)ν0 (λ)dλ τ (t), τn := Z Z X(−t, λ)ν−n (λ)dλ· · · X(−t, λ)ν−1 (λ)dλ τ (t), τ−n := form a discrete KP-τ -vector, i.e., the bi-infinite matrix ∞ X ˜ n+2−` ◦ τn p` (∂)τ diag 31−` L= τn+2−` τn n∈Z

(0.23)

`=0

satisfies the discrete KP-hierarchy (0.3). As an interesting special case of this situation, we study in Sect. 6 the q-KP equation. A wide variety of examples are captured by this construction, like q-approximations to KP, discussed in Sect. 5, but also soliton formulas, matrix integrals, certain integrals leading to band matrices, the Calogero–Moser system and others, discussed in [3]. The technology in Theorem 3 might also be applicable to other approximations to integrable system, like the K & M-lattice [10]. For a detailed discussion of discrete approximations to integrable systems, see [7, 12]. Remark. A semi-infinite discrete KP-hierarchy with τ0 (t) = 1 is equivalent to a biinfinite discrete KP-hierarchy with τ−n (t) = τn (−t) and τ0 (t) = 1; this also amounts to W−n = Wn∗ , with W0 = H+ . In such cases, one only keeps the lower right-hand corner of L, while the lower left-hand corner completely vanishes. 1. The KP τ -Functions, Grassmannians and a Residue Formula As is well known [5], the bilinear identity I 9(t, z)9∗ (t, z)dz = 0, z=∞

together with the asymptotics

(1.1)

190

M. Adler, P. van Moerbeke

P∞ i P∞ i 1 1 ti z − ti z ∗ 1 1 , 9 (t, z) = e , 9(t, z) = e 1+O 1+O z z (1.2) force 9, 9∗ to be expressible in terms of τ -functions P∞ i τ (t + [z −1 ]) P∞ i τ (t − [z −1 ]) , 9∗ (t, z) = e− 1 ti z ; 9(t, z) = e 1 ti z τ (t) τ (t) moreover the KP τ -functions satisfy the differential Fay identity3 , for all y, z ∈ C, as shown in [1, 16]: {τ (t − [y −1 ]), τ (t − [z −1 ])} + (y − z)(τ (t − [y

−1

(1.3)

])τ (t − [z

−1

]) − τ (t)τ (t − [y

−1

] − [z

−1

]) = 0.

In fact this identity characterizes the τ -function, as shown in [14]. From (1.1), it follows that I P∞ i dz 0 = τ (t − a − [z −1 ])τ (t + a + [z −1 ])e−2 1 ai z 2πi ∞ X ∂2 ˜ ak − 2pk+1 (∂t) τ ◦ τ + O(a2 ). = ∂t1 ∂tk k=1

The Hirota notation used here is the following: Given a polynomial p ∂ ∂ti ,

(1.4)

∂ ∂ ∂t1 , ∂t2 , . . .

in

define the symbol p

∂ ∂ , ,... ∂t1 ∂t2

(f ◦ g)(t) := p

and ∂˜t :=

∂ ∂ , ,... ∂u1 ∂u2

∂ 1 ∂ 1 ∂ , , ,... ∂t1 2 ∂t2 3 ∂t3

f (t + u)g(t − u)

, (1.5) u=0

.

For future use, we state the following proposition shown in [1]: Proposition 1.1. Consider τ -functions τ1 and τ2 , the corresponding wave functions P P −1 ]) t z i τj (t − [z t zi (1.6) = e i≥1 i 1 + O(z −1 ) 9j = e i≥1 i τj (t) and the associated infinite-dimensional planes, as points in the Grassmannian Gr, ( ) k P∞ k ∂ ˜ it = W ˜ i e− 1 tk z ; ˜ 9i (t, z), for k = 0, 1, 2, . . . with W Wi = span ∂t1 then the following statements are equivalent: ˜ 1; ˜2 ⊂W (i) z W (ii) z92 (t, z) = ∂t∂ 1 91 (t, z) − α91 (t, z), for some function α = α(t); 3

{f, g} :=

∂f g ∂t1

∂g − f ∂t . 1

Vertex Operator Solutions to Discrete KP-Hierarchy

191

(iii) {τ1 (t − [z −1 ]), τ2 (t)} + z(τ1 (t − [z −1 ])τ2 (t) − τ2 (t − [z −1 ])τ1 (t)) = 0. (1.7) When (i), (ii) or (iii) holds, α(t) is given by α(t) =

τ2 ∂ log . ∂t1 τ1

(1.8)

˜ 1 , hence z W ˜ 2t ⊂ W ˜ 1t , implies ˜2 ⊂ W Proof. To prove that (i) ⇒ (ii), the inclusion z W by (0.16) that ˜ 1t zψ2 (t, z) = z(1 + O(z −1 )) ∈ W must be a linear combination4 zψ2 =

∂ψ1 ∂ + zψ1 − α(t)ψ1 , and thus z92 = 91 − α(t)91 . ∂t1 ∂t1

(1.9)

The expression (1.8) for α(t) follows from equating the z 0 -coefficient in (ii), upon using the τ -function representation (1.6). To show that (ii) ⇒ (i), note that z92 =

∂ ˜ 1, 91 − α91 ∈ W ∂t1

and taking t1 -derivatives, we have j j+1 j ∂ ∂ ∂ 92 = 91 + β1 91 + · · · + βj+1 91 , z ∂t1 ∂t1 ∂t1 for some β1 , · · · , βj+1 depending on t only; this implies the inclusion (i). The equivalence (ii) ⇐⇒ (iii) follows from a straightforward computation using the τ -function representation (1.6) of (ii) and the expression for α(t). Lemma 1.2. The following integral along a clockwise circle in the complex plane encompassing z = ∞ and z = α−1 , can be evaluated as follows: I dz z m+1 f (t + [α] − [z −1 ])g(t − [α] + [z −1 ]) −1 2 (z − α ) 2πiz z=∞ ! ∞ m−1 X X ∂ ˜ k−r (+∂) ˜ f ◦ g. αk − + (m − r)pr (−∂)p = α1−m ∂tk r=0 k=1

Proof. By the residue theorem, the integral above is the sum of residue at z = ∞ and at z = α−1 : I dz z m+1 f (t + [α] − [z −1 ])g(t − [α] + [z −1 ]) −1 )2 2πiz (z − α z=∞ m−1 d 1 1 f (t + [α] − [u])g(t − [α] + [u]) = −1 2 (m − 1)! du (1 − uα ) u=0 d − z m f (t + [α] − [z −1 ])g(t − [α] + [z −1 ]) . (1.10) dz z=α−1 4

remember ψi is the same as 9i , but without the exponential.

192

M. Adler, P. van Moerbeke

Evaluating each of the pieces requires a few steps. Step 1. 1 k!

d du

k

f (t + [α] − [u])g(t − [α] + [u])

=

∞ X

u=0

˜ ` (∂)f ˜ ◦ g. α` pk (−∂)p

`=0

At first note

d du

k

F ([u])

u=0

= k!pk (∂˜s )F (s)

(1.11)

and, by (1.5) and (1.12), k d 1 ˜ ◦g f (t + [u])g(t − [u]) = pk (∂)f k! du u=0 ˜ ◦f = pk (−∂)g X ˜ ˜ pi (−∂)g.p = j (∂)f.

(1.12)

i+j=k

Indeed 1 k!

d du

k

f (t + [α] − [u])g(t − [α] + [u])

˜ = pk (∂s )g(t − [α] + s)f (t + [α] − s) = pk (∂˜s ) = = = =

∞ X `=0 ∞ X `=0 ∞ X `=0 ∞ X

∞ X `=0

u=0

, using (1.11) s=0

˜ α p` (∂t )f (t − s) ◦ g(t + s) `

, using (1.13) s=0

˜ ˜ α pk (∂s )p` (∂w )f (t + w − s)g(t − w + s) `

, expressing Hirota,

s=w=0

α` pk (∂˜s )p` (−∂˜w )f (t − w − s)g(t + w + s) ˜ ˜ α pk (∂v )p` (−∂v )f (t − v)g(t + v)

, flipping signs,

s=w=0

`

v=0

˜ ` (∂)f ˜ ◦ g, using(1.5). α` pk (−∂)p

`=0

Step 2. Residue at ∞: Note ` X 2 ` ∞ 1 d −1 i−1 = d i(uα ) = (` + 1)!α−` ; (1.13) −1 du 1 − uα du u=0 u=0 i=1 then we find

Vertex Operator Solutions to Discrete KP-Hierarchy

193

m−1 d 1 f (t + [α] − [u])g(t − [α] + [u]) −1 2 du (1 − uα ) u=0 m−1 X m − 1 d r 1 = f (t + [α] − [u])g(t − [α] + [u]) r (m − 1)! r=0 du m−1−r 1 d du (1 − uα−1 )2 u=0 1 (m − 1)!

=

m−1 X

∞ X

r=0

`=0

(m − r)

˜ ` (∂)f ˜ ◦ g, using step 1 and (1.13) α`−m+r+1 pr (−∂)p

= mα1−m f (t)g(t) + α1−m

∞ X

αk

k=1

m X

˜ k−r (∂)f ˜ ◦ g, using p0 = 1. (m − r)pr (−∂)p

r=0

(1.14) Step 3. Residue at z = α−1 : d m −1 −1 z f (t + [α] − [z ])g(t − [α] + [z ]) dz z=α−1 d 2 −m u f (t + [α] − [u])g(t − [α] + [u]) = −u du = mα

−m+1

f (t)g(t) − α

= mα1−m f (t)g(t) +

2−m

∞ X k=1

u=α d f (t + [α] − [u])g(t − [α] + [u]) du u=α

α1−m+k

∂ f ◦ g, by elementary differentiation. ∂tk (1.15)

Finally, putting Step 2 and Step 3 in (1.10) yields Lemma 1.2.

Lemma 1.3. The Hirota symbol acts on functions f (t1 , t2 , . . . ) and g(t1 , t2 , . . . ) as follows:  k  ∂t ∂...∂t log fg for k odd n ∂ 1 i1 ik f ◦ g = a polynomial Pn in ∂k  f g ∂t1 . . . ∂tn log f g for k even (1.16) ∂ti1 ...∂tik

over all subsets {i1 , . . . , ik } ⊂ {1, . . . , n}. Upon granting degree 1 to each partial in ti , the polynomial Pn is homogeneous of degree n. Proof. By induction, we assume the statement to be valid for an Hirota symbol, involving ` partials, and we prove the statement for a symbol involving ` + 1 partials: ∂` 1 ∂ f (t) ◦ g(t) f g ∂t`+1 ∂t1 . . . ∂t`

∂` 1 ∂ ∂t1 ...∂t` f (t + u) ◦ g(t − u) f (t + u)g(t − u) = f g ∂u`+1 f (t + u)g(t − u) u=0

194

M. Adler, P. van Moerbeke

1 ∂` ∂ f = log f (t + u) ◦ g(t − u) ∂t`+1 g f g ∂t1 . . . ∂t` ∂m f (t + u) ∂ ,..., P ..., log + ∂u`+1 ∂ti1 . . . tim g(t − u) ∂n log f (t + u)g(t − u), . . . , ∂tj1 . . . ∂tjn u=0

(1.17) where m is odd and n even. The result follows from the simple computation: ∂m f (t + u) ∂ m+1 ∂ log = log f (t)g(t), ∂u`+1 ∂ti1 . . . ∂tim g(t − u) u=0 ∂ti1 . . . ∂tim .∂t`+1 ∂n ∂ n+1 f (t) ∂ log f (t + u)g(t − u) = log ∂u`+1 ∂ti1 . . . ∂tin ∂ti1 . . . ∂tin .∂t`+1 g(t) u=0 (1.18) Remark. The induction formula (1.17) can be made into an explicit formula for Pn , involving partitions of the set {1, 2, . . . , n}. 2. The Existence of a τ -Vector and the Discrete KP Bilinear Identity Before proving Theorem 0.1, we shall need two lemmas, which are analogues of basic lemmas in the theory of differential operators. So the main purpose of this section is threefold, namely, to prove the bilinear identities for the wave and adjoint wave vectors, to prove the existence of a τ -vector and finally to give a closed form for Lk . Lemma 2.1. For z-independent U, V ∈ D, the following matrix identities hold 5 I dz UV = U χ(z) ⊗ V > χ∗ (z) . (2.1) 2πiz z=∞ Proof. Set U=

X

uα 3α and V =

α

X

3β vβ ,

β

where uα and vα are diagonal matrices. To prove (2.1), it suffices to compare the (i, j)entries on each side. On the left side of (2.1), we have X (U V )ij = uα 3α+β vβ α,β

=

X

ij

uα (i)(3α+β )ij vβ (j)

α,β

=

X

uα (i)vβ (j).

α,β α+β=j−i

5 (A ⊗ B) = A B and remember χ∗ (z) = χ(z −1 ). The contour in the integration below runs clockwise ij i j about ∞; i.e., opposite to the usual orientation.

Vertex Operator Solutions to Discrete KP-Hierarchy

195

On the right side of (2.1), we have I dz U χ(z) V > χ(z −1 ) i j 2πiz z=∞ I X X uα z α χ(z) vβ z β χ(z −1 ) = I

z=∞

α

X

=

z=∞ α,β

=

X

i

β

uα (i)vβ (j)z α+β+i−j

j

dz 2πiz

dz 2πiz

uα (i)vβ (j),

α,β α+β=j−i

establishing (2.1).

Lemma 2.2. For W (t) a wave operator of the discrete KP-hierarchy, W (t)W −1 (t0 ) ∈ D+ , ∀t, t0 .

(2.2)

Proof. Setting h(t, t0 ) = W (t)W −1 (t0 ), compute from (0.6), ∂h = (Ln (t))+ h, ∂tn

∂h = −h(Ln (t0 ))+ . ∂t0n

Since h(t, t) = I ∈ D+ , it follows that h(t, t0 ) evolves in D+ .

(2.3)

Consider the wave function, already defined in the introduction, and the adjoint wave function: ! P∞ i P i X ti z Sχ(z) = e ti z z n + si (n)z i , 9(t, z) = W χ(z) = e 1 ∗

−1 > ∗

−

P∞

i
1 (S ) χ (z) 9 (t, z) = (W ) χ (z) = e ! P i X − ti z −n ∗ i si (n)z . =e z +

i<−n

n∈Z

−1 > ∗

(2.4)

n∈Z

Proof of Theorem 0.1. Step 1. Setting

U := W (t) and V > := (W −1 (t0 ))>

in formula (2.1) of Lemma 2.1, and using formula (0.8) for 9 and 9∗ in terms of W , one finds for all t, t0 ∈ C∞ , I dz 0 −1 9(t, z) ⊗ 9∗ (t0 , z) . (2.5) W (t)W (t ) = 2πiz z=∞ But, according to Lemma 2.2, W (t)W −1 (t0 ) ∈ D+ and thus (2.5) is upper-triangular, yielding I dz 9n (t, z)9∗m (t0 , z) = 0 for all n > m. (2.6) 2πiz z=∞

196

M. Adler, P. van Moerbeke

Defining

P i 8n (t, z) := z −n 9n (t, z) = e ti z (1 + O(z −1 )), P i 8∗n (t, z) := z n−1 9∗n−1 (t, z) = e− ti z (1 + O(z −1 )),

upon using the asymptotics (2.4), we have, by setting m = n − 1 in (2.6), I I dz ∗ 0 8n (t, z)8n (t , z)dz = 9n (t, z)9∗n−1 (t0 , z) = 0. z z=∞ z=∞ From the KP-theory, there exists a τ -function τn (t) for each n, such that P i τ (t + [z −1 ]) P i τ (t − [z −1 ]) n n , 8∗n (t, z) = e− ti z , 8n (t, z) = e ti z τn (t) τn (t) yielding the τ -function representation (0.10) for 9n and 9∗n . Step 2. The following holds for n ∈ Z: 1 ∂2 ˜ τn ◦ τn = 0, for k = 1, 2, 3, . . . , − pk+1 (∂) 2 ∂t1 ∂tk

(2.7)

! `−1 X ∂ ˜ ˜ − (` − r)pr (−∂)pk−r (∂) τn ◦ τn−` = 0, for `, k = 1, 2, 3, . . . . (2.8) ∂tk r=0 Indeed the bilinear identity (2.6), upon setting m = n − ` − 1, shifting t 7→ t + [α], t0 7→ t − [α], using the τ -function representation (0.10) of 9 and 9∗ , and Lemma 1.2 with m = `, yield6 I dz 0 = −α2 9n (t + [α], z)9∗n−`−1 (t − [α], z) τn (t + [α])τn−` (t − [α]) 2πiz I z=∞ P∞ i dz τn (t + [α] − [z −1 ])τn−` (t − [α] + [z −1 ])e2 1 (αz) /i α2 z `+1 =− 2πiz z=∞ ! ∞ `−1 X X ∂ ˜ k−r (∂) ˜ τn ◦ τn−` , αk − (` − r)pr (−∂)p = α1−` ∂tk r=0 k=1

establishing (2.8). As for (2.7), set m = n − 1, t 7→ t − a and t0 7→ t + a in the bilinear identity, and use (1.4). This establishes the two Eqs. (0.14). Step 3. To check the formulas (0.12) for S, compute P∞ i e 1 ti z Sχ(z) =: 9(t, z) P∞ i τ (t − [z −1 ]) χ(z) (by (0.10)) = e 1 ti z τ (t) ∞ P∞ i X ˜ (t) pn (−∂)τ ti z 1 z −n χ(z) = e τ (t) n=0 6

m

e

P∞ 1

(αz)i /i

= (1 − αz)−m

Vertex Operator Solutions to Discrete KP-Hierarchy

197

∞ P∞ i X ˜ (t) pn (−∂)τ 3−n χ(z). = e 1 ti z τ (t) 0

Similarly one checks the formula for S −1 using the formulas for 9∗ (t, z) in terms of S −1 and τ (t). Finally to check the formula (0.13) for Lk , use the formulas (0.12) for S ˜ see footnote 1): and S −1 (for 3, Lk = S3k S −1 ∞ X ˜ ˜ pj (∂)τ pi (−∂)τ −i−j+k ˜ 3 = 3 τ τ i,j≥0 ∞ X pi (−∂)τ ˜ ˜ ˜ −i−j+k+1 pj (∂)τ 3−i−j+k 3 = τ τ i,j≥0   X  X pi (−∂)τ ˜ n pj (∂)τ ˜ n+k−`+1  = 3k−`   τ τ n n+k−`+1 i,j≥0 `≥0

=

i+j=`

n∈Z

X p` (∂)τ ˜ n+k−`+1 ◦ τn `≥0

τn+k−`+1 τn

n∈Z

3k−`

(using (1.12))

yielding (0.13) and (0.15), upon noting, ˜ n+k ◦ τn τn+k p1 (∂)τ ∂ k = log , coef3k−1 L = τn+k τn ∂t1 τn n∈Z n∈Z ˜ n+1 ◦ τn τn+1 pk (∂)τ ∂ = log , by (2.8), coef30 Lk = τn+1 τn ∂tk τn n∈Z n∈Z ˜ n ◦ τn pk+1 (∂)τ ∂2 k = log τn , by (2.7), coef3−1 L = τn2 ∂t1 ∂tk n∈Z n∈Z concluding the proof of Theorem 0.1.

˜ (t)/τ (t)), the wave operator W (t) for the discrete Corollary 2.3. Setting γ(t) := (3τ KP-hierarchy has the following property: (W (t)W −1 (t0 ))− = 0, (W (t)W −1 (t0 ))0 =

γ(t) . γ(t0 )

Proof. That h(t, t0 ) = W (t)W −1 (t0 ) ∈ D+ was shown in Lemma 2.2. Concerning its diagonal h0 , we deduce from (2.3) that7 ∂ log h0 = (Lk (t))0 , ∂tk

∂ log h0 = −(Lk (t0 ))0 , with h0 (t, t) = I. ∂t0k

Note that γ(t)/γ(t0 ) satisfies the same differential equations as h0 (t) with the same initial condition, upon using (0.15): 7

M0 := diagonal part of M .

198

M. Adler, P. van Moerbeke

γ(t) ∂ τn+1 (t) ∂ = Lk (t)nn , log = log ∂tk γ(t0 ) n ∂tk τn (t) γ(t) ∂ τn+1 (t0 ) ∂ = −Lk (t0 )nn , log = − log ∂t0k γ(t0 ) n ∂t0k τn (t0 )

with γ(t)/γ(t0 )

= I.

t=t0

3. Sequences of τ -Functions, Flags and the Discrete KP Equation In this section, we prove Theorem 0.2; it will be broken up into three propositions: the first one is very similar to the analogous statement for the KP theory (see [5, 16]). One could make an argument unifying both cases, in the context of Lie theory. The second statement uses Grassmannian technology. Proposition 3.1. The following equivalences (i) ⇐⇒ (ii) ⇐⇒ (iii) stated in Theorem 0.2 hold. Proof. (i) ⇒ (ii) was already shown in Theorem 0.1. Regarding the converse (ii) ⇒ (i), we show vectors 9(t, z) and 9∗ (t, z) having the asymptotics (0.8) and satisfying the bilinear identity (ii) are discrete KP-hierarchy vectors. The point of the proof is to show that the matrices S and T > ∈ I + D− defined through P∞ i P∞ i 9(t, z) =: e 1 ti z Sχ(z), 9∗ (t, z) =: e− 1 ti z T χ∗ (z) satisfy the vector fields (0.6) with T > = S −1 . Step 1. T > = S −1 . Assuming the bilinear identities (assumption (ii) of Theorem 0.2), I dz ∗ 9(t, z) ⊗ 9 (t, z) 0= 2πiz − z=∞ I P∞ i P∞ i dz = e 1 ti z S χ(z) ⊗ e− 1 ti z T χ(z −1 ) 2πiz − z=∞ = (ST > )− , by (2.1), but since S, T > ∈ I + D− , ST > = I, yielding T > = S −1 . P∞ i Step 2. W (t)W −1 (t0 ) ∈ D+ , upon defining W (t) := S(t)e 1 ti 3 . According to the bilinear identity, the left-hand side of I dz 9(t, z) ⊗ 9∗ (t0 , z) 2πiz z=∞ I

P∞ 0 i P i dz e ti z S χ(z) ⊗ e− 1 ti z (S −1 )> χ(z −1 ) 2πiz z=∞ I P i P 0 >i dz S(t)e ti 3 χ(z) ⊗ (S −1 (t0 ))> e− t 3 χ(z −1 ) = 2πiz z=∞ P i P 0 i ti 3 − ti 3 −1 0 e S (t ), using Lemma 2.1 = S(t)e =

Vertex Operator Solutions to Discrete KP-Hierarchy

199

= W (t)W −1 (t0 ); belongs to D+ , and hence so does the right-hand side. Step 3.

P∞ i ∂ ∂ n n − (L )+ 9(t, z) = − (L )+ Sχ(z)e 1 ti z ∂tn ∂tn P∞ i ∂S n n = − (L )+ S + S z χ(z)e 1 ti z ∂tn P∞ i ∂S n n −1 = − (L )+ S + S 3 (S S) χ(z)e 1 ti z ∂tn P∞ i ∂S n n = − (L )+ S + L S χ(z)e 1 ti z ∂tn P∞ i ∂S n = + (L )− S) χ(z)e 1 ti z . ∂tn

Step 4. From W (t)W −1 (t0 ) ∈ D+ , since D+ is an algebra, deduce ∂ − (Ln )+ W (t) W −1 (t0 ) D+ 3 ∂tn t0 =t I dz ∂ = − (Ln )+ 9(t, z) ⊗ 9∗ (t, z) , by Lemma 2.1 ∂t 2πiz n z=∞ I P∞ i P∞ i dz ∂S(t) , +(Ln )− S(t) χ(z)e 1 ti z ⊗ (S > (t))−1 χ(z −1 )e− 1 ti z = ∂tn 2πiz z=∞ by step 3 ∂S(t) + (Ln )− S(t) S(t)−1 , by Lemma 2.1, = ∂tn and thus, since S ∈ I + D− and D− is an algebra, ∂S(t) n + (L )− S(t) S(t)−1 ∈ D+ ∩ D− = 0; ∂tn therefore, we have the discrete KP-hierarchy equations on S, ∂S(t) + (Ln )− S = 0, n = 1, 2, . . . , ∂tn and on L = S3S −1 ,

∂L = [−(Ln )− , L], ∂tn

ending the proof that (ii) ⇒ (i). Finally (ii) ⇐⇒ (iii) upon using the equivalence (i) ⇐⇒ (ii) and the τ -function representation (0.10) of 9 and 9∗ , shown in Theorem 0.1; this establishes Proposition 3.1.

200

M. Adler, P. van Moerbeke

With each component of the wave vector 9, or, what is the same, with each component of the τ -vector, we associate a sequence of infinite-dimensional planes in the Grassmannian Gr(n) , ) ( k ∂ 9n (t, z), k = 0, 1, 2, . . . Wn = spanC ∂t1 ( ) k P∞ i ∂ + z ψn (t, z), k = 0, 1, 2, . . . = e 1 ti z spanC ∂t1 P∞ i (3.1) =: e 1 ti z Wnt , and planes Wn∗

) ( k ∂ 1 ∗ = spanC 9n−1 (t, z), k = 0, 1, 2, . . . , z ∂t1

which are the orthogonal complements of Wn in Gr(n) , by the residue pairing I dz f (z)g(z) . hf, gi∞ := 2πi z=∞

(3.2)

(3.3)

Proposition 3.2. The equivalences (ii) ⇐⇒ (iv) ⇐⇒ (v) of Theorem 0.2 hold. Proof. The inclusion · · · ⊃ Wn−1 ⊃ Wn ⊃ Wn+1 ⊃ . . . in (iv) implies that Wn , given by (3.1), is also given by Wn = spanC {9n (t, z), 9n+1 (t, z), . . . }. Moreover the inclusions · · · ⊃ Wn ⊃ Wn+1 ⊃ . . . imply, by orthogonality, the inclu∗ sions · · · ⊂ Wn∗ ⊂ Wn+1 ⊂ . . . , and thus Wn∗ , given by (3.2) and thus specified by ∗ 9n−1 and τn , is also given by Wn∗ = {

9∗n−1 (t, z) 9∗n−2 (t, z) , , . . . }. z z

The bilinear identities (1.1) yield Wn∗ = Wn⊥ , with respect to the residue pairing. Indeed, since each τn is a τ -function, we have that I 9∗n−1 (t0 , z) dz i∞ = 9n (t, z)9∗n−1 (t0 , z) h9n (t, z), z 2πiz z=∞ 1 = τn (Hτn (t0 )) I P∞ 0 i dz τn (t − [z −1 ])τn (t0 + [z −1 ])e 1 (ti −ti )z = 0. × 2πi z=∞ Since

∗ )∗ , all n > m, Wn ⊂ Wm+1 = (Wm+1

∗ for all n > m, with respect to the residue we have the orthogonality Wn ⊥ Wm+1 ∗ 0 9m (t ,z) ∗ ∈ Wm+1 (t0 , z), we have pairing; since 9n (t, z) ∈ Wn , z

Vertex Operator Solutions to Discrete KP-Hierarchy

9∗ (t0 , z) i∞ = 0 = h9n (t, z), m z

I z=∞

201

9n (t, z)9∗m (t0 , z)

dz , all n > m, (3.4) 2πiz

which is (ii). Now assume (ii); then, for fixed n > m, we have k ` I ∂ ∂ dz 9n (t, z) 9∗m (t0 , z) , n > m, 0= 0 ∂t1 ∂t1 2πiz z=∞ and thus by (3.1) and (3.2), ∗ )∗ = Wm+1 , for n > m, Wn ⊆ (Wm+1

which implies the flag condition · · · ⊃ Wn−1 ⊃ Wn ⊃ Wn+1 ⊃ . . . , stated in (iv). (iv) ⇐⇒ (v), follows from the equivalence of (i) and (iii) in Proposition 1.1, by ˜ 1 = z −n+1 Wn−1 and W ˜ 2 = z −n Wn and noting setting τ1 := τn−1 , τ2 = τn , W z(z −n Wn ) ⊂ (z −n+1 Wn−1 ), i.e. Wn ⊂ Wn−1 ,

concluding the proof of the proposition.

Proposition 3.3. (v) ⇐⇒ (vi), as in Theorem 0.2, holds. Proof. Step 1. For a given n ∈ Z, statement (v), namely ˜ n , τn+1 } + τn+1 pk (−∂)τ ˜ n − τn pk (−∂)τ ˜ n+1 = 0, k ≥ 2 Rk(n) := {pk−1 (−∂)τ implies 0 Rk(n)

=

∂ ˜ − pk (∂) τn+1 ◦ τn = 0, k ≥ 2. ∂tk

Since Rk(n) are the Taylor coefficients of relation (v) in Theorem 0.2, statement (v)n is equivalent to (iv)n (i.e. Wn ⊃ Wn+1 ). The latter is equivalent to the bilinear identity (iii)n (i.e., (0.18) with n → n + 1 and m → n − 1). According to the arguments used in 0 the proof of Theorem 0.1, (iii)n implies Rk(n) = 0. Step 2. The converse holds, because, upon using an inductive argument, 0

0

0

(n) ); Rk(n) = αRk(n) + partials of (R1(n) , . . . , Rk−1 0

0

thus the vanishing of the R1(n) , . . . , Rk(n) implies the vanishing of Rk(n) .

Theorem 3.4. Every discrete KP-hierarchy is equivalent to a 2-Toda lattice. Proof. The 1-Toda theory implies for S1 := S ∈ I + D− , L1 := L, ∂S1 = −(Ln1 )− S1 (t), where L1 = S1 3S1−1 . ∂tn Then, in view of the 2-Toda theory, define S2 (t) ∈ D+ by means of the differential equations ∂S2 (t) = (Ln1 )+ S2 (t), n = 1, 2, . . . , ∂tn

202

M. Adler, P. van Moerbeke

with initial condition S2 (0) = (an invertible element d+ ∈ D+ ). Then define8 S1,2 (t, s) −1 , flowing according to the commuting differential equations and L1,2 = S1,2 3±1 S1,2 ∂S1,2 (t, s) = ±(Ln2 (t, s))∓ S1,2 (t, s) with S1,2 (t, 0) = S1,2 (t). ∂sn

(3.5)

S1,2 (t, s) satisfies the t-equations of 2-Toda for s = 0, by construction; now we must check that this holds for s 6 = 0; therefore, set (n) (t, s) = F1,2

∂S1,2 (t, s) ± (Ln1 (t, s))∓ S1,2 (t, s), for n = 1, 2, . . . . ∂tn

(3.6)

Compute, using (3.5) and [∂/∂tn , ∂/∂sn ] = 0, the system of two differential equations, (n) ∂F1,2 (n) −1 (n) (t, s) = ±[F2,1 S2 , Lk2 ]∓ S1,2 ± (Lk2 )∓ F1,2 , k, n = 1, 2, . . . ; ∂sk (n) (n) (t, 0) = 0, we have F1,2 (t, s) = 0 for all s. Thus, by (3.5) and (3.6), S1,2 (t, s) since F1,2 flow according to 2-Toda.

4. Discrete KP-Solutions Generated by Vertex Operators An important construction leading to Toda solutions is contained in Theorem 0.3, which is based on the following lemma: Lemma 4.1. Particular solutions to equation {τ1 (t − [z −1 ]), τ2 (t)} + z(τ1 (t − [z −1 ])τ2 (t) − τ2 (t − [z −1 ])τ1 (t)) = 0

(4.1)

are given, for arbitrary measures ν(λ)dλ, ν 0 (λ)dλ, by pairs (τ1 , τ2 ), defined by: Z Z P i X(t, λ)ν(λ)dλ τ1 (t) = e ti λ τ1 (t − [λ−1 ])ν(λ)dλ, (4.2) τ2 (t) = or

Z τ1 (t) =

Z P i X(−t, λ)ν 0 (λ)dλ τ2 (t) = e− ti λ τ2 (t + [λ−1 ])ν 0 (λ)dλ. (4.3)

Proof. Using

P∞

λ , z it suffices to check, before even integrating, that τ2 (t) = X(t, λ)τ1 (t) satisfies the above Eq. (4.1) P i e− ti λ {τ1 (t − [z −1 ]), τ2 (t)} + z(τ1 (t − [z −1 ])τ2 (t) − τ2 (t − [z −1 ])τ1 (t)) P i P i = e− ti λ {τ1 (t − [z −1 ]), e ti λ τ1 (t − [λ−1 ])} λ +z(τ1 (t − [z −1 ])τ1 (t − [λ−1 ]) − (1 − )τ1 (t)τ1 (t − [z −1 ] − λ−1 ])) z e−

8

1

1 λ i i(z )

=1−

The first index in L1,2 and S1,2 corresponds to the upper-sign.

Vertex Operator Solutions to Discrete KP-Hierarchy

203

= {τ1 (t − [z −1 ]), τ1 (t − [λ−1 ])} +(z − λ)(τ1 (t − [z −1 ])τ1 (t − [λ−1 ]) − τ1 (t)τ1 (t − [z −1 ] − [λ−1 ])) = 0, using the differential Fay identity (1.3) for the τ -function τ1 ; a similar proof works for the second solution, given by τ1 (t) = X(−t, λ)τ2 (t). Since Eq. (4.1) is linear in τ1 (t), and also in τ2 (t), the equation remains valid after integrating with regard to λ. Proof of Theorem 0.3. Note, from the definition of τ±n in Theorem 3, that each τn is defined inductively by Z Z τn+1 = X(t, λ)νn (λ)dλ τn and τ−n−1 = X(−t, λ)ν−n−1 (λ)dλ τ−n ; thus by Lemma 4.1, the functions τn+1 and τn are a solution of Eq. (v) of Theorem 0.2. Therefore, Theorem 0.2 implies that the τn ’s form a τ -vector of the discrete KP hierarchy.

5. Example of Vertex Generated Solutions: The q-KP Equation Consider the class of q-pseudo-difference operators, with y-dependent coefficients, acting on functions f (y) X ai (y)Di }, with Df (y) := f (qy), Dq = { and the q-derivative Dq , defined by Dq f (y) :=

1 f (qy) − f (y) = −λ(y)(D − 1)f (y), with λ(y) := − . (q − 1)y (q − 1)y

Consider the following q-pseudo-difference operators: Q = D + u0 (x)D0 + u−1 D−1 + . . . and Qq = Dq + v0 (x)Dq0 + v−1 (x)Dq−1 + . . . and the following q-deformations, which were proposed respectively by E. Frenkel [6] and Khesin, Lyubashenko and Roger [11], for n = 1, 2, . . . : ∂Q n = (Q )+ , Q (Frenkel system), ∂tn ∂Qq n = Qq + , Qq , (KLR system), ∂tn

(5.1) (5.2)

where ( )+ and ( )− refer to the q-difference and strictly q-pseudo-differential part of ( ). Haine and Iliev [8] have constructed q-Schur polynomials, solutions to Eqs. (5.2), by inserting the vector c(x) below in the usual Schur polynomials. In an elegant paper, Iliev [9] has obtained q-bilinear identities and q-tau functions, as well, purely within the KP theory. Defining (1 − q)x (1 − q)2 x2 (1 − q)3 x3 n−1 , , . . . ∈ C∞ and λ−1 , , c(x) = n (x) = (1 − q)xq 1−q 2(1 − q 2 ) 3(1 − q 3 ) (5.3)

204

M. Adler, P. van Moerbeke

one checks for n ≥ 1, Dn λ0 (x) = λn (x), and Dn c(x) = c(x) −

n X

[λ−1 i (x)],

1

D−n c(x) = c(x) +

n X

[λ−1 −i+1 (x)].

(5.4)

1

Theorem 5.1. There is an algebra isomorphism ˆ : Dq −→ D, which maps the Frenkel and KLR system into the discrete KP-hierarchy ∂L n = (L )+ , L , n = 1, 2, . . . . ∂tn Theorem 5.2. Consider the matrices L=3+

X

diag

−∞<`≤0

and

˜ n+`+1 ◦ τn p1−` (∂)τ τn+`+1 τn

(5.5)

n∈Z

3`

L˜ = εLε−1 ,

where ε is defined in (5.11) and where τ0 = τ (c(x) + t) and τn = X(t, λn ) . . . X(t, λ1 )τ (c(x) + t) ! n P∞ Y ti λik e i=1 Dn τ (c(x) + t), = rn (λ)

(5.6)

k=1

τ−n = X(−t, λ−n+1 ) . . . X(−t, λ0 )τ (c(x) + t) ! n P∞ i Y − ti λ−k+1 i=1 e D−n τ (c(x) + t). = r−n (λ) k=1

The matrices L and L˜ transform, using the mapˆ , respectively into solutions to the q-KP deformations (5.1) and (5.2): X X ai (y)Di or Qq = Dq + bi (y)Dqi , Q=D+ −∞
−∞
where the bi are related to the ai by (5.12) and9  ∂k  π(k) `+1    ∂ti . . . ∂ti log τ (c(y) + t) D τ (c(y) + t) for k ≥ 2 1 k a` (y) = polynomial in `+1 X ∂ D`+1 τ (c(y) + t)    λji (y) + log , for k = 1  ∂tj τ (c(y) + t) i=1 9

π(k) = parity of k = 1, when k is even, and = −1, when k is odd.

Vertex Operator Solutions to Discrete KP-Hierarchy

205

The proofs of these theorems, which rely heavily on the next lemma, will be given later. Consider an appropriate space of functions f (y) representable by “Fourier” series f (y) =

∞ X

fn ϕn (y)

−∞

in the basis10 ϕn (y) := δ(q −n x−1 y) for fixed q 6 = 1, and a parameter x ∈ R. Also, remember λi (x) := Di λ0 (x) = λ(xq i ).

(5.7)

Lemma 5.3. Then the Fourier transform, f 7−→ Ff = (. . . , fn , . . . )n∈Z , induces an algebra isomorphismˆ, mapping D-operators into 3-operators X

ˆ : Dq −→ D X X ai (y)Di 7−→ aˆ i 3i := diag (. . . , ai (xq n ), . . . )n∈Z 3i .

i

(5.8)

i

Moreover n X

bi (y)Dqi =

i=0

n X

ai (y)(−λD)i 7−→ ε

i=0

n X

! aˆ i 3i

ε−1 ,

(5.9)

i=0

where the 3-operator in brackets is monic, with11 λˆ = (. . . , λ−1 (x), λ0 (x), λ1 (x), . . . ) = (. . . , D−1 λ, λ, Dλ, . . . ), ε := diag

1 1 1 ,− ,... . . . , λ−2 λ−1 , −λ−1 , 1, − , λ 0 λ0 λ 1 λ0 λ 1 λ2

ai (y) :=

X

k+i

0≤k≤n−i

(−y(q − 1)q i )k

k

bk+i (y).

(5.10)

with ε0 = 1, (5.11)

(5.12)

Proof. The operators D and multiplication by a function a(y) act on basis elements, as follows: Dϕn (y) = ϕn−1 (y) and a(y)ϕn (y) = a(xq n )ϕn (y). P z i ; enjoys the property f (za)δ(z) = f (a)δ(z) n i∈Z[n] [n−1] ...[n−k+1]

10

The δ-function δ(z) :=

11

with [j] :=

1−q j 1−q

and k :=

[k] [k−1] ...[1]

206

M. Adler, P. van Moerbeke

Therefore Dk and a(y) act on functions f (y), as X X fn ϕn (y) 7−→ Dk f (y) = fn Dk ϕn (y) f (y) = n∈Z

n∈Z

=

X

fn ϕn−k (y)

n∈Z

=

X

fn+k ϕn (y),

(5.13)

n∈Z

and f (y) =

X

fn ϕn (y) 7−→ a(y)f (y) =

n∈Z

X

fn a(y)ϕn (y)

n∈Z

=

X

fn a(xq n )ϕn (y),

(5.14)

n∈Z

from which it follows that (Dk )ˆ = 3k ,

(5.15)

aˆ (y) = diag (. . . , a(xq n ), . . . )n∈Z .

(5.16)

To establish the algebra isomorphism (5.8), one checks that j ˆ a(y)Di ˆ b(y)Dj ˆ = aˆ (y)3i b(y)3 i+j −i ˆ = aˆ (y) 3i b(y)3 3 = diag(. . . , a(xq n )b(xq n+i ), . . . )n∈Z 3i+j = a(y)b(yq i )Di+j ˆ = a(y)Di b(y)Dj ˆ.

(5.17)

Using the inductively established identity Dqn =

n X

1 y n (q − 1)n q

n(n−1) 2

h ni (−1)k q k(k−1)/2 k Dn−k ,

k=0

the first identity of (5.9) is immediate. ˆ = −ε3ε−1 and εˆaε−1 = aˆ (since aˆ Then, using, by virtue of (5.10) and (5.11), λ3 is diagonal), one computes ˆ i ai (y)(−λ(y)D)i ˆ = aˆ i −λˆ D ˆ i = aˆ i −λ3 i = aˆ i ε3ε−1 = ε aˆ i 3i ε−1 establishing (5.9).

(5.18)

Vertex Operator Solutions to Discrete KP-Hierarchy

207

Proof of Theorem 5.1. Indeed the Frenkel system (5.1) maps at once under the isomorphism (5.8) into (5.5), whereas, using (5.9), the KLR-system maps into ∂εLε−1 = εLn ε−1 + , εLε−1 ∂tn = ε (Ln )+ , L ε−1 , which upon conjugation by ε−1 leads to (5.5) as well.

(5.19) (5.20)

Proof of Theorem 5.2. From Theorem 0.3, it follows that L with the τn ’s defined by (5.6), satisfies the discrete KP-hierarchy; the second equality in (5.6) follows from (5.4) and (0.22). According to Lemma 1.3, ˜ n+`+1 ◦ τn k p1−` (∂)τ = a polynomial in ∂ti ∂...∂ti log(τn+`+1 τnπ(k) ) , 1 k τn+`+1 τn where by (5.6) and (5.16), ∂k log(τn+`+1 τnπ(k) ) ∂ti1 . . . ∂tik n∈Z ∂k n = D log τ (c(y) + t)π(k) D`+1 τ (c(y) + t) ∂ti1 . . . ∂tik n∈Z ∧ k ∂ = log τ (c(y) + t)π(k) D`+1 τ (c(y) + t) , ∂ti1 . . . ∂tik and

τn+`+1 ∂ log ∂tj τn n∈Z   ! n+`+1 Y P∞ i n+`+1  e i=1 ti λα D τ (c(y) + t)    ∂   α=1 ! = log  n P∞   ∂tj Y i  Dn τ (c(y) + t)  e i=1 ti λ

α

α=1

=

n+`+1 X

λjα (y) +

α=n+1

=

=

`+1 X

n+`+1

τ (c(y) + t) ∂ D log ∂tj Dn τ (c(y) + t) `+1

∂ D τ (c(y) + t) log ∂t τ (c(y) + t) j i=1 !∧ `+1 X ∂ D`+1 τ (c(y) + t) j λi (y) + log , ∂tj τ (c(y) + t) i=1 Dn

establishing Theorem 5.2.

λji (y) +

n∈Z

! n∈Z

!!

n∈Z

Remark. Note the ε-conjugation has no counterpart in the Dq -world.

208

M. Adler, P. van Moerbeke

Defining the simple q-vertex operators: xz −1 ˜ Xq (x, t, z) := exz q X(t, z) and Xq (x, t, z) := (eq ) X(−t, z)

P∞ in terms of the vertex operator (0.22) and the q-exponential exq = e 1 state:

(1−q)k xk k(1−q k )

, we now

Corollary 5.4. Any K-P τ -function leads to a q-K-P τ -function τ (c(x) + t) satisfying q-bilinear relations below for all x ∈ R, t, t0 ∈ C∞ and all n > m, which tends to the standard K-P bilinear identity when q goes to 1: I Dn (Xq (x, t, z)τ (c(x) + t))Dm+1 (X˜ q (x, t0 , z)τ (c(x) + t0 )dz = 0 z=∞ Z X(t, z)τ (x¯ + t)X(t0 , z)τ (x¯ + t0 )dz = 0. −→ z=∞

Proof. The τ -functions τn defined in Theorem 5.2 satisfy the usual bilinear identity (0.11), and so, using the following identity: n n P∞ 1 λk i Y Y z z n−m−1 − i z i=1 Qn e = 1− k λk k=m+2 (−λ) k=m+2 k=m+2 n P Y − ∞ 1 z i i=1 i λk e = =

k=m+2 m+1 xz −1 Dn exz (eq ) q D

in computing τn (t − [z −1 ]) in the usual bilinear identity yields, up to a multiplicative factor α(λ, ν): I P∞ 0 i dz τn (t − [z −1 ])τm+1 (t0 + [z −1 ])e 1 (ti −ti )z z n−m α(λ, ν) z z=∞ I n m+1 X X 0 −1 τ (c(x) + t − [z −1 ] − [λ−1 ]+ [λ−1 = i ])τ (c(x) + t + [z i ]) z=∞

n Y k=m+2

I = z=∞

1

P∞ 0 i z 1− e 1 (ti −ti )z dz λk

1

Dn (Xq (x, t, z)τ (c(x) + t))Dm+1 (X˜ q (x, t0 , z)τ (c(x) + t0 ))dz = 0,

the latter tending as q → 1 to the usual KP bilinear identity, upon using (5.3).

Corollary 5.5. If we take τ0 (t) = τ (c(x) + t) in Theorem 5.2, with τ (t) a N -KdV τ function12 , then (LN ) = (LN )+ and L˜ N = (L˜ N )+ .

(5.21)

They map to solutions QN and QN q of the N -Frenkel and N -KLR systems: 12

N -KdV means the N th Gel’fand-Dickey reductionof KP; in particular, ∂τ /∂tiN = 0 for i = 1, 2, . . . .

Vertex Operator Solutions to Discrete KP-Hierarchy

209

N QN = (QN )+ and QN q = (Qq )+ .

(5.22)

The q-differential operator QN q has the form below and tends to the differential operator of the N -KdV hierarchy as q goes to 1: ∂ τ (DN c + t) N −1 Dq log ∂t1 τ (c + t) N −1 X ∂2 + log τ (Di c + t) 2 ∂t 1 i=0 N −2 X τ (DN c + t) τ (Di+1 c + t) ∂ ∂ λi+1 log log − − ∂t1 τ (DN −1 c + t) ∂t1 τ (Di c + t) i=0  i+1 j+1 X ∂ τ (D c + t) ∂ τ (D c + t)  N −2 Dq log log + ... + ∂t1 τ (Di c + t) ∂t1 τ (Dj c + t)

N + QN q = Dq

0≤i≤j≤N −2

−→

∂ ∂x

N

+N

∂2 log τ (x¯ + t) ∂t21

∂ ∂x

N −2 + ...,

(5.23)

Proof. P Note that for W ∈ Gr(0) , z N W ⊂ W if and only if its tau function is of the ∞ form e 1 ci tiN τ (t), with ∂τ (t)/∂tiN = 0, i = 1, 2, . . . . Thus by hypothesis, we have for each Wk = span{ψk (t, z), ψk+1 (t, z), . . . }, z N Wk ⊂ Wk ; since Lψ = zψ, we have z N ψk =

N −1 X

aj ψk+j + ψk+N = (LN ψ)k ,

j=0

and so LN is upper-triangular, yielding (5.21), which by the isomorphism ∧ of Lemma 5.3 yields (5.22). From (0.13) and the relationship between ai (y) and bi (y) given in (5.12), deduce (5.23). References 1. Adler, M. and van Moerbeke, P.: Birkhoff strata, B¨acklund transformations and limits of isospectral operators. Adv. in Math. 108, 140–204 (1994) 2. Adler, M. and van Moerbeke, P.: Matrix integrals, Toda symmetries, Virasoro constraints and orthogonal polynomials. Duke Math.J. 80 (3), 863–911 (1995) 3. Adler, M. and van Moerbeke, P.: Polynomials associated with weights and discrete KP. Commun. Math. Phys. (1999) 4. Adler, M., Horozov, E. and van Moerbeke, P.: The solution to the q-KdV equation. Phys. Lett. A 242, 139–151 (1998) 5. Date, E., Jimbo, M., Kashiwara, M., Miwa, T.: Transformation groups for soliton equations. Proc. RIMS Symp. Nonlinear integrable systems, Classical and quantum theory (Kyoto 1981), Singapore: World Scientific, 1983, pp. 39–119 6. Frenkel, E.: Deformations of the KdV hierarchy and related soliton equations. Int. Math. Res. Notices, 2, 55–76 (1996) 7. Gieseker, D.: The Toda hierarchy and the KdV hierarchy. Preprint, alg-geom/9509006

210

M. Adler, P. van Moerbeke

8. Haine, L. and Iliev, P.: The bispectral property of a q-deformation of the Schur polynomials and the q-KdV hierarchy. J. of Phys. A: Math. Gen, 30, 7217–7227 (1997) 9. Iliev, P.: Tau-function solutions to a q-deformation of the KP-hierarchy. Preprint (1997) 10. Kac. M., van Moerbeke, P.: On some periodic Toda lattices. Proc. Nat. Acad. Sci. 72, 1627–1629 (1975) 11. Khesin, B., Lyubashenko, V. and Roger, C.: Extensions and contractions of the Lie algebra of qPeudodifferential symbols on the circle. J. Funct. Anal. 143, 55–97 (1997) 12. Kupershmidt, B.A.: Discrete Lax equations and differential-difference calculus. Ast´erisque, 123, (1985) 13. Segal, G., Wilson, G.: Loop groups and equations of KdV type. Publ. Math. IHES 61, 5–65 (1985) 14. Takasaki, K., Takebe, T.: Integrable hierarchies and dispersionless limit. Rev. Math. Phys. 7, 743–808 (1995) 15. Ueno, K., Takasaki, K.: Toda Lattice Hierarchy. Adv. Studies in Pure Math. 4, 1-95 (1984) 16. van Moerbeke, P.: Integrable foundations of string theory, in Lecures on Integrable systems. In: Proceedings of the CIMPA-school, 1991, Ed.: O. Babelon, P. Cartier, Y. Kosmann-Schwarzbach, Singapore: World scientific, 1994, pp. 163–267 Communicated by G. Felder

Commun. Math. Phys. 203, 211 – 247 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Face Algebras and Unitarity of SU(N)L -TQFT Takahiro Hayashi Graduate School of Mathematics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464, Japan. E-mail: [email protected] Received: 24 March 1998 / Accepted: 26 November 1998

Abstract: Using face algebras (i.e. algebras of L-operators of IRF models), we construct modular tensor categories with positive definite inner product, whose fusion rules and S-matrices are the same as (or slightly different from) those obtained by Uq (slN ) at roots of unity. Also we obtain state-sums of ABF models on framed links which give quantum SU (2)-invariants of corresponding 3-manifolds. 1. Introduction As is well known, quantum groups have their origin in the theory of quantum inverse scattering method. More specifically, they first appeared as so-called algebras of Loperators of lattice models (of vertex type). For example, the simplest quantum group Uq (sl(2)) can be viewed as the algebra of L-operators of 6-vertex model without spectral parameter. It seems that it is worth trying to study algebras of L-operators independently from the framework of the Drinfeld–Jimbo algebra. By investigating algebraic structure of lattice models of face type, we found a new class of quantum groups, which is called the class of face algebras (cf. [14]–[22] and also [3,24,36]). It contains all bialgebras as a subclass. Moreover, as well as bialgebras, face algebras produce monoidal categories as their (co-)module categories. In this paper, we give a detailed study of face algebras S(AN −1 ; t) , which are obtained as algebras of L-operators of RSOS models of type AN −1 (N ≥ 2) (cf. [23]), where = ±1 and t denotes a primitive 2(N + L)th root of unity with L ≥ 1. We also give two applications of S(AN−1 ; t) to 3-dimensional topological quantum field theory (TQFT) and corresponding quantum invariants of 3-manifolds. We show that the algebra S(AN−1 ; t) is finite-dimensional cosemisimple and that its dual is a C ∗ -algebra for a suitable t. Also, we classify irreducible comodules of S(AN−1 ; t) and determine their dimensions. Moreover, we construct various structures on S(AN−1 ; t) , such as the antipode, the braiding and the ribbon functional. The algebra S(AN−1 ; t) is constructed as a quotient of the face version A(wN,t, ) of FRT

212

T. Hayashi

construction modulo one additional relation “det = 1”, where wN,t, is the Boltzmann weight of RSOS models of type AN−1 without the spectral parameter and det denotes the “(quantum) determinant” of A(wN,t, ). Since the representation theory of A(wN,t, ) is relatively easily established using a result on Iwahori–Hecke algebras due to H. Wenzl, the core of our work is to study the properties of the element det or the corresponding “exterior” algebra. Next, we explain the unitarity of 3-dimensional TQFT briefly. Roughly speaking, a 3-dimensional TQFT is a map which assigns to each 3-cobordism (M, ∂− M, ∂+ M), a linear map τ (M) : T(∂− M) → T(∂+ M). Here, by a 3-dimensional cobordism, we mean a compact 3-dimensional manifold M whose boundary is a disjoint union of two closed surfaces ∂− M, ∂+ M. A 3-dimensional TQFT is called unitary if T(∂± M) are (finite-dimensional) Hilbert spaces and τ (−M) = τ (M)∗ for each 3-cobordism (M, ∂− M, ∂+ M). It is established in [40] that to obtain a (unitary) 3-dimensional MTC, it suffices to construct a (unitary) modular tensor category (MTC) (i.e. a braided category which satisfies certain additional properties). The most important examples of MTC are constructed as semisimple quotients C(g, q) of some module categories of Drinfeld– Jimbo algebra Uq (g) at roots q of unity (cf. [34,1,8,28], and see also [7,41,42] for other constructions of MTC’s). For a suitable q, it also is expected that C(g, q) is a unitary MTC. However, it seems that it is not easy to verify it directly, since Uq (g) is non-semisimple and cannot have a C ∗ -algebra structure (cf. A. Kirillov, Jr, [28] and V. Turaev and H. Wenzl [42]). The first application of S(AN−1 ; t) is to show that the category CS (AN −1 , t) of all finite-dimensional right S(AN−1 ; t) -comodules is a MTC, and that (for a suitable t), the u (A category CS N−1 , t) of all finite-dimensional unitary S(AN −1 ; t) -comodules is a unitary MTC, whose fusion rules agree with those of C(slN , q). Here “unitary” comodule means a comodule with inner product which satisfies some conditions. The quantum dimensions and S-matrices of CS (AN−1 , t) are the same as (or slightly different from) those of C(slN , q) (according to the choice of and another sign parameter ι when N is even). Although we use Uq (slN ) to obtain some combinatorial formulas, the essential part of our theory is independent from Uq (slN ). Hence the equivalence of C(slN , q) and CS (AN−1 , t) is left as an open problem. However, by the result of Kazhdan and Wenzl [27], these two categories are equivalent up to a “twist”. Unlike the module category of Uq (g), the category CS (AN −1 , t) itself is semisimple. Moreover, it has a apparent similarity to the spaces of the conformal blocks of Wess– Zumino–Witten (WZW) models. We hope that there exists a direct connection between S(AN−1 ; t) and WZW models, similarly to Drinfeld [6]. The second application is to give an explicit description of the quantum SU (2)invariant τ (M) of closed 3-manifolds M, which is associated with CS (A1 , t) . More precisely, we express τ (M) as a state sum on each generic link diagram D which represents M. It gives a direct connection between the invariant and ABF model. The paper is organized as follows. We start in Sects. 2–4, by recalling elementary properties of face algebras H, various structures on H and their relations to the comodule category of H. In Sect. 5, we recall the notion of star-triangular (Yang–Baxter) face models and flat face models. The latter is introduced in [18], and is a variant of Ocneanu’s notion of a flat biunitary connection in an operator algebra. Flat face models play an crucial role to the determination of the representation theory of S(AN −1 ; t) . In Sects. 6–7, we define the algebras S(AN−1 ; t) and state the main result on the representation theory of them. In Sects. 9–10, we construct several structures on S(AN −1 ; t) , which we call the transpose, the costar structure, the antipode and the ribbon functional. Conse-

Face Algebras and Unitarity of SU(N)L -TQFT

213

u (A quently, we see that CS (AN−1 , t) (resp. CS N −1 , t) ) is a (unitary) ribbon category. In Sect. 11, we prove that CS (AN−1 , t) is a modular tensor category, by computing its S-matrix. Section 8 and Sect. 12 are devoted to some technical calculations on face analogues of the exterior algebras. In Sects. 13–14, we give an explicit description of CS (A1 , t) and give a state sum expression of the quantum SU (2)-invariant stated above. After submitting the manuscript, one of the referees informed me of the existence of the following two papers: Wenzl, H.: C ∗ -tensor categories from quantum groups, J. Amer. Math. Soc. 11, 261–282 (1988); Blanchet, C.: Heck algebras, modular categories and 3-manifolds quantum invariants, preprint. The former gives a proof of the unitarity of C(g, q) for each g. The latter gives a construction of modular tensor categories via Iwahori–Hecke algebras at root of unity. Throughout thisP paper, we use Sweedler’s sigma notation for coalgebras, such as (1 ⊗ id)(1(a)) = (a) a(1) ⊗ a(2) ⊗ a(3) (cf. [37]).

2. Face Algebras In this section, we give the definition of the face algebra and various structures on it. Let H be an algebra over a field K equipped with a coalgebra structure (H, 1, ε). Let ◦ V be a finite non-empty set and let eλ and eλ (λ ∈ V) be elements of H. We say that ◦ (H, eλ , eλ ) is a V-face algebra if the following relations are satisfied: 1(ab) = 1(a)1(b), ◦

◦

(2.1)

◦

◦

◦

eλ eµ = δλµ eλ , eλ eµ = δλµ eλ , eλ eµ = eµ eλ , X◦ X eν , eν = 1 = ν∈V

◦

1(eλ eµ ) =

ε(ab) =

(2.3)

ν∈V

X◦ ◦ ◦ eλ eν ⊗ eν eµ , ε(eλ eµ ) = δλµ , ν∈V

(2.2)

X

◦

ε(aeν )ε(eν b)

(2.4) (2.5)

ν∈V ◦

for each a, b ∈ H and λ, µ ∈ V. We call elements eλ and eλ face idempotents of H. It is known that a bialgebra is an equivalent notion of a V-face algebra with card(V) = 1. For a V-face algebra H, we have the following formulas: ◦

◦

ε(eλ a) = ε(eλ a), ε(a eλ ) = ε(aeλ ), X a(1) ε(eλ a(2) eµ ) = eλ aeµ ,

(2.6) (2.7)

(a)

X (a)

1(a) = X (a)

◦

◦

ε(eλ a(1) eµ )a(2) = eλ a eµ , XX

◦

(2.8) ◦

eν a(1) eξ ⊗ eν a(2) eξ ,

ν,ξ (a)

eλ a(1) eµ ⊗ a(2) =

X (a)

◦

◦

a(1) ⊗ eλ a(2) eµ ,

(2.9) (2.10)

214

T. Hayashi ◦

◦

1(eλ eµ a eλ0 eµ0 ) =

X◦ ◦ eλ a(1) eλ0 ⊗ eµ a(2) eµ0

(2.11)

(a)

for each a ∈ H and λ, µ, λ0 , µ0 ∈ V. Let G be a finite oriented graph with set of vertices V = G0 . For an edge p, we denote by s(p) and r(p)` its source (start) and its range (end) respectively. For each m ≥ 1, we m denote by Gm = λ,µ∈V Gm λµ the set of paths of G of length m, that is, p ∈ Gλµ if p is a sequence (p1 , . . . , pm ) of edges of G such that s(p) := s(p1 ) = λ, r(pn ) = s(pn+1 ) (1 ≤ n < m) and r(p) := r(pm ) = µ. Let H(G) be the linear span of the symbols e qp (p, q ∈ Gm , m ≥ 0). Then H(G) becomes a V-face algebra by setting X λ X λ ◦ eλ = e , eµ = e , (2.12) µ µ µ∈V λ∈V p·r p r , (2.13) e e = δr(p)s(r) δr(q)s(s) e q·s q s X p t p p e ⊗e , ε e = δpq (2.14) 1 e = t q q q m t∈G

Gm

and r, s ∈ Gn (m, n ≥ 0). Here for paths p = (p1 , . . . , pm ) and for each p, q ∈ . . . , rn ), we set p · r = (p1 , . . . , pm , r1 , . . . , rn ) if r(p) = s(r). We note that r = (r1 ,L KGm = p∈Gm Kp (m ≥ 0) becomes a right H(G)-comodule via X p p⊗e . (2.15) q 7→ q m p∈G

Proposition 2.1 ([16]). Every finitely generated V-face algebra is isomorphic to a quotient of H(G) for some G. We say that a linear map S : H → H is an antipode of H, or (H, S) is a Hopf V-face algebra if X X X X ◦ S(a(1) )a(2) = ε(aeν )eν , a(1) S(a(2) ) = ε(eν a)eν , (2.16) (a)

(a)

ν∈V

X

ν∈V

S(a(1) )a(2) S(a(3) ) = S(a)

(2.17)

(a)

for each a ∈ H. An antipode of a V-face algebra is an antialgebra-anticoalgebra map, which satisfies ◦

◦

S(eλ eµ ) = eµ eλ (λ, µ ∈ V).

(2.18)

The antipode of a V-face algebra is unique if it exists. Let H be a V-face algebra with product m and let R+ and R− be elements of (H ⊗ H)∗ . We say that (H, R± ) is a coquasitriangular (or CQT) V-face algebra if the following relations are satisfied: R+ m∗ (1) = R+ , m∗ (1)R− = R− ,

(2.19)

Face Algebras and Unitarity of SU(N)L -TQFT

215

R− R+ = m∗ (1), R+ R− = (mop )∗ (1), +

∗

−

op ∗

(2.21)

+ R+ 13 R12 .

(2.22)

R m (X)R = (m ) (X) (X ∈ H ), ∗

+

(m⊗id) (R ) =

+ R+ 13 R23 ,

∗

+

(id⊗m) (R ) =

(2.20)

∗

Here, as usual, for each Z ∈ (H ⊗ H)∗ and {i, j, k} = {1, 2, 3}, we define Zij ∈ (H⊗3 )∗ by Zij (a1 , a2 , a3 ) = Z(ai , aj )ε(ak ) (a1 , a2 , a3 ∈ H). We note, for example, that the second formula of (2.22) is equivalent to X R+ (a(1) , c)R+ (a(2) , b) (a, b, c ∈ H). (2.23) R+ (a, bc) = (a)

It is known that R± satisfies − − − ∗ − (m⊗id)∗ (R− ) = R− 23 R13 , (id⊗m) (R ) = R12 R13 , ◦

◦

◦

◦

◦

◦

◦

◦

R+ (eλ eµ a eν eξ , b) = R+ (a, eµ eξ beλ eν ), −

−

R (eλ eµ a eν eξ , b) = R (a, eν eλ beξ eµ ), ◦

◦

R+ (eλ eµ , a) = ε(eµ aeλ ), R+ (a, eλ eµ ) = ε(eλ aeµ ), ◦

−

−

◦

R (eλ eµ , a) = ε(eλ aeµ ), R (a, eλ eµ ) = ε(eµ aeλ )

(2.24) (2.25) (2.26) (2.27) (2.28)

for each λ, µ, ν, ξ ∈ V, a, b ∈ H. If, in addition, H has an antipode S, then (S ⊗ id)∗ (R+ ) = R− , (id ⊗ S)∗ (R− ) = R+ .

(2.29)

For a CQT Hopf V-face algebra H, we define its Drinfeld functionals Ui ∈ H∗ (i = 1, 2) via X X R+ (a(2) , S(a(1) )), U2 (a) = R− (S(a(1) ), a(2) ) (a ∈ H). (2.30) U1 (a) = (a)

(a)

The Drinfeld functionals are invertible as elements of the dual algebra H∗ and satisfy the following relations [20]: X X R− (S(a(2) ), a(1) ), U2−1 (a) = R+ (a(1) , S(a(2) )), (2.31) U1−1 (a) = (a)

(a) ◦

◦

◦

Ui±1 (eλ eµ ) = δλµ , Ui± (eλ a eµ ) = Ui± (eλ aeµ ),

− − m (U1 ) = R R− 21 (U1 ⊗ U1 ) = (U1 ⊗ U1 )R R21 , + + + m∗ (U2 ) = R+ 21 R (U2 ⊗ U2 ) = (U2 ⊗ U2 )R21 R , Ui XUi−1 = (S 2 )∗ (X), S ∗ (U1±1 ) = U2∓1 , S ∗ (U2±1 ) = U1∓1 ∗

−

(2.32) (2.33) (2.34) (2.35)

for each i = 1, 2, a ∈ H, X ∈ H∗ and λ, µ ∈ V. In particular, S is bijective and U1 U2−1 is a central element of H∗ . Let (H, R± ) be a CQT Hopf V-face algebra and V an invertible central element of H∗ . We say that V is a ribbon functional of H, or (H, V) is a coribbon Hopf V-face algebra if

216

T. Hayashi

m∗ (V) = R− R− 21 (V ⊗ V), ∗ S (V) = V.

(2.36) (2.37)

Let g (resp. G) be an element of (resp. a linear functional on) a V-face algebra H. We say that g (resp. G) is group-like if the following relations (2.38)–(2.39) (resp. (2.40)–(2.41)) are satisfied: X ◦ geν ⊗ g eν , (2.38) 1(g) = ν∈V ◦

◦

◦

g eλ eµ = eλ eµ g, ε(g eλ eµ ) = δλµ , X ◦ G(aeν )G(eν b), G(ab) =

(2.39) (2.40)

ν∈V ◦

◦

◦

G(eλ a eµ ) = G(eλ aeµ ), G(eλ eµ ) = δλµ

(2.41)

for each a, b ∈ H and λ, µ ∈ V. If H has an antipode S, then every group-like element g and group-like functional G are invertible and satisfy S(g) = g −1 , S ∗ (G) = G −1 .

(2.42)

We denote by GLE(H) the set of all group-like elements of H. Proposition 2.2 ([21]). Let H be a CQT Hopf V-face algebra and V an invertible element of H∗ . Then (H, V) is a coribbon Hopf V-face algebra if and only if M = U1 V −1 is group-like and satisfies the following relations: MXM−1 = (S 2 )∗ (X) (X ∈ H∗ ), M = U1 U2 . 2

(2.43) (2.44)

For a coribbon Hopf V-face algebra (H, V), we call M = U1 V −1 the modified ribbon functional of H corresponding to V. Example 2.1. When L = 1, the algebra S(AN −1 ; t) is rather degenerate. Hence, we treat this case here separately from the case L ≥ 2 (cf. [11]). Let N ≥ 2 be an integer and V the cyclic group Z/NZ. Let t ∈ C be a primitive 2(N + 1)th root of unity and either 1 or −1. We define the V-face algebra S(AN −1 ; t) to be the C-linear span of the symbols eji (m) (i, j, m ∈ V) equipped with the structure of V-face algebra given by eji [p]elk [q] = δi+p,k δj +q,l eji [p + q] (p, q ∈ Z≥0 ), X j eji (m) ⊗ ek (m), ε(eji (m)) = δij , 1(eki (m)) = j

◦

ei =

X j

eji (0), ej =

X i

eji (0).

(2.45) (2.46) (2.47)

Here, in (2.45), we set eji [qN + r] = (−)(i−j )q(N−1) eji (r + N Z)

(2.48)

Face Algebras and Unitarity of SU(N)L -TQFT

217

for each q ∈ Z and 0 ≤ r < N . The algebra S(AN −1 ; t) becomes a coribbon Hopf V-face algebra via j +p

S(eji [p]) = ei+p [−p], R

+

(eji [p], elk [q])

(2.49) −pq

= δi,k+q δj k δi+p,l+q δj +p,l (−ζ t)

,

p2

Vι (eji [p]) = δij ιp (−ζ t) ,

(2.50) (2.51)

where i, j, k, l ∈ V, p, q ∈ Z≥0 , ζ denotes a solution of ζ N = N −1 t, ι = ±1 if N ∈ 2Z and ι = 1 if N ∈ 1 + 2Z. 3. Comodules of Face Algebras In this section, we recall categorical properties of comodules of face algebras (cf. [16]). We refer the readers to [26] for the terminologies on monoidal (or tensor) categories. Let M be aL right comodule of a V-face algebra H. We define its face space decomposition M = λ,µ∈V M(λ, µ) by M(λ, µ) =

nX

o u(0) ε(eλ u(1) eµ ) u ∈ M .

(3.1)

(u)

P Here we denote the coaction M → M ⊗ H by u 7 → (u) u(0) ⊗ u(1) (u ∈ M). Let ¯ to be the N be another H-comodule. We define the truncated tensor product M ⊗N H-comodule given by ¯ (M ⊗N)(λ, µ) =

u ⊗ v 7→

X

M ν∈V

M(λ, ν) ⊗ N (ν, µ),

(3.2)

u(0) ⊗ v(0) ⊗ u(1) v(1)

(u),(v)

(u ∈ M(λ, ν), v ∈ N (ν, µ), λ, µ, ν ∈ V).

(3.3)

f

With this operation, the category ComH of all finite-dimensional right H-comodules becomes a monoidal (or tensor) category whose unit object KV is given by KV =

M

Kµ; µ 7 →

µ∈V

X

◦

λ ⊗ eλ eµ (µ ∈ V).

(3.4)

λ∈V f

If, in addition, H is CQT, then ComH becomes a braided monoidal category with braiding ¯ ∼ ¯ given by cMN : M ⊗N = N ⊗M cMN (u ⊗ v) =

X

v(0) ⊗ u(0) R+ (u(1) , v(1) )

(u),(v)

(u ∈ M(λ, ν), v ∈ N (ν, µ), λ, µ, ν ∈ V).

(3.5)

218

T. Hayashi

Next, suppose that H has a bijective antipode and M is finite-dimensional. Then there exists a unique right H-comodule M ∨ such that whose underlying vector space is the dual of M and that the coaction satisfies X X hv, u(0) iS(u(1) ) = hv(0) , uiv(1) (u ∈ M, v ∈ M ∨ ). (3.6) (u)

(v)

As vector spaces, we have M ∨ (λ, µ) ∼ = M(µ, λ)∗ (λ, µ ∈ V).

(3.7)

The comodule M ∨ becomes a left dual object of M via maps X X ¯ ∨ ; λ 7→ p ⊗ p∨ (λ ∈ V), bM : KV → M ⊗M

(3.8)

µ∈V p∈Mλµ

¯ → KV ; p∨ ⊗ q 7 → δpq µ (p ∈ Mνλ , q ∈ Mνµ ), dM : M ∨ ⊗M

(3.9)

where Mλµ denotes a basis of M(λ, µ) and {p∨ | p ∈ Mλµ } its dual basis. By replacing S in (3.6) with S −1 , we obtain another H-comodule structure on M ∗ , which gives the right dual M ∧ of M. We note that the canonical linear isomorphism IM ∧ : M ∧ → M ∨ satisfies IM ∧ (Xu) = (S −2 )∗ (X)IM ∧ (u)

(3.10)

for each X ∈ H∗ and u ∈ M ∧ . Here, as usual, we regard M as a left H∗ -module via X u(0) hX, u(1) i (u ∈ M, X ∈ H∗ ). (3.11) Xu = (u) f

Finally, suppose that H is a coribbon Hopf V-face algebra. Then ComH becomes a ribbon category with twist θM : M ∼ = M given by θM (u) = V −1 u (u ∈ M).

(3.12)

Lemma 3.1. Let H be a coribbon Hopf V-face algebra with modified ribbon functional M such that its unit comodule KV is absolutely irreducible. Then, the quantum trace of f the ribbon category ComH is given by 1 Tr(Mf ) card(V) = Tr (Mf )|M(λ,−)

Tr q (f ) =

f

(3.13) (3.14)

for each M ∈Lob(ComH ) and f ∈ EndH (M), where λ is an arbitrary element of V and M(λ, −) = µ M(λ, µ).

Face Algebras and Unitarity of SU(N)L -TQFT

219

Proof. We define uM : M → M ∨∨ by the composition ¯ id⊗b ¯ −−−−→ M∼ = M ⊗KV

¯ ∨∨ ¯ ∨ ⊗M M ⊗M

¯ c⊗id

−−−−→

¯ ⊗id ¯ ∨∨ ∼ ¯ ⊗M ¯ ∨∨ −−d− −→ KV⊗M M ∨ ⊗M = M ∨∨ . f

Then, as a consequence of the fact that ComH is a rigid braided monoidal category, we have ¯ M ∨ ). dM ◦ cMM ∨ = dM ∨ ◦ (uM ⊗id

(3.15)

On the other hand, since uM (v) = IM (U1 v) (v ∈ M), we have (uM ◦ θM )(v) = IM (Mv) (v ∈ M),

(3.16)

where IM is as in (3.10). Hence, by the definition of Tr q , we obtain ¯ M ∨ ) ◦ bM (λ) Tr q (f )(λ) = dM ∨ ◦ (uM θM f ⊗id X X hIM (Mf (p)), p∨ i λ = µ∈V p∈Mλµ

= Tr (Mf )|M(λ,−) λ

t as required, where Mλµ and p∨ are as in (3.8). u We say that a right comodule M of a V-face algebra H is group-like if dim(M(λ, µ)) = δλµ for each λ, µ ∈ V. For each g ∈ GLE(H), we can define a group-like comodule P L ◦ KVg = µ∈V Kµg by the coaction µg 7 → λ λg ⊗ eλ eµ g. Conversely, we have the following. Lemma 3.2 ([19]). For each group-like comodule M and its basis gµ ∈ M(µ, µ) P ◦ (µ ∈ V), we obtain a group-like element g by eλ eµ g = g µλ and gµ 7→ λ gλ ⊗ g µλ . Moreover, every group-like element is obtained in this manner. Let H be L a V-face algebra and {Lψ | ψ ∈ 3} its finite-dimensional comodules such that H ∼ = ψ∈3 End(Lψ )∗ as coalgebras. Let g be a central group-like element of H. We say that g is simply reducible if there exists a set 3 and a bijection ϕ : 3 × Z≥0 ∼ = ¯ ϕ(λ,0) for each λ ∈ 3 and n ≥ 0. 3 such that Lϕ(λ,n) ∼ = KVg n ⊗L Lemma 3.3 ([18]). Let H, g etc. be as above. Then, (1) the element L g is not a zerodivisor of H. (2) The quotient H = H/(g − 1) is isomorphic to λ∈3 End(Lλ )∗ as coalgebras, where Lλ is Lϕ(λ,0) viewed as an H-comodule. As H-comodules, we have Lϕ(λ,n) ∼ = Lλ for each λ ∈ 3 and n.

220

T. Hayashi

4. Compact Face Algebras Let H be a V-face algebra over the complex number field C and × : H → H; a 7 → a × an antilinear map. We say that (H, ×) is a costar V-face algebra [17] if (a × )× = a, (ab)× = a × b× , X × × a(2) ⊗ a(1) , 1(a × ) =

(4.1) (4.2)

(a)

eλ×

◦

= eλ

(4.3)

for each λ ∈ V and a, b ∈ H. Let [tpq ]p,q∈I be a finite-size matrix whose entries are elements of H. Then [tpq ] is called a unitary matrix corepresentation if 1(tpq ) = P × r tpr ⊗ trq , ε(tpq ) = δpq and tpq = tqp . A costar V-face algebra H is called compact if H is spanned by entries of unitary matrix corepresentations (cf. [29,17]). For each costar V-face algebra H, its dual H∗ becomes a ∗-algebra via hX ∗ , ai = hX, a × i (X ∈ H∗ , a ∈ H). When H is a Hopf V-face algebra, we also set a ∈ H and X ∈ H∗ . These operations satisfy

a∗

=

S(a × )

and

(4.4) X×

(a ∗ )∗ = a, (X× )× = X (a ∈ H, X ∈ H∗ ).

=

S ∗ (X)∗

for each (4.5)

For each compact Hopf V-face algebra H, its Woronowicz functional [29,17] Q ∈ H∗ is a unique group-like functional such that QXQ−1 = (S 2 )∗ (X) (X ∈ H∗ ), X X Q(tpp ) = Q−1 (tpp ), p

(4.6) (4.7)

p

and that [Q(tpq )] is a positive matrix, where [tpq ] denotes an arbitrary unitary matrix corepresentation. The functional Q also satisfies Q∗ = Q, Q× = Q−1 .

(4.8)

Let (H, ×) be a costar V-face algebra and M a finite-dimensional H-comodule equipped with a Hilbert space structure ( | ). We say that (M, ( | )) is unitary if X X × (u(0) | v) u(1) = (u | v(0) ) v(1) (u, v ∈ M). (4.9) (u)

(v)

We note that (M, ( | )) is unitary if and only if (3.11) gives a ∗-representation of H∗ . A costar V-face algebra (H, ×) is compact if and only if every finite-dimensional Hcomodule is unitary for some ( | ). Proposition 4.1. Let H be a compact V-face algebra. (1) The unit comodule CV becomes a unitary comodule via (λ | µ) = δλµ . (2) For each unitary comodule M and ¯ becomes a unitary comodule via (u ⊗ v | u0 ⊗ v 0 ) = (u | u0 )(v | v 0 ) (u ∈ N, M ⊗N M(λ, ν), v ∈ N(ν, µ), u0 ∈ M(λ0 , ν 0 ), v 0 ∈ N(ν 0 , µ0 )). (3) If H has an antipode, then the left dual M ∨ becomes a unitary comodule via (u | v) = (Qϒ −1 (v) | ϒ −1 (u)), where ϒ : M →

M∨

(4.10)

denotes the antilinear isomorphism defined by hϒ(u), vi = (v | u) (u, v ∈ M).

(4.11)

Face Algebras and Unitarity of SU(N)L -TQFT

221

Proof. The proof of Part (1) and Part (2) is straightforward. Using the second equality of (4.5), we obtain (S 2 )∗ (X × ) = S ∗ (X∗ ). Using this together with (4.6) and Xϒ(u) = ϒ(X × u) (u ∈ M, X ∈ H∗ ),

(4.12)

we obtain (u | Xv) = (S ∗ (X∗ )Qϒ −1 (v) | ϒ −1 (u)) = (Qϒ −1 (v) | (X∗ )× ϒ −1 (u)) = (X ∗ u | v) t for each u, v ∈ M ∨ , as required. u fu

By the proposition above, the category ComH of all finite-dimensional unitary comodules of a compact (Hopf) V-face algebra H becomes a (rigid) monoidal category. Let M and N be unitary H-comodules and f : M → N an H-comodule map. We define a linear map f¯ : N → M by (f (n) | m) = (n | f (m)). We have ¯ = f ⊗g, ¯ f = f, f ◦ g = g ◦ f , f ⊗g

(4.13)

f + g = f + g, cf = cf (c ∈ C).

(4.14)

Proposition 4.2. For each compact Hopf V-face algebra H and its comodule M, we have ¯ M ◦ bM ∧ , dM = (IM ∧ ◦ Q)⊗id (4.15) ¯ M ∧ ◦ Q)−1 ). bM = dM ∧ ◦ (idM ⊗((I

(4.16)

Proof. Using (4.8) and (4.12), we obtain (q∨ | Qp∨ ) = hp∨ , ϒ −1 (q∨ )i, (ϒ(q) | r∨ ) = hr∨ , Qqi,

(4.17)

where {p} and {p∨ } is as in (3.8). Using the first equality of (4.17), we obtain X X ¯ M ) ◦ bM ∧ (λ) = q∨ ⊗ r | Qp∨ ⊗ p q∨ ⊗ r | ((IM ∧ ◦ Q)⊗id =

X X

µ∈V p∈Mµλ

r | hp∨ , ϒ −1 (q∨ )ip = δλr(q) δqr ,

µ∈V p∈Mµλ

where the first equality follows from (3.10). This proves (4.15). Similarly, (4.16) follows from the second equality of (4.17). u t Let H be a costar CQT V-face algebra. We say that H is of unitary type if R+ (a × , b× ) = R− (a, b) (a, b ∈ H).

(4.18)

In this case, the Drinfeld functionals of H satisfy U1∗ = U2 , U2∗ = U1 .

(4.19)

222

T. Hayashi

Remark 4.1. When q > 0, the function algebras of the usual quantum groups (such as Fun(SLq (N )), Fun(Spq (2N))) are both CQT and compact. However, they are not of unitary type but rather of “Hermitian type.” Proposition 4.3 ([22]). For each compact CQT Hopf V-face algebra H of unitary type, its Woronowicz functional is a modified ribbon functional of H. Moreover, the corresponding ribbon functional VQ satisfies −1 VQ∗ = VQ .

(4.20)

We call VQ the canonical ribbon functional of (H, ×). We note that the expression U1 = VQ Q gives the “polar decomposition” of U1 . Let H be a compact CQT Hopf V-face algebra of unitary type, equipped with the canonical ribbon functional. By (4.18) and (4.20), we have −1 cMN = (cMN )−1 , θM = θM .

(4.21)

Moreover, by (3.15), (3.16), (4.16), we have ¯ M ∨ ). bM = dM ◦ cMM ∨ ◦ (θM ⊗id

(4.22)

Furthermore, by (3.14) and the defining properties of Q, we have Tr q (f f ) =

1 Tr(f Qf ) > 0 card(V)

(4.23)

fu

for every f 6 = 0. Thus, the category ComH is a unitary ribbon category (cf. [40]). Example 4.1. Let L = 1 and S(AN−1 ; t) as in Example 2.1. Then S(AN −1 ; t) is a compact CQT Hopf V-face algebra of unitary type with costar structure eji (m)× = j

ei (m). Its Woronowicz functional agrees with the counit ε. 5. Flat Face Models

Let G be a finite oriented graph with set of vertices V. We say that a quadruple r qp s or a diagram p

λ −−−−→   ry

µ  s y

(5.1)

q

ν −−−−→ ξ is a face if p, q, r, s ∈ G1 and s(p) = λ = s(r), r(p) = µ = s(s), r(r) = ν = s(q), r(q) = ξ = r(s). (5.2) When G has no multiple edge, we also write r qp s = λν µξ . We say that (G, w) is a i h (V-)face model over a field K if w is a map which assigns a number w r qp s ∈ K to each face r qp s of G. We call w Boltzmann weight of (G, w). For convenience, we set

Face Algebras and Unitarity of SU(N)L -TQFT

223

h i w r qp s = 0 unless r qp s is a face. For a face model (G, w), we identify w with the L linear operator on KG2 = p∈G2 Kp given by w(p · q) =

X r·s∈G

h p i w r q r · s (p · q ∈ G2 ). s 2

(5.3)

For m ≥ 2 and 1 ≤ i < m, we define an operator wi = wi/m on KGm by wi/m (p · q · r) = p⊗w(q)⊗r (p ∈ Gi−1 , q ∈ G2 , r ∈ Gm−i−1 ), where we identify (p1 , . . . , pm ) ∈ Gm with p1 ⊗ . . . ⊗ pm ∈ (KG1 )⊗m . A face model is called invertible if w is invertible as an operator on KG2 . An invertible face model is called star-triangular (or Yang–Baxter) if w satisfies the braid relation w1 w2 w1 = w2 w1 w2 in End(KG3 ). For a star-triangular face model (G, w), the operators wi/m (1 ≤ i < m) define an action of the m-string braid group Bm on KGm λµ for each m ≥ 2 and λ, µ ∈ V. The following proposition gives a face version of the FRT construction. Proposition 5.1 ([35,30,13,19]). Let (G, w) be a V-face model and H(G) as in § 1. Let I be an ideal of H(G) generated by the following elements: X h c i c · d X h p i a · b wr q e − wa d e (p · q, a · b ∈ G2 ). (5.4) s r · s b p · q 2 2 r·s∈G

c·d∈G

Then I is a coideal of H(G) and the quotient A(w) := H(G)/I becomes a V-face algebra. If (G, w) is star-triangular, then there exist unique bilinear pairings R± on A(w) such that (A(w), R± ) is a CQT V-face algebra and that p r q ,e =w r s (5.5) R+ e q s p for each p, q, r, s ∈ G1 .

We denote the image of e qp by the projection H(G) → A(w) again by e qp . Then P Am (w) := p,q∈Gm Ke qp becomes a subcoalgebra of A(w) for each m ≥ 0. As the usual FRT construction (cf. [12, Prop. 2.1]), we have the following. ∼ Proposition 5.2. For each star-triangular face model (G, w), we have Am (w)∗ = HomBm (KGm ) (m ≥ 2) as K-algebras. We say that r qp s or (5.1) is a boundary condition of size m × n if p, q ∈ Gn , r, s ∈ Gm and the relation (5.2) is satisfied for some λ, µ, ν, ξ . For a face model (G, w), we define its partition function to be an extension w : boundary conditions of size m × n; m, n ≥ 1} → K of the map w which is determined by the following two recursion relations: X p p0 p · p0 s = w r a w a 0s , w r q · q0 q q m a∈G

(5.6)

224

T. Hayashi

X h p i p a w r · r 0 s · s0 = w r s w r 0 s0 (5.7) a q q a∈Gn 0 0 p, q ∈ Gn , p0 , q0 ∈ Gn , r, s ∈ Gm , r0 , s0 ∈ Gm . i h i h Also, we set w r qp s = δpq (respectively w r qp s = δrs ) if r, s ∈ G0 (respectively p, q ∈ G0 ). With this notation, the relation (5.5) holds for every star-triangular face model (G, w) and p, q ∈ Gn , r, s ∈ Gm (m, n ≥ 0). Next, we recall the notion of flat face model [18], which is a variant of A. Ocneanu’s notion of flat biunitary connection (cf. [33]). Let (G, w) be an invertible face model with a fixed vertex ∗ ∈ V = G0 . We assume that 3m G 6 = ∅ for each m ≥ 0 and that S m = 3 and V(m) = V(m) V = m≥0 V(m), where 3m G∗ are defined by G G∗ m (5.8) 3G = (λ, m) ∈ V × Z≥0 G∗λ 6= ∅ , 3m G = 3G ∩ (V × {m}), V(m) = λ ∈ V (λ, m) ∈ 3G .

(5.9) (5.10)

For each m ≥ 0, we define the algebra Str m (G, ∗) by Y End(KGm Str m (G, ∗) = ∗λ )

(5.11)

λ∈V(m)

and call it a string algebra of (G, w, ∗). For each m, n ≥ 0, we define the algebra map n ιmn : Str m (G, ∗) → Str m+n (G, ∗) by ιmn (x)(p · q) = xp ⊗ q (p ∈ Gm ∗λ , q ∈ Gλµ ). For ∗

∗

−1

∗

each 1 ≤ i < m, we define the` element w i = wi/m of Str m (G, ∗) to be the restriction of m m wi/m on KG∗− , where G∗− = λ∈V Gm ∗λ . We say that (G, w, ∗) is a flat face model if ∗

∗

∗

−1

ιmn (x)w nm ιnm (y)wnm = wnm ιnm (y)wnm ιmn (x)

(5.12)

∗

for each x ∈ Str m (G, ∗) and y ∈ Str n (G, ∗) (m, n ≥ 0), where wmn ∈ Str m+n (G, ∗) is defined by ∗

∗

∗

∗

∗

∗

∗

∗

∗

∗

w mn = (w n w n+1 · · · wm+n−1 )(wn−1 wn · · · wm+n−2 ) · · · (w1 w2 · · · w m ).

(5.13)

For each flat V-face model (G, w, ∗), n ≥ 0 and λ, µ ∈ V, there exists a unique left action 0 of Str n (G, ∗) on KGnλµ such that ∗

∗

−1

p ⊗ (0(x)q) = wnm ιnm (x)wnm (p · q)

(5.14)

n n for each m ≥ 0, p ∈ Gm ∗λ , q ∈ Gλµ and x ∈ Str (G, ∗). Using this action, we define the costring algebra M Costm (w, ∗) (5.15) Cost(w, ∗) =

to be the quotient V-face algebra of

L

m≥0 m≥0 EndK (KG

m )∗

∼ = H(G) given by

Cost m (w, ∗) = EndStrm (G,∗) (KGm )∗ .

(5.16)

Face Algebras and Unitarity of SU(N)L -TQFT

225 µ

For each λ, µ ∈ V and (ν, m) ∈ 3G , we define the non-negative integer Nλν (m) by the irreducible decomposition of KGm λµ : X µ Nλν (m)[KGm (5.17) [KGm ∗ν ], λµ ] = ν∈V(m)

Str m (G, ∗)-module

V , [V ] denotes the element of the Grothendieck where, for each µ group K0 (Str m (G, ∗)) corresponding to V (see e.g. [4, §5.1]). We call Nλν (m) fusion rules of (G, w, ∗), µ Theorem 5.3 ([18]). Let (G, w, ∗) be a flat V-face model with fusion rule Nλν (m). (1) For each (λ, m) ∈ 3G , up to isomorphism there exists a unique right Costm (w, ∗)comodule L(λ,m) such that dim L(λ,m) (∗, µ) = δλµ for each µ ∈ V. As coalgebras, we have

M

Cost m (w, ∗) ∼ =

(5.18)

End(L(λ,m) )∗ .

(5.19)

λ∈V(m) f

(2) In the corepresentation ring K0 (ComCost(w,∗) ), we have [L(∗,0) ] = 1, X

[L(λ,m) ][L(µ,n) ] =

ν∈V(m+n)

(5.20)

ν Nλµ (n)[L(ν,m+n) ].

Moreover, for each Cost m (w, ∗)-comodule M, we have X dim (M(∗, λ)) [L(λ,m) ]. [M] =

(5.21)

(5.22)

λ∈V(m)

(3) We have

µ dim L(ν,m) (λ, µ) = Nλν (m).

(5.23)

Lemma 5.4. For each flat star-triangular face model, we have ∗

0(wi/n ) = wi/n (n ≥ 2, 1 ≤ i < n).

(5.24)

Proof. By the braid relation, we have ∗

∗

∗

−1

∗

w nm wi/n+m w nm = w i+m/n+m ∗

(5.25) ∗

for each 1 ≤ i < n. Using this together with ιnm (w i/n ) = w i/n+m , we obtain ∗

p ⊗ (0(wi/n )q) = p ⊗ wi/n q for each m ≥ 0, p ∈

Gm ∗λ

and q ∈

Gnλµ

(5.26)

as required. u t

Proposition 5.5. Let (G, w) be a star-triangular V-face model with a fixed vertex ∗ ∈ V. Then (G, w, ∗) is flat if KGm ∗λ is an absolutely irreducible Bm -module for each (λ, m) ∈ 3G . In this case, we have Cost(w, ∗) = A(w) as quotients of H(G). ∗

Proof. Using (5.25), we see that (5.12) holds for every x ∈ Str m (G, ∗) and y = w i ∗ ∗ (1 ≤ i < n). Hence (G, w, ∗) is flat if Str m (G, ∗) = hw1 , . . . , wm−1 i for each m > 1. The second assertion follows from Proposition 5.2 and the lemma above.

226

T. Hayashi

6. SU (N )L -SOS Models In order to construct the algebras S(AN−1 ; t) , we first recall SU (N )L -SOS models (without spectral parameter) [23], which are equivalent to H. Wenzl’s representations of Iwahori–Hecke algebras (cf. [45]) and also, the monodromy representations of the braid group arising from conformal field theory (cf. A. Tsuchiya and Y. Kanie [39]). Let N ≥ 2 and L ≥ 2 be integers. For each 1 ≤ i ≤ N, we define the vector iˆ ∈ RN by 1ˆ = (1 − 1/N, −1/N, . . . , −1/N), . . . , Nˆ = (−1/N, . . . , −1/N, 1 − 1/N ). Let V = VNL be the subset of RN given by VNL = λ1 1ˆ + · · · λN Nˆ λ1 , . . . , λN ∈ Z, L ≥ λ1 ≥ · · · ≥ λN = 0 . (6.1) P ˆ λN = 0 and |λ| = For λ ∈ V, we define integers λ1 , . . . , λN and |λ| by λ = i λi i, P m m+1 by i λi . For m ≥ 0, we define a subset G of V (6.2) Gm = Vm+1 ∩ p = (λ | i1 , . . . , im ) λ ∈ V, 1 ≤ i1 , . . . , im ≤ N , where for λ ∈ RN and 1 ≤ i1 , . . . , im ≤ N, we set (λ | i1 , . . . , im ) = (λ, λ + iˆ1 , . . . , λ + iˆ1 + · · · + iˆm ).

(6.3)

Then (V, G1 ) defines an oriented graph G = GN,L and Gm is identified with the set of paths of G of length m. For p = (λ | i, j ), we set p† = (λ | j, i) and d(p) = λi − λj + j − i. We define subsets G2 [→], G2 [ ↓ ] and G2 [&] of G2 by G2 [→] = p ∈ G2 p† = p , G2 [ ↓ ] = p ∈ G2 p† 6 ∈ G2 , G2 [&] = p ∈ G2 p 6 = p† ∈ G2 .

(6.4)

(6.5) (6.6) (6.7)

Let t ∈ C be a primitive 2(N + L)th root of 1. Let be either 1 or −1 and ζ a nonzero parameter. We define a face model (G, wN,t, ) = (GN,L , wN,t,,ζ ) by setting 1 λ λ + iˆ , (6.8) = −ζ −1 t −d(p) wN,t, λ + iˆ λ + iˆ + jˆ [d(p)] wN,t,

[d(p) − 1] λ λ + iˆ , = ζ −1 λ + jˆ λ + iˆ + jˆ [d(p)] wN,t,

λ λ + kˆ = ζ −1 t λ + kˆ λ + 2kˆ

(6.9)

(6.10)

for each p = (λ | i, j ) ∈ G2 [&] q G2 [ ↓ ] and (λ | k, k) ∈ G2 [→], where [n] = (t n − t −n )/(t − t −1 ) for each n ∈ Z. We call (G, wN,t, ) an SU (N )L -SOS model (without spectral parameter) [23]. It is known that (G, wN,t, ) is star-triangular. Moreover, H. m Wenzl [45] showed that CGm 0λ is an irreducible Bm -module for each m ≥ 0 and λ ∈ 3G0 . Therefore (G, wN,t, , 0) is flat by Proposition 5.5. In [10], F. Goodman and H. Wenzl showed that the fusion rule of (G, wN,t, , 0) agrees with that of SU (N )L -WZW model. We give another proof of their result in the next section.

Face Algebras and Unitarity of SU(N)L -TQFT

227

Remark 6.1. (1) Strictly speaking, Wenzl deals with wN,t, only when = −1. However, it is clear that his arguments are applicable to the case = 1. The results for A(wN,t,1 ) also follows from those of A(wN,t,−1 ), since the former is a 2-cocycle deformation of the latter (cf. [5]). Hence, may be viewed as a gauge parameter. (2) In order to avoid using square roots of complex numbers, we use a different normalization of wN,t,−1 from Wenzl [45]. For each p ∈ Gm (m ≥ 1), we define κ(p) ∈ C by κ(p · q) = κ(p)κ(q) (p ∈ Gm , q ∈ Gn , m, n > 0), κ(λ | i) =

N Y

(6.11)

Ad(0 | i,k)+1 Ad(0 | i,k)+2 · · · Ad(λ | i,k) ((λ | i) ∈ G1 ),

k=i+1

√ where Ad = ad / ad a−d and ad = [d + 1]/[2][d]. Note that κ(p) satisfies κ(p) = Ad(p) κ(p† )

(6.12)

for each p ∈ G2 [&]. By replacing the basis {p} of CGm with {κ(p)p}, we obtain Wenzl’s original expression of the Hecke algebra representation. It is also useful to use {κ(p)2 p} 6 satisfies instead of {p} (see Sect. 12). The corresponding Boltzmann weight wN,t, h p i h p i r κ(p · q) 2 6 wN,t, wN,t, r q = wN,t, p s . r q := κ(r · s) q s s

(6.13)

We call {p} and {κ(p)2 p} a rational basis of type and type 6 respectively. 7. The Algebra S(AN−1 ; t) Applying Proposition 5.1 to (G, wN,t,,ζ ), we obtain a CQT V-face algebra A(wN,t,,ζ ) = Cost(wN,t, , 0). In order to define the “(quantum) determinant” of A(wN,t, ), we introduce an algebra = N,L, , which is a face-analogue of the exterior algebra. It is defined by generators ω(p) (p ∈ Gm ; m ≥ 0) with defining relations: X ω(k) = 1, (7.1) k∈V

ω(p)ω(q) = δr(p)s(q) ω(p · q),

(7.2)

ω(p) = − ω(p ) (p ∈ G [&]),

(7.3)

†

2

(7.4) ω(p) = 0 (p ∈ G [→]). P It is easy to verify that m := p∈Gm Cω(p) becomes an A(wN,t, )-comodule via X p ω(p) ⊗ e (q ∈ Gm ) (7.5) ω(q) 7 → q m 2

p∈G

for each m ≥ 0. For each m ≥ 0, we set X iˆ ) ∈ V2 I ⊂ {1, . . . , N}, card(I ) = m . Bm = (λ, λ + k∈I

(7.6)

228

T. Hayashi

Also we define L : Gm → Z≥0 by L(λ | i1 , . . . , im ) = Card{(k, l)|1 ≤ k < l ≤ N, ik < il }

(7.7)

L(p) ω(p) Proposition 7.1. For each (λ, µ) ∈ Bm , Gm λµ 6 = ∅ and ωm (λ, µ) := (−) m does not depend on the choice of p ∈ Gλµ . Moreover {ωm (λ, µ)|(λ, µ) ∈ Bm } is a basis of m . In particular, m = 0 if m > N.

Proof. We will prove this lemma by means of Bergman’s diamond lemma [2], ` or mrather its obvious generalization to the quotient algebras of ChGi, where hGi = m G . We ` define a “reduction system” S = S1 S2 ⊂ hGi × ChGi by setting S1 = (p, − p† ) p = (λ | i, j ) ∈ G2 [&], i < j , S2 = (p, 0) p = (λ | i1 , . . . , im ) ∈ Gm , m ≥ 2, card{i1 , . . . , im } < m . It is straightforward to verify that the quotient ChGi/hW −f | (W, f ) ∈ Si is isomorphic to and that all ambiguities of S are resolvable. Next, we introduce a semigroup partial order ≤ on hGi by setting (λ | i1 , . . . , im ) < (λ | j1 , . . . , jn ) if either m < n, or m = n and i1 = j1 , . . . , ik−1 = jk−1 , ik > jk for some 1 ≤ k ≤ m. Then ≤ is compatible with S and satisfies the descending chain condition. This completes the proof of the proposition. u t ˆ As an immediate consequence of For each 0 ≤ m ≤ N , we set 3m = 1ˆ + · · · + m. (5.22) and the proposition above, we obtain the following result. Proposition 7.2. For each 0 ≤ m ≤ N , we have m ∼ = L(3m ,m) as A(wN,t, )comodules. P λ Now we define the “determinant” det = λ,µ∈V det µ of A(wN,t, ) to be the element which corresponds to the group-like comodule N and its basis group-like ω(λ) ¯ via Lemma 3.2, where ω(λ) ¯ = D(λ)ωN (λ, λ) and D(λ) =

Y 1≤i<j ≤N

Explicitly, we have

[d(λ | i, j )] (λ ∈ V). [d(0 | i, j )]

p λ D(µ) X L(p)+L(q) (−) e , det = q D(λ) µ N

(7.8)

(7.9)

p∈Gλλ

where q denotes an arbitrary element of GN µµ . By (2.38) and (2.39) for g = det, the quotient S(AN−1 ; t) := A(wN,t, )/(det −1)

(7.10)

naturally becomes a V-face algebra, which we call an SU (N )L -SOS algebra. The proof of the following lemma will be given in Sects. 8 and 13. Lemma 7.3. For each p ∈ G1 , we have ¯ ⊗ ω(p)) = N −1 ζ −N tω(p) ⊗ ω(r(p)), ¯ cN 1 (ω(s(p)) ¯ = c1 N (ω(p) ⊗ ω(r(p)))

N −1 −N

ζ

t ω(s(p)) ¯ ⊗ ω(p).

(7.11) (7.12)

Face Algebras and Unitarity of SU(N)L -TQFT

229

Proposition 7.4. The element det belongs to the center of A(wN,t, ). Moreover, if ζ satisfies ζ N = N−1 t,

(7.13)

R± (det −1, a) = 0 = R± (a, det −1) (a ∈ A(wN,t,,ζ )).

(7.14)

then Hence, S(AN −1 ; t) naturally becomes a quotient CQT V-face algebra of A(wN,t,,ζ ). ¯ in two Proof (cf. [13]). By computing the coaction of A(wN,t, ) on ω(p) ⊗ ω(r(p)), ways via (7.11), weobtain the first assertion. We show the first equality of (7.14) for ± = + and a = e qp (p, q ∈ Gm , m ≥ 0). By (2.25) and (2.27), it suffices to show r(p) p + (7.15) ,e = δpq (p, q ∈ Gm ). R det s(p) q For m = 0, this follows from (2.27) and (2.25). By computing the left-hand side of (7.11) via (3.5), we obtain (7.15) for m = 1. For m ≥ 2, (7.15) follows from (2.23) and (2.38) for g = det by induction on m. u t Since the braiding of S(AN−1 ; t) depends on the choice of the discrete parameter ζ satisfying (7.13), we sometimes write S(AN −1 ; t),ζ instead of S(AN −1 ; t) . To state our first main result, we recall the fusion rule of the SU (N )L -WZW model ν of a in conformal field theory. By [9], it is characterized as the structure constant Nλµ commutative Z-algebra F (calledP the fusion algebra of the SU (N )L -WZW model) with ν χ ) such that free basis {χλ }λ∈V (i.e., χλ χµ = ν∈V Nλµ ν ( 1 (λ, µ) ∈ Bm µ (7.16) Nλ3m = 0 otherwise for each λ, µ ∈ V and 0 ≤ m < N. See Kac [25] or Walton [44] for an explicit formula ν . of Nλµ Theorem 7.5 ([11]). (1) For each λ ∈ V, up to isomorphism there exists a unique right S(AN−1 ; t) -comodule Lλ such that dim Lλ (0, µ) = δλµ (µ ∈ V). Moreover, we have S(AN−1 ; t) ∼ =

M

End(Lλ )∗

(7.17)

(7.18)

λ∈V

as coalgebras. In particular, Lλ is irreducible for each λ ∈ V (2) The corepresentation ring K0 (ComS(AN −1 ;t) ) is identified with the fusion algebra F of SU (N )L -WZW model via χλ = [Lλ ]. That is, we have [L0 ] = 1, X ν [Lλ ][Lµ ] = Nλµ [Lν ].

(7.19) (7.20)

ν∈V

(3) We have µ

dim (Lν (λ, µ)) = Nλν .

(7.21)

230

T. Hayashi

Proof. It is easy to verify that (λ, m) ∈ 3GN,L if and only if m ∈ |λ| + N Z≥0 . Since ¯ (λ,m) ∼ CV det n ⊗L = L(λ,m+Nn) by (3.2) and (5.18), we see that det satisfies all conditions ¯ = V and ϕ(λ, n) = (λ, |λ| + N n). Therefore, we have Part (1), of Lemma 3.3, where 3 (7.19) and X µ ν Nλµ (|µ|)[Lν ], dim (Lν (λ, µ)) = Nλν (m) ((ν, m) ∈ 3GN,L ). [Lλ ][Lµ ] = ν∈V

(7.22) ν := N ν (m) does not depend on the choice of m. Using Proposition In particular, N˜ λµ λµ 7.2, (5.22) and (7.17), we obtain X X ¯ m )(0, µ)[Lµ ] = dim(Lλ ⊗ dim m (λ, µ) [Lµ ]. (7.23) [Lλ ][L3m ] = µ∈V

µ∈V

ν satisfies the condition of N ν stated above. u t Thus the numbers N˜ λµ λµ

Proposition 7.6. The element det is not a zero-divisor of A(wN,t, ). In particular, we have det µλ 6 = 0 for each λ, µ ∈ V. Moreover, we have X c(λ) ◦ eλ eµ detm m ∈ Z≥0 , c(λ) ∈ C× (λ ∈ V) . (7.24) GLE A(wN,t, ) = c(µ) λ,µ∈V

Proof. The first assertion follows from the fact that det is simply reducible (see the proof of the theorem above). By Theorem 5.3 (1), every group-like comodule of A(wN,t, ) is isomorphic to (N )⊗m for some m ≥ 0. Hence the second assertion follows from Lemma 3.2. u t 8. Module Structure of Let H be a CQT V-face algebra over K. As in case H is a CQT bialgebra, the correspondence a 7 → R+ ( , a) defines an antialgebra-coalgebra map from H into the dual face algebra H◦ (cf. [16]). Let W be a right H-comodule. Combining the above map with the left action (3.11) of H∗ on W , we obtain a right action of H on W given by X w(0) R+ (w(1) , a) (w ∈ W, a ∈ H). (8.1) wa = (w)

Let V be another H-comodule. Then we have X va(1) ⊗ wa(2) (v ∈ V (λ, ν), w ∈ W (ν, µ), a ∈ H). (v ⊗ w) a =

(8.2)

(a)

If H has an antipode and W is finite-dimensional, then we have hva, wi = hv, wS −1 (a)i (v ∈ W ∨ , w ∈ W ),

(8.3)

by (3.6) and (2.29). In this section, we give an explicit description of the right A(wN,t, )-module structure of . By (7.5) and (5.5), we obtain the following.

Face Algebras and Unitarity of SU(N)L -TQFT

231

Lemma 8.1. For each s ∈ Gm and p, q ∈ Gn (n ≥ 0), we have h s i X p wN,t, p q ω(r) (s ∈ Gm , p, q ∈ Gn , n ≥ 0). ω(s) e = r q m

(8.4)

r∈G

In particular, we have (λ, µ)e

p ∈ δλ,s(p) δµ,s(q) (r(p), r(q)). q

(8.5)

Lemma 8.2. Let (λ, µ) be an element of Bm (m > 0) and p = (λ | i1 , . . . , im ) an element of Gm λµ . Define the set I and C(λ | k, l) ∈ C (k 6 = l) by I = {i1 , . . . , im } and C(λ | k, l) =

[d(λ | k, l) + 1] [d(λ | k, l)]

(8.6)

respectively. Then for each (λ | i), (µ | j ) ∈ G1 , we have: Y 1 λ |i ˆ 2 , . . . , im , j ) C(λ|i, k) ω(λ + i|i ω(p)e = (−ζ )−m t −d(λ|i,j ) [d(λ|i, j )] µ|j k∈I \{i}

(i = i1 , j 6∈ I ), ω(p)e

λ|i µ|j

= −δij (−ζ )−m t

Y

C(λ | i, k) ωm (λ + iˆ | i2 , . . . , im , i)

k∈I \{i}

(i = i1 , j ∈ I ),

λ|i ωm (p)e µ|j

(8.7)

= δij (ζ −1 )m

Y

(8.8)

C(λ | i, k) ωm (λ + iˆ | i1 , . . . , im )

k∈I

(i, j 6 ∈ I ),

λ|i ωm (p)e µ|j

(8.9)

=0

(i 6 ∈ I, j ∈ I ).

(8.10)

Proof. These formulas are proved by induction on m in a similar manner. Here we give m m−1 ⊗CG 1 , the left-hand side of ¯ the proof of (8.7). Since P is a quotient module of CG (8.7) is rewritten as q Aq Bq with λ|i ν |q , Bq = ω1 (ν, µ)e (8.11) Aq = ωm−1 (λ, ν)e ν |q µ|j by (8.2), where ν = λ+ iˆ1 +· · ·+ iˆm−1 and the summation is taken over for all 1 ≤ q ≤ N ˆ µ + jˆ) ∈ G1 only if q = im or j , we have such that (ν | q) ∈ G1 . Since (ν + q, ( λ|i Aim Bim + Aj Bj (ν | j ) ∈ G1 = ωm (λ, µ)e (8.12) Aim Bim otherwise. µ|j

232

T. Hayashi

Using the inductive assumption, we see that the right-hand side of (8.12) equals −m −d(i,j )

(−ζ )

t

[d(im , j ) − 1] 1 + [d(i, im )][d(im , j )] [d(i, j )][d(im , j )] m−1 Y

ˆ µ + jˆ) C(λ | i, in ) ωm (λ + i,

(8.13)

n=2

if (ν | j ) ∈ G1 , where d(k, l) = d(λ | k, l). Applying [a +b +1]+[a][b] = [a +1][b +1] to a = d(i, im ) and b = d(im , j ) − 1, we see that (8.13) equals the right-hand side of (8.7) (even if (ν | j ) 6 ∈ G1 ). Next suppose that (ν | j ) 6 ∈ G1 . It suffices to verify that the second term in the parentheses of (8.13) is zero. In case j = 1, we obtain ν1 = L. Using this together with 1 6 ∈ I , we see that λ1 = L. On the other hand, since (µ | 1) ∈ G1 , we have L − 1 ≥ µ1 = L − δim N . Hence, im = N and d(im , j ) − 1 = −L − N . In case j > 1, we obtain d(im , j ) − 1 = 0 in a similar manner. Thus we complete the proof of (8.7). u t The following lemma is frequently used in the sequel. Lemma 8.3. As a right A(wN,t, )-module, CG1 ∼ = 1 is irreducible. Hence CG1 is also irreducible as a left A(wN,t, )∗ -module. L ◦ ◦ Proof. Let W be a non-zero submodule of CG1 . Since W = λµ W eλ eµ and W eλ eµ ⊂ CG1λµ , we have s0 ∈ W for some s0 = (λ | i) ∈ G1 . To show CG1 = s0 A(wN,t, ), we introduce the oriented graph H determined by H0 = G1 and ( 1 s e qp ∈ C× r (∃p, q ∈ G1 ) 1 (8.14) card Hsr = 0 otherwise. It suffices to show that hHis0 r 6 = ∅ for every r ∈ G1 , where hHis r = ∪m Hsmr . We note that 1 6 = ∅ if (µ | j ), (µ + jˆ | k) ∈ G1 and j 6 = k, H(µ | j ) (µ+jˆ | k)

(8.15)

1 6 = ∅ if (µ | j ), (µ + kˆ | j ) ∈ G1 and j 6 = k, H(µ | j ) (µ+kˆ | j )

(8.16)

1 6 = ∅ if (µ | j, j ) ∈ G2 . H(µ | j ) (µ+jˆ | j )

(8.17)

P −1 ˆ Using (8.16), we obtain hHis0 s1 6 = ∅, where s1 = (L3i−1 + N k=i λk k | i). Suppose i 6 = N. Using (8.17), and then using (8.16), we obtain hHis1 s2 6 = ∅, where s2 = ((L − 1)3N−1 + 3i−1 | i). Using (8.15) and (8.16) respectively, we obtain hHis2 s3 , hHis4 (0|1) 6 = ∅ and hHis3 s4 6 = ∅ respectively, where s3 = ((L−1)3N −1 +3N −2 | N −1). and s4 = (3N−2 | N − 1). Therefore we obtain hHis0 (0|1) 6= ∅ if i 6 = N. By similar consideration, we also obtain hHis0 (0|1) 6 = ∅ in case i = N, and also, hHi(0|1)r 6 = ∅ for every r ∈ G1 . Thus, we have verified the first assertion. The second assertion is obvious t since the image of a 7 → R+ ( , a) is a subalgebra of A(wN,t, )∗ . u

Face Algebras and Unitarity of SU(N)L -TQFT

233

Now we begin to prove Lemma 7.3. By (8.5) and Lemma 8.2, we have λ|i ˆ ¯ + i). ω(ν)e ¯ = N−1 ζ −N tδij δνλ δνµ ω(ν µ|j

(8.18)

1 → CG1 ; ω(s(p)) ¯ ¯ ⊗ p 7→ Using (8.2) and this equality, we see that both N ⊗CG ¯ N → CG1 ; p ⊗ ω(s(p)) ¯ 7 → p ⊗ ω(r(p)) ¯ are isomorphisms of p ⊗ ω(r(p)) ¯ and CG1 ⊗ right A(wN,t, )-modules. Hence, by Lemma 8.3 and Schur’s Lemma, we have

¯ ⊗ p) = ϑ p ⊗ ω(r(p)) ¯ (p ∈ G1 ) cN 1 (ω(s(p))

(8.19)

for some constant ϑ. We will prove ϑ = N−1 ζ −N t in Sect. 12. 9. Transposes and Complex Conjugates The following proposition is an immediate consequence of the following reflection symmetry: h p i κ(r · s) 2 r wN,t, p s , (9.1) wN,t, r q = κ(p · q) q s where κ is as in (6.11). Proposition 9.1. There exists an algebra-anticoalgebra map A(wN,t, ) → A(wN,t, ); a 7 → aT given by T κ(p) 2 q p = e (p, q ∈ Gm , m ≥ 0). (9.2) e κ(q) p q Moreover it satisfies (aT )T = a and R± aT , bT = R± (b, a)

(9.3)

for each a, b ∈ A(wN,t, ) and ζ ∈ C× . The following proposition is needed to construct the “cofactor matrix”. Proposition 9.2. The element det satisfies detT = det. Hence, T induces an algebraanticoalgebra involution of S(AN−1 ; t) , which satisfies (9.3). Proof. Since detT is a group-like element of A(wN,t, ), we have detT = g det; g =

X c(λ) ◦ eλ eµ c(µ)

(9.4)

λ,µ∈V

by (7.24), where c(λ) (λ ∈ V) denotes some nonzero constant. Since both det and detT are central and det is not a zero divisor, g is central. Hence, by Lemma 8.3 and Schur’s lemma, we have p g = c p (p ∈ G1 ) for some c ∈ C. Hence we have c(λ) = c|λ| c(0) T for each λ ∈ V. In order to prove c = 1, we compute det 01ˆ in two ways. Using † p p (p ∈ G2 [ & ], q ∈ G2 [ ↓ ]), (9.5) [d(p) − 1] e = −[d(p) + 1] e q q

234

we obtain

T. Hayashi

k X (−)L(pi ) e i=1

pi 0 | 1, . . . , N

= (−)L(pk ) [k]2 e

pk 0 | 1, . . . , N

(9.6)

by induction on k, where pi = (1ˆ | 2, . . . , i, 1, i + 1, . . . , N). Substituting k = N in this equality, we get T 1ˆ 0 ˆ = c|1|−|0| det det ˆ1 0 1ˆ | 2, 3, . . . , N, 1 . = (−)N(N−1)/2+L(pN ) c[N ] e 0 | 1, 2, . . . , N − 1, N On the other hand, using (7.9) and (9.2), we see that the right-hand side of the above T t equality agrees with c det 01ˆ . This completes the proof of the proposition. u πi i ), or − exp(± Nπ+L ) with N + L ∈ 2Z. Then, we have Next suppose t = exp(± N+L m κ(p) > 0 for each p ∈ G (m > 0). Moreover, for each ζ with |ζ | = 1, the Boltzmann weight wN,t satisfies h p i κ(r · s) 2 r −1 wN,t, p s . (9.7) r q = wN,t, κ(p · q) q s

Similarly to Proposition 9.1 and Proposition 9.2, we obtain the following. πi i ) if N + L ∈ 2Z, and t = exp(± Nπ+L ) if Proposition 9.3. Set t = ± exp(± N+L N +L ∈ 1+2Z. Then for each solution ζ of (7.13), both S(AN −1 ; t),ζ and A(wN,t,,ζ ) are compact CQT V-face algebras of unitary type with costar structure × κ(p) 2 q p = e (p, q ∈ Gm , m ≥ 0). (9.8) e p κ(q) q

In fact, both S(AN−1 ; t) and A(wN,t, ) are spanned by unitary matrix corepresentations [eu qp ]p,q∈Gm (m ≥ 0) given by p κ(q) p (9.9) e (p, q ∈ Gm , m ≥ 0). eu = κ(p) q q

10. Antipodes and Ribbon Functionals Lemma 10.1. The V-face algebra S(AN−1 ; t) has an antipode given by ˆ X a D(λ + i) λ|i (−)L(a)+L(b) e , S e = (−)i+j ˆ b µ|j D(µ + j ) a

(10.1)

and the summation is taken over all where b denotes an arbitrary element of GN−1 ˆ λ+i λ

a ∈ GN−1ˆ . Moreover, we have µ+j µ p D(r(p))D(s(q)) p 2 e (p, q ∈ Gm , m ≥ 0). S e = D(s(p))D(r(q)) q q

(10.2)

Face Algebras and Unitarity of SU(N)L -TQFT

235

Proof (cf. [38], [12]). Let Y(λ | i) (µ | j ) denote the right-hand side of (10.1) viewed as an element of A(wN,t, ). By [17, §7], it suffices to verify that X X ◦ Ypr Xrq = δpq er(p) det, Xpr Yrq = δpq es(p) det, (10.3) r∈G1

r∈G1

¯ p) ˜ p ∈ G1 of N −1 by where Xpq = e qp (p, q ∈ G1 ). We define a basis ω( ˆ N−1 (λ + i, ˆ λ), where p˜ = (µ, λ) ∈ BN −1 for ˆ λ) = (−)i−1 D(λ + i)ω ω(λ ¯ + i, 1 N −1 is given by ω( ˜ 7→ ¯ q) p P= (λ, µ) ∈ G . Then, the coaction of A(wN,t, ) on ˜ ⊗ Yqp and the multiplication of gives maps ¯ p) p∈G1 ω( ¯ 1 → N ; ω( ¯ q) ˜ ⊗ ω(p) 7→ δpq ω(r(p)), ¯ N−1 ⊗ D(r(p)) ¯ N−1 → N ; ω(p) ⊗ ω( ˜ 7 → δpq (−)N −1 ω(s(p)). ¯ ¯ q) 1 ⊗ D(s(p)) Since these maps are compatible with the coaction of A(wN,t, ), we have the first formula of (10.3) and X Wrp Yqr = δpq es(p) det, (10.4) r

where Wpq ∈ A(wN,t, ) denotes the right-hand side of (10.2). Applying T to this equality, we obtain X

◦

Xpr Zrq = δpq es(p) det; Zpq =

r

D(r(p))D(s(q))κ(p)2 T Y . D(s(p))D(r(q))κ(q)2 qp

(10.5)

P Computing rs Ypr Xrs Zsq in two ways, we obtain Ypq = Zpq . This proves the second P 2 equality of (10.3). Finally, Computing rs Wrq S(Xsr )S (Xps ) in the algebra t S(AN−1 ; t) in two ways, we obtain (10.2). u Proposition 10.2. For each t and ζ , S(AN−1 ; t),ζ becomes a coribbon Hopf V-face algebra, whose braiding, antipode S and modified ribbon functional M = M1 are given by (5.5), (10.1) and the following formulas: D(r(p)) p (p, q ∈ Gm , m ≥ 0). (10.6) M e = δpq D(s(p)) q Moreover, we have dimq (Lλ ) = D(λ)

(10.7)

for each λ ∈ V. When N is even, there exists another ribbon functional M−1 given by p D(r(p)) (10.8) (p, q ∈ Gm , m ≥ 0). = δpq (−1)|r(p)|−|s(p)| M−1 e D(s(p)) q The quantum dimension of the corresponding ribbon category is given by |λ| dim−1 q (Lλ ) = (−1) D(λ).

(10.9)

236

T. Hayashi

Proof. Let M ∈ H∗ be as in (10.6). Using (10.2), we see that M satisfies (2.43). Hence, it suffices to verify that p p 2 =M e (10.10) (U1 U2 ) e (p, q ∈ Gm ) q q for each m ≥ 0. By (2.34) and (2.43), U1 M−1 is a central element of A(wN,t, )∗ . Hence by Lemma 8.3 and Schur’s lemma, we have U1 p = ϑMp (p ∈ G1 ) for some constant ϑ. Using (2.31), (2.29) and (10.2), we compute X 0|1 r 0|1 R+ S 2 e = ,e (10.11) U1−1 e 0|1 0|1 r 1 r∈G X D(0)D(ν) 0 1ˆ wN,t ˆ (10.12) = 2 ˆ 1ν D(1) ˆ 1+ ˆ 2ˆ ν=21,

= ζ −1 t N

1 . [N]

(10.13)

This shows that ϑ = ζ t −N , and similarly, we obtain U2 p = ϑ −1 Mp (p ∈ G1 ). This proves (10.10) for m = 1. For m ≥ 2, (10.10) follows from (10.10) for m = 1 by induction on m, using the fact that both M2 and U1 U2 are group-like (cf. (2.32), (2.33)). The second assertion follows from (10.6) and Lemma 3.1. u t We denote by S(AN−1 ; t)ι,ζ the coribbon Hopf V-face algebra (S(AN −1 ; t),ζ , Mι ), where ι = ±1 if N is even and ι = 1 if N is odd. πi i ), or t = − exp(± Nπ+L ) and N is odd, then we have Lemma 10.3. If t = exp(± N+L πi D(λ) > 0 for every λ ∈ V. If t = − exp(± N +L ) and N is even, then we have (−1)|λ| D(λ) > 0.

Proof. Straightforward. u t πi i ), or t = − exp(± Nπ+L ) and N, L ∈ 1 + Proposition 10.4. When t = exp(± N+L 2Z, the Woronowicz functional Q of S(AN−1 ; t) is given by (10.6). While when t = πi ) and N, L ∈ 2Z , Q is given by (10.8). − exp(± N+L

Proof. We will prove the first assertion. We set p p Q = Q eu , M = M eu q q 1 p,q∈G p,q∈G1

(10.14)

where Q denotes the Woronowicz functional of S(AN −1 ; t) and eu qp is as in (9.9). By (2.43), (4.6) and Lemma 8.3, we have M = ϑQ for some ϑ. Since M is positive by (10.6) and the lemma above, we have ϑ > 0. Since the quantum dimension satisfies have Tr(M) = Tr(M −1 ). By (4.7), this proves dimq L = dimq L∨ for every L, we p p M = Q. Now the assertion M eu q = Q eu q (p, q ∈ Gm ) easily follows from the fact that both M and Q are group-like, by induction on m. u t

Face Algebras and Unitarity of SU(N)L -TQFT

237

11. The Modular Tensor Category Let C be a ribbon category, which is additive over a field K. We say that C is semisimple if there exist a set VC , an involution ∨ : VC → VC , an element 0 ∈ VC and simple objects LCλ (λ ∈ VC ) such that every object of C is isomorphic to a finite direct sum of LCλ ’s, and that LC0 ∼ = 1, (LCλ )∨ ( K C C C(Lλ , Lµ ) = 0

∼ = LCλ∨ ,

(11.1)

(λ = µ) (λ 6 = µ)

(11.2)

for each λ, µ ∈ VC , where 1 denotes the unit object of C. For a semisimple ribbon ν (λ, µ, ν ∈ V ) and S-matrix S C = [S C ] category C, we define its fusion rule Nλµ C λµ λ,µ∈VC by X ν Nλ,µ [LCν ], (11.3) [LCλ ][LCµ ] = ν∈VC

C Sλµ

= Tr q (cLC LC ◦ cLC LC ). µ λ

λ

µ

(11.4)

By definition, we have C = dimq LCλ . Sλ0

(11.5)

cW V ◦ cV W = θV ⊗W ◦ (θV ⊗ θW )−1 ,

(11.6)

Since the twist θ satisfies

S C satisfies C = Sλµ

X ν∈VC

θν N ν dimq (LCν ), θλ θµ λµ

(11.7)

where θλ ∈ K is defined by θLC = θλ idLC . Moreover, S = S C satisfies the following λ λ Verlinde’s formula (cf. [43,32,40]): X ξ Nλµ Sξ ν = Sλν Sµν , (11.8) Sν,0 ξ ∈V

where λ, µ and ν denote arbitrary elements of V = VC . Let C be a semisimple ribbon category. We say that C is a modular tensor category (or MTC) if VC is finite and the matrix S C is invertible. If, in addition, C is unitary as a ribbon category, then it is called a unitary MTC. It is known that each (unitary) MTC gives rise to a (unitary) 3dimensional topological quantum field theory (TQFT), hence, in particular, an invariant of 3-manifolds of Witten–Reshetikhin–Turaev type (cf. V. Turaev [40]). The most well-known example of MTC is obtained as a certain semisimple quotient C(g, κ) of a category of representations of the quantized enveloping algebra Uq (g) of finite type in the case when q is a root of unity [1,8,28,41]. When g = slN , the simple objects LU λ of C(slN , N + L) (L ≥ 1) are also indexed by the set V = VN L given by

238

T. Hayashi

(6.1) and the fusion rules agree with those of the SU (N )L -WZW model. The quantum dimension and the constant θλ for C(slN , N + L) are given by (λ|λ+2ρ)∼

dimq (LU λ ) = D(λ)t0 , θλ = ζ0

(11.9)

πi ), t0 = ζ0N , ρ = 31 + · · · + 3N −1 , ( | )∼ = N ( | ) respectively, where ζ0 = exp( N(N+L) and ( | ) denotes the usual inner product of RN . Moreover, the S-matrix of C(slN , N +L) is given by S U = S 1 (ζ0 ). Here, for each primitive 2N (N + L)th root ζ of unity, we define the matrix S ι (ζ ) by the following Kac–Peterson formula (cf. [28]): P l(w) ζ −2(w(λ+ρ) | µ+ρ)∼ w∈SN (−1) ι |λ|+|µ| P , (11.10) S (ζ )λµ = ι l(w) ζ −2(w(ρ) | ρ)∼ w∈SN (−1)

where ι = ±1 if N ∈ 2Z, ι = 1 if N ∈ 1 + 2Z and the action of the symmetric group P L w(i). Note that (λ | µ)∼ ∈ Z for every λ, µ ∈ i Z3i . SN on i Ciˆ is given by w iˆ = [ Lemma 11.1. Let ζ be a primitive 2N(N + L)th root of unity and t = ζ N . Then the matrix S = S ι (ζ ) is both symmetric and invertible, and satisfies Verlinde’s formula (11.8). Moreover, we have: ι

S (ζ )3q 3r

S ι (ζ )λ0 = ι|λ| D(λ)t , X = ιq+r (ζ t)−2qr t 2s(q+r−s+1) D(3q+r−s + 3s )t

(11.11) (11.12)

s

for each λ ∈ VNL and 0 < r ≤ q < N, where the summation in (11.12) is taken over max{0, q + r − N } ≤ s ≤ r. Proof. Suppose ζ = ζ0 . Then the formula (11.12) follows from (11.7) for C(slN , N +L), (7.16) and (11.9). In this case, the other assertions also follow from the results for t C(slN , N + L). For other ζ , the assertions follow from Galois theory for Q(ζ0 )/Q. u ν be the fusion rules of SU (N ) -WZW models and let S and S 0 be Lemma 11.2. Let Nλµ L symmetric matrices whose entries are indexed by V = VN L . If these satisfy Verlinde’s 0 6 = 0 (λ ∈ V) and formula (11.8), Sλ0 = Sλ0 0 (0 < r ≤ q < N ), S3q 3r = S3 q 3r

(11.13)

then we have S = S 0 . Proof. We recall that there exists an algebra surjection from Z[x1 , . . . , xN ]SN onto F (cf. Theorem 7.5 (2)), which sends the Schur function s(λ1 ,... ,λN ) (see e.g. [31]) to [Lλ ] for each λ ∈ V (see e.g. [9]). For each ξ = (ξ1 , . . . , ξm ) such that N > ξ1 ≥ . . . ≥ ξm > 0, we define Eξ ∈ F to be the image of the elementary symmetric function eξ via this map, that is Eξ = [ξ1 ] · · · [ξm ]. Since {eξ } is a basis of Z[x1 , . . . , xN ]SN , {Eξ } spans F. We define the symmetric bilinear forms S and S 0 on F by setting 0 . (11.14) S [Lλ ], [Lµ ] = Sλµ , S 0 [Lλ ], [Lµ ] = Sλµ Then, Verlinde’s formula for S is rewritten as [Lν ] [Lν ] [Lν ] = S a, S b, , S ab, Sν0 Sν0 Sν0

(11.15)

Face Algebras and Unitarity of SU(N)L -TQFT

239

where ν ∈ V and a, b ∈ F. By (11.13) and this formula, we obtain S(Eξ , [r ]) = S 0 (Eξ , [r ])

(11.16)

for each ξ and 0 ≤ r < N, or equivalently, we obtain S([Lλ ], [r ]) = S 0 ([Lλ ], [r ])

(11.17)

for each λ ∈ V and 0 ≤ r < N . Repeating a similar consideration, we conclude that 0 holds for every λ, µ ∈ V. u t Sλµ = Sλµ f

For S = S(AN−1 ; t)ι,ζ , we denote the semisimple ribbon category ComS (resp. fu

u (A unitary ribbon category ComS ) by CS (AN−1 , t)ι,ζ (resp. CS N −1 , t),ζ ).

Theorem 11.3. Let N ≥ 2 and L ≥ 1 be integers, ι = ±1 if N ∈ 2Z, ι = 1 if N ∈ 1 + 2Z and = ±1. Let ζ be a primitive 2N (N + L)th root of unity. (1) Suppose N is odd or = 1 and set t = ζ N . Then the category CS (AN −1 , t)ι,ζ is a modular tensor category with S-matrix S ι (ζ ). (2) Suppose N is even, = −1 and t := −ζ N is a primitive 2(N + L)th root of unity. (Note that this implies L ∈ 2Z.) Then the category CS (AN −1 , t)ι−1,ζ is a modular tensor category with S-matrix S −ι (ζ ). Proof. We will prove Part (2). By (11.5), (10.7), (10.9), (11.11) and D(λ)−t = (−1)|λ| S = S −ι (ζ ) . Hence by the lemma above, it suffices to show that D(λ)t , we have Sλ0 λ0 S −ι (ζ ) = S for each 0 < r ≤ q < N . As we will see in the next section, S3 3 3 q r 3 q r ¯ r )(0, 3q+r−s + 3s ) the action of cr q ◦ cq r on the one-dimensional space (q ⊗ = Cωq (0, 3q ) ⊗ ωr (3q , 3q+r−s + 3s ) is given by the scalar (ζ t)−2qr t 2s(q+r−s+1) . Hence, by (3.14), we obtain S = S3 q 3r

X s

Tr Mι ◦ cr q ◦ cq r (q ⊗ ¯ r )(0, 3

q+r−s +3s )

= (−ι)p+q

X 2s(q+r−s+1) (ζ t 0 )−2qr t 0 D(3q+r−s + 3s )t 0 ,

(11.18)

s

where t 0 = ζ N and the summation is taken over max{0, q + r − N } ≤ s ≤ r. Since the right-hand side of (11.18) equals S −ι (ζ )3q 3r by (11.12), this completes the proof of Part (2). u t i ). If Corollary 11.4. Let N ≥ 2 and L ≥ 1 be integers, = ±1 and t = exp(± Nπ+L u N + L ∈ 1 + 2Z, CS (AN−1 , t),ζ is a unitary MTC provided that = 1 or N is odd, where ζ denotes an arbitrary primitive 2N(N + L)th root of unity such that ζ N = t. If u (A th N + L ∈ 2Z, CS N−1 , ±t),ζ is a unitary MTC for each primitive 2N (N + L) root N N−1 t. ζ of unity such that ζ = ±

Remark 11.1. (1) When N ∈ 1 + 2Z, S(AN −1 , t)ι−1,ζ is isomorphic to a 2-cocycle deformation of S(AN−1 , t)ι1,ζ . Hence CS (AN−1 , t)ι1,ζ and CS (AN −1 , t)ι−1,ζ are equivalent. (2) For g = soN and spN , a category-theoretic construction of unitary MTC’s related to C(g, κ) is given by Turaev and Wenzl [42].

240

T. Hayashi

12. Braidings on In this section, we give some explicit calculation of the braiding cq,r := cq r in order to complete the proof of Lemma 7.3 and Theorem 11.3. Since the braiding is a natural transformation and the multiplication of gives a S(AN −1 ; t) -comodule map ¯ r → q+r , cq,r satisfies mq,r : q ⊗ ¯ r ) = (idr ⊗m ¯ q,q 0 ) ◦ (cq,r ⊗id ¯ q 0 ) ◦ (idq ⊗c ¯ q 0 ,r ), cq+q 0 ,r ◦ (mq,q 0 ⊗id ¯ r,r 0 ) = (mr,r 0 ⊗id ¯ q ) ◦ (idr ⊗c ¯ q,r 0 ) ◦ (cq,r ⊗id ¯ r 0 ). cq,r+r 0 ◦ (idq ⊗m

(12.1) (12.2)

Lemma 12.1. For each 1 ≤ p < N and 1 ≤ q ≤ N − p, we have cq,1 ω 3p | p + 1, . . . , p + q ⊗ ω 3p+q | 1 [q] ω 3p | p + 1 ⊗ ω 3p+1 | p + 2, . . . , p + q, 1 = −(−ζ )−q t p+1 [p + 1] [p + q + 1] ω 3p | 1 ⊗ ω 3p + 1ˆ | p + 1, . . . , p + q . (12.3) + (−ζ )−q (−)q [p + 1] Proof. Suppose (12.3) is valid for each p and for some q < N − p. Then, by (12.1), we obtain (12.4) c1+q,1 ω 3p | p + 1, . . . , p + q + 1 ⊗ ω 3p+q+1 | 1 h ¯ 1,q ) ◦ (c1,1 ⊗id ¯ q ) ω(3p |p + 1) ⊗ = (id1 ⊗m [q] ω 3p+1 |p + 2 ⊗ ω 3p+2 |p + 3, . . . , p + q + 1, 1 [p + 2] i [p + q + 2] ˆ + 2, . . . , p + q + 1 ω 3p+1 |1 ⊗ω 3p+1 + 1|p +(−ζ )−q (−)q [p + 2]

− (−ζ )−q t p+2

=

[p + q + 2] [q] 3p 3p+1 3p 3p+1 wN,t wN,t + −t 3p+1 3p+2 3p+1 3p+1 + 1ˆ [p + 2] [p + 2] −q (−ζ ) ω(3p |p + 1) ⊗ ω 3p+1 | p + 2, . . . , p + q + 1, 1 3p 3p+1 −q q [p + q + 2] wN,t + (−ζ ) (−) 3p + 1ˆ 3p+1 + 1ˆ [p + 2] ω 3p | 1 ⊗ ω 3p + 1ˆ | p + 1, . . . , p + q + 1 . p+2

Computing the right-hand side of the above equality, we obtain (12.3) for q + 1. u t Using (12.3) for p = 1, q = N − 1 together with (12.1), we obtain cN,1 ω (0| 1, . . . , N) ⊗ ω (0 | 1) = −(−ζ )−N t [N ] ω (0 | 1) ⊗ ω (1 | 2, . . . , N, 1) . (12.5) This shows that the constant ϑ in (8.19) equals N −1 ζ −N t and completes the proof of Lemma 7.3.

Face Algebras and Unitarity of SU(N)L -TQFT

241

Lemma 12.2. We have the following relations: cq,1 ω 3p | p + 1, . . . , p + s, 1, . . . , q − s ⊗ ω 3p+s + 3q−s | q − s + 1 [s] ω 3p | p + 1 [p + 1] ⊗ ω 3p+1 | p + 2 . . . , p + s, 1, . . . , q − s + 1 ˆ 3p+s + 3q−s+1 + 1 3p | 1 ⊗ q 3p + 1, ∈ −(−ζ )−q t p−q+s+1

(1 ≤ p < N, 1 ≤ s ≤ N − p, s < q ≤ p + 2s − 1),

(12.6)

cq,r ω 3p | p + 1, . . . , p + q ⊗ ω 3p+q | 1, . . . , r [q]! [p]! ω 3p | p + 1, . . . , p + r [p + r]! [q − r]! ⊗ ω 3p+r | p + r + 1, . . . , p + q, 1, . . . , r X r 3p , λ ⊗ q λ, 3p+q + 3r +

∈ (−1)r (−ζ )−qr t pr+r

λ6=3p+r

(0 ≤ p ≤ N − 1, 0 ≤ q ≤ N − p, 0 ≤ r ≤ q),

(12.7)

where [n]! = [n] · · · [2][1] and [0]! = 1. Proof. The relation (12.6) follows from (12.1), (12.3) and cq−s,1 ω 3p+s | 1, . . . , q − s ⊗ ω 3p+s + 3q−s | q − s + 1 = (−ζ t)s−q ω 3p+s | 1 ⊗ ω 3p+s + 1ˆ | 2, . . . , q − s + 1 .

(12.8)

The relation (12.7) is easily proved by induction on r, using (12.2) and (12.6). u t Using (12.7), (12.2) and cq,r−s ω (0 | 1, . . . , q) ⊗ ω 3q | q + 1, . . . , q + r − s

= (−ζ t)qs−qr ω (0 | 1, . . . , r − s) ⊗ ω (3r−s | r − s + 1, . . . , q + r − s) ,

(12.9)

we obtain the following. Lemma 12.3. For each 0 ≤ q, r < N and max{0, q + r − N } ≤ s ≤ min{q, r}, we have cq,r ω (0 | 1, . . . , q) ⊗ ω 3q | q + 1, . . . , q + r − s, 1, . . . s [q]! [r − s]! = (−1)s (−ζ t)−qr t s(q+r−s+1) [r]! [q − s]! ω (0 | 1, . . . , r) ⊗ ω (3r | r + 1, . . . , q + r − s, 1, . . . , s) . (12.10) As an immediate consequence of the lemma above, we see that cr,q ◦ cq,r acts on ¯ r )(0, 3q+r−s + 3s ) as the scalar (ζ t)−2qr t 2s(q+r−s+1) . Thus we complete the (q ⊗ proof of Theorem 11.3.

242

T. Hayashi

13. ABF Models and SU (2)L -SOS Algebras In this section, we give an explicit description of the representation theory of S(A1 ; t) . We identify G = G2,L with the Dynkin diagram of type AL+1 : 0

L−1

1

−−−−−→ −−−−−→ ◦ ←−−−− − ◦ ←−−−− −·

·

L

−−−−−→ · ◦ ←−−−−− ◦ .

(13.1)

Also, we identify V and Gkij with {0, 1, · · · , L} and { (i0 , i1 , · · · , ik )| 0 ≤ i0 , · · · , ik ≤ L, |iν − iν−1 | = 1 (1 ≤ ν ≤ k)} respectively. We define the set B by k i, j, k ∈ V, |i − j | ≤ k ≤ i + j, B= . ij i + j + k ∈ 2Z, i + j + k ≤ 2L

(13.2)

(13.3)

Then, we have

( Nijk

=

1

k ij

∈B

0 (otherwise) .

(13.4)

In order to simplify the formula for quantum invariants stated in the introduction, we use the rational basis of type 6 instead of type (cf. Remark 6.1). The corresponding 6 is given by Boltzmann weight w = w2,t, [i + 1 ± 1] ±t ∓(i+1) i i±1 i i±1 , w = ζ −1 , (13.5) = −ζ −1 i∓1 i i±1 i [i + 1] [i + 1]

w

h i i i±1 w = ζ −1 t, w otherwise = 0. i±1 i±2

(13.6)

Next, we recall a realization of Lk introduced in [14]. Let 6 be an algebra generated by the symbols σ (p) (p ∈ Gk , k ≥ 0) with defining relations: X σ (k) = 1, (13.7) k∈V

σ (p)σ (q) = δr(p)s(q) σ (p · q), σ (i, i + 1, i) = σ (i, i − 1, i) (0 < i < L), σ (0, 1, 0) = σ (L, L − 1, L) = 0.

(13.8) (13.9) (13.10)

L k k k We define the grading 6 = k≥0 6 via 6 = span{σ (p) | p ∈ G }. Then each k component 6 becomes a right S(A1 ; t) -comodule via X p σ (p) ⊗ e (q ∈ Gk ). (13.11) σ (q) 7 → q k p∈G

Face Algebras and Unitarity of SU(N)L -TQFT

243

For each ijk ∈ B, the element σk (i, j ) := L(q) σ (q) does not depend on the choice of q ∈ Gkij , where L is as in Sect. 7, that is, L(i, i − 1, · · · , (i + j − k)/2, · · · , j − 1, j ) = 0, (13.12) L ((· · · , n, n + 1, n, · · · )) = L ((· · · , n, n − 1, n, · · · )) + 1. (13.13) It is easy to see that 6 k (i, j ) = Cσk (i, j ) for each ijk ∈ B and that { σk (i, j )| ijk ∈ B} is a linear basis of 6. Since dim 6 k (0, l) = dim 6 k (l, 0) = δkl , we have 6 k ∼ = Lk ∼ = (6 k )∨ by Theorem 7.5 and (3.7) . More explicitly, we have the following. Proposition 13.1. The map 6 k → (6 k )∨ ; σk (i, j ) 7 → c ijk σk∨ (i, j ) gives an identification of S(A1 ; t) -comodules, where {σk∨ (j, i)} denotes the dual basis of {σk (i, j )} and the constant c ijk is given by [(i + j + k)/2 + 1]! [(i − j + k)/2]! [(−i + j + k)/2]! k . c = (−)(i−j )/2 [i + 1] [(i + j − k)/2]! ij (13.14) Under this identification, the maps d6 k and b6 k in (3.8)-(3.9) are given by X k −1 c σk (i, j ) ⊗ σk (j, i), d6 k (i) = ji j k i b6 k (σk (i, j ) ⊗ σk (j, l)) = δil c ij respectively, where the summation in (13.15) is taken over all j ∈ V such that

(13.15) (13.16) k ji

∈ B.

Proof. It suffices to show the first assertion. Since 6 k (i, j ) = Cσk (i, j ) and (6 k )∨ (i, j ) = Cσk∨ (i, j ), there exists an isomorphism 6 k ∼ = (6 k )∨ of the form σk (i, j ) 7 → c ijk σk∨ (i, j ) for some nonzero constant c ijk ( ijk ∈ B). To compute c ijk , we consider the S(A1 ; t) -right module structure on 6 given by (8.1). Similarly to Lemma 8.2, we obtain [ i+j2∓k + 1] i, i ± 1 = ζ −k (±i∓j +k)/2 t (∓i±j +k)/2 σk (i ± 1, j ± 1). σk (i, j ) e j, j ± 1 [i + 1] (13.17) On the other hand, by (10.1), we obtain i, i ± 1 [j + 1] j ± 1, j e , S e = [i + 1] i ± 1, i j, j ± 1 i, i ± 1 [j + 1] j ∓ 1, j e . S e = − [i + 1] i ± 1, i j, j ∓ 1

(13.18)

Using this together with (8.3) and (13.17)− , we obtain [ i+j2+k + 2] ∨ i, i + 1 ∨ = ζ −k (i−j +k)/2 t (−i+j +k)/2 σk (i + 1, j + 1). σk (i, j ) e j, j + 1 [i + 2] (13.19)

244

T. Hayashi

By (13.17)+ and (13.19), we obtain k c i+1,j [i + 1] [(i + j + k)/2 + 2] +1 . = k [i + 2] [(i + j − k)/2 + 1] c i,j Similarly, by computing σk (i, j ) e c

k i±1,j ∓1 k c i,j

i,i±1 j,j ∓1 ,

= −

(13.20)

we obtain

[i + 1] [(±i ∓ j + k)/2 + 1] . [i + 1 ± 1] [(∓i ± j + k)/2]

(13.21)

By solving these recursion relations under some initial condition, we get (13.14). u t ¯ n →6 ¯ m and its inverse are given by ˜ n ⊗6 Proposition 13.2. (1) The braiding c : 6 m ⊗6 X ± h i wmn (13.22) c±1 (σm (h, i) ⊗ σn (i, k)) = σn (h, j ) ⊗ σm (j, k), j k j h q i X X ± h i L(u)+L(v)+L(q)+L(r) w± u r , (13.23) wmn := v j k n m u∈Ghj v∈Gj k

n respectively, and the summation where q and r denote arbitrary elements of Gm hi and G ik n m in (13.22) is taken over all j ∈ V such that hj , j k ∈ B.

(2) The ribbon functional of S(A1 , t)ι,ζ acts on Li as the scalar θi−1 given by θi = ιi ζ i(i+2) .

(13.24)

Proof. Since the braiding is a natural transformation and the map CGm → 6 m ; p 7→ σ (p) is a S(A1 ; t) -comodule map, we have h q i X X X L(q)+L(r) w± u r σ (u) ⊗ σ (v). c±1 (σm (h, i) ⊗ σn (i, k)) = v n m j u∈Ghj v∈Gj k

(13.25) Since σ (p) = L(p) σm (i, j ) for each p ∈ Gm ij , this proves Part (1). When i = 1, (13.24) follows from the proof of Proposition 10.2. For i > 1, (13.24) follows from (11.7) by induction on i. u t 14. The State Sum Invariants Let L be a positive integer and let , ι be elements of {±1} such that = 1 if L is a odd integer. Let t be a primitive 2(L + 2)th root of unity and ζ a solution of ζ 2 = t. Applying the general theory of TQFT to the MTC CS (A1 , t)ι,ζ , we obtain an invariant ι of oriented 3-manifolds. To give an explicit description of τ ι , we prepare some τ,ζ ,ζ terminologies on link diagrams. Let D be a generic link diagram in R × (0, 1) (viewed as a union of line segments AB), which presents a framed link L with components K1 , . . . , Kp (see e.g. [26]). A point of D is called extremal if the height function on D attains its local maximum

Face Algebras and Unitarity of SU(N)L -TQFT

245

or local minimum in this point, where the height function ht is the restriction of the projection R × (0, 1) → (0, 1) on D. A point of D is called singular if it is either an extremal point or a crossing point. We denote by ]D the set of all singular points of D. Let E be the set of all connected components of D \ ]D. We say that E ∈ E belongs to Kq (1 ≤ q ≤ p) if E is a subset of the image of Kq via the projection L → D. Let c be λ1 (E) a map from {K1 , . . . , Kp } to V. We say that a map λ : E → B; E 7→ λ2 (E)λ is a 3 (E) state on D of color c if λ1 (E) = c(Kq ) for each component Kq and E ∈ E belonging to Kq . We denote by cλ the color of a state λ, and by S(D) the set of all states on D. Figure (A) shows a state λ on a diagram of the Hopf link with 6 singular points, such that cλ (K1 ) = 1, cλ (K2 ) = 2.

1 32

@

b @ 2 @ 11

b @ @ 1 01 @ b

@ K@ 1

2 31 @

1 12

23

@ @ b

@ @b

2

@ L L−2 @ @ @ @ 1 @ 2 02

@

K2

@ @b

Fig. (A)

Next, we assign a complex number hλ|Ai for each state λ ∈ S(D) and singular point A ∈ ]D as follows: When (λ, A) is as in Fig. (B) or Fig. (C), then we set −1 l l δhk δij (14.1) δhk δij or hλ|Ai = c hλ|Ai = c ih hi respectively, where c hil is as in (13.14). A ◦

J

J

J JJ

l l jk

hi

l jk

l hi

JJ

J

J

J◦

A

Fig. (B)

Fig. (C)

When (λ, A) is as in Fig. (D± ), we set hλ|Ai = ± is as in (13.23). where wmn

± δij δhc δde δf k wmn

hi ef

,

(14.2)

246

T. Hayashi n jk

m hi

J

J

J

A◦

J

J J

m n

JJ

J

AJ◦

J J

JJ

m n

ef

cd

n jk

m hi

ef

cd

Fig. (D+ )

Fig. (D− )

The following result follows from Sect. 13 by a method quite similar to “vertex models on link invariants” (see e.g. [40] Appendix II), hence we omit the proof. ι be the invariant of closed oriented 3-manifolds associated with Theorem 14.1. Let τ,ζ the modular tensor category CS (A1 , t)ι,ζ (cf. [40]). Let M be a 3-manifold obtained by surgery on S 3 along a framed link L with p components K1 , . . . , Kp . Let D be a ι is given by generic diagram in R × (0, 1) which presents L. Then τ,ζ ι (M) τ,ζ

σ (L)

=1

−σ (L)−p−1

D

X

p Y

λ∈S (D) q=1

ιcλ (Kq ) [cλ (Kq ) + 1]

Y

hλ|Ai.

(14.3)

A∈]D

P P Here 1 denotes a fixed square root of i∈V [i + 1]2 , D = i∈V ιi ζ −i(i+2) [i + 1]2 and σ (L) denotes the signature of the linking matrix of L. References 1. Andersen, H.: Tensor products of quantized tilting modules. Commun. Math. Phys. 149, 149–159 (1992) 2. Bergman, G.: The diamond lemma for ring theory Adv. Math. 29, 178–218 (1978) 3. Böhm, G. and Szlachányi, K.: A coassociative C∗ -quantum group with non-integral dimensions. Lett. Math. Phys. 35, 437–456 (1996) 4. Chari, V. and Pressley, A.: A guide to quantum groups. Cambridge: Cambridge University Press, 1994 5. Doi, Y. and Takeuchi, M.: Multiplication alternation by two-cocycles – The quantum version. Commun. Alg. 22, 5715–5732 (1994) ¯ 6. Drinfeld, V.G.: On quasitriangular Quasi-Hopf algebras and a group closely connected with Gal(Q/Q). Leningrad Math. J. 2, 829–860 (1991) 7. Finkelberg, M.: An equivalence of fusion categories. Geom. Funct. Analysis 6, 249–267 (1996) 8. Gelfand, S. and Kazhdan, D.: Examples of tensor categories. Invent. Math. 109, 595–617 (1992) 9. Goodman, F. and Nakanishi, T.: Fusion algebras in integrable systems in two dimensions. Phys. Lett. B 262, 259–264 (1991) 10. Goodman, F. and Wenzl, H.: Littlewood-Richardson coefficients for Hecke algebras at roots of unity. Adv. Math. 82, 244–265 (1990) 11. Hayashi, T.: An algebra related to the fusion rules of Wess–Zumino–Witten models. Lett. Math Phys. 22, 291–296 (1991) 12. Hayashi, T.: Quantum deformation of classical groups. Publ. RIMS, Kyoto Univ. 28, 57–81 (1992) 13. Hayashi, T.: Quantum groups and quantum determinants. J. Algebra 152, 146–165 (1992) 14. Hayashi, T.: Quantum group symmetry of partition functions of IRF models and its application to Jones’ index theory. Commun. Math. Phys. 157, 331–345 (1993) 15. Hayashi, T.: Face algebras and their Drinfeld doubles. In: Proceedings of Symposia in Pure Mathematics, Vol 56, Part 2, Providence, RI: American Mathematical Society, 1994 16. Hayashi, T.: Face algebras I – A generalization of quantum group theory. To appear in J. Math. Soc. Japan 17. Hayashi, T.: Compact quantum groups of face type. Publ. RIMS, Kyoto Univ. 32, 351–369 (1996) 18. Hayashi, T.: Galois quantum groups of II1 -subfactors. Preprint

Face Algebras and Unitarity of SU(N)L -TQFT 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47.

247

Hayashi, T.: Face algebras II – Standard generator theorems. In preparation Hayashi, T.: Quantum groups and quantum semigroups. To appear in J. Algebra Hayashi, T.: In preparation Hayashi, T.: In preparation Jimbo, M., Miwa, T. and Okado, M.: Solvable lattice models related to the vector representation of classical simple Lie algebras. Commun. Math. Phys. 116, 507–525 (1988) Jurˇco, B. and Schupp, P.: AKS scheme for face and Calgero–Moser–Sutherland type models. Preprint Kac, V.: Infinite dimensional Lie algebras, 3rd ed.. Cambridge: Cambridge Univ. Press, 1990 Kassel, C.: Quantum groups. New York: Springer-Verlag, 1995 Kazhdan, D. and Wenzl, H.: Reconstructing monoidal categories. Adv. in Soviet Math. 16, 111–136 (1993) Kirillov, A., Jr.: On an inner product in modular tensor categories. J. of AMS 9, 1135–1169 (1996) Koornwinder, T.: Compact quantum groups and q-special functions. Preprint Larson, R. and Towber, J.: Two dual classes of bialgebras related to the concepts of “quantum group” and “quantum Lie algebra”. Commun. Alg. 19, 3295–3345 (1991) Macdonald, I.: Symmetric functions and Hall polynomials, 2nd ed.. Oxford: Oxford Univ. Press, 1995 Moore, G. and Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177–254 (1989) Ocneanu, A.: Quantized group, string algebras and Galois theory for algebras. In: Operator algebras and applications, Vol. 2. London Math. Soc. Lecture note series 136, 119–172 (1989) Reshetikhin, N. and Turaev, V.: Invariants of 3-manifolds via link polynomials and quantum groups. Invent. Math. 103, 547–598 (1991) Reshetikhin, N., Takhtadzhyan, L. and Faddeev, L.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193–225 (1990) Schauenburg, P.: Face algebras are ×R -bialgebras. Preprint Sweedler, M.: Hopf algebras, New York: Benjamin Inc., 1969 Takeuchi, M.: Matric bialgebras and quantum groups. Israel J. Math. 72, 232–251 (1990) Tsuchiya, A. and Kanie, Y.: Vertex operators in conformal field theory on P1 and monodromy representations of braid groups. In Adv. Stud. Pure Math. Vol. 16. Turaev, V.: Quantum invariants of knots and 3-manifolds, Berlin, New York: Walter de Gruyter, 1994 Turaev, V. and Wenzl, H.: Quantum invariants of 3-manifolds associated with classical simple Lie algebras. Int. J. of Modern Math. 4, 323–358 (1993) Turaev, V. and Wenzl, H.: Semisimple and modular categories from link invariants. Math. Ann 309, 411–461 (1997) Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B300, 360–376 (1988) Walton, M.: Algorithm for WZW fusion rules: A proof. Phys. Lett. B 241, 365–368 (1990) Wenzl, H.: Hecke algebras of type An and subfactors. Invent. Math. 92, 349–383 (1988) Wenzl, H.: C ∗ tensor categories from quantum groups. J. AMS 11, 261–282 (1998) Witten, E.: Quantum field theory and the Jones polynomial. Comm. Math. Phys. 121, 351–399 (1989)

Communicated by T. Miwa

Commun. Math. Phys. 203, 249 – 267 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Donaldson Invariants for Non-Simply Connected Manifolds Marcos Mariño, Gregory Moore Department of Physics, Yale University, New Haven, CT 06520, USA. E-mail: [email protected], [email protected] Received: 2 May 1998 / Accepted: 14 September 1998

Abstract: We study Coulomb branch (“u-plane”) integrals for N = 2 supersymmetric SU (2), SO(3) Yang–Mills theory on 4-manifolds X of b1 (X) > 0, b2+ (X) = 1. Using wall-crossing arguments we derive expressions for the Donaldson invariants for manifolds with b1 (X) > 0, b2+ (X) > 0. Explicit expressions for X = CP 1 × Fg , where Fg is a Riemann surface of genus g are obtained using Kronecker’s double series identity. The result might be useful in future studies of quantum cohomology. 1. Introduction The Donaldson invariants of four-manifolds have been a source of fascination both in mathematics and in physics. While there has been much progress in understanding these invariants, there is more to learn, particularly in terms of the relation of the invariants to Floer homology [1] and to Gromov–Witten invariants [2, 3]. Understanding the invariants in these contexts leads to the need to understand the Donaldson invariants for nonsimply connected four-manifolds X. Most investigations of Donaldson invariants have focussed on the case π1 (X) = 0. There exists a mathematical definition in the nonsimply connected case [4] but comparatively little is known about this case. This paper derives some new results on Donaldson invariants for 4-manifolds with first Betti number b1 (X) > 0. We do not consider the effects of torsion in H∗ (X; Z), nor the effects of a nonabelian fundamental group. Recently, using Witten’s physical approach to Donaldson theory [5–8], a fairly systematic (physical) procedure has been developed for deriving various properties of the Donaldson invariants, including wall-crossing and blowup formulae, and the relation to Seiberg–Witten invariants [9–13]. The systematic procedure, which begins with certain integrals over the Coulomb branch of vacua of an N = 2 SYM theory can be extended to higher rank gauge groups and to nonsimply connected manifolds. In [9, 10] partial results were obtained for nonsimply connected manifolds. In this paper a more complete treatment is given for the rank one groups SU (2), SO(3).

250

M. Mariño, G. Moore

Our main results are: 1. A wall-crossing formula for the Donaldson invariants in Eqs. (3.4) and (3.16) below. 2. An expression for the Donaldson invariants in terms of SW invariants. For manifolds of simple type this is given in Eq. (4.17) below. It is a natural generalization of Witten’s formula [7], obtained in the simply connected case. 3. Explicit expressions for the Donaldson invariants for X = CP 1 × Fg , where Fg is a Riemann surface of genus g. We give answers valid in both the chambers vol(CP 1 ) → 0 and vol(Fg ) → 0. The expressions in Eqs. (5.18), (5.20) below might prove useful in future studies of the Gromov–Witten invariants of the moduli space of flat connections on Fg . 2. The u-Plane Integral for b1 > 0 In this section we extend and elaborate on the results of Sect. 10 of [9]. We consider an arbitrary insertion of observables, using the proposal for the contact terms in [10]. Consider an N = 2 SU (2) or SO(3) supersymmetric Yang–Mills theory1 on a compact oriented 4-manifold X of b1 (X) > 0. As explained in [9] the Donaldson– Witten generating function can be written as ZDW = Zu + ZSW , where Zu is the Coulomb branch integral and ZSW is the contribution of the Seiberg–Witten invariants. The Coulomb branch integral is only nonvanishing for b2+ (X) = 1, but, by a procedure explained in [9, 12] can be taken as the starting point for a systematic derivation of ZDW . In the case of b1 (X) > 0 the Coulomb integral can be obtained by a simple generalization of the arguments in section three of [9]. First of all, the photon partition function includes [8] an integration over b1 zero modes of the gauge field corresponding to flat connections. These zero modes span the tangent space to a torus of dimension b1 , Tb1 = H 1 (X, IR)/H 1 (X, Z). The zero modes of the one-forms ψ live in this tangent space. As a consequence of having these extra zero modes, the photon partition function is 1 (2.1) (Imτ ) 2 (b1 −1) θ0 (τ, τ ), where θ0 (τ, τ ) is the Siegel–Narain theta function introduced in [8, 14, 9]. The next ingredient comes from the measure for the ψ-fields. The expansion in zero P1 ci βi , where βi , i = 1, . . . , b1 is an integral basis of harmonic modes reads ψ = bi=1 one-forms, and we identify H1 (X) ' H 1 (X, Z). The ci are Grassmann variables. The measure for the ψ fields is then: b1 Y

dci

i=1

(Imτ ) 2

b1

1

= (Imτ )− 2

b1 Y

dci .

(2.2)

i=1 ]

We can consider the ci as a basis of one-forms βi ∈ H 1 (Tb1 , Z), dual to βi ∈ In this way we can identify

H 1 (X, Z).

ψ=

b1 X i=1

]

βi ⊗ βi = c1 (L),

where L is the universal flat line bundle over Tb1 × X. 1 For simplicity we do not include hypermultiplets.

(2.3)

Donaldson Invariants for Non-Simply Connected Manifolds

251

Taking into account (2.1) and (2.2), we see that the Coulomb integral (without any insertion of observables) can be written as [9]: Z Z Z dψ dDAχ B σ y −1/2 Zu = 2 [da d a¯ dη dχ]

P ic(X)

1 2 2 (Imτ )D ∧ ∗D exp −iπ τ¯ λ+ − iπ τ λ− + π i(λ, w2 (X)) exp 8π √ Z √ Z i 2 d τ¯ dτ i 2 ηχ ∧ (D+ + 4πλ+ ) + 7 (ψ ∧ ψ) ∧ (4π λ− + D+ ) exp − 16π d a¯ 2 π da Z 2 d τ 1 ψ ∧ψ ∧ψ ∧ψ , + 3 · 211 πi da 2 (2.4) R where P ic(X) denotes a sum over line bundles and an integration over Tb1 . The integration over ψ is understood as integration of differential forms on Tb1 . The orientation of the measure of the finite-dimensional integral (2.4) corresponds to a choice of Donaldson orientation of the moduli space of instantons [15]. Consider now the generating function for an arbitrary insertion of zero, one, two and three observables. We introduce the formal sums of cycles Z

γ =

b1 X

ζi δi , S =

i=1

b2 X

λi Si , 6 = 3

i=1

b3 X i=1

θi 6i3 ,

(2.5)

where δi , i = 1, . . . , b1 , Si , i = 1, . . . , b2 and 6i3 , i = 1, . . . , b3 = b1 are a basis of one, two, and three cycles, respectively. The basis of one-cycles δi is dual to βi ∈ H 1 (X, Z). The λi are complex numbers, and ζi , θi are Grassmann variables. The insertion of observables corresponding to these cycles is Z Z Z K 3 u, (2.6) a1 Ku + a2 K 2 u + a3 γ

S

63

where a1 , a2 and a3 are constants that should be fixed by comparison to known mathematical√results. The constant for the two-observable has been already fixed in [9], a2 = i/ 2π. K is the canonical descent operator in the normalization of [9] , and we have, explicitly: 1 du ψ, Ku = √ 4 2 da √ 1 d 2u 2 du 2 ψ ∧ψ − (F+ + D), K u= 2 32 da 4 da √ 1 d 3u 3 d 2u 3 2i du 3 (2dχ − ∗dη). ψ ∧ψ ∧ψ − ψ ∧ (F+ + D) − K u= √ 16 da 2 8 da 227 da 3 (2.7) In addition, we have to take into account various contact terms associated to the intersecting cycles [9, 10]. These come from intersections of two, three and four cycles on

252

M. Mariño, G. Moore

the manifold X. For the intersection of two cycles we have S ∩ S ∈ H0 (X, Z), 6 3 ∩ γ ∈ H0 (X, Z), 6 3 ∩ S ∈ H1 (X, Z), 6 3 ∩ 6 3 ∈ H2 (X, Z).

(2.8)

For intersection of three cycles, we have the possibilities S ∩ 6 3 ∩ 6 3 ∈ H0 (X, Z),

6 3 ∩ 6 3 ∩ 6 3 ∈ H1 (X, Z),

(2.9)

and for intersection of four cycles we only have the possibility 6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ∈ H0 (X, Z).

(2.10)

The contact term corresponding to S ∩ S was obtained in [9]. A proposal for the structure of the contact terms corresponding to more general intersecting cycles was made in [10]. According to this proposal, the contact term associated to the intersection of p cycles is given by the appropriate descendant of the p th derivative of the Seibeg-Witten prepotential with respect to a deformation parameter τ0 . Moreover [11, 12], this deformation parameter τ0 can be related to the dynamically generated scale of the theory 3 (in the case of the asymptotically free theories) or to the microscopic gauge coupling, in the case of self-dual gauge theories. For SU (2) N = 2 supersymmetric Yang–Mills theory, the relation between τ0 and 3 is given by 34 = eiπ τ0 . This is related to the fact that 3 can be identified with the first slow time of the Toda–Whitham hierarchy [16]. Following the proposal of [10] the contact terms for the various intersections in (2.8), (2.9) and (2.10) are of the form Z Z KT (u) + a33 K 2 T (u) S 2 T (u) + a13 T (u)(6 3 ∩ γ ) + a32 3 3 3 6 ∩6 Z 6 ∩S (2.11) (3) 3 3 (3) KFτ0 + a332 Fτ0 (S ∩ 6 ∩ 6 ) + a333 + a3333 Fτ(4) (6 3 0

6 3 ∩6 3 ∩6 3

∩ 6 ∩ 6 ∩ 6 ), 3

3

3

(p)

where the a’s are constants which will be determined below. In this equation, Fτ0 (2) denotes the pth derivative of the prepotential with respect to τ0 , and T (u) = (4/π i)Fτ0 . The contact term for S ∩ S can be written as [10, 11]: T (u) =

du i 1h 2u − a . 4 da

(2.12)

The constants aij , aij k and aij kl will be obtained in terms of ai , i = 1, 2, 3, using singlevaluedness of the integrand on the u-plane. Notice that one expects, on physical grounds, that aij is proportional to ai aj , and so on. We can already plug the observables (2.6) and the corresponding contact terms (2.11) into the generating function and write an explicit expression for the u-plane integral. It is important, however, to check that the resulting expression has good properties under duality transformations, that is, that the integrand is single-valued in the u-plane. This is not obvious due to the holomorphic “functions” that appear in (2.7) and (2.11). Following the strategy of [9], we will first of all integrate the auxiliary field D. Comparing to the

Donaldson Invariants for Non-Simply Connected Manifolds

253

expressions in the simply-connected case, we see that the new terms coupling to D appear in the combination −

i i du h (4πλ− + D) ∧ S˜ , 4π da

(2.13)

where √ S˜ = S −

√ 3π d 2 u da 2 dτ dT 3 ψ ∧ψ + a3 2 ψ ∧ 6 3 − 2π ia33 6 ∧ 63, 32 du 4i da du du

(2.14)

and we interpret 6 3 as a harmonic one-form using Poincaré duality and the Hodge theorem. To guarantee that the resulting lattice sum over first Chern classes is wellbehaved under duality transformations, the two-form S˜ should be invariant under duality. To achieve this, we redefine the ψ field as 12π d 2 u da 3 6 . ψ˜ = ψ − √ a3 2 2i da dτ

(2.15)

a33 = −9π 2 a32

(2.16)

Then, if we choose

S˜ becomes √ 9π 3 2i 2 3 2 dτ ˜ ˜ ψ ∧ψ + a3 6 ∧ 6 3 . 32 du 4

√ S˜ = S −

(2.17)

We then see that, if ψ˜ is a modular form of weight (1, 0), then S˜ is a modular form of weight zero. Notice that the redefinition of ψ in (2.15) does not change the ψ measure. ˜ we have taken into account that To obtain the above expression for S, 4πi

d 2 u 2 da du dT − = πi . 2 da da dτ da

(2.18)

Once this redefinition has been made, the u-plane integral involves a lattice sum ˜ and identical to the one in the simply-connected case but with the substitution S → S, additional holomorphic insertions (coming from the observables and contact terms) that, ˜ should be modular forms of weight zero. once they are reexpressed in terms of ψ˜ and S, This is in fact the case for appropriate choices of the constants in (2.11). The computation is lengthy but straightforward. Duality invariance fixes all the constants in (2.11) in terms of a1 , a2 and a3 . The final result is: √ a32 = −6 2π ia3 , a13 = −6π 2 a1 a3 , √ a332 = −72 2πia32 , a333 = 36π 2 ia33 , a3333 = −216π

3

ia34 .

(2.19)

254

M. Mariño, G. Moore

The u-plane integral is therefore given by: √ Z Z dxdy 2 dτ ˜ 2 2ˆ ˜ u (S, ψ˜ ) µ(τ ) dψ exp 2pu + S T (u) + Zu = 1/2 0 y 32 du P ic(X) 0¯ (4)\H Z 3π i du ˜ a1 du a3 (S, ψ˜ ∧ 6 3 ) ψ˜ ∧ [γ ] − 3π 2 a1 a3 u(6 3 ∩ γ ) − + √ da 8 da 4 2 X √ 3 √ 7 dτ 2 4 2π i dτ 3 3 2π i 2 ˜ 3 a3 u(S, 6 ∧ 6 3 ) + a3 (ψ˜ ∧ 6 3 ) ψ˜ − u + 10 2 3·2 du 64 da √ 4 Z 9 2π du 9π 3 i 2 dτ 2 3 a u (ψ˜ , 6 ∧ 6 3 ) − a3 ψ˜ ∧ 6 3 ∧ 6 3 ∧ 6 3 + 64 3 du 256 3 da X 135π 6 4 ˜ + a3 u(6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ) 9(S), 16 (2.19) where

√

µ(τ ) = −

2 da χ σ A B 2 dτ

2 ˜ = exp(2iπλ20 ) exp − 1 ( du )2 S˜− 9(S) 8πy da X 2 2 exp −iπ τ¯ (λ+ ) − iπτ (λ− ) + π i(λ − λ0 , w2 (X))

(2.20)

λ∈H 2 + 21 w2 (E)

i du ˜ du ˜ (S+ , ω) , exp −i (S− , λ− ) (λ+ , ω) + da 4πy da 1 du 2 . Tˆ (u) = T (u) + 8πImτ da

Notice that, in the above exponential, all the terms are modular forms of weight zero if ψ˜ is a modular form of weight (1, 0). There is another check one can do of the above functions: one can formally assign an R-charge to the cycles in such a way that the observables have R-charge zero, namely: R(γ ) = −3, R(S) = −2, R(6 3 ) = −1. ˜ S˜ Taking into account that R(ψ) = 1, R(a) = 2, we see that the definitions of ψ, are consistent with this R-charge assignment and that, moreover, all the terms in the exponential of (2.20) have zero R-charge. This is in fact an example of Seiberg’s trick since the insertions of (2.6) (2.11) may be regarded as operators in some UV theory (e.g. in a brane configuration). The remaining constants a1 , a3 will be fixed below by comparing to certain topological results [17]. In principle, the coefficients in (2.20) are fixed by the proposal of Eq. (2.14) of the first paper in [10]. Although we find the proposal of Losev, Nekrasov, and Shatashvili natural, because of the many conventions involved we have not checked that the coefficients derived in (2.20) are consistent with their Eq. (2.14). 3. Donaldson Wall-Crossing In this section we want to compute the wall-crossing formulae associated to the region at infinity of the u-plane. To compare to mathematical results, it is better to write the

Donaldson Invariants for Non-Simply Connected Manifolds

255

expression in terms of ψ, S. There are two standard mathematical facts for manifolds of b2+ = 1 that will be very useful in writing the resulting wall-crossing formula [17]. First, for any β1 , β2 , β3 and β4 in H 1 (X, Z), one has β1 ∧ β2 ∧ β3 ∧ β4 = 0. Second, the image of the map ∧ : H 1 (X, Z) ⊗ H 1 (X, Z) −→ H 2 (X, Z)

(3.1)

is generated by a single rational cohomology class 6 (not to be confused with the threecycle 6 3 ). We introduce now the antisymmetric matrix aij associated to the basis βi of H 1 (X, Z), i = 1, . . . , b1 , as βi ∧ βj = aij 6. Finally, we introduce the two-form on Tb1 as X ] ] aij βi ∧ βj , (3.2) = i<j

which does not depend on the choice of basis. This is a volume element for the torus, hence Z b1 /2 b1 . (3.3) vol(T ) = Tb1 (b1 /2)! (For the manifolds under discussion b1 is even.) As a simple example, if X = CP 1 × Fg , where Fg is a Riemann surface of genus g, then 6 = [CP 1 ] (the Poincaré dual to the two-homology class of CP 1 ) and vol(Tb1 ) = 1. With all these ingredients, we can already write the wall-crossing formulae. Notice that, because of the first fact, many terms on the u-plane integral (2.21) vanish. We will write the formula first for an insertion of two and four observables. In this case, a straightforward generalization of the arguments in [9] gives: W C(λ) = −32i(−1)(λ−λ0 ,w2 (X)) e2π iλ0 (8i)−b1 /2 b du 2 σ/8 dτ 1− 21 ( 2i da 2 2 π dτ ) · q −λ /2 (u − 1) dτ du u2 − 1 · exp 2pu + S 2 T (u) − i(λ, S)/ h √ 2 Z dτ i 2 d u S 2 + λ, ψ dψ exp , · 32 da 2 2π da Tb1 q0 2

(3.4)

where we use the value obtained in [9] for the universal constants α, β, and h(τ ) =

1 da = ϑ2 ϑ3 . du 2

(3.5)

We can actually compute the integral over Tb1 and give an expression in terms of modular forms which generalizes the expressions given in [9, 18, 19]. Using the explicit expressions given in [9] for the Seiberg–Witten solution in terms of modular forms, we find that d 2u = 4f1 (q), da 2

16i dτ = f2 (q), da π

(3.6)

256

M. Mariño, G. Moore

where f1 (q) = f2 (q) =

2E2 + ϑ24 + ϑ34 3ϑ48

= 1 + 24q 1/2 + · · · ,

ϑ2 ϑ3 = q 1/8 + 18q 5/8 + · · · . 2ϑ48

(3.7)

Using (2.3) and (3.2) we find ψ 2 = 2(6 ⊗ ),

(3.8)

hence we can perform the integral over Tb1 to write a very explicit expression for (3.4): i 2 W C(λ) = − (−1)(λ−λ0 ,w2 (X)) e2πiλ0 2−b1 /4 vol(Tb1 ) 2 −λ2 /2 b1 −2 σ 2 h(τ ) ϑ4 exp 2pu + S T (u) − i(λ, S)/ h · q bX 1 /2 b=0

(3.9)

b1 /2 1 b b1 /2−b−1 b b1 /2−b (q) f (q) (S, 6) (λ, 6) . f 1 2 (8i)b b q0

Notice that this result confirms the conjecture on p. 18 of [17]. As a check of this expression, and also to fix an overall coefficient depending on b1 , we will compute the wall-crossing for the correlator p r S d−2r on a ruled surface, where d is half the dimension of the moduli space, and compare it to the expressions in [17]. Introduce the integer cohomology class ζ = 2λ. In [17], Muñoz computes the wall-crossing for walls satisfying ζ 2 = p1 and ζ 2 = p1 + 4, where p1 is the Pontryagin number of the instanton bundle. For ζ 2 = p1 , it is easy to see that only the first coefficients in the expansion in q contribute to (3.9), and there is no contribution from the contact term T (u). To compare the formulae, notice that we have to multiply our wall-crossing expression by r!(d −2r)!, as we are considering the Donaldson–Witten generating function. We finally obtain, δζ,λ0 (pr S d−2r ) =

1 (−1)(λ−λ0 ,w2 (X)) (−i)b1 /2 2−3b1 /4−b−d (−1)r+d pr vol(Tb1 ) 2 bX 1 /2 d − 2r (b1 /2)! (S, ζ )d−2r−b (S, 6)b (ζ, 6)b1 /2−b . (b1 /2 − b)! b b=0

(3.10) If we compare with [17], we find perfect agreement except for an overall factor 1/2 (a standard discrepancy between topological and quantum field theory normalizations), and a factor (−i)b1 /2 2−3b1 /4 . The latter factor is due to the normalization of the fermions in the physical theory. In order to make the identification in (2.3) and to use the normalization of topologists, we have to correct the ψ measure with an overall factor i b1 /2 29b1 /4 . As we will see in the next section, with this normalization the above formula gives the right answer for the generalized Seiberg–Witten wall-crossing formula of [20, 21]. The case ζ 2 = p1 + 4 also agrees with [17]. In this case the computation is more involved, as one has to take into account the first two coefficients in the q-expansion of the various functions in (3.9).

Donaldson Invariants for Non-Simply Connected Manifolds

257

We consider now an arbitrary insertion of observables associated to one, two, and three cycles. We have new contact terms as well as new terms in the integration over the torus. To write these in a convenient way, notice that we can use Poincaré duality and the isomorphism H1 (X, Z) ' H 1 (Tb1 , Z) to obtain a one-form 6 3] in H 1 (Tb1 , Z). Define ιβ ] = k

X p

akp βp] .

(3.11)

Using (2.3), we find ψ ∧ 6 3 = (ι6 3] )6.

(3.12)

In the same way, using the isomorphism H1 (X, Z) ' H 1 (X, Z) given by δi → βi , we P1 ] can define γ ] = bi=1 ζi βi , where the ζi were defined in (2.5). We then obtain, using (2.3) again, Z ψ = γ ]. (3.13) γ

The functions appearing in the u-plane integral can be written in terms of Jacobi theta functions and Eisenstein series as follows, dT = −8f3 (q), da

Fτ(3) =− 0

π2 f4 (q), 4

(3.14)

where 1 1 4 4 2 (2E + ϑ + ϑ ) − 1 = q 3/8 + 12q 7/8 + · · · , 2 2 3 16ϑ2 ϑ3 9ϑ48 1 f4 (q) = 8(ϑ2 ϑ3 )2 1 4 1 4 4 3 (ϑ2 + ϑ34 ) − E2 + · (2E + ϑ + ϑ ) = q 1/4 + 8q 3/4 + · · · . 2 2 3 2 54ϑ48 (3.15) f3 (q) =

The wall-crossing formula now reads, i 2 W C(λ) = − (−1)(λ−λ0 ,w2 (X)) e2πiλ0 27b1 /4 (−iπ )b1 /2 2 2 −λ /2 b1 −2 −1 σ · q h(τ ) f2 (q) ϑ4 exp 2pu + (S 2 − 6π 2 a1 a3 (6 3 ∩ γ ))T (u) √ 3 2 3 3 ∩ 6 ∩ S)f4 (q) − i(λ, S)/ h − 72 2π a3 f3 (q)(6 ∧ 6 , λ) + 18 2π Z h i ] exp 2(P (q), 6) + (R(q), 6)ι6 3] + Q(q)γ , √

Tb1

3

ia32 (6 3

3

q0

(3.16)

258

M. Mariño, G. Moore

where √ i 2 f1 (q)S + 8if2 (q)λ , P (q) = 16π

R(q) = 3πa3 4if3 (q)S − f1 (q)λ , √ 2a1 1 . Q(q) = 8 h

(3.17)

In the case of ruled surfaces and for ζ 2 = p1 , we find again agreement with the expressions obtained by Muñoz. In fact, comparing to the formula in p.13 of [17], we can find the values of a1 , a3 : πi

a1 = π −1/2 23/4 e− 4 ,

a3 =

π −3/2 1/4 π i 2 e4. 6

(3.18)

4. The Seiberg–Witten Contribution We can now follow the strategy in [9] and find the form of the Seiberg–Witten contribution at the monopole and dyon cusps. We focus on the monopole cusp, as the contribution at the dyon cusp can be obtained using the Z2 symmetry on the u-plane. In fact, as the functions involved in the monopole contribution are universal and they have been obtained in [9] in the simply-connected case, we will be able to derive the general wall-crossing formula of [20, 21]. A crucial ingredient in the discussion of the Seiberg–Witten contribution for nonsimply connected manifolds is that we have to consider generalized Seiberg–Witten invariants, i.e., we have to consider correlation functions of one-observables. Recall that the basic observable in Seiberg–Witten theory is the two-form aD on Mλ . The first descendant of aD (in the topological field theory associated to the Seiberg–Witten monopole equations) is ψ, which is a one -form on X and also a one-form on Mλ . It can be written as: ψ=

b1 X

νi βi ,

(4.1)

i=1

where βi ∈ H 1 (X, Z), i = 1, . . . , b1 is the basis of one-forms considered before, and Z ψ (4.2) νi = δi

are the one-observables of Seiberg–Witten theory. The generalized Seiberg–Witten invariant associated to the one-forms β1 , . . . , βr is defined by intersection theory on Mλ : Z dλ −r ν1 ∧ · · · ∧ νr ∧ aD 2 , (4.3) SW (λ, β1 ∧ · · · ∧ βr ) = Mλ

where dλ = λ2 − (2χ + 3σ )/4 is the virtual dimension of Mλ . These generalized invariants (and their wall-crossing) have been considered in [21].

Donaldson Invariants for Non-Simply Connected Manifolds

259

The Seiberg–Witten twisted Lagrangian near u = 1 (with the monopole fields included) can be written as [9, 11] i τ˜M F ∧ F + p(u)TrR ∧ R 16π √ i 2 d τ˜M (ψ ∧ ψ) ∧ F + `(u)TrR ∧ R˜ − 7 2 · π daD d 2 τ˜M i ψ ∧ ψ ∧ ψ ∧ ψ, + 2 3 · 211 π daD

{Q, W } +

(4.4)

and, as we see, the terms which are not Q-exact do not depend on the metric. The exponentiation of the terms involving the densities TrR ∧ R, TrR ∧ R˜ gives, after integration on X, the gravitational factors P (u)σ/8 , L(u)χ /4 considered in [9]. The term 2 involving F ∧ F gives a factor C(u)λ /2 , where F = 4π λ and C(u) = e−2π i τ˜M . The terms C(u), L(u) and P (u) are universal (they do not depend on the manifold X) and they were found in [9] using matching of wall-crossing in the simply-connected case. They can be written as dτD , L(u) = πiα 4 (u2 − 1) du −1 2 (4.5) (u − 1), P (u) = −π 2 β 8 aD aD , C(u) = qD where qD = e2πiτD . The last relation tells us that the gauge coupling τ˜M appearing in (4.4) is given by τ˜M = τD −

1 log aD , 2πi

(4.6)

eM (aD ) and therefore it is smooth at the monopole cusp. This defines the prepotential F 00 e through the equation FM (aD ) = τ˜M . First we analyze the Seiberg–Witten contribution when only two and four observables are introduced. It can be written as Z 2 2 2e2iπ(λ0 ·λ+λ0 ) C(u)λ /2 P (u)σ/8 L(u)χ /4 hepO+I2 (S) iλ,u=1 =

Mλ

b1 h i X du 2 · exp 2pu + i (S, λ) + S TM (u) exp (PM (u), 6) aij νi νj , daD

(4.7)

i,j =1

where

√ d τ˜M i 2 d 2u S + λ , PM (u) = 2 2π 32 daD daD

(4.8)

which is the magnetic version of the function in the Tb1 integral of (3.4), but with the smooth coupling constant (4.6). When we expand the exponential involving the oneobservables, we obtain the generalized Seiberg–Witten invariants with 2b insertions,

260

M. Mariño, G. Moore

b = 0, . . . , b1 . The final expression is: 1 2 2 ResaD =0 2e2iπ(λ0 ·λ+λ0 ) C(u)λ /2 P (u)σ/8 L(u)χ /4 b! b=0 du −d /2+b−1 · exp 2pu + i (S, λ) + S 2 TM (u) aD λ (PM (u), 6)b (4.9) daD

hepO+I2 (S) iλ,u=1 =

·

b1 X

bX 1 /2

ai1 j1 · · · aib jb SW (λ, βi1 ∧ βj1 ∧ · · · ∧ βib ∧ βjb ).

ip ,jp =1

On the other hand, the wall-crossing formula for the u-plane integral near u = 1 can be written as W C(λ) =2πi211b1 /4 i b1 /2 α χ β σ e2iπ(λ0 ·λ+λ0 ) vol(Tb1 ) dτ χ /4 2 −λ2 /2 (u − 1)σ/8 · ResaD =0 qD (u2 − 1) du du · exp 2pu + i (S, λ) + S 2 TM (u) (P (qD ), 6)b1 /2 , daD 2

(4.10)

where we have included the extra factors depending on b1 that we obtained in the previous section by comparing to the expressions in [17]. Notice that, from (4.6), one has √ 2 1 λ, (4.11) P (qD ) = PM (u) + 64π aD and (4.10) can then be expanded as: i b1 /2 2 α χ β σ e2iπ(λ0 ·λ+λ0 ) vol(Tb1 ) W C(λ) = 2πi π bX 1 /2 dτ χ /4 2 b1 /2 −λ2 /2 (u − 1)σ/8 ResaD =0 qD (u2 − 1) · du b b=0 du 2 · exp 2pu + i (S, λ) + S TM (u) daD b−b /2 · 211b/2 π b (PM (u), 6)b (λ, 6)b1 /2−b aD 1 .

(4.12)

Now we can compare the expressions for wall-crossing obtained from (4.9) and (4.12). Notice again that, to identify the fields ψ with the one-observables in (4.2) we have to be careful with possible normalization factors needed to agree with the normalization used by topologists. But for the terms with no insertions, i.e. b = 0 in (4.9), we should be able to obtain the wall-crossing formula for this Seiberg–Witten invariant. In fact, we find: W C(SW (λ)) = (−1)b1 /2 (λ, 6)b1 /2 vol(Tb1 ).

(4.13)

As λ ∈ H 2 (X, Z) + 21 w2 (X), and 2λ is the determinant line bundle of the corresponding Spinc structure, we find perfect agreement with [20, 21] (notice that the wall-crossing formula in Theorem 1.2 of [20] should have an extra factor of 2−b1 /2 ). For the wall-crossing

Donaldson Invariants for Non-Simply Connected Manifolds

261

of Seiberg–Witten invariants with one-observable insertions, the general formula of [21] is (in our notation): W C(SW (λ, β1 ∧ · · · ∧ βr )) =

(−1)

b1 −r 2

( b12−r )!

(λ, 6)

b1 −r 2

Z

]

Tb1

β1 ∧ · · · ∧ βr] ∧

b1 −r 2

,

(4.14) and we see that, if we introduce a normalization factor 2−9/4 π −1/2 i for the fields ψ, we obtain from the matching of wall-crossing the expression b X

ai1 j1 · · · aib jb W C(SW (λ, βi1 ∧ βj1 ∧ · · · ∧ βib ∧ βjb )) =

ip ,jp =1

(4.15)

(−1)b1 /2−b (λ, 6)b1 /2−b vol(Tb1 ), 2b (b1 /2 − b)! again in agreement with (4.14). The expression (4.14) can be actually derived by considering general insertions of one and three-observables. We then see that, in general, the Donaldson invariants for a non-simply connected manifold should be written in terms of the generalized Seiberg–Witten invariants, and only in this case we have consistent formulae for matching of wall-crossing. When we include arbitrary observables associated to one- and three-cycles, the above expressions are more complicated and we have to take into account the terms involving γ and 6 3 , as well as the new contact terms. For manifolds of b2+ > 1, all the contact terms appear in the Seiberg–Witten contribution. They are given by the expression in (2.11), where the prepotential is now the eM (aD ) (notice that, as all the contact terms are regular at the monopole prepotential F cusp, we can take as well the dual prepotential FD (aD )). In general we have a complicated expression which can be written explicitly using the previous results. In the simple type case, dλ = 0, the Seiberg–Witten moduli space consists of a finite set of points and counting them with appropriate signs we obtain SW (λ). We only have to compute the different functions at u = 1 (the first term in the expansion in aD ). Using also (3.18), we can already write an explicit expression for the SW contribution of λ to the Donaldson invariants at u = 1, with an arbitrary insertion of observables: 7χ

hepO+I2 (S)+I1 (γ )+I3 (6 ) iλ,u=1 = 21+ 4 + 4 e2iπ(λ0 ·λ+λ0 ) n o 1 1 1 · exp 2p + S 2 − (6 3 ∩ γ ) − (6 3 ∧ 6 3 , S) + (6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ) 2 4 96 o n 1 3 3 · exp 2(S, λ) − (6 ∧ 6 , λ) SW (λ). 4 (4.16) 3

11σ

2

We see that the contact terms give, on the one hand, new contributions which depend on the intersection theory on X. On the other hand, there is a term coming from the intersection of the two-cycle 6 3 ∩ 6 3 with the Poincaré dual of the basic class λ, as was suspected in [6] using the cosmic string approach. Notice that all the coefficients in (4.16) are real, as needed for consistency. We can now write an expression for the Donaldson invariants of non-simply connected manifolds of simple type, summing over the basic classes and considering both the

262

M. Mariño, G. Moore

monopole and the dyon contributions. Using the Z2 symmetry on the u-plane, and taking into account the R-charges of the different terms in (4.16), we obtain: 2 hepO+I2 (S)+I1 (γ )+I3 (6 ) i 3

7χ

= 21+ 4 + 4 n o 1 1 1 · exp 2p + S 2 − (6 3 ∩ γ ) − (6 3 ∧ 6 3 , S) + (6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ) 2 4 96 n o X 1 2 e2iπ(λ0 ·λ+λ0 ) SW (λ) exp 2(S, λ) − (6 3 ∧ 6 3 , λ) · 4 11σ

λ

+i

χ +σ 4

−w22 (E)

n o 1 1 1 · exp −2p − S 2 + (6 3 ∩ γ ) + (6 3 ∧ 6 3 , S) − (6 3 ∩ 6 3 ∩ 6 3 ∩ 6 3 ) 2 4 96 n o X i 3 2iπ(λ0 ·λ+λ20 ) e SW (λ) exp −2i(S, λ) + (6 ∧ 6 3 , λ) . · 4 λ

(4.17) 5. Donaldson Invariants for Product Ruled Surfaces In this section we will use the above results and the formulae in Sect. 8 of [9] to write general expressions for the Donaldson invariants of product ruled surfaces X = CP 1 × Fg , where Fg is a Riemann surface of genus g. 3 For these manifolds, b1 = 2g, b2 = 2, b2+ = 1, so σ = 0 and χ = 4 − 2b1 . H 2 (X, Z) is generated by the cohomology classes [CP 1 ], [Fg ], with intersection form I I 1,1 . Consider a general period point ω given by 1 ω(θ) = √ (eθ [CP 1 ] + e−θ [Fg ]). 2

(5.1)

The standard constant curvature metrics for CP 1 , Fg give natural representatives for [CP 1 ], [Fg ]. Thus, choosing coordinates z ∈ C for CP 1 and representing Fg (for g > 1) as a quotient by a Fuchsian group of the the Poincaré disk: D = {w : |w| < 1}, we have i dw ∧ dw , 2π(g − 1) (1 − |w|2 )2 i dz ∧ dz . [Fg ] = 2π (1 + |z|2 )2

[CP 1 ] =

(5.2)

A metric with period point (5.1) then has scalar curvature 8π(eθ − e−θ (g − 1)) and will hence be positive for e2θ > g −1. By [7] there are no SW contributions to the Donaldson invariants for this metric. Thus, to compute the invariants in the chamber corresponding to the limit of a small volume for CP 1 (with θ positive and large), we need only evaluate the u-plane integral. 2 Curiously, the coefficient of (6 3 )4 is 1/96 = 2−5 · 3−1 . The factor of 3 would seem to imply divisibility properties by 3 for certain intersection numbers. These are easily confirmed in simple examples, but we did not find a general proof. 3 This case has also been discussed in Eq. (2.15) of [10].

Donaldson Invariants for Non-Simply Connected Manifolds

263

The value of the integral depends on the non-abelian magnetic flux w2 (E). The vanishing argument of Sect. 5 of [9] applies when w2 (E) · [CP 1 ] 6= 0, (i.e. there is nontrivial flux through the small rational fiber). In this case we have simply that Zu = 0. Now consider w2 (E) = 0. Here we can use the computation of Sect. 8 of [9]. To do this, we choose the primitive null vector z = [CP 1 ] = 6. The expression derived in [9] 2 = (z, ω)2 is small, which is the case in the chamber under will be valid as long as z+ consideration (corresponding to θ large and positive). If we consider the expression (2.20) on a product ruled surface, we see that it involves a lattice sum 9(e S) and a term involving the observables and the measure. From this last term we can derive an expression generalizing the holomorphic function f appearing in Sect. 8 of [9]: √ dτ 1−g du 2 (8i)1−g (u2 − 1) f = 8πi du dτ √ 1 2 dτ 2 3 3 a (S, ψ˜ 2 ) · exp 2pu + S − (S, 6 ∧ 6 ) T (u) + 8 64 da (5.3) Z 3π i du a1 du 3 3 ˜ ˜ a3 (S, ψ ∧ 6 ) ψ ∧ [γ ] − u(6 ∩ γ ) − + √ 8 da 4 2 da X 1 − u(S, 6 3 ∧ 6 3 ) , 12 where we have taken into account the relation (2.17), and a3 is given by (3.18). We also ˜ z) = (S, 6). Using the computation leading to Eq. (8.15) of [9], we see have that (S, that the Donaldson invariants of product ruled surfaces in the limit of a small volume for CP 1 and for w2 (E) = 0 are given by Z (S, [CP 1 ]) √ ˜ d ψ f h coth i . (5.4) −8 2πi 2h Tb1 q0 When we consider only insertions of zero and two-observables, the integration over the torus is easily done and we obtain the following expression for the Donaldson invariants of CP 1 × Fg , which generalizes the expression obtained in [19, 9] for CP 1 × CP 1 : g (S, [CP 1 ]) 2 . (5.5) −16i h(u2 − 1)e2pu+S T (u) 2h2 f1 (q)(S, [CP 1 ]) coth i 2h q0 If w2 (E) = [CP 1 ], one can analyze the lattice reduction as in [9] including this nonzero flux, which gives extra phases function. The effect of these phases in the lattice theta 1 is simply to change the coth i(S, [CP ])/2h in (5.4) by −i csc (S, [CP 1 ])/2h , and we then obtain the generating function for CP 1 × Fg and with w2 (E) = [CP 1 ]: g (S, [CP 1 ]) 2 −16 h(u2 − 1)e2pu+S T (u) 2h2 f1 (q)(S, [CP 1 ]) csc , (5.6) 2h q0 where we have included only zero and two-observables. For applications to Floer theory and Gromov–Witten theory the most interesting chamber is on the other side of the Kähler cone, namely when θ → −∞ giving a very small volume to Fg . We can obtain the Donaldson invariants for this chamber by summing over all the wall-crossing discontinuities. In general, denoting the lift 2λ0 of

264

M. Mariño, G. Moore

w2 (E) by λ0 = 21 1 [CP 1 ] + 21 2 [Fg ] with 1,2 = 0, 1 we have walls at ω · λn,m = 0 where: 1 1 (5.7) λn,m = (n + 1 )[CP 1 ] + (m + 2 )[Fg ] 2 2 for (n + 21 1 )(m + 21 2 ) < 0, n, m ∈ Z.4 The infinite sums over wall-crossings can be written explicitly in terms of modular forms using a result of Kronecker [22]: 5 ∞ ∞ X X

q mn e2π i(nθ1 +mθ2 ) − e−2π i(nθ1 +mθ2 ) = −1 +

m=1 n=1

1 1 + −2π iθ 1 1−e 1 − e−2π iθ2

− iη3 (τ )

ϑ1 (θ1 + θ2 |τ ) . ϑ1 (θ1 |τ )ϑ1 (θ2 |τ ) (5.8)

We will consider in detail the cases 2λ0 = [CP 1 ], 2λ0 = [CP 1 ] + [Fg ], which are relevant for the Floer cohomology of Fg × S1 [2, 24] as well as to the quantum cohomology of the moduli space of stable bundles over Fg [3, 25]. We consider only insertions of zero, one, and two observables (i.e. we put 6 3 = 0). As w2 (X) = 0 for product ruled surfaces, the only λ-dependent factors in the wall-crossing expression (3.16) are i 2 ˜ λ) , (5.9) q −λ /2 exp − (S, h where S˜ has been defined in (2.14). Define now the formal variables 1 ˜ = 1 ([CP 1 ], S), 2πθ1 = ([CP 1 ], S) h h √ 1 1 2 dτ ˜ . 2πθ2 = − ([Fg ], S) = − ([Fg ], S) + h h 16 da

(5.10)

The sum of wall-crossings can then be written, for 2λ0 = [CP 1 ], as −

∞ ∞ X X

q m(n−1/2) e2π i((n−1/2)θ1 +mθ2 ) − e−2π i((n−1/2)θ1 +mθ2 )

m=1 n=1

ϑ4 (θ1 + θ2 |τ ) i + iη3 (τ ) , = 2 sin(πθ1 ) ϑ1 (θ1 |τ )ϑ4 (θ2 |τ )

(5.11)

and for 2λ0 = [CP 1 ] + [Fg ] as −

∞ ∞ X X

q (m−1/2)(n−1/2) e2π i((n−1/2)θ1 +(m−1/2)θ2 ) − e−2π i((n−1/2)θ1 +(m−1/2)θ2 )

m=1 n=1

= iη3 (τ )

ϑ1 (θ1 + θ2 |τ ) . ϑ4 (θ1 |τ )ϑ4 (θ2 |τ ) (5.12)

4 We require λ2 < 0 rather than λ2 ≤ 0 because the “walls” at λ2 = 0 are on the light-cone, and are never crossed when moving from one chamber to another. 5 Curiously, this identity showed up in recent studies of elliptic quantum cohomology, [23].

Donaldson Invariants for Non-Simply Connected Manifolds

265

These explicit expressions are obtained from (5.8) by shifting θ1 , θ2 appropriately. The quotients of theta functions can be written in terms of Weierstrass σ functions, as in the blow-up formulae of [26]: 1 ϑ1 (θ|τ ) 2 = e−η2 ω2 θ σ (t), 0 ϑ1 (0|τ ) ω2 ϑ4 (θ|τ ) 2 =e−η2 ω2 θ σ3 (t), ϑ4 (0|τ )

(5.13)

where t = ω2 θ and ω2 corresponds to the a-period of the Seiberg–Witten curve, ω2 = 8π √ h. The value of the Weierstrass zeta function at ω2 can be written in terms of E2 (τ ) 2 as η2 =

π2 E2 (τ ). 6ω2

(5.14)

In fact, the terms in (5.13) involving η2 cancel the E2 factors in the wall-crossing formula, and the resulting expressions are modular forms of weight zero (after integrating on Tb1 ). The coefficients in the expansion of the σ functions only depend on the zeroobservable u. After writing the Seiberg–Witten elliptic curve in Weierstrass form, one has that g2 =

1 u2 1 2u3 1 u − , g3 = − , 4 3 4 48 9 4

(5.15)

and the root relevant for σ3 (t) is e3 = −u/12. We then have the expansions g2 t 5 + ··· , 24 · 3 · 5 u σ3 (t) = 1 + t 2 + · · · . 24

(5.16)

ϑ4 (θ1 + θ2 |τ ) 4 σ3 (t1 + t2 ) = − √ he−2η2 ω2 θ1 θ2 , ϑ1 (θ1 |τ )ϑ4 (θ2 |τ ) σ (t1 )σ3 (t2 ) 2 √ ϑ1 (θ1 + θ2 |τ ) 2 −2η2 ω2 θ1 θ2 σ (t1 + t2 ) =− he . η3 (τ ) ϑ4 (θ1 |τ )ϑ4 (θ2 |τ ) 4 σ3 (t1 )σ3 (t2 )

(5.17)

σ (t) = t −

Using (5.13) we find η3 (τ )

Using these expressions, we can obtain an explicit answer for the Donaldson–Witten function of product ruled surfaces in the chamber where the volume of Fg is small. For 2λ0 = [CP 1 ], we have to add the expression for the chamber where the volume of 1 [CP 1 ] is small, and the infinite sum of wall-crossing terms. Whenthe volume of [CP ] is small, we have to use (5.4) but with the −i csc (S, [CP 1 ])/2h function as in (5.6). We then obtain S2 [CP 1 ] = − 2(7g+1)/2 (−iπ)g h2g−1 f2 (q)−1 e(2p+ 3 )u ZDW Z ·

Tb1

(5.18) √ i 2 σ3 (t1 + t2 ) 1 ] M(q)(S, [CP ]) + Q(q)γ · exp , 24π σ (t1 )σ3 (t2 ) q 0

266

M. Mariño, G. Moore

where M(q) =

ϑ24 + ϑ34 ϑ48

.

(5.19)

For 2λ0 = [CP 1 ] + [Fg ] we obtain: √ S2 2 7g/2 [CP 1 ]+[Fg ] = (−iπ)g h2g−1 f2 (q)−1 e(2p+ 3 )u 2 ZDW 8 √ Z i 2 σ (t1 + t2 ) 1 ] · exp . M(q)(S, [CP ]) + Q(q)γ · 24π σ3 (t1 )σ3 (t2 ) q 0 Tb1 (5.20) As a check of (5.20), one can easily derive, using the expansion of the σ functions, the result D[CP

1 ]+[F ] 1

((I (CP 1 ))2 ) = −2,

(5.21)

where D[CP ]+[F1 ] denotes the Donaldson invariant in the chamber vol(F1 ) → 0. This agrees with the computation in [2]. Recently the proof of the Atiyah conjecture on the relation of symplectic and instanton Floer homology has been completed [24, 25]. It is possible that the above expressions can be used to give another proof of this conjecture. In fact, 1

([CP 1 ],[CP 1 ]+[Fg ])

ZDW

[CP 1 ]+[Fg ]

[CP ] = ZDW + ZDW 1

(5.22)

is a generating function for the Gromov–Witten invariants of the moduli space of flat connections [2, 3, 24, 25]. In the simple case g = 1, one finds that D([CP

1 ],[CP 1 ]+[F ]) 1

(1) = −1, D([CP

1 ],[CP 1 ]+[F ]) 1

(O) = 2,

(5.23)

where O is the zero-observable. Taking into account that the generator of the quantum cohomology of the moduli space of flat connections on F1 (or of the Floer cohomology of F1 × S1 )is given by β = −4O, we obtain the relation β = 8, which is the first quantum correction to the classical cohomology ring in genus one. Acknowledgements. We would like to thank Y. Ruan and E. Witten for some discussions. We would also like to thank V. Muñoz and Tianjun Li for very useful and clarifying correspondence. This work is supported by DOE grant DE-FG02-92ER40704.

References 1. The Floer Memorial Volume. H. Hofer et. al. eds., Boston–Basel: Birkhäuser, 1995. 2. Donaldson, S.: Floer homology and algebraic geometry. In: Vector bundles in algebraic geometry, N.J. Hitchin et. al. eds. Cambridge: Cambridge University Press, 1995 3. Bershadsky, M., Johansen, A., Sadov, V. and Vafa, C.: Topological Reduction of 4D SYM to 2D σ –Models. hep-th/9501096; Nucl. Phys. B448, 166 (1995) 4. Morgan, J. and Mrowka, T.: A note on Donaldson’s polynomial invariants. Int. Math. Res. Not. 10, 223 (1992) 5. Witten, E.: Topological Quantum Field Theory. Commun. Math. Phys. 117, 353 (1988) 6. Witten, E.: Supersymmetric Yang–Mills theory on a four-manifold. hep-th/9403193; J. Math. Phys. 35, 5101 (1994)

Donaldson Invariants for Non-Simply Connected Manifolds

7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.

267

Witten, E.: Monopoles and four-manifolds. hep-th/9411102; Math. Res. Letters 1, 769 (1994) Witten, E.: On S-duality in abelian gauge theory. hep-th/9505186; Selecta Mathematica 1, 383 (1995) Moore, G. and Witten, E.: Integration over the u-plane in Donaldson theory. hep-th/9709193 Losev, A., Nekrasov, N. and Shatashvili, S.: Issues in topological gauge theory. hep-th/9711108; Testing Seiberg–Witten solution. hep-th/9801061 Mariño M. and Moore, G.: Integrating over the Coulomb branch in N = 2 gauge theory. hep-th/9712062 Mariño, M. and Moore, G.: The Donaldson–Witten function for gauge groups of rank larger than one. hep-th/9802185 Takasaki, K.: Integrable Hierarchies and Contact Terms in u-plane Integrals of Topologically Twisted Supersymmetric Gauge Theories. hep-th/9803217 Verlinde, E.: Global aspects of electric-magnetic duality. hep-th/9506011; Nucl. Phys. B455, 211 (1995) Donaldson, S.K. and Kronheimer, P.B.: The Geometry of Four-Manifolds. Oxford: Clarendon Press, 1990 Gorsky, A., Marshakov, A., Mironov, A. and Morozov, A.: RG equations from Whitham hierarchy. hepth/9802007 Muñoz, V.: Wall-crossing formulae for algebraic surfaces with q > 0. alg-geom/9709002 Göttsche, L.: Modular forms and Donaldson invariants for 4-manifolds with b+ = 1. alg-geom/9506018; J. Am. Math. Soc. 9, 827 (1996) Göttsche, L. and Zagier, D.: Jacobi forms and the structure of Donaldson invariants for 4-manifolds with b+ = 1. alg-geom/9612020 Li, T.J. and Liu, A.: General wall-crossing formula. Math. Res. Lett. 2, 797 (1995) Okonek, C. and Teleman, A.: Seiberg–Witten invariants for manifolds with b2+ = 1, and the universal wall-crossing formula. alg-geom/9603003; Int. J. Math. 7, 811 (1996) Weil, A.: Elliptic Functions according to Eisenstein and Kronecker. Berlin–Heidelberg–New York: Springer-Verlag, 1976 Polishchuk, A.: Massey and Fukaya products on elliptic curves. alg-geom/9803017 Muñoz, V.: Ring structure of the Floer cohomology of 6 × S1 . dg-ga/9710029 Muñoz, V.: Quantum cohomology of the moduli space of stable bundles over a Riemann surface. alggeom/9711030 Fintushel, R. and Stern, R.J.: The blowup formula for Donaldson invariants. alg-geom/9405002; Ann. Math. 143, 529 (1996)

Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 203, 269 – 295 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Gleason’s Theorem for Rectangular JBW∗ -Triples? C. Martin Edwards1 , Gottfried T. Rüttimann2 1 The Queen’s College, Oxford, United Kingdom. E-mail: [email protected] 2 University of Berne, Berne, Switzerland. E-mail: [email protected]

Received: 3 August 1998 / Accepted: 20 October 1998

Abstract: A JBW∗ -triple B is said to be rectangular if there exists a W∗ -algebra A and a pair (p, q) of centrally equivalent elements of the complete orthomodular lattice P(A) of projections in A such that B is isomorphic to the JBW∗ -triple pAq. Any weak∗ -closed injective operator space provides an example of a rectangular JBW∗ triple. The principal order ideal CP(A)(p,q) of the complete ∗ -lattice CP(A) of centrally equivalent pairs of projections in a W∗ -algebra A, generated by (p, q), forms a complete lattice that is order isomorphic to the complete lattice I(B) of weak∗ -closed inner ideals in B and to the complete lattice S(B) of structural projections on B. Although not itself, in general, orthomodular, CP(A)(p,q) possesses a complementation that allows for definitions of orthogonality, centre, and central orthogonality to be given. A less familiar notion in lattice theory, that is well-known in the theory of Jordan algebras and Jordan triple systems, is that of rigid collinearity of a pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A)(p,q) . This is defined and characterized in terms of properties of P(A). A W∗ -algebra A is sometimes thought of as providing a model for a statistical physical system. In this case B, or, equivalently, pAq, may be thought of as providing a model for a fixed sub-system of that represented by A. Therefore, CP(A)(p,q) may be considered to represent the set consisting of a particular kind of sub-system of that represented by pAq. Central orthogonality and rigid collinearity of pairs of elements of CP(A)(p,q) may be regarded as representing two different types of disjointness, the former, classical disjointness, and the latter, decoherence, of the two sub-systems. It is therefore natural to consider bounded measures m on CP(A)(p,q) that are additive on centrally orthogonal and rigidly collinear pairs of elements. Using results of J.D.M. Wright, it is shown that, provided that neither of the two hereditary sub-W∗ -algebras pAp and qAq of A has a weak∗ -closed ideal of Type I2 , such measures are precisely those that are the restrictions of bounded sesquilinear functionals φm on pAp × qAq with the property that the action of the centroid Z(B) of B commutes with the adjoint ? Research partially supported by the Royal Society and the Swiss National Science Foundation.

270

C. M. Edwards, G. T. Rüttimann

operation. When B is a complex Hilbert space of dimension greater than two, this result reduces to Gleason’s Theorem. 1. Introduction This paper is concerned with the structure of rectangular JBW∗ -triples. A JBW∗ -triple B is said to be rectangular if there exists a W∗ -algebra A and a pair (p, q) of projections in A such that B is isomorphic to the JBW∗ -triple pAq. The family of rectangular JBW∗ triples includes, for example, all weak∗ -closed injective operator spaces [31]. Since all the properties that will be discussed in this paper are invariant under isomorphisms, without loss of generality, B will always be identified with pAq. For the general theory of JBW∗ -triples the reader is referred to [3,14,21,23,27] and [33]. In the study of the structure of the rectangular JBW∗ -triple B, the weak∗ -closed subspaces J that arise naturally are the inner ideals, which are defined by the property that, for each element a in J and each element b in B, the element ab∗ a lies in J . A pair (e, f ) of projections in the W∗ -algebra A is said to be centrally equivalent if e and f have the same central support projection. The authors showed in [15] that every weak∗ -closed inner ideal in a W∗ -algebra A is of the form eAf , for a unique centrally equivalent pair (e, f ) of projections in A. Since, for arbitrary projections p and q in A, pAq is a weak∗ -closed inner ideal in A, there is no loss of generality in assuming throughout that the pair (p, q) is centrally equivalent. The results of [15] show that every weak∗ -closed inner ideal in B is of the form eAf , where (e, f ) is a centrally equivalent pair of projections in A, with e and f dominated by p and q, respectively. A linear projection P on B is said to be structural if, for each element a in B, the elements (P a)a ∗ (P a) and P (a(P a)∗ a) of B coincide. It follows from the results of [13,15,17] and [18] that every such projection is weak∗ -continuous and contractive and is of the form a 7 → eaf for a centrally equivalent pair (e, f ) of projections in A, with e and f dominated by p and q, respectively. It is a consequence of the results referred to above that the sets CP(A)(p,q) of centrally equivalent pairs of projections in A such that e and f are dominated by p and q, respectively, and S(B) of structural projections on B, with appropriate partial orderings, form complete lattices which are order isomorphic to the complete lattice I(B) of weak∗ -closed inner ideals in B. A W∗ -algebra A is often thought of as a model for a statistical quantum-mechanical system, the bounded observables of which are represented by self-adjoint elements of A, and the propositions of which are represented by elements of the complete orthomodular lattice P(A) of projections in A. Weak∗ -continuous contractive projections on A can be thought of as representing filtering processes on the physical system, and their ranges as representing sub-systems. Consequently, the sub-systems corresponding to structural projections are represented by weak∗ -closed inner ideals in A. It follows that the rectangular JBW∗ -triple B may be considered as representing a physical system, which, though not itself classical or quantum, may possess sub-systems that are. Its structural sub-systems are represented by weak∗ -closed inner ideals or, equivalently, by elements of CP(A)(p,q) . In this paper, sometimes using the corresponding results for CP(A) discussed in [20], the properties of the complete lattice CP(A)(p,q) of centrally equivalent pairs of projections in the W∗ -algebra A, dominated by (p, q), are examined in some detail. It is shown that CP(A)(p,q) possesses a rich structure involving notions of compatibility, orthogonality, central orthogonality and rigid collinearity, all of which have physical interpretations.

Gleason’s Theorem for Rectangular JBW∗ -Triples

271

For a physical system represented by a W∗ -algebra A, a discussion of its statistics may be approached in two ways. In the traditional approach, states of the system are represented by bounded ortho-additive measures on P(A), whilst a second approach, sometimes referred to as that involving quantum histories ([24–26,34]), states are represented by measures on CP(A). The generalized Gleason Theorem of Bunce and Wright ([6–8]) shows that, provided that A has no weak∗ -closed ideal of Type I2 , states of the first kind are the restrictions of bounded linear functionals on A, and the results of [20] show that, under the same conditions, the relevant measures on CP(A) are the restrictions of bounded sesquilinear functionals, or decoherence functionals, on A × A. When A is a Type I factor, the second approach subsumes the first. For a system represented by a rectangular JBW∗ -triple B, where B is isomorphic to pAq and p and q are not equal, the first approach is not available. However, it is possible to appeal to the second approach. The central orthogonality and rigid collinearity of a pair (e1 , f1 ), (e2 , f2 ) of elements of CP(A)(p,q) correspond to two different kinds of disjointness of the corresponding sub-systems, the first classical disjointness, and the second, decoherence. Consequently, it is natural to study those bounded measures m on CP(A)(p,q) which have the property that, for each pair (e1 , f1 ), (e2 , f2 ) of either centrally orthogonal or rigidly collinear elements of CP(A)(p,q) , m((e1 , f1 ) ∨ (e2 , f2 )) = m((e1 , f1 )) + m((e2 , f2 )). Using results of [20] and [34], it is shown that, provided that neither of the W∗ -algebras pAp or qAq has a direct summand of Type I2 , such measures are the restrictions of a particular class of bounded sesquilinear functionals on pAp × qAq. In the case in which p and q coincide, these are the decoherence functionals, mentioned above, and discussed in [24–26] and [34–36]. Furthermore, a measure is normal if and only if the corresponding sesquilinear functional is separately weak∗ -continuous. The results obtained can be couched equivalently as properties of S(B) or I(B). The paper is organized as follows. In Sect. 2 definitions are given, notation is established and certain preliminary results are described. In Sect. 3 rectangular JBW∗ -triples are defined and the notion of compatibility in the complete lattice CP(A)(p,q) , which is order isomorphic to the complete lattice I(B) of weak∗ -closed inner ideals in the rectangular JBW∗ -triple B, is introduced. A more detailed study of the structure of CP(A)(p,q) is carried out in Sect. 4. In particular the notions of orthogonality, centre, central orthogonality and rigid collinearity are introduced and related to the centroid of B. Whilst the structure of CP(A)(p,q) and its physical interpretation are of independent interest, the main results of the paper are the generalization of Gleason’s Theorem [22], and the identification of normal measures on CP(A)(p,q) as the restrictions of separately weak∗ -continuous sesquilinear functionals, which are proved in Sect. 5. The last section is devoted to a discussion of examples, including that of the rectangular JBW∗ -triple Mr,s (C) of r × s complex matrices. In many ways, the most illuminating example is provided by the rectangular JBW∗ triple B which is itself a complex Hilbert space. In this case, the complete lattice I(B) of weak∗ -closed inner ideals is the complete lattice of closed subspaces of B and the complete lattice S(B) of structural projections is the complete lattice P(B(B)) of projections in the W∗ -algebra B(B) of bounded linear operators on the Hilbert space B. Since the centre ZS(B) of S(B) is trivial, there are no non-trivial centrally orthogonal pairs of elements of S(B). More surprisingly, there are no non-trivial orthogonal pairs of elements of S(B). However, two elements Q1 and Q2 of S(B) are rigidly collinear if and only if

272

C. M. Edwards, G. T. Rüttimann

they are orthogonal in the complete orthomodular lattice P(B(B)). Consequently, the bounded measures on S(B), discussed in Sect. 5, reduce to ortho-additive measures on P(B(B)) and the main results, Theorems 5.2 and 5.3, reduce to Gleason’s Theorem [22]. The conclusion that can be deduced from this is that, in generalizing Gleason’s Theorem to rectangular JBW∗ -triples, it is rigid collinearity, not orthogonality, that is the relevant binary relation for the additivity of measures. 2. Preliminaries Recall that a partially ordered set P is said to be a lattice if, for e and f in P, the supremum e ∨ f and the infimum e ∧ f exist. The partially ordered set P is said to be a complete lattice if, for any family (ej )j ∈3 of P, the supremum ∨j ∈3 ej and the infimum ∧j ∈3 ej exist. A complete lattice has a greatest element, denoted by 1 and a least element, denoted by 0. For each element p in the complete lattice P, the subset Pp consisting of elements e in P majorized by p forms a complete lattice with greatest element p and least element 0, and both the infimum and supremum of a family of elements of Pp is the same whether calculated in P or in Pp . The complete lattice Pp is said to be the principal order ideal in P generated by p. A complete lattice P together with an anti-order automorphism e 7 → e0 of P such that, for all elements e in P, e ∨ e0 = 1, e00 = e, and, for all elements e and f in P with e ≤ f , f = e ∨ (f ∧ e0 ), is said to be orthomodular and the mapping e 7 → e0 is said to be an orthocomplementation of P. Elements e and f in the complete orthomodular lattice P are said to be orthogonal, denoted e ⊥ f , if e ≤ f 0 . An element z in P is said to be central if, for all elements e in P, e = (z ∧ e) ∨ (z0 ∧ e). The set ZP of central elements of the complete orthomodular lattice P contains 0 and 1, and if z is contained in ZP then so also is z0 . The centre ZP of P forms a sub-complete orthomodular lattice of P which, with the restricted order and orthocomplementation, is Boolean. The central support c(e) of an element e in P is the infimum of the set of elements in ZP which dominate e. Observe that, for elements e and f in P, c(e ∧ c(f )) = c(e) ∧ c(f ), and, for every family (ej )j ∈3 of elements of P, _ _ ej ) = c(ej ). c( j ∈3

(2.1)

(2.2)

j ∈3

When endowed with the product ordering the Cartesian product P ×P of the complete orthomodular lattice P with itself forms a complete lattice. An element (e, f ) in P × P is said to be centrally equivalent if the central supports c(e) of e and c(f ) of f coincide and, in this case, the common central support is denoted by c(e, f ). Let CP be the collection of centrally equivalent elements of P × P. It follows from (2.2) that, with the ordering inherited from P × P, CP is a complete lattice with least element (0, 0) and greatest element (1, 1), and the supremum of a subset of CP coincides with its supremum when regarded as a subset of P × P. In general, this is not the case for the infimum. However, for any element (e, f ) in CP, (e, f ) = (c(f ), f ) ∧ (e, c(e)) = (c(f ), f ) ∧P ×P (e, c(e)). For details the reader is referred to [32].

Gleason’s Theorem for Rectangular JBW∗ -Triples

273

Let A be a W∗ -algebra and let P(A) be the family of self-adjoint idempotents, or projections in A. For e and f in P(A), write e ≤ f if ef = e and let e0 = 1 − e, where 1 is the unit in A. These define a partial ordering on P(A), with respect to which it forms a complete lattice, and an orthocomplementation on P(A) which is therefore a complete orthomodular lattice. Observe that, for orthogonal elements e and f in P(A), e ∨ f = e + f and, by (2.2), c(e + f ) = c(e) ∨ c(f ). Furthermore, for each increasing net (ej )j ∈3 in P(A), the supremum ∨j ∈3 ej coincides with the limit of the net (ej )j ∈3 in the weak∗ topology. The centre ZP(A) of P(A) coincides with the complete Boolean lattice of projections in the algebraic centre Z(A) of A. Observe that, for z in ZP(A) and e in P(A), z ∧ e = ze, z ∨ e = e + z − ez = z + z0 e.

(2.3)

A subspace J of the W∗ -algebra A is said to be a left ideal if AJ ⊆ J , is said to be a right ideal if J A ⊆ J , and is said to be an ideal if it is both a left and a right ideal. For each weak∗ -closed left ideal J in A there exists a unique projection e such that J coincides with Ae. The left ideal Ae is an ideal if and only if e is central. For each element a in A, the unique projection e(a) such that the left ideal {b ∈ A : ba = 0} coincides with Ae(a)0 is said to be the left support of a. It is the least element of P(A) for which e(a)a = a. The right support f (a) is similarly defined. Clearly, e(a) = f (a ∗ ) and, therefore, the left and right supports of a self-adjoint element a coincide. This element, denoted by s(a), is the unit in the sub-W∗ -algebra of A generated by a and is said to be the support projection of a. An element u in A is said to be a partial isometry if uu∗ u = u or, equivalently, if either uu∗ or u∗ u is a projection. For any subset B of A, the set of partial isometries in B is denoted by U(B). For each partial isometry u in A, e(u) = uu∗ , f (u) = u∗ u and the central supports c(e(u)) and c(f (u)) coincide. For each element a in A, there exists a unique partial isometry r(a) in A such that 1 a = r(a)|a| and f (r(a)) = s(|a|), where |a| = (a ∗ a) 2 . Moreover, r(a)∗ = r(a ∗ ), f (a) = f (r(a)), e(a) = e(r(a)), a = r(a)a ∗ r(a).

(2.4)

The partial isometry r(a) is said to be the support of a. For details of these and other results on W∗ -algebras the reader is referred to [29] and [30]. The Jordan triple product {a b c} of three elements a, b and c in A is defined by 1 (ab∗ c + cb∗ a). 2 A subspace J of A is said to be a subtriple of A if the subset {J J J } is contained in J , is said to be an inner ideal if the subset {J A J } is contained in J and is said to be an ideal if the subsets {A J A} and {J A A} are contained in J . Observe that, for each pair a, b of elements of A the subspace aAb is an inner ideal in A. Since the intersection of a family of weak∗ -closed inner ideals in A is a weak∗ -closed inner ideal in A, the set I(A) of weak∗ -closed inner ideals in A, when ordered by set inclusion, forms a complete lattice. The following important result is proved in [15]. {a b c} =

Lemma 2.1. Let A be a W∗ -algebra, let P(A) be the complete orthomodular lattice of projections in A, and let CP(A) be the complete lattice of centrally equivalent pairs of elements of P(A). Then, the mapping (e, f ) 7 → eAf is an order isomorphism from CP(A) onto the complete lattice I(A) of weak∗ -closed inner ideals in A, with inverse J 7 → (eJ , fJ ), where _ _ {e(u) : u ∈ U(J )}, fJ = {f (u) : u ∈ U(J )}. eJ =

274

C. M. Edwards, G. T. Rüttimann

Since, for arbitrary elements e and f in P(A), the set eAf is a weak∗ -closed inner ideal in A, the corresponding element of CP(A) is (c(f )e, c(e)f ), the common central support c(c(f )e, c(e)f ) of c(f )e and c(e)f being c(e)c(f ). The complete lattice CP(A) has a complicated structure, some of which is described briefly below. For details, the reader is referred to [15] and [20]. For each element (e, f ) in CP(A) and each element a in A, let P2A (e, f )a = eaf, P1A (e, f )a = eaf 0 + e0 af, P0A (e, f )a = e0 af 0 .

(2.5)

Then, P0A (e, f ), P1A (e, f ) and P2A (e, f ) are weak∗ -continuous norm non-increasing pairwise orthogonal linear projections on A with sum IA , the identity operator on A. The ranges A0 (e, f ) of A2 (e, f ) of P0A (e, f ) and P2A (e, f ) are weak∗ -closed inner ideals in A and the range A1 (e, f ) of P1A (e, f ) is a weak∗ -closed subtriple of A. The decomposition A = A0 (e, f ) ⊕ A1 (e, f ) ⊕ A2 (e, f ) of A is said to be the generalized Peirce decomposition of A relative to (e, f ). Let D A (e, f ) be the weak∗ -continuous bounded linear operator on A defined, for each element a in A, by 1 1 D A (e, f )a = ( P1A (e, f ) + P2A (e, f ))a = (ea + af ). 2 2

(2.6)

Recall that a bounded linear operator T on a complex Banach space X is said to be hermitian if the numerical range of T in the Banach algebra B(X) of bounded linear operators on X is contained in R, or, equivalently, if, for all real numbers t, the bounded linear operator eitT is an isometry (see [5], Lemma 5.2). It can easily be seen that the weak∗ -continuous linear operator D A (e, f ) is hermitian. A pair (e1 , f1 ), (e2 , f2 ) of elements of CP(A) is said to be compatible, written (e1 , f1 )CA (e2 , f2 ) if, for j and k equal to 0, 1 or 2, the commutant [PjA (e1 , f1 ), PkA (e2 , f2 )] is equal to zero. It can be seen that the compatibility of (e1 , f1 ) and (e2 , f2 ) is equivalent to either of the conditions: [D(e1 , f1 ), D(e2 , f2 )] = 0; e1 e2 = e2 e1 and f1 f2 = f2 f1 . For an element (e, f ) in CP(A) define (e, f )0 = (c(f 0 )e0 , c(e0 )f 0 ).

(2.7)

(e, f )00 = ((c(f 0 )e0 )0 , (c(e0 )f 0 )0 ),

(2.8)

Then

and c((e, f )00 ) is equal to c(e, f ). Furthermore, the mapping (e, f ) 7→ (e, f )0 is order reversing, and, (e, f ) ≤ (e, f )00 = (e, f )0000 = . . . ,

(e, f )0 = (e, f )000 = (e, f )00000 = . . . . (2.9)

An element (e1 , f1 ) in CP(A) is said to be orthogonal to an element (e2 , f2 ) in CP(A), written (e1 , f1 ) ⊥ (e2 , f2 ), if (e2 , f2 ) ≤ (e1 , f1 )0 . This relation is symmetric and holds if and only if e1 ⊥ e2 and f1 ⊥ f2 . It follows that orthogonal elements are compatible. An element (g, h) in CP(A) is said to be central if, for each element (e, f ) of CP(A), (e, f ) = ((g, h) ∧ (e, f )) ∨ ((g, h)0 ∧ (e, f )) .

(2.10)

Gleason’s Theorem for Rectangular JBW∗ -Triples

275

An element (g, h) in CP(A) is central if and only if g is equal to h and lies in the centre ZP(A) of P(A). Two elements (e1 , f1 ) and (e2 , f2 ) of CP(A) are said to be centrally orthogonal if there exists an element z in the centre ZP(A) of P(A) such that (e1 , f1 ) ≤ (z, z) and (e2 , f2 ) ≤ (z, z)0 . Observe that centrally orthogonal elements are orthogonal and therefore compatible. Furthermore, the elements (e1 , f1 ) and (e2 , f2 ) are centrally orthogonal if and only if the central supports c(e1 , f1 ) and c(e2 , f2 ) are orthogonal, and, in this case, (e1 , f1 ) ∨ (e2 , f2 ) = (e1 + e2 , f1 + f2 ).

(2.11)

A pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A) is said to be rigidly collinear, written (e1 , f1 )>(e2 , f2 ), if A2 (e1 , f1 ) ⊆ A1 (e2 , f2 ),

A2 (e2 , f2 ) ⊆ A1 (e1 , f1 ).

This occurs if and only if there exist pairwise orthogonal elements w1 , w2 and w3 of ZP(A) of sum 1 such that w1 e1 = w1 e2 , w1 f1 ⊥ w1 f2 , w2 e1 ⊥ w2 e2 , w2 f1 = w2 f2 , w3 e1 = w3 e2 = w3 f1 = w3 f2 = 0, and in this case

c(e1 , f1 ) = c(e2 , f2 ) ≤ w1 + w2 .

Furthermore, a unique such triple w1 , w2 and w3 of elements of ZP(A), satisfying the additional conditions w1 ≤ c(e1 , f1 ) and w2 ≤ c(e1 , f1 ), exists. In this case c(e1 , f1 ) = c(e2 , f2 ) = w1 + w2 and writing w1 e1 = w1 e2 = e0 , w2 f1 = w2 f2 = f0 , w2 e1 + w2 e2 = e, w1 f1 + w1 f2 = f, c(e0 ) = w1 and c(f0 ) = w2 , and (e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A) such that (e1 , f1 ) ∨ (e2 , f2 ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f + f0 ). A family ((ej , fj ))j ∈3 of elements of CP(A) is said to be rigidly collinear if every pair of distinct elements of the family is rigidly collinear. For j and k in 3 and l equal to 1, 2 or 3, let w(j, k)l be the unique element of ZP(A) such that w(j, k)1 fj ⊥ w(j, k)1 fk , w(j, k)1 ej = w(j, k)1 ek , w(j, k)2 fj = w(j, k)2 fk , w(j, k)2 ej ⊥ w(j, k)2 ek , w(j, k)3 ej = w(j, k)3 fj = w(j, k)3 ek = w(j, k)3 fk = 0, w(j, k)1 ≤ c(ej , fj ), w( j, k)2 ≤ c(ej , fj ),

3 X

w(j, k)l = 1.

l=1

Then, there exist uniquely pairwise orthogonal elements w1 , w2 and w3 of ZP(A) of sum 1 and elements e0 and f0 in P(A) such that for all distinct j and k in 3, and l equal to 1, 2 or 3, w(j, k)l = wl , for all j in 3, w1 ej = e0 , c(e0 ) = w1 ,

w2 fj = f0 , c(f0 ) = w2 ,

w3 ej = w3 fj = 0, c(ej , fj ) = w1 + w2 ,

(2.12)

276

C. M. Edwards, G. T. Rüttimann

and (w1 fj )j ∈3 and (w2 ej )j ∈3 are families of pairwise orthogonal elements of P(A). Writing _ _ w2 ej , f = w1 fj , e= j ∈3

j ∈3

(e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A) such that _ (ej , fj ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f0 + f ). j ∈3

3. Rectangular JBW∗ -Triples Recall that a complex Banach space B equipped with a triple product (a, b, c) 7→ {a b c} from B × B × B to B which is symmetric and linear in the first and third variables, conjugate linear in the second variable, and, for elements a, b, c and d in B, satisfies the identity [D(a, b), D(c, d)] = D({a b c}, d) − D(c, {d a b}),

(3.1)

where D is the mapping from B × B to B defined by D(a, b)c = {a b c}, is said to be a Jordan ∗ -triple. When D

is continuous from B × B to the Banach space of bounded linear operators on B and, for each element a in B, D(a, a) is hermitian with non-negative spectrum and satisfies kD(a, a)k = kak2 , the Jordan ∗ -triple B is said to be a JB∗ -triple. If B is the dual of a Banach space B∗ then B is called a JBW∗ -triple. A subspace J of B is said to be a subtriple of B if {J J J } is contained in J , is said to be an inner ideal in B if {J B J } is contained in J , and is said to be an ideal in B if {B J B} and {J B B} are contained in J . Observe that a W∗ -algebra A endowed with the Jordan triple product forms a JBW∗ triple. Let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A. Then, the weak∗ -closed inner ideal pAq in A is a JBW∗ -triple. A JBW∗ -triple B is said to be rectangular if there exists a W∗ -algebra A and an element (p, q) in CP(A) such that B is isomorphic as a Jordan∗ -triple to pAq. The remainder of this section will be concerned with a fixed rectangular JBW∗ -triple B, which will be identified with the rectangular JBW∗ -triple pAq. Lemma 3.1. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, and let B be the rectangular JBW∗ -triple pAq. Then, there exists an order isomorphism (e, f ) 7 → eBf from the principal order ideal CP(A)(p,q) in CP(A) onto the complete lattice I(B) of weak∗ closed inner ideals in B. Proof. Let J be a weak∗ -closed inner ideal in B. Then, for each partial isometry u in J , the weak∗ -closed inner ideal e(u)Bf (u) lies in J . However, e(u)Bf (u) = e(u)pAqf (u) = e(u)Af (u), since, by Lemma 2.1, e(u) ≤ p and f (u) ≤ q. It follows from [17], Lemma 2.3 that J is an inner ideal in A. The result now follows from Lemma 2.1. u t

Gleason’s Theorem for Rectangular JBW∗ -Triples

277

Recall that the principal order ideal P(A)p in the complete orthomodular lattice P(A) coincides with the complete orthomodular lattice P(pAp) of projections in the hereditary sub-W∗ -algebra pAp of A. In order to simplify notation at a later stage, for each element e in P(A)p , let e0p = p − e. Let (e, f ) lie in CP(A)(p,q) and observe that, since (e, f ) ≤ (p, q), (e, f ) and (p, q) are compatible in A. It follows that, for j equal to 0, 1 or 2, PjA (e, f )P2A (p, q) = P2A (p, q)PjA (e, f ),

(3.2)

and the restriction PjB (e, f ) of PjA (e, f ) to B is a weak∗ -continuous norm-non-increasing linear projection onto a weak∗ -closed subspace Bj (e, f ) of B. Furthermore, for each element a in B, P2B (e, f )a = eaf, P1B (e, f )a = eaf 0q + e0p af, P0B (e, f )a = e0p af 0q ,

(3.3)

and P0B (e, f ), P1B (e, f ) and P2B (e, f ) are pairwise orthogonal with sum IB , the identity operator on B. Clearly B0 (e, f ) and B2 (e, f ) are inner ideals in B and B1 (e, f ) is a subtriple of B. The decomposition B = B0 (e, f ) ⊕ B1 (e, f ) ⊕ B2 (e, f ) of B is said to be the generalized Peirce decomposition of B relative to (e, f ). From (2.6) it is clear that, D A (e, f )P2A (p, q) = P2A (p, q)D A (e, f ), and, therefore, the restriction D B (e, f ) of D A (e, f ) to B is a weak∗ - continuous linear operator on B defined, for each element a in B, by 1 1 D B (e, f )a = ( P1B (e, f ) + P2B (e, f ))a = (ea + af ). 2 2

(3.4)

Lemma 3.2. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Let (e, f ) be an element of CP(A)(p,q) and, for j equal to 0, 1, or 2, let the operators PjB (e, f ) and D B (e, f ) be defined by (3.3) and (3.4). Then: (i) for each complex number λ of unit modulus, the weak∗ -continuous linear operator S B (e, f )(λ) on B, defined by S B (e, f )(λ) = P0B (e, f ) + λP1B (e, f ) + λ2 P2B (e, f ) is a Jordan triple automorphism of B and an isometry from B onto B such that, for each real number t, S B (e, f )(eit ) = exp(2itD B (e, f )); (ii) the weak∗ -continuous linear operator D B (e, f ) is hermitian.

278

C. M. Edwards, G. T. Rüttimann

Proof. This follows from the corresponding result for A, Lemma 3.1 of [20], and (3.2) and (3.3). u t Following the definition in [28], a pair (e1 , f1 ), (e2 , f2 ) of elements of CP(A)(p,q) is said to be compatible, written (e1 , f1 )CB (e2 , f2 ), if, for j and k equal to 0, 1 or 2, the commutant [PjB (e1 , f1 ), PkB (e2 , f2 )] is equal to zero. The next lemma describes other conditions equivalent to that of compatibility. Lemma 3.3. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q) and, for an element (e, f ) in CP(A)(p,q) , let the operator D B (e, f ) be defined by (3.4). Then, for elements (e1 , f1 ) and (e2 , f2 ) of CP(A)(p,q) , the following conditions are equivalent: (i) (ii) (iii) (iv) (v) (vi)

(e1 , f1 )CA (e2 , f2 ); (e1 , f1 )CB (e2 , f2 ); [D B (e1 , f1 ), D B (e2 , f2 )] = 0; 0 0 0 0 D B (e1 , f1 )(e2 Af2 ) ⊆ e2 Af2 and D B (e1 , f1 )(e2p Af2q ) ⊆ e2p Af2q ; 0 0 0 0 D B (e2 , f2 )(e1 Af1 ) ⊆ e1 Af1 and D B (e2 , f2 )(e1p Af1q ) ⊆ e1p Af1q ; e1 e2 = e2 e1 and f1 f2 = f2 f1 .

Proof. That (i) implies (ii) follows from (3.2). That (ii) and (iii) are equivalent and imply (iv) and (v) are proved, using (3.2) and (3.3) and Lemma 3.2, in exactly the same way as the corresponding results for A proved in [20], Lemma 3.2. Since, by the same result, (vi) implies (i), it remains to show that (iv) implies (vi). Using (3.4), for all elements a in A, e2 (e1 (e2 af2 ) + (e2 af2 )f1 )f2 = e1 (e2 af2 ) + (e2 af2 )f1 , 0

0

0

0

0

0

0

0

0

0

e2p (e1 (e2p af2q ) + (e2p af2q )f1 )f2q = e1 (e2p af2q ) + (e2p af2q )f1 .

(3.5) (3.6)

0

Multiplying (3.5) on the left by e2p and (3.6) on the left by e2 gives 0

e2p e1 e2 af2 = 0, 0

(3.7)

0

e2 e1 e2p af2q = 0.

(3.8)

Hence, from (3.7), for all b in A, 0

b∗ e2q e1 e2 af2 = 0,

(3.9)

0

and, choosing a equal to e2 e1 e2p b, 0

0

(e2 e1 e2p b)∗ (e2 e1 e2p b)f2 = 0.

(3.10)

Choosing b equal to a in (3.8) yields 0

0

(e2 e1 e2p b)f2 = (e2 e1 e2p b)q,

(3.11)

Gleason’s Theorem for Rectangular JBW∗ -Triples

279

and, substituting from (3.11) in (3.10), gives 0

0

q(e2 e1 e2p b)∗ (e2 e1 e2p b)q = 0.

(3.12)

Therefore, by (3.11), for all b in A, 0

e2 e1 e2p bq = 0. 0

Hence, the weak∗ -closed inner ideal e(e2 e1 e2p )Aq is equal to zero and, from [30], 1.10.7, 0

c(e(e2 e1 e2p )) ⊥ c(q). But

0

(3.13)

0

p(e2 e1 e2p ) = e2 e1 e2p , 0

and, therefore, e(e2 e1 e2p ) ≤ p. Hence, 0

c(e(e2 e1 e2p )) ≤ c(p) = c(q), 0

(3.14) 0

and, (3.13) and (3.14) imply that c(e(e2 e1 e2p )) is equal to zero. It follows that e(e2 e1 e2p ) 0 and, hence, e2 e1 e2p , is equal to zero. Therefore, e2 e1 = e2 e1 p = e2 e1 e2 = (e2 e1 e2 )∗ = pe1 e2 = e1 e2 , as required. Similarly f1 f2 and f2 f1 coincide. This completes the proof of the lemma. t u For a more detailed investigation into the concept of compatibility of subtriples of a Jordan ∗ -triple, the reader is referred to [12]. 4. The Centroid of a Rectangular JBW∗ -Triple Let B be an arbitrary JBW∗ -triple. Recall that the centroid Z(B) of B is the set of bounded linear operators T on B such that, for all elements a in B, T D(a, a) = D(a, a)T .

(4.1)

For each element T in Z(B) there exists a unique element T † in Z(B) such that, for all elements a and b in B, T {a b a} = {T a b a} = {a T † b a}.

(4.2)

A bounded linear operator T on B is said to be a multiplier if, for each element x in the set ∂e B1∗ of extreme points of the unit ball B1∗ in the dual space B ∗ of B, there exists a complex number λT (x) such that T ∗ x = λT (x)x.

(4.3)

In [11] it is shown that the centroid Z(B) of B coincides with the centralizer of B, namely, the set of multipliers T on B for which there exists a multiplier T † on B such that, for all elements x in ∂e B1∗ , T †∗ x = λT (x)x.

(4.4)

280

C. M. Edwards, G. T. Rüttimann

Recall that a linear projection P on B is said to be an M-projection if, for each element a in B, kak = max{kP ak, ka − P ak}. A subspace of B is said to be an M-summand if it is the range of a necessarily unique M-projection. The results of [3] and [23] show that the M-summands of B coincide with its weak∗ -closed ideals. The following result is an immediate consequence of those of [1,2,4,9] and [10]. Lemma 4.1. Let B be a JBW∗ -triple, with centroid Z(B), and let ∂e B1∗ be the set of extreme points of the unit ball B1∗ in the dual space B ∗ of B. Then: with respect to the operator norm and product, and the involution T 7 → T † , defined by (4.2), Z(B) forms a commutative W∗ -algebra; (ii) the mapping T 7 → λT defined by (4.3) is an isometric ∗ -isomorphism from Z(B) onto a sub-W∗ -algebra of the space of bounded complex-valued functions on ∂e B1∗ ; (iii) the set of M-projections on B, when ordered by the set inclusion of the corresponding M-summands, with complementation P 7 → IB − P is identical to the complete Boolean lattice P(Z(B)) of self-adjoint idempotents in the commutative W∗ -algebra Z(B). (i)

For the remainder of this section B will denote the rectangular JBW∗ -triple pAq discussed in Sect. 3. Before embarking upon a further discussion of the complete lattice CP(A)(p,q) , some preliminary results are needed. Recall that the mapping z 7 → pz is a ∗ -isomorphism from the commutative W∗ -algebra c(p, q)Z(A) onto the centre Z(pAp) of the hereditary sub-W∗ -algebra pAp of A. It follows that the same mapping determines an order isomorphism from the complete Boolean lattice ZP(A)c(p,q) onto ZP(pAp) or, equivalently, Z(P(A)p ). In order to simplify notation, for e in P(A)p , let ^ {zp : z ∈ ZP(A)c(p,q) , e ≤ z}. cp (e) = It is clear that cp (e) coincides with c(e)p. Lemma 4.2. Let A be a W∗ -algebra, with centre Z(A), let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, and let B be the rectangular JBW∗ -triple pAq, with centroid Z(B). Then, the mapping µ, defined, for each element z of the commutative W∗ -algebra c(p, q)Z(A), and each element a in B, by µ(z)(a) = za, is an isometric ∗ -isomorphism from c(p, q)Z(A) onto Z(B). Proof. It is clear that for each element z in c(p, q)Z(A), µ(z) lies in Z(B) and that µ is a ∗ -homomorphism. Suppose that z is an element of c(p, q)Z(A) such, that zB is equal to zero. Since the support r(z) of z in A is the weak∗ -limit of a sequence of finite linear combinations of elements consisting of products of z and z∗ , it follows that r(z) lies in c(p, q)Z(A). Similarly, e(z) lies in c(p, q)Z(A) and, by commutativity, f (z) and e(z) coincide. Therefore, it follows that pe(z)Ae(z)q is equal to zero. By [30], 1.10.7, it can be seen that e(z)c(p, q) = c(pe(z)) ⊥ c(e(z)q) = e(z)c(p, q), and it follows that e(z)c(p, q) is equal to zero. But e(z) ≤ c(p, q), and, therefore e(z) is equal to zero, which implies that z is equal to zero. It follows that µ is a ∗ -isomorphism

Gleason’s Theorem for Rectangular JBW∗ -Triples

281

into Z(B). Let P be an M-projection on B. Then, by [23], the range P B of B is a weak∗ -closed ideal in B such that B = P B ⊕M (P B)⊥ , where (P B)⊥ is the set of elements b in B such that D(P B, b) is equal to zero. It follows from Lemma 3.1 and the results of [19] that there exists an element (e, f ) in CP(A)(p,q) such that B = eAf ⊕M e0p Af 0q . Therefore, by Lemma 3.2,

eAf 0q = e0p Af = {0}.

Hence, by [30], 1.10.7, f 0q ≤ cq (f 0q ) ≤ cq (e)0q = c(e, f )0 q ≤ f 0q . It follows that

f 0q = c(e, f )0 q,

e0p = c(e, f )0 p,

and, hence, that e = c(e, f )p,

f = c(e, f )q.

Therefore, P = µ(c(e, f )). Lemma 4.1 shows that µ maps onto Z(B), as required. u t This result has the following immediate corollary. Corollary 4.3. The mapping µ defined in Lemma 4.2, when restricted to the complete Boolean lattice ZP(A)c(p,q) , is an order isomorphism onto the complete Boolean lattice of M-projections on B. It is now possible to examine the complete lattice CP(A)(p,q) in more detail. Lemma 4.4. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q) and, for an element (e, f ) in CP(A)(p,q) , let (e, f )0(p,q) = (c(f 0q )e0p , c(e0p )f 0q ). Then: (i)

for each element (e, f ) in CP(A)(p,q) , (e, f )0(p,q) 0(p,q) = ((c(f 0f )e0p )0p , (c(e0p )f 0q )0q ), and

c((e, f )0(p,q) 0(p,q) ) = c(e, f );

(ii) the mapping (e, f ) 7 → (e, f )0(p,q) is order reversing;

(4.5)

282

C. M. Edwards, G. T. Rüttimann

(iii) for each element (e, f ) in CP(A)(p,q) , (e, f ) ≤ (e, f )0(p,q) 0(p,q) = (e, f )0(p,q) 0(p,q) 0(p,q) 0(p,q) = . . . , (e, f )0(p,q) = (e, f )0(p,q) 0(p,q) 0(p,q) = (e, f )0(p,q) 0(p,q) 0(p,q) 0(p,q) 0(p,q) = . . . . Proof. Observe that (c(e0p )f 0q )0q = (c(e0p )q ∧ f 0q )0q = (c(e0p )q)0q ∨ (f 0q )0q

(4.6)

= (c(e0p )q)0q ∨ f. Therefore, c((c(e0p )f 0q )0q ) = c(c(e0p )q)0q ) ∨ c(e, f ).

(4.7)

Observe that, since e ≤ c(e), c(e)0 p ≤ e0p ≤ c(e0p ), which implies that c(e)0 c(p, q) ≤ c(e0p ). Therefore, c(e)0 q ≤ c(e0p )q, which shows that (c(e0p )q)0q ≤ (c(e)0 q)0q = c(e)q. Hence

c((c(e0p )q)0q ) ≤ c(e, f )c(p, q) = c(e, f ),

and it follows from (4.7) that c((c(e0p )f 0q )0q ) = c(e, f ). Similarly, c((c(f 0q )e0p )0p ) is also equal to c(e, f ) and (i) follows. Let (e1 , f1 ) and (e2 , f2 ) be elements of CP(A)(p,q) such that (e1 , f1 ) ≤ (e2 , f2 ). 0 0 0 0 Then, e1 ≤ e2 and f1 ≤ f2 . and e2p ≤ e1p and f2q ≤ f1q . Hence, 0

0

0

0

0

0

c(f2q )e2p ≤ c(f2q )e1p ≤ c(f1q )e1p , 0

0

0

0

and, similarly c(e2p )f2q ≤ c(e1p )f1q . Therefore, (e2 , f2 )0(p,q) ≤ (e1 , f1 )0(p,q) and the proof of (ii) is complete. From (i) it can be seen that (e, f ) ≤ (e, f )0(p,q) 0(p,q) , and, combining this result with (ii), completes the proof of (iii). u t In general, the complete lattice CP(A)(p,q) is not orthomodular. However, it is still possible to have a notion of orthogonality in CP(A)(p,q) . An element (e1 , f1 ) in CP(A)(p,q) is said to be orthogonal to an element (e2 , f2 ) in CP(A)(p,q) , written (e1 , f1 ) ⊥(p,q) (e2 , f2 ), if (e2 , f2 ) ≤ (e1 , f1 )0(p,q) . The next lemma shows that this is a reasonable definition. Lemma 4.5. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Then, for elements (e1 , f1 ) and (e2 , f2 ) in CP(A)(p,q) , the following conditions are equivalent:

Gleason’s Theorem for Rectangular JBW∗ -Triples

283

(i) (e1 , f1 ) ⊥(p,q) (e2 , f2 ); (ii) (e2 , f2 ) ⊥(p,q) (e1 , f1 ); (iii) e1 + e2 ≤ p and f1 + f2 ≤ q. Proof. The equivalence of (i) and (ii) is immediate from Lemma 4.4. Furthermore, if (i) holds, then 0 0 0 e2 ≤ c(f1q )e1p ≤ e1p , and e1 + e2 ≤ p as required. Similarly, f1 + f2 ≤ q and (iii) holds. Conversely, if these 0 0 conditions hold then e2 ≤ e1p , and, since f2 ≤ f1q , 0

e2 ≤ c(e2 , f2 ) ≤ c(f1q ). 0

0

0

0

Therefore, e2 ≤ c(f1q )e1p and, similarly, f2 ≤ c(e1p )f1q , which together imply that (i) holds. u t Lemma 4.2 and Lemma 4.5 lead to the following result. Corollary 4.6. Let (e1 , f1 ), (e2 , f2 ) be a pair of orthogonal elements in CP(A)(p,q) . Then (e1 , f1 ) and (e2 , f2 ) are compatible. An element (g, h) in CP(A)(p,q) is said to be central if, for each element (e, f ) of CP(A)(p,q) , (e, f ) = ((g, h) ∧ (e, f )) ∨ ((g, h)0(p,q) ∧ (e, f )).

(4.8)

The next result describes the central elements of CP(A)(p,q) . Lemma 4.7. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Then, an element (g, h) in CP(A)(p,q) , is central if and only if there exists an element z in the complete Boolean lattice ZP(A)c(p,q) such that g = zp,

h = zq,

and, in this case, z is uniquely defined. Proof. Suppose that z is an element of ZP(A)c(p,q) . Observe that (zp, zq)0(p,q) = (c((zq)0q )(zp)0p , c((zp)0p )(zq)0q ) = (c(p, q)z0 p, c(p, q)z0 q) = (z0 p, z0 q). It follows that, for an arbitrary element (e, f ) of CP(A), ((zp, zq) ∧ (e, f )) ∨ ((z0 p, z0 q) ∧ (e, f )) = (e, f ), as required. Conversely, let (g, h) be a central element of CP(A)(p,q) . Then, for each element (e, f ) in CP(A)(p,q) , (e, f ) = (g, h) ∧ (e, f ) ∨ (g, h)0(p,q) ∧ (e, f ) = (g, h) ∧ (e, f ) ∨ (c(h0q )g 0p , c(g 0p )h0q ) ∧ (e, f ) .

284

C. M. Edwards, G. T. Rüttimann

Therefore, e = c(h ∧ f )(g ∧ e) ∨ c(c(g 0p )h0q ) ∧ f ))(c(h0q )g 0p ∧ e) = c(h ∧ f )(g ∧ e) ∨ c(g 0p )c(h0q )c(h0q ∧ f )(g 0p ∧ e) ≤ (g ∧ e) ∨ (g 0p ∧ e) ≤ e. It follows that

e = (g ∧ e) ∨ (g 0p ∧ e)

and, hence that g is an element of Z(P(A)p ). Therefore, by the remarks prior to Lemma 4.2, there exists a unique element z in ZP(A)c(p,q) such that g is equal to zp. Similarly, there exists a unique element w in ZP(A)c(p,q) such that h is equal to wq. However, z = zc(p, q) = c(zp) = c(g) = c(g, h) = c(h) = c(wq) = wc(p, q) = w, and the proof of the theorem is complete. u t Combining this result with Lemma 3.1 and Corollary 4.3 yields the following result. Corollary 4.8. The centre ZCP(A)(p,q) of the complete lattice CP(A)(p,q) is a complete Boolean lattice that is order isomorphic to the complete Boolean lattice of M-projections on B. Observe that it is a consequence of this result that the centre ZI(B) of the complete lattice of weak∗ -closed inner ideals in B is the complete Boolean lattice of weak∗ -closed ideals in B and the centre ZS(B) of the complete lattice of structural projections on B is the complete Boolean lattice of M-projections on B. Two elements (e1 , f1 ) and (e2 , f2 ) of CP(A)(p,q) are said to be centrally orthogonal if there exists an element z in ZP(A)(p,q) such that (e1 , f1 ) ≤ (zp, zq) and (e2 , f2 ) ≤ (zp, zq)0(p,q) . Observe that centrally orthogonal elements are orthogonal and therefore compatible. The proof of the following result, that follows closely that of [20], Theorem 4.6, is straightforward. Theorem 4.9. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Then: (i) the elements (e1 , f1 ) and (e2 , f2 ) of CP(A)(p,q) are centrally orthogonal if and only if c(e1 , f1 ) + c(e2 , f2 ) ≤ c(p, q) and, in this case, (e1 , f1 ) ∨ (e2 , f2 ) = (e1 + e2 , f1 + f2 ); (ii) for each element (e, f ) in CP(A)(p,q) and each element z in ZP(A)c(p,q) , the pairs (ze, zf ) and (z0 e, z0 f ) are centrally orthogonal elements of CP(A)(p,q) such that (ze, zf ) ∨ (z0 e, z0 f ) = (e, f ).

Gleason’s Theorem for Rectangular JBW∗ -Triples

285

The notion of rigid collinearity, discussed earlier for a W∗ -algebra can be extended to the rectangular JBW∗ -triple B. A pair (e1 , f1 ), (e2 , f2 ) of elements in CP(A)(p,q) is said to be rigidly collinear, denoted by (e1 , f1 )>(p,q) (e2 , f2 ), if B2 (e1 , f1 ) ⊆ B1 (e2 , f2 ),

B2 (e2 , f2 ) ⊆ B1 (e1 , f1 ).

Observe that, because (p, q) and (e1 , f1 ) and (p, q) and (e2 , f2 ) are compatible, it follows that (e1 , f1 )>(p,q) (e2 , f2 ) if and only if (e1 , f1 )>(e2 , f2 ), and (e1 , f1 ) ≤ (p, q) and (e2 , f2 ) ≤ (p, q). Therefore, Theorem 5.3 of [20] and the previous results of this section lead immediately to the following characterization of rigid collinearity. Theorem 4.10. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ -triple pAq, and let (e1 , f1 ) and (e2 , f2 ) be elements of the principal order ideal CP(A)(p,q) in CP(A) generated by (p, q). Then (e1 , f1 )>(p,q) (e2 , f2 ) if and only if there exist elements w1 , w2 and w3 of ZP(A)c(p,q) such that, w1 + w2 + w3 = c(p, q), w1 e1 = w1 e2 , w1 f1 + w1 f2 ≤ w1 q, w2 e1 + w2 e2 ≤ w2 p, w2 f1 = w2 f2 , w3 e1 = w3 e2 = w3 f1 = w3 f2 = 0, and, in this case,

c(e1 , f1 ) = c(e2 , f2 ) ≤ w1 + w2 . Furthermore, a unique such triple w1 , w2 and w3 of elements of ZP(A)(p,q) satisfying the additional conditions w1 ≤ c(e1 , f1 ) and w2 ≤ c(e1 , f1 ) exists. In this case: (i) c(e1 , f1 ) = c(e2 , f2 ) = w1 + w2 ; (ii) writing w1 e1 = w1 e2 = e0 , w2 f1 = w2 f2 = f0 , w2 e1 + w2 e2 = e, w1 f1 + w1 f2 = f, e and e0 are elements of P(A)p and f and f0 are elements of P(A)q , such that c(e0 ) = w1 and c(f0 ) = w2 , and (e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A)(p,q) such that (e1 , f1 ) ∨ (e2 , f2 ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f + f0 ). The notion of rigid collinearity can be extended to any family ((ej , fj ))j ∈3 of elements of CP(A)(p,q) . Such a family is said to be rigidly collinear if every pair of distinct elements of the family is rigidly collinear. The next result, the proof of which follows closely that of Theorem 5.5 of [20], describes such families. Theorem 4.11. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let ((ej , fj ))j ∈3 be a rigidly collinear family of elements of the principal order ideal CP(A)(p,q) in CP(A) generated by (p, q). For j and k in 3 and l equal to 1, 2 or 3, let w(j, k)l be the unique element of ZP(A)c(p,q) such that w(j, k)1 fj + w(j, k)1 fk ≤ w(j, k)1 q, w(j, k)1 ej = w(j, k)1 ek , w(j, k)2 fj = w(j, k)2 fk , w(j, k)2 ej + w(j, k)2 ek ≤ w(j, k)2 p, w(j, k)3 ej = w(j, k)3 fj = w(j, k)3 ek = w(j, k)3 fk = 0, w(j, k)1 ≤ c(ej , fj ), w( j, k)2 ≤ c(ej , fj ),

3 X l=1

w(j, k)l = c(p, q).

286

C. M. Edwards, G. T. Rüttimann

Then, there exist uniquely pairwise orthogonal elements w1 , w2 and w3 of ZP(A)c(p,q) of sum c(p, q) and elements e0 in P(A)p and f0 in P(A)q such that: (i)

for all distinct j and k in 3, and l equal to 1, 2 or 3, w(j, k)l = wl ;

(ii) for all j in 3, w1 ej = e0 , c(e0 ) = w1 ,

w2 fj = f0 , c(f0 ) = w2 ,

w3 ej = w3 fj = 0, c(ej , fj ) = w1 + w2 ,

where (w2 ej )j ∈3 and (w1 fj )j ∈3 are families of pairwise orthogonal elements of P(A)p and P(A)q , respectively; (iii) writing _ _ w2 ej , f = w1 fj , e= j ∈3

j ∈3

(e0 , f ) and (e, f0 ) are centrally orthogonal elements of CP(A)(p,q) such that _ (ej , fj ) = (e0 , f ) ∨ (e, f0 ) = (e0 + e, f0 + f ). j ∈3

Recall that a W∗ -algebra is said to be a factor if its centre consists of complex multiples of its unit. In the case in which c(p, q)A is a factor, Theorem 4.11 has a particularly simple form. Corollary 4.12. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A such that c(p, q)A is a factor, let B be the rectangular JBW∗ -triple pAq, and let ((ej , fj ))j ∈3 be a rigidly collinear family of elements of the principal order ideal CP(A)(p,q) in CP(A) generated by (p, q). Then, either, there exists an element e0 in P(A)p such that, for all j in 3, ej is equal to e0 and (fj )j ∈3 is a family of pairwise orthogonal elements of P(A)q , or, there exists an element f0 in P(A)q such that, for all j in 3, fj is equal to f0 and (ej )j ∈3 is a family of pairwise orthogonal elements of P(A)p . Proof. Since ZP(A)c(p,q) is the set {0, c(p, q)} it follows that, in the theorem, one of w1 , w2 and w3 is equal to c(p, q) and the other two are equal to zero. If w3 is equal to c(p, q), then all elements of the rigidly collinear family are zero, giving a contradiction. If w1 is equal to c(p, q) and w2 is equal to zero, then, by Theorem 4.11, for all j in 3, ej is equal to e0 and (fj )j ∈3 is a family of pairwise orthogonal elements of P(A)q . The t other possibility occurs if w1 is equal to zero and w2 is equal to c(p, q). u 5. Measures on the Complete Lattice CP(A)(p,q) In this section certain measures on the complete lattice CP(A)(p,q) are analyzed. A measure m on CP(A)(p,q) is a mapping from CP(A)(p,q) to C such that, for each pair (e1 , f1 ) and (e2 , f2 ) of elements of CP(A)(p,q) that are either centrally orthogonal or rigidly collinear, m((e1 , f1 ) ∨ (e2 , f2 )) = m((e1 , f1 )) + m((e2 , f2 )).

Gleason’s Theorem for Rectangular JBW∗ -Triples

287

The measure m is said to be bounded if {m((e, f )) : (e, f ) ∈ CP(A)(p,q) } is a bounded subset of C. Recall that, according to [34], a mapping ν from P(A)p × P(A)q to C is said to be a quantum bimeasure if, for e1 , e2 in P(A)p such that e1 + e2 ≤ p and f in P(A)q , ν(e1 + e2 , f ) = ν(e1 , f ) + ν(e2 , f ), and, for e in P(A)p and f1 and f2 in P(A)q with f1 + f2 ≤ q, ν(e, f1 + f2 ) = ν(e, f1 ) + ν(e, f2 ). The quantum bimeasure ν is said to be bounded if the set {ν(e, f ) : e ∈ P(A)p , f ∈ P(A)q } is a bounded subset of C. The first lemma describes the relationship that exists between measures on CP(A)(p,q) and quantum bimeasures on P(A)p ×P(A)q . Its proof is very similar to that of Lemma 6.1 of [20]. Lemma 5.1. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A), of centrally equivalent pairs of projections in A, let B be the rectangular JBW∗ triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Then, there exists a bijection m 7 → νm from the set of measures m on CP(A)(p,q) onto the set of quantum bimeasures ν on P(A)p × P(A)q with the property that, for all elements z in ZP(A)c(p,q) , and all elements e in P(A)p and f in P(A)q , ν(ze, f ) = ν(e, zf ), defined, for e in P(A)p and f in P(A)q , by νm (e, f ) = m((c(f )e, c(e)f )). The mapping sends the set of bounded measures onto the set of bounded quantum bimeasures. Using the results of the previous section, this result can now be combined with those of Wright [34] to give a precise description of the bounded measures on CP(A)(p,q) , at least in the situation in which Gleason’s Theorem holds. Theorem 5.2. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A such that neither of the hereditary sub-W∗ -algebras pAp and qAq of A contains a weak∗ -closed ideal of Type I2 , let B be the rectangular JBW∗ -triple pAq, and let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q). Let Sb (pAp × qAq) be the space of bounded sesquilinear functionals φ from pAp × qAq to C such that, for all elements a in pAp, b in qAq and c in c(p, q)Z(A), φ(ca, b) = φ(a, c∗ b). Then, the mapping φ 7 → mφ defined, for each element φ in Sb (pAp × qAq) and each element (e, f ) of CP(A)(p,q) , by mφ ((e, f )) = φ(e, f ) is a bijection from Sb (pAp×qAq) onto the space Mb (CP(A)(p,q) ) of bounded measures on CP(A).

288

C. M. Edwards, G. T. Rüttimann

Proof. Let φ be an element of Sb (pAp × qAq) and denote by νφ its restriction to P(A)p × P(A)q . Then, νφ is clearly a quantum bimeasure and, for all elements e in P(A)p , f in P(A)q and z in ZP(A)c(p,q) , νφ (ze, f ) = φ(ze, f ) = φ(e, z∗ f ) = φ(e, zf ) = νφ (e, zf ). It follows from Lemma 5.1 that the function mφ from CP(A)(p,q) to C, defined above, is a bounded measure. Let m be a bounded measure on CP(A)(p,q) and let νm be the bounded quantum bimeasure defined in Lemma 5.1. Then, by [34], Theorem 1, there exists a unique bounded bilinear functional ψm on pAp × qAq extending νm . For a in pAp and b in qAq, define a mapping φm from pAp × qAq to C by φm (a, b) = ψm (a, b∗ ). Then, φm is a bounded sesquilinear functional from pAp × qAq to C extending νm . Moreover, for e in P(A)p , f in P(A)q , and z in ZP(A)c(p,q) , φm (ze, f ) = νm (ze, f ) = νm (e, zf ) = φm (e, zf ). Since the set of finite linear combinations of elements of P(A)p is dense in pAp for the norm topology, it follows that, for all elements a in pAp, b in qAq, and z in ZP(A)c(p,q) , φm (za, b) = φm (a, zb). The space of finite linear combinations of elements of ZP(A)c(p,q) is dense in c(p, q)Z(A) in the norm topology. Recalling that φm is conjugate linear in the second variable, it follows that, for all elements a in pAp, b in qAq and c in c(p, q)Z(A), φm (ca, b) = φm (a, c∗ b). Hence, φm lies in Sb (pAp × qAq) and, for (e, f ) in CP(A)(p,q) , mφm ((e, f )) = φm (e, f ) = ψm (e, f ) = νm (e, f ) = m((e, f )). This completes the proof of the theorem. u t A measure m on CP(A)(p,q) is said to be normal if, for every centrally orthogonal or rigidly collinear family ((ej , fj ))j ∈3 of elements of CP(A)(p,q) , _ X (ej , fj ) = m((ej , fj )), m j ∈3

j ∈3

where the sum is defined to be the limit of the net formed by taking sums over finite subsets of 3. The final result describes the normal bounded measures on CP(A). Theorem 5.3. Let A be a W∗ -algebra, let (p, q) be an element of the complete lattice CP(A) of centrally equivalent pairs of projections in A such that neither of the hereditary sub-W∗ -algebras pAp and qAq of A contains a weak∗ -closed ideal of Type I2 , let B be the rectangular JBW∗ -triple pAq, let CP(A)(p,q) be the principal order ideal in CP(A) generated by (p, q) and let φ 7 → mφ be the bijection, defined in Theorem 5.2, from the space Sb (pAp × qAq) of bounded sesquilinear functionals φ from pAp × qAq to C such that, for all elements a in pAp, b in qAq, and c in c(p, q)Z(A), φ(ca, b) = φ(a, c∗ b), onto the space Mb (CP(A)(p,q) ) of bounded measures on CP(A)(p,q) . Then φ is separately weak∗ -continuous if and only if mφ is normal.

Gleason’s Theorem for Rectangular JBW∗ -Triples

289

Proof. Suppose that φ is separately weak∗ -continuous and let ((ej , fj ))j ∈3 be a centrally orthogonal family in CP(A)(p,q) . Let 0 be the directed set of finite subsets of 3. Then, using [20], Lemma 2.3, and the fact that, for j and k distinct elements of 3, φ(ej , fk ) = 0, _ _ _ _ _ (ej , fj ) = mφ ( ej , fj ) = φ ( ej , fk ) mφ j ∈3

j ∈3

= lim φ α∈0

X j ∈α

= lim lim

α∈0 β∈0

=

X

j ∈3

ej ,

_

j ∈3

fk = lim

α∈0

k∈3

XX

X j ∈α

φ(ej , fk ) = lim

α∈0

j ∈α k∈β

k∈3

_

φ ej , X

fk

k∈3

φ(ej , fj )

j ∈α

mφ ((ej , fj )),

j ∈3

as required. Now let ((ej , fj ))j ∈3 be a rigidly collinear family in CP(A)(p,q) . Let w1 , w2 and w3 be the elements of ZP(A)c(p,q) and (e0 , f ) and (e, f0 ) the centrally orthogonal elements of CP(A)(p,q) defined in Theorem 4.11. Then, using the separate weak∗ -continuity of φ and [20], Lemma 2.3, _ mφ ( (ej , fj )) = mφ ((e0 , f ) ∨ (e, f0 )) = mφ ((e0 , f )) + mφ ((e, f0 )) j ∈3

= φ(e0 , f ) + φ(e, f0 ) = φ(e0 , = lim

α∈0

= lim

α∈0

= lim

α∈0

= lim

α∈0

X

j ∈α

X

w1 fj ) + φ(

j ∈3

φ(e0 , w1 fj ) + lim

α∈0

j ∈α

X

_

X

_

w2 ej , f0 )

j ∈3

φ(w2 ej , f0 )

j ∈α

φ(w1 ej , w1 fj ) + φ(w2 ej , w2 fj )

mφ ((w1 ej , w1 fj )) + mφ ((w2 ej , w2 fj ))

j ∈α

X

mφ ((w1 ej , w1 fj ) ∨ (w2 ej , w2 fj )),

j ∈α

using the fact that (w1 ej , w1 fj ) and (w2 ej , w2 fj ) are centrally orthogonal. Hence, X _ mφ ((w1 ej + w2 ej , w1 fj + w2 fj )) mφ ( (ej , fj )) = lim j ∈3

α∈0

= lim

j ∈α

X

α∈0

mφ ((ej , fj )) =

j ∈α

X

mφ ((ej , fj )),

j ∈3

since w3 ej = w3 fj = 0. It follows that the measure mφ is normal. Suppose now that the measure mφ is normal and that f is an element of P(A)q . Let x be the bounded linear functional on pAp defined, for each element a in pAp, by x(a) = φ(a, f ).

290

C. M. Edwards, G. T. Rüttimann

Let (ej )j ∈3 be a centrally orthogonal family of elements of P(A)p . Using the properties of φ, and the fact that ((c(f )ej , c(ej )f ))j ∈3 is a centrally orthogonal family in CP(A)(p,q) , observe that _ _ _ _ ej = φ ej , f = φ c(f ) ej , c ej f x j ∈3

j ∈3

= mφ (c(f ) = mφ =

X

_

j ∈3

ej , c

j ∈3

_

_

j ∈3

ej f ) = mφ (

j ∈3

_

c(f )ej ,

j ∈3

_

c(ej )f ))

j ∈3

X (c(f )ej , c(ej )f ) = mφ ((c(f )ej , c(ej )f ))

j ∈3

φ(ej , f ) =

j ∈3

X

j ∈3

x(ej ).

j ∈3

Now, let (ej )j ∈3 be an orthogonal family of elements of P(A)p each having central support equal to z. Notice that ((c(f )ej , zf ))j ∈3 is a rigidly collinear family of elements of CP(A)(p,q) . Therefore, _ _ _ _ ej = φ ej , f = φ c(f ) ej , c ej f x j ∈3

j ∈3

= mφ (c(f ) = mφ =

X j ∈3

_

_

j ∈3

ej , c

j ∈3

_

j ∈3

ej f ) = mφ (

j ∈3

_ j ∈3

c(f )ej ,

X (c(f )ej , zf ) = mφ ((c(f )ej , zf ))

j ∈3

φ(ej , f ) =

X

_

c(ej )f )

j ∈3

j ∈3

x(ej ).

j ∈3

Lemma 6.5 of [20] shows that the bounded linear functional x is weak∗ -continuous on pAp. Since the set of finite linear combinations of elements of P(A)q is dense in qAq in the norm topology, by approximating a fixed element b in qAq by a finite linear combination of elements of P(A)q , it can be seen that the bounded linear functional mapping a 7 → φ(a, b) is also weak∗ -continuous on pAp. Similarly, for each fixed element a in pAp, the mapping b 7 → φ(a, b) is weak∗ -continuous on qAq. This completes the proof of the theorem. u t 6. Examples In this section three examples will be considered. The first one to be considered is that of a commutative W∗ -algebra A. In this case there exists a hyperstonian space such that A is isomorphic to the commutative W∗ -algebra C() of continuous complex-valued functions on , and the complete orthomodular lattice P(A) is Boolean and corresponds to the family of characteristic functions of clopen subsets of . It follows that a pair of elements e and f in P(A) are centrally equivalent if and only if they coincide. For a fixed element p in P(A), the rectangular JBW∗ -triple pAp is the weak∗ -closed ideal pA in A, and it is a commutative W∗ -algebra that is isomorphic to the commutative

Gleason’s Theorem for Rectangular JBW∗ -Triples

291

W∗ -algebra C(p ), where p is the clopen subset of corresponding to p. There is clearly no loss of generality in assuming that p is the identity in A and that A and B coincide. Observe that the centre Z(A) of A coincides with A, and that the mapping T 7 → T 1 is an isomorphism from the centroid of A onto A. In this example the structure of CP(A) is quite simple, because the mapping e 7→ (e, e) is an order isomorphism from the complete Boolean lattice P(A) onto CP(A). Notice that a pair (e1 , e1 ) and (e2 , e2 ) of elements of CP(A) is always compatible, is orthogonal or centrally orthogonal if and only if e1 ⊥ e2 and is rigidly collinear if and only if e1 and e2 are both zero. Let φ be a bounded sesquilinear functional on A × A such that, for elements a, b and c in A, φ(ca, b) = φ(a, c∗ b).

(6.1)

Notice that, from (6.1), for e and f in P(A) with e ≤ f , φ(e, f ) = φ(ef, f ) = φ(f, ef ) = φ(f, e).

(6.2)

Hence, for arbitrary e and f , using (6.2), φ(e, f ) + φ(f, f ) − φ(ef, f ) = φ(e + f − ef, f ) = φ(e ∨ f, f ) = φ(f, e ∨ f ) = φ(f, e + f − ef ) = φ(f, e) + φ(f, f ) − φ(f, ef ), from which it follows that (6.2) holds for all e and f in P(A). Let mφ be the bounded measure on CP(A), defined, according to Theorem 5.2, by mφ ((e, e)) = φ(e, e). Recall that every bounded measure on CP(A) arises in this way. Furthermore, the mapping πφ from P(A) to C defined, for e in P(A), by πφ (e) = mφ ((e, e)) is clearly a bounded complex measure on the complete Boolean lattice P(A). Therefore, there exists a unique bounded linear functional xφ on A, the corresponding integral, extending πφ . From the definition of a measure, it can be seen that, for e1 and e2 orthogonal, mφ ((e1 , e1 )) + mφ ((e2 , e2 )) = mφ ((e1 + e2 , e1 + e2 )) = φ(e1 + e2 , e1 + e2 ) = φ(e1 , e1 ) + φ(e2 , e2 ) + φ(e1 , e2 ) + φ(e2 , e1 ). Therefore, from (6.2), 2φ(e1 , e2 ) = φ(e1 , e2 ) + φ(e2 , e1 ) = 0.

(6.3)

Let a and b be elements of A that are finite linear combinations of elements of P(A). Then, without loss of generality, there exists a family e1 , e2 , . . . er of pairwise orthogonal elements of P(A) and complex numbers α1 , α2 , . . . αr and β1 , β2 , . . . βr such that a=

r X j =1

αj e j , b =

r X k=1

βk ek .

292

C. M. Edwards, G. T. Rüttimann

It follows that φ(a, b) = =

r X

αj βk φ(ej , ek ) =

j,k=1 r X

r X

αj βj mφ ((ej , ej ))

j =1

(6.4)

∗

αj βj xφ (ej ) = xφ (ab ).

j =1

Since the set of finite linear combinations of elements of P(A) is dense in A, (6.4) holds for arbitary elements a and b in A. Identifying A and C(), it follows that the bounded sesquilinear functional φ is given, for a and b in A, by Z a(ω)b(ω)πφ (dω). φ(a, b) =

It is clear from the definition and Theorem 5.3 that normal measures on CP(A) give rise to normal linear functionals on C(). Commutative W∗ -algebras are used as models for classical physical systems. It can be observed that the results of Sect. 5, as described above, show that ortho-additive bounded measures on P(A) and bounded measures on CP(A) are essentially the same. Integration with respect to a measure on P(A) creates a bounded linear functional on A and a bounded sesquilinear functional on A × A. Let r and s be positive integers greater than two, and let B be the rectangular JBW∗ triple Mr,s (C) of r ×s complex matrices. There exist projections p and q in the complete orthomodular lattice P(A) of projections in the Type I factor A, which is equal to the set Mr+s (C) of (r + s) × (r + s) matrices over C, such that B is isomorphic to pAq. Furthermore the central supports of p and q are equal to 1. In this example, since the centroid of B is trivial, the structure of CP(A)(p,q) is also much simplified. Notice that a pair (e1 , f1 ) and (e2 , f2 ) of non-zero elements of CP(A)(p,q) is compatible if and only if e1 commutes with e2 and f1 commutes with f2 , is orthogonal if and only if e1 ⊥ e2 and f1 ⊥ f2 , is never centrally orthogonal and is rigidly collinear if and only if either e1 is equal to e2 and f1 + f2 ≤ q, or f1 is equal to f2 and e1 + e2 ≤ p. In this example the bounded measures on P(A)(p,q) are precisely the quantum bi-measures on Mr (C) × Ms (C) discussed in [34]. In this finite-dimensional case all measures are normal. The analysis above can easily be extended to the rectangular JBW∗ -triple B(H, K) of bounded linear operators from the complex Hilbert space H into the complex Hilbert space K, provided that neither H or K is of dimension two. In the special case in which H and K coincide, the rectangular JBW∗ -triple B(H, H ) is the Type I factor B(H ), originally taken to represent a quantum-mechanical system. The final example shows that the results of Sect. 5 are genuine generalizations of Gleason’s original theorem. Let B be a complex Hilbert space. Define the triple product of elements a, b and c in B by {a b c} =

1 (ha, bic + hc, bia). 2

Then, it can be easily seen that B is a Jordan∗ -triple. Let B ∗ be the Banach dual space of B and let a 7 → aˆ be the conjugate linear mapping from B onto B ∗ , defined, for b in B, by a(b) ˆ = hb, ai.

Gleason’s Theorem for Rectangular JBW∗ -Triples

293

Then, B ∗ is a complex Hilbert space with respect to the inner product, defined for a and b in B, by ˆ = hb, ai, ha, ˆ bi and the mapping a 7 → aˆ is isometric. For T in B(B), define Tˆ on B ∗ , for a in B, by Tˆ aˆ = (Tˆa). Then Tˆ lies in B(B ∗ ) and the map T 7 → Tˆ is a conjugate linear ∗ -isomorphism from B(B) onto B(B ∗ ) and, in particular is an ortho-order isomorphism from the complete orthomodular lattice P(B(B)) onto P(B(B ∗ )). Consider the complex Hilbert space B ∗ ⊕C and let A be the Type I factor B(B ∗ ⊕C). ˆ α) in B ∗ ⊕ C, by Define elements p and q in P(B(B ∗ ⊕ C)), for (a, p(a, ˆ α) = (0, α), q(a, ˆ α) = (a, ˆ 0). For each element b in B and (a, ˆ α) in B ∗ ⊕ C, let η(b)(a, ˆ α) = (0, hb, ai). Then, η is a linear mapping from B into pAq, and a simple calculation shows that, regarded as a bounded linear operator on B ∗ ⊕ C, ˆ 0). η(b)∗ (a, ˆ α) = (α b, ˆ α) lie in B ∗ ⊕ C. Then, Let b1 and b2 lie in B and let (a, ˆ α) = η(b1 )η(b2 )∗ (0, hb1 , ai) = η(b1 )(hb1 , aibˆ2 , 0) η(b1 )η(b2 )∗ η(b1 )(a, ˆ α), = (0, hb1 , hb1 , aib2 i) = η({b1 b2 b1 })(a, and η is a Jordan ∗ -triple isomorphism from B into pAq. Let c be an arbitary element of A. Then pcq lies in B(B ∗ , C) and, therefore, there exists an element b in B such that, for all aˆ in B ∗ , (pcq)(a) ˆ = hb, ai, and it can be seen that η(b) and pcq coincide. Hence η is a Jordan ∗ -triple isomorphism from B onto pAq and B is a rectangular JBW∗ -triple. Since A is a factor, (p, q) lies in CP(A), and a pair (e, f ) in CP(A) lies in the order ideal CP(A)(p,q) if and only if e ≤ p and f ≤ q. Since p is minimal in P(A), e is equal either to 0 or to p. Recall that the order interval [0, q] can be identified with the order ideal P(A)q or, equivalently with P(qAq) and P(B(B ∗ )). The remarks above show that ˆ is an order isomorphism from P(B(B)) onto CP(A)(p,q) . the mapping Q 7 → (p, Q) It follows that the complete lattice S(B) of structural projections on B coincides with the complete lattice P(B(B)) of orthogonal projections on B, and the complete lattice I(B) of weak∗ -closed inner ideals in B coincides with the complete lattice of closed subspaces of B. Furthermore, for Q1 and Q2 in S(B), Q1 and Q2 are compatible if ˆ 2 ) are compatible, which occurs when Q ˆ 1 and Q ˆ 2 or, ˆ 1 ) and (p, Q and only if (p, Q equivalently, Q1 and Q2 , commute. Moreover, Q1 and Q2 are orthogonal if and only ˆ 1 ) ⊥ (p, Q ˆ 2 ), which occurs if and only if at least one of Q1 and Q2 is zero. if (p, Q ˆ 1 )>(p,q) (p, Q ˆ 2 ), which Finally, Q1 and Q2 are rigidly collinear if and only if (p, Q occurs if and only if Q1 and Q2 are orthogonal.

294

C. M. Edwards, G. T. Rüttimann

Suppose now that the dimension of B is not two and let m be a bounded measure ˆ in on S(B). Then, it follows that the mapping m ˆ defined, for each element (p, Q) CP(A)(p,q) , by ˆ = m(Q), m(p, ˆ Q) is a bounded measure. Since the centroid of pAq is trivial, by Theorem 5.2, there exists a unique bounded sesquilinear functional ξmˆ on Cp × B(B ∗ ) such that, for all Q in S(B), ˆ = m(p, ˆ = m(Q). ˆ Q) ξmˆ (p, Q) For each element T in B(B), let xm (T ) = ξmˆ (p, Tˆ ). Then xm is a bounded linear functional on B(B) extending m and, by Theorem 5.2, xm is the unique such extension. Furthermore, by Theorem 5.3, xm is weak∗ -continuous if and only if m is normal. In this case there exists a unique trace class operator ρm on B such that, for all T in B(B), xm (T ) = Tr(ρm T ). In other words, Gleason’s Theorem [22] holds. It should, however, be observed that, since the results depend upon the generalized Gleason Theorem for W∗ -algebras proved by Bunce and Wright [6,7], this is not an alternative proof of Gleason’s Theorem. References 1. Alfsen, E.M., Effros, E.G.: Structure in real Banach spaces I. Ann. of Math. 96, 98–128 (1972) 2. Alfsen, E.M., Effros, E.G.: Structure in real Banach spaces II. Ann. of Math. 96, 129–174 (1972) 3. Barton, T.J., Timoney, R.M.: Weak∗ -continuity of Jordan triple products and its applications. Math. Scand. 59, 177–191 (1986) 4. Behrends, E.: M-structure and the Banach-Stone Theorem. Lecture Notes in Mathematics 736, Berlin– Heidelberg–New York: Springer, 1979 5. Bonsall, F.F., Duncan, J.: Numerical Ranges of Operators on Normed Spaces and of Elements of Normed Algebras. London Mathematical Society Lecture Note Series 2, Cambridge: Cambridge University Press, 1971 6. Bunce, L.J., Wright, J.D.M.: The Mackey-Gleason problem. Bull. Am. Math. Soc. 26, 288–293 (1992) 7. Bunce, L.J., Wright, J.D.M.: Complex measures on projections in von Neumann algebras. J. London Math. Soc. 46, 269–279 (1992) 8. Bunce, L.J., Wright, J.D.M.: The Mackey-Gleason problem for vector measures on projections in von Neumann algebras. J. London Math. Soc. 49, 133–149 (1994) 9. Cunningham, F.: M-structure in Banach spaces. Math. Proc. Cambridge Philos. Soc. 63, 613–629 (1967) 10. Cunningham, F., Effros, E.G., Roy, N.M.: M-structure in dual Banach spaces. Israel J. Math. 14, 304–309 (1973) 11. Dineen, S., Timoney, R.M.: The centroid of a JB∗ -triple system. Math. Scand. 62, 327–342 (1988) 12. Edwards, C.M., Lörch, D., Rüttimann, G.T.: Compatible subtriples of Jordan ∗ -triples. J. Algebra (to appear) 13. Edwards, C.M., McCrimmon, K., Rüttimann, G.T.: The range of a structural projection. J. Funct. Anal. 139, 196–224 (1996) 14. Edwards, C.M., Rüttimann, G.T.: On the facial structure of the unit balls in a JBW∗ -triple and its predual. J. London Math. Soc. 38, 317–322 (1988) 15. Edwards, C.M., Rüttimann, G.T.: Inner ideals in W∗ -algebras. Michigan Math. J. 36, 147–159 (1989) 16. Edwards, C.M., Rüttimann, G.T.: On inner ideals in ternary algebras. Math. Z. 204, 309–318 (1990) 17. Edwards, C.M., Rüttimann, G.T.: A characterization of inner ideals in JB∗ -triples. Proc. Am. Math. Soc. 116, 1049–1057 (1992) 18. Edwards, C.M., Rüttimann, G.T.: Structural projections on JBW∗ -triples. J. London Math. Soc. 53, 354– 368 (1996) 19. Edwards, C.M., Rüttimann, G.T.: Peirce inner ideals in Jordan ∗ -triples. J. Algebra 180, 41–66 (1996)

Gleason’s Theorem for Rectangular JBW∗ -Triples

295

20. Edwards, C.M., Rüttimann, G.T.: The lattice of weak∗ -closed inner ideals in a W∗ -algebra. Commun. Math. Phys. 197, 131–166 (1998) 21. Friedman, Y., Russo, B.: Structure of the predual of a JBW∗ -triple. J. Reine Angew. Math. 356, 67–89 (1985) 22. Gleason, A.M.: Measures on the closed subspaces of a Hilbert space. J. Maths. and Mechanics. 6, 885–894 (1957) 23. Horn, G.: Characterization of the predual and the ideal structure of a JBW∗ -triple. Math. Scand. 61, 117–133 (1987) 24. Isham, C.J.: Quantum logic and histories approach to quantum theory. J. Math. Phys. 35, 2157–2185 (1994) 25. Isham, C.J., Linden, N.: Quantum temporal logic and decoherence functionals in the histories approach to generalised quantum theory. J. Math. Phys. 35, 5452–5476 (1994) 26. Isham, C.J., Linden, N., Schreckenberg, S.: The classification of decoherence functionals: An analogue of Gleason’s theorem. J. Math. Phys. 35, 6360–6370 (1994) 27. Kaup, W.: Riemann mapping theorem for bounded symmetric domains in complex Banach spaces. Math. Z. 183, 503–529 (1983) 28. McCrimmon, K.: Compatible Peirce decomposition of Jordan triple systems. Pac. J. Math. 83, 415–439 (1979) 29. Pedersen, G.K.: C∗ -algebras and their automorphism groups. (London Mathematical Society Monographs 14). London: Academic Press, 1979 30. Sakai, S.: C∗ -algebras and W∗ -algebras. Berlin–Heidelberg–New York: Springer, 1971 31. Ruan, Z.-J.: Injectivity of operator spaces. Trans. Am. Math. Soc. 315, 89–104 (1989) 32. Rüttimann, G.T.: Non-commutative measure theory. Habilitationsschrift, Universität Bern 1980 33. Upmeier, H.: Symmetric Banach manifolds and Jordan C∗ -algebras. Amsterdam: North Holland, 1985 34. Wright, J.D.M.: The structure of decoherence functionals for von Neumann quantum histories. J. Math. Phys. 36, 5409–5413 (1995) 35. Wright, J.D.M.: Linear representations of bilinear forms on operator algebras. Expositiones Mathematicae (to appear) 36. Wright, J.D.M.: Decoherence functionals for von Neumann quantum histories: boundedness and countable additivity. Commun. Math. Phys. 191, 493–500 (1998) Communicated by H. Araki

Commun. Math. Phys. 203, 297 – 324 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Quantized Flag Manifolds and Irreducible ∗-Representations Jasper V. Stokman1,?,?? , Mathijs S. Dijkhuizen2,??? 1 KdV Institute for Mathematics, University of Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam,

The Netherlands

2 Department of Mathematics, Faculty of Science, Kobe University, Rokko, Kobe 657, Japan

Received: 24 February 1998 / Accepted: 22 October 1998

Abstract: We study irreducible ∗-representations of a certain quantization of the algebra of polynomial functions on a generalized flag manifold regarded as a real manifold. All irreducible ∗-representations are classified for a subclass of flag manifolds containing in particular the irreducible compact Hermitian symmetric spaces. For this subclass it is shown that the irreducible ∗-representations are parametrized by the symplectic leaves of the underlying Poisson bracket. We also discuss the relation between the quantized flag manifolds studied in this paper and the quantum flag manifolds studied by Soibel’man, Lakshimibai & Reshetikhin, Jurˇco & Šˇtovíˇcek and Korogodsky. 1. Introduction The irreducible ∗-representations of the “standard” quantization Cq [U ] of the algebra of functions on a compact connected simple Lie group U were classified by Soibel’man [40]. He showed that there is a 1–1 correspondence between the equivalence classes of irreducible ∗-representations of Cq [U ] and the symplectic leaves of the underlying Poisson bracket on U (cf. [39,40]). This Poisson bracket is sometimes called Bruhat–Poisson, because its symplectic foliation is a refinement of the Bruhat decomposition of U (cf. Soibel’man [39,40]). The symplectic leaves are naturally parametrized by W × T , where T ⊂ U is a maximal torus and W is the Weyl group associated with (U, T ). The 1–1 correspondence between equivalence classes of irreducible ∗-representations of Cq [U ] and symplectic leaves of U can be formally explained by the observation that in the semi-classical limit the kernel of an irreducible ∗-representation should tend ? Current address: Université Louis Pasteur, Institut de Recherche Mathématique Avancée, 7, rue René Descartes, F-67084 Strasbourg, France ?? The first author was supported by a NISSAN-fellowship of the Netherlands Organization of Scientific Research (NWO) ??? Current address: Schweizer Strasse 21, D-60594 Frankfurt/Main, Germany. E-mail: [email protected]

298

J. V. Stokman, M. S. Dijkhuizen

to a maximal Poisson ideal. The quotient of the Poisson algebra of polynomial functions on U by this ideal is isomorphic to the Poisson algebra of functions on the symplectic leaf. In recent years many people have studied quantum homogeneous spaces (see for example [45,31,36,13,30,35,4]). The results referred to above raise the obvious question whether the irreducible ∗-representations of quantized function algebras on U homogeneous spaces can be classified and related to the symplectic foliation of the underlying Poisson bracket. This question was already raised in a paper by Lu & Weinstein [25, Question 4.8], where they studied certain Poisson brackets on U -homogeneous spaces that arise as a quotient of the Bruhat–Poisson bracket on U . To our knowledge, affirmative answers to the above mentioned question have been given so far for only three different types of U -homogeneous spaces, namely Podle´s’s family of quantum 2-spheres [36] (the relation with the symplectic foliation of certain covariant Poisson brackets on the 2-sphere seems to have been observed for the first time by Lu & Weinstein [26]), odd-dimensional complex quantum spheres SU (n+1)/SU (n) (cf. Vaksman & Soibel’man [45]), and Stiefel manifolds U (n)/U (n − l) (cf. Podkolzin & Vainerman [35]). In this paper we study the irreducible ∗-representations of a certain quantized ∗algebra of functions on a generalized flag manifold. To be more specific, let G denote the complexification of U , and let P ⊂ G be a parabolic subgroup containing the standard Borel subgroup B with respect to a fixed choice of Cartan subalgebra and system of positive roots (compatible with the choice of Bruhat–Poisson bracket on U , see [25]). The generalized flag manifold U/K with K := U ∩ P naturally becomes a Poisson U -homogeneous space (cf. Lu & Weinstein [25]). The quotient Poisson bracket on U/K is also called Bruhat–Poisson in [25], and its symplectic leaves coincide with the Schubert cells of the flag manifold G/P ' U/K. It is straightforward to realize a quantum analogue Cq [K] of the algebra of polynomial functions on K as a quantum subgroup of Cq [U ]. The corresponding ∗-subalgebra Cq [U/K] of Cq [K]-invariant functions in Cq [U ] may be regarded as a quantization of the Poisson algebra of functions on U/K endowed with the Bruhat–Poisson bracket. The main result in this paper is a classification of all the irreducible ∗-representations of Cq [U/K] for an important subclass of flag manifolds containing in particular the irreducible Hermitian symmetric spaces of compact type. For this subclass we show that the equivalence classes of irreducible ∗-representations are parametrized by the Schubert cells of U/K. Let us emphasize that we regard here the flag manifold U/K as a real manifold. This means that the algebra of functions on U/K has a natural ∗-structure, which survives quantization and allows us to study ∗-representations in a way analogous to Soibel’man’s approach [40]. For an arbitrary generalized flag manifold U/K we describe in detail how irreducible ∗-representations of Cq [U ] decompose under restriction to Cq [U/K]. This decomposition corresponds precisely to the way symplectic leaves in U project to Schubert cells in the flag manifold U/K. It leads immediately to a classification of the irreducible ∗representations of the C ∗ -algebra Cq (U/K), where Cq (U/K) is obtained by taking the closure of Cq [U/K] with respect to the universal C ∗ -norm on Cq [U ]. The equivalence classes of irreducible ∗-representations of Cq (U/K) are naturally parametrized by the symplectic leaves of U/K endowed with the Bruhat–Poisson bracket. For the classification of the irreducible ∗-representations of the quantized function algebra Cq [U/K] itself it is important to have a kind of Poincaré–Birkhoff–Witt (PBW) factorization of Cq [U/K] (which in turn is closely related to the irreducible decom-

Quantized Flag Manifolds and Irreducible ∗-Representations

299

position of tensor products of certain finite-dimensional irreducible U -modules). Such a factorization is needed in order to develop a kind of highest weight representation theory for Cq [U/K]. In Soibel’man’s paper [40], a crucial role is played by a similar factorization of Cq [U ]. From Soibel’man’s results one easily derives a factorization of the algebra Cq [U/T ] (corresponding to P minimal parabolic in G). In this paper we derive a PBW type factorization for a different subclass of flag manifolds using the so-called Parthasarathy–Ranga Rao–Varadarajan (PRV) conjecture. This conjecture was formulated as a follow-up to certain results in the paper [34] and was independently proved by Kumar [18] and Mathieu [29] (see also Littelmann [22]). The subclass of flag manifolds U/K we consider here can be characterized by the two conditions that (U, K) is a Gel’fand pair and that the Dynkin diagram of K can be obtained from the Dynkin diagram of U by deleting one node (cf. Koornwinder [14]). Note that the corresponding P ⊂ G is always maximal parabolic. These two conditions are satisfied for the irreducible compact Hermitian symmetric pairs (U, K). Roughly speaking, the PBW factorization in the above mentioned cases states that the quantized function algebra Cq [U/K] coincides with the quantized algebra of zeroweighted complex valued polynomials on U/K. The quantized algebra of zero-weighted complex valued polynomials can be naturally defined for an arbitrary generalized flag manifold U/K. It is always a ∗-subalgebra of Cq [U/K] and invariant under the Cq [U ]coaction (we shall call it the factorized ∗-algebra associated with U/K). The factorized ∗-algebra is closely related to the quantized algebra of holomorphic polynomials on generalized flag manifolds studied by Soibel’man [41], Lakshmibai & Reshetikhin [19, 20], and Jurˇco & Šˇtovíˇcek [9] (for the classical groups) as well as to the function spaces considered recently by Korogodsky [15]. In this paper we classify the irreducible ∗-representations of the factorized ∗-algebra associated with an arbitrary flag manifold U/K and we show that the equivalence classes of irreducible ∗-representations are naturally parametrized by the symplectic leaves of U/K endowed with the Bruhat–Poisson bracket. In particular, we obtain a complete classification of the irreducible ∗-representations of Cq [U/K] whenever a PBW type factorization holds for Cq [U/K] (i.e., Cq [U/K] is equal to its factorized ∗-algebra). The paper is organized as follows. In Sect. 2 we review the results by Lu & Weinstein [25] and Soibel’man [40] concerning the Bruhat–Poisson bracket on U and the quotient Poisson bracket on a flag manifold. In Sect. 3 we recall some well-known results on the “standard” quantization of the universal enveloping algebra of a simple complex Lie algebra and its finite-dimensional representations. We also recall the construction of the corresponding quantized function algebra Cq [U ] and give some commutation relations between certain matrix coefficients of irreducible corepresentations of Cq [U ]. They will play a crucial role in the classification of the irreducible ∗-representations of the factorized ∗-algebra. In Sect. 4 we define the quantized algebra Cq [U/K] of functions on a flag manifold U/K and its associated factorized ∗-subalgebra. We prove that the factorized ∗-algebra is equal to Cq [U/K] for the subclass of flag manifolds referred to above. In Sect. 5 we study the restriction of an arbitrary irreducible ∗-representation of Cq [U ] to Cq [U/K]. We use here Soibel’man’s explicit realization of the irreducible ∗-representations of Cq [U ] as tensor products of irreducible ∗-representations of Cq [SU (2)] (cf. [40], see also [12,45] for SU (n)). As a corollary we obtain a complete classification of the irreducible ∗-representations of the C ∗ -algebra Cq (U/K). Section 6 is devoted to the classification of the irreducible ∗-representations of the factorized ∗-algebra associated with an arbitrary flag manifold. The techniques in Sect. 6

300

J. V. Stokman, M. S. Dijkhuizen

are similar to those used by Soibel’man [40] for the classification of the irreducible ∗representations of Cq [U ], and to those used by Joseph [8] to handle the more general problem of determining the primitive ideals of Cq [U ]. 2. Bruhat–Poisson Brackets on Flag Manifolds In this section we review some results by Soibel’man [40] and Lu & Weinstein [25] concerning the Bruhat–Poisson bracket on a compact connected simple Lie group U and its flag manifolds. For unexplained terminology in this section we refer the reader to [2] and [25]. Let g be a complex simple Lie algebra with a fixed Cartan subalgebra h ⊂ g. Let G be the connected simply connected Lie group with Lie algebra g (regarded here as a real analytic Lie group). Let R ⊂ h∗ be the root system associated with (g, h) and write gα for the root space associated with α ∈ R. Let 1 = {α1 , . . . , αr } be a basis of simple roots for R, and let R + (resp. R − ) be the set of positive (resp. negative) roots relative to 1. We identify h with its dual by the Killing form κ. The non-degenerate symmetric bilinear form on h∗ induced by κ is denoted by (·, ·). Let W ⊂ GL(h∗ ) be the Weyl group of the root system R and write si = sαi for the simple reflection associated with αi ∈ 1. For α ∈ R write dα := (α, α)/2. Let Hα ∈ h be the element associated with the coroot α ∨ := dα−1 α ∈ h∗ under the identification h ' h∗ . Let us choose nonzero Xα ∈ gα (α ∈ R) such that for all α, β ∈ R one has [Xα , X−α ] = Hα , κ Xα , X−α = dα−1 and [Xα , Xβ ] = cα,β Xα+β with cα,β = −c−α,−β ∈ R whenever α + β ∈ R. Let h0 be the real form of h defined as the real span of the Hα ’s (α ∈ R). Then X X u := R(Xα − X−α ) ⊕ Ri(Xα + X−α ) ⊕ ih0 (2.1) α∈R +

α∈R +

is a compact real form of g. P Set b+ := h0 ⊕ n+ with n+ := ⊕ α∈R + gα . Then, by the Iwasawa decomposition for g, the triple (g, u, b+ ) is a Manin triple with respect to the imaginary part of the Killing form κ (cf. [25, §4]). The corresponding coboundary cocommutator on u can be integrated to a Sklyanin bracket on the connected Lie subgroup U ⊂ G with Lie algebra u. The corresponding Poisson tensor is explicitly given by g = lg⊗2 r − rg⊗2 r, (g ∈ U ), where lg resp. rg denotes infinitesimal left resp. right translation and with the classical r-matrix r ∈ g ∧ g given by the following well-known skew solution of the Modified Classical Yang–Baxter Equation: X dα X−α ⊗ Xα − Xα ⊗ X−α ∈ u ∧ u. (2.2) r=i α∈R +

This particular Sklyanin bracket is often called Bruhat–Poisson, since its symplectic foliation is closely related to the Bruhat decomposition of G. Let us explain this in more detail. Let B+ be the connected subgroup of G with Lie algebra b+ , let T ⊂ U be the maximal torus in U with Lie algebra ih0 , and set B := T B+ . The Weyl group NU (T )/T , where NU (T ) is the normalizer of T in U , is isomorphic to W . More explicitly, the

Quantized Flag Manifolds and Irreducible ∗-Representations

301

isomorphism sends the simple reflection si to exp π2 (Xαi − X−αi ) /T . The disjoint union of G in double B+ -cosets (cf. [46, Prop. 1.2.3.6]), a B+ mB+ (2.3) G= m∈NU (T )

` is a refinement of the Bruhat decomposition G = w∈W BwB. For m ∈ NU (T ) we set 6m := U ∩ B+ mB+ . Then 6m 6 = ∅ for all m ∈ NU (T ), and we have the disjoint union a 6m . (2.4) U= m∈NU (T )

By the Iwasawa decomposition for G there exists for any b ∈ B+ and u ∈ U a unique ub ∈ U such that bu ∈ ub B+ . The map U × B+ → U, (u, b) 7→ ub

−1

(2.5)

is a right action of B+ on U , and the corresponding decomposition of U into B+ -orbits coincides with the decomposition (2.4). On the other hand, if we regard B+ as the Poisson–Lie group dual to U , the action (2.5) becomes the right dressing action of the dual group on U (cf. [25, Thm. 3.14]). The orbits in U under the right dressing action are exactly the symplectic leaves of the Poisson bracket on U (cf. [38, Thm. 13]; [25, Thm. 3.15]), hence (2.4) coincides with the decomposition of U into symplectic leaves (cf. [40, Thm. 2.2]). Next, we recall some results by Lu & Weinstein [25] concerning certain quotient Poisson brackets on generalized flag manifolds. Let S ⊂ 1 be a set of simple roots, and let PS be the corresponding standard parabolic subgroup of G. The Lie algebra pS of PS is given by M gα (2.6) pS := h ⊕ α∈0S

with 0S := R + ∪ {α ∈ R | α ∈ span(S)}. Let lS be the Levi factor of pS , M gα , lS := h ⊕

(2.7)

α∈0S ∩(−0S )

and set kS := pS ∩u = lS ∩u. Then kS is a compact real form of lS . Set KS := U ∩PS ⊂ U , then KS ⊂ U is a Poisson–Lie subgroup of U with Lie algebra kS (cf. [25, Thm. 4.7]). Hence there is a unique Poisson bracket on U/KS such that the natural projection π : U → U/KS is a Poisson map. This bracket is also called Bruhat–Poisson. It is covariant in the sense that the natural left action U × U/KS → U/KS is a Poisson map. Let WS be the subgroup of W generated by the simple reflections in S. The decomposition PS = BWS B (cf. [46, Thm. 1.2.1.1]) implies the Schubert cell decomposition of U/KS ' G/PS : a Xw , Xw := (U ∩ BwPS )/KS ' Bw/PS , (2.8) U/KS = w∈W/WS

where w ∈ W/WS is the right WS -coset in W which contains w. Since KS ⊂ U is a Poisson–Lie subgroup, we have that the right dressing-action (2.5) of B+ on U induces a right Poisson action of B+ on U/KS (cf. [25, Thm. 4.6]). The

302

J. V. Stokman, M. S. Dijkhuizen

corresponding B+ -orbits in U/KS coincide exactly with the Schubert cells. On the other hand, the symplectic leaves of the Poisson manifold U/KS are exactly the orbits under the B+ -action (see [25, Thm. 4.6]). We conclude (cf. [25, Thm. 4.7]): Theorem 2.1. The decomposition into symplectic leaves of the flag manifold U/KS endowed with the Bruhat–Poisson bracket coincides with its decomposition into Schubert cells. Consider now the set of minimal coset representatives W S := {w ∈ W | l(wsα ) > l(w) ∀α ∈ S}.

(2.9)

W S is a complete set of coset representatives for W/WS . Any element w ∈ W can be uniquely written as a product w = w1 w2 with w1 ∈ W S , w2 ∈ WS . The elements of W S are minimal in the sense that l(w1 w2 ) = l(w1 ) + l(w2 ), (w1 ∈ W S , w2 ∈ WS ),

(2.10)

where l(w) := #(R + ∩ wR − ) is the length function on W . Observe that π maps the symplectic leaf 6m ⊂ U onto the symplectic leaf Xw(m) ⊂ U/KS , where w(m) := m/T ∈ W . We write πm : 6m → Xw(m) for the surjective Poisson map obtained by restricting π to the symplectic leaf 6m . The minimality condition (2.9) translates to the following property of the map πm . Proposition 2.2. Let m ∈ NU (T ). Then πm : 6m → Xw(m) is a symplectic automorphism if and only if w(m) ∈ W S . Proof. For w ∈ W set nw :=

M

gα , Nw := exp(nw ).

α∈R + ∩wR −

Observe that the complex dimension of Nw is equal to l(w). Write prU : G ' U ×B+ → U for the canonical projection. It is well known that for m ∈ NU (T ) and for w ∈ W S with representative mw ∈ NU (T ), the maps φm : Nw(m) → 6m , n 7→ prU (nm),

ψw : Nw → Xw , n 7→ π prU (nmw )

are surjective diffeomorphisms (see for example [1, Prop. 1.1 & 5.1]). The map ψw is independent of the choice of representative mw for w. It follows now from (2.10) by a dimension count that πm can only be a diffeomorphism if w(m) ∈ W S . On the other −1 and hence π is a hand, if m ∈ NU (T ) such that w(m) ∈ W S , then πm = ψw(m) ◦ φm m diffeomorphism. u t Soibel’man [40] gave a description of the symplectic leaves 6m (m ∈ NU (T )) as a product of two-dimensional leaves which turns out to have a nice generalization to the quantized setting (cf. Sect. 5). For i ∈ [1, r], let γi : SU (2) ,→ U be the embedding corresponding to the i th node of the Dynkin diagram of U . After a possible renormalization of the Bruhat–Poisson structure on SU (2), γi becomes an embedding of Poisson–Lie groups. Recall that the two-dimensional leaves of SU (2) are given by α β ∈ SU(2) | arg(β) = arg(t) (t ∈ T), St := −β α

Quantized Flag Manifolds and Irreducible ∗-Representations

303

where T ⊂ C is the unit circle in the complex plane. The restriction of the embedding γi to S1 ⊂ SU (2) is a symplectic automorphism from S1 onto the symplectic leaf 6mi ⊂ U , where mi = exp π2 (Xαi − X−αi ) . Recall that mi ∈ NU (T ) is a representative of the simple reflection si ∈ W . For arbitrary m ∈ NU (T ) let w(m) = si1 si2 · · · sil be a reduced expression for w(m) := m/T ∈ W , and let tm ∈ T be the unique element such that m = mi1 mi2 · · · mir tm . Note that tm depends on the choice of reduced expression for w(m). The map (g1 , . . . , gl ) 7 → γi1 (g1 )γi2 (g2 ) · · · γil (gl )tm defines a symplectic automorphism from S1×l onto the symplectic leaf 6m ⊂ U (cf. [40, §2]; [42]). Note that the image of the map is independent of the choice of reduced expression for w(m), although the map itself is not. Combined with Proposition 2.2 we now obtain the following description of the symplectic leaves of the generalized flag manifold U/KS . Proposition 2.3. Let m ∈ NU (T ) and set w := m/T ∈ W . Let w1 ∈ W S , w2 ∈ WS be such that w = w1 w2 and choose reduced expressions w1 = si1 · · · sip and w2 = sip+1 · · · sil . Then the map (g1 , g2 , . . . , gl ) 7 → γi1 (g1 )γi2 (g2 ) · · · γil (gl )/KS is a surjective Poisson map from S1×l onto the Schubert cell Xw . It factorizes through ×p ×(l−p) ×p ×p the projection pr : S1×l = S1 × S1 → S1 . The quotient map from S1 onto Xw is a symplectic automorphism. In particular, we have Xw = 6mi1 6mi2 · · · 6mip /KS . See Lu [24] for more details in the case of the full flag manifold (KS = T ). 3. Preliminaries on the Quantized Function Algebra Cq [U ] In this section we introduce some notations which we will need throughout the remainder of this paper. First, we recall the definition of the quantized universal enveloping algebra associated with the simple complex Lie algebra g. We use the notations introduced in the previous section. Set di := dαi and Hi := Hαi for i ∈ [1, r]. Let A = (aij ) be the Cartan matrix, i.e. aij := di−1 (αi , αj ). Note that Hi ∈ h is the unique element such that αj (Hi ) = aij for all j . The weight lattice is given by P = {λ ∈ h∗ | λ(Hi ) = (λ, αi∨ ) ∈ Z ∀i}.

(3.1)

The fundamental weights $αi = $i (i ∈ [1, r]) are characterized by $i (Hj ) = ($i , αj∨ ) = δij for all j . The set of dominant weights P+ resp. regular dominant weights P++ is equal to K-span{$α }α∈1 with K = Z+ resp. N. We fix q ∈ (0, 1). The quantized universal enveloping algebra Uq (g) associated with the simple Lie algebra g is the unital associative algebra over C with generators Ki±1 ,

304

J. V. Stokman, M. S. Dijkhuizen

Xi± (i = [1, r]) and relations Ki Kj = Kj Ki , Ki Ki−1 = Ki−1 Ki = 1, ±αj (Hi )

Ki Xj± Ki−1 = qi

Xj± ,

Xi+ Xj− − Xj− Xi+ = δij 1−aij

X s=0

(−1)s

1 − aij s

qi

Ki − Ki−1 qi − qi−1

,

(3.2)

(Xi± )1−aij −s Xj± (Xi± )s = 0 (i 6= j ),

where qi := q di , [a]q :=

q a − q −a (a ∈ N), [0]q := 1, q − q −1

[a]q ! := [a]q [a − 1]q . . . [1]q , and [a]q ! a := . [a − n]q ![n]q ! n q A Hopf algebra structure on Uq (g) is uniquely determined by the formulas 1(Xi+ ) = Xi+ ⊗ 1 + Ki ⊗ Xi+ , 1(Xi− ) = Xi− ⊗ Ki−1 + 1 ⊗ Xi− ,

1(Ki±1 ) = Ki±1 ⊗ Ki±1 ,

S(Ki±1 ) = Ki∓1 , S(Xi+ ) = −Ki−1 Xi+ , S(Xi− ) = −Xi− Ki ,

(3.3)

ε(Ki±1 ) = 1, ε(Xi± ) = 0.

In fact, Uq (g) may be regarded as a quantization of the co-Poisson-Hopf algebra structure (cf. [2, Ch. 6]) on U (g) induced by the Lie bialgebra (g, −iδ), δ being the cocommutator of g associated with the r-matrix (2.2). Uq (g) becomes a Hopf ∗-algebra with ∗-structure on the generators given by (Ki±1 )∗ = Ki±1 , (Xi+ )∗ = qi−1 Xi− Ki , (Xi− )∗ = qi Ki−1 Xi+ .

(3.4)

In the classical limit q → 1, the ∗-structure becomes an involutive, conjugate-linear anti-automorphism of g with −1 eigenspace equal to the compact real form u defined in (2.1). Let U ± = Uq (n± ) be the subalgebra of Uq (g) generated by Xi± (i = [1, r]) and write U 0 := Uq (h) for the commutative subalgebra generated by Ki±1 (i = [1, r]). Let Q (resp. Q+ ) be the integral (resp. positive integral) span of the positive roots. We have the direct sum decomposition M ± U±α , U± = α∈Q+

±α(H )

± i := {φ ∈ U ± | Ki φKi−1 = qi φ}. The Poincaré–Birkhoff–Witt Theowhere U±α rem for Uq (g) states that multiplication defines an isomorphism of vector spaces

U − ⊗ U 0 ⊗ U + → Uq (g).

Quantized Flag Manifolds and Irreducible ∗-Representations

305

In particular, Uq (g) is spanned by elements of the form b−η K α aζ , where b−η ∈ − , aζ ∈ Uζ+ (η, ζ ∈ Q+ ) and α ∈ Q. Here we used the notation K α = K1k1 · · · Krkr U−η P if α = i ki αi . For a left Uq (g)-module V , we say that 0 6 = v ∈ V has weight µ ∈ h∗ if Ki · v = µ(Hi ) v = q (µ,αi ) v for all i. We write Vµ for the corresponding weight space. Recall that qi a P -weighted finite-dimensional irreducible representation of Uq (g) is a highest weight module V = V (λ)Pwith highest weight λ ∈ P+ . If vλ ∈ V (λ) is a highest weight vector, − we have V (λ) = ⊕ α∈Q+ U−α vλ by the PBW Theorem, hence the set of weights P (λ) of V (λ) is a subset of the weight lattice P satisfying µ ≤ λ for all µ ∈ P (λ). Here ≤ is the dominance order on P (i.e. µ ≤ ν if ν − µ ∈ Q+ and µ < ν if µ ≤ ν and µ 6 = ν). We define irreducible finite-dimensional P -weighted right Uq (g)-modules with respect to the opposite Borel subgroup. So the irreducible finite-dimensional right Uq (g)module V (λ) with highest weight λ ∈ P + has the weight space decomposition V (λ) = P⊕ + α∈Q+ vλ Uα , where vλ ∈ V (λ) is the highest weight vector of V (λ). The weights of the right Uq (g)-module V (λ) coincide with the weights of the left Uq (g)-module V (λ) and the dimensions of the corresponding weight spaces are the same. The quantized algebra Cq [G] of functions on the connected simply connected complex Lie group G with Lie algebra g is the subspace in the linear dual Uq (g)∗ spanned by the matrix coefficients of the finite-dimensional irreducible representations V (λ) (λ ∈ P+ ). The Hopf ∗-algebra structure on Uq (g) induces a Hopf ∗-algebra structure on Cq [G] ⊂ Uq (g)∗ by the formulas (φψ)(X) = (φ ⊗ ψ)1(X), 1(X) = ε(X), 1(φ)(X ⊗ Y ) = φ(XY ), ε(φ) = φ(1), ∗

S(φ)(X) = φ(S(X)), (φ )(X) =

(3.5)

φ(S(X)∗ ),

where φ, ψ ∈ Cq [G] ⊂ Uq (g)∗ and X, Y ∈ Uq (g). The algebra Cq [G] can be regarded as a quantization of the Poisson algebra of polynomial functions on the algebraic Poisson–Lie group G, where the Poisson structure on G is given by the Sklyanin bracket associated with the classical r-matrix −ir (cf. (2.2)). Since the ∗-structure (3.5) on Cq [G] is associated with the compact real form U of G in the classical limit, we will write Cq [U ] for Cq [G] with this particular choice of ∗-structure. Note that Cq [U ] is a Uq (g)-bimodule with the left respectively right action given by (X.φ)(Y ) := φ(Y X), (φ.X)(Y ) := φ(XY ),

(3.6)

where φ ∈ Cq [U ] and X, Y ∈ Uq (g). The finite-dimensional irreducible Uq (g)-module V (λ) of highest weight λ ∈ P+ is known to be unitarizable (say with inner product (., .)). So we can choose an orthonormal basis consisting of weight vectors {vµ(i) | µ ∈ P (λ), i = [1, dim(V (λ)µ )]},

(3.7)

(i)

where vµ ∈ V (λ)µ (we omit the index i if dim(V (λ)µ ) = 1). Set λ (X) := (X.vν(j ) , vµ(i) ), X ∈ Uq (g), Cµ,i;ν,j

(3.8)

for µ, ν ∈ P (λ) and 1 ≤ i ≤ dim(V (λ)µ ), 1 ≤ j ≤ dim(V (λ)ν ). If dim(V (λ)µ ) = 1 respectively dim(V (λ)ν ) = 1 we omit the dependence on i respectively j in (3.8). It is sometimes also convenient to use the notation λ (X) := (X.w, v), v, w ∈ V (λ), X ∈ Uq (g). Cv;w

306

J. V. Stokman, M. S. Dijkhuizen

Note that when λ runs through P+ and µ, i, ν and j run through the above-mentioned sets the matrix elements (3.8) form a linear basis of Cq [G]. Furthermore, we have the formulas X λ λ λ )= Cµ,i;σ,s ⊗ Cσ,s;ν,j , 1(Cµ,i;ν,j σ,s (3.9) λ λ λ ) = δµ,ν δi,j , (Cµ,i;ν,j )∗ = S(Cν,j ). ε(Cµ,i;ν,j ;µ,i (Sums for which the summation sets are not specified are taken over the “obvious” choice of summation sets.) Using the relations (3.9) and the Hopf algebra axiom for the antipode S we obtain X λ λ (Cσ,s;µ,i )∗ Cσ,s;ν,j = δµ,ν δi,j . (3.10) σ,s

λ )∗ are matrix coefficients of the contragredient representation The elements (Cµ,i;ν,j V (λ)∗ ' V (−σ0 λ) (here σ0 is the longest element in W ). To be precise, let π : Uq (g) → End(V (λ)) be the representation of highest weight λ, and let (·, ·) be an inner product (r) with respect to which π is unitarizable. Fix an orthonormal basis of weight vectors {vµ }. Let (π ∗ , V (λ)∗ ) be the contragredient representation, i.e. π ∗ (X)φ = φ ◦ π(S(X)) for X ∈ Uq (g) and φ ∈ V (λ)∗ . For u ∈ V (λ) set u∗ := (·, u) ∈ V (λ)∗ . We define an inner product on V (λ)∗ by (u∗ , v ∗ ) := π(K −2ρ )v, u , u, v ∈ V (λ), P where ρ = 1/2 α∈R + α ∈ h∗ . By using the fact that S 2 (u) = K −2ρ uK 2ρ (u ∈ Uq (g)) one easily deduces that π ∗ is unitarizable with respect to the inner product (·, ·) on (i) (i) V (λ)∗ and that {φ−µ := q (µ,ρ) (vµ )∗ } is an orthonormal basis of V (λ)∗ consisting of (i)

−σ0 λ weight vectors (here φ−µ has weight −µ). Defining the matrix coefficients C−µ,i;−ν,j (i)

of (π ∗ , V (λ)∗ ) with respect to the orthonormal basis {φ−µ }, we then have −σ0 λ λ )∗ = q (µ−ν,ρ) C−µ,i;−ν,j (Cµ,i;ν,j

(3.11)

(cf. [40, Prop. 3.3]). A fundamental role in Soibel’man’s theory of irreducible ∗-representations of Cq [U ] is played by a Poincaré–Birkhoff–Witt (PBW) type factorization of Cq [U ]. For λ ∈ P+ , set λ | v ∈ V (λ)}. Bλ := span{Cv;v λ

Note that Bλ is a right Uq (g)-submodule of Cq [U ] isomorphic to V (λ). Set M M Bλ , A++ := Bλ . A+ := λ∈P+

(3.12)

(3.13)

λ∈P++

The subalgebra and right Uq (g)-module A+ is equal to the subalgebra of left U + invariant elements in Cq [U ] (cf. [8]). The existence of a PBW type factorization of Cq [U ] now amounts to the following statement. Theorem 3.1 ([40, Thm. 3.1]). The multiplication map m : (A++ )∗ ⊗ A++ → Cq [U ] is surjective.

Quantized Flag Manifolds and Irreducible ∗-Representations

307

A detailed proof can be found in [8, Prop. 9.2.2]. The proof is based on certain results concerning decompositions of tensor products of irreducible finite-dimensional Uq (g)modules which can be traced back to Kostant in the classical case [16, Thm. 5.1]. The close connection between Theorem 3.1 and the decomposition of tensor products of irreducible Uq (g)-modules becomes clear by observing that (Bλ )∗ Bµ ' V (λ)∗ ⊗ V (µ)

(3.14)

as right Uq (g)-modules. Important for the study of ∗-representations of Cq [U ] is some detailed information about the commutation relations between matrix elements in Cq [U ]. In view of Theoλ rem 3.1, we are especially interested in commutation relations between the Cµ,i;λ and 3 λ 3 ∗ Cν,j ;3 resp. between the Cµ,i;λ and (Cν,j ;3 ) , where λ, 3 ∈ P+ . To state these commutation relations we need to introduce certain vector subspaces of Cq [U ]. Let λ, 3 ∈ P+ and µ ∈ P (λ), ν ∈ P (3), then we set λ C 3 | (v, w) ∈ sN }, N(µ, λ; ν, 3) := span{Cv;v λ w;v3

3 C λ | (v, w) ∈ sN }, N opp (µ, λ; ν, 3) := span{Cw;v 3 v;vλ

(3.15)

where sN := sN(µ, λ; ν, 3) is the set of pairs (v, w) ∈ V (λ)µ0 × V (3)ν 0 with µ0 > µ, ν 0 < ν and µ0 + ν 0 = µ + ν. Furthermore set λ 3 )∗ Cw;v | (v, w) ∈ sO}, O(µ, λ; ν, 3) := span{(Cv;v λ 3

3 λ (Cv;v )∗ | (v, w) ∈ sO}, O opp (µ, λ; ν, 3) := span{Cw;v 3 λ

(3.16)

where sO := sO(µ, λ; ν, 3) is the set of pairs (v, w) ∈ V (λ)µ0 × V (3)ν 0 with µ0 < µ, ν 0 < ν and µ − µ0 = ν − ν 0 . If sN (resp. sO) is empty, then let N = N opp = {0} (resp. O = O opp = {0}). We now have the following proposition. Proposition 3.2. Let λ, 3 ∈ P+ and v ∈ V (λ)µ , w ∈ V (3)ν . λ 3 and Cw;v satisfy the commutation relation (i) The matrix elements Cv;v λ 3 λ 3 C 3 = q (λ,3)−(µ,ν) Cw;v C λ mod N (µ, λ; ν, 3). Cv;v λ w;v3 3 v;vλ

Moreover, we have N = N opp . λ )∗ and C 3 (ii) The matrix elements (Cv;v w;v3 satisfy the commutation relation λ λ 3 3 λ (Cv;v )∗ Cw;v = q (µ,ν)−(λ,3) Cw;v (Cv;v )∗ mod O(µ, λ; ν, 3). λ 3 3 λ

Moreover, we have O = O opp . Soibel’man [40] derived commutation relations using the universal R-matrix whereas Joseph [8, §9.1] used the Poincaré–Birkhoff–Witt Theorem for Uq (g) and the left, respectively right, action (3.6) of Uq (g) on Cq [U ]. Although the commutation relations formulated here are slightly sharper, the proof can be derived in a similar manner and will therefore be omitted. As a corollary of Proposition 3.2 (i) we have Corollary 3.3. Let λ, 3 ∈ P+ and v ∈ V (λ)µ , w ∈ V (3)ν . Then λ 3 C 3 = q (µ,ν)−(λ,3) Cw;v Cλ Cv;v λ w;v3 3 v;vλ

mod N (ν, 3; µ, λ).

(3.17)

308

J. V. Stokman, M. S. Dijkhuizen

Note that Proposition 3.2 (i) and Corollary 3.3 give two different ways to rewrite λ C3 as elements of the vector space Cv;v λ w;v3 Wλ,3 := span{Cw30 ;v3 Cvλ0 ;vλ | v 0 ∈ V (λ), w0 ∈ V (3)}. We will need both “inequivalent” commutation relations (Proposition 3.2 (i) and Corollary 3.3) in later sections. It follows in particular that, when v 0 ∈ V (λ) and w0 ∈ V (3) run through a basis, the elements Cw30 ;v3 Cvλ0 ;vλ are (in general) linearly dependent. This also follows from the following two observations. On the one hand, Wλ,3 ' V (λ + 3) as right Uq (g)-modules. On the other hand, V (λ + 3) occurs with multiplicity one in V (λ) ⊗ V (3), whereas in general V (λ) ⊗ V (3) has other irreducible components too. By contrast, the commutation relation given in Proposition 3.2 (ii) is unique in the 3 (C λ )∗ are sense that, when v ∈ V (λ) and w ∈ V (3) run through a basis, the Cw;v v;vλ 3 linearly independent (cf. (3.14)). We end this section by recalling the special case g = sl(2, C). Set $1 $1 , t12 := C$ , t11 := C$ 1 ;$1 1 ;−$1

$1 $1 t21 := C−$ , t22 := C−$ . 1 ;$1 1 ;−$1

(3.18)

Then it is well known that the tij ’s generate the algebra Cq [SU (2)]. The commutation relations tk1 tk2 = qtk2 tk1 , t1k t2k = qt2k t1k (k = 1, 2), t12 t21 = t21 t12 , t11 t22 − t22 t11 = (q − q −1 )t12 t21 , t11 t22 − qt12 t21 = 1

(3.19)

characterize the algebra structure of Cq [SU (2)] in terms of the generators tij . The ∗∗ = t , t ∗ = −qt . structure is uniquely determined by the formulas t11 22 12 21 4. Quantized Function Algebras on Generalized Flag Manifolds Let S be any subset of the simple roots 1. We will sometimes identify S with the index set {i | αi ∈ S}. Let pS ⊂ g be the corresponding standard parabolic subalgebra, given explicitly by (2.6). We define the quantized universal enveloping algebra Uq (lS ) associated with the Levi factor lS of pS as the subalgebra of Uq (g) generated by Ki±1 (i ∈ [1, r]) and Xi± (i ∈ S). Note that Uq (lS ) is a Hopf ∗-subalgebra of Uq (g). For later use in this section we briefly discuss the finite-dimensional representation theory of Uq (lS ). Recall that lS is a reductive Lie algebra with centre \ Ker(αi ) ⊂ h. (4.1) Z(lS ) = i∈S

Moreover, we have direct sum decompositions h = Z(lS ) ⊕ hS , lS = Z(lS ) ⊕ lss S , where hS = span{Hi }i∈S and is explictly given by

lss S

(4.2)

is the semisimple part of lS . The semisimple part lss S

lss S := hS ⊕

M α∈0S ∩(−0S )

gα .

(4.3)

Quantized Flag Manifolds and Irreducible ∗-Representations

309

We define the quantized universal enveloping algebra Uq (lss S ) associated with the semiof l as the subalgebra of U (g) generated by Ki±1 and Xi± for all i ∈ S. simple part lss S q S ss Observe that Uq (lS ) is a Hopf ∗-subalgebra of Uq (g). There are obvious notions of weight vectors and weights for Uq (lS )-modules. With a suitably extended interpretation of the notion of highest weight, the irreducible finitedimensional Uq (lS )-modules may be characterized in terms of highest weights. By relating the finite-dimensional representation theory of Uq (lS ) to the representation ss theory of Uq (lss S ), which is well known since lS is semisimple, one easily derives the following result. Proposition 4.1. (i) Any finite-dimensional Uq (lS )-module V which is completely reducible as Uq (h)-module, is completely reducible as Uq (lS )-module. (ii) The multiplicity of any irreducible Uq (lS )-module in the irreducible decomposition of the restriction of the Uq (g)-module V (λ) to Uq (lS ) is the same as in the classical case. Next, we define the quantized algebra of functions on U/KS . The mapping ι∗S : Uq (g)∗ Uq (lS )∗ dual to the Hopf ∗-embedding ιS : Uq (lS ) ,→ Uq (g) is surjective, and we set Cq [LS ] := ι∗S (Cq [G]) = {φ ◦ ιS | φ ∈ Cq [G]}. The formulas (3.5) uniquely determine a Hopf ∗-algebra structure on Cq [LS ], and ι∗S then becomes a Hopf ∗-algebra morphism. We write Cq [KS ] for Cq [LS ] with this particular choice of ∗-structure. Define a ∗-subalgebra Cq [U/KS ] ⊂ Cq [U ] by Cq [U/KS ] := {φ ∈ Cq [U ] | (id ⊗ ι∗S )1(φ) = φ ⊗ 1} = {φ ∈ Cq [U ] | X.φ = ε(X)φ, ∀ X ∈ Uq (lS )}.

(4.4)

The algebra Cq [U/KS ] is a left Cq [U ]-subcomodule of Cq [U ]. We call it the quantized algebra of functions on the generalized flag manifold U/KS . In a similar way, one can define the quantized function algebra Cq [KSss ] corresponding to the semisimple part KSss of KS as the image of the dual of the natural embedding Uq (lss S ) ,→ Uq (g). Its Hopf ∗-algebra structure is again given by the formulas (3.5). The subalgebra Cq [U/KSss ] then consists by definition of all right Cq [KSss ]-invariant elements in Cq [U ]. Note that Cq [U/KSss ] ⊂ Cq [U ] is a left Uq (h)-submodule and that Cq [U/KS ] coincides with the subalgebra of Uq (h)-invariant elements in Cq [U/KSss ]. We now turn to PBW type factorizations of the algebra Cq [U/KS ]. Write P (S), P+ (S), resp. P++ (S) for K-span{$α }α∈S with K = Z, Z+ resp. N. Set S c := 1 \ S. The quantized algebra Ahol S of holomorphic polynomials on U/KS is defined by Ahol S :=

M

Bλ ⊂ Cq [U ],

(4.5)

λ∈P+ (S c )

where Bλ is given by (3.12) (cf. [19,20,41,9] and [15]). Note that Ahol S is a right Uq (g)comodule subalgebra of Cq [U ], (4.5) being the (multiplicity free) decomposition of Ahol S ∗ into irreducible Uq (g)-modules. The right Uq (g)-module algebra (Ahol S ) ⊂ Cq [U ] is called the quantized algebra of antiholomorphic polynomials on U/KS .

310

J. V. Stokman, M. S. Dijkhuizen

Lemma 4.2. The linear subspace hol ∗ hol ⊂ Cq [U ], Ass S := m (AS ) ⊗ AS where m is the multiplication map of Cq [U ], is a right Uq (g)-submodule ∗-subalgebra of Cq [U ]. Proof. Proposition 3.2 (ii) implies that Ass S is a subalgebra of Cq [U ]. The other assertions are immediate. u t The subalgebra Ass S may be considered as a quantum analogue of the algebra of complexvalued polynomial functions on the real manifold U/KSss . c Remark 4.3. In the classical setting (q = 1), the algebra Ass S (#S = 1) can be interpreted as an algebra of functions on the product of an affine spherical G-variety with its dual. The G-module structure on Ass S is then related to the doubled G-action (see [32,33] for the terminology). These (and related) G-varieties have been studied in several papers, see for example [33,32] and [23].

The algebra Ass S ⊂ Cq [U ] is stable under the left Uq (h)-action, so we can speak of ss Uq (h)-weighted elements in Ass S . Let AS be the left Uq (h)-invariant elements of AS . Then AS ⊂ Cq [U ] is a right Uq (g)-module ∗-subalgebra of Cq [U ]. We now have the following lemma. ss Lemma 4.4. We have Ass S ⊂ Cq [U/KS ], so in particular AS ⊂ Cq [U/KS ]. Furthermore, λ λ )∗ Cw;v | λ ∈ P+ (S c ), v, w ∈ V (λ)}. AS = span{(Cv;v λ λ

(4.6)

Proof. Choose λ ∈ P+ (S c ) and i ∈ S. Then we have Xi+ · vλ = 0 and Ki · vλ = vλ . It follows that Cvλ ⊂ V (λ) is a one-dimensional Uqi (sl(2; C))-submodule, where we consider the Uqi (sl(2; C)) action on V (λ) via the embedding φi : Uqi (sl(2; C)) ,→ ss Uq (g). It follows that Xi− · vλ = 0. This readily implies that Ass S ⊂ Cq [U/KS ]. The remaining assertions are immediate. u t Definition 4.5. We call AS ⊂ Cq [U/KS ] the factorized ∗-subalgebra associated with U/KS . In Theorem 4.10(i) below we show that Theorem 3.1 directly implies that A∅ = Cq [U/K∅ ]. In fact, we conjecture that Conjecture 4.6. AS = Cq [U/KS ] for all subsets S of the simple roots 1. In Theorem 4.10(ii) below we will prove the conjecture for a certain subclass of generalized flag manifolds that we shall define and classify in the following proposition. For the proof in these cases we use the so-called Parthasarathy–Ranga Rao–Varadarajan (PRV) conjecture, which was proved independently by Kumar [18] and Mathieu [29]. The PRV conjecture gives information about which irreducible constituents occur in tensor products of irreducible finite-dimensional U -modules. Recall the notations introduced in Sect. 2. A pair (U, K) with K a subgroup of U is called a Gel’fand pair if for every irreducible representation of U , the subspace of K-fixed vectors is at most one dimensional. The following proposition was observed by Koornwinder [14].

Quantized Flag Manifolds and Irreducible ∗-Representations

311

Proposition 4.7 ([14]). Let U be a connected, simply connected compact Lie group with Lie algebra u, and let p ⊂ g be a standard maximal parabolic subalgebra. Let K ⊂ U be the connected subgroup with Lie algebra k := p ∩ u. Then (U, K) is a Gel’fand pair if and only if one of the following three conditions are satisfied: (i) (U, K) is an irreducible compact Hermitian symmetric pair; (ii) (U, K) ' (SO(2l + 1), U (l)), (l ≥ 2); (iii) (U, K) ' (Sp(l), U (1) × Sp(l − 1)), (l ≥ 2). Proof. For a list of the irreducible compact Hermitian symmetric pairs see [6, Ch. X, Table V]. The proposition follows from this and the classification of compact Gel’fand pairs (U, K) with U simple (cf. [17, Tab. 1]). u t Let (U, K) be a pair from the list (i)–(iii) in Proposition 4.7, and let (u, k) be the associated pair of Lie algebras. Then k = kS for some subset S ⊂ 1 with #S c = 1. We call the simple root α ∈ S c the Gel’fand node associated with (U, K). A dominant weight λ ∈ P+ is called spherical if the subspace of K-fixed vectors in V (λ) is one dimensional. The corresponding representation V (λ) is then also called spherical. We write P+K ⊂ P+ for the subset of dominant spherical weights. Proposition 4.8. Let (U, K) be a pair from the list (i)–(iii) in Proposition 4.7, and let α ∈ 1 be the associated Gel’fand node with corresponding fundamental weight $ := $α . Then we have a multiplicity free irreducible decomposition of Uq (g)-modules of the form ∗

V ($ ) ⊗ V ($ ) '

l M

V (µi )

i=0

for certain l ∈ N, where µ0 := 0 ∈ P+ and {µi }li=1 is a subset of the dominant spherical weights P+K . Furthermore, every λ ∈ P+K can be uniquely written as a Z+ linear combination of the µi ’s (i ∈ [1, l]). Definition 4.9. The spherical weights µi (i ∈ [1, l]) are called the fundamental spherical weights associated with (U, K). Proof. It is well known that the trivial representation V (0) occurs with multiplicity one in the tensor product decomposition of V ($ )∗ ⊗ V ($ ). Furthermore, observe that ∗ (4.7) V ($ )∗ ⊗ V ($ ) ' B$ B$ ⊂ A{α}c ⊂ Cq [U/K] as right Uq (g)-modules. By Proposition 4.1 we have the multiplicity free decomposition as right Uq (g)-modules M V (λ), (4.8) Cq [U/K] ' λ∈P+K

from which it follows that the decomposition of V ($ )∗ ⊗ V ($ ) is multiplicity free, and that its irreducible constituents are all spherical. Krämer [17, Tab. 1] presented for each pair (U, K) from the list (i)–(iii) in Proposition 4.7 a set of dominant spherical weights {µi }li=1 satisfying the property that every λ ∈ P+K can be uniquely written as a Z+ -linear combination of the µi ’s (i ∈ [1, l]).

312

J. V. Stokman, M. S. Dijkhuizen

The µi ’s are explicitly given as a Z+ -linear combination of the fundamental dominant weights $j (j ∈ [1, r]). In case of the Hermitian symmetric spaces U/K, there is an elegant procedure to recover the µi ’s as linear combinations of the fundamental dominant weights from the corresponding Satake diagrams [44]. We show now that all spherical representations V (µi ) (i ∈ [1, l]) are constituents of V ($ )∗ ⊗V ($ ) by using the PRV conjecture, which states the following. Let λ, µ ∈ P+ and w ∈ W . Let [λ + wµ] be the unique element in P+ which lies in the W -orbit of λ + wµ. Then V ([λ + wµ]) occurs with multiplicity at least one in V (λ) ⊗ V (µ) (for a proof, see [18,29] or [22]). For each pair (U, K) from the list (i)–(iii) of Proposition 4.7, it is now possible to find explicit Weyl group elements wi ∈ W such that [$ − wi $ ] = µi , (i = [1, l]). Combined with the PRV conjecture and the fact that V ($ )∗ ' V (−σ0 $ ), this implies that V (µi ) is a constituent of V ($ )∗ ⊗ V ($ ) for all i ∈ [1, l]. As an example, let us follow the procedure for the compact Hermitian symmetric pair (U, K) = (SO(2l), U (l)) (l ≥ 2). We use the standard realization of the root system P R of type Dl in the l-dimensional vector space V = li=1 Rεi , with basis given by αi = εi − εi+1 (i = [1, l − 1]) and αl = εl−1 + εl . The fundamental weights are given by $i = ε1 + ε2 + . . . + εi , (i < l − 1), $l−1 = (ε1 + ε2 + . . . + εl−1 − εl )/2, $l = (ε1 + ε2 + . . . + εl−1 + εl )/2. We set $ = $l (i.e. S c = {αl }). Let σi be the linear map defined by εj 7→ −εj (j = i, i + 1) and εj 7 → εj otherwise. Then σi ∈ W (i = [1, l − 1]). If l = 2l 0 + 1, then $ − σ1 σ3 . . . σ2i−1 $ = $2i , (i = [1, l 0 − 1]), $ − σ1 σ3 . . . σ2l 0 −1 $ = $l−1 + $l .

(4.9)

If l = 2l 0 then we have $ − σ1 σ3 . . . σ2i−1 $ = $2i , $ − σ1 σ3 . . . σ2l 0 −1 $ = 2$l .

(i = [1, l 0 − 1]),

(4.10)

By comparison with [17, Tab. 1] we see that (4.9) (resp. (4.10)) are exactly the spherical weights {µi }li=1 for the pair (U, K) = (SO(2l), U (l)). The other cases are checked in a similar manner. To complete the proof, we have to show that the V (µi ) (i ∈ [0, l]) are the only irreducible constituents which can occur in the tensor product decomposition of V ($ )∗ ⊗ V ($ ). This is also proved case by case. The cases corresponding to the exceptional groups can be directly verified using for instance the maple-package “qtensor” of Stem bridge [43]1 . The special case (U, K) = SU (p + l), S(U (p) × U (l)) (i.e. for which U/K is a complex Grassmannian), follows easily from the Pieri formula for Schur functions [28, Ch. I, (5.17)] (see [4] for more details). The remaining cases can be checked by showing that for λ ∈ P+K \ {µi }li=0 , we have λ 6 ≤ $ − σ0 $ , which implies that V (λ) t cannot occur as constituent of V ($ )∗ ⊗ V ($ ). u 1 http://www.math.lsa.umich.edu/∼jrs/maple.html

Quantized Flag Manifolds and Irreducible ∗-Representations

313

The following main theorem of this section gives a positive answer to Conjecture 4.6 for S = ∅ and for the pairs classified in Proposition 4.7. Theorem 4.10. The factorized ∗-subalgebra AS is equal to Cq [U/KS ] if (i) S = ∅, i.e. U/KS = U/T is the full flag manifold; (ii) #S c = 1 and the simple root α ∈ S c is a Gel’fand node. Proof. To prove (i) we look at the simultaneous eigenspace decomposition of Cq [U ] with respect to the left Uq (h)-action on Cq [U ]. The simultaneous eigenspace corresponding to the character ε of Uq (h) is exactly Cq [U/T ]. Using Soibel’man’s factorization of Cq [U ] (cf. Theorem 3.1) and Lemma 4.4, it is then easily checked that Cq [U/T ] = A∅ . To prove (ii) we note that l M

V (µi ) ' (B$ )∗ B$ ⊂ A{α}c

i=0

as right Uq (g)-modules by Proposition 4.8 and (4.7) (here we use the notations as introduced in Proposition 4.8). Now Cq [U ] is an integral domain (cf. [8, Lemma 9.1.9 (i)]), hence vλ vµ ∈ A{α}c is a highest weight vector of highest weight λ + µ if vλ , vµ ∈ A{α}c are highest weight vectors of highest weight λ respectively µ. It follows that M V (λ) ,→ A{α}c λ∈P+K

as right Uq (g)-modules. Combining with (4.8), it follows that A{α}c = Cq [U/K{α}c ], as requisted. u t In the remainder of the paper we study the irreducible ∗-representations of the ∗algebras AS and Cq [U/KS ]. In the next section we first consider the restriction of the irreducible ∗-representations of Cq [U ] to the ∗-algebras AS and Cq [U/KS ]. 5. Restriction of Irreducible ∗-Representations to Cq [U/K] Let us first recall some results from Soibel’man [40] concerning the irreducible ∗-representations of Cq [U ]. Let {ei }i∈Z+ be the standard orthonormal basis of l2 (Z+ ). Write B(l2 (Z+ )) for the algebra of bounded linear operators on l2 (Z+ ). Then the formulas q πq (t11 )ej = (1 − q 2j )ej −1 , πq (t12 )ej = −q j +1 ej , (5.1) q πq (t21 )ej = q j ej , πq (t22 )ej = (1 − q 2(j +1) )ej +1 (here πq (t11 )e0 = 0) uniquely determine an irreducible ∗-representation πq : Cq [SU (2)] → B(l2 (Z+ )). Now the dual of the injective Hopf ∗-algebra morphism φi : Uqi (sl(2; C)) ,→ Uq (g) corresponding to the i th node of the Dynkin diagram (i ∈ [1, r]) is a surjective Hopf ∗-algebra morphism φi∗ : Cq [U ] Cqi [SU (2)]. Hence we obtain irreducible ∗-representations πi := πqi ◦ φi∗ : Cq [U ] → B(l2 (Z+ )).

314

J. V. Stokman, M. S. Dijkhuizen

On the other hand, there is a family of one-dimensional ∗-representations τt of Cq [U ] parametrized by the maximal torus t ∈ T ' Tr (T ⊂ C denoting the unit circle in the complex plane). More explicitly, let ιT : Uq (h) ,→ Uq (g) be the natural Hopf ∗-algebra embedding, and set Cq [T ] := span{φµ }µ∈P ⊂ Uq (h)∗ , where φµ (K σ ) := q (µ,σ ) for σ ∈ Q. As in (3.5) we get a Hopf ∗-algebra structure on Cq [T ]. Then ι∗T : Cq [U ] → Cq [T ], ι∗T (φ) := φ ◦ ιT is a surjective Hopf ∗-algebra morphism. Any irreducible ∗as τ˜t (φµ ) := t µ for a representation of Cq [T ] is one-dimensional and can be written Pr m1 mr r µ unique t ∈ T ' T . Here t := t1 . . . tr for µ = i=1 mi $i . So we obtain a one-dimensional ∗-representation τt := τ˜t ◦ ι∗T of Cq [U ], which is given explicitly on λ by the formula matrix elements Cµ,i;ν,j λ ) = δµ,ν δi,j t µ . τt (Cµ,i;ν,j

(5.2)

The following theorem completely describes the irreducible ∗-representations of Cq [U ]. Theorem 5.1 (Soibel’man [40]). Let σ ∈ W , and fix a reduced expression σ = si1 si1 · · · sil . The ∗-representation πσ := πi1 ⊗ πi2 ⊗ · · · ⊗ πil

(5.3)

does not depend on the choice of reduced expression (up to equivalence). The set {πσ ⊗ τt | t ∈ T , σ ∈ W } is a complete set of mutually inequivalent irreducible ∗-representations of Cq [U ]. Here tensor products of ∗-representations are defined in the usual way by means of the coalgebra structure on Cq [U ]. The irreducible representation πe with respect to the unit element e ∈ W is by definition the one-dimensional ∗-representation associated with the counit on Cq [U ]. In Soibel’man’s terminology, the representations πσ ⊗ τt are said to be associated with the Schubert cell Xσ of U/T (cf. Sect. 2). We also mention here an important property of the kernel of πσ , which we will repeatedly need later on. Let Uq (b) be the subalgebra of Uq (g) generated by the Ki±1 and the Xi+ (i ∈ [1, r]). For any λ ∈ P+ , the ∗-representation πσ satisfies λ )=0 v∈ / Uq (b)vσ λ , πσ (Cvλσ λ ;vλ ) 6 = 0 (5.4) πσ (Cv;v λ (cf. [40, Thm. 5.7]). Formula (5.4) combined with [1, Lemma 2.12] shows that the classical limit of the kernel of πσ formally tends to the ideal of functions vanishing on Xσ . Fix now a subset S ⊂ 1. We freely use the notations introduced earlier. Our next goal is to describe how the ∗-representations πσ decompose under restriction to the subalgebra Cq [U/KS ]. Consider the selfadjoint operators Lσ λ;λ := πσ ((Cσλλ;λ )∗ Cσλλ;λ )

(5.5)

for λ ∈ P+ (S c ). Let σ = si1 · · · sil be a reduced expression for σ , and set πσ = πi1 ⊗ πi2 ⊗ · · · ⊗ πil . Then it follows from [40, Proof of Prop. 5.2] (see also [40, Proof of Prop. 5.8]) that ∨

∨

∨

πσ (Cσλλ;λ ) = c πqi1 (t21 )(λ,γ1 ) ⊗ πqi2 (t21 )(λ,γ2 ) ⊗ · · · ⊗ πqil (t21 )(λ,γl ) ,

(5.6)

Quantized Flag Manifolds and Irreducible ∗-Representations

315

where the scalar c ∈ T depends on the particular choices of bases for the irreducible representations V (µ) (µ ∈ P+ ), and with γk := sil sil−1 · · · sik+1 (αik ) (1 ≤ k ≤ l − 1), γl := αil .

(5.7)

The proof of (5.6), which was given in [40] under the assumption that λ ∈ P++ , is in fact valid for all dominant weights λ ∈ P+ . It follows from (5.1), (5.5) and (5.6) that l2 (Z+ )⊗l(σ ) decomposes as an orthogonal direct sum of eigenspaces for Lσ λ;λ , M Hγ (λ), (5.8) l2 (Z+ )⊗l(σ ) = γ ∈I (λ)

where I (λ) ⊂ (0, 1] denotes the set of eigenvalues of Lσ λ;λ , and Hγ (λ) denotes the eigenspace of Lσ λ;λ corresponding to the eigenvalue γ ∈ I (λ) (we suppress the dependence on σ if there is no confusion possible). Observe that 1 ∈ I (λ) and that Lσ λ;λ is injective. Recall the definition of the set W S of minimal coset representatives (cf. (2.9)). An alternative characterization of W S is given by W S = {σ ∈ W | σ (RS+ ) ⊂ R + },

(5.9)

where RS+ := R + ∩ span{S} (cf. [1, Prop. 5.1 (iii)]). Using this alternative description of W S we obtain the following properties of Lσ λ;λ for λ ∈ P++ (S c ). Proposition 5.2. Suppose that σ ∈ W S and λ ∈ P++ (S c ). Then (i) Lσ λ;λ is a compact operator; (ii) The eigenspace H1 (λ) of Lσ λ;λ corresponding to the eigenvalue 1 is spanned by the ⊗l(σ ) . vector e0 Proof. Fix a λ ∈ P++ (S c ), and let σ = si1 si2 · · · sil be a reduced expression of a minimal coset representative σ ∈ W S . It is well known that R + ∩ σ −1 (R − ) = {γk }lk=1 ,

(5.10)

where the γk are defined by (5.7). We have γk ∈ R + \ RS+ by (5.9). It follows that (λ, γk∨ ) > 0 for all k, since λ ∈ P++ (S c ). By (5.1) and (5.6) it follows that H1 (λ) = ⊗l(σ ) } and that Hγ (λ) is finite-dimensional for all γ ∈ I (λ). Since the spectrum span{e0 of Lσ λ;λ (which is equal to I (λ) ∪ {0}) does not have a limit point except 0, we conclude t that Lσ λ;λ is a compact operator (cf. [37, Thm. 12.30]). u Let us recall the following well known inequalities for weights of finite-dimensional irreducible representations of g (or, equivalently, Uq (g)). Proposition 5.3. Let λ ∈ P+ and µ, ν ∈ P (λ). Then (λ, λ) ≥ (µ, ν), and equality holds if and only if µ = ν ∈ W λ. For a proof of the proposition, see for instance [10, Prop. 11.4]. The proof is based on the following lemma, which we will also need later on. The lemma is a slightly weaker version of [10, Lemma 11.2]. {λ}, and let mi ∈ Z+ (i ∈ [1, r]) be the Lemma 5.4. Let λ ∈ P+ and µ ∈ P (λ) \ P expansion coefficients defined by λ − µ = i mi αi . Then there is an i ∈ [1, r] with mi > 0 and λ(Hi ) 6 = 0.

316

J. V. Stokman, M. S. Dijkhuizen

We now have the following proposition, which can be regarded as a quantum analogue of the “if” part of Proposition 2.2. Proposition 5.5. Let σ ∈ W S . Then πσ restricts to an irreducible ∗-representation of the factorized ∗-algebra AS . In particular, πσ restricts to an irreducible ∗-representation of Cq [U/KS ]. Proof. Let λ ∈ P++ (S c ) and σ ∈ W S . Suppose H ⊂ l2 (Z+ )⊗l(σ ) is a non-zero closed subspace invariant under πσ |AS . Set γ := kLσ λ;λ |H k. Then γ > 0, since Lσ λ;λ is injective and γ is an eigenvalue of Lσ λ;λ |H by Proposition 5.2(i). Let Hγ be the corresponding eigenspace. We claim that λ λ )∗ Cµ,i;λ )Hγ = 0, µ 6= σ λ. πσ ((Cµ,i;λ

(5.11)

Suppose for the moment that the claim is correct. Then (3.10) and (5.11) imply γ = 1, ⊗l(σ ) } by Proposition 5.2(ii). So every non-zero closed invariant subhence Hγ = span{e0 ⊗l(σ ) . Since H ⊥ is also a closed invariant subspace, we must space contains the vector e0 ⊥ have H = {0}, i.e. H = l2 (Z+ )⊗l(σ ) . It remains therefore to prove the claim (5.11). λ ) = 0 if µ < σ λ. Hence By (5.4) we have πσ (Cµ,i;λ λ λ λ λ )∗ Cµ,i;λ ) = q (λ,λ)−(µ,σ λ) πσ ((Cµ,i;λ Cσλλ;λ )∗ Cσλλ;λ Cµ,i;λ ) Lσ λ;λ πσ ((Cµ,i;λ λ λ )∗ (Cσλλ;λ )∗ Cσλλ;λ Cµ,i;λ ) = q 2(λ,λ)−2(µ,σ λ) πσ ((Cµ,i;λ λ λ )∗ Cσλλ;λ )πσ ((Cσλλ;λ )∗ Cµ,i;λ ), = q 2(λ,λ)−2(µ,σ λ) πσ ((Cµ,i;λ

where we used Proposition 3.2(i) in the second equality and Proposition 3.2(ii) in the first and third equality. So (5.11) will then follow from λ )Hγ = 0, µ 6= σ λ, πσ ((Cσλλ;λ )∗ Cµ,i;λ

(5.12)

in view of the injectivity of Lσ λ;λ . Fix h ∈ Hγ and µ ∈ P (λ) with µ 6 = σ λ. By λ ∈ AS ⊂ Cq [U/KS ], hence the vector Lemma 4.4 we have (Cσλλ;λ )∗ Cµ;i;λ λ h˜ := πσ ((Cσλλ;λ )∗ Cµ;i;λ )h

(5.13)

lies in the invariant subspace H . Again using the commutation relations given in Proposition 3.2 and Corollary 3.3, we see that h˜ is an eigenvector of Lσ λ;λ with eigenvalue −1 γ˜ := q 2(λ,σ (µ)−λ) γ . We have γ˜ > γ by Proposition 5.3. By the maximality of γ , we conclude that h˜ = 0. This proves (5.12), hence also the claim (5.11). u t Definition 5.6. We say that the irreducible ∗-representation πσ (σ ∈ W S ) of Cq [U/KS ] is associated with the Schubert cell Xσ ⊂ U/KS . The following proposition can be regarded as a quantum analogue of Proposition 2.3 as well as of the “only if” part of Proposition 2.2. Proposition 5.7. Let σ ∈ W , and let σ = uv be the unique decomposition of σ with u ∈ W S and v ∈ WS . For πσ = πu ⊗ πv (cf. (2.10)) and t ∈ T , we have (πσ ⊗ τt )(a) = πu (a) ⊗ id⊗l(v) , a ∈ Cq [U/KS ].

Quantized Flag Manifolds and Irreducible ∗-Representations

317

Proof. Recall that the one-dimensional ∗-representation τt factorizes through ι∗T : Cq [U ] → Cq [T ] and that πi factorizes through φi∗ : Cq [U ] → Cqi [SU (2)]. The maps ι∗T and φi∗ (i ∈ S) factorize through ι∗S : Cq [U ] → Cq [KS ] since the ranges of ιT and φi (i ∈ S) lie in the Hopf-subalgebra Uq (lS ). Hence πv ⊗ τt (v ∈ WS , t ∈ T ) factorizes through ι∗S , say πv ⊗ τt = πv,t ◦ ι∗S . Then we have for a ∈ Cq [U/KS ], (πσ ⊗ τt )(a) = (πu ⊗ πv ⊗ τt ) ◦ 1(a) = (πu ⊗ πv,t ) ◦ (id ⊗ ι∗S )1(a)

= πu (a) ⊗ πv,t (1) = πu (a) ⊗ id⊗l(v) ,

which completes the proof of the proposition. u t Lemma 5.8. The ∗-representations {πσ }σ ∈W S , considered as ∗-representations of AS respectively Cq [U/KS ], are mutually inequivalent. Proof. Let σ, σ 0 ∈ W S with σ 6 = σ 0 and λ ∈ P++ (S c ). Then σ λ 6 = σ 0 λ, since the isotropy subgroup {σ ∈ W | σ λ = λ} is equal to WS by Chevalley’s Lemma (cf. [11, Prop. 2.72]). Without loss of generality we may assume that σ λ 6 ≥ σ 0 λ. Then we have πσ 0 ((Cσλλ,λ )∗ Cσλλ,λ ) = 0 by (5.4). On the other hand, Lσ λ;λ is injective. It follows that t πσ 6 ' πσ 0 as ∗-representations of AS . u Let now k.ku be the universal C ∗ -norm on Cq [U ] (cf. [3, §4]), so kaku :=

sup

σ ∈W,t∈T

k(πσ ⊗ τt )(a)k, a ∈ Cq [U ].

(5.14)

Let Cq (U ) (resp. Cq (U/KS )) be the completion of Cq [U ] (resp. Cq [U/KS ]) with respect to k.ku . All ∗-representations πσ ⊗ τt of Cq [U ] extend to ∗-representations of the C ∗ algebra Cq (U ) by continuity. The results of this section can now be summarized as follows. Theorem 5.9. Let S ⊂ 1. Then {πσ }σ ∈W S is a complete set of mutually inequivalent irreducible ∗-representations of Cq (U/KS ). Proof. This follows from the previous results, since every irreducible ∗-representation of Cq (U/KS ) appears as an irreducible component of σ|Cq (U/KS ) for some irreducible t ∗-representation σ of Cq (U ) (cf. [5, Prop. 2.10.2]). u Theorem 5.9 does not imply that {πσ }σ ∈W S is a complete set of irreducible ∗-representations of the ∗-algebra Cq [U/KS ] itself. Indeed, it is not clear that any irreducible ∗-representation of Cq [U/KS ] can be continuously extended to a ∗-representation of Cq (U/KS ). In the remainder of this paper we will deal with the classification of the irreducible ∗-representations of AS . In particular, this will yield a complete classification of the irreducible ∗-representations of Cq [U/KS ] for the generalized flag manifolds U/KS for which the PBW factorization is valid (cf. Theorem 4.10). 6. Irreducible ∗-Representations of AS Let S ⊂ 1 be any subset. In this section we show that {πσ }σ ∈W S exhausts the set of irreducible ∗-representations of AS (up to equivalence). We fix therefore an arbitrary irreducible ∗-representation τ : AS → B(H ) and we will show that τ ' πσ for a

318

J. V. Stokman, M. S. Dijkhuizen

(unique) σ ∈ W S . In order to associate the proper minimal coset representative σ ∈ W S with τ , we need to study the range τ (AS ) ⊂ B(H ) of τ in more detail. For λ ∈ P+ (S c ) and µ, ν ∈ P (λ), let τ λ (µ; ν), τ λ (ν) ⊂ B(H ) be the linear subspaces λ λ )∗ Cw;v ) | v ∈ V (λ)µ , w ∈ V (λ)ν }, τ λ (µ; ν) := {τ ((Cv;v λ λ λ λ )∗ Cw;v ) | v ∈ V (λ), w ∈ V (λ)ν }. τ λ (ν) := {τ ((Cv;v λ λ

(6.1)

For λ ∈ P+ (S c ) set D(λ) := {ν ∈ P (λ) | τ λ (ν) 6 = {0}}

(6.2)

/ D(λ) for all ν 0 < ν. and let Dm (λ) be the set of weights ν ∈ D(λ) such that ν 0 ∈ By (3.10), we have D(λ) 6 = ∅, hence also Dm (λ) 6= ∅. We start with a lemma which is useful for the computation of commutation relations in τ (AS ) ⊂ B(H ). Lemma 6.1. Let λ, 3 ∈ P+ (S c ) and ν ∈ Dm (λ). Let v ∈ V (λ), v 0 ∈ V (λ)ν 0 with λ )∗ , ν 0 < ν and w, w0 ∈ V (3). Then the product of the four matrix elements (Cv;v λ λ 3 3 Cv 0 ;vλ , (Cw;v3 )∗ , and Cw0 ;v3 , taken in an arbitrary order, is contained in Ker(τ ). Proof. Since Ker(τ ) is a two-sided ∗-ideal in AS , it follows from the definitions that 3 λ )∗ Cw30 ;v3 (Cv;v )∗ Cvλ0 ;vλ ∈ Ker(τ ). (Cw;v 3 λ

If the product of the four matrix coefficients is taken in a different order, then we can rewrite it by Proposition 3.2 and by Corollary 3.3 as a linear combination of products of matrix elements 3 λ )∗ Cu30 ;v3 (Cx;v )∗ Cxλ0 ;vλ (Cu;v 3 λ

with x 0 ∈ V (λ)ν 00 and ν 00 ≤ ν 0 < ν. These are all contained in Ker(τ ), since ν ∈ Dm (λ). t u Lemma 6.2. Let λ ∈ P+ (S c ) and ν ∈ Dm (λ). Then (i) τ λ (ν; ν) 6 = {0}; (ii) ν = σ λ for some σ ∈ W S . Proof. Let λ ∈ P+ (S c ) and ν ∈ Dm (λ). Fix weight vectors v ∈ V (λ)µ , w ∈ V (λ)ν λ )∗ C λ such that Tv;w := τ ((Cv;v w;vλ ) 6 = 0. By Lemma 6.1, we compute λ λ λ λ (Cv;v C λ )∗ Cw;v ) (Tv;w )∗ Tv;w = q (µ,ν)−(λ,λ) τ (Cv;v λ λ w;vλ λ λ λ (Cv;v )∗ )Tw;w , = τ (Cv;v λ λ

(6.3)

where we used Proposition 3.2(ii) in the first equality and Proposition 3.2(i) in the second equality. On the other hand, (Tv;w )∗ Tv;w 6 = 0 since B(H ) is a C ∗ -algebra, so we conclude that Tw;w 6 = 0. In particular, τ λ (ν, ν) 6= {0}. Formula (6.3) for v = w gives λ λ (Cw;v )∗ )Tw;w = q (λ,λ)−(ν,ν) Tw;w Tw;w , 0 6 = (Tw;w )∗ Tw;w = τ (Cw;v λ λ

where we have used Proposition 3.2(ii) in the last equality. It follows that (λ, λ) = (ν, ν), t since Tw;w is selfadjoint. By Proposition 5.3 we obtain ν = σ λ for some σ ∈ W S . u

Quantized Flag Manifolds and Irreducible ∗-Representations

319

For λ ∈ P+ (S c ) and ν ∈ Dm (λ) we set λ ∗ λ ) Cν;λ ). Lν;λ := τ ((Cν;λ

(6.4)

This definition makes sense since dim(V (λ)ν ) = 1 by Lemma 6.2(ii). Furthermore, Lν;λ is a non-zero selfadjoint operator which commutes with the elements of τ (AS ) in the following way. Lemma 6.3. Let λ, 3 ∈ P+ (S c ) and ν ∈ Dm (λ). For v ∈ V (3)µ , w ∈ V (3)µ0 we have 0

3 3 3 3 )∗ Cw;v ) = q 2(ν,µ −µ) τ ((Cv;v )∗ Cw;v )Lν;λ . Lν;λ τ ((Cv;v 3 3 3 3

Proof. By Lemma 6.1 and the commutation relations in Sect. 3 we compute 3 3 3 3 )∗ Cw;v ) = q (λ,3)−(ν,µ) τ ((Cv;v C λ )∗ Cvλν ;vλ Cw;v ) Lν;λ τ ((Cv;v 3 3 3 vν ;vλ 3

3 3 )∗ (Cvλν ;vλ )∗ Cvλν ;vλ Cw;v ) = q 2(λ,3)−2(ν,µ) τ ((Cv;v 3 3 0

3 3 )∗ (Cvλν ;vλ )∗ Cw;v Cλ ) = q (ν,µ )+(λ,3)−2(ν,µ) τ ((Cv;v 3 3 vν ;vλ 0

3 3 )∗ Cw;v )Lν;λ , = q 2(ν,µ −µ) τ ((Cv;v 3 3

where we used Proposition 3.2(ii) for the first and fourth equality, Proposition 3.2(i) for the second equality, and Corollary 3.3 for the third equality. u t It follows from Lemma 6.3 that Ker(Lν;λ ) ( H is a closed invariant subspace. By the irreducibility of τ , we thus obtain the following corollary. Corollary 6.4. Let λ ∈ P+ (S c ) and ν ∈ Dm (λ). Then Lν;λ is injective. The minimal coset representative σ of Lemma 6.2(ii) is unique and independent of λ ∈ P+ (S c ) in the following sense. Lemma 6.5. There exists a unique σ ∈ W S such that Dm (λ) = {σ λ} for all λ ∈ P+ (S c ). Proof. Let 3 ∈ P++ (S c ) and ν ∈ Dm (3). Then there exists a unique σ ∈ W S such that ν = σ 3 by Lemma 6.2(ii) and by Chevalley’s Lemma (cf. [11, Prop. 2.27]). Fix furthermore arbitrary λ ∈ P+ (S c ) and ν 0 ∈ Dm (λ). Choose a σ 0 ∈ W such that ν 0 = σ 0 λ. By Lemma 6.1 and the commutation relations of Sect. 3, we compute 0

3 ∗ 3 ) Cν;3 Cνλ0 ;λ ) Lν;3 Lν 0 ;λ = q (3,λ)−(ν,ν ) τ ((Cνλ0 ;λ Cν;3 0

3 ∗ λ 3 ) Cν 0 ;λ Cν;3 ) = q 3(3,λ)−3(ν,ν ) τ ((Cνλ0 ;λ )∗ (Cν;3 0

= q 2(3,λ)−2(ν,ν ) Lν 0 ;λ Lν;3 , where we used Proposition 3.2(ii) in the first and third equality and Proposition 3.2(i) twice in the second equality. If we repeat the same computation, but now using Corollary 3.3 twice in the second equality, then we obtain 0

Lν;3 Lν 0 ;λ = q 2(ν,ν )−2(3,λ) Lν 0 ;λ Lν;3 , hence

0 0 q 2(3,λ)−2(ν,ν ) − q 2(ν,ν )−2(3,λ) Lν 0 ;λ Lν;3 = 0.

320

J. V. Stokman, M. S. Dijkhuizen

By Corollary 6.4 we have Lν 0 ;λ Lν;3 6 = 0, so we conclude that (3, λ) − (ν, ν 0 ) = (3, λ − σ −1 σ 0 λ) = 0. Since 3 ∈ P++ (S c ) and λ ∈ P+ (S c ), it follows from Lemma 5.4 that λ = σ −1 σ 0 λ, i.e. t ν 0 = σ λ. Hence, Dm (λ) = {σ λ} for all λ ∈ P+ (S c ). u In the remainder of this section we write σ for the unique minimal coset representative such that Dm (λ) = {σ λ} for all λ ∈ P+ (S c ). We are going to prove that τ ' πσ . First ⊗l(σ ) (cf. Proposition 5.2(ii)) in we look for the analogue of the distinguished vector e0 the representation space H of τ . The spectrum I (λ) of Lσ λ;λ is contained in [0, ∞), since Lσ λ;λ is a positive operator. By considering the spectral decomposition of Lσ λ;λ , one obtains the following corollary of Lemma 6.3 and [12, Lemma 4.3]. Corollary 6.6. Let λ ∈ P+ (S c ). Then I (λ) ⊂ [0, ∞) is a countable set with no limit points, except possibly 0. The proof of Corollary 6.6 is similar to the proof of [40, Prop. 3.9] and of [12, Prop. 4.2]. By Corollary 6.6 we have an orthogonal direct sum decomposition M Hγ (λ) (6.5) H = γ ∈I (λ)∩R>0

into eigenspaces of Lσ λ;λ , where Hγ (λ) is the eigenspace of Lσ λ;λ corresponding to the eigenvalue γ . Let γ0 (λ) > 0 be the largest eigenvalue of Lσ λ;λ . Lemma 6.7. Let λ ∈ P+ (S c ), v ∈ V (λ), w ∈ V (λ)ν and assume that ν 6= σ λ. Then λ )∗ C λ τ ((Cv;v w;vλ )(Hγ0 (λ) (λ)) = {0}. λ Proof. Let λ ∈ P+ (S c ), v ∈ V (λ)µ and w ∈ V (λ)ν . By Lemma 6.1 and the commutation relations in Sect. 3, we compute λ λ λ λ )∗ Cw;v ) = τ (Cvλσ λ ;vλ (Cv;v Cλ )∗ Cw;v ) Lσ λ;λ τ ((Cv;v λ λ λ vσ λ ;vλ λ

λ λ )∗ (Cvλσ λ ;vλ )∗ Cw;v ) = q (λ,λ)−(µ,σ λ) τ (Cvλσ λ ;vλ (Cv;v λ λ

λ λ )∗ Cvλσ λ ;vλ )τ ((Cvλσ λ ;vλ )∗ Cw;v ), = q 2(λ,λ)−2(µ,σ λ) τ ((Cv;v λ λ

where we used Proposition 3.2(i) in the second equality and Proposition 3.2(ii) in the first and third equality. This computation, together with the injectivity of Lσ λ;λ , shows that it suffices to give a proof of the lemma for the special case that v = vσ λ . So we fix h ∈ Hγ0 (λ) (λ) and w ∈ V (λ)ν with ν ∈ P (λ) and ν 6= σ λ. It follows from λ )h is an eigenvector of Lσ λ;λ with eigenvalue Lemma 6.3 that h˜ := τ ((Cvλσ λ ;vλ )∗ Cw;v λ −1 2(λ,σ (ν)−λ) γ0 (λ). By Proposition 5.3 we have γ˜0 (λ) > γ0 (λ), hence h˜ = 0 γ˜0 (λ) = q t by the maximality of the eigenvalue γ0 (λ). u Corollary 6.8. γ0 (λ) = 1 for all λ ∈ P+ (S c ). Proof. Follows from (3.10) and Lemma 6.7. u t

Quantized Flag Manifolds and Irreducible ∗-Representations

321 µ

The linear subspace of Cq [U ] spanned by the matrix elements {Cσ µ;µ }µ∈P+ is a subµ algebra of Cq [U ] with algebraic generators Cσ$$i i ;$i (i ∈ [1, r]), since Cσ µ;µ Cσν ν;ν = µ+ν

λµ,ν Cσ (µ+ν);µ+ν , where the scalar λµ,ν ∈ T depends on the particular choices of orthonormal bases for the finite-dimensional irreducible representations V (µ) and V (ν) (cf. [40, Proof of Prop. 3.12]). Then it follows from Proposition 3.2 and Lemma 6.1 that Lσ (µ+ν);µ+ν = Lσ µ;µ Lσ ν;ν

(6.6)

for all µ, ν ∈ P+ (S c ), hence span{Lσ λ;λ }λ∈P+ (S c ) is a commutative subalgebra of B(H ). Set \ H1 ($i ), (6.7) H1 := i∈S c

then H1 ⊂ H1 (λ) for all λ ∈ P+ (S c ) by (6.6). Lemma 6.9. H1 = H1 (λ) for all λ ∈ P++ (S c ). In particular, H1 6 = {0}. Proof. For µ ∈ P+ (S c ) we have kLσ µ;µ k = 1. Moreover, for any h ∈ H , h ∈ H1 (µ)

⇔

kLσ µ;µ hk = khk.

(6.8)

This follows from the eigenspace decomposition (6.5) for Lσ µ;µ and the fact that 1 is the largest eigenvalue of Lσ µ;µ . Let λ ∈ P++ (S c ) and choose arbitrary i ∈ S c . Then λ = µ + $i for certain µ ∈ P+ (S c ). By (6.6), we obtain for h ∈ H1 (λ), khk = kLσ λ;λ hk = kLσ µ;µ Lσ $i ;$i hk ≤ kLσ $i ;$i hk ≤ khk, hence we have equality everywhere. By (6.8), it follows that h ∈ H1 ($i ). Since i ∈ S c was arbitrary, we conclude that h ∈ H1 . u t Lemma 6.10. Let λ ∈ P+ (S c ). For all v ∈ V (λ)µ with µ 6 = σ λ we have λ )∗ Cvλσ λ ;vλ (H1 ) ⊂ H1⊥ . τ (Cv;v λ Proof. Let 3 ∈ P++ (S c ), λ ∈ P+ (S c ), and v ∈ V (λ)µ with µ 6= σ λ and µ ∈ P (λ). Then λ )∗ Cvλσ λ ;vλ ) = q 2(3,λ−σ Lσ 3;3 τ ((Cv;v λ

−1 (µ))

λ τ ((Cv;v )∗ Cvλσ λ ;vλ )Lσ 3;3 λ

by Lemma 6.3. By Lemma 5.4 we have (3, λ − σ −1 (µ)) > 0. Hence, λ λ )∗ Cvλσ λ ;vλ )(H1 ) = τ ((Cv;v )∗ Cvλσ λ ;vλ )(H1 (3)) τ ((Cv;v λ λ M Hγ (3) = H1 (3)⊥ = H1⊥ , ⊂ γ <1

which completes the proof of the lemma. u t Corollary 6.11. dim(H1 ) = 1.

(6.9)

322

J. V. Stokman, M. S. Dijkhuizen

Proof. By Lemma 6.7 and Lemma 6.10 we obtain for any 0 6 = h ∈ H1 , τ (AS )h ⊂ span{h} ⊕ H1⊥ , where the overbar means closure. By the irreducibility of τ , we conclude that span{h} = t H1 . u Any vector h ∈ H1 with khk = 1 can serve now as the analogue in the representation ⊗l(σ ) in the representation space of πσ . By comspace H of the distinguished vector e0 paring the Gel’fand–Naimark–Segal states of τ and πσ taken with respect to the cyclic ⊗l(σ ) , we obtain the following lemma. vector h ∈ H1 (khk = 1) resp. e0 Lemma 6.12. We have τ ' πσ as irreducible ∗-representations of AS . Proof. Fix an h ∈ H1 with khk = 1, and define the Gel’fand–Naimark–Segal states φτ , φπσ : AS → C by φτ (a) := (τ (a)h, h),

⊗l(σ )

φπσ (a) := (πσ (a)e0

⊗l(σ )

, e0

).

(6.10)

Then we have for φ = φτ (resp. φ = φπσ ), λ λ φ((Cµ,i;λ )∗ Cν,j ;λ ) = δµ,σ λ δν,σ λ

(6.11)

for λ ∈ P+ (S c ), µ, ν ∈ P (λ), i ∈ [1, dim(V (λ)µ )], and j ∈ [1, dim(V (λ)ν )]. Indeed, (6.11) for φ = φτ follows from Lemma 6.7 and Lemma 6.10. For φ = φπσ , recall that πσ is an irreducible ∗-representation of AS (Proposition 5.5). We have seen in the previous section that Lσ λ;λ = πσ ((Cσλλ;λ )∗ Cσλλ;λ ) is injective for all λ ∈ P+ (S c ), hence σ λ ∈ D(λ) (cf. (6.2)) for all λ ∈ P+ (S c ). By (5.4), we actually have σ λ ∈ Dm (λ) for all λ ∈ P+ (S c ). Hence the labeling σ ∈ W S of πσ coincides with its (unique) minimal coset representative defined in Lemma 6.5. Furthermore, the one-dimensional subspace ⊗l(σ ) } (cf. Proposition 5.2(ii), Lemma 6.11). So (6.11) for H1 for πσ is equal to span{e0 φ = φπσ follows again from Lemma 6.7 and Lemma 6.10. By linearity it follows from (6.11) that φτ = φπσ , hence τ and πσ are unitarily equivalent ∗-representations (cf. [5, Prop. 2.4.1]). u t We may summarize the results of this section as follows. Theorem 6.13. For all S ⊂ 1, {πσ }σ ∈W S is a complete set of mutually inequivalent, irreducible ∗-representations of the factorized ∗-subalgebra AS . Combining Proposition 5.7, Theorem 4.10 and Theorem 6.13 we obtain the following theorem. Theorem 6.14. {πσ }σ ∈W S is a complete set of mutually inequivalent, irreducible ∗-representations of Cq [U/KS ] in the following cases: (i) S = ∅, i.e. U/KS = U/T is the full flag manifold; (ii) #S c = 1 and the simple root α ∈ S c is a Gel’fand node. For these cases the restriction to Cq [U/KS ] of the universal C ∗ -norm on Cq [U ] coincides with the universal C ∗ -norm on Cq [U/KS ].

Quantized Flag Manifolds and Irreducible ∗-Representations

323

Acknowledgements. The research for this work was started while the first author was a guest at the University of Kobe during a period of ten weeks in the Spring of 1997, for which he received financial support by NWO/Nissan. He greatly acknowledges the hospitality of the Department of Mathematics at the University of Kobe. The authors would like to thank Erik T. Koelink for helpful discussions about the topic of this paper. The authors thank Prof. P. Littelmann for pointing out the connection of the algebra Ass S with doubled G-varieties (cf. Remark 4.3).

Note added in proof The first author recently proved Conjecture 4.6 using an analogue of the Stone–Weierstrass theorem for type 1 C ∗ -algebra. This result implies that Theorem 6.14 is valid for an arbitrary flag manifold U/KS (S ⊂ 1). References 1. Bernstein, I.N., Gel’fand, I.M., Gel’fand, S.I.: Schubert cells and cohomology of the spaces G/P . Uspekhi Mat. Nauk 28 no. 3, 3–26 (1973); English translation in: Russ. Math. Surveys 28, no. 3, 1–26 (1973) 2. Chari, V., Pressley, A.: A Guide to Quantum Groups. Cambridge: Cambridge University Press, 1994 3. Dijkhuizen, M.S., Koornwinder, T.H.: CQG algebras: A direct approach to compact quantum groups. Lett. Math. Phys. 32, 315–330 (1994) 4. Dijkhuizen, M.S., Stokman, J.V.: Some limit transitions between BC type orthogonal polynomials interpreted on quantum complex Grassmannians. Report 98-17, Univ. of Amsterdam (1998). To apper in Publ. Res. Inst. Math. Sci. 5. Dixmier, J.: C ∗ -Algebras. North-Holland Math. Lib. vol. 15, Amsterdam: North-Holland, 1977 6. Helgason, S.: Differential Geometry, Lie Groups, and Symmetric Spaces. Boston: Academic Press, 1978 7. Joseph, A.: On the prime and primitive spectra of the algebra of functions on a quantum group. J. Algebra 169, 441–511 (1994) 8. Joseph, A.: Quantum Groups and Their Primitive Ideals. Ergebnisse der Mathematik und ihrer Grenzgebiete, 3. Folge, Bd. 29. Berlin: Springer-Verlag, 1995 9. Jurˇco, B., Šˇtovíˇcek, P.: Coherent states for quantum compact groups. Commun. Math. Phys. 182, 221–251 (1996) 10. Kac, V.G.: Infinite-dimensional Lie Algebras (3rd ed.). Cambridge: Cambridge University Press, 1990 11. Knapp, A.W.: Lie Groups Beyond an Introduction. Progress in Math. 140. Boston: Birkhäuser Verlag, 1996 12. Koelink, H.T.: On ∗-representations of the Hopf ∗-algebra associated with the quantum group Uq (n). Compositio Math. 77, 199–231 (1991) 13. Koornwinder, T.H.: Orthogonal polynomials in connection with quantum groups. In: Orthogonal polynomials: Theory and Practice. NATO ASI series C, vol. 294. Dordrecht: Kluwer, 1990, pp. 257–292 14. Koornwinder, T.H.: Dynkin diagrams of rank n with a subdiagram of rank n − 1. Informal note (1994) 15. Korogodsky, L.I.: Complimentary series representations and quantum orbit method. Preprint (1997) 16. Kostant, B.: A formula for the multiplicity of a weight. Trans. Am. Math. Soc. 93, 53–73 (1959) 17. Krämer, M.: Sphärische Untergruppen in kompakten zusammenhängenden Liegruppen. Compositio Math. 38, 129–153 (1979) 18. Kumar, S.: Proof of the Parthasarathy–Ranga Rao–Varadarajan conjecture. Invent. Math. 93, 117–130 (1988) 19. Lakshmibai, V., Reshetikhin, N. Yu.: Quantum deformations of flag and Schubert schemes. C.R. Acad. Sci. Paris Sér. I 313, 121–126 (1991) 20. Lakshmibai, V., Reshetikhin, N.Yu.: Quantum flag and Schubert schemes. In: M. Gerstenhaber, J. Stasheff (eds.) Deformation Theory and Quantum Groups with Applications to Mathematical Physics. Contemporary Math. 134, Providence, RI: Am. Math. Soc., 1991, pp. 145–181 21. Levendorskii, S., Soibel’man, Ya.S.: Algebras of functions on compact quantum groups, Schubert cells and quantum tori. Commun. Math. Phys. 139, 141–170 (1991) 22. Littelmann, P.: A Littlewood-Richardson rule for symmetrizable Kac–Moody algebras. Invent. Math. 116, 329–346 (1994) 23. Littelmann, P.: On spherical double cones. J. Algebra 166, 142–157 (1994) 24. Lu, J.-H.: Coordinates on Schubert cells, Kostant’s harmonic forms, and the Bruhat–Poisson structure on G/B. Preprint (1997) 25. Lu, J.-H., Weinstein, A.: Poisson–Lie groups, dressing transformations, and Bruhat decompositions. J. Diff. Geom. 31, 501–526 (1990)

324

J. V. Stokman, M. S. Dijkhuizen

26. Lu, J.-H., Weinstein, A.: Classification of SU (2)-covariant Poisson structures on S 2 . Appendix to: Sheu, A.J.-L.: Quantization of the Poisson SU(2) and its Poisson homogeneous space - the 2-sphere. Commun. Math. Phys. 135, 229–231 (1991) 27. Majid, S.: Matched pairs of Lie groups associated to solutions of the Yang–Baxter equations. Pacific J. Math. 141, no. 2, 311–332 (1990) 28. Macdonald, I.G.: Symmetric Functions and Hall Polynomials (2nd ed.). Oxford: Oxford University Press, 1995 29. Mathieu, O.: Construction d’un groupe de Kac–Moody et applications. Compositio Math. 69, 37–60 (1989) 30. Noumi, M.: Macdonald’s symmetric polynomials as zonal spherical functions on some quantum homogeneous spaces. Adv. Math. 123, no. 1, 16–77 (1996) 31. Noumi, M., Yamada, H., Mimachi, K.: Finite dimensional representations of the quantum group GLq (n; C) and the zonal spherical functions on Uq (n − 1) \ Uq (n). Japanese J. Math. 19 (1), 31–80 (1993) 32. Panyushev, D.I.: Complexity and rank of double cones and tensor product decompositions. Comment. Math. Helv. 68, 455–468 (1993) 33. Panyushev, D.I.: Reductive group actions on affine varieties and their doubling. Ann. Inst. Fourier (Grenoble) 45, 929–950 (1995) 34. Parthasarathy, K.R., Ranga Rao, R., Varadarajan, V.S.: Representations of complex semisimple Lie groups and Lie algebras. Ann. Math. 85, 383–429 (1967) 35. Podkolzin, G.B., Vainerman, L.I.: Quantum Stiefel manifold and double cosets of quantum unitary group. Pacific J. Math. 188, no. 1, 179–199 (1999) 36. Podle´s, P.: Quantum spheres. Lett. Math. Phys. 14, 193–202 (1987) 37. Rudin, W.: Functional analysis. New York: Tata McGraw-Hill, 1973 38. Semenov-Tian-Shansky, M.A.: Dressing transformations and Poisson–Lie group actions. Publ. RIMS 21, 1237–1260 (1985) 39. Soibel’man, Ya.S.: Irreducible representations of the function algebra on the quantum group SU (n), and Schubert cells. Soviet Math. Dokl. 40, no. 1, 34–38 (1990) 40. Soibel’man, Ya.S.: The algebra of functions on a compact quantum group, and its representations. Leningrad Math. J. 2, no. 1, 161–178 (1991) 41. Soibel’man, Ya.S.: On quantum flag manifolds. Funct. Anal. Appl. 26, 225–227 (1992) 42. Steinberg, R.: Lectures on Chevalley groups. Lecture Notes. New Haven, CT: Yale Univ., 1968 43. Stembridge, J.R.: A maple package for root systems and finite Coxeter groups. Manuscript (1997) 44. Sugiura, M.: Representations of compact groups realized by spherical functions on symmetric spaces. Proc. Japan Acad. 38, 111–113 (1962) 45. Vaksman, L.L., Soibel’man, Ya.S.: Algebra of functions on the quantum group SU (N + 1) and odddimensional quantum spheres. Leningrad Math. J. 2, no. 5, 1023–1042 (1991) 46. Warner, G.: Harmonic Analysis on Semi-Simple Lie Groups. Vol. I. New York: Springer-Verlag, 1972 Communicated by T. Miwa

Commun. Math. Phys. 203, 325 – 340 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Exact Absorption Probabilities for the D3-Brane Steven S. Gubser1 , Akikazu Hashimoto2 1 Joseph Henry Laboratories, Princeton University, Princeton, NJ 08544, USA.

E-mail: [email protected]

2 Institute for Theoretical Physics, University of California, Santa Barbara, CA 93106, USA

Received: 29 June 1998 / Accepted: 16 November 1998

Abstract: We consider a minimal scalar in the presence of a three-brane in ten dimensions. The linearized equation of motion, which is just the wave equation in the three-brane metric, can be solved in terms of associated Mathieu functions. An exact expression for the reflection and absorption probabilities can be obtained in terms of the characteristic exponent of Mathieu’s equation. We describe an algorithm for obtaining the low-energy behavior as a series expansion, and discuss the implications for the world-volume theory of D3-branes. 1. Introduction One of the intriguing aspects of Ramond–Ramond solitons in string theory is the existence of two alternative descriptions, one in terms of supergravity [1] and the other in terms of Dirichlet branes (D-branes) [2]. The description in terms of D-branes is essentially perturbative in nature: each boundary picks up a factor of gN , which is the square of the open string coupling times a Chan–Paton factor. As realized in [3], the low-energy dynamics of N coincident D-branes is dictated by maximally supersymmetric gauge theory with gauge group U (N ), and gN is recognized as the ’t Hooft parameter. That the gauge theory and supergravity descriptions should be related was implicit in much early work on absorption and Hawking emission (see for example [4, 5]). A precise formulation of the duality between the two descriptions was conjectured recently in [6] by taking the so-called “decoupling limit”. The simplest example comes from considering D3-branes in the type IIB theory. In the decoupling limit one obtains a duality between N = 4 supersymmetric Yang-Mills theory in four dimensions and string theory on the near horizon AdS5 × S 5 background [6]. The AdS5 and the S 5 have the same radius of curvature R, where R 4 = 4πgNα 0 2 . It is difficult to find non-trivial checks of the duality because it relates two things that are rather poorly understood away from certain limits. On the AdS side, it is widely felt that the supergravity description should be capable of being elevated to a full closed

326

S. S. Gubser, A. Hashimoto

string theory description, similar to non-linear sigma models; but it is not understood how to include Ramond–Ramond backgrounds in a non-linear sigma model.1 We must for the present content ourselves with the supergravity limit. The validity of this limit relies on having a large number N of coincident branes, with a small closed string coupling g, but large gN . Large gN is exactly where the gauge theory is difficult to deal with: after ’t Hooft scaling, the Feynman rules associate a factor gN with each vertex, so for generic amplitudes one must consider large graphs. How then can we study the relation between the two dual descriptions concretely? Aside from the calculation of entropy [9], one of the simplest quantities that can be computed on both sides of the correspondence is the absorption cross-section of scalar fields incident on the of stringy correction on the supergravity side √ √ branes. Suppression relies on having ω α 0 1 and α 0 /R 1; but ωR can be arbitrary, suggesting the existence of a double scaling limit [5, 10]. Indeed, the wave equation for the fields propagating in the supergravity background of branes depend on gN only in the combination ωR. Remarkably, the leading order behavior in small ωR of the semi-classical crosssection is reproduced by a tree level gauge theory calculation (leading order in gN ) [5, 10]. The relevant gauge theory amplitude apparently suffers no radiative corrections. An argument for why this is so was advanced in [11] for graviton absorption, and other examples have emerged in [12, 13]. A natural question which arises at this point is whether this pattern persists to higher order in ωR [14]. In order to address this question, one must examine higher order corrections in both D-brane and supergravity computations. On the supergravity side, a first step in this direction was taken in [15] where terms subleading by order (ωR)4 were examined. The coefficient of the (ωR)4 correction turns out to have a piece which depends logarithmically in ωR: σ =

i κ 2 N 2 ω3 h 1 + c10 (ωR)4 log ωR + c1 (ωR)4 + O((ωR)8 ) 32π

(1)

and the numerical value of c10 was found to be −1/6. The goal of this paper is to describe an algorithm for computing the absorption crosssection as a power series expansion in ωR to all orders. The absorption cross-section is determined by comparing the flux of incident partial waves at the asymptotic region and the near horizon region. We are therefore interested in finding the solution to the wave equation of scalar fields in the background of the D3-brane metric. It turns out that the wave equation in question is equivalent to Mathieu’s modified differential equation2

∂2 + 2q cosh 2z − a ψ(z) = 0 ∂z2

(2)

under appropriate change of variables and field redefinitions. The exact solution of Mathieu’s modified differential equation is known in the form of power series expansion with respect to q. From this, we can read off the absorption cross-section. For reviews of Mathieu functions see [16–20]. In view of the relative obscurity of these functions, most of the relevant details will be included in our exposition. 1 See however [7, 8] for interesting recent work on including Ramond–Ramond fields in a world-sheet formulation. 2 The usual form of Mathieu’s equation is obtained from 2 via the replacement z → iz.

Exact Absorption Probabilities for D3-Brane

327

First, let us see how Mathieu’s modified equation arises from the wave equation of scalar fields. The supergravity background for the D3-brane has the simple form [1] 2 ds 2 = H −1/2 (−dt 2 + dxk2 ) + H 1/2 dx⊥

as well as some RR 4-form background, where H =1+

Nκ R4 2 , R 4 = 4πgNα 0 = . 4 r 2π 5/2

For scalar fields decoupled from the RR 4-form (the example we will always have in mind is the dilaton), the equation of motion is simply 1 √ √ ∂µ gg µν ∂ν φ = 0. g The radial wave equation for the l th partial wave of energy ω which follows from this equation is

l(l + 4) 5 ∂ R4 ∂2 2 − + +ω 1+ 4 φ (l) (r) = 0. ∂r 2 r ∂r r2 r

(3)

In order to relate (3) to Mathieu’s equation alluded to earlier, one performs the following change of variables: r = Re−z , φ(r) = e2z ψ(z). In terms of these new variables, Eq. 3) reads

∂2 2 2 + 2(ωR) cosh 2z − (l + 2) ψ(z) = 0, ∂z2

(4)

which is precisely of the form (2) for q = (ωR)2 and a = (l + 2)2 . Note that we have reduced the problem of particle absorption by three-branes to the computation of the tunneling S-matrix for a one-dimensional Schrödinger equation.3 The rest of the paper is organized as follows. In Sect. 2 we present the method for obtaining the absorption probability from Mathieu’s equation. This method will be of primary interest to the mathematically oriented reader, but those concerned with the string theory implications may wish to skip directly to the final answer, (34). Section 3 is concerned with the world-volume interpretation of this probability. Section 4 concludes with a brief discussion. The appendix includes some formulas judged too cumbersome to include in the main text. 3 As an aside we note that the equations of motion for supergravity fields other than minimal scalars generically do not lead to the Mathieu equation. For example, the fixed scalar considered in [15] experiences a “transmutation of angular momentum”, in the sense that the low-energy radial function at infinity and near the horizon are Bessel functions of different orders. To put it differently, the potential function in the Schrödinger operator is asymmetric.

328

S. S. Gubser, A. Hashimoto

2. Cross-Sections from Mathieu Functions Mathieu functions arise in the study of a variety of physical problems: for example, the solution of the flat-space Laplace equation in elliptical coordinates; Bloch waves for the potential cos 2x; the Faraday instability; classical motion of a driven pendulum; the sine-Gordon model [21]; and, in the present context, as tunneling wavefunctions in the potential − cosh 2z. Our analysis is an extension of [22], and our conventions will be a hybrid of those of [16] and [22]. The so-called Floquet solutions of (2) can be expressed in the form J (ν, z) =

∞ X n=−∞

φ n + 21 ν e(2n+ν)z .

(5)

These solutions are analogous to Bloch waves because of the property J (ν, z + iπ) = eiπ ν J (ν, z).

(6)

The quantity ν is termed the Floquet exponent and is determined in terms of a and q. Clearly, J (ν, −z) is also a solution of (2) . Since J (ν, −z) acquires a phase e−iπ ν under z → z + iπ , J (ν, −z) is also a Floquet solution with exponent −ν. It follows that there is a proportionality relation J (−ν, z) ∝ J (ν, −z),

(7)

which will become useful in the later discussions. It is straightforward to see that (5) solves (2) if φ(z + 1) + φ(z − 1) =

z2 − r 2 φ(z), λ2

(8)

√ √ where we have defined r = 21 a and λ = 21 q. A meromorphic function φ was found in [22] which satisfies the recursion relation (8) and in addition has the property φ → 0 as
φ(z) =

n=0

A(0) z

= 1,

(q)

Az =

(9)

∞ ∞ X X p1 =0 p2 =2

az =

...

∞ X

az+p1 az+p1 +p2 · · · az+p1 +...+pq ,

pq =2

1 . (z + r + 1)(z + r + 2)(z − r + 1)(z − r + 2)

The value of ν = 2µ is determined by relation (7), which implies φ(−µ + 1) φ(µ) × = 1. φ(µ − 1) φ(−µ)

(10)

Exact Absorption Probabilities for D3-Brane

329

The recursion relation (8) can be written in the form V (z) =

1 φ(z + 1) φ(z − 1) + = Gz+1 + , φ(z) φ(z) Gz

where we have defined V (z) = (z2 − r 2 )/λ2 and Gz = φ(z)/φ(z − 1). Then, we can express the first factor of (10) as a continued fraction: 1 1 1 φ(µ) = Gµ = . = φ(µ − 1) V (µ) − Gµ+1 V (µ)− V (µ + 1) − · · · Similarly, the recursion relation (8) can be written in yet another form V (z) =

1 φ(−z + 1) φ(−z − 1) , + = Hz−1 + φ(−z) φ(−z) Hz

where this time we have defined Hz = φ(−z)/φ(−z − 1). Now we can also express the second factor of (10) as a continued fraction: 1 1 1 φ(−µ + 1) = = Hµ−1 = . φ(−µ) V (µ − 1) − Hµ−2 V (µ − 1)− V (µ − 2) − · · · It is now straightforward to solve for µ order by order in λ. We simply substitute the ansatz ν = ν0 + ν1 λ4 + ν2 λ8 + · · · into (10) expressed in terms of the continued fractions. If we are only interested in the value of ν to some finite order in λ, we can truncate the continued fraction by finite iteration. In Eq. 46) of the appendix we give the first few terms of the series for the partial waves l = 0, l = 1, and l = 2. There is a remarkable resummation of the Bloch wave expansion (5) in terms of Bessel functions:4 ∞ X φ n + 21 ν √ √ J (ν, z) = (11) Jn ( qe−z )Jn+ν ( qez ). φ(ν/2) n=−∞ The expansion (11) is uniformly convergent everywhere and is convenient for extracting the asymptotic behavior for large |z| [22, 19]. For ν ∈ / Z, the Floquet solutions J (±ν, z) are independent. It is useful, however, to consider other linear combinations, N (ν, z), H (1) (ν, z), and H (2) (ν, z), in analogy with Bessel functions: cos πν J (ν, z) − J (−ν, z) , sin πν J (−ν, z) − e−iπ ν J (ν, z) , H (1) (ν, z) = J (ν, z) + iN(ν, z) = i sin π ν J (−ν, z) − eiπ ν J (ν, z) . H (2) (ν, z) = J (ν, z) − iN(ν, z) = −i sin π ν N(ν, z) =

(12)

4 We use a notational convention where J (z) with subscript ν denote Bessel functions whereas J (ν, z) ν with argument ν denote solutions to Mathieu’s equation (2).

330

S. S. Gubser, A. Hashimoto

Some useful relations among the various solutions are H (1) (ν, z) + H (2) (ν, z) , 2 eiπν H (1) (ν, z) + e−iπ ν H (2) (ν, z) . J (−ν, z) = 2 J (ν, z) =

(13)

Using (10) and the standard relation J−n = (−1)n Jn , it is straightforward to show that solutions (12) also admit expansions in terms of Bessel functions, generalizing (11): Z

(j )

∞ X φ n + 21 ν √ (j ) √ Jn ( qe−z )Zn+ν ( qez ). (ν, z) = φ(ν/2) n=−∞

(14)

Here, Z (j ) runs over J , N , H (1) , and H (2) . These solutions are termed associated Mathieu functions of the first, second, third, and fourth kinds.5 We will primarily be interested in the third kind, since that is the one which describes tunneling from asymptotic infinity into the three-brane. The asymptotic behavior for
(15)

The behavior for
φ(−ν/2) J (ν, −z). φ(ν/2)

(16)

It is useful at this point to introduce the two quantities η = eiπν χ =

φ(−ν/2) . φ(ν/2)

(17)

Now the behavior of H (1) (z) as
(1)

1 (ν, z) = 2i sin πν

1 χ− χ

H

(1)

e−2iπ ν (ν, −z) + χ − χ

H

(2)

(ν, −z) . (18)

Recalling the asymptotics s Hν(1) (ξ )

→ s

Hν(2) (ξ ) →

2 i (ξ − π ν− π ) 2 4 e πξ

     

 2 −i (ξ − π ν− π )   2 4  e  πξ

as <ξ → ∞,

5 We emphasize, however, that of these only J (ν, z) is a Floquet solution.

(19)

Exact Absorption Probabilities for D3-Brane

331

we obtain

q  √ 1 i ( qez−π4 ) √2  η− for
From (20) we read off the amplitudes A = χ − χ1 , B = χη − χ1η , and C = η − η1 for the reflected, incident, and transmitted waves, respectively. A consistency check on (20) is the unitarity relation, |B|2 = |A|2 + |C|2 . One way to prove this relation is to send z → z + iπ/2 so that the − cosh potential is inverted to + cosh. Clearly there are wavefunctions in this potential which are everywhere real and exponentially decaying on one side (but not the other unless a is an eigen-energy).6 In fact, H (1) (z + iπ/2) is just such a solution, up to a constant overall phase. Hence A/C is pure imaginary. Now, 2 cos πν = η + η1 is always real for real q (a consequence of Hill’s equation). Hence η is always either real or of unit modulus. The statement that A/C is imaginary means that the same is true of χ , and moreover χ is real when η is of unit modulus and vice versa. The verification of unitarity, 2 2 2 ηχ − 1 = η − 1 + χ − 1 , (21) ηχ η χ is now straightforward. It proves easiest in practice to compute the absorption probability from 2 η − η1 (22) P = 2 2 , η − η1 + χ − χ1 but of course there are several equivalent alternative forms. Following the methods of [22], it is straightforward though tedious to obtain a series expansion of χ in q. The first observation is that any formal sum Aq =

∞ X

∞ X

p1 =−∞ p2 =2

...

∞ X

tp1 tp1 +p2 · · · tp1 +...+pq ,

(23)

pq =2

where the tn are regarded as independent variables, can be written in terms of products of single sums of products of the tn . A recursion relation is derived in [22] to demonstrate this fact: ∞ X ∂Aq tn qAq = ∂tn n=−∞ (24) ∞ X ∂ ∂2 + tn−1 tn tn+1 tn − tn (tn−1 + tn + tn+1 ) Aq−1 . = ∂tn ∂tn−1 ∂tn+1 n=−∞ 6 Incidentally, (20) provides an implicit equation for the eigen-energies of the + cosh potential: namely χ = ±1 for even/odd wavefunctions.

332

S. S. Gubser, A. Hashimoto

Let us introduce the notation S[α0 , α1 , . . . , αk ] =

k ∞ Y X n=−∞ j =0

α

j tn+j ,

(25)

where the αj are natural numbers with α0 and αk nonzero.7 Then the map H : Aq−1 → qAq defined in (24) can be viewed formally as a linear operator on the infinite-dimensional vector space whose basis is 1 together with all possible products of the S[α0 , α1 , . . . , αk ].8 We have Aq =

1 q H (1), q!

(26)

where H q (1) is the operator H acting q times on unity. Amusingly, the problem of computing the generalization of the function v in (9) to arbitrary tn at finite λ is formally identical to Euclidean evolution by the Hamiltonian H : ∞ X 4 (−λ4 )q Aq = e−λ H (1). v≡

(27)

q=0

In Eq. (47) of the appendix we write out the first four Aq in terms of the S[α0 , α1 , . . . , αk ]. (q) Now let us specialize to Aq = Az by setting az+n for n ≥ 0 (28) tn = 0 otherwise. P 1 The sums S[α0 , α1 , . . . , αq ] then have the general form ∞ n=0 f (z+n) , where f (z) is Pq a polynomial of degree 4 i=0 αi . Such sums can be performed explicitly in terms of the function ψ(z) = 0 0 (z)/ 0(z) and its derivatives. The first step is to make a partial fraction decomposition: ∞ (`) X X cy 1 . = f (z) (z − y)`

(29)

f (y)=0 `=1

The first sum is over the roots of f (z). For a root y of multiplicity k, only the first k of (`) the constants cy can be nonzero. Each term in the partial fraction decomposition makes a contribution to the sum over n which can be read off from ∞ X 1 1 − , ψ(z) = −C − z+n n+1 n=0 (30) ∞ X 1 (k) k+1 . ψ (z) = (−1) k! (z + n)k+1 n=0

7 Note that only questions of convergence stand in the way of extending the following discussion to arbitrary ∞ ∞ ∞ real sequences αj j =−∞ modulo the equivalence relation αj j =−∞ ∼ αj +k j =−∞ for integer k. 8 This space is reminiscent of the loop spaces encountered, for instance, in the c = 0 matrix model [23,

24]. In this analogy, H plays the role of the Fokker–Planck Hamiltonian.

Exact Absorption Probabilities for D3-Brane

333

where C = log γ ≈ 0.5772 is Euler’s constant. This leads to the sum ∞ X n=0

∞ X X 1 (−1)` (`−1) cy(`) (z − y). = ψ f (z + n) (` − 1)!

(31)

f (y)=0 `=1

(1)

The coefficients cy satisfy the relation X f (y)=0

cy(1)

1 = 2πi

I γ

dz = 0, f (z)

(32)

where γ is a contour that encloses all the roots of f (z), ensuring that the divergences from the various 1/(z − y) terms in the partial fraction decomposition cancel. In effect this allows us to use the second line of (30) even at k = 0. Explicit expressions for the first few S[{αi }]’s are included in Eq. (48) of the appendix. To complete the task of computing the absorption cross-section, we need to determine the value of χ = φ(−ν/2)/φ(ν/2). All that remains to be done now is to substitute the expansion for ν given in (46) into (9) and collect terms of given order in λ. Because ν is an integer plus powers of λ, the ψ functions can all be Taylor expanded around integers or half-integers. To simplify the final expressions, it is useful to recall the relation of ψ to the Riemann zeta function ζ (s) and its generalizations ζ (s, z): ψ(1) = −C, ψ (k) (z) = (−1)k+1 k!ζ (k + 1, z), 1 ζ (s, z + 1) = ζ (s, z) − s , z ζ (s, 1) = ζ (s), ζ s, 21 = (2s − 1)ζ (s).

(33)

The final expressions for the absorption probability of the l th partial wave have the form ∞

Pl =

n

XX k 4π 2 8+4l bn,k (ωR)4n log ωR¯ , (ωR/2) 4 2 (l + 1)! (l + 2)

(34)

n=0 k=0

where R¯ = eC R/2. The overall normalization has been chosen so that b0,0 = 1. We have computed the values of the first few bn,k ’s for l = 0, l = 1, and l = 2 which we summarize in Table 1. We find that bn,k is rational for n − k < 2, whereas for n − k ≥ 2 it is a linear combination of ζ (2), ζ (3), . . . , ζ (n − k) with rational coefficients. The absorption cross-section for the l th partial wave can now be computed from a version of the Optical Theorem: σl =

8π 2 /3 (l + 1)(l + 2)2 (l + 3)Pl . ω5

The generalization of this formula to arbitrary dimensions was derived in [25].

(35)

334

S. S. Gubser, A. Hashimoto

Table 1. Leading coefficients bn,k for the expansion with respect to ωR for the absorption cross-section (34) of l = 0, l = 1, and l = 2 partial waves l=0

l=1

l=2

− 16 7 72 17 576 161 − 4608 (2) − 11ζ 576 11 − 2592 623 82944 49ζ (2) + 6912

1 − 24 53 1152 1 1152 757 − 276480 261343 − ζ (2) 132710400 4608 1 − 82944 7 69120 ζ (2) 554911 − 3185049600 + 110592

1 − 60 19 800 1 7200 821 − 1728000 44071 − ζ (2) 103680000 28800 1 − 1296000 479 103680000 ζ (2) 1731599 − 174182400000 + 1728000

1379ζ (2) 1093099 2388787200 − 331776 + 5ζ (3) + 41472

65129557 764411904000 − 101ζ (2) ζ (3) − 2211840 − 663552

1148018521 167215104000000 − 479ζ (2) ζ (3) − 414720000 − 10368000

b1,1 b1,0 b2,2 b2,1 b2,0

5561 663552

b3,3 b3,2 b3,1 b3,0

39037 − 9953280

3. The World-Volume Dynamics Let us now consider the world-volume interpretation for the case where the minimal scalar is the dilaton. In the ’t Hooft limit g → 0, N → ∞ with gN fixed, quantum fluctuations of bulk fields decouple and the dynamics is strictly on the brane world-volume. The only sense in which bulk fields enter is as a source of world-volume fluctuations in the form of a local operator. The s-wave of the dilaton corresponds in the world-volume theory to the operator O which slides the gauge coupling. The absorption probability Pl=0 then translates directly into the discontinuity of the cut in the two-point function O through the formula [11] Pl=0 = 5(p 2 ) =

π 3 ω4 R 8 Disc 5(p2 ), 2 8iN Z d 4 x eip·x 5(x 2 ),

(36)

where 5(x 2 ) = hO(x)O(0)i.

(37)

The dynamics of the world-volume theory at leading order in energy is captured by its superconformal limit in the infrared. To higher order in energy, however, one must account for the effect of irrelevant perturbations which takes the theory away from the fixed point. The correlator h. . . i is therefore taken with respect to some quantum effective action which we will describe later in this section. In (36) it should be noted that the discontinuity is taken across the cut positioned along the positive real axis of the complex s = −p2 plane, evaluated at s = ω2 . Working backward, one can read off 5(x 2 ) from Pl=0 , with the result 2 2n k ∞ n 3N 2 X X R R2 2 cn,k . (38) log 2 5(x ) = 4 8 π x x2 x n=0 k=0

Exact Absorption Probabilities for D3-Brane

335

To obtain Pl=0 from (38) we must specify a regularization scheme for the Fourier integrals. The minimal scheme, following [26, 27], is to analytically continue the formula 2−h Z ip·x 4 0(2 − h) 4 e 2 (39) d x 2h = π 2 x p 0(h) beyond its radius of convergence |h − 1| < 1 to a meromorphic function on the entire complex h plane, and then read off the behavior near the poles at positive integer h by matching terms in the Taylor expansions in a of 2−n 2 a Z ip·x (µx)2a 0(2 − n + a) 4 4µ 4 e 2 =π , d x x 2n p2 p2 0(n − a) (40) 2−n 2 a Z ip·x (µx)2a 4 2π 3 i 4µ 4 e =− . Disc d x x 2n ω2 ω2 0(n − a)0(n − a − 1) For the expansions in a one uses (µx)2a =

∞ X an n=0

n!

(log µ2 x 2 )n , ∞

log 0(1 + a) =

1 2

X a 2n+1 πa − Ca − ζ (2n + 1). log sin πa 2n + 1

(41)

n=1

Upon setting the energy scale µ = 1/R one obtains the cn,k as numbers involving ζ (s) in the same way as the bn,k : explicitly, c0,0 = 1, c1,1 = −320, c2,2 = 571200, c1,0 , = −1024, c2,1 = 4408560, 14 (1422697 − 12000π 2 ). c2,0 = 3

(42)

One can formally define a dimension 1 for the operator O in 37 through a version of the Callan-Symanzik equation: ∂ + 21 5(x 2 ) = 0. (43) x ∂x For R 4 /x 4 1, this results in a series of the same form as (38): k R2 x2 n=0 k=0 R4 R2 = 4 − 64 4 37 + 10 log 2 + . . . . x x

1=

n ∞ X X

1n,k

R2 x2

2n

log

(44)

The challenge at this point is to reproduce (38) and its generalizations to higher partial waves through a quantum field theory analysis. As we mentioned earlier in this section, this requires a knowledge of the world-volume dynamics beyond the superconformal limit in the infrared. In principle, this theory is well defined as a low-energy effective action of the full string theory. At present, however, no concrete formulation of this

336

S. S. Gubser, A. Hashimoto

effective theory is known. Therefore, instead of trying to reproduce (38), we can attempt to learn about this effective theory from the data provided by (38). The leading term has precisely the form one expects in a conformal theory. The 4 2 leading correction, Rx 4 log Rx 2 , has the form one would obtain by perturbing the conformal field theory by a dimension eight operator. It was speculated in [15] that this correction and perhaps the full semi-classical cross-section would eventually find its world-volume explanation in the non-abelian Dirac–Born–Infeld (DBI) action, with the symmetrized trace prescription proposed in [28] to pick out the leading correction at dimension eight (Tr[F 4 ]), rather than dimension six (Tr[F 3 ]) as one would expect from other prescriptions. However, the DBI action arises from summing disc diagrams, so it defines a classical field theory, and in no way captures the effect of a resummation of infinite insertions of boundaries in the large gN limit. Furthermore, the non-renormalizability of the action makes it impossible to proceed to the quantum theory from a knowledge of the tree-level amplitudes alone, as was the standard strategy in deriving low-energy renormalizable quantum field theories from string theory. We require some further input from the string theory. It was conjectured in [29, 30] that all operators in the gauge theory except those in short multiplets acquire large anomalous dimensions in the strong ’t Hooft coupling limit, and perhaps even decouple from the operator algebra.9 The supergravity fields corresponding to the operators in short multiplets have been tabulated in [31]. Inspection of this table reveals that the only scalar SO(6) singlet operators are the renormalizable lagrangian O4 (coupling to the s-wave of the dilaton) and a dimension eight operator O8 which couples to uniform dilations of the S 5 part of the near-horizon geometry. There is also a dimension four pseudo-scalar which couples to the axion, which we shall ignore in the following. On the grounds of group theory and large anomalous scaling dimensions, we are then led to the tentative conclusion that the effective lagrangian for the low-energy dynamics at large gN is L = O4 + R 4 O 8 .

(45)

The relation to DBI is merely that the low-energy effective lagrangian of the same system at small gN is the DBI action. On this view, the phrase “DBI action” must be interpreted in [15] (and in the many other papers in the literature, e.g. [32], where it was invoked in the context of an effective world-volume theory of D-brane black holes) as a metonym for its strong-coupling relative. Equation (45) is a fantastic simplification over the still incompletely known non-abelian DBI action. But in a way it is no less problematical as a specification of a quantum theory. The natural interpretation of (45) is as the Wilsonian effective action with cutoff on the order R.10 The difficulties with this approach include pinning down the normalization of O8 at a given cutoff, defining an appropriate regularization scheme which allows one to recover maximal supersymmetry, and the apparent vanishing of hO4 O4 O8 i in the AdS/CFT prescription to leading order in large gN. Nevertheless, let us try to argue that (45) at least has the potential to reproduce all the correction terms in (38). Following [15], we can consider as a toy model free U (1) 9 We thank T. Banks for a discussion on this point. 10 If the cutoff 3 is made arbitrary, then one must introduce a coupling λ(3) in front of O which runs 8

precisely in order to keep the physical observables, e.g. correlation functions, invariant with respect to the change in the choice of the cut-off.

Exact Absorption Probabilities for D3-Brane

337

Fig. 1. A diagram with n quartic vertices contributing at order O(R 4n )

gauge theory with an F 4 interaction. From graphs such as the one in Fig. 1, one indeed obtains a (R 4 /x 4 )n (log R 2 /x 2 )n correction to the two-point function. It is fascinating that the final forms (34) and (38) of the absorption probability and two-point function are so simple and suggestive of Feynman integrals, regulated at the scale µ = 1/R. For small ωR, it seems that the perturbative expansion around the conformal limit may be better defined than we have any right to expect based on previous experience with non-renormalizable divergences in quantum field theories. Quite remarkably, one type of interaction alone is sufficient to reproduce the form of (38). This might indeed be a consequence of superconformal invariance and the decoupling of non-chiral operators in the large gN limit severely restricting the dynamics away from the infrared fixed point. We regretfully leave a more detailed study for future work. 4. Discussion The biggest obstacle to finding evidence for the conjectured throat-brane equivalence [6, 29, 30] between N = 4 super Yang-Mills theory and supergravity on AdS5 × S 5 is that supergravity’s validity is restricted to the region of strong ’t Hooft coupling, where gauge theory calculations √ are difficult. Let us adopt units where the radius of S 5 is 1. 0 0 Briefly, since 1/α ∼ gY M N in these units, the √ α corrections to the supergravity action are important except in the limit of large gY M N . For example, the supergravity fields the order 1/R) are much lighter than massive on AdS5 (with Kaluza Klein masses on √ string states (with masses on the order 1/ α 0 ) only in this limit. The corresponding nonchiral fields in the √ gauge theory “freeze out” on account of an anomalous dimension on the order (gY M N)1/2 [29]. Large N can be regarded as a separate requirement: since powers of κ ∼ 1/N suppress quantum loop corrections to supergravity, the identification of the classical supergravity action with the generator of connected Green’s functions can only capture the leading large N asymptotics. √ To proceed to finite or small gY M N seems difficult without some profound new insight into the description of string theory in Ramond–Ramond backgrounds. Any hope of systematic perturbative field theory evidence in favor of the throat-brane conjecture would seem to depend on finding some other small coupling parameter. The only candidate seems to be ωR, where ω is the energy of a given process (i.e. absorption). As a first step in investigating a possible perturbation expansion in ωR, we have given an algorithm, which can be readily implemented on a computer, for extracting the absorption cross-section of a minimal scalar in an arbitrary partial wave. The notion [15, 33] that the DBI action of D3-branes can in any meaningful way “holograph” supergravity or string theory on the full extremal three-brane geometry must be viewed with skepticism. It is perhaps more reasonable to hope that a quantum field theoretic derivation of at least the leading log terms in the ωR series expansion might be achieved (in part because these terms have a simpler cutoff dependence than terms with fewer powers of logarithms). In geometrical terms, the hope would be to see the r/R corrections to the near-horizon geometry (where r is the usual radial variable entering into the har-

338

S. S. Gubser, A. Hashimoto

monic function H = 1 + R 4 /r 4 ) reflected order by order in the non-renormalizable contributions to the Green’s functions for some quantum effective world volume theory. While the motivation for this work was primarily our hope to achieve a better understanding of the double scaling limit described in [5, 10], our main technical results can be stated in the more prosaic setting of Schrödinger operators in one dimension. For a particle moving in a potential V (z) = −2q cosh 2z, we have found a simple expression (22) for the transmission coefficient in terms of the Floquet exponent ν and a quantity χ related to the transformation properties of Floquet solutions under parity. The computation of the Floquet is well understood in terms of partial fractions. We implement the methods of [22] to give a method for computing χ as well. The Hamiltonian form of (27), and the surprising symmetry in the transmission probability between η = eiπ ν and χ , tantalizes us with the hope that one might be able to give a treatment of Mathieu functions which puts η and χ on an equal footing. Acknowledgements. We would like to thank M. Fisher for introducing us to Mathieu functions. Thanks to R. Goldstein, E. Lieb, R. Askey, and the participants of the String Dualities program at ITP, Santa Barbara, for discussions. We appreciate S. de Alwis’ prompt reading of the manuscript. S.S.G. is grateful to the ITP for hospitality during the initial stages of this work. This research was supported in part by the National Science Foundation under Grant No. PHY94-07194, by the Department of Energy under Grant No. DEFG02-91ER40671, and by the James S. McDonnell Foundation under Grant No. 91-48. S.S.G. also thanks the Hertz Foundation for its support.

Appendix A. Explicit Formulas In this appendix we present explicit forms for some results which were considered too lengthy to write out in the main text. Most of the computations were done with Mathematica. First, the Floquet exponent for r = 1, r = 3/2, and r = 2 (corresponding to l = 0, l = 1, and l = 2) can be expanded as a power series in λ as follows: 7i 11851 i 12 i √ 4 5λ + √ λ8 + √ λ + ··· , 3 108 5 31104 5 1 133 8 311 3 λ + λ12 + · · · , r = : ν = 3 − λ4 + 2 6 4320 1555200 137 8 305843 1 4 λ − λ + λ12 + · · · . r =2 : ν =4− 15 27000 680400000 r =1 : ν =2−

(46)

By iterating (26) one can obtain expressions for the formal series Aq defined in (23) in terms of the “loop variables” S[α0 , α1 , . . . , αk ]. These grow in size very rapidly: A1 = S[1], S[2] S[1]2 − − S[1, 1], A2 = 2 2 S[1]S[2] S[3] S[1]3 − + − S[1]S[1, 1] + S[1, 2] + S[2, 1] + S[1, 1, 1], A3 = 6 2 3 S[1]2 S[2] S[2]2 S[1]4 − + + (47) A4 = 24 4 8 S[1]S[3] S[4] S[1]2 S[1, 1] S[2]S[1, 1] S[1, 1]2 − − + + + + 3 4 2 2 2

Exact Absorption Probabilities for D3-Brane

339

3S[2, 2] − S[3, 1] + S[1]S[1, 1, 1] 2 −S[1, 1, 2] − 2S[1, 2, 1] − S[2, 1, 1] − S[1, 1, 1, 1], +S[1]S[1, 2] − S[1, 3] + S[1]S[2, 1] −

and so on. After making the identification (28), the formal sums S[{αi }] may be evaluated explicitly in the manner indicated in the paragraph following (28). S[1] =

−3 − 2z (−1 + 2r) (1 + 2r) (−1 + r − z) (1 + r + z)

ψ(1 − r + z) − ψ(1 + r + z) , −r + 4r 3 35 + 84z + 70z2 + 20z3 + 8r 4 (1 + 2z) − 2r 2 35 + 50z + 28z2 + 8z3 S[2] = 3 −1 + 4r 2 (1 − r + z)2 (1 + r + z)2 −1 + 20r 2 (ψ(1 − r + z) − ψ(1 + r + z)) + (48) 3 2r 3 −1 + 4r 2 1 + 4r 2 ψ (1) (1 − r + z) + ψ (1) (1 + r + z) , + 2 2r 2 1 − 4r 2 +

S[1, 1] = 4r 6 (3+2z)+r 4 35−26z−36z2 −8z3 −r 2 109+143z+65z2 +10z3 +2(2+z)2 − 2 4 (−1+r) (1+r) r −4r 3 (−2+r −z) (−1+r −z) (1+r +z) (2+r +z) −1+10r 2 (ψ(1−r +z)−ψ(1+r +z)) ψ (1) (2−r +z)+ψ (1) (2+r +z) + . + 2 4r 2 −16r 4 4r 3 1−4r 2 −1+r 2 These formulas also become very lengthy, and they have many different forms because of the various identities for the ψ function. References 1. Horowitz, G.T., and Strominger, A.: Black strings and p-branes. Nucl. Phys. B360, 197–209 (1991) 2. Polchinski, J.: Dirichlet Branes and Ramond–Ramond charges. Phys. Rev. Lett. 75, 4724–4727 (1995), hep-th/9510017 3. Witten, E.: Bound states of strings and p-branes. Nucl. Phys. 460, 335–350 (1996), hep-th/9510135 4. Das, S.R. and Mathur, S.D.: Comparing decay rates for black holes and D-branes. Nucl. Phys. B 478, 561–576 (1996), hep-th/9606185 5. Klebanov, I.R.: World volume approach to absorption by nondilatonic branes. Nucl. Phys. B496 231 (1997), hep-th/9702076 6. Maldacena, J.: The Large N limit of superconformal field theories and supergravity. hep-th/9711200 7. Russo, J.G. and Tseytlin, A.A.: Green–Schwarz superstring action in a curved magnetic Ramond–Ramond background. J. High Energy Phys. 04, 14 (1998), hep-th/9804076 8. Metsaev, R.R. and Tseytlin,A.A.: Type IIB superstring action inAdS(5)×S 5 background. hep-th/9805028 9. Gubser, S.S., Klebanov, I.R. and Peet, A.W.: Entropy and temperature of black 3-branes. Phys. Rev. D54, 3915–3919 (1996), hep-th/9602135 10. Gubser, S.S, Klebanov, I.R. and Tseytlin, A.A.: String theory and classical absorption by three-branes. Nucl. Phys. B499, 217 (1997), hep-th/9703040 11. Gubser, S.S. and Klebanov, I.R.: Absorption by branes and Schwinger terms in the world volume theory. Phys. Lett. B413, 41–48 (1997), hep-th/9708005

340

S. S. Gubser, A. Hashimoto

12. Freedman, D.Z., Mathur, S.D., Matusis,A. and Rastelli, L.: Correlation functions in the CFT(d) /AdS(d+1) correspondence. hep-th/9804058 13. Chalmers, G., Nastase, H., Schalm, K. and Siebelink, R.: R current correlators in N-4 superYang–Mills theory from anti-de Sitter supergravity. hep-th/9805105 14. Das, S.R.: The Effectiveness of D-branes in the description of near extremal black holes. Phys. Rev. D56, 3582–3590 (1997), hep-th/9703146 15. Gubser, S.S., Hashimoto, A., Klebanov, I.R. and Krasnitz, M.: Scalar absorption and the breaking of the world volume conformal invariance. hep-th/9803023 16. Abramowitz, M. and Stegun, I.A., (eds): Handbook of Mathematical Functions. Waskington, DC: US Government Printing Office, 1964 17. Gradshteyn, I.S. and Ryzhik, I.M.: Table of Integrals, Series, and Products. San Diego: Academic Press, 1994 18. Whittaker, E.T. and Watson, G.N.: A Coarse of Modern Analyis. Cambridge: Cambridge University Press, 1969 19. Erdelyi, A., Magnus, W., Oberhettinger, F. and Tricomi, F.G., (eds): Higher Transcendental Functions. New York: McGraw-Hill, 1953–1955 20. McLachlan, N.W.: Theory and Application of Mathiev Functions. Oxford: Clarendon Press, 1947 21. Neuberger, H.: Semiclassical Calculation of the Energy Dispersion Relation in the Valence Band of the Quantum Pendulum. Phys. Rev. C17, 498 (1978) 22. Dougall, J.: The Solution of Mathieu’s Differential Equation. Proceedings of the Edinburgh Mathematical Society XXXIV, 176–196 (1916) 23. Ishibashi, N. and Kawai, H.: String field theory of noncritical strings. Phys. Lett. B314, 190–196 (1993), hep-th/9307045 24. Gubser, S.S.: Geodesic distance in two-dimensional quantum gravity. Undergraduate thesis, Princeton University, 1994 25. Gubser, S.S.: Can the effective string see higher partial waves? Phys. Rev. D56, 4984–4993 (1997), hep-th/9704195 26. Freedman, D.Z., Johnson, K. and Latorre, J.I.: Differential regularization and renormalization: A New method of calculation in quantum field theory. Nucl. Phys. B371, 353–414 (1992) 27. Anselmi, D.: Central functions and their physical implication. hep-th/9702056 28. Tseytlin, A.A.: On non-Abelian generalization of Born–Infeld action in string theory. Nucl. Phys. B501, 41 (1997), hep-th/9701125 29. Gubser, S.S., Klebanov, I.R. and Polyakov, A.M.: Gauge theory correlators from noncritical string theory. hep-th/9802109 30. Witten, E.: Anti-de Sitter space and holography. hep-th/9802150 31. Kim, J.J., Romans, L.J. and van Nienwenhuizen, P.: Mass spectrum of chiral ten-dimensional N = 2 supergravity on S 5 . Phys. Rev. D32, 389–399 (1985) 32. Callan, C.G., Gubser, S.S., Klebanov, I.R. and Tseytlin, A.A.: Absorption of fixed scalars and the D-brane approach to black hole. Nucl. Phys. B489, 65–94 (1997), hep-th/9610172 33. de Alwis, S.P.: Supergravity the DBI action and black hole physics. hep-th/9804019 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 203, 341 – 347 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

On the Absolutely Continuous Spectrum of One-Dimensional Schrödinger Operators with Square Summable Potentials P. Deift1 , R. Killip2 1 Courant Institute, 251 Mercer St., New York, NY 10012, USA. E-mail: [email protected] 2 Mathematics 253-37, Caltech, Pasadena, CA 91125, USA. E-mail: [email protected]

Received: 30 October 1998 / Accepted: 16 November 1998

Abstract: For continuous and discrete one-dimensional Schrödinger operators with square summable potentials, the absolutely continuous part of the spectrum is essentially supported by [0, ∞) and [−2, 2] respectively. This fact is proved by considering a priori estimates for the transmission coefficient. 1. Introduction In this paper we consider the absolutely continuous spectrum of one-dimensional Schrödinger operators with potentials decaying slowly at infinity. For example, we consider the half-line operators with Dirichlet boundary conditions: d 2ψ + q(x)ψ(x) on L2 ([0, ∞)) with ψ(0) = 0 dx 2

(1)

ψ(n + 1) + ψ(n − 1) + q(n)ψ(n) n = 1, 2, . . . . ψ(1) + q(0)ψ(0) n=0

(2)

HD ψ(x) = − and

hD ψ(n) =

Recent papers by Christ, Kiselev and Remling [1,2,11] have shown that for potentials obeying |q(x)| ≤ C(1 + |x|)−1/2− , > 0, the a.c. spectrum of HD (resp. hD ) is essentially supported by [0, ∞) (resp. [−2, 2]). By essentially supported we mean that every subset of positive Lebesgue measure has positive measure with respect to some spectral measure. Note that the spectrum may be very far from purely absolutely continuous: Naboko [8] and Simon [13] constructed potentials with xq(x) → ∞ arbitrarily slowly for which HD has point spectrum dense in [0, ∞). Our purpose here is to prove Theorem 1. If q ∈ L2 then the a.c. spectrum of HD is essentially supported by [0, ∞). In the discrete case, q ∈ `2 implies that hD has a.c. spectrum essentially supported by [−2, 2].

342

P. Deift, R. Killip

This result is of interest for two reasons in particular: it is optimal in the sense that the result fails in any Lp (or `p ) space with p > 2, and secondly, the methods used here are far simpler than those previously employed. That this result is optimal in terms of Lp -type decay was demonstrated by Kiselev, Last and Simon [7]. Indeed their analysis of sparse ‘bump’ potentials shows that there exist potentials decaying to zero at infinity and belonging to ∩2
Hψ = −

on L2 (R),

(3)

on `2 (Z),

(4)

have a.c. spectrum of multiplicity two essentially supported by [0, ∞) (resp. [−2, 2]). The next section introduces the main object of our analysis, the transmission coefficient, and derives the basic estimates for this quantity. Section 3 uses the Weyl m-function and its relationship to the transmission coefficient to prove Theorem 1. Of course, by Weyl’s (relative) compactness theorem, σess (H ) = [0, ∞) and σess (h) = [−2, 2], and so it is sufficient to limit our considerations to these two intervals respectively. 2. The Transmission Coefficient Throughout this section we shall assume q to be continuous on R and supported on a compact subset of [0, ∞). For such q the following are well known: 1. In the continuum case, for every k ∈ C there exists a solution to −ψ 00 (x) + q(x)ψ(x) = k 2 ψ(x) ∀x ∈ R

(5)

such that ψ(x) = eikx for all x to the right of the support of q. Moreover for each x, ψ(x) and ψ 0 (x) are analytic functions of k. 2. In the discrete case, for every k ∈ C \ {0} there exists a solution to (6) ψ(n + 1) + ψ(n − 1) + q(n)ψ(n) = k + k1 ψ(n) ∀n ∈ Z such that ψ(n) = k n for all n to the right of the support of q. Again ψ(n) is an analytic function of k ∈ C \ {0} for each n. To the left of the support of q, ψ must satisfy the free Schrödinger equation and so take the form ψ(x) = aeikx + be−ikx (continuum), ψ(n) = ak n + bk −n

(discrete).

Absolutely Continuous Spectrum of 1D Schrödinger Operators

343

As ψ depends analytically on k so do a and b. Since e±ikx , k ±n represent waves propagating to the right/left in the time dependent picture, it is natural to term t = 1/a and r = b/a the transmission and reflection coefficients respectively. The proof of Theorem 1 rests on the estimates Z Z 1 2 π log k dk ≤ 8 (continuum), (7) q 2 (x)dx t R Z π X 1 log sin2 (θ)dθ ≤ π4 q(n)2 (discrete), (8) t −π n which are by no means new. They follow from well-known R inverse scattering identities (10),(13) which are true for more general potentials: |q(x)|(1 + |x|)dx < ∞ (or P |q(n)|(1 + |n|) < ∞) is sufficient. See, for example, [4]. These identities are not so well known in the Schrödinger operator community and so, for the convenience of the reader, we include proofs under the mild simplification that q is continuous of compact support. 2.1. Continuum. We begin by rewriting (5) as the integral equation Z ∞h i i 1 − e2ik(y−x) q(y)v(y)dy, v(x) = 1 + 2k x

(9)

where v(x) = e−ikx ψ(x). For Im(k) ≥ 0 and |k| > 0 this may be solved by repeated substitution. In turn this gives Z ZZ h i q(x)dx − 8k12 1 − e2ik|x−y| q(x)q(y)dxdy + O(k −3 ). a = 1 + 2ki Let γR denote the anticlockwise contour parameterized by k(θ ) = Reiθ , θ ∈ [0, π]. In light of the above Z Z π q(x)2 dx. log(a)k 2 dk = − lim R→∞ γR 8 Since a may have zeros, log(a) need not be analytic in the upper half-plane. However at any point Im(k) > 0, where a(k) vanishes, (5) must have an L2 solution. That is a(k) = 0 precisely when k 2 is a negative eigenvalue of the whole line Schrödinger equation (5). Since q is compactly supported there are only finitely many such eigenvalues, indeed no R more than 1 + |xq(x)|dx, refer to [12]. Let {iβm } enumerate the (necessarily simple) zeros of a and define the corresponding Blaschke product for the upper half-plane Y k − iβm . B= k + iβm m So log(Bt) = − log(a/B) is analytic in the upper half-plane. Because a(−k) = a(k) and B(−k) = B(k) (for k real), Im[log(Bt)] is an odd function of k ∈ R. Thus Z Z 1 2 [− log(a) + log(B)]k 2 dk log k dk = lim R→∞ γR t Z X 3 π βm , (10) q(x)2 dx − 2π = 8 3 m

which provides the desired estimate (7).

344

P. Deift, R. Killip

2.2. Discrete. In the discrete case one may rewrite (6) as v(n) = 1 −

k k 2 −1

∞ h X

i 1 − k 2(n−m) q(n)v(n)

(11)

m=n

with v(n) = k n ψ(n). For |k| > 1 one may solve this by repeated substitution, from which i X XXh (12) q(n) + 2k12 1 − k −2|m−n| q(m)q(n) + O(k −3 ). a = 1 − k1 Just as above a(k) has zeros in {|k| > 1} at precisely those points for which k + 1/k is an eigenvalue of (6). Let k ∈ {βm } enumerate these zeros. As h is self-adjoint, these βm must be real. Since q is compactly supported u(n) 7 → q(n)u(n) is a finite rank operator, hence there only finitely many such βm [6]. We introduce the Blaschke product B=

Y βm − k βm . 1 − βm k |βm | m

If k = eiθ then Im[log(Bt)] is an odd function of θ and so, by (12), Z π Z 1 2 (k 2 − 1)2 2 sin (θ)dθ = 1 log log(B/a) dk t π 2πi k3 |k|=1 −π X X X β 2 − β −2 m m 2 , q(n)2 + log(βm )− = 21 2 n m m

(13)

where the integral dk is taken anticlockwise. As β 2 − β −2 − log(β 4 ) > 0 for β > 1, we have the crucial estimate for the discrete case (8). 3. Application to Spectral Theory We now return our attention to half-line Schrödinger operators: (1) and (2). The central tool in the spectral analysis of such operators is the Weyl m-function [3,5]. The spectral significance of the m-function is the representation Z E 1 − dµ(E) for z 6 ∈ supp(µ) (14) m(z) = A + 1 + E2 R E−z with A ∈ C+ and µ a locally finite measure such that HD (or hD ) is unitarily equivalent to multiplication by E in L2 (R; dµ). In fact m(z) has a simple expression in terms of ∂2 G (x, y) in the continuum case and m(z) = the Green’s function: m(z) = ∂x∂y (0,0) z Gz (0, 0) in the discrete case. As such, it can be recovered from the solutions ψ of (5) and (6) provided q is of compact support. In the continuum case m(k 2 ) = ψ 0 (0)/ψ(0) for Im(k) > 0, Re(k) 6 = 0 and m(k + k −1 ) = −ψ(0)/ψ(−1) for |k| > 1, Re(k) 6 = 0 in the discrete case. As mentioned in Sect. 2, ψ depends analytically on k ∈ C (or k ∈ C \ {0} in the discrete case), so the above definitions meromorphically continue m. Standard techniques show that (14) implies weak- lim Im[m(E + i)]dE = π dµ(E). →0

(15)

Absolutely Continuous Spectrum of 1D Schrödinger Operators

345

For q of compact support HD (resp. hD ) has purely absolutely continuous spectrum in [0, ∞) (resp. [−2, 2]); refer for instance Ch. XIII §8,13 of [10]. Hence (15) becomes dµ(E) =

1 π Im[m(E

+ i0)]dE

for E ∈ [0, ∞) or [−2, 2]. By m(E + i0) we mean the limiting value for m(E + i) as → 0. Since m has a meromorphic continuation in k this is equivalent to taking the corresponding value for k. Thus far we have considered only potentials q which are continuous and of compact support. To prove Theorem 1 for general L2 (or `2 ) potentials we require two lemmas regarding limits. Lemma 1. If qn → q in L2 (or `2 ) then the corresponding Weyl m-functions converge uniformly on compact subsets of the upper half-plane. And so µn converges weakly to µ, µn * µ. Proof. This follows from the relation between the m-function and the Green’s function. For Im(z) > 0 the second resolvent equation implies that the Green’s function (and the requisite derivatives) converge uniformly on compact subsets of C \ R. That µn * µ then follows by the Stone–Weierstrass Theorem. u t Lemma 2. Let be an open subset of R and let w(x) ∈ L1loc () be a.e. positive. If µn is a sequence of positive measures which converge weakly to µ then, Z w(K) 1 n 1 (16) − log dµ w(x)dx ≥ log lim dx w(x) µ(K) n w(K) K R for each compact K ⊂ of positive Lebesgue measure. By w(K) we mean K w(x)dx. Proof. Let K be fixed and let 8 denote the set of positive continuous functions of compact support which take the value 1 on K. Since µ is regular, Z Z φdµ = inf lim φdµn µ(K) = inf φ∈8 φ∈8 n Z n ≥ inf lim φ dµ dx dx φ∈8 n Z dµn 1 ≥ lim dx w(x) w(x)dx. n

K

Dividing by w(K) and applying g(t) = − log(t) to both sides, (16) follows from Jensen’s inequality. Note that because g is decreasing the inequality is reversed and lim becomes t lim. u At last we come to the Proof of Theorem 1. Let {qn } be a sequence of continuous functions of compact support which converge to q in L2 . To each such potential we shall associate its m-function mn , m and corresponding measure µn , µ. From our foregoing remarks, Theorem 1 can be proved by demonstrating that µ(K) > 0 for all compact sets K ⊆ (0, ∞) (or (−2, 2) in the discrete case) of positive Lebesgue measure. From Lemmas 1 & 2 we see that it is sufficient to prove that Z Z Im mn (E + i0) dµn 1 − log dE w(E) w(E)dE = − log w(E)dE (17) π w(E) K K

346

P. Deift, R. Killip

is bounded above (uniformly in n) for some positive weight function. At this point the proofs for the continuum and discrete cases diverge slightly. We shall treat them separately. Continuum. As qn is continuous of compact support, the m-function is just ψ 0 (0)/ψ(0) and so an (k) − bn (k) 1 − rn (k) = ik mn (k 2 ) = ik an (k) + bn (k) 1 + rn (k) or conversely rn = (ik − mn )/(ik + mn ). Thus for k 2 ∈ K, k > 0 (which corresponds to E = k 2 + i0) 4kIm mn 4Im mn ≤ . |tn |2 = 1 − |rn |2 = 2 |mn + ik| k The last inequality follows from the fact that Im(m) ≥ 0 for such k; this follows from √ (14). Thus choosing weight function w(E) = E/4π we have Z 4Immn √ 1 − log EdE RHS(17) = 4π √ E K Z 1 log k 2 dk ≤ π1 tn k 2 ∈K,k>0 Z 1 log k 2 dk ≤ π1 t n ZR 2 1 ≤ 8 q dx, which proves that (17) is indeed uniformly bounded. Discrete. As in the continuous case we may write m in terms of r for compactly supported potentials: −1 1 + kmn . rn = k mn + k So for k = eiθ , θ ∈ (0, π ), which corresponds to E = 2 cos(θ ) + i0, |t|2 = 1 − |r|2 =

4 sin(θ)Im(mn ) 4Im(mn ) ≤ . |mn + k|2 sin(θ )

˜ As before the final inequality follows because Im m ≥ 0 for such √ k. Define K = {θ ∈ (−π, π) : 2 cos(θ) ∈ K}. Then with weight function w(E) = 4 − E 2 /8π , Z 4Immn 1 − log sin2 (θ )dθ RHS(17) = 2π sin(θ ) K˜ Z 1 1 log sin2 (θ )dθ ≤ 2π tn K˜ Z π 1 1 log sin2 (θ )dθ ≤ 2π tn −π X 2 1 q . ≤2 Thus (17) is bounded uniformly in n. u t

Absolutely Continuous Spectrum of 1D Schrödinger Operators

347

Acknowledgements. The authors would like to thank Barry Simon for initiating their collaboration. The work of the first author was supported in part by NSF Grant #DMS-9500867.

References 1. Christ, M., Kiselev, A., Remling, C.: The absolutely continuous spectrum of one-dimensional Schrödinger operators with decaying potentials. Math. Res. Lett. 4, no. 5, 719–723 (1997) 2. Christ, M., Kiselev, A.: Absolutely continuous spectrum for one-dimensional Schrödinger operators with slowly decaying potentials: some optimal results. J. Am. Math. Soc. 11, 771–797 (1998) 3. Coddington, E. A., Levinson, N.: Theory of ordinary differential equations. New York–Toronto–London: McGraw-Hill Book Company, Inc., 1955, MR 16, 1022b 4. Faddeev, L. D., Takhtajan, L. A.: Hamiltonian methods in the theory of solitons. Translated from the Russian by A. G. Re˘ıman, Berlin: Springer, 1987, MR 89m:58103 5. Gesztesy, F., Simon, B.: m-functions and inverse spectral analysis for finite and semi-infinite Jacobi matrices. J. Anal. Math. 73, 267–297 (1997) 6. Kato, T.: Perturbation theory for linear operators. Reprint of the 1980 edition, Berlin: Springer, 1995; MR 96a:47025 7. Kiselev, A., Last, Y., Simon, B.: Modified Prüfer and EFGP transforms and the spectral analysis of one-dimensional Schrödinger operators. Commun. Math. Phys. 194, 1–45 (1998) 8. Naboko, S. N.: On the dense point spectrum of Schrödinger and Dirac operators. Teoret. Mat. Fiz. 68, no. 1, 18–28 (1986); MR 88h:81029 9. Reed, M., Simon, B.: Methods of modern mathematical physics. I. Functional analysis. New York– London: Academic Press, 1972 10. Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. New York– London: Academic Press, 1978 11. Remling, C.: The absolutely continuous spectrum of one-dimensional Schrödinger operators with decaying potentials. Commun. Math. Phys. 193, no. 1, 151–170 (1998) 12. Simon, B.: An introduction to the self-adjointness and spectral analysis of Schrödinger operators. Acta Phys. Aust. 17, 19–42 (1977) 13. Simon, B.: Some Schrödinger operators with dense point spectrum. Proc. Am. Math. Soc. 125, no. 1, 203–208 (1997); MR 97c:34179 Communicated by B. Simon

Commun. Math. Phys. 203, 349 – 364 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Spin Holonomy of Einstein Manifolds Brett McInnes Department of Mathematics, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260, Republic of Singapore. E-mail: [email protected] Received: 1 September 1997 / Accepted: 28 November 1998

Abstract: Berger’s Theorem classifies the linear holonomy groups of irreducible, simply connected Riemannian manifolds. For physical applications, however, it is at least as important to have a classification of the possible spin holonomy groups (defined by the parallel transport of spinors) of non-simply-connected manifolds. We establish a complete classification of the spin holonomy groups of all compact, locally irreducible, Einstein Riemannian spin manifolds of non-negative scalar curvature. 1. Introduction The recent results of Joyce [7, 8] complete the answer to one of the basic questions in Riemannian geometry: which subgroups of the orthogonal group O(n) can occur as linear holonomy groups of compact, irreducible, simply connected, n-dimensional Riemannian manifolds? Some progress has also been made in the non-simply-connected case; for example, it is known that precisely seven subgroups of O(4) occur as linear holonomy groups of compact locally irreducible Riemannian four-manifolds, and that there are exactly eight possibilities in six dimensions [13, 14]. All of these results pertain, however, to linear holonomy groups, defined by the parallel transport of vectors. In applications to physics, one would expect the parallel transport of spinors to be more relevant. On a connected Riemannian spin manifold with a specified spin structure [10], the parallel transport of spinors around closed (not necessarily contractible) loops does indeed define a spin holonomy group, a certain subgroup of Spin(n), where henceforth n denotes the (real) dimension of the manifold. Spin holonomy groups have in fact already appeared in physics: for when the SU (3) “holonomy group” of a Calabi–Yau string compactification [5] is embedded in E8 (that is, when the “spin connection is embedded in the gauge group”), this is done through a certain Spin(6) subgroup of E8 , not through O(6). Evidently SU (3) is being treated as a spin holonomy group, and not as a linear holonomy group, in this case. In fact, the “holonomy groups” discussed in the physics literature are usually spin holonomy

350

B. McInnes

groups. The distinction between linear and spin holonomy, while purely global, is by no means physically insignificant. For example, the disconnected group Z2 × SU (3) occurs as a spin holonomy group over some Calabi–Yau manifolds, though it cannot occur as a linear holonomy group. Note that if Z2 × SU (3) is embedded in E8 , the latter is broken to [U (1) × Spin(10)]/Z4 , not E6 . In this work, we explain the techniques needed to classify the spin holonomy groups of all compact, locally irreducible, Einstein spin manifolds of non-negative scalar curvature. Explicit results are given for dimensions n ≤ 10, but the reader can readily extend the results to any specific higher dimension. The results are almost optimal, in the sense that, with just one exception, we can give examples for each class. The main result is the following. Theorem 1. Let G be a subgroup of Spin(n), n ≤ 10, and let M be a compact locally irreducible Einstein spin manifold of non-negative scalar curvature and real dimension n. Then G is a spin holonomy group over such an M only if G is conjugate in Spin(n) to one of the following groups: n = 2 : Spin(2), n = 3 : Spin(3), n = 4 : Spin(3), Spin(4), n = 5 : Spin(5), n = 6 : SU (3), Z2 × SU (3), U (3), Spin(6), n = 7 : G2 , Z2 × G2 , Spin(7), n = 8 : Spin(5), Spin(6), Spin(7), Spin(8), Spin(2) · Spin(3) · Spin(3), Spin(2) · Spin(6), Spin(3) · Spin(3), Spin(3) · Spin(5), Z3 × Spin(5), [Spin(2) · Spin(3) · Spin(3)] o Z2 , Spin(6) o Z2 , [Spin(2) · Spin(6)] o Z2 , P SU (3), Z2 × P SU (3), n = 9 : Spin(9), n = 10 : SU (5), Z2 × SU (5), U (5), Spin(10), SO(5), Z2 × SO(5). Furthermore, if G is conjugate in Spin(n) to any of these groups with the possible exception of Z3 × Spin(5), then G is a spin holonomy group over such an M. Notice that we do not assume that M is simply connected. The dot denotes a direct product factored by Z2 , while o denotes a semi-direct product. We give the proof in four sections: for n ≤ 5, n = 6 or 7, n = 8, and n = 9 or 10. The reader may find the following general remarks useful. The basic strategy is to take the lists of possible linear holonomy groups given in Ref. [13], and to compute the subgroups of Spin(n) which project onto those subgroups of SO(n). (This case-by-case approach is reasonable because of the highly non-trivial fact [13] that only finitely many subgroups of O(n) occur as linear holonomy groups of compact locally irreducible Riemannian nmanifolds.) Note that both linear and spin holonomy groups over non-simply-connected manifolds can be disconnected Lie groups, and this can cause these computations to be quite delicate. At the same time, we must ask whether there are in fact any spin manifolds in a given linear holonomy class : sometimes there is no such manifold of the desired kind. Finally, the reader should bear in mind the fact that the spin holonomy group is defined relative to a particular choice of spin structure. We must therefore be prepared for the possibility that a given Riemannian manifold could have distinct spin holonomy groups, depending on this choice. We now proceed to the proof.

Spin Holonomy of Einstein Manifolds

351

2. Dimensions ≤ 5

Recall [10] that a spin manifold is an orientable Riemannian manifold M such that SO(M), a bundle of oriented orthonormal frames over M, has a non-trivial double cover which is also a Spin(n) bundle over M, where n is the dimension of M. Such a double cover of SO(M) is called a spin structure, and M has such a structure if and only if its second Stiefel–Whitney class is zero. Spin structures need not be unique; they are counted by the cohomology group H 1 (M, Z2 ), which is easily computed from the fundamental group of M. Let Spin(M) be a spin structure over M. Because Spin(n) and SO(n) are locally isomorphic, the pull-back of the Levi-Civitá connection (regarded as a one-form ωL on SO(M)) defines a connection ωD , the Dirac connection, on Spin(M). Clearly ωD pushes forward to ωL , and so, by the standard connection mapping theorem [9], the holonomy group of ωD projects onto the linear holonomy group under the canonical projection Spin(n) → SO(n). That is, the spin holonomy group is contained in the pre-image of the linear holonomy group in Spin(n). The first step in the proof is therefore to consider the list of possible linear holonomy groups and to study the structure of the projection Spin(n) → SO(n). For n = 2 or 3 this is trivial: the linear holonomy group of any orientable locally irreducible manifold is SO(2) or SO(3), and so every spin holonomy group is isomorphic to Spin(2) or Spin(3) respectively. Note that every orientable Riemannian manifold of dimension less than four is spin. In four dimensions there are seven possible linear holonomy groups [14], of which six occur in the orientable case. These six are U (2), U (2) o Z2 , SO(4), SU (2), Z4 · SU (2), in SO(4) as follows: if A+iB is any (complex) and Q8 ·SU (2). Here U (2)is embedded A −B matrix in U (2), map it to in SO(4). Conjugation by the SO(4) matrix diag B A (1, 1, −1, −1) maps U (2) into itself, and so we can form the semi-direct product U (2)o Z2 as a subgroup of SO(4). Next, the cyclic group Z4 and the quaternionic group of order 8, Q8 , embed in SU (2) in a natural way [23]; therefore if we regard [4] SO(4) as [SU (2) × SU (2)]/Z2 = SU (2) · SU (2), it is clear that Z4 · SU (2) and Q8 · SU (2) are subgroups of SO(4). The pre-image of SO(4) itself is of course Spin(4), which is the spin holonomy group of the unique spin structure on the four-sphere S 4 . The preimage of U (2), which may be written as U (1) · SU (2), is U (1) × SU (2). This group does not occur, however, as the spin holonomy group of any compact locally irreducible Einstein spin four-manifold of non-negative scalar curvature. To see this recall first that the restricted holonomy group [2] of a locally Kählerian manifold must be contained in SU (n) in the Ricci-flat case, and so we need only consider the positive case. Thus we take M to be a compact irreducible Kähler–Einstein 4-manifold with a positive first Chern class [2]. Such manifolds have been classified: as a result of the work of Matsushima [11], Yau [24], and Tian [20], one knows that M must be CP 2 blown up at r points in general position, 3 ≤ r ≤ 8 (and that these manifolds, denoted 6r , do admit Kähler–Einstein metrics). The first Chern classes can therefore be examined, and, since [10] the second Stiefel–Whitney class is the mod 2 reduction of the first Chern class, one can verify directly that none of the 6r , 3 ≤ r ≤ 8, is a spin manifold. However, the following approach is more instructive. The Dolbeault cohomology of a blown-up manifold is related in a simple way to that of the original manifold [6], and so we can

352

B. McInnes

evaluate the Hodge numbers as follows: h0,0 = h2,2 = 1, h1,1 = 1 + r, hi,j = 0, i 6= j. The signature [10] is therefore X (−1)q hp,q = 1 − r, 3 ≤ r ≤ 8. τ (6r ) = p,q

As this is not a multiple of 16, the fact that 6r is not spin now follows from Rochlin’s theorem [10]. Thus U (1) × SU (2) must be removed from our list. The manifolds of linear holonomy U (2) o Z2 are not Kähler manifolds: they are obtained from Kähler manifolds of holonomy U (2) by taking the quotient by finite groups acting freely but not holomorphically. For compact locally irreducible Einstein manifolds, the relevant index theorem therefore expresses the signatures as quotients of 1 − r, with r as above, and again Rochlin’s theorem implies that none of these manifolds is spin. We now turn to the linear holonomy groups with identity component isomorphic to SU (2). For SU (2) itself, it is well known that the manifold must be [2] a K3 surface with a Ricci-flat Yau metric. Similarly, for Z4 · SU (2) the manifold must be an Enriques surface [14], and for Q8 · SU (2) it must be a Hitchin manifold [14]. However, Enriques surfaces and Hitchin manifolds are not spin manifolds. We shall give two very different proofs of this. First, as is well known, the signature of the underlying manifold [2] of a K3 surface is −16, and so, as the Enriques surfaces and Hitchin manifolds are non-trivially covered by K3, Rochlin’s theorem gives the result. Second, the pre-image of Z4 · Spin(3) in Spin(4) is Z4 × Spin(3), and no proper subgroup of Z4 × Spin(3) projects onto Z4 · SU (2); therefore, if an Enriques surface were spin, its spin holonomy group (with respect to either spin structure) would have to be Z4 × Spin(3), with four connected components. Now it is easy to see that the order of the fundamental group of any manifold must be at least as large as the number of connected components of the holonomy group of any connection on any principal fibre bundle over that manifold; but the fundamental group of an Enriques surface is of order two [6]. The contradiction shows that these manifolds cannot be spin, and a similar argument applies to the Hitchin manifolds (which have fundamental group isomorphic to Z2 × Z2 ). K3 itself is, of course, a spin manifold, and its spin holonomy group with respect to a Yau metric must be either Z2 × Spin(3) or Spin(3) (since both of these project onto SU (2)). As K3 is simply connected, Z2 × Spin(3) is impossible, and so we conclude that Spin(3) is the holonomy group of the Dirac connection on the unique spin structure on K3. We see, then, that the range of possibilities for the spin holonomy groups of compact, locally irreducible, Einstein four-manifolds of non-negative scalar curvature is very much narrower than in the case of linear holonomy. In the latter case there are seven possibilities, but in the spin case only two: Spin(4) and Spin(3). Only two groups occur as linear holonomy groups of compact locally irreducible orientable five-manifolds: SO(5) and SO(3). The pre-image of SO(5) is of course Spin(5), and the sphere S 5 provides an example. On the other hand, SO(3) occurs only as the linear holonomy group of the symmetric space [23] of the form SU (3)/SO(3),

Spin Holonomy of Einstein Manifolds

353

as well as its space forms. But none of these is a spin manifold; in fact, SU (3)/SO(3) is usually cited (Ref. [10], p. 393) as the standard example of a manifold which does not even have a Spinc structure. Hence it has no spin structure, and the same is true of its space forms. Thus Spin(5) is the only possibility in five dimensions. 3. Dimensions 6 and 7 The linear holonomy groups of compact, locally irreducible six-manifolds fall into eight conjugacy classes in O(6), of which, however, only four are possible in the orientable case. These correspond to SO(6), U (3), SU (3), and SO(2) × SO(3). This last occurs as the holonomy group of the Grassmannian manifold G2,5 (R), the space of oriented two-dimensional subspaces of R5 . In general, the Grassmannians of the form G2,p+2 (R) are Kähler manifolds, and their first Chern classes are given by [3] c1 (G2,p+2 (R)) = pα, where c1 is regarded as an element of the cohomology group H 2 (G2,p+2 (R), Z), and where α is indivisible. Thus we find that c1 (G2,5 (R)) = 3α, and so, since the second Stiefel–Whitney class is the mod 2 reduction of this, G2,5 (R) is not spin. Thus we may confine attention to SO(6), U (3), and SU (3). As usual, S 6 gives us an example of a spin manifold of spin holonomy Spin(6). The pre-image of U (3) in Spin(6) is again isomorphic to U (3). As this is not completely obvious - recall that the pre-image of U (2) in Spin(4) is not isomorphic to U (2) but rather to U (1) × SU (2) - we shall give the proof. Let G be a compact connected Lie group and let H be a connected maximal-rank subgroup. From the fact that G is covered by its maximal tori, these tori being mutually conjugate [4], it follows that the centre of H contains the centre of G. Let H be of maximal rank in SO(n), and let Hˆ be the identity component of the pre-image of H in Spin(n); then of course Hˆ is of maximal rank in Spin(n), and so the centre of Hˆ contains the centre of Spin(n). In particular, Hˆ contains {±1}, so the pre-image of H is connected if H is connected. In our case, U (3) is of maximal rank in SO(6), so its pre-image in Spin(6) must be a connected group which projects onto U (3) under factoring by {±1}. Since U (3) = [U (1) × SU (3)]/Z3 , the pre-image can only be U (3). (In the case of U (2) in SO(4), the centre of the pre-image must contain Z2 × Z2 , the centre of Spin(4). If the pre-image were U (2), this would be impossible, since U (1) does not contain Z2 × Z2 ; but the centre of U (1) × SU (2) is U (1) × Z2 , which does. Similarly, the centre of Spin(8) is [4] isomorphic to Z2 × Z2 , so the pre-image of U (4) in SO(8) must be [U (1)×SU (4)]/Z2 , with centre [U (1)×Z4 ]/Z2 containing Z2 × Z2 , and not U (4) = [U (1) × SU (4)]/Z4 .) The simplest example of a six-dimensional manifold with spin holonomy U (3) is the projective space CP 3 with its usual (Fubini-Study) metric; recall [10] that CP n is spin if and only if n is odd. Turning now to SU (3), which is the linear holonomy group of the familiar Calabi– Yau manifolds arising in string theory [5], we note first that SU (3) is not of maximal rank in SO(6), and so we expect the pre-image in Spin(6) to be a disconnected group. Indeed, since the centre of SU (3) is isomorphic to Z3 , it is clear that the subgroup of U (3) (as a subgroup of Spin(6)) which projects onto SU (3) is Z2 × SU (3), with two connected components. We conclude that every spin holonomy group of a Calabi–Yau manfold is isomorphic either to SU (3) or to Z2 × SU (3). As every Calabi–Yau six-manifold is spin (this follows [10] directly from the fact that the linear holonomy group is connected and simply connected), we can easily find

354

B. McInnes

examples of manifolds with spin holonomy SU (3): take any simply connected Calabi– Yau manifold, and consider its unique spin structure. The question now is whether there are non-simply-connected Calabi–Yau manifolds with spin holonomy Z2 ×SU (3). The question is of considerable interest, because the Calabi–Yau manifolds used in applications [5] are usually not simply connected, and because we are claiming that the “holonomy groups” appearing in physics are in fact spin holonomy groups. Examples may be constructed as follows. Let M be a Calabi–Yau manifold with H 1 (M, Z2 ) non-trivial. In group-theoretic terms this means that there are non-trivial homomorphisms from the first homology group H1 (M, Z) to Z2 . This in turn means that M has at least one non-trivial (connected) double cover. Let M be any double cover of M. Since M is a Kähler manifold, SO(M) is reducible to the sub-bundle of unitary frames, U (M), with structural group U (3). (Throughout this discussion, 3 can always be replaced by any larger odd integer.) Let Spin(M) denote the spin structure over M corresponding to M, and let SpinU (M) be the sub-bundle of Spin(M) which projects onto U (M). As SU (3) is normal in U (3), with U (3)/SU (3) = U (1), we can define a U (1) bundle K(M) over M by K(M) = U (M)/SU (3). Similarly we define SpinK(M) = SpinU (M)/SU (3), and so obtain the commuting diagram shown. (Here the horizontal arrows are Z2 projections, the downward arrows are SU (3) projections, and the upward arrows are inclusions.) Now the holonomy group of the reduced connection on U (M) is precisely SU (3) (and not some disconnected group with SU (3) as identity component). The holonomy group of the push-forward of this connection to K(M) is therefore precisely the trivial group (and not some other discrete group). Thus K(M) is a trivial bundle, K(M) = M ×U (1). Therefore SpinK(M), which we may call the spin canonical bundle corresponding to this choice of spin structure (since K(M) is usually called the canonical bundle over M), is constructed as follows. Let f : M → M be the fixed-point-free isometry generating Z2 such that M/Z2 = M. Then Z2 has a natural action on M × U (1) by (x, u) → (f (x), −u). Setting N = [M × U (1)]/Z2 , we see that N is a U (1) bundle over M, and that N/Z2 = K(M); in fact N is just SpinK(M). Our objective now is to determine the holonomy group of the connection on SpinK(M) pushed forward from SpinU(M). Suppose first that M is the trivial double cover, M = Z2 × M. Then SpinK(M) is trivial, SpinK(M) = M × U (1). Since holonomy bundles must be connected, it is clear that the holonomy group is trivial; hence the holonomy group of SpinU (M) must be SU(3). As this is of course also the holonomy group of Spin(M), M has spin holonomy SU (3) with respect to this spin structure. But now let M be connected. Then the holonomy group of the pushed-forward connection on SpinK(M) cannot be trivial, for if it were, then SpinK(M) would be a trivial U (1) bundle over M, which is not the case. Thus in fact the holonomy group of the connection on the spin canonical bundle must be isomorphic to Z2 ; and so we conclude that the spin holonomy group itself is Z2 × SU (3). In short, then, the structure of the spin holonomy group over a given Calabi–Yau manifold may not be fixed: it depends on the choice of spin structure. If M has only the trivial double cover - if it is simply connected or its fundamental group is Z3 , for example - then its spin holonomy group with respect to its sole spin structure is isomorphic to SU (3). But if M has H 1 (M, Z2 ) non-trivial, so that it has more than one spin structure, then it has spin holonomy SU (3) with respect to one spin structure, but it has spin holonomy Z2 × SU (3) with respect to all of the others. This situation has no analogue in the case of linear holonomy.

Spin Holonomy of Einstein Manifolds

355 Spin (M) −−−−−→ SO(M) x x     Spin U (M) −−−−−→ U (M)     y y Spin K(M) −−−−−→ K(M) Figure 1.

Explicit examples of this behaviour are readily found. Let H3 be the non-singular intersection of four quadrics in CP 7 . It is possible [14] to arrange the coefficients so that H3 admits a free holomorphic action by Z2 . Then H3 /Z2 is a Calabi–Yau manifold with linear holonomy isomorphic to SU (3) and two distinct spin structures: the spin holonomy group is isomorphic to SU (3) for one spin structure, and to Z2 × SU (3) for the other. The situation in dimension seven is quite similar. There are four possible linear holonomy groups, of which two occur in the orientable case: SO(7) and the exceptional group G2 . The latter is the linear holonomy group of certain compact manifolds constructed by Joyce [7]. As G2 is simply connected and has a trivial centre, it is clear that the subgroup of Spin(7) which projects onto G2 must be isomorphic to Z2 × G2 ; and so every spin holonomy group is isomorphic either to Z2 × G2 or to G2 . Just as in the Calabi–Yau case, every manifold of linear holonomy G2 is spin, and there is exactly one spin structure with spin holonomy group isomorphic to G2 ; all other spin structures have spin holonomy Z2 × G2 . (Recall [10] that these manifolds have a global parallel form, defining an associative structure: this plays the same role here as the global holomorphic three-form which trivialises the canonical bundle of a Calabi–Yau manifold.) Some of Joyce’s examples [7] have fundamental group isomorphic to Z2 ; these have two spin structures, one with spin holonomy G2 , and the other with spin holonomy Z2 × G2 . (Notice that Z2 × G2 does occur as a linear holonomy group also [12], but any confusion on this point can be resolved by noting that a seven-manifold of linear holonomy Z2 × G2 cannot be orientable and so cannot be spin.) 4. Dimension 8 The possible linear holonomy groups of compact locally irreducible Riemannian 8manifolds have been classified [14]: there are 23 relevant (conjugacy classes of) subgroups of O(8). As usual, we eliminate those which are not contained in SO(8), and then study the corresponding subgroups of Spin(8). We arrange these manifolds according to their linear holonomy. (a) U (4) and U (4) o Z2 . Here U (4), the linear holonomygroup of a generic Kähler A −B 8-manifold, is embedded in SO(8) through A + iB → . Conjugation by B A the SO(8) matrix diag (1, 1, 1, 1, −1, −1, −1, −1) maps U (4) into itself, and so we construct U (4) o Z2 , which occurs as the linear holonomy group of the quotient of a generic Kähler 8-manifold by a Z2 generated by a fixed-point-free antiholomorphic involution. Recall that U (4) = [U (1) × SU (4)]/Z4 , and that we saw in the preceding section that the pre-image of U (4) in Spin(8) is not isomorphic to U (4) but rather to

356

B. McInnes

[U (1) × SU (4)]/Z2 . In this particular case there is a completely different and more enlightening way of seeing this, as follows. Let {ei }, i = 1 · · · 8, be a basis of R8 generating the Clifford algebra which defines Spin(8). Setting α = e1 e2 e3 e4 e5 e6 e7 e8 , one finds [4] that the centre of Spin(8) is {±1, ±α}. Now let Spin(2) be generated by e1 and e2 , and let Spin(6) be generated by e3 , · · · e8 . Then Spin(8) contains Spin(2)·Spin(6), where we use the dot to distinguish this product from the direct product: the product here is not direct, because Spin(2) and Spin(6) intersect non-trivially, in {±1}. Of course, Spin(2) · Spin(6) is the pre-image in Spin(8) of SO(2) × SO(6). Now let τ denote the triality [18] automorphism of Spin(8). We claim that τ [Spin(2) · Spin(6)] is the preimage of U (4), so that τ [Spin(2) · Spin(6)]/{±1} = U (4). To see this note that, since e1 e2 is contained in Spin(2) and e3 e4 e5 e6 e7 e8 is contained in Spin(6), α is contained in Spin(2) · Spin(6), and so [Spin(2) · Spin(6)]/{1, α} is well defined. (Note that α 2 = 1.) Since Spin(2) is isomorphic to U (1), and Spin(6) is isomorphic to SU (4), we have [Spin(2) · Spin(6)]/{1, α} = [(U (1) × SU (4))/Z2 ]/Z2 = [U (1) × SU (4)]/Z4 = U (4). Now τ cyclically permutes the three non-trivial elements of the centre of Spin(8), so τ (α) = −1, and thus τ {1, α} = {±1}. Hence indeed τ [Spin(2) · Spin(6)]/{±1} is isomorphic to the U (4) subgroup of SO(8). Thus the pre-image of U (4) in Spin(8) is τ [Spin(2) · Spin(6)], which is of course isomorphic to Spin(2) · Spin(6) as an abstract group. Next we turn to U (4) o Z2 , where Z2 is generated by the SO(8) matrix 24 = ˆ 4 = ±e5 e6 e7 e8 in diag(1, 1, 1, 1, −1, −1, −1, −1). This matrix is covered by 2 2 ˆ 4 = 1. The complex conjugation automorphism of U (4) Spin(8), which satisfies 2 lifts to an automorphism s → s of Spin(2) · Spin(6), and clearly ˆ 4 )−1 = ±s ˆ 4 s(2 2 for each s in Spin(2) · Spin(6). Since the latter is connected, a continuity argument rules ˆ 4 induces the canonical outer automorphism out the minus sign, and so conjugation by 2 ˆ 4 allows us to define a subgroup of Spin(8) with two on Spin(2) · Spin(6). Thus 2 connected components, [Spin(2)·Spin(6)]oZ2 , where × denotes a semi-direct product. This subgroup projects onto U (4) o Z2 in SO(8). Examples may be given as follows. Let X4,4 denote the Fermat sub-manifold of CP 5 given by Z14 + Z24 + Z34 + Z44 + Z54 + Z64 = 0, where the co-ordinates are homogeneous as usual. The first Chern class is 2g, where g is the canonical generator of H 2 (X4,4 , Z), that is, the Kähler form induced from CP 5 . ¨ Therefore X4,4 is a spin manifold, and, by the work of Tian [20], it has a Kahler-Einstein metric of positive Ricci curvature. This manifold is simply connected, and the spin holonomy group of its unique spin structure is isomorphic, as above, to Spin(2)·Spin(6). The isometry of CP 5 defined by complex conjugation of the homogeneous coordinates restricts to an antiholomorphic involution on X4,4 , and this involution clearly reverses the sign of g but has no fixed point. By the Bando–Mabuchi theorem [2], the Kähler– Einstein metric on X4,4 descends to an Einstein metric on X4,4 /Z2 . The spin holonomy group (with respect to either spin structure) is isomorphic to [Spin(2) · Spin(6)] o Z2 .

Spin Holonomy of Einstein Manifolds

357

(b) SO(2) · SO(4) and [SO(2) · SO(4)] o Z2 . The real subgroup of SU (4) is of course SO(4), and the corresponding subgroup of U (4) is SO(2) · SO(4), where the dot reminds us that SO(2), embedded in SO(8) through i h i h cos θ − sin θ I cos θ −I4 sin θ , −→ 4 I4 sin θ I4 cos θ sin θ cos θ intersects SO(4), embedded through A −→

h

i A 0 , 0 A

in ±I8 , so the product is not direct. This group is the linear holonomy group of the Grassmannian of oriented 2-planes in R6 , G2,6 (R). As SO(2) · SO(4) is a subgroup of U (4), G2,6 (R) can be regarded as a Kähler manifold, and its first Chern class is, in our earlier notation, 4α; thus G2,6 (R) is spin, and its spin holonomy group with respect to its unique spin structure is the subgroup of Spin(2) · Spin(6) which projects onto SO(2) · SO(4). Recalling that Spin(6) is isomorphic to SU (4), and noting that SO(4) contains ±I4 , we see that this subgroup takes the form Spin(2) · SO(4). Writing SO(4) = [Spin(3) × Spin(3)]/Z2 = Spin(3) · Spin(3), we can say that the spin holonomy group of G2,6 (R) is isomorphic to Spin(2) · Spin(3) · Spin(3). G2,6 (R) admits a fixed-point-free antiholomorphic involution ω, defined by reversing the orientation of any 2-plane (and its orthogonal complement) in R6 . The mod 2 cohomology ring of the quotient, G2,6 (R)/Z2 , is isomorphic [3] to that of the quaternionic projective plane; since the latter is known [10] to be spin, so is G2,6 (R)/Z2 . The spin holonomy group of either spin structure must be isomorphic to the relevant subgroup of [Spin(2) · Spin(6)] o Z2 , namely [Spin(2) · Spin(3) · Spin(3)] o Z2 . (c) Sp(1) · Sp(2) and SO(4). The linear holonomy group of any quaternion-Kähler 8manifold is contained in Sp(1) · Sp(2), where Sp(m) is the compact symplectic group of rank m, and where Sp(1) · Sp(2) = [Sp(1) × Sp(2)]/Z2 . To compute the corresponding subgroup of Spin(8), we consider the subgroup Spin(3) · Spin(5), which covers SO(3) × SO(5). Recall that the pre-image of U (4) in Spin(8) is τ [Spin(2)·Spin(6)]: one might expect, in view of the isomorphisms [4] Sp(1) = Spin(3) and Sp(2) = Spin(5), that τ [Spin(3) · Spin(5)] should be the pre-image of Sp(1) · Sp(2). However, this is not so. To see this, recall that the central element α in Spin(8) is contained in Spin(2)·Spin(6), so that the quotient of the latter was well-defined. But e1 e2 e3 is not an element of Spin(3), and e4 e5 e6 e7 e8 is not an element of Spin(5); thus α is not contained in Spin(3) · Spin(5), and so the quotient τ [Spin(3) · Spin(5)]/{±1} is not defined. In fact the pre-image of Sp(1)·Sp(2) is the disconnected group Z2 ×τ [Spin(3)·Spin(5)], where Z2 is {±1}. We conclude that a spin holonomy group of a quaternion-Kähler manifold must be contained either in Z2 × Spin(3) · Spin(5) or in Spin(3) · Spin(5) itself. The situation here is therefore very different from the case of U (4). (Notice that Sp(1)·Sp(2) has a non-trivial double cover, namely Sp(1) × Sp(2), which does not occur as a spin holonomy group.) The quaternionic projective plane, Sp(3)/[Sp(1) × Sp(2)], is a simply connected spin [10] 8-manifold with spin holonomy group isomorphic to Spin(3) · Spin(5). In fact,

358

B. McInnes

every compact quaternion-Kähler manifold of positive Ricci curvature must be [19] simply connected, so there is no positive Ricci curvature example with spin holonomy Z2 × Spin(3) · Spin(5). (Examples may well exist in the negative case, but this is beyond our scope here.) The symmetric space [23] G2 /SO(4), where G2 is the exceptional group of rank 2, is the only example of an irreducible compact quaternion-Kähler 8-manifold with linear holonomy group properly contained in Sp(1) · Sp(2): note that SO(4) = Sp(1) · Sp(1) ⊂ Sp(1) · Sp(2). The mod 2 cohomology of this manifold is given in Ref. [3], and one can use this to show that G2 /SO(4) is spin. The pre-image of SO(4) in Spin(8) is not, as one might expect, isomorphic to Spin(4), but rather to Z2 × SO(4) = Z2 × Spin(3) · Spin(3) ⊂ Z2 ×Spin(3)·Spin(5). However, as above, Z2 ×SO(4) cannot occur in the case of positive Ricci curvature; that is, G2 /SO(4) has no non-trivial space forms. The spin holonomy group of G2 /SO(4) itself is isomorphic to SO(4), which we write as Spin(3) · Spin(3). (d) Spin(7), SU (4), and Sp(2). It is shown in Ref. [22] that a compact 8-manifold of linear holonomy Spin(7) must be simply connected. In fact, a much stronger statement is possible: if M is a compact 8-manifold with a restricted linear holonomy group isomorphic to Spin(7), then it must be simply connected. (Recall [9] that the restricted holonomy group is defined by parallel transport around contractible loops.) This follows [12] from the group theory of SO(8): the latter simply has no disconnected subgroup with Spin(7) as identity component, and so the full holonomy group is forced to be isomorphic to the restricted holonomy group. As Spin(7) is connected and simply connected, such manifolds must be spin. As in the quaternion-Kähler case, the obvious Spin(7) subgroup of Spin(8) does not contain α, and so the pre-image of Spin(7) in Spin(8) is Z2 × τ [Spin(7)], where Z2 = {±1} and τ denotes triality. Since M must be simply connected in the compact case, Z2 × τ [Spin(7)] cannot occur as a spin holonomy group (though it may well occur in the non-compact case) and so the spin holonomy group of any such manifold must be τ [Spin(7)], which is of course abstractly isomorphic to Spin(7) itself. Joyce [8] has given examples of such manifolds. The case of manifolds with restricted linear holonomy group contained in SU (4) = Spin(6) is very different, for it is certainly not true that a compact 8-manifold with SU (4) as restricted linear holonomy group must be simply connected, and the analogous statement for the Sp(2) = Spin(5) subgroup of SU (4) is also probably false [14]. The following theorem greatly simplifies our task, however. Theorem 2. Let M be a compact, locally irreducible, Riemannian, Ricci-flat manifold of real dimension n, with restricted linear holonomy group isomorphic to a proper subgroup of SO(n). If n is not a multiple of 4, then M is spin if and only if it is orientable. If n = 4r, then M is spin only if it can be described in one of the following ways: (a) If r is odd, then either M = M HK /Zk , where M HK is a compact hyperKähler manifold [2], Zk acts freely and holomorphically, and k is an odd divisor of 1 + r, or M is a simply connected Calabi–Yau manifold. (b) If r is even, then either (i) M = M J 8 if r = 2, where M J 8 is a compact, necessarily simply connected Joyce 8-fold, or (ii) M = M HK /Zk , where M HK is as above, and k is any divisor of 1 + r, or (iii) M = M CY or M CY /Z2 , where M CY is a simply connected Calabi–Yau manifold, and the generator of Z2 acts antiholomorphically.

Spin Holonomy of Einstein Manifolds

359

Proof. Berger’s theorem [2] applied to the universal Riemannian cover implies that the latter is a Joyce manifold, a Calabi–Yau manifold, or a hyperKähler manifold. If n is not a multiple of 4, then the generalised Berger theorem [12] shows that the f ull (not restricted) holonomy group is connected and simply connected when M is orientable, and so M must be spin in that case. Henceforth we put n = 4r. Suppose first that r is odd. If the universal Riemannian cover is a Calabi–Yau manifold, then the generalised Hitchin theorems [15] show that M CY can cover three kinds CY / Z− , and M CY /[Z+ × Z− ], where Z± is generated by of manifold: M CY / Z+ 2,M 2 2 2 2 a fixed-point-free holomorphic (respectively, antiholomorphic) involution. The linear holonomy groups are, respectively, Z4r ·SU (2r) (which is isomorphic to [Z4r ×SU (2r)]/ Z2r ], SU (2r) o Z2 , and [Z4r · SU (2r)] o Z2 . The product with Z2 is semi-direct, the action on SU (2r) being conjugation by the SO(4r) matrix 22r , a diagonal matrix with the first 2r entries equal to +1, and the remainder equal to −1. The pre-image of Z4r ·SU (2r) in Spin(4r) is [Z4r × SU (2r)]/ Zr , and no proper subgroup of this group projects onto Z4r · SU (2r). If M CY / Z+ 2 were a spin manifold, then, its spin holonomy group with respect to either of its two spin structures would be isomorphic to [Z4r × SU (2r)]/ Zr . But this group has four connected components, so there can exist no homomorphism from the fundamental group of M CY / Z+ 2 onto the group of components of the spin holonomy group. (The group of components of a (disconnected) Lie group G is the quotient G/G0 , where G0 is the identity component.) But the existence of such a “holonomy homomorphism” is a basic property of any connection on any principal bundle. The contradiction shows that M CY / Z+ 2 cannot be a spin manifold. A precisely − × Z ] can never be spin. Finally, 22r is covered similar argument shows that M CY /[Z+ 2 2 by the Spin(4r) element ˆ 2r = e2r+1 e2r+2 · · · e4r , 2 which satisfies

ˆ 2r )2 = (−1)r . (2

Thus if r is odd, SU (2r) o Z2 is covered by a Spin(4r) subgroup isomorphic to SU (2r) o Z4 , and no proper subgroup of this group projects onto SU (2r) o Z2 . Again, the fact that there is no homomorphism from Z2 onto Z4 shows that M CY / Z− 2 cannot be spin. In short, if r is odd, a 4r-dimensional Calabi–Yau manifold cannot cover a spin manifold. Suppose instead that r is odd but the universal Riemannian cover is a hyperKähler manifold, M HK . Then [12] M has one of the following forms: M = M HK / Zk , k ≥ 1, k divides r + 1, M = M HK /D2k , k ≥ 2, k divides r + 1, M = M HK /P2k , k = 6, 12, or 30, k divides r + 1. Here D2k is the dihedral group of order 2k (with D4 = Z2 × Z2 ), and P2k is the polyhedral group of order 2k. The corresponding linear holonomy groups are Z2k ·Sp(r), Q4k · Sp(r), and B4k · Sp(r), where Q4k is the quaternionic group of order 4k, B4k is the binary polyhedral group of order 4k, and Sp(r) is the symplectic group. These groups are subgroups of Sp(1)·Sp(r), which, in order to avoid confusion in the discussion below, we denote by Spin(3) · Sp(r). As subgroups of SO(4r), both Spin(3) and Sp(r) contain the matrix − I4r . Now we consider the corresponding elements in Spin(4r). Let us suppose that the centre of Spin(3) is generated by α = e1 , e2 · · · e4r ; then we must determine whether the centre of Sp(r) is generated by α or by −α. (This determines whether Spin(3) and Sp(r) intersect non-trivially in Spin(4r), as their counterparts do in SO(4r).) To do this, recall that Sp(r) contains Sp(1) × Sp(1) × · · · (r factors). We associate the first

360

B. McInnes

Sp(1) with {e1 , e2 , e3 , e4 }, the second with {e5 , e6 , e7 , e8 }, and so on. The centres of the respective Sp(1) factors are ±e1 e2 e3 e4 , ±e5 e6 e7 e8 and so on, where the sign must be constant throughout. We can see that the sign is in fact negative, as follows. When r = 1, we know that SO(4) = Spin(3) · Sp(1) is covered by Spin(4) = Spin(3) × Sp(1). Evidently the two factors in Spin(4) intersect trivially (that is, in the identity only), so if e1 e2 e3 e4 generates the centre of Spin(3) in this embedding, then the centre of Sp(1) must be generated by − e1 e2 e3 e4 . Now the centre of Sp(r) is diagonal between the centres of its Sp(1) subgroups, and so we conclude that the centre of Sp(r) is generated by (−1)r α. Therefore Spin(3) and Sp(r) intersect, as subgroups of Spin(4r), as follows [22]: Spin(3) ∩ Sp(r) = {1, α} r even = {1} r odd, and so it follows that if r is odd, then the pre-image of Q4k · Sp(r) in Spin(4r) is Q4k ×Sp(r), while that of B4k ·Sp(r) is B4k ×Sp(r). No proper subgroup of Q4k ×Sp(r) projects onto Q4k · Sp(r), yet Q4k × Sp(r) has 4k connected components while the order of the fundamental group, D2k , is 2k. We conclude, in the usual way, that manifolds of linear holonomy Q4k ·Sp(r) are not spin, and similarly for manifolds of linear holonomy B4k · Sp(r). If k is even, the reader can easily extend this argument to manifolds of linear holonomy Z2k · Sp(r). If k is odd, however, then Z2k × Sp(r) = Z2 × Zk × Sp(r), which has Zk × Sp(r) as a subgroup; this subgroup projects onto the linear holonomy group (note that Z2k · Sp(r) = Zk × Sp(r) if k is odd) and it has k connected components, consistent with the fact that the fundamental group is cyclic of order k. To summarise: if r is odd, then M can be spin only if it is a simply connected Calabi–Yau manifold M CY , a simply connected hyperKähler manifold M HK , or a quotient M HK / Zk , where k is an odd divisor of the even integer r + 1. CY /[Z+ × Z− ] are not spin If r is even, then the above proof that M CY / Z+ 2 and M 2 2 2 ˆ 2r ) = +1, and so SU (2r) o Z2 is covered in Spin(4r) still goes through, but now (2 not by SU (2r) o Z4 but rather by Z2 × SU (2r) o Z2 . This group has a subgroup, SU (2r) o Z2 , which has two connected components, and so we can no longer exclude the possibility that M CY / Z− 2 is spin. Finally, when r is even, it can be shown [12] that a simply connected hyperKähler manifold M HK can only cover manifolds of the form M HK / Zk , where k divides 1 + r. The linear holonomy group is Zk × Sp(r), and, as above, this is covered by a subgroup of Spin(4r) which is likewise isomorphic to Zk × Sp(r). This concludes the proof of Theorem 2. The following corollaries may be of interest. Corollary 1. Let M be a compact, locally irreducible, Riemannian, Ricci-flat spin manifold of dimension 4r, with restricted linear holonomy group isomorphic to a proper subgroup of SO(4r). Then the fundamental group of M is a cyclic group of order ≤ 1 + r. Corollary 2. Let M be a compact, locally irreducible , Riemannian, Ricci-flat manifold with restricted linear holonomy group isomorphic to a proper subgroup of SO(n), n = dim(M). Suppose that n = 4, 12, 28, · · · 2m − 4, m ≥ 3. Then M is spin if and only if it is simply connected.

Spin Holonomy of Einstein Manifolds

361

We may also observe that Wang’s classification [22] of the linear holonomy groups of compact locally irreducible spin manifolds admitting a parallel spinor is also a corollary of Theorem 2. (On the other hand, Wang’s sophisticated representation – theoretic approach can be extended to non-compact Riemannian manifolds.) When n = 8, Theorem 2 gives five possibilities: M J 8 , with spin holonomy Spin(7), as discussed earlier ; M CY , with spin holonomy SU (4) = Spin(6); M CY / Z− 2 , with spin holonomy Spin(6) o Z2 with respect to both spin structures; M HK with spin holonomy Sp(2) = Spin(5); and M HK /Z3 , with spin holonomy Z3 × Spin(5) with respect to its unique spin structure. The problem of constructing examples of such manifolds is discussed in detail in Refs. [1] and [14], and we shall not go into it here; suffice it to say that spin examples are known in every case except the last: as far as the author is aware, no example of a manifold of the form M HK / Z3 has yet been given. (e) P SU (3). The underlying manifold of the Lie group SU (3) is an eight-dimensional spin manifold. With respect to the canonical metric, the linear holonomy group of SU (3) is P SU (3) = SU (3)/Z3 , where Z3 is the centre of SU (3). Manifolds of the form SU (3)/ 0, where 0 is a finite, freely acting group of isometries of SU (3), have P SU (3) as restricted holonomy group. In some cases, 0 is not contained in the identity component of the isometry group; but all such quotients are non-orientable, as is shown in Ref. [14]. For our purposes, then, it suffices to assume that 0 consists of combinations of left and right SU (3) translations. It can be shown that, under these circumstances, the full linear holonomy group of SU (3)/ 0 must be connected, and so it, too, must be isomorphic to P SU (3). As the latter has no non-trivial double cover, its pre-image in Spin(8) must be Z2 × P SU (3), with Z2 = {±1}; and so we conclude that the possible spin holonomy groups here are P SU (3) and Z2 × P SU (3). Both of these do occur. Let f : SU (3) → SU (3) be the fixed-point-free isometry defined by left translation by the SU (3) matrix diag (1, −1, −1). Since the Lie algebra consists of left-invariant vector fields, SU (3)/Z2 (which is not itself a group manifold of course) is parallelisable, so we have SO(SU (3)/Z2 ) = (SU (3)/Z2 ) × SO(8). One sees that SU (3)/Z2 is a spin manifold by directly constructing the two distinct spin structures; we have Spin(1) (SU (3)/Z2 ) = (SU (3)/Z2 ) × Spin(8) and

Spin(2) (SU (3)/Z2 ) = (SU (3) × Spin(8))/Z2 ,

where, in Spin(2) (SU (3)/Z2 ), Z2 acts on SU (3) × Spin(8) by (x, y) → (f (x), −y). If the spin holonomy group of Spin(1) (SU (3)/Z2 ) were isomorphic to Z2 ×P SU (3), then the corresponding holonomy bundles would take the form (SU (3)/Z2 )×(Z2 ×P SU (3)); but this is impossible, since holonomy bundles are connected, by their definition [9]. Similarly, it is clear that Spin(2) (SU (3)/Z2 ) is only reducible to a sub-bundle with a structural group containing {±1}. The holonomy reduction theorem [9] therefore implies that the spin holonomy group of Spin(2) (SU (3)/Z2 ) cannot be P SU (3). In short, both P SU (3) and Z2 × P SU (3) occur as spin holonomy groups over SU (3)/Z2 : the first is the spin holonomy group of Spin(1) (SU (3)/Z2 ), while the second is the spin holonomy group of Spin(2) (SU (3)/Z2 ).

362

B. McInnes Spin(8)

Z2 × P SU (3)

YH Q k QH Q HH Q Q HH HH Q Q HH Q HH Q Q Q

[Spin(2) · Spin(6)] o Z2

Spin(3)·Spin(5)

Spin(7)

6 @ I ]J @ J 6 @ J Spin(6) o Z 2 J P SU (3) ] J J J J J J J J Spin(3)·Spin(3) Spin(2) · Spin(6)J [Spin(2) · Spin(3) · Spin(3)] o Z2 J }Z AKA Z J I @ Z A @ Spin(6) A @ } Z A @ ZZ A Spin(2) · Spin(3) · Spin(3) Z A Z A yXX Z3 × Spin(5) X Spin(5) 6

Figure 2.

According to Ref. [14] , this exhausts the list of possible spin holonomy groups of compact locally irreducible 8-manifolds of non-negative Ricci curvature. All of the examples given here are Einstein manifolds. For the reader’s convenience, we summarise the results in the following diagram of subgroup inclusions. 5. Dimensions 9 and 10 Apart from SO(9) and O(9), the only possible linear holonomy groups in the ninedimensional case are those of the real Grassmannian G3,6 (R) and its space forms [23]. However, these manifolds are not spin manifolds. The simplest way to see this is to exploit the local isomorphism of SO(6) with SU (4) and of SO(3) × SO(3) with SO(4), so that we have G3,6 (R) = SO(6)/[SO(3) × SO(3)] = SU (4)/SO(4). This is useful, because the cohomology of the symmetric spaces of the form SU (n)/SO(n) is quite simple and is given explicitly in Ref. [17]. From this one can verify directly that neither G3,6 (R) nor its space forms are spin, by computing the second Stiefel–Whitney class. Hence Spin(9) is the only possibility here. The situation in ten dimensions is quite similar to the six-dimensional case: the ten-sphere has spin holonomy Spin(10), CP 5 has spin holonomy U (5), and one can readily construct a Calabi–Yau

Spin Holonomy of Einstein Manifolds

363

five-fold with two spin structures, one with spin holonomy SU (5), the other with spin holonomy Z2 × SU (5). In ten dimensions, however, we also have the group manifold Spin(5). The linear holonomy group of this manifold, and also of all of its space forms [14], is isomorphic to SO(5), embedded in SO(10) through the adjoint representation. As SO(5) has a non-trivial double cover, it is not immediately clear whether the preimage of SO(5) in Spin(10) is isomorphic to Spin(5) or to Z2 × SO(5). The following simple trick can be used to settle such questions. Consider first the adjoint representation of SO(6). If B is any 6 × 6 antisymmetric matrix, then the representation is B → ABAT for any matrix A in SO(6). This is a faithful representation of SO(6)/Z2 , and so it embeds SO(6)/Z2 in SO(15). Of course, the 5 × 5 antisymmetric matrices may be regarded as 6 × 6 antisymmetric matrices, so we have the inclusions SO(5) → SO(6)/Z2 → SO(15), SO(5) → SO(10) → SO(15). Now lift everything to Spin(15): Pre(SO(5)) → Pre(SO(6)/Z2 ) → Spin(15), Pre(SO(5)) → Spin(10) → Spin(15), where Pre denotes the pre-image. Clearly Pre(SO(6)/Z2 ) must be isomorphic either to SO(6) or to Z2 × SO(6)/Z2 ; but neither of these groups contains Spin(5). Hence Pre(SO(5)) can only be Z2 × SO(5), which is contained in both SO(6) and Z2 × SO(6)/Z2 . We conclude that the spin holonomy groups of space forms of Spin(5) are isomorphic either to SO(5) or to Z2 × SO(5). A (perhaps slightly confusing) example is provided by the group manifold SO(5), with its two spin structures, Spin(1) (SO(5)) = SO(5) × Spin(10), Spin(2) (SO(5)) = [Spin(5) × Spin(10)]/Z2 . The same argument we used in the preceding section for SU (3)/Z2 shows that the spin holonomy group of Spin(1) (SO(5)) is SO(5) (as a subgroup of Spin(10)), while the spin holonomy group of Spin(2) (SO(5)) is Z2 × SO(5). Thus, both possibilities do actually occur. (The spin holonomy group of Spin(5) itself, with respect to its unique spin structure, is SO(5) – a somewhat curious result). This completes the proof of Theorem 1. u t 6. Conclusion Berger’s theorem [2] classifies the linear holonomy groups of irreducible simply connected Riemannian manifolds. Essentially, then, it describes the behaviour of vectors (or other tensors) under parallel transport around contractible loops. For physical applications, however, one might well be interested in the parallel transport of vectors or spinors around non-contractible loops (on, for example, a non-simply-connected Calabi– Yau or Joyce manifold). Some progress has been made [13] in the classification of linear holonomy groups of compact locally irreducible non-simply-connected Riemannian

364

B. McInnes

manifolds; the theory is complete, however, only in the case of positive Ricci curvature [16]. Theorem 1 is the analogue of Berger’s theorem for the parallel transport of spinors on compact locally irreducible non-simply-connected Riemannian manifolds of nonnegative Ricci curvature. The reader is particularly urged to note that a given Calabi–Yau six-manifold (or a given Joyce seven-manifold) may have two different spin holonomy groups. This is perhaps the most important novelty in moving from linear to spin holonomy; it is striking that it arises precisely in the cases of most interest to string theorists. References 1. Beauville, A.: Variétés Kähleriennes dont la premiere classe de Chern est nulle. J. Diff. Geom. 18, 755–782 (1983) 2. Besse, A. L.: Einstein Manifolds. Berlin: Springer, 1987 3. Borel, A., Hirzebruch, F.: Characteristic classes and homogeneous spaces II. Am. J. Math. 81, 315–382 (1959) 4. Curtis, M. L.: Matrix Groups. New York: Springer, 1984 5. Green, M. B., Schwarz, J. H., Witten, E.: Superstring Theory. Cambridge: Cambridge University Press, 1987. 6. Griffiths, P. Harris, J.: Principles of Algebraic Geometry. New York: John Wiley and Sons, 1978 7. Joyce, D. D.: Compact Riemannian seven-manifolds with holonomy G2 . J. Diff. Geom. 43, 291–375 (1996) 8. Joyce, D. D.: Compact Riemannian eight-manifolds with holonomy Spin(7). Invent. Math. 123, 507–552 (1996) 9. Kobayashi, S., Nomizu, K.: Foundations of Differential Geometry I. New York: Interscience, 1963 10. Lawson, H. B., Michelsohn, M. L.: Spin Geometry. Princeton: Princeton University Press, 1989 11. Matsushima, Y.: Remarks on Kähler–Einstein manifolds. Nagoya Math. J. 46, 161–173 (1972) 12. McInnes, B.: Methods of holonomy theory for Ricci-flat Riemannian manifolds. J. Math. Phys. 32, 888– 896 (1991) 13. McInnes, B.: Holonomy groups of compact Riemannian manifolds: A classification in dimensions up to ten. J. Math. Phys. 34, 4273–4286 (1993) 14. McInnes, B.: Examples of Einstein manifolds with all possible holonomy groups in dimensions less than seven. J. Math. Phys. 34, 4287–4304 (1993) 15. McInnes, B.: The quotient construction for a class of compact Einstein manifolds. Commun. Math. Phys. 154, 307–312 (1993) 16. McInnes, B.: Holonomy groups and holonomy representations. J. Math. Phys. 36, 4450–4460 (1995) 17. Mimura, M., Toda, H.: Topology of Lie Groups. Providence, RI: American Mathematical Society, 1991 18. Porteous, I. R.: Clifford Algebras and the Classical Groups. Cambridge: Cambridge University Press, 1995 19. Salamon, S. M.: Quaternionic Kähler manifolds. Invent. Math. 67, 143–171 (1982) 20. Tian, G.: On Kähler–Einstein metrics on certain Kähler manifolds with c1 (M) > 0. Invent. Math. 89, 225–243 (1987) 21. Wang, M. Y.: Parallel spinors and parallel forms. Ann. Global Anal. Geom. 7, 59–70 (1989) 22. Wang, M. Y.: On non-simply connected manifolds with non-trivial parallel spinors. Ann. Global Anal. Geom. 13, 31–42 (1995) 23. Wolf, J. A.: Spaces of Constant Curvature. Wilmington: Publish or Perish, 1984. 24. Yau, S. T.: On the curvature of compact Hermitian manifolds. Invent. Math. 25, 213–239 (1974) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 203, 365 – 383 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Global Existence of Smooth Solution to Non-Linear Thermoviscoelastic System with Clamped Boundary Conditions in Solid-like Materials Boling Guo, Peicheng Zhu Center for Nonlinear Studies and Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Beijing 100088, P. R. China. E-mail: [email protected]; [email protected] Received: 18 May 1998 / Accepted: 1 December 1998

Abstract: In this paper, we prove the global existence and uniqueness of smooth solution to a system of one-dimensional non-linear thermoviscoelasticity which describes the thermomechanical processes in a class of solid-like materials, such as rubber, etc. The materials satisfy that both ends of the rod are fixed. This assumption was proposed by Dafermos in 1982 (see [6]). A new approach is developed to obtain the crucial estimate of the L∞ -norm of the strain u. 1. Introduction This paper is concerned with the global existence and uniqueness of a smooth solution to the following system of one-dimensional nonlinear thermoviscoelasticity, which describes the thermomechanical processes in a class of solid-like materials, such as rubber, etc. The referential (Lagrangian) form of the balance laws of mass, momentum and energy for one-dimensional, heat-conducting, viscous solid-like materials with reference density %0 = 1 can be written as ut − vx = 0, vt − σx = 0, 1 e + v 2 − (σ v)x + qx = 0, 2 t

(1.1) (1.2) (1.3)

where u, v, σ, e, θ and q denote deformation gradient (strain), velocity, stress, internal energy, absolute temperature and heat flux, respectively. The quantities u, θ and e may take only positive values. The above system should be supplemented with the second law of thermodynamics expressed by the Clausius–Duhem inequality q ≥ 0 in Qt := (0, 1) × (0, t). (1.4) ηt + θ x

366

B. Guo, P. Zhu

Here, η stands for the specific entropy. We assume that the reference configuration is the unit interval [0,1]. For one-dimensional homogeneous thermoviscoelastic materials, internal energy, stress, entropy and heat flux are given by the following constitutive relations: ˆ θ ), q = q(u, ˆ θ, θx ) e = e(u, ˆ θ), σ = σˆ (u, θ, vx ), η = η(u,

(1.5)

which, in order to comply with (1.4), must satisfy ˆ θ ) = −ψˆ θ (u, θ ), σˆ (u, θ, 0) = ψˆ u (u, θ), η(u, (σˆ (u, θ, w) − σˆ (u, θ, 0))w ≥ 0, q(u, ˆ θ, g)g ≤ 0,

(1.6) (1.7)

where ψ = e − θη is the Helmholtz free energy. The following assumptions are made on our model: (i) The material has viscoelastic damping of rate type, i.e., σ (x, t) = σˆ (u, θ) = −p(u, ˆ θ ) + µ(u)v ˆ x,

(1.8)

µ(u)u ˆ ≥ µ0 > 0, 0 < u < +∞.

(1.9)

where µˆ satisfies

Here, µ0 is a positive constant. (ii) We assume that the heat flux q satisfies: q = q(u, ˆ θ, θx ) = −kθx

(1.10)

with k = const > 0. (iii) The Helmholtz free energy ψ is assumed as follows: ψ(u, θ) = −CV θ ln θ + C 0 θ + F1 (u)θ + F2 (u)

(1.11)

with positive constants CV , C 0 and smooth functions F1 , F2 . (iv) Concerning F1 , F2 , to include the model for the study of thermomechanical processes in a class of solid-like materials, we assume that the functions f1 := F10 , f2 := F20 satisfy ¯ f1 (u), f2 (u) > 0, ∀ ∞ > u > U¯ , f1 (u), f2 (u) < 0, ∀ 0 < u < u;

(1.12)

so that the elastic part of the stress p = p(u, ˆ θ ) := −(f1 θ + f2 ) satisfies for any 0 ≤ θ < ∞, p(u, ˆ θ) > 0, ∀ 0 < u < u; ¯ p(u, ˆ θ ) < 0, ∀ ∞ > u > U¯ .

(1.13)

Here, u, ¯ U¯ are positive constants satisfying u¯ ≤ U¯ . The condition (1.13) implies that p(u, ˆ θ ) is compressible at high density and tensile at low density, at any temperature. ˆ θ ), p = p(u, Remark 1.1. More general models in which k = k(u, ˆ θ ) grow at higher rates in θ, for instance, the same as those in [6], can also be treated by using our technique for getting the estimate of u and the technique in [6] for reducing the degree of superlinear terms.

Global Smooth Solution to Non-Linear Thermoviscoelasticy

367

Thus the system (1.1)–(1.3) turns out to be ut − vx = 0, ˆ x )x = 0, vt − (f1 (u)θ + f2 (u) + µv ˆ x )x = 0. ˆ x2 − (kθ CV θt − θf1 (u)vx − µv

(1.14) (1.15) (1.16)

Equations (1.14)–(1.16) are supplemented by the following boundary conditions θx = 0,

for x = 0, 1

(1.17)

v = 0,

for x = 0, 1,

(1.18)

and

as well as by the initial conditions u|t=0 = u0 , v|t=0 = v0 , θ|t=0 = θ0 .

(1.19)

Boundary conditions (1.17) and (1.18) physically mean that there is no heat flux through the boundary and the rod is clamped at both ends. For the initial data, we assume that (v) u0 ∈ C 1,α [0, 1], v0 ∈ C 2,α [0, 1], θ0 ∈ C 2,α [0, 1] with u0 (x), θ0 (x) > 0, for x ∈ [0, 1]. Furthermore, the following compatibility conditions are satisfied: v0 |x=0,1 = 0 ˆ 0 )v00 |x=0,1 = 0. 0, θ0x |x=0,1 = 0 and f1 (u0 )θ0 + f2 (u0 ) + µ(u Before stating and proving our results, let us first recall the related results in the literature. For the material of ideal gas, in which the constitutive laws are given as follows: e = CV θ, σ = −R

θ vx θx + µ , q = −K , u u u

(1.20)

where CV , R, µ and K are positive constants, Nagasawa proved in [13] that the solution (u, v, θ ) to the problem (1.14)–(1.16), which is subjected to (1.20) and the non-fixed and thermally insulated boundary conditions at both ends of the rod, satisfies u(x, t) ≥ C log(1 + t).

(1.21)

The first result on global existence of a large solution to this problem with clamped boundary conditions was obtained by Kazhikhov and Shelukhin [10] in the ideal gas case. Their analysis depends strongly on the special form of the constitutive relations (1.20). On the other hand, in 1982, Dafermos and Hsiao [7] proved the global existence of classical solutions to the system of (1.1)–(1.3) for a fairly general class of solid-like materials (1.5) with stress free boundary conditions at least at one end of the rod. Furthermore, Dafermos in [6] overcame the limitations of [7] by establishing an additional estimate which is motivated by the second law of thermodynamics and embodies the dissipative effects of viscosity and thermal diffusion. The asymptotic behavior of a smooth solution as time tends to infinity has been investigated in Luo’s thesis [12] for a class of solid-like materials. R. Racke and S. Zheng [15] investigated the global existence, uniqueness and asymptotic behavior of a weak solution to the model in shape memory alloys also with stress

368

B. Guo, P. Zhu

free boundary condition for at least one end of the rod. In all these above papers except [10], a variant of Andrews’ (see [1]) technique, used in Dafermos [6] and Pego [14] etc., was crucial to obtain a uniform a priori estimate on the L∞ -norm of u. However, this technique does not apply to the case that both ends of the rod are fixed. As for the system in shape memory alloys, the problem of global existence and uniqueness of the classical solution for the case that at both ends of the rod is clamped was open for about ten years until the paper by Chen and Hoffmann [4]. It should be pointed out that in that paper the estimates on the solution crucially depend at least on the H 1 -norm of initial data of u. However, Shen, Zheng and Zhu [16] or Chapter 2 in Zhu’s Ph. D. thesis [17] proved, by developing a new approach to get the L∞ norm of the strain u(does not depend on H 1 -norm of u), the global existence, uniqueness and asymptotic behavior of a solution to the same system but in a different framework. The purely viscoelastic case has been intensively studied over the past twenty years. There are a large number of references in this direction. We refer toAndrews [1],Andrews and Ball [2], Antman and Seidman [3], Dafermos [5], Greenberg and MacCamy [8], and Pego [14], etc. Now we can state our main theorem Theorem 1.1. Suppose that the assumptions (i)–(v) are satisfied. Assume that the following growth conditions hold: µ(u) ˆ ∼ γ2 u−µ2 ,

µ(u) ˆ ∼ γ1 uµ1 , as u → +∞,

(1.22)

f1 (u) ∼ −γ4 u−b1 , as u → 0+ , f1 (u) ∼ γ3 ua1 , as u → +∞,

(1.23)

f2 (u) ∼ −γ6 u

as u → 0+ ,

−b2

+

a2

, as u → 0 , f2 (u) ∼ γ5 u , as u → +∞.

(1.24)

Here, the symbol “∼" denotes asymptotic equivalence as u → 0+ or u → ∞. γi (i = 1, · · · , 6), µi , ai , bi (i = 1, 2) are non-negative constants. We denote a := max{a1 , a2 }, b := max{b1 , b2 }.

(1.25)

Furthermore, we assume that a, b > 0, and µ2 ≥ 1 (which is just the condition (1.9)). Then there exists a unique global solution (u, v, θ ) to the problem (1.14)–(1.19) such that the functions u, ux , ut , uxt ; v, vx , vt , vxx ; θ, θx , θt , θxx are all in C α,α/2 and utt , vxt , θxt are in L2 (QT ). Moreover, we have θ (x, t) > 0, C −1 ≤ u(x, t) ≤ C f or all (x, t) ∈ [0, 1] × [0, T ], where C is a positive constant which may depend on T . Remark 1.2. Our assumptions include the rubber model in which the pressure pˆ is p(u, ˆ θ) = −γ θ(u −

1 ). u2

(1.26)

Here, f2 = 0, so a2 = b2 = 0. Remark 1.3. The assumption µ2 ≥ 1 is just the condition (1.9). The physical meaning of this condition is as follows: when strain u is small, then the viscosity becomes strong ˆ → enough and prevents u from going to 0. However, if 0 < µ2 < 1, then M(u) ˆ see below ), the viscosity is so weak that it can 0, as u → 0+ ( for definition of M(u), not prevent u from going to 0. So, our technique does not work again for this case.

Global Smooth Solution to Non-Linear Thermoviscoelasticy

369

In our proof of global existence, the main difficulty lies in the estimate of the L∞ norm of the strain u. This difficulty arises in the appearance of non-local terms when we handle the boundary terms. The non-local terms involve the values of both the minimum and maximum of u. So, we need to give a delicate analysis on the growth rates and derive the relation between the minimum and maximum values of u (see Lemma 2.3 ). This relation is crucial in our proof. Another difficulty in proving Theorem 1.1 is the appearance of superlinear terms. The notations used in this paper are as follows: Let = (0, 1) (the unit interval), and Qt = × (0, t). By k · kp we denote the Lp -norm( 1 ≤ p ≤ ∞) over . In particular, for p = 2, we denote the L2 -norm over simply by k · k. (f, g) denotes the inner product of the unknown functions f and g. C α,α/2 denotes the Hölder space of functions on Qt = [0, 1] × [0, t] which are uniformly Hölder continuous with an exponent α in x and exponent α/2 in t. This article is organized as follows. In Sect. 2 we intend to derive pointwise uniform a priori estimates, from both below and above, of the strain u, which plays a crucial role in our proof. This section lies in the heart of the present paper. In Sect. 3 we derive further estimates on the unknown functions. In Sect. 4, we prove global existence and uniqueness of a smooth solution by making use of the framework devised in [7]. Therefore, our principal objective is to show that (u, v, θ ) is a priori bounded in the space B referred in [6], i.e. to prove Proposition 4.1. 2. Bounds of the Strain In this section, we are going to derive the uniform pointwise boundedness, from both below and above, of the unknown function u(x, t). This will be done in a series of lemmas leading to the desired a priori estimates. The letters C, Ci (i ∈ N) will denote various constants which may depend on T . In the sequel, without loss of generality, we R1 assume that CV = 1, 0 u0 (x)dx = 1. Lemma 2.1. For any t > 0, the following estimates hold: Z 1 Z 1 u(x, t)dx = u(x, 0)dx = 1, θ(x, t) > 0, in [0, 1] × [0, R+ ), 0

(2.1)

0 −b+1 k1 ≤ C, kθ(t)k1 + kv(t)k2 + kuka+1 a+1 + ku

Z tZ 0

Z

t 0

0

1 θ2 x θ2

µv ˆ x2 + θ

Z kθ(τ )k dτ ≤ C 2

t

(2.2)

dxdτ ≤ C,

sup θ (x, τ )dτ ≤ C + Ct.

0 x∈[0,1]

(2.3)

(2.4)

Proof. First, applying the maximum principle to (1.16), we find that θ(x, t) > 0, ∀ (x, t) ∈ [0, 1] × R+ .

(2.5)

On the other hand, integrating (1.14) with respect to x, t over Qt and using the boundary condition (1.18), we arrive at (2.1).

370

B. Guo, P. Zhu

Next, multiplying (1.15) by v, adding the result to (1.16) and integrating the resultant with respect to x over , we arrive at Z 1 1 2 d θ + F2 (u) + v (t)dx = 0. (2.6) dt 0 2 Hence, 1

Z 0

1 2 θ + F2 (u) + v (t)dx =: E1 , 2

(2.7)

where E1 is a constant depending only on the initial data. Multiplying (1.16) by θ −1 and integrating the resultant with respect to x yield Z 1 2 Z 1 Z 1 µv ˆ x2 kθx d log θdx − F1 (u)dx = + dx ≥ 0. (2.8) dt θ2 θ 0 0 0 Since log θ ≤ θ − 1, for all θ > 0, we obtain Z t Z 1 2 Z 1 Z 1 µv ˆ x2 kθx F1 (u)dx + + (θ − 1)dx dxdτ ≤ C + C θ2 θ 0 0 0 0 ≤ C.

(2.9)

Invoking the definition of f1 , f2 , using the above estimates and Young’s inequality we assert that (2.2) is valid. To prove (2.4), we write Z x 1 1 θx dx, (2.10) θ 2 (x, t) = θ 2 (y(t), t) + 1 y(t) 2θ 2 (x, t) R1 where y(t) ∈ [0, 1] satisfies that for any t, there holds θ (y(t), t) = 0 θ dx. Thus, Z 1 2 Z tZ 1 Z t Z 1 Z t θx kθ(τ )k∞ dτ ≤ C θdxdτ + C θ dx dx dτ 2 0 0 0 0 0 0 θ ≤ C + Ct. (2.11) Thus the proof is complete. u t The following lemma plays an important role in estimating the L∞ −norm of u. Motivated by [10], using the procedure similar to [9], we have Lemma 2.2. The following expression holds: Z x Z x Z 1Z u ∂ ˆ vdy + (v, udy) + y µ(y)dydx ˆ − M(u) ∂t 1 0 0 1 Z 1 (v 2 + pu)dx ˆ − p(u, ˆ θ). = 0

ˆ Here, M(u) =

Ru 1

µ(y)dy. ˆ

(2.12)

Global Smooth Solution to Non-Linear Thermoviscoelasticy

371

Proof. Firstly, we deduce, by integrating Eq.(1.15) over [1, x] that Z x ∂ 0= vdy − (µv ˆ x − p(u, ˆ θ))|x1 ∂t 1 Z x ∂ ˆ = vdy − M(u) + p(u, ˆ θ ) + (µv ˆ x − p(u, ˆ θ ))|x=1 . (2.13) ∂t 1 Rx Next, we multiply (1.15) by 0 udy and integrate the resultant with respect to x to get Z x Z x udy − µv ˆ x − pˆ x , udy 0 = vt , 0Z x Z x 0 d v, udy − v, ut dy + µv ˆ x − p, ˆ u − (µv ˆ x − p)| ˆ x=1 = dt 0 0 Z 1Z u Z x d = udy + y µ(y)dydx ˆ v, dt 0 0 1 Z 1 ˆ dx − (µv ˆ x − p)| ˆ x=1 . (2.14) v 2 + pu − 0

Combining (2.14) with (2.13) yields (2.12). u t Now we introduce A, m to measure the growth rates of various quantities. For any given T > 0, let A = A(T ) =

sup

u(x, t),

(2.15)

inf

u(x, t).

(2.16)

(x,t)∈[0,1]×[0,T ]

and m = m(T ) =

(x,t)∈[0,1]×[0,T ]

In the sequel, we may assume A ≥ 1 and m ≤ 1. The following lemma gives the relation between m and A. Lemma 2.3. Assume that all the conditions in Theorem 1.1 are met. Then the following relations hold: i) for µ2 > 1, Aµ1 +1 ≤ C1 m−β + C2 , m−µ2 +1 ≤ C3 Aα + C4 ;

(2.17)

q Aµ1 +1 ≤ C5 log(m−1 ) + C6 , log(m−1 ) ≤ C7 Aα + C8 ,

(2.18)

ii) for µ2 = 1,

where α, β, Ci (i = 1, · · · , 8) are positive constants. Ci (i = 1, · · · , 8) are independent of m, A but may depend on T . And α, β satisfy µ1 + 1 − a < α < µ1 + 1, µ2 − 1 − b < β < µ2 − 1.

(2.19)

372

B. Guo, P. Zhu

Proof. Let Z

x

q(x, t) =

Z vdy + (v,

1

ˆ − M(u) −

x

0 tZ 1

Z 0

Z udy) +

1Z u

0

y µ(y)dydx ˆ

1

(2.20)

(v + pu)dxdτ. ˆ 2

0

Then, by (2.12) we have ∂q (x, t) = −p(u, ˆ θ ). ∂t

(2.21)

It is easy to infer from (2.2) and (2.4) that Z t Z 1 Z t Z 1 2 ≤ Ct + C (v + pu)dxdτ ˆ sup |θ |(x, τ ) (|F1 (u)| + |F2 (u)|)dxdτ 0 0 0 x∈[0,1] 0 Z t sup |θ |(x, τ )dτ ≤ Ct + C 0 x∈[0,1]

≤ C + Ct.

(2.22)

In what follows, the argument is divided into two steps. Step 1. First, we consider the case µ2 > 1. For this case, we handle the third term in (2.20) as follows. By using the Young inequality and the growth condition on µ, ˆ we have Z u ≤ C + C uµ1 +2 + u2−µ2 . y µ(y)dy ˆ (2.23) 1

Hence, by virtue of (2.3), one has Z

1Z u 0

1

Z ≤ C +C sup uµ1 +1−a y µ(y)dydx ˆ Qt

≤ C +C (sup u) Qt

µ1 +1−a

1

a+1

u

+(sup u Qt

0

−1 µ2 −1−b

)

Qt

)

1

! u

−b+1

0

!

−1 µ2 −1−b

+(sup u

Z

.

(2.24)

Since (2.1)–(2.3) and u0 ∈ L∞ hold, we have, for any x ∈ [0, 1], |q(x, 0)| ≤ A0 ≤ Aα + m−β ,

(2.25) (2.26)

where A0 , α, β are positive constants and α, β are the same as that in (2.19). In what follows, we want to show that |q(x, t)| ≤ Aα + m−β , for all (x, t) ∈ [0, 1] × [0, T ].

(2.27)

Global Smooth Solution to Non-Linear Thermoviscoelasticy

373

We employ the contradiction argument. Suppose now that for some x0 ∈ [0, 1] there exists t0 ∈ [0, T ] such that q(x0 , t0 ) > Aα + m−β + `,

(2.28)

where ` is any given positive constant. We may assume the solution is smooth. Therefore, the mapping t → q(x0 , t), for this fixed x0 ∈ [0, 1], is continuously differentiable. Thus, this implies that there exists t ∗ ∈ (0, t0 ) such that q(x0 , t ∗ ) = Aα + m−β + `,

(2.29)

q(x0 , t) ≤ Aα + m−β + `, for all t ∈ [0, t ∗ ],

(2.30)

∂q (x0 , t ∗ ) ≥ 0. ∂t

(2.31)

and

On the other hand, we write (2.24) as follows: Z 1 Z u ≤ C Aµ1 +1−a + m−(µ2 −1−b) . y µ(y)dydx ˆ 0

(2.32)

1

We deduce from the fact µ(u)u ˆ ≥ µ0 , ∀0 < u < ∞ that ˆ M(u) ≥ µ0 log u.

(2.33)

In the sequel, without loss of generality, we let µ0 = 1. Recalling (2.29), (2.22), (2.2), (2.1), and (2.19), invoking the definition of q, we get Aα + m−β + ` = q(x0 , t ∗ ) Z x Z x Z vdy + (v, udy) + = 1

Z

− 0

t∗

Z

0 1

1Z u

0

1

ˆ y µ(y)dydx ˆ − M(u)

x=x0 ,t=t ∗

(v 2 + pu)dxdτ ˆ

0

1 α ˆ A + m−β + ` . ≤ −M(u)| t=t ∗ + C + 2

(2.34)

Therefore, combining this with (2.33) yields ˆ log u|t=t ∗ ≤ M(u)| t=t ∗ 1 α A + m−β . ≤C− 2

(2.35)

Without loss of generality, we can assume that the term on the right-hand side of (2.35) is negative. Otherwise, if C − 21 Aα + m−β ≥ 0, i.e. C ≥ 21 Aα + m−β , the proof is done. Hence, we assert from (2.35) that u|t=t ∗ ≤ eC− 2

1

Aα +m−β

→ 0, as A → ∞ or m → 0.

(2.36)

374

B. Guo, P. Zhu

Therefore, if we take A or m−1 large enough, then ¯ u|t=t ∗ < u.

(2.37)

By virtue of the property of p, ˆ we get immediately p(u, ˆ θ)|t=t ∗ > 0.

(2.38)

∂q (x0 , t ∗ ) < 0, ∂t

(2.39)

Combining this with (2.21) yields

which contradicts the inequality (2.31). We conclude from the above argument that either A, m−1 is uniformly bounded in T or for all (x, t) in QT , q(x, t) ≤ Aα + m−β .

(2.40)

As for the lower bound on q(x, t), we also use the contradiction argument. There is a slight difference from the derivation of the upper bound. Suppose now that for some x0 ∈ [0, 1], there exists some t ∈ [0, T ], such that (2.41) q(x0 , t0 ) < − Aα + m−β + ` , where ` is any given positive constant. Using the mean value theorem, we know that there exists t ∗ ∈ (0, t0 ) such that (2.42) q(x0 , t ∗ ) = − Aα + m−β − `, q(x0 , t) ≥ −(Aα + m−β ) − `, for all t ∈ [0, t ∗ ],

(2.43)

∂q (x0 , t ∗ ) ≤ 0. ∂t

(2.44)

and

We deduce from (2.42) that ˆ ∗ ) = Aα + m−β + M(u

Z 0

1 Z u∗

y µ(y)dydx ˆ + ··· .

(2.45)

1

Here, u∗ = u(x0 , t ∗ ) and we denote the bounded terms by dots · · · . It follows from (2.32) and (2.19) that the right-hand side of (2.45) grows at most as Aα + m−β . On the other hand, since µ2 > 1, we conclude that ˆ ˆ → +∞, as u → ∞. M(u) → −∞ as u → 0+ ; M(u)

(2.46)

ˆ We deduce from the above property on M(u) that u∗ in (2.45) must be very large provided −1 A, m is suitably large. Hence, the terms with positive exponents will be very large and play a dominant role, while the terms with negative exponents are very small, so, they can be omitted. Therefore, it follows from (2.45) and the Young inequality that u(x0 , t ∗ )µ1 +1 ≥

1 α A + m−β − C. 2

(2.47)

Global Smooth Solution to Non-Linear Thermoviscoelasticy

375

We can assume that the right-hand side of (2.47) is positive, otherwise, the proof is done. Hence, we can solve u(x0 , t ∗ ) from (2.47),

∗

u(x0 , t ) ≥

1 α A + m−β − C 2

1 µ1 +1

.

(2.48)

Whence, if we take A or m−1 large enough, then u(x0 , t ∗ ) > U¯ .

(2.49)

By virtue of the property of p, ˆ we get immediately p(u, ˆ θ)|t=t ∗ < 0.

(2.50)

∂q (x0 , t ∗ ) > 0, ∂t

(2.51)

Combining this with (2.21) yields

which contradicts the inequality (2.44). We conclude from the above argument that either A and m−1 are uniformly bounded in T or for all (x, t) ∈ QT , q(x, t) ≥ −Aα − m−β .

(2.52)

Hence, we conclude from (2.40) and (2.52) that either A is uniformly bounded or for all (x, t) ∈ QT , |q(x, t)| ≤ Aα + m−β .

(2.53)

We now can show that if (2.53) holds, then A, m satisfy the relation (2.17). Indeed, suppose u(x, t) achieves its minimum, maximum at (x1 , t1 ), (x2 , t2 ) respectively. Substituting (xi , ti ), (i = 1, 2) into (2.53), we arrive easily at the relation (2.17). Step 2. We now investigate the case µ2 =p1. For this case, we should replace the bound Aα + m−β of q(x, t) in (2.53) with Aα + log(m−1 ). Then, by repeating the argument in Step 1, we obtain the relation (2.18). Thus the proof is complete. u t As a corollary of Lemma 2.3, we have Lemma 2.4. There exists a positive constant C, such that A ≤ C, m ≥ C −1 ,

(2.54)

C −1 ≤ u(x, t) ≤ C.

(2.55)

Proof. It follows from Lemma 2.3 that β

Aµ1 +1 ≤ C1 (C3 Aα + C4 ) µ2 −1 ≤ C9 A

α· µ β−1 2

+ C10

(2.56)

376

B. Guo, P. Zhu

for µ2 > 1 and

p Aµ1 +1 ≤ C5 C7 Aα + C8 α

≤ C11 A 2 + C12

(2.57)

for µ2 = 1. Recalling (2.19) and using Young’s inequality, we conclude that for both cases, the following hold: A ≤ C13 ,

(2.58)

whence α + C4 m ≥ C3 C13

1 µ2 −1

for µ2 > 1,

(2.59)

α + C8 } for µ2 = 1. m ≥ exp{− C7 C13

(2.60)

Thus the proof is complete. u t 3. Further a priori Estimates In this section we are going to derive the further bounds on the unknown functions and their derivatives. After the preparations in the previous section, we are now in a position to prove the following estimates. Since the assumptions on the growth rates in our model are weaker than that in [6], we can get more delicate estimates which are somewhat different from those derived in [6]. Lemma 3.1. The following estimates hold for all t ∈ [0, T ] and for any α ∈ (0, 1]: Z t kvx k2 dτ ≤ C, (3.1) 0 Z t Z 1 2 vx2 θx + dxdτ ≤ C, (3.2) θ 1+α θα 0 0 Z t kθx k2 dτ ≤ C sup kθ (τ )k1+α (3.3) ∞ + C. 0≤τ ≤t

0

Proof. First, we prove (3.1). Integrating (1.16) with respect to x, t over Qt and using the boundary conditions yield t Z tZ 1 Z 1 Z tZ 1 2 µv ˆ x dxdτ = − f1 θ vx dxdτ + θ dx 0

0

0

0 t

Z

≤C+ 0

0

0

(Cε kθ k2 + εkvx k2 )dτ

Z

t

≤ C+ε 0

kvx k2 dτ.

(3.4)

Here, we have used Lemma 2.1 and Lemma 2.4. Taking ε small enough we arrive at (3.1).

Global Smooth Solution to Non-Linear Thermoviscoelasticy

377

Next, since it is easy to see that (3.3) follows from (3.2), we need only to prove (3.2). To this end, recalling Lemmas 2.1, 2.4, we find that (3.2) holds for α = 1. To prove (3.2) with α = 21 , we multiply (1.16) by θ −1/2 and integrate the resulting equation with respect to x over the interval (0, 1) to get Z 1 Z 1 Z 1 µv ˆ x2 kθx2 d 1/2 + 1/2 = 2 θ dx − f1 θ 1/2 vx dx 2θ 3/2 θ dt 0 0 0 Z 1 Z 1 Z 1 d 1/2 1/2 1/2 1/2 θ dx − f1 θ − θ f1 vx dx. (3.5) vx dx − θ =2 dt 0 0 0 R1 Here, θ 1/2 = 0 θ 1/2 dx. By virtue of (3.1) and Lemmas 2.1, 2.4, we find Z 1 θ 1/2 dx ≤ C, (3.6) 0

Z t Z 1 1 1/2 v dxdτ 2 f − θ θ 1 x 0 0 Z t

21

≤C

θ − θ 1/2 kvx k1 dτ ∞ 0 Z tZ 1 |θx | dxkvx k1 dτ ≤C 1/2 θ 0 0 Z t 1

θx

kθk 2 kvx k1 dτ ≤C 1

θ 0 Z t Z t 2

θx

kvx k2 dτ ≤ C, ≤C

θ dτ + C 0

and

Z t Z θ 1/2 0

0

(3.7)

0

1

Z t f1 vx dx ≤ C kvx k2 dτ + C ≤ C.

(3.8)

0

Therefore, we can infer from (3.5)–(3.8) that (3.2) holds for α = 21 . We can now assume that (3.2) holds for α = 2−n , n ∈ N, then we deduce inductively that (3.2) is valid for α = 2−(n+1) . By the interpolation technique, we find that for any α ∈ [2−n−1 , 2−n ] (n = 0, 1, 2, 3, · · · ), Z tZ 1 Z tZ 1 2 2β 2(1−β) θx θx θx dxdτ = dxdτ −n 1+α β(1+2 ) θ (1−β)(1+2−n−1 ) 0 0 θ 0 0 θ β Z 1 1−β Z t Z 1 θx2 θx2 dx dx dτ ≤ −n−1 1+2−n 0 0 θ 0 θ 1+2 β Z t Z 1 1−β Z t Z 1 θx2 θx2 dxdτ dxdτ ≤ −n−1 1+2−n 0 0 θ 0 0 θ 1+2 ≤ C. (3.9) −n−1

−n +(1−β)2−n−1 . Similarly, we can prove that the Here, β = 2α−2 −n −2−n−1 so that α = β2 term involving v in (3.2) is also valid. Then we assert that (3.2) holds for all α ∈ (0, 1]. Thus the proof of the lemma is complete. u t

378

B. Guo, P. Zhu

In what follows, we derive the relation between ux and θ . Lemma 3.2. The following relation holds for all t ∈ [0, T ], and for any α ∈ (0, 1]: kux (t)k2 ≤ C + C sup kθ kα∞ . 0≤τ ≤t

Proof. We rewrite (1.15) as follows: ∂ ˆ v − M(u) x + pˆ x = 0. ∂t

(3.10)

(3.11)

ˆ Then multiplying (3.11) by v − M(u) x and integrating it with respect to x over (0, 1), we arrive at 1 d 2 ˆ ˆ kv − M(u) 0= x k + pˆ x , v − M(u)x 2 dt 1 d 2 ˆ ˆ ˆ kv − M(u) = x k + pˆ u ux , v − M(u)x − f1 θx , v − M(u)x . (3.12) 2 dt The right-hand side of (3.12) is handled as follows. Invoking Lemma 2.4, Z 1 ˆ ˆ ˆ + 1) |M(u) | ≤ C(kθ(t)k | pˆ u ux , v − M(u) ∞ x x | · |v − M(u)x |dx 0 2 2 ˆ ≤ C(kθ(t)k∞ + 1) kv − M(u) x k + kvk 2 ˆ k + 1 , (3.13) ≤ C(kθ(t)k∞ + 1) kv − M(u) x Z 1 ˆ ˆ |θx | |v − M(u) ≤ C f1 θx , v − M(u) x x |dx 0 Z 1 ˆ |θx |θ −(1+α)/2 θ α/2 θ 1/2 |v − M(u) =C x | dx Z

0

1

≤C 0

≤

θ

θx2

Z

α

θ dx + C 1+α

Ckθ(t)kα∞

Z

0

1

θx2

θ 1+α

0

1

2 ˆ θ |v − M(u) x | dx

2 ˆ dx + Ckθ (t)k∞ kv − M(u) x k . (3.14)

Using (3.2), (2.4), and applying Gronwall’s inequality (see, e.g. [11]), we find Z t Z 1 2 θx 2 α ˆ dx + kθ k∞ dτ · kθk∞ kv − M(u)x k ≤ C 1 + 1+α 0 0 θ Z t (kθk∞ + 1)dτ + C · exp 0

≤ C sup kθ(τ )kα∞ + C. 0≤τ ≤t

(3.15)

However, we have

2 2 2 ˆ ˆ , kux k2 ≤ kM(u) x k ≤ C kvk + kv − M(u) xk

whence (3.10) follows. Thus the lemma is proved. u t

(3.16)

Global Smooth Solution to Non-Linear Thermoviscoelasticy

379

Concerning the estimates on the higher order derivatives of v, we have Lemma 3.3. The following estimate holds for all t ∈ [0, T ], and for any α ∈ (0, 1]: kvx k +  2

Z t

kvt (τ )k2 + kvxx (τ )k2 dτ sZ

0

t

≤ C 1 + sup kθ(τ )k1+α ∞ + 0≤τ ≤t

0



kθt (τ )k2 dτ  .

(3.17)

Proof. Step 1. Multiplying (1.15) by vt and integrating the resulting equation with respect to x yield Z 1 Z d 1 1 µˆ 2 vx dx − (p, ˆ vxt ) − µˆ t vx2 dx dt 0 2 2 0 Z 1 Z d 1 1 µˆ 2 ˆ vx ) + (pˆ t , vx ) − µˆ t vx2 dx vx dx − (p, = kvt k2 + dt 2 2 0 0 Z 1 d µ ˆ vx2 dx − (p, ˆ vx ) − (f1 θt , vx ) = kvt k2 + dt 0 2 Z 1 1 µˆ t vx2 dx. −(f10 θ + f20 , vx2 ) − 2 0

0 = kvt k2 +

(3.18)

But, by Nirenberg’s inequality, we have 1

5

kvx k3 ≤ Ckvxx k 6 kvx k 6 + Ckvx k.

(3.19)

Therefore, Z tZ 1 Z t Z 1 2 µˆ t vx dxdτ ≤ |vx |3 dxdτ 0 0 0 0 Z t 1 5 kvxx k 2 kvx k 2 + kvx k3 dτ ≤C 0

Z

t

≤C 0

kvxx k dτ 2

Z

t

≤C 0

Z

t

≤ε 0

41 Z

kvxx k dτ 2

0

t

10 3

kvx k dτ

41 Z 0

t

43

kvx k dτ 2

43

Z

t

+C Z + 0

0 t

kvx k3 dτ !

kvx k dτ 2

sup kvx (τ )k

0≤τ ≤t

kvxx k2 dτ + ε sup kvx (τ )k2 + Cε . 0≤τ ≤t

(3.20)

Here, we have used Lemma 3.1, Hölder’s inequality and the Young inequality of the following type: abc ≤ ε(a 4 + b2 ) + Cε c4 . We notice that ˆ 2 ≤ εkvx k2 + Cε (kθ k∞ + 1). |(p, ˆ vx )| ≤ εkvx k2 + Cε kpk

(3.21)

380

B. Guo, P. Zhu

Combining this with (3.1), (3.20) and (3.18) yields Z kvx k2 + 

t

0

kvt (τ )k2 dτ

≤ C 1 + sup kθ(τ )k∞ +

sZ

0≤τ ≤t

t

0

 kθt

(τ )k2 dτ  + ε

Z 0

t

kvxx (τ )k2 dτ. (3.22)

Step 2. On the other hand, from (1.15) we have ˆ xx − µˆ 0 ux vx + pˆ x = 0. vt − µv

(3.23)

Hence, kµv ˆ xx k2 ≤ C kvt k2 + kpˆ x k2 + kµˆ 0 ux vx k2 ≤ C kvt k2 + kθx k2 + (kθ(t)k2∞ + 1)kux k2 + kux k2 kvx k2∞ .

(3.24)

Furthermore, we use Nirenberg’s inequality 1

1

kvx k∞ ≤ Ckvxx k 2 kvx k 2 + Ckvx k.

(3.25)

Then, integrating (3.24) with respect to t over (0, t) and making use of (2.3) and Lemma 3.2 yields Z t kµv ˆ xx (τ )k2 dτ ≤ C kvt (τ )k2 + kθx (τ )k2 dτ + 0 0 Z t 2 kθk∞ + 1 kux (τ )k2 + kux (τ )k2 kvx (τ )k2∞ dτ +C 0 Z t Z t kvt (τ )k2 dτ + C kθx (τ )k2 dτ + ≤C 0 0 Z t 2 kθ (τ )k∞ dτ +C sup kux (τ )k sup kθ(τ )k∞ Z

t

0≤τ ≤t t

Z +C

0

0≤τ ≤t

0

kux k2 kvxx kkvx k + kvx k2 dτ,

(3.26)

and the last term in (3.26) may be handled as follows: Z

t 0

kux k2 kvxx kkvx k + kvx k2 dτ

Z

t

≤C 0

Z

t

≤ε 0

kux k kvx k dτ 4

2

21 Z 0

t

kvxx k dτ 2

21

kvxx k2 dτ + Cε sup kux (τ )k4 + C. 0≤τ ≤t

Z + sup kux (τ )k2 0≤τ ≤t

0

t

kvx k2 dτ (3.27)

Global Smooth Solution to Non-Linear Thermoviscoelasticy

381

Therefore, combining (3.26) with (3.27), recalling Lemma 3.2, we have Z t Z t 2 kvxx k dτ ≤ C kvt k2 + kθx k2 dτ + C sup kux k4 + 0

0≤τ ≤t

0

+C sup kux (τ )k

sup kθ (τ )k∞ + C

2

0≤τ ≤t

Z

t

≤C 0

0≤τ ≤t

! kθ k2α ∞

kvt k dτ + C 1 + sup 2

0≤τ ≤t

+ sup

0≤τ ≤t

kθ k1+α ∞

. (3.28)

From (3.22), (3.28), Lemma 3.2 and the fact that α can be chosen suitably small (here, t we may choose α ≤ 21 ), we conclude that (3.17) is valid. Thus the proof is complete. u Lemma 3.4. For all t ∈ [0, T ], the following estimates hold. Z t kθt (τ )k2 dτ ≤ C, sup kθ(τ )k∞ + kθx k2 + 0≤τ ≤t

kvx k + 2

(3.29)

0

Z t

kvt (τ )k2 + kvxx (τ )k2 dτ ≤ C,

0

(3.30)

kux k2 ≤ C.

(3.31)

Proof. Multiplying (1.16) by θt and integrating it with respect to x yields Z1 Z1 1 1 d 2 2 2 kk 2 θx k + kθt k = µv ˆ x θt dx + f1 (u)θ vx θt dx 2 dt 0 0 2 ≤ Ckθt k kθvx k + kvx k4 1 3 ≤ Ckθt k kθk∞ kvx k + kvxx k 2 kvx k 2 + kvx k2 =: I1 + I2 + I3 .

(3.32)

It follows from estimates in the previous lemmas that Z Z t Z t 1 t I1 dτ ≤ kθt k2 dτ + C sup kθ (τ )k2∞ kvx k2 dτ 6 0 0≤τ ≤t 0 0 Z 1 t ≤ kθt k2 + C sup kθ (τ )k2∞ . 6 0 0≤τ ≤t

(3.33)

Using Lemma 3.3 with sufficiently small α, we then have Z t Z t 1 1 I2 dτ ≤ C sup kvx (τ )k kvxx k 2 kvx k 2 kθt kdτ 0

0≤τ ≤t

0

Z

≤ C sup kvx (τ )k 0≤τ ≤t



≤ C 1 + sup kθ 0≤τ ≤t

≤ C 1 + sup kθ 0≤τ ≤t

0

t

kvxx k dτ 2

41 Z 0

sZ (τ ) k1+α ∞

+ 0

! β (τ ) k∞

t

+

1 6

Z 0

t

t

kvx k dτ 2

41 Z

3 4

kθt

(τ ) k2 dτ 

kθt k2 dτ.

t

0

Z 0

t

kθt k dτ 2

kθt k dτ 2

21

21

(3.34)

382

B. Guo, P. Zhu

Here, β = 3(1+α) can be chosen satisfying 0 < β < 3, since α in Lemma 3.3 can be 2 taken small, Z t Z t I3 dτ ≤ C sup kvx (τ )k kvx kkθt kdτ 0

0≤τ ≤t

0

Z

≤ C sup kvx (τ )k 0≤τ ≤t

t

0

kvx k dτ 2

≤ C + C sup kθ (τ ) k2∞ + 0≤τ ≤t

1 6

Z 0

21 Z

t

0 t

kθt k dτ 2

21

kθt k2 dτ.

(3.35)

By Nirenberg’s inequality, we have 1/3

kθk∞ ≤ Ckθx k2/3 kθk1

+ C 0 kθ k1

≤ Ckθx k2/3 + C. Hence, combining of (3.32)–(3.35) and applying Young’s inequality, one has Z t 1 kθt (τ )k2 dτ ≤ C + sup kθx (τ )k2 . kθx (t)k2 + 2 0≤τ ≤t 0 Taking the supremum with respect to t on both sides, we have Z t kθt (τ )k2 dτ ≤ C sup kθx (τ )k2 + 0≤τ ≤t

(3.36)

(3.37)

(3.38)

0

and sup kθk∞ ≤ C sup kθx k2/3 + C ≤ C.

0≤τ ≤t

0≤τ ≤t

Thus the proof is complete. u t After these preparations, we can obtain the estimates of the higher order derivatives on the unknown functions. We have Lemma 3.5. For all t ∈ [0, T ], the following estimates hold: Z t 2 kθxt (τ )k2 dτ ≤ C, sup kθt k + 0≤τ ≤t

(3.39)

0

sup kθxx (τ )k2 ≤ C,

0≤τ ≤t

Z

sup kvt k2 +

0≤τ ≤t

0

t

kvxt (τ )k2 dτ ≤ C,

sup kvxx k2 ≤ C.

0≤τ ≤t

(3.40) (3.41) (3.42)

Proof. We differentiate the second and the third equations in the system with respect to t, multiply the resulting equations by vt and θt , respectively, and integrate by parts. After a lengthy sequence of routine estimates, we obtain the estimates of the lemma. u t

Global Smooth Solution to Non-Linear Thermoviscoelasticy

383

4. Global Existence In this section, we prove Proposition 4.1 in which the a priori bounds of u, v, θ in some Banach space is given. By making use of the framework in [7], we prove the global existence of a smooth solution. Proposition 4.1. There hold kuk

1 1

≤ C,

kvk

1 1

≤ C,

kvx k

1 1

≤ C,

(4.2)

1 1

≤ C,

kθx k

1 1

≤ C.

(4.3)

C 3 , 6 (QT )

C 3 , 6 (QT )

kθk

C 3 , 6 (QT )

(4.1) C 3 , 6 (QT ) C 3 , 6 (QT )

Proof. For the proof, we refer to [6] or [11]. u t Acknowledgements. The authors would like to express their sincere thanks to the referee for pointing out several mistakes in an earlier version of our manuscript and for his/her valuable comments. The authors are also greatly grateful to Prof. Song Jiang for his useful discussion.

References 1. Andrews, G.: On the existence of solutions to the equation utt = uxxt + σ (ux )x . J. Differ. Eqs. 35, 200–231 (1980) 2. Andrews, G. and Ball, J.M.: Asymptotic behavior and changes of phase in one-dimensional nonlinear viscoelasticity. J. Differ. Eqs. 44, 306–341 (1982) 3. Antman, S. Seidman, T.: Quasilinear hyperbolic-parabolic equations of one-dimensional viscoelasticity. J. Differ. Eqs. 124, 132–195 (1996) 4. Chen, Z. and Hoffmann, K. H.: On a one-dimensional nonlinear thermoviscoelastic model for structural phase transitions in shape memory alloys. J. Differ. Eqs. 112, 325–350 (1994) 5. Dafermos, C.M.: The mixed initial-boundary value problem for the equations of one-dimensional viscoelasticity. J. Differ. Eqs. 6, 71–86 (1969) 6. Dafermos, C.M.: Global smooth solutions to the initial boundary value problem for the equations of one-dimensional nonlinear thermoviscoelasticity. SIAM J. Math. Anal. 13, 397–408 (1982) 7. Dafermos, C.M., Hsiao, L.: Global smooth thermomechanical processes in one-dimensional nonlinear thermoviscoelasticity. Nonlinear Analysis, T.M.A. 6, 435–454 (1982) 8. Greenberg, J.M. and MacCamy, R.C.: On the exponential stability of solutions of E(ux )uxx + λuxtx = ρutt . J. Math. Anal. Appl. 31, 406–417 (1970) 9. Guo, Boling and Zhu, Peicheng: Asymptotic behavior of the solution to the system for a viscous reactive gas. J. Differ. Eqs. 154 (1999) 10. Kazhikhov, A.V. and Shelukhin, V.V.: Unique global solution with respect time of initial boundary value problems for one-dimensional equations of the viscous gas. Prikl. Mat. Mekh. 41, 282–291 (1977) 11. Ladyzenskaja, O., Solonnikov, V. and Uralceva, N.: Linear and Quasilinear Equations of parabolic Type. Transl. Math. Monographs, Vol. 23, Providence, RI: AMS, 1968 12. Luo, T.: Qualitative behavior to nonlinear evolution equations with dissipation. Ph.D. Thesis, Institute of Mathematics, Academy of Sciences of China, Beijing (1994) 13. Nagasawa., T.: On the one-dimensional motion of the polytropic ideal gas non-fixed on the boundary. J. Differ. Eqs. 65, 49–67 (1986) 14. Pego, R. L.: Phase transitions in one-dimensional nonlinear viscoelasticity: Admissibility and stability. Arch. Rat. Mech. Anal. 97, 353–394 (1987) 15. Racke, R. and Zheng, S.: Global existence and asymptotic behavior in nonlinear thermoviscoelasticity. J. Differ. Eqs. Vol. 134 No. 1, 46–67 (1997) 16. Shen, W., Zheng, S. and Zhu, P.: Global existence and asymptotic behavior of weak solutions to nonlinear thermoviscoelastic system with clamped boundary conditions. To appear in 1998, Q. Appl. Math. 17. Zhu, P.: Global existence and asymptotic behavior of weak solutions to some hyperbolic-parabolic coupled systems. Ph. D. Thesis, Fudan University, 1997 Communicated by H. Araki

Commun. Math. Phys. 203, 385 – 419 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Lower Dimensional Invariant Tori in the Regions of Instability for Nearly Integrable Hamiltonian Systems Chong-Qing Cheng? Department of Mathematics, Nanjing University, Nanjing 210093, China. E-mail: [email protected] Received: 21 January 1998 / Accepted: 3 December 1998

Abstract: Consider a Hamiltonian system of KAM type, H (p, q) = N (p) + P (p, q), with n degrees of freedom (n > 2), where the Hessian of N is nondegenerate. For one resonance condition hI, Np i = 0, (I ∈ Zn ), there is an immersed (n − 1) dimensional submanifold M in action variable space, where almost every point corresponds to a resonant torus for the unperturbed system, which is foliated by (n − 1) dimensional ergodic components. It is shown in this paper that there is a subset of M with positive (n−1)-dim Lebesgue measure, such that for each resonant torus corresponding to a point in this set at least two (n − 1)-dimensional tori can survive perturbations. Generically, one is hyperbolic and the other one is elliptic. 1. Introduction and Results By the KAM theory it is well known that most nonresonant tori of a nondegenerate integrable Hamiltonian system can survive small perturbations with only slight deformation. In contrast to this, the destruction of those resonant tori appears inevitable under such perturbations in general. Nevertheless, their disintegration does not imply the nonexistence of regular motions such as periodic or quasi-periodic motions in these regions of instability. The Poincaré–Birkhoff fixed point theorem for area-preserving twist maps of annulus or of cylinder, and the results of Berstein and Katok [BK] for the systems with n degrees of freedom, prove that at least n periodic solutions are preserved. Both of these results are essentially variational. The disintegration of a resonant torus turns out to be difficult to investigate by a variational approach when the ergodic components have dimensions higher than 1. By applying Mather’s theory [Ma] to this situation we can get the existence of an invariant measure; however we know little about its topological structure, for that theory also applies to systems far from the integrable ones. Using local methods involving small divisors it is shown here that for most of those resonant tori ? Supported by NNSF of China and Qiu Shi Foundation

386

C.-Q. Cheng

which have a fibration in (n − 1)-dimensional ergodic components at least two elements survive perturbation. In the generic case one is hyperbolic and the other one is elliptic. These tori are expected to have great influence on the dynamical behavior of Arnold diffusion. Let us consider a Hamiltonian system in action-angle variables (p, q), q˙ =

∂H ∂H , p˙ = − , ∂p ∂q

(1.1)

with a real analytical Hamiltonian of KAM type q ∈ Tn , p ∈ D ⊂ Rn ,

H (p, q) = N(p) + P (p, q),

where D is an open set, P is a small perturbation and N is the main part. When P vanishes the system is integrable and D × Tn is stratified by a family of n-dimensioanl invariant tori p = const. carrying a quasi-periodic or periodic flow q = ω∗ t + qo with n-torus frequency ω∗ given by ω∗ (p) = (ω1 , ω2 , · · · , ωn ) =

∂H . ∂p

(1.2)

Now we assume that there is a po ∈ D such that ω = ω(po ) satisfies a rational resonance condition, which means hIn , ωi =

n X

Ini ωi = 0

(1.3)

i=1

Zn \{0}.

Here we assume λIn ∈ / Zn for real 0 6 = |λ| < 1. Noticing holds for some In ∈ ∂H that the frequency map ∂p from the action variable space to frequency space is a local diffeomorphism we know that there is an immersed (n − 1) dimensional manifold M containing po . The theory of abelian groups implies that there is a unimodular matrix I (i.e. det I = 1) I = [I1 , I2 , · · · , In ] in which In is specified as above and all Ij ∈ Zn . Introducing a linear symplectic transformation S: (p, q) → (x, y) x = qI,

p = yI t ,

we find that in the new coordinates ∂H = (ω1 , ω2 , · · · , ωn−1 , 0) = (ω, 0) ∂y

(1.4)

holds for y ∈ SM. In the following we still use (p, q) to denote the new coordinates and assume (1.4) holds for p ∈ M, to simplify the notation. Without loss of generality we also assume that M is an (n − 1)-submanifold since we work locally in the action coordinates. Therefore M is real analytic. With these preliminaries we then have the following results: Theorem. Let the Hamiltonian be of KAM type H(p,q)=N(p)+P(p,q) and be real analytic in a complex neighbourhood 6σ,τ : of the set

D × Tn ,

|I mq| ≤ τ,

|p − po | ≤ σ, ∀po ∈ D

where | · | = max1≤j ≤n | ·j | and assume the following conditions:

Lower Dimensional Invariant Tori in the Regions of Instability

387

1. Nondegeneracy: the main part N satisfies Kolmogorov’s nondegeneracy condition, i.e. the determinant of its Hessian 2 ∂ N det ≥ λ > 0, ∀p ∈ D. ∂pi ∂pj 2. Resonance: for all p ∈ M the following holds: ∂N = 0, ∂pn p∈M

∂ 2 N 6= 0. ∂pn2 p∈M

Then there is a positive d = d(N, n, σ, τ ) > 0 such that if |P | < d in 6σ,τ there is a set S ⊂ M with positive (n − 1) dimensional Lebesgue measure, such that if po ∈ S, the Hamiltonian flow defined by Eq. (1.1) admits at least two (n − 1)-dimensional invariant submanifolds of the form j = 1, 2, p = po + 0j (φ1 , φ2 , · · · , φn−1 , qnj ), q = (φ1 , φ2 , · · · , φn−1 , qnj ) + 2j (φ1 , φ2 , · · · , φn−1 , qnj ),

(1.5)

with 0j , 2j being real analytic functions of period 2π in the complex domain |I mφ| ≤ τ2 , and qnj is some constant. The parametrization is chosen so that the induced flow on Tn−1 is still given by ∂N (po )t. φ = φo + ∂p Moreover, for each > 0 there exists a positive d 0 = d 0 (N, n, σ, τ, ) ≤ d such that if in addition |P | < d 0 in 6σ,τ , then the functions 0j and 2j satisfy |0j | + |2j | < . As d → 0 the Lebesgue meas(S) → Lebesgue meas(M). For every point p in the set S the corresponding frequency ω of the flow satisfies a relative Diophantine condition −µ

|hk, ωi| ≥ D|k|1 ,

∀k ∈ Zn−1 \{0},

(1.6)

P ∂N where D > 0, µ ≥ 4n + 2, |k|1 = n−1 j =1 |kj | and (ω, 0) = ∂p (p). However, not every point p whose ω satisfies (1.6) is in S. The situation we have here is between of KAM theory and that of the Birkhoff fixed point theorem. With the results here and those in [BK] in mind one may ask if the resonant torus has a fibration and each ergodic component is (n − k) dimensional, for which do at least k + 1 such components survive perturbation? In the work of Treschev [Tr] the existence of one hyperbolic lower dimensional torus is proved when the maximal torus is destroyed. However, the perturbation can not be arbitrary. Apart from smallness it is assumed to satisfy some extra condition. In our previous work [Cg] it is shown that for those resonant tori which have a fibration into (n − 1) dimensional ergodic components whose frequencies satisfy a Diophantine condition, any small perturbation does not destroy all of these submanifolds. At least one survives. In the generic case it is of hyperbolic type.

388

C.-Q. Cheng

The analyticity of the Hamiltonian in our theorem is clearly not necessary. We require it just for simplicity. The same result can be expected in the C m case if some well developed techniques are employed ([Mo, SZ]). The classical KAM theory has been established not only for Hamiltonian systems but also for volume-preserving diffeomorphisms as well as reversible systems ([CS1, Se, Xi, Yo]). It seems also possible to get similar results for those systems, e.g. periodically invariant tori [CS2].

2. Outline of the Proof The main point of the proof in this paper is to avoid resonance between torus and normal frequencies. We are able to give a measure estimate for the set where such resonance does not take place. The proof is completed in two main steps. As the first step we fix a resonant n-torus where the (n − 1)-torus frequency ω satisfies some Diophantine condition and suppose “the coupling of the torus frequencies and normal frequency” does not invalidate the small divisor condition. With these prerequisites we can invoke the full strength of our method in [Cg] to construct two types of iteration schemes according to whether the problem is nondegenerate or not at that iterative step. In addition one must find a smooth transition between these two types of iteration scheme for the measure estimate in the second part of the proof. Formally for any step of the iteration a symplectic transform M: (p, q) → (p+ , q+ ) is introduced taking the form: p = p+ + Wq ,

q+ = q + Wp+ ,

(2.1)

with a generating function W (p+ , q). As the beginning of the iteration W is determined by looking for the solution of the homological equation as in the proof of classical KAM theory [Ar]: h(ω, 0), Wq i = −P (p+ , q) + N+ (p+ , qn ), where ω = (ω1 , ω2 , · · · , ωn−1 ). The difference is that one has to let N+ depend not only on p+ but also on qn so that the above equation is solvable because the last component of the frequencies is zero. The N+ is chosen by expanding P (p+ , q) into Fourier series in qˆ = (q1 , q2 , · · · , qn−1 ), P (p+ , q) =

X

ˆ Pk (p+ , qn )eihk,qi ,

k∈zn−1

and letting N+ = −Po . After such a change of variables the Hamiltonian takes the form: H (p+ , q+ ) = (N + N+ )(p+ , qn+ ) + P+ (p+ , q+ ). The new perturbation term P+ is much smaller than the original one and the main part N + N+ depends on qn+ also. So the n-torus at po is no longer invariant for the Hamiltonian flow. Nevertheless there are at least two (n−1)-dimensional tori embedded in such an n-torus which are invariant for the flow determined by the Hamiltonian H = N + N+ .

Lower Dimensional Invariant Tori in the Regions of Instability

389

Lemma 1 ([Cg]). Suppose N(p) is nonsingular in D, i.e. the eigenvalues of its Hessian |λi (p)| ≥ λ > 0, and ∂N ∂p (po ) = (ω1 , ω2 , · · · , ωk , 0, · · · , 0) =: , N+ = N+ (p, qk+1 , · · · , qn ), defined on Tn−k ×D, satisfies some smallness condition |N+p | < λn−1 dist{po , ∂D} and the spectrum of its Hessian matrix lies in a ball with radius smaller than λ, centered at the origin. Then the flow determined by the Hamiltonian N + N+ admits at least n − k + 1 invariant tori Tk : p = pl , qj = ωj t + qj l , qj = qj l , with 1 ≤ l ≤ n − k + 1, |pl − po | ≤

n λ

(1 ≤ j ≤ k), (k < j ≤ n),

maxD |N+p |.

Indeed, such lower dimensional tori are found by searching for the critical points of ˜ − h, pi (q˜ = qk+1 , · · · , qn ). Let p = ρ(q) ˜ be the the function N(p) + N+ (p, q) functions such that ∂ (N + N+ )(ρ(q), ˜ q) ˜ = , ∂p then they are exactly the critical points of the function ˜ q) ˜ − h, ρ(q)i. ˜ φ(q) ˜ = (N + N+ )(ρ(q), If k = n − 1, such a function is defined on T, the minimum and the maximum must be the critical value which corresponds to two invariant tori Tn−1 for N + N+ . The task of the proof in this part is to show that the iteration scheme for a symplectic transformation works on these (n − 1) tori instead of on the whole resonant n-torus. To proceed to the next step of iteration one needs to investigate whether or not these critical points have "stronger persistency" for perturbations from an average part of P+ with respect to qˆ = (q1 , q2 , · · · , qn−1 ). The "stronger persistency" is such a ˆ sufficiently condition that guarantees the corresponding critical points of N +N+ −hω, pi close to that of N − hω, pi. ˆ Since Kolmogorov’s nondegeneracy of N is assumed, the displacement of the critical points is no doubt small enough in the direction of p. The problem is that one does not know whether or not it is also the case in the direction of qn . It is dealt with in two ways. 2.1. The critical point has weaker persistency. By definition, the following relation holds: d2 2 ˆ n )i (2.2) ≤ βd 8n+21 := βs 2 2 N(ρ(qn ), qn ) − hω, ρ(q ∗ qn =qn dqn at the critical point (ρ(qn∗ ), qn∗ ). There is no precise criterion to distinguish if the critical point has stronger persistency, it depends on how one runs the KAM machine. Usually we choose β neither too big nor too small, say β ∈ [1, 2]. When the critical point has weak persistency we are in the same situation as in our previous work [Cg]. Since the change of variables (pn , qn ) → (pn + ρn (qn ), qn ) is symplectic we can assume that ρn (qn ) = 0. Expanding the perturbation in Taylor series of pn and retaining the sum up to order κ = [ 8n 3 ] + 8, here [a] denotes the largest integer among those smaller than a, Tw P =

κ X j =0

j

Pj (p, ˆ q)pn ,

(2.3)

390

C.-Q. Cheng

then the remaining part RP = P −Tw P is much smaller than P in some smaller domain of pn . We also develop Tw P into a Fourier series of qˆ and truncate it up to order K, X ˆ Pj k eihk,qi . TK Pj = k∈Zn−1 \{0} |k|≤K

The remaining part RK (Tw P ) = Tw P − TK (Tw P ) can also be made much smaller than truncated part. We choose the generating function W in the form W (p, q) =

κ X

j

Wj (p, ˆ q)pn ,

(2.4)

j =0

where Wj is the solution of the following equation: hω, Wj qˆ i − j

∂ 2N (ρ(qn ), qn )Wj = −TK Pj (p, ˆ q). ∂pn ∂qn

(2.5)

By restricting the domain of qn properly small, |Im qn | ≤ s 2 , Im(Npn qn (ρ(qn ), qn )) can be set so small such that |hω, ki − j Im(Npn qn (ρ(qn ), qn ))| ≥

1 |hω, ki|, 2

∀j ≤ κ, ∀|k| ≤ K

if we choose K carefully as in [MP]. In this way we then find a well defined generating function W if ω satisfies the Diophantine condition (1.6). The symplectic transformation M : (p, q) → (p+ , q+ ) of (2.1) with the generating function takes the Hamiltonian H (p, q) into the form H (p, q) = (N + N+ )(p+ , qn+ ) + P+ (p+ , q+ ), where

(2.6)

Z 2π 1 P (p, q)d q, ˆ N+ = (2π)n−1 0 P+ = (M1 + M2 + M3 + M4 )(p+ , q+ ), M1 = N(p+ + Wq , qn ) − N(p+ , qn + Wpn+ ) − hω, Wqˆ i ∂ 2N (ρ(qn ), qn )pn+ Wpn+ , ∂pn ∂qn M2 = (P − N+ )(p, q) − (P − N+ )(p+ , q), M3 = R(P − N+ )(p+ , q) + RK (T (P − N+ ))(p+ , q), M4 = N+ (p, q) − N+ (p+ , q+ ). +

ˆ n )i. We have N (p, qn ) = φ(qn ) + hω, pi ˆ + Let φ(qn ) = N(ρ(qn ), qn ) − hω, ρ(q N˜ (p, qn ), where N˜ pi qn = Npi qn ,

N˜ pi pj = Npi pj ,

˜ p (ρ(qn ), qn ) = N˜ qn (ρ(qn ), qn ) = 0, ˜ N(ρ(q n ), qn ) = N

Lower Dimensional Invariant Tori in the Regions of Instability

391

since on the real line {ρ(qn ), qn , qn ∈ L} they are identical with zero. Let M1 = M11 + M12 + M13 , 1 M11 = φ(qn+ − Wpn+ ) − φ(qn+ ) = φ 0 (qn+ )Wpn+ + φ 00 (qn+ + ηWpn+ )Wp2n+ , 2 ˜ + , qn ) M12 = N˜ (p+ + Wq , qn ) − N(p n 1 X = Npi pj (ρ(qn ) + ξ1 1p+ , qn )1pi+ + Npi pj (p+ + ξ2 Wq , qn )Wqi Wqj , 2 i,j =1

M13

= N˜ (p+ , qn ) − (p˜ + , qn + Wpn+ ) + Npn qn (ρ(qn ), qn )pn+ Wpn+ = −

n−1 X i=1

−

1 Npi qn (ρ(qn ), qn )1pi+ Wpn+ − Nqn2 (p+ , qn + ηWpn+ )Wp2n+ 2

n 1 X Npi pj qn (ρ(qn ) + ξ3 1p+ , qn )1pi+ 1pj + Wpn+ , 2 i,j =1

where some notations are used as follows: ξj (Wq ) = (ξj 1 Wq1 , · · · , ξj n Wqn ),

1p+ = p+ − ρ(qn ),

|ξij | ≤ 1, (i = 1, 2, 3; j ≤ n) and |η| ≤ 1. Finally, the change of variables (pn , qn ) → 3 , where (pn +ρ+n (qn ), qn ) is introduced so that pn+ is defined in the domain |pn+ | ≤ s+ ρ+ is determined through the way by setting (N + N+ )(ρ+ (qn ), qn ) = (ω, 0).

(2.7)

If the perturbation is small enough, such transformation can then be repeated. In every step, a symplectic change of coordinates Ml is set up in this way so that Hl ◦ Ml = Nl+1 + Pl+1 with another main part Nl+1 and a much smaller error term Pl+1 , for instance, |Pl+1 | ≤ |Pl |ξ for some ξ > 1. Such a procedure might be repeated infinitely many times, if those critical points always have weaker persistency. It results in a sequence of transformations M0 , M1 · · · whose infinite product converges on a set containing at least one trivially ˜ q˜n ) and maps it to a set containing at least one embedded (n − 1)-torus (p, qn ) = (p, (n − 1)-torus with its tangent map taking constant vector fields on (p, ˜ q˜n ) to the vector field governed by (1.1). Since such a transform is designed around each critical point, one then finally obtains at least two invariant (n − 1) tori. The iterative step may also be repeated only for finite times. When the critical point gets stronger persistency we may switch to the other procedure described in the following, we shall show that the critical point always has stronger persistency in the following iteration steps. In this way the designed iteration scheme can also work smoothly. The existence of two (n − 1) tori follows from the fact that a periodic function has at least two critical points.

392

C.-Q. Cheng

2.2. The critical point has stronger persistency. In this case (2.2) does not hold, which ˆ remains close to that of N − hω, pi. ˆ It implies that the critical point of N + N+ − hω, pi is shown in [Cg] that once the critical point gets stronger persistency the critical point in the following steps of iteration shall possess stronger persistency as well. Thus one can restrict (pn , qn ) in a smaller domain {|pn | ≤ O(s 3 ), |qn − qn∗ | ≤ O(s 3 )} and expand P in a Taylor series in (pn , qn ) X ˆ q)p ˆ ni (qn − qn∗ )j P = Pij (p, and use the truncation Ts P of P to approximate P . Ts P will be chosen such that these sums only extend over |i + j | ≤ κ. By restricting the domains of (pn , qn ) properly small, P − Ts P is much smaller than P in the relevant domain. To find a symplectic transformation (2.1) we introduce the generating function W by X i Wij (pˆ + , q)p ˆ n+ (qn − qn∗ )j , (2.8) W (p+ , q) = |i+j |≤κ

where Wij are determined by the following equations: ∂Wl + AlWl = −e Pl ,

(0 ≤ l ≤ κ),

(2.9)

Pl = (P˜l0 , P˜(l−1)1 , · · · , P˜0l )t , ∂Wij = hω, Wij qˆ i, Wl = (Wl0 , W(l−1)1 , · · · , W0l )t , e eij = Pij − [Pij ], P Z 2π 1 Pij (p, ˆ q)d ˆ q, ˆ [Pij ] = (2π)n−1 0 and A0 = 0

−Npn qn Npn2 A1 = φ 00 0

b11 b12 := . b21 b22

Other matrices are defined through the entries of A1 in the following way: 2 ≤ l ≤ κ, Al =(aij )(l+1)×(l+1) , ajj =(l + 1 − j )b11 + (j − 1)b22 , aj (j +1) =j b12 , a(j +1)j =(l + 1 − j )b21 , aij =0, (|i − j | ≥ 2). All entries in these matrices are evaluated at the critical point. The equations in (2.9) can be solved in this way. First we expand Wij and P˜ij into Fourier series X ˆ Wij k (p)e ˆ ihk,qi , Wij = k∈Zn−1 \{0}

P˜ij =

X

ˆ Pij k (p)e ˆ ihk,qi ,

k∈Zn−1 \{0}

then

Pj k , (ihk, ωiI + Aj )Wj k = −e

i, j ≤ κ,

Lower Dimensional Invariant Tori in the Regions of Instability

393

where I is the identity matrix. In the light of the representation theory of groups it is easy to show that the spectrum of Am is as follows: Sp(Am ) = {(m − l)σ1 + lσ2 , l = 0, 1, · · · , m}, n o q σ1,2 = 21 − Npn qn ± Np2n qn − 4φ 00 Npn2 , if one realizes that Am comes from the following equation: ∂Wm + pn , qn − qn∗ where Wm =

P

i i+j =m Wij pn (qn

W0k =

m b12 −b11 Wqn = Pm , b22 −b21 −Wm pn

− qn∗ )j . Thus we find

˜ 0k iP , hk, ωi ˜ mk (ihk, ωiI + Am )∗ P , (j ≤ κ), ihk, ωi + λ l λl ∈Sp(Am )

Wmk = − Q

(2.10)

where A∗m denotes the adjoint matrix of Am . To make sure if the convergence of the series of W is guaranteed one has to deal with the small divisor not only for hk, ωi but also ihk, ωi + λl if λl has an imaginary part much bigger than its real part. In this case the "coupling of frequencies" is brought into action, which complicates the handling of the small divisor conditions. This corresponds to the critical point where φ achieves its minimum if Npn2 > 0 as supposed. However, we leave this difficulty to the second part of the proof and suppose the following holds: |ihk, ωi + λl | ≥ D|k|−µ ,

∀k ∈ Zn−1 \{0}, |l| ≤ κ.

(2.11)

Under this assumption the generating function is well defined, and the symplectic map of (2.1) transforms the Hamiltonian H (p, q) into the form of (2.7), where Z 2π 1 T P (p+ , q)d q, ˆ N+ = (2π)n−1 0 P+ =(M1 + M2 + M3 + M4 )(p+ , q+ ), M1 =N(p+ + Wq , qn ) − N(p+ , qn + Wpn+ ) − hω, Wqˆ i Npn2 Npn qn Wqn , − (pn+ , qn − qn∗ ) −Wpn+ 0 φ 00 M2 =(P − T P )(p, q), M3 =(T P − N+ )(p, q) − (T P − N+ )(p+ , q), M4 =N+ (p, qn ) − N+ (p+ , qn+ ), where the entries of the matrix are evaluated at the critical point: M1 = M11 + M12 + M13 + M14 + M15 ,

394

C.-Q. Cheng

M11 = φ(qn ) − φ(qn + Wpn+ ) + φ 00 (qn∗ )(qn − qn∗ )Wpn+ , ˜ + + Wq , qn ) − N(p ˜ + , qn ) − Np2 (ρ(qn ), qn )pn+ Wqn M12 = N(p =

n X i,j =1

+ +

n

Npi pj (ρ(qn ), qn )1pi+ Wqj − Npn2 (ρ(qn ), qn )1pn+ Wqn

n 1 X Npi pj pk (ρ(qn ) + ξ1 1p+ , qn )1pi+ 1pj + Wqk 2

1 2

i,j,k=1 n X

Npi pj (p+ + ξ2 Wq , qn )Wqi Wqj ,

i,j =1

˜ + , qn ) − N(p ˜ + , qn + Wpn+ ) − Npn qn (ρ(qn ), qn )pn+ Wqn M13 = N(p = −

n−1 X i=1

−

1 Npi qn (ρ(qn ), qn )1pi+ Wpn+ − Nqn2 (p+ , qn + ηWpn )Wp2n+ 2

n 1 X Npi pj qn (ρ(qn ) + ξ3 1p+ , qn )1pi+ 1pj + Wpn+ , 2 i,j =1

M14 = (Npn2 (ρ(qn ), qn ) − Npn2 (p∗ , qn∗ ))pn+ Wqn ,

M15 = (−Npn qn (ρ(qn ), qn ) + Npn qn (p∗ , qn∗ ))pn+ Wpn .

Such a procedure yields a sequence of transformations M0 , M1 · · · whose infinite ˜ q˜n ) and maps it product converges on a trivially embedded (n − 1)-torus (p, qn ) = (p, to a (n − 1)-torus with its tangent map taking a constant vector field on (p, ˜ q˜n ) to the vector field determined by (1.1). Since such tranformations are designed around each critical point, one then finally obtains at least two invariant (n − 1)-tori. 2.3. Smooth transition between two generating functions. To get a measure estimate on the set where (2.11) holds we will exploit the differentiability of λl in ω to some extent in the sense of Whitney [Wh]. For this, the generating function should change smoothly in ω at any step of iteration. That is why a smooth transition of the generating functions defined by (2.4) and (2.8) is introduced here. Let µ(x): R → R+ be such a C ∞ defined as follows: Z x .Z ∞ µ(x) = α(t)dt α(t)dt, −∞

( α(t) =

−∞

e− t (1−t) 0 1

0
so µ(x) = 0 as x ≤ 0 and µ(x) = 1 as x ≥ 1. With this function we define a smooth transition between the generating functions in weaker and stronger persistency cases b + (1 − µ(β − 1))W e, W (p, q) = µ(β − 1)W

(2.12)

b is the generating function defined through (2.8), W e is the one defined in which W through (2.4) and β is as in (2.2). With such a definition we actually think the critical

Lower Dimensional Invariant Tori in the Regions of Instability

395

point is at an intermediate stage where 1 < β < 2 in (2.2). The generating function (2.12) also defines a symplectic transformation (2.1). In the new coordinates (p+ , q+ ), the Hamiltonian H (p, q) has the following form: H = (N + N+ )(p+ , qn+ ) + P+ (p+ , q+ ),

(2.13)

where Z 2π 1 ((1 − µ)P (p, q) + µTs P (p, q))d qˆ (2π)n−1 0 b+ , e+ + µN = (1 − µ)N X (µMj s + (1 − µ)Mj w ) + M4 , P+ = M0 +

N+ =

j =1,2,3

Mo = N(p+ + Wq , qn ) − N(p+ , qn + Wpn+ ) bq , qn ) + µN (p+ , qn + W bpn+ ) − µN(p+ + W M1w

eq , qn ) + (1 − µ)N (p+ , qn + W epn+ ), − (1 − µ)N(p+ + W eq , qn ) − N(p+ , qn + W epn+ ) − hω, W eqˆ i = N(p+ + W ∂ 2N epn+ , (ρ(qn ), qn )pn+ W ∂pn ∂qn e+ )(p+ , q) + RK (Tw (P − N e+ ))(p+ , q), = R(P − N e+ )(p, q) − (P − N e+ )(p+ , q), = (P − N +

M2w M3w

bq , qn ) − N(p+ , qn + W bpn+ ) − hω, W bqˆ i M1s = N(p+ + W bqn Npn2 Npn qn W − (pn+ , qn − qn∗ ) 00 bpn+ , 0 φ −W M2s = (P − Ts P )(p, q), b+ )(p, q) − (Ts P − N b+ )(p+ , q), M3s = (Ts P − N M4 = N+ (p, qn ) − N+ (p+ , qn+ ). In view of the experience we have in dealing with the case of weaker or stronger persistency, by shrinking the domain of the new variables (p+ , q+ ) properly, the new error term P+ is expected to be much smaller than the original one, so that the iteration can be carried on further. So far we have designed an iterative scheme to obtain a sequence of symplectic transformations which converges on a (n − 1)-torus provided the coupling of torus frequencies and normal frequency does not break condition (2.11). However, it is not guaranteed since λl comes from perturbation and it is indeed possible that ihk, ωi+λl = 0. What we hope to get is that it is not always the case, which is the second part of our proof in this paper. Consider the (n − 1)-dimensional manifold M in an action variable space where (1.3) always holds. For those (ω, 0) ∈ ∂N ∂p (M) which satisfy some Diophantine condition a formal iterative scheme can be set up. This way we obtain a Hamiltonian depending on some parameter H = H (p, q, ω) which is real analytic in the domain 6τ,σ :

|Imq| ≤ τ,

|p − p(ω)| ≤ σ, ∀p(ω) ∈ M,

396

C.-Q. Cheng −1

where p(ω) = ∂N (ω, 0). At the beginning of iteration H = H (p + p(ω), q). Along ∂p the line we followed in [CS3] we shall show that if the iteration is done at those ω which are Diophantine, the Hamiltonian H (pm , qm , ω) at the mth step of iteration still has some differentiability in ω in the sense of Whitney. Consequently, λl (ω) has such differentiability in the regions where the critical point of φ(qn ) has stronger persistency. With some technical lemmas we shall develop later it is shown that there is indeed a set in ∂N ∂p (M) with positive Lebesgue measure where (2.11) holds, which is the prerequisite condition for the validity of the first part of the proof. 3. An Iteration Step In this section the details of one step of iteration is given. At the beginning of this transformation step we assume that the main part of the Hamiltonian N (p, qn ) coincides with that obtained from No + Po by the composition of earlier steps of KAM transformation jS j on the domains ∪6s ∪Ds , where j

ˆ n )| ≤ s 4 , |pn | ≤ s 3 , |Imqn | ≤ s 2 , Reqn ∈ Lj }, 6s ={|pˆ − ρ(q j

∗ ˆ n )| ≤ s 4 , |pn | ≤ s 3 , |qn − qnj | ≤ s 3 }. Ds ={|pˆ − ρ(q ∗ ∈ T is the critical point of the function φ(q ) with stronger persistency. In each qnj n 2 . On the real domain Lj there is at least one point where |φ 0 | ≤ s 5 and |φ 00 | ≤ s+ 4 3 {|pˆ − ρ(q ˆ n )| ≤ s , |pn | ≤ s , qn ∈ T} N satisfies the following: |k+l| ∂ N (3.1) ∀|k + l| ≤ 2n + 5, l ≥ 1, qn ∈ T. ∂pk ∂ l q < B 1, n j

∗ | ≤ s 3 , q ∈ T} ⊂ D by I . These L and I are disjoint to each other. Denote {|qn −qnj n j j j s In the complement of these sets in T there might be some critical points of φ(qn ), where |φ 00 | ≥ s 2 and φ 00 Npn2 > 0. The reason is that at some earlier step of iteration resonance between torus frequencies and normal one occurs which leads to the interruption of the successive transformations. Writing a s 3 -neighbourhood of such points also with Ij , which is also disjoint with other Ik and Lk , then we have

|φ 0 (qn )| >

1 5 s , 2

qn ∈ / Lj , qn ∈ / Ij .

(3.2)

The case when the union of all Lj covers the whole T can be treated in the same way as in the following. To proceed with the details, we construct such an iteration scheme around critical points in three ways according to the persistency of the corresponding critical point. 3.1. Weaker persistency. As it is shown in [Cg], one needs to construct the nested domains for canonical variables in some implicit way, in this case so that the critical point for the next step of iteration is still in the chosen domain. It is assumed that the Hamilj tonian N (p, qn ) + P (p, q) is real analytical in the domain At × 6s , where At = {|Im q| ˆ ≤ t, Re qˆ ∈ Tn−1 },

Lower Dimensional Invariant Tori in the Regions of Instability

397

and Lj is either an interval contractable to one point or the whole T. If some Lj ⊂ T is ∗ − s 2 , q ∗ + s 2 ]. q ∗ is a local minimum or local homotopic to a point, then Lj ⊇ [qnj nj nj maximum point of the function φ(qn ). For a different subindex j , the corresponding critical points may be stronger persistent or weaker persistent, or at an intermediate stage, they need not to be the same type. It is worth pointing out that ρ(qn ) depends on the parameter ω, i.e. ∂N (ρ(qn ), qn ) = (ω, 0). ∂p ∗ in L has weaker persistency, then we can proceed as in [Cg] to assume Assume some qnj j the following. Let s+ , t+ be positive number such that 4l−2 s+ ≥ s 4l , (l ≤ 2n + 5), 2s+ ≤ s, s ≤ t ≤ 1, 1 s+ ≤ t+ , s+ ≤ (t − t+ ), 4 e n−1 t − t n D 3(t − t+ ) µ + , s2 ≤ . s 2 | log s|µ ≤ 2 64(n − 1) 2n − 2 32

(3.3) (3.4)

j

We assume further in At × 6s , 3.1.1. The smallness of the perturbation P . |P (p, q)| ≤ δ,

(3.5)

where δ ≤ min

n D 6 (t − t )6µ λ 2 10 1 9 o + 4 3 (s 4 − 3s+ )(s 3 − 3s+ ), s . s s , c3 12n + 4c7 +

3.1.2. The nondegeneracy of N. The Hessian matrix of N in p is nonsingular and

∂ 2N

∀ξ ∈ Cn . min 2 ξ ≥ λkξ k, 6s ∂p Without loss of generality we assume 0 < λ ≤ 1. 2 and φ(q ) = 3.1.3. There is at least one point in Lj where |φ 0 | ≤ s 5 , |φ 00 | ≤ s+ n ˆ n )i (qn ∈ Lj ), in which ρ(qn ) is determined by N (ρ(qn ), qn ) − hω, ρ(q ∂ ˆ =0 (3.6) (N(p, qn ) − hω, pi) p=ρ(qn ) ∂p

and is real analytical in {|Imqn | ≤ s 2 , Reqn ∈ Lj }. On the whole line {(ρ(qn ), qn ), qn ∈ Lj }, |φ 00 (qn )| ≤ 2s 2 . If Lj is a contractible interval, denoted by [l, r], it is assumed l ≤ qn∗ − s 2 , r ≥ qn∗ + s 2 , at each endpoint qn = l, qn = r |φ 0 (qn )| ≥ 2s 4 . 3.1.4. In 6s

j ∂ N max j ≤ η, j ≤3 ∂p 1 , max |φ (j ) (qn )| ≤ j ≤3 64

l +l ∂ 1 2N |(l1 ,l2 )|≤3,l2 ≥1 ∂ l1 p∂ l2 q max

≤ 1 , 7n2 2µ n 1 max |ρ (j ) (qn )| ≤ . j =1,2 2

398

C.-Q. Cheng

Introduce several intermediate domains which are defined somewhat implicitly: At+ ⊂ A1 ⊂ A2 ⊂ At ,

6s+ ⊂ 6o ⊂ 61 ⊂ 62 ⊂ 6s ,

where 2 ˆ ≤ t+ }, A1 = {|Im q| ˆ ≤ t+ + s+ }, A2 = {|Im q| ˆ ≤ At+ = {|Im q|

1 (t + t+ )}, 2

4 ∗ 3 2 , |pn − pn+ | ≤ s+ , |Imqn | ≤ s+ , Reqn ∈ L+ }, 6s+ = {|pˆ − ρˆ+ (qn )| ≤ s+ 4 3 2 ˆ n )| ≤ 2s+ , |pn | ≤ 2s+ , |Imqn | ≤ s+ , Reqn ∈ L+ }, 60 = {|pˆ − ρ(q 4 3 2 ˆ n )| ≤ 3s+ , |pn | ≤ 3s+ , |Imqn | ≤ 2s+ Reqn ∈ L1 }, 61 = {|pˆ − ρ(q 1 4 1 4 3 ˆ n )| ≤ (s + 3s+ ), |pn | ≤ (s 3 + 3s+ ), 62 = {|pˆ − ρ(q 2 2 1 3 ), Reqn ∈ L2 }. |Imqn | ≤ (s 3 + 3s+ 2

When L 6 = T, L2 = [l + 21 (l1 − l), r − 21 (r − r1 )], where L1 = [l1 , r1 ] is set so that |φ 0 | 4 and achieves only the endpoints of L . L = L − s 2 . is smaller than 4s+ j + 1 + Under condition (3.1.3) it is easy to show, by an argument basically the same as in [Cg], that 3 2 4 , |φ 00 (qn )| ≤ 8s+ , (3.7) |φ 00 (qn )| ≤ s+ 2 2 , Req ∈ L . As in [Cg], by making use of the generating for qn ∈ |Imqn | ≤ 2s+ n 1 function defined by (2.4), " 1# D µ 1 . K= n − 1 2s 2 We find a symplectic transformation M as in (2.1) which maps At+ × 6o into At × 6s and in At+ × 6o M − id ≤ 4θ (s 4 − 3s 4 )−1 , + ∂(p, q) 4 −2 − I d ≤ 80θ (s 4 − 3s+ ) , ∂(p+ , q+ )

(3.8)

where θ=

c1 δ . D(t − t+ )µ

Such a map takes the Hamiltonian H (p, q) into the form (N + N+ )(p+ , qn+ ) + P+ (p+ , q+ ) so that (3.1.2) and (3.1.4) also hold for N + N+ with s replaced by s+ , (3.1.3) is guaranteed for N + N+ provided the degree of φ 0 on Lj deg(φ 0 ) = ±1. When deg(φ 0 ) = 0 there may be no critical point for φ+ (qn ). Nevertheless if there is no point 0 | ≤ s 5 and |φ 00 | ≤ s 2 we eliminate L in the following steps of iteration. Bewhere |φ+ j + + 2 } × {pˆ − ρ(q cause the following hold in the intermediate domain {|Imq| ˆ ≤ t+ + s+ ˆ n )| ≤

Lower Dimensional Invariant Tori in the Regions of Instability

399

4 , |p | ≤ 3s 3 , |Im q | ≤ 2s 2 , Req ∈ L + s 2 }, 3s+ n n n + + + +

|M11 | ≤ |φ 0 (qn )||Wpn+ | + max |φ (3) ||Wpn+ |2 4 3 3 −1 3 −2 ≤ 16θs+ (s − 3s+ ) + 4θ 2 (s 3 − 3s+ ) ,

|M12 | ≤ n2 η(|pˆ + − ρ(q ˆ n )||Wqˆ | + |pn+ ||Wqn | + |Wq |2 ) 3 2 2 −1 4 2 −2 (s − 2s+ ) + 12θs+ (t − t+ )−1 + 4θ 2 (s 2 − 2s+ ) ), ≤ n2 η(6θs+

4 3 3 −1 3 −2 6 3 3 −1 (s − 3s+ ) + 4θ 2 (s 3 − 3s+ ) + 9n2 θ s+ (s − 3s+ ) ), |M13 | ≤ η(6nθs+ 4 −1 −4 ) s , |M2 | ≤ 8nδθ(s 4 − 3s+ s 8n+22 + + θs 2 , |M3 | ≤ 2 δ s 4 −1 −4 ) s . |M4 | ≤ 16nδθ(s 4 − 3s+

thus 3 2 2 −1 (s − 2s+ ) +δ |P+ | ≤ c2 {θs+

Therefore in At+ × 6o

s 8n+22 +

s

} := δ+ .

) ( ∂kP k!δ+ + max ≤ 4k , |l1 +l2 |=k ∂p l1 ∂q l2 s+ + +

(3.9)

which clearly holds in A+ × 6+ . (3.1.1) can also be set to hold for P+ as it is shown in the Sect. 5 below.

3.2. Intermediate case. By our classification the following holds for 1 < β < 2: 00 φ (qn ) q

∗ n =qn

2 = βs+

(3.10)

∗ , q ∗ ) of at the critical point (ρ(qn∗ ), qn∗ ). Under such a condition the critical point (pn+ n+ ˆ determined by the same variational principle remains close (N + N+ )(p, qn ) − hω, pi ˆ indeed, we have ([Cg]) to that of N (p, qn ) − hω, pi, ∗ | < s5, |qn∗ − qn+

∗ |pn∗ − pn+ | < s5.

(3.11)

Such a property enables us to find nested domains in an explicit way and a relevant symplectic transformation of coordinate (2.1) with the generating function (2.12). Instead of considering the Hamiltonian N + P in At × 6s , we only suppose it is real analytic in the domain At × Ds Ds+ ⊆ 6o [Cg]. We deal with it in this way because the transition from weaker persistency to the intermediate case and to the stronger persistency case is irreversible. To construct such a step of iteration we introduce several intermediate domains: At+ ⊂ A1 ⊂ A2 ⊂ At ,

Ds+ ⊂ D0 ⊂ D1 ⊂ D2 ⊂ Ds ,

400

C.-Q. Cheng

At+ = {|Im q| ˆ ≤ t+ , Reqˆ ∈ Tn−1 }, 2 A1 = {|Im q| ˆ ≤ t+ + s+ , Reqˆ ∈ Tn−1 }, 1 ˆ ≤ t − (t − t+ ), Reqˆ ∈ Tn−1 }, A2 = {|Im q| 2 4 3 ∗ 3 , |pn − ρn+ (qn )| ≤ s+ , |qn − qn+ | ≤ s+ }, Ds+ = {|pˆ − ρˆ+ (qn )| ≤ s+ 4 3 3 D0 = {|pˆ − ρ(q ˆ n )| ≤ 2s+ , |pn | ≤ 2s+ , |qn − qn∗ | ≤ 2s+ },

4 3 3 ˆ n )| ≤ 3s+ , |pn | ≤ 3s+ , |qn − qn∗ | ≤ 3s+ }, D1 = {|pˆ − ρ(q 1 1 4 3 ˆ n )| ≤ s 4 − (s 4 − 3s+ ), |pn | ≤ s 3 − (s 3 − 3s+ ), D2 = {|pˆ − ρ(q 2 2 1 3 )}. |qn − qn∗ | ≤ s 3 − (s 3 − 3s+ 2

Supposing |P | ≤ δ in At × Ds and developing P in a Taylor series in (pn , qn ), we have Cauchy’s estimate in {|Im q| ˆ ≤ t} × {|pˆ − ρ(q ˆ n )| ≤ s 4 }, ˆ q)|} ˆ ≤ max{|Pij (p,

(i + j )!δ , s 3(i+j )

(3.12)

Also by Cauchy’s method we find in A2 × D2 , e (p+ , q)| ≤ |W eqˆ | ≤ 4θ(t − t+ )−1 , |W 2 −1 eqn | ≤ 2θ(s 2 − 2s+ ) , |W

c1 δ := θ, D(t − t+ )µ 4 −1 epˆ | ≤ 2θ (s 4 − 3s+ |W ) , 3 −1 epn | ≤ 2θ (s 3 − 3s+ |W ) ,

4 −2 epi qj |, |W eqi qj |} ≤ 8θ(s 4 − 3s+ epi pj |, |W ) , max{|W

∀i, j ≤ n,

(3.13)

if iω and λj (j ≤ κ) are nonresonant as (2.11) shows. We also find in A2 × D2 , b (p+ , q)| ≤ |W

D 6 (t

bqˆ | ≤ 4γ (t − t+ )−1 , |W 3 −1 bqn | ≤ 2γ (s 3 − 2s+ ) , |W

c3 δ := γ , − t+ )6µ

4 −1 bpˆ | ≤ 2γ (s 4 − 3s+ |W ) , 3 3 bpn | ≤ 2γ (s − 3s+ )−1 , |W

4 −2 bpi qj |, |W bqi qj |} ≤ 8γ (s 4 − 3s+ bpi pj |, |W ) , max{|W

∀i, j ≤ n.

(3.14)

If (3.3) is assumed we then have that M: (p+ , q+ ) → (p, q), generated by the function b + (1 − µ)W e , maps A1 × D1 into A2 × D2 , and in A1 × D1 , W = µW M − id ≤ 4 max{θ, γ }(s 4 − 3s 4 )−1 , + ∂(p, q) 4 −2 − I d ≤ 80 max{θ, γ }(s 4 − 3s+ ) . ∂(p+ , q+ )

(3.15)

Lower Dimensional Invariant Tori in the Regions of Instability

401

As for the estimation on the new error term P+ in the domain A1 × D1 , we only need to figure out the upper bound of Mo , Mj s (j = 1, 2, 3) in (2.13), which are n ∂ 2 N ∂ 2 N ∂ 2 N o |Mo | ≤ max , , max{|Wq |2 , |Wpn |2 } ∂pi ∂pj ∂pi ∂qn ∂qn ∂qn 4 −2 ) , ≤ c4 max{θ 2 , γ 2 }(s 4 − 3s+

|M11s | ≤ max |φ 00 ||Wpn+ |2 + max |φ (3) ||qn − qn∗ |2 |Wpn+ | 3 −2 6 3 3 −1 ) + 18γ s+ (s − 3s+ ) , ≤ 4γ 2 (s 3 − 3s+

4 3 3 −1 6 3 3 −1 (s − 3s+ ) + 9n3 ηs+ (s − 3s+ ) |M12s | ≤ 6n2 ηγ s+ 3 −2 + 2n2 ηγ 2 (s 3 − 3s+ ) ,

4 3 3 −1 3 −2 |M13s | ≤ 6(n − 1)ηγ s+ (s − 3s+ ) + 2ηγ 2 (s 3 − 3s+ ) 6 3 3 −1 + 9n2 ηγ s+ (s − 3s+ ) ,

6 3 3 −1 (s − 3s+ ) , |M14s | ≤ 18(n + 2)ηγ s+

6 3 3 −1 (s − 3s+ ) , |M15s | ≤ 18(n + 2)ηγ s+ s 8n+22 + , |M2s | ≤ 2δ s 3 −1 3 3 −1 ) (s − 3s+ ) , |M3s | ≤ 4δγ (s 3 + 3s+ 3 −1 3 3 −1 ) (s − 3s+ ) . |M4s | ≤ 8δγ (s 3 + 3s+

Putting these terms together we finally obtain s 8n+22 + 4 3 3 −1 (s − 3s+ ) +δ := δ+ , |P+ | ≤ c5 max{θ, γ }s+ s where c4 , c5 depend on N. Shrinking A1 × D1 to At+ × Do then ) ( ∂kP k!δ+ + max ≤ 4k , |l1 +l2 |=k ∂p l1 ∂q l2 s+ + +

(3.16)

which certainly hold in At+ × Ds+ since Ds+ ⊂ Do . 3.3. Stronger persistency. In this case β ≥ 2 in (3.10), therefore µ(β − 1) = 1 and the generating function is a special form of (2.12). All of the iteration is the same as above by setting µ = 1. We also introduce a change of coordinates (pn , qn ) → (pn + ρn+ (qn ), qn )

(3.17)

4 , |p | ≤ s 3 }. so that the Hamiltonian is defined in the domain {|pˆ − ρˆ+ (qn )| ≤ s+ n + j j We observe that N+ is real analytical on the domains 6s and Ds . Thus it can be extended to T in qn as a real function by Whitney’s extension theorem shown in the next section, and in view of (3.9) and (3.16) it is bounded by |k+l| ∂ N+ 4k!l!δ ∀|k + l| ≤ 2n + 5, (3.18) ∂pk ∂q l ≤ c6 s 4|k|+3l , n

402

C.-Q. Cheng

which brings about the following: ∂(N + N+ ) ≥ max{ 1 s 5 , (1 − 2B)s 5 } − c7 δ > 1 s 5 + ∂q 4 2 2 + s+ n+ when qn is outside L+j and I+j , if we note the construction of these sets, the transformation (3.17) and the conditions imposed (3.3), (3.5), in which c7 depends on n and λ. On the whole T |k+l| ∂ (N + N+ ) δ ∀|k + l| ≤ 2n + 5, l ≥ 1. (3.19) < B + c7 s 4|k+l| , ∂pk ∂ l qn For the next step of iteration, some critical point may turn out to have stronger persistency, 2 to and simple calculation shows that the corresponding I+j has distance larger than s+ 2 if we adjacent L+k or I+k . Also the distance among those L+j are also larger than s+ j note how the intermediate domain 61 is constructed and L+j is determined by shrinking j 2 at each endpoint. The above argument shows that φ has at least two critical Re61 for s+ + points which can not be outside L+k or I+k , which makes it possible to proceed with iteration if there is no resonance between torus and tangent frequencies. 4. Differentiability in ω From now on we let ω range over some open set ⊂ Rn−1 to see how the Hamiltonian depends on ω after some iteration steps. At the beginning of iteration, the Hamiltonian depends on ω in the way H0 (p, q, ω) = H0 (p + g0−1 (ω), q), where g0 is defined as g0 (p) = Np (p) = (ω, 0). Since the small divisor problem is involved in the iteration which depends on ω, after some transformation steps, Hm (p, q, ω) = Nm (p, qn , ω) + Pm (p, q, ω) is only defined on some Cantor set in with respect to ω. With the experience in our previous work [CS3], we believe in the differentiability of Hm in ω in the sense of Whitney, which will be demonstrated in this section. Definition 2. Let S be a closed set included in Rn under Euclid topology, H m,α (S) (m ∈ Zn+ , 0 < α ≤ 1) is a class of all families u = {ul } (|l| ≤ m) of functions defined on S, which satisfy, for some finite M, |uk (x)| ≤ M,

|uk (x) − Pk (x, y, u)| ≤ M|x − y|m−|k|+α ,

for all x, y ∈ S and |k| ≤ m. Here we use usual multi-index notation, Pk (x, y, u) =

X

uk+l (y)

|l+k|≤m

(x − y)l l!

is the analogue of k th Taylor polynomial. We define a norm kukm,α,S = inf M

(4.1)

Lower Dimensional Invariant Tori in the Regions of Instability

403

as the smallest M for which both inequalities in (4.1) hold. On the other hand, for some open set I, H m,α (I) is defined to be the class of all functions u on I with derivatives u(l) for all |l| ≤ m which satisfy, for some finite M, |u(l) (x)| ≤ M,

|u(l) (x) − Pl (x, y, u)| ≤ M|x − y|m−|l|+α ,

(4.2)

for all x, y ∈ I and all |l| ≤ m. In this case Pl (x, y, u) is its Taylor polynomial like that in (4.1) with uk being replaced by u(k) . Moreover the norm is defined kukm,α,I = inf M as the smallest M for which both inequalities in (4.2) hold. A mapping g(x): S → Rn is said to be an element of {H m,α (S)}n if all its components gj ∈ H m,α (S). Its norm is given by kgkm,α,S = max kgj km,α,S . 1≤j ≤n

Sometimes, for simplicity, we say a function u is of H m,α (S) if there is a family of function {ul } in H m,α (S) such that u = u0 , and when there is no danger of confusion we sometimes call ul the l-derivative of u in the sense of Whitney. Lemma 3 (Whitney). For each closed S and positive integer m suppose u(s, x) belongs to H m,α (S) in variable s, and is analytical in x on some domain (resp. C r differentiable). There exists a linear extension operator < H m,α (S) → H m,α (I), u → U =
(4.3)

with a constant c6 depending on m, α but not on S. To verify Whitney’s differentiability of Hm , as the first step one needs to check whether the generating function has such a property. Lemma 4. Suppose that fj (ω) (1 ≤ j ≤ κ + 2), h(·, ω) ∈ H m,1 (S) with the j th norm bounded by σj , δj (j ≤ m) respectively, h(·, ω) is analytical in q on {|Imq| ≤ so } with period 2π , hk (ω) are its Fourier coefficients, o n S = ω ∈ Rn : |ihk, ωi + fl (ω)| ≥ D|k|−µ ; ∀k ∈ Zn , then the function u(q, ω) =

X k∈Zn \{0}

fκ+1 (ω)hk, ωi + fκ+2 (ω) hk (ω)eihk,ωi , Qκ (ω) ihk, ωi + f l l=1

also belongs to H m,1 (S) in ω. If q is confined to a smalller domain {|Imq| ≤ t1 < to }, then one has the following estimates: ku(·, ω)km,1,S ≤

c8 m (σ, δ) , 0 +2 m D (to − t1 )µm

(4.4)

404

C.-Q. Cheng

where m (σ, δ) = max{δm−j

k Y i=1

k σlνi i : 6i=1 li νi = j, j = 0, 1, 2, · · · , m, σo = 1, δo = δ}

and c8 depends only on m, n, µm = µ(m0 + 2) + m0 + n + 1, and m0 = max{m, κ}. Proof. We consider (ihk, ωi + fl (ω))−1 as the composition of the functions x −1 and ihk, ωi + fl (ω). Since we have, by straightforward calculation X 1 (x1 − x2 )l (x1 − x2 )j +1 1 − = (−1)l + (−1)j , l+1 j +1 x1 x2 x2 x x1 1≤l≤j

1 x1k

−

1 x2k

=

2

X

(−1)l

1≤l≤j −k

(l + k)!(x1 − x2 )l l!x2l+1

+

i nX i=1

αik

j +2−i i x1

x2

o

(x1 − x2 )j −k+1 ,

αik is some integer depending on (i, k). From the fact that fl (ω) is Whitney differentiable and ω ∈ S we get −1

k ihk, ωi + fl (ω)

k k Y X kj,1,S ≤ c9 D −(j +2) |k|(j +2)µ max{ σlνi i : li νi = j }. i=1

i=1

Q Next we consider l (ihk, ωi + fl (ω))−1 as the composition of the functions of l xl and (ihk, ωi + fl (ω))−1 and to make use of the estimate above. In this way we have Q

k

κ k k Y Y X 0 0 (ihk, ωi + fl (ω))−1 kj,1,S ≤ c10 D −j −2 |k|(j +2)µ max{ σlνi i : li νi = j }. l

i=1

i=1

Since h is included in H m,1 (S) in ω, the well-known Cauchy method leads to khk (ω)kj,1,S ≤ δj e−|k|r , (j ≤ m) if q is confined to a smaller domain {|Imq| ≤ t1 < to − r}. Therefore we have the estimate (4.4). To apply Lemma 4 we need first to estimate the Whitney norm of the eigenvalues in (2.10). Observe that Sp(Am ) = {(m − l)σ1 + lσ2 , l = 0, 1, · · · , m}, q o 1n − Npn qn ± Np2n qn − 4φ 00 Npn2 , σ1,2 = 2 q 2 . We only consider the case when |Np2n qn − 4φ 00 Npn2 | ≥ |Npn qn |, where |φ 00 | ≥ s+ otherwise the relevant tori are hyperbolic, so there is no necessity to study the Whitney 8n+20 differentiability. If s+ = s 8n+21 , as shown later, simple calculation leads to the estimate kλkl,1 ≤

O(kφ 00 kl,1 , kNkl,1 ) , s 2l

l ≤ 2n + 5, λ ∈ Sp(Am ).

Lower Dimensional Invariant Tori in the Regions of Instability

405

We also assume that the main part N(p, qn , ω) and the function φ(qn , ω) satisfy the following (cf.(3.1)): ∂ |i+j +k| N ≤ B 1, j i k ∂p ∂qn ∂ω ∂ |j +k| φ ≤ B 1, |i + j + k| ≤ 2n + 5, j ≥ 1, j k ∂qn ∂ω

(4.5)

and the error term P (p, q, ω) is bounded by kP (·, ·, ω)kl,1 ≤

c11 δ , s 4(l+1)

l ≤ 2n + 5.

(4.6)

l Notice sup ddωPl = kP kl−1,1 for l < m if it is considered as the usual derivatives. According to Lemma 4 and from (2.5) we find the relevant generating function W is bounded by kW (·, ·, ω)kl,1 ≤

0 D l +2 (t

c12 δ , l ≤ 2n + 5. − t+ )µl s 4(l+1)

(4.7)

In the case of stronger persistency, besides (4.5) and (4.6) we need the bound for qn∗ which is determined by the relation that φ 0 (qn∗ , ω) ≡ 0. Thus ∂ 2 φ −1 ∂ 2 φ ∂qn∗ =− , ∂ω ∂qn2 ∂qn ∂ω which leads to the estimate kqn∗ (ω)kl,1 ≤

c13 4l+2 s+

,

l ≤ 2n + 4.

(4.8)

Consequently, from (3.3) and Lemma 4 we have the bound for W (·, ·, ω) of the same form as (4.7) with c12 replaced by a bigger constant, for simplicity, still denoted by c12 . Clearly, in the intermediate case between stronger and weaker we also have an upper bound for W (·, ·, ω) like (4.7). To get the upper bound for kP+ kl,1 , (l ≤ 2n + 5) we first notice that W is involved in the form W (p+ , q, ω). Since the map (p, q) ↔ (p+ , q+ ) is determined by (2.1) through W implicitly, which is parameterized by ω, (p, q) is then considered as the function of (p+ , q+ , ω). From the relation q = q+ + Wp+ (p+ , q, ω) one readily has h ∂ 2 W i−1 ∂ 2 W ∂q = I− . ∂ω ∂p∂q ∂p∂ω

(4.9)

By inductive inference |l|−1 X ∂ 2 W i−1 ∂ |l|+1 W ∂ |j |+1 W h ∂ |l| q = Gj + I − , ∂ωl ∂p∂ωj ∂p∂q ∂p∂ωl l≤l

where Gj is the sum of N(|j |) monomials, N(|j |) is integer only depending on j , each |ξ |+|i|+1 of which is the product of ∂∂p∂q ξ ∂ωWi (|i + j | ≤ l, i < l). The sum of |ξ | over all these

406

C.-Q. Cheng

factors equals exactly the number of these factors, their coefficients are the elements of ∂ 2 W −1 ] . So if q is restricted in the domain |Imq| ≤ 21 (t + t+ ), and [I − ∂p∂q kW (·, ·, ω)kn,1 ≤

1 1 (t − t+ ) ≤ , 2 2

∂ 2W 1 , ≤ ∂p∂q 2n

(4.9)

then kq(·, ω)kl,1 ≤

c14 kW (·, ·, ω)kl,1 , 4) (s 4 − 3s+

(4.10)

plus (3.3), we have kq(·, ω)kl,1 ≤ s 4 . Therefore c15 δ , l ≤ 2n + 5. 0 D l +2 (t − t+ )µl s 4(l+1)

kW (·, , q(·, ω)ω)kl,1 ≤

(4.11)

With the preliminaries as above we can figure out an upper bound for kP (·, ·, ω)kl,1 for l ≤ 2n+5. Note both (4.6) and (4.11) provide estimates for the derivatives in ω for P and W in the domain A2 ×6s . Thus the standard Cauchy technique can be used to obtain the estimates for mixed derivatives in (p, q, ω) by shrinking the domain to a smaller one. In the case of weaker persistency, from (2.6) we obtain by inductive inference ∂ |l| φ ∂ |l| φ ∂φ ∂ |l+1| W ∂ |l| M11 = (qn ) − (qn+ ) + l l l ∂ω ∂ω ∂ω ∂qn ∂pn ∂ωl X

+

cl1 l2

2≤|l1 +l2 |≤|l| |l2 |<|l|

∂ N˜ ∂ |l+1| W ∂ |l| M12 = + l ∂ω ∂p ∂q∂ωl +

X

cl1 l2

2≤|l1 +l2 |≤|l| |l2 |<|l|

|l1 | Y

∂ |l1 +l2 | φ

∂qnl1 ∂ωl2 6|ki |=|l−l2 | X

cl1 l2 l3

2≤|l1 +l2 +l3 |≤|l| |l1 |>1

∂ |l1 +l2 | N˜ ∂qnl1 ∂ωl2

(p, qn ) −

∂ |ki +1| W , ∂pn ∂ωki |l1Y +l2 |

∂ |l1 +l2 +l3 | N˜ ∂p l1 ∂qnl2 ∂ωl3

∂ |l1 +l2 | N˜ ∂qnl1 ∂ωl2

6|ki |=|l−l3 | |ξi +ηi |=1

∂ |ki +ξi +ηi | W ξ

∂pni ∂q ηi ∂ωki

|l1 | Y

(p+ , qn )

6|ki |=|l−l2 |

∂ |ki +1| W , ∂pn ∂ωki

|l+1| W ∂ |l| N˜ ∂ |l| M13 ˜ qn (p+ , ·)) ∂ = (N (ρ(q ), ·)p − N + (·, qn ) p q n n+ n n ∂ωl ∂pn ∂ωl ∂ωl

−

+

∂ |l| N˜ (·, q+n ) + ∂ωl X |l1 +l2 |=|l| |l2 |<|l|

X 2≤|l1 +l2 |≤|l| |l2 |<|l|

cl1 l2

∂ |l1 +l2 | N˜

|l1 | Y

∂qnl1 ∂ωl2

6|ki |=|l−l2 |

∂ |ki +1| W , ∂pn ∂ωki

∂ |l1 |+2 N ∂ |l2 | W l (ρ(q , ω), q , ω)p , n n n+ l1 ∂pn ∂qn ∂ωl1 ∂pn ∂ωl2

where cl1 l2 and cl1 l2 l3 are constants. By virtue of (3.7), (4.7) and the property that Np (ρ(qn ), qn ) = Nq (ρ(qn ), qn ) = 0, in A1 × 61 we have

Lower Dimensional Invariant Tori in the Regions of Instability

407

|l| ∂ M1 s4 δ c16 δ ∂ωl ≤ D l 0 +2 (t − t )µl s 4l s 3 − 3s 3 + D l 0 +2 (t − t )µl . + + + By simple calculation it is derived from (2.6) that |l| ∂ M2 c16 δ ∂ωl ≤ D l 0 +2 (t − t )µl (s 4 − 3s 4 )s 4l+4 , + + |l| δ s 8n+22 ∂ M3 δs 2 + + l 0 +2 , ∂ωl ≤ c16 s 4l s D (t − t+ )µl s 4l |l| ∂ M4 c16 δ 2 ≤ ∂ωl D l 0 +2 (t − t )µl (s 4 − 3s 4 )s 4l+4 . + + In conclusion we have the following by putting these estimates together |l| 4 s 8n+22 o ∂ P+ c17 δ n s+ + , ∂ωl ≤ s 4l D l 0 +2 (t − t )µl 0 (s − s )3 + s + + +

(4.12)

if (3.3), (3.4) and (3.5) are taken into account. Shrinking A1 × 61 to At+ × 60 , one can obtain an upper bound for mixed derivatives of P+ in p, q and ω. When the critical points have stronger persistency, in view of (2.11) and by inductive inference, we have that ∂ |l| φ ∂ |l| φ ∂ |l|+1 φ ∂W ∂ |l| M11 = (qn ) − (qn+ ) + 2 l (qn∗ )(qn − qn∗ ) l l l ∂ω ∂ω ∂ω ∂qn ∂ω ∂pn +

∂ |l|+2 φ ∂qn2 ∂ω

(q ∗ )(qn − qn∗ ) − l n

X

+

cl1 l2

2≤|l1 |+l2 ≤l

∂ |l1 |+l2 φ ∂qnl2 ∂ωl1

X

+

cl1 ,··· ,l5

|l1 +l4 +(1−l3 )l5 |+l2 ≤l 0≤l3 ≤1

×

∂ |l5 |+1 W ∂pn ∂ωl5

1−l3

∂ |l|+1 φ ∂ |l|+1 W ∂qn ∂ωl ∂pn ∂ωl l2 Y

(qn )

6|ki |=|l−l1 |

∂ |l1 |+l2 +2 φ ∂qnl2 +2 ∂ωl1

∂ |ki |+1 W ∂pn ∂ωki

(qn∗ )(qn − qn∗ )l3

l2 +1−l Y3 6|ki |=|l−l1 −l4 −(1−l3 )l5 |

∂ |ki | qn∗ , ∂ωki

∂ |l4 |+1 W ∂pn ∂ωl4

408

C.-Q. Cheng

∂ N˜ ∂ |l+1| W ∂ 2 N˜ ∂ |l+1| W ∂ N˜ ∂ N˜ ∂ |l| M12 (p, q (p, q (p+ , qn ) = ) (ρ, q ) + ) − n n n ∂ωl ∂p ∂q∂ωl ∂pn2 ∂qn ∂ωl ∂ω ∂ω +

X

cl1 l2 l3

2≤|l1 +l2 +l3 |≤|l| |l1 |>1

+

X

cl1 l2

2≤|l1 +l2 |≤|l| |l2 |<|l|

×

|l1 | Y 6|ki |=|l−l2 |

∂ |l| M13 ∂ωl

|l1Y +l2 |

∂ |l1 +l2 +l3 | N˜ ∂pl1 ∂qnl2 ∂ωl3

∂ |l1 +l2 | N˜ ∂qnl1 ∂ωl2

∂ |ki +ξi +ηi | W

6|ki |=|l−l3 | |ξi +ηi |=1

(p, qn ) −

ξ

∂pni ∂q ηi ∂ωki

∂ |l1 +l2 | N˜ ∂qnl1 ∂ωl2

(p+ , qn )

∂ |ki +1| W , ∂pn ∂ωki

∂ |l+1| W ∂ |l| N˜ = (Npn qn (ρ(qn ), ·)pn+ − N˜ qn (p+ , ·)) + (·, qn ) ∂pn ∂ωl ∂ωl −

+

∂ |l| N˜ (·, q+n ) + ∂ωl X |l1 +l2 |=|l| |l2 |<|l|

X 2≤|l1 +l2 |≤|l| |l2 |<|l|

cl1 l2

∂ |l1 +l2 | N˜

|l1 | Y

∂qnl1 ∂ωl2

6|ki |=|l−l2 |

∂ |ki +1| W , ∂pn ∂ωki

∂ |l2 | W ∂ |l1 |+2 N l (ρ(q , ω), q , ω)p , n n n+ l1 ∂pn ∂qn ∂ωl1 ∂pn ∂ωl2

∂ |l|+2 N ∂ |l|+2 N ∗ ∗ ∂W ∂ |l| M14 = (ρ(q ), q ) − (p , qn ) pn+ n n l 2 l 2 l ∂ω ∂pn ∂ω ∂pn ∂ω ∂qn ∂ 2N ∂ 2N ∗ ∗ ∂ |l|+1 W + (ρ(qn ), qn ) − (p , qn ) pn+ ∂pn2 ∂pn2 ∂qn ∂W −

X |l2 |+l1 ≤|l| l1 ≥1

cl1 l2

d l1 ∂ N ∂ |l2 |+1 W (ρ(qn∗ ), qn∗ )pn+ ∗ 2 dqn ∂pn ∂qn ∂ωl2

l1 Y σ |ki |=|l−l2 |

∂ ki qn∗ , ∂ωki

∂ |l|+2 N ∂W ∂ |l| M15 ∂ |l|+2 N ∗ ∗ = (ρ(q ), q ) − (p , q ) pn+ n n n ∂ωl ∂pn ∂qn ∂ωl ∂pn ∂qn ∂ωl ∂pn ∂ 2N ∂ 2N ∂ |l|+1 W + (ρ(qn ), qn ) − (p∗ , qn∗ ) pn+ ∂pn ∂qn ∂pn ∂qn ∂pn ∂W −

X

d l1 ∂ N ∂ |l2 |+1 W cl1 l2 ∗ (ρ(qn∗ ), qn∗ )pn+ dqn ∂pn ∂qn ∂pn ∂ωl2 ≤|l|

|l2 |+l1 l1 ≥1

l1 Y σ |ki |=|l−l2 |

∂ ki qn∗ . ∂ωki

Thanks to (4.7) and (4.8), the following holds in A1 × 61 by virtue of (3.3), (3.4) and (3.5): |l| ∂ M1 δ c18 δ s4 ≤ 0 ∂ωl D l +2 (t − t )µl s 4l s 3 − 3s 3 + D l 0 +2 (t − t )µl . + + +

Lower Dimensional Invariant Tori in the Regions of Instability

409

Also by straightforward calculation we have |l| ∂ M2 c18 δ s+ 8n+22 , ∂ωl ≤ s 4l s n ∂ |l| M ∂ |l| M o c18 δ 2 3 4 . , ≤ max 0 4 )s 4l+4 ∂ωl ∂ωl D l +2 (t − t+ )µl (s 4 − 3s+ Thus |l| 4 s 8n+22 o ∂ P+ c19 δ n s+ + , l ≤ 2n + 5. (4.13) ∂ωl ≤ s 4l D l 0 +2 (t − t )µl 0 (s 3 − 3s 3 ) + s + + + In the intermediate case we note the generating function is chosen through (2.12). Since −2 , by chain derivation, it is easy to show that (4.11) still holds for β(ω) = φ 00 (qn∗ , ω)s+ the generating function with some constant possibly bigger than c15 , but only depending on n, so we still write it with c15 . Thus combing (4.13) with (4.12) we can also obtain an upper bound. So, if we let δ+ = max{c2 , c5 , c17 , c19 }δ

n

4 s+

3) D κ+2 (t − t+ )µ2n+5 (s 3 − 3s+

+

s 8n+22 o +

s

, (4.14)

we then have in A1 × 61 (A1 × D1 ), |l| ∂ P+ δ+ ∂ωl ≤ s 4l . +

(4.15)

Shrinking to At+ × 6s+ (At+ × Ds+ ), the estimate on mixed derivatives of P+ in p, q and ω are obtained by using the Cauchy technique. ˆ n )| ≤ s 4 , |pn | ≤ Finally, we extend (N + N+ )(p, qn , ω) to the domain {|pˆ − ρ(q 3 s , qn ∈ T, ω ∈ } inductively for the estimation of measure. For one iteration step it is assumed that N(p, qn , ω) has been extended. Thus we only need to extend N+ (p, qn , ω); it is no doubt differentiable. For certain fixed ω, N+ is defined on some Lj and Ij as a real analytical function in qn . As it is shown in the end of the last section, when the distance between two such intervals is shorter than s 2 , N+ has a natural extension in a larger interval containing them. If the distance is longer than s 2 , one readily gets (j, 1)-Whitney’s norm in qn ,

|i+k|

∂ i!(j + 1)!k! max |N+ | N+

,

∂pi ∂ωk ≤ 2 s 4|i+k|+3(j +1) j,1 which verifies (3.18) and on the domain {|pˆ − ρ(q ˆ n )| ≤ s 4 , |pn | ≤ s 3 , qn ∈ T, ω ∈ }, N+ is bounded by |i+j +k| ∂ N+ i!j !k! max |N+ | ∂pi ∂q ∂ωk ≤ 2c6 s 4|i+k|+3j , k ≤ 2n + 5, n by Lemma 3.

(4.16)

410

C.-Q. Cheng

5. Convergence In this section we complete the convergence proof of a sequence of symplectic maps for certain ω under the assumption that this ω always avoids resonance with the normal frequency at every iteration step. As assumed above, the Hamiltonian Ho (p, q, ω) = Ho (p + g −1 (ω)) + Po (p + g −1 (ω), q) is real analytical in the domain 6o = {|Imq| ≤ τ∗ } × {|p| ≤ s∗ }. Np (g −1 (ω)) = (ω, 0) satisfies the relative Diophantine condition |hk, ωi| ≥ D|k|−µ ,

∀k ∈ Zn−1 \{0}.

The Hessian of No is far from nonsingular in Re6o :

2

∂ No

≥λo kξ k, ξ ∀ξ ∈ Cn , min 6o ∂p 2 |l| ∂ N ≤ηo . sup l |l|≤n+3 ∂p Let τo = τ , so3 = σ and assume max |Po (po , qo | ≤ δo . 6o

As usual, we construct a sequence of symplectic transformation of coordinates, Mm on domain 6m whose range is contained in 6m−1 . In 6m the Hamiltonian Ho is transformed into (5.1) Hm (pm , qm ) = Nm (pm , qnm ) + Pm (pm , qm ), where the Hessian of Nm in pm is still far from singular for any qnm . The critical point line ρm (qnm ) is defined by the equation Nmp (pm , qnm ) = ω. The function φm (qnm ) = ∗ . Nm (ρm (qnm ), qnm ) − hω, ρˆm (qnm )i attains its maximum (minimum) at the point qnm ∗ ∗ Let pm = ρm (qnm ). The domain 6m is prescribed in two ways. In the case of weaker persistency, 4 ˆ nm )| ≤ sm , 6m = {|Imqˆm | ≤ τm , |pˆ m − ρ(q 3 2 , |Imqnm | ≤ sm , qnm ∈ Lm }; |pnm | ≤ sm

in the intermediate stage as well as the stronger persistency case, 4 3 ∗ 3 ˆ nm )| ≤ sm , |pnm | ≤ sm , |qnm − qnm | ≤ sm }, 6m = {|Imqˆm | ≤ τm , |pˆ m − ρ(q ∗ − s 2 , q ∗ + s 2 ]. where Lm = [lm , rm ] ⊃ [qnm m nm m

(a) tm =

τo −m ); 2 (1 + 2 8n+22

8n+21 , so ≤ τo ; (b) sm = sm−1 8n+21 ; (c) δm = sm

Lower Dimensional Invariant Tori in the Regions of Instability

411

(d) The modification of the main part of the Hamiltonian after each iteration step is small: (in 6m+1 ); |Nm − Nm+1 | ≤ 2δm ; (e)

) ( ∂ |l| P 1− 4|l| max l1 l2 ≤ l!δm 8n+21 ; (in 6m ); |l1 |+|l2 |=|l| ∂p ∂q m m

(f) 8n+17

|Mm − id| ≤ δm8n+21 , 8n+13

|M0m − I d| ≤ δm8n+21 ; (in 6m ). M0m is the Jacobian of Mm . At the beginning of the iteration we have to deal with the problem in the way of weaker persistency because No is independent of qn . Such a process may be repeated for finite or infinite times, depending on whether the critical point at each step has weaker persistency or not. Once it gets stronger persistency at some iteration step, then it is always the case in the following steps. Such a construction is possible if all conditions introduced in the sections above are satisfied for each m which needs to be verified. Notice that Nm (pn , qnm ) =

m−1 X

P¯ν (pn , qnm ) + No (pm ),

ν

where P¯m is the average part of Pm with respect to qˆm . It follows from (e) that in 6m ,

2 m−1 X 8n+13

∂ Nm

≥ λo − 2

ξ δν8n+21 kξ k, min 6m ∂p2

∀ξ ∈ Cn .

ν=1

Since No is independent of qn0 , Nm (m > 0) becomes dependent on qnm due to the contribution from P¯i (i < m) and the change of coordinates pnm → pnm + ρnm (qnm ). So it is easy to show that ∞ ∂ |l1 |+l2 N X 1− 4|l| m δν 8n+21 , |l| ≤ 2n + 5. (5.2) ≤ l! sup l l 1 2 (l1 ,l2 )=l ∂pm ∂qnm ν=1 l2 ≥1

Let λ = λo − 2

P∞

8n+13 8n+21

ν=1 δν

and η = ηo + 6

such that if δo ≤ d1 then λ ≥ |P¯m | ≤ 2δm . We have

λo 2 ,

P∞

8n+6 8n+21

ν+1 δν

, then there is a d1 (No , n) > 0 8n+6

∞ δ 8n+21 ≤ (7n2 2µ )−1 . Note η ≤ 2ηo and 66ν=1 ν

(l)

c20 δm

(l)

4l+4 sm c20 δm

(l) (qn )| ≤ |ρm+1 (qn ) − ρm (l) (qn )| ≤ |φm+1 (qn ) − φm

,

, 4l+4

sm

(5.3)

412

C.-Q. Cheng (j )

(j )

where c20 depends on λ and n. Since ρo = φo = 0, (j )

|ρm | ≤

m X ν=1

(j )

(j )

|ρν(j ) − ρν−1 |, |φm | ≤

m X ν=1

(j )

|φν(j ) − φν−1 |,

there exists d2 (No , n) > 0 such that 1 , 64

(j )

max |φm (qn )| ≤

6m ,j ≤3

(j )

max |ρm (qn )| ≤

6m ,j =1,2

1 , 2

if δo ≤ d2 further. Up to now (3.1.2) and (3.1.4) have been verified. Let

d3 = min

8n+21 n 65D 6 τ 6µ 8n+14

212µ+7 c3

m 1+ 8n+21

Since δm ≤ δo

,2

2

− 6µ(8n+21) 8n+14

λ , 12n

8n+21 8n+18

o 8n+21 , (4c7 )− 8n+11 .

1

, we see that if δ ≤ d3 , so = δo8n+21 then 2sm+1 ≤ sm , 4 sm+1

sm ≤ tm ≤ 1, 1 5 ≥ sm , sm+1 ≤ (tm − tm+1 ) 4

and δm ≤ min

n 65D 6 τ 6µ s 7 λ o 1 o m 10 2 s , s , , m m+1 9 26µ(m+2)+7 c3 12 4c7 sm+1

which verifies (3.1) and (3.3). Clearly, ∃ d4 > 0, if δo ≤ d4 , then so | log so |µ ≤ 1. Let n (8n+21)µ D 3τo , d5 = min d4 , 2 64(n − 1)

o e n−1 τo n 8n+21 2 . 2n − 2 32

Equation (3.2) is also satisfied if δo ≤ d5 . Finally, we let 1 (8n+21)2

d6

= min

n1 2

−1

max{c2 , c5 , c17 , c19 }

D κ+2 τ µ2n+5 , 22µ2n+5 +4

13 o

and set δo = min{dk ; 1 ≤ k ≤ 6}. If |Pm | ≤ δ in 6m , from (4.14) we have |Pm+1 | ≤ δm+1 in 6m+1 which verifies (e). Clearly (3.8) and (3.15) result in (f) by the choice of δo . What is left in this section is just standard arguments to show that the sequence of coordinate transformation constructed above converges on a (n − 1) dimensional torus, which is the same as, for instance, in [Cg] which we omit here. u t

Lower Dimensional Invariant Tori in the Regions of Instability

413

6. Measure Estimation The convergence proved in the last section works only under the assumption that the coupling of torus and normal frequencies does not break up the small divisor condition: |ihk, ωi + λl | ≥ D|k|−µ ,

∀k ∈ Zn−1 \{0}, |l| ≤ κ.

(2.11)

A necessary condition that (2.11) does not hold is that the imaginary part of λl is bigger than its real part, for instance, we require that |Reλl | ≤ |Imλl |. Otherwise (2.11) holds automatically for those ω satisfying (1.6). To see this, suppose that Nopn2 > 0. We have φ 00 (qn∗ ) > 0, i.e. φ(qn ) reaches its minimum, which corresponds to an elliptic (n − 1) dimensional torus. To find those ω satisfying condition (2.11) we consider λl as the function of ω and skip n−1 ; hk, ωi + those points in Rn−1 which are between two hypersurfaces S− k = {ω ∈ R + Imλl = −D|k|−µ } and Sk = {ω ∈ Rn−1 ; hk, ωi + Imλl = D|k|−µ } for all k 6= {0} when |Reλl | ≤ |Imλl |. However, such surfaces are only piecewise smooth where φ 00 takes non-zero value. Nevertheless, it causes no make trouble, since there is no coupling problem between torus and normal frequencies when φ 00 is very small, which is true for the case of weaker persistency. Also, λl (ω) may be multi-valued because there might be more than one local minimum point of φ(qn , ω) for fixed ω, as each component extends to where φ 00 remains positive. For certain k ∈ Zn−1 \{0}, choose ki 6 = 0, such that |ki | is the largest one and let ω˜ = (ω1 , · · · , ωi−1 , ωi+1 , · · · , ωn−1 ) be fixed. We see that the small divisor condition (2.11) does not hold only when those ωi fall into the interval such that ˜ + Imλl (ωi , ω)| ˜ ki ˜ ≤ D|k|−µ . |ki ωi + hω, To measure how many ωi do not fall into this set, we make use of the differentiability of λ in ω in each component and investigate whether the first order derivative of Imλ is very close to ki . If not, those ωi out of an interval with length proportional to D|k|−µ will be candidates. If yes, the second order derivatives should be figured out. If they 1 are not very small, out of an interval proportional to (D|k|−µ ) 2 they will satisfy the requirement. We claim that for the majorityqof ω this is indeed the case. First we consider λl = λ = Npn qn + i 4Npn2 φ 00 − Np2n qn and choose those qn (ω) where φ(ω) reach its global minimum in qn ∈ T. By Lemma 5 below we have

(P1): |φ 000 φ 00 − 2 | ≤ c21 δ1 , where δ1 = max{|Npi q j |, |i + j | = 4, j ≥ 1} is contributed k n by perturbation. Lemma 6 below guarantees that 1

1

(P2): |φω0 φ 00 1− m | ≤ 1; 00 2 φ 0 2 φ (P3): φω00 > √ω200 when the right-hand side is not smaller than |φ 00 |− m which holds φ

in a set S with positive Lebesgue measure meas S ∩ {ω˜ = const.} ≥ (1 − 2) meas ∩ {ω˜ = const.} , where → 0 as δo → 0, and m will be specified later.

(6.1)

414

C.-Q. Cheng

Let 3 := Imλ =

√ 1, 1 = 4φ 00 Npn2 − Np2n qn , and suppose that d3 10 φω0 1ω = − = k 1. + dω 23φ 00 23

(6.2)

Here we use the symbol ω to denote a scalar variable, q i.e. ω ∈ R. By virtue of the condition |Imλ| ≥ |Reλ| we see that |Npn qn | ≤ 2Npn2 φ 00 , consequently, |10 | ≤ q √ √ 10δ1 max{Npn2 , Npn2 } φ 00 + O(φ 00 ) and 1ω = 4φω00 Npn2 + O( φ 00 ). If 3 does not fit into the condition then the main part of

m

3 ≤ |k|− 2 , d1 dω

is

4Npn2 φω00

(6.3)

and 0 0

1φ 2 d 21 −1 = − 00ω + δ1 φ 00 m kO(1), 2 dω φ therefore

if δ1 ≤

1 2

1 1 d 21 d1 2 d 23 = − + dω2 433 dω 23 dω2 1 −1 =− k 2 + δ1 φ 00 m kO(1) 23

(6.4)

q max{Npn2 , Npn2 } and both (P2) and (P3) are applied.

Consider a divisor for arbitrarily given l 6 = 0. By translation we can suppose it has the form of F (ω) = lω − 3(ω). Without loss of generality, we assume l > 0. Let s n 12m Npn2 o . I3c (l) = ω ∈ R : −Dl −µ < lω < Dl −µ + 2 lm For any ω outside I3c (l), if |F (ω)| ≤ Dl −µ and |Imλ(ω)| ≥ |Reλ(ω)|, then we have 1

φ 00 m ≥ 12l −1 . Note 3(ω) remains positive. Such a case happens only when ω > 0. Suppose for certain ω ∈ S ∩ {ω˜ = const.}\I3c , F (ωo ) ≤ Dl −µ holds at the global minimum point of φ. At this point the conditions (P1–3) guarantee that one of the following holds: 1. dF dω ≥ 1; d2F l2 when dF ≤ 1. 2. 2 ≥ dω

23

dω

We observe √ 00 from the √ 00argument in the proof of Lemma 6 that q(ω) is well defined in φo φ [ωo − 30l , ωo + 30lo ] for some ωo ∈ S ∩ {ω˜ = const.}. In this interval we have d 2F 3(l − 1)2 (l − 1)2 ≤ ≤ 2 33o dω 23o if δ1 ≤ d7 for some d7 > 0. So, in case√2 only when ω ranges over a small interval, µ can |F (ω)| remain less than Dl −µ . centered at ωo , with length shorter than 3 Dl − 2 −1 √ It implies that for such an interval not shorter than

φo00 15l

only in a portion smaller than

Lower Dimensional Invariant Tori in the Regions of Instability

415

√ 1 m 1 4 D12− 2 +1 l − 2 (µ−m) is it possible that |F (ω)| ≤ Dl −µ (since φo00 m ≥ 12l −1 ). As to case 1 it occupies an even smaller portion. Such an argument given here is due to the observation that λ(ω) may be multivalued. This case happens when some qn∗ (ω) starting from a global minimum point of φ becomes local instead of the global minimum point while some other global minimum point qn∗∗ emerges as ω changes. Thus |F (ω)| may take value zero at many points. Nevertheless what we care about is to find a set where |F (ω)| can take values not less than Dl −µ . Denote the set by I c (l) where |F (ω)| may take value less than Dl −µ then its Lebesgue measure √ n o m 1 meas I c (l) ≤ meas I3c (l) + 4 D12− 2 +1 l − 2 (µ−m) + 2 meas ∩ {ω˜ = const.} s 12m Npn2 ≤2Dl −µ−1 + 2 l m+2 √ n o m 1 + 4 D12− 2 +1 l − 2 (µ−m) + 2 meas ∩ {ω˜ = const.} . Back to the original form of the small divisor hk, ωi + λl (ω) (|l| ≤ κ), we denote by c the set c = {ω ∈ go (D) : |hk, ωi + λl (ω)| ≤ D|k|−µ , ∀k ∈ Zn−1 \{0}, |l| ≤ κ} and set m = 2n + 2, µ ≥ 4n + 2. Since |k|∞ ≤ |k| ≤ (n − 1)|k|∞ we have X √ |k|−ν meas c ≤ c22 min{ D, |ko |−1 , } meas() k∈Zn−1 \{0}

√ = c23 min{ D, |ko |−1 , } meas go (D) ,

(6.5)

where ν = min{µ + 1, m2 , 21 (µ + 2 − m)} ≥ n. ko is such an integer valued vector that λl (ω) is involved in the problem of resonance with tangent frequencies ω only when |l| ≥ |ko |. Since c23 is a constant only depending on n, µ, No and κ, D is a Diophantine constant which can be set arbitrarily small. |ko |−1 can also be small enough if the perturbation Po is small enough. Therefore, the right-hand side of (6.5) can be set as small as one wishes. We should notice that the estimate (6.5) is valid only for a single step of symplectic transformation. Some persistency of the set go (D)\c in the sense of measure needs to show when the coordinate transformation is subjected to perturbation. To do this we observe that at any iteration step the scheme for stronger persistency is used at some ω ∈ when |φ 00 |4n+10 ≥ |P |, where φ 00 is valued at the critical point qn∗ (ω). In the region where this relation does not hold, we may get measure estimation by an argument the same as in classical KAM theory. In the region where this relation holds, it is endowed with some nondegeneracy. Denoting by Sjs the set which is in S for the j th iteration step where |φ 00 |4n+10 ≥ |P |, we readily get that Sjs +1 ⊃ Sjs if conditions (P2) and (P3) are φ0 φ 00 2 1 modified as |φω0 φ 00 1− m | ≤ 1 + O(δj ) and φω00 > (1 + O(δj )) √ω200 for (j + 1)th step. φ Such modification does not result in any essential change of the arguments above in this section since δj goes to zero faster than exponentially. Under such consideration one has that ∞ [ cj ≤ 2 meas c , meas j =1

416

C.-Q. Cheng

by some standard argument similar in [Ar] and [CS3], where cj denotes the set in which (2.11) does not hold for the j th iteration step. So, it is left to state and to prove the following two lemmas which verify the conditions (P1–P3). Lemma 5. Assume the periodic function y ∈ C 4 (T) satisfying the condition |y (4) | ≤ . At the global minimum and global maximum point of y, if y 00 6 = 0, then y 000 2 00 ≤ . y Proof. Suppose xo is a local minimum point of y, where y 00 ≤ y 000 2 . We expand y in Taylor formula 1 1 1 y − yo = 1x 2 yo00 + yo000 1x + y (4) (xo + λ1x)1x 2 , 2 6 24 y 00

and if we choose 1x = −4sign(y 000 ) y 000o , we find y(xo +1x)−y(xo ) < 0, which implies o t xo is not a global minimum point of y. u 1

Lemma 6. Assume φ(ω, q) ∈ C m+2 (I × T, R), |φωi q j | i ≤ σ 1 for |i + j | ≤ m + 2, j ≤ 4. Let C1 = {(ω, q) ∈ I × T; φ 0 (ω, q) = 0}, C2 = {(ω, q) ∈ C1 ; φ(ω, q) = 1 mins∈T φ(ω, s)}, I1 = {ω ∈ I ; (ω, q) ∈ C1 , |φω0 |1+ m |φ 00 |−1 ≤ 1 when φ 00 6= 0, or φ 00 = φ0

00

φ 0}, I2c = {ω ∈ I1 ; (ω, q) ∈ C2 ; | √ω 00 |2 ≤ | √ω200 | = k 2 , k ≥ 12|φ 00 |− m , }, where the φ

1

φ

prime denotes derivative with respect to q. Then

|I2c | ≤ |I |,

|I1 | ≥ (1 − )|I |,

where |Ij | stands for the Lebesque measure, and → 0 as σ → 0. Proof. Consider a function g ∈ C m+1 ; let ak = 1 m

1 m

d k g(0) , dy k

1

c = maxk≤m+1 {|g (k) | k }. If

o o , a2c ) there are at most m roots of g. Indeed, ao 6 = 0 we claim that in the interval (− a2c if g is expanded into Taylor series

g(x) = ao + a1 x + · · · +

am m g (m+1) (λx) m+1 x + x , m! (m + 1)!

by the definition of c we know that there exists ko ≤ m such that 1− kmo

|ako | ≥ cko ao

,

k 1− m

|ak | < ck ao

, ∀ko < k ≤ m.

1

When |x| ≤

aom 2c

,

|g (ko ) (x)| = ako + ako +1 x + · · · + ≥

1− ko cko ao m

1−

m−k Xo k=1

> 0.

am g (m+1) (λx) m−ko +1 x m−ko + x (m − ko )! (m − ko + 1)! 1

aom 1 − 2k k! 2m−ko +1 (m − ko + 1)!

Lower Dimensional Invariant Tori in the Regions of Instability

417

Since g (ko ) (x) keeps positive (negative) in this interval, g(x) has at most ko roots. Next, we define a non-negative function for arbitrarily fixed y, h(ω) = min{|I |, sup |ω0 − ω|}, where the supremum is taken over the set where φω (ω, q) has at most m + 1 roots. By applying the result above to the function φω0 in this situation, we find h(ω) ≥

1 0 1 |φ | m . 2σ ω

Clearly, for fixed q there are at most countably many ωi , where φ 0 = 0, φω0 6 = 0. So we are able to number these h(x), and for every q, X hi (ω) ≤ (m + 1)|I |. i

For those (ω, q) ∈ C1 where φ 0 ω 6 = 0, we find a locally well defined function q = q(ω) such that f (ω, q(ω)) ≡ 0 and XZ (6.6) 2(m + 1)π |I | ≥ hi (ω, q)dq, where the integration is over the subset of C1 where φ 0 ω 6 = 0. Thus φ0 XZ hi ω00 dω the right-hand side of (6.6) ≥ φ I \I1 0 1+ m1 XZ φ ω dω ≥ 00 I \I1 σ φ ≥

1 meas(I \I1 ), σ

from which we get the estimate on I1 . To get the second estimate, let us assume that there is a point (ωo , q(ωo )) ∈ C2 , k . First we claim that through this point the function q = q(ω) ωo ∈ I2c , so |βo | ≤ 12 √ 00 φ induced by φ 0 (ω, q(ω)) = 0 is well defined in the interval [ωo , ωo + | 30ko |), denoted by L+ (ωo ). Indeed, since φ takes its global minimum at (ωo , qo ) with respect to q, p φ0 dq | = | φω00 | is not bigger than k6 in the interval |φo000 | ≤ σ φo00 by Lemma 5. If |β| = | dω √ 00 φ [ωo , ωo + λ k o ), (0 < λ) then ∂ 3φ 1 p ∂ 3φ i j (ω, q(ω)) − i j (ωo , q(ωo )) ≤ λσ φo00 ∂q ∂ω ∂q ∂ω 3 for |i + j | = 3, i ≥ 1 and it follows that 00 φ (ω) − φ 00 (ωo ) ≤ sup{|φ 000 β| + |φ 00 |}1ω ω

2 ≤ λσ |φ 00 (ωo )|. 3

418

C.-Q. Cheng

Note that β is governed by the equation β˙ = −

φω0 2 φ 00

+2

φω00 φ 000 2 β − β . φ 00 φ 00

By Gronwell’s inequality we have ) Z ω 0 Z ω 000 Z ω 00 ( φω φω2 φ 2 2 00 dω βo + β ≤ exp 00 dω + 00 β dω , φ ωo ωo φ ωo φ 3 ≤ exp5λ βo + λk . 2 1 1 , |β| remains smaller than k6 . In this case |φ 00 (ω) − φo00 | ≤ 45 |φo00 |, If σ ≤ 19 and λ ≤ 30 which implies that the function q = q(ω) with q(ω √ o ) = qo is well defined in L(ωo ). The φ 00

same argument also applies to L( ωo ) = (ωo − | 30ko |, ωo ]. Clearly |L+ | = |L− | := |L|. Now we pick out those ωj ∈ I2c so that L± (ωj ) ∩ L± (ωk ) = ∅ for l 6= k and ∪j L± (ωj ) ⊇ I2c . By the argument as above the function q = q(ω) is well defined in 0 φωj2 1 ˙ ≥ each L± (ωj ) and |β| ˙ takes values 2 φ 00 in this interval. The function q˙ = q(ω) j

ranging over an interval Qj with |Qj | ≥ 8|L(ωj )| either when ω ranges over L+ (ωj ) 0 φ 2 1 qωj . Note |φω0 | < |φω0 2 | and min |φω0 2 | ≥ or when over L− (ωj ), where 8(ωj ) = 60 00 k φj

(1 − σ )|φω0 2 | in each L(ωj ) we find that h(ω) ≥ j

2(m + 1)π |I | ≥ ≥

XZ X j

≥

1 1 0 m 2σ |φω2 | , j

hk (ω, q)dq

1 1 8(ωj )|φω0 2 | m |L(ωj )| 2(1 − σ )σ j

1 meas(I2c ), 120(1 − σ )σ

which completes the proof. u t References [Ar]

Arnold, V.I.: Proof of a theorem of A.N. Kolmogorov on the invariance of quasi-periodic motions under small perturbations of the Hamiltonian. Russ. Math. Surv. 18, 9–36 (1963) [BK] Berstein, D., Katok, A.: Birkhoff periodic orbits for small perturbations of completely integrable Hamiltonian systems with convex Hamiltonians. Invent. Math. 88, 224–241 (1987) [Cg] Cheng, C.-Q.: Birkhoff–Kolmogorov–Arnold–Moser tori in convex Hamiltonian systems. Commun. Math. Phys. 177, 529–559 (1996) [CS1] Cheng, C.-Q., Sun, Y.-S.: Existence of Invariant Tori in Three Dimensional Measure-preserving Mappings, Celest. Mech. 45, 275–292 (1990) [CS2] Cheng, C.-Q., Sun, Y.-S.: Existence of Periodically Invariant Tori in Three Dimensional Measure preserving Mappings. Celest. Mech. 45, 293–303 (1990) [CS3] Cheng, C.-Q., Sun, Y.-S.: Existence of KAM tori in degenerate Hamiltonian systems. J. Diff. Eqns. 114, 288–335 (1994) [El] Eliasson, L.H.: Perturbations of stable invariant tori for Hamiltonian systems. Ann Scuola Norm. Su. Pisa 15 115–148 (1988)

Lower Dimensional Invariant Tori in the Regions of Instability

[Ma] [Mo] [MP] [Se] [SZ] [Tr] [Wh] [Xi] [Yo]

419

Mather, J.N.: Action minimizing invariant measures for positive definite Lagrangian systems. Math. Z. 207, 169–207 (1991) Moser, J.K.: On invariant curves of area-preserving mappings of an annulus. Nachr. Akad. Wiss. Gött. Math. Phys. K1, 1–20 (1962) Moser, J., Pöschel, J.: An extension of a result by Dinaburg and Sinai on quasi-periodic potentials. Comment. Math. Helv. 59, 39–85 (1984) Sevryuk, M.M.: Some problems of the KAM theory: Conditionally periodic motions in typical systems. Russ. Math. Surv. 50, 341–353 (1995) Salamon, D., Zehnder, E: KAM theory in configuration space. Comment. Math. Helv. 64, 84–132 (1989) Treschev, D.V.: The mechanism of destruction of resonance tori of Hamiltonian systems. Math. USSR Sbornik 68, 181–203 (1991) Whitney, H.: Analytical extension of differentiable functions defined in closed sets. Trans. Am. Math. Soc. 36, 63–89 (1934) Xia, Z.H.: Existence of invariant tori in volumn-preserving diffeomorphisms. Ergod. Th. & Dynam. Sys. 12, 621–631 (1992) Yoccoz, J.C.: Travaux de Herman sur les tores Invariants. Asterisque 206, 311–344 (1992)

Communicated by A. Jaffe

Commun. Math. Phys. 203, 421 – 444 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Scattering Problem for Local Perturbations of the Free Quantum Gas Yu. G. Kondratiev1,2 , A. Yu. Konstantinov3 , M. Röckner4 , G. V. Shchepan’uk5 1 Bonn University, D-53013 Bonn, Germany. E-mail: [email protected];

BiBoS Research Center, Bielefeld University, D-33615 Bielefeld, Germany

2 Institute for Mathematics of the Ukrainian National Science Academy, 3 Tereshchenkivs’ka St, MSP, Kyiv-4,

252601, Ukraine

3 Department for Mathematics, Kyiv University, 64 Volodymyrs’ka St, 252033, Kyiv, Ukraine.

E-mail: [email protected]

4 Department for Mathematics, Bielefeld University, D-33615 Bielefeld, Germany.

E-mail: [email protected]

5 Institute for Mathematics of the Ukrainian National Science Academy, 3 Tereshchenkivs’ka St, MSP, Kyiv-4,

252601, Ukraine. E-mail: [email protected] Received: 25 June 1998 / Accepted: 3 December 1998

Abstract: Scattering theory for perturbations of the intrinsic Dirichlet (Laplace–Beltrami) operator H0 = − div0 ∇ 0 on L2 (0, πz ), i. e. the space of πz -square integrable functions on the configuration space 0 over Rd , is studied. Here πz denotes Poisson measure with intensity z. We show that for an arbitrary regular non-zero potential V the standard wave operators W ± (H0 , H0 + V ) do not exist, and propose to consider Dirichlet operators of perturbed Poisson measures instead of potential perturbations of the Hamiltonian H0 . As case studies, cylindric smooth densities and finite volume Gibbs perturbations of the Poisson measure are considered. In these cases the existence of the corresponding wave operators is proved. Introduction Finite as well as zero temperature states of free quantum gases are usually described in terms of the Araki–Woods representation (see, e. g., [5]). In particular, the free Bose gas of zero density has a standard representation in Fock space, which leads to the known Schrödinger representation in the functional realization of the Fock space as a space of functions square integrable with respect to the corresponding Gaussian measure, see e. g., [4]. In contrast to this, the free Bose gas of positive density cannot be properly described in the Gaussian space representation, and a rigorous mathematical description of the ground state for this system can only be achieved in terms of Poisson analysis (see [3] as well as the references and historical comments therein). In this paper we study scattering theory for different types of perturbations of the intrinsic Dirichlet (Laplace–Beltrami) operator H0 = − div0 ∇ 0 on L2 (0, πz ) [1], where πz is the Poisson measure with Lebesgue intensity measure zdx, z > 0, and 0 is the corresponding configuration space over Rd . It should be mentioned here that from the physical point of view the operator H0 is the energy operator, or Hamiltonian, of the corresponding system of the free Bose gas [3]. In Sect. 2 we show that for an arbitrary regular non-zero potential perturbation V the standard wave operators W ± (H0 , H0 +V )

422

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

do not exist. (An analogous result in the Gaussian case is well known, see e. g. [4].) Hence, to study scattering theory for the pair of operators H0 and H0 + V , we need some renormalization of the perturbed Hamiltonian H0 +V . As in the case of Gaussian analysis [4], such a renormalization can be given by the Dirichlet operator of a perturbed Poisson measure (see [2,3]). In Sect. 3 we consider perturbations of πz by cylindric smooth densities and prove existence of the corresponding wave operators. Most important from the mathematical physics point of view is the case of a local perturbation of the Poisson measure by a Gibbs factor, describing a locally perturbed Bose gas in which interactions between particles take place only in a fixed bounded domain of Rd . But first, we need to recall the notions of configuration space, Poisson measure and the corresponding Dirichlet form [1,2] (Sect. 1). 1. Poisson Analysis and Intrinsic Differential Geometry of Poisson Spaces In this section we provide a brief review of Poisson analysis and the intrinsic differential geometry of the corresponding Poisson spaces needed for this article. For a more detailed exposition of different aspects of Poisson analysis and the intrinsic differential geometry of Poisson spaces as well as their applications to quantum field theory and statistical mechanics see [7,8,1–3,18,20,13,14,12,16] and the references therein. Let L(3), 3 ⊆ Rd , denote the system of all bounded Borel subsets in 3 and define the configuration space 0 over Rd as the set of all locally finite configurations in Rd , i. e. 0 := γ ⊂ Rd | #(γ ∩ 3) < ∞ for all 3 ∈ L(Rd ) , #(γ ∩ 3) being the cardinality of the set γ ∩ 3. Let B3 (0), 3 ∈ B(Rd ), be the σ -algebra on 0 generated by all mappings of the form 0 3 γ 7 → #(γ ∩ 30 ) ∈ Z+ := {0, 1, 2, . . . }, where 30 ∈ L(3). We set B(0) := BRd (0). The Poisson measure πz˜ over B(0) is defined so that for all 3 ∈ L(Rd ) and any B3 (0)-measurable function F ≥ 0, Z Z ∞ h Z iX 1 F (γ )πz˜ (dγ ) = exp − z˜ (x)dx F {xi }ni=1 (˜zdx)1n , (1.1) n! 3n 0 3 n=0

where (˜zdx)1n := z˜ (x1 ) . . . z˜ (xn ) dx1 . . . dxn and z˜ is some non-negative function from d Lloc 1 (R ). The existence of πz˜ is insured by Kolmogorov’s theorem. Let us set for an arbitrary B(Rd )-measurable function f over Rd with bounded essential support X f (x). hf, γ i := x∈γ

This means that we identify a configuration γ ∈ 0 with the following point measure over B(Rd ): X δx , γ ≡ x∈γ

Scattering Problem for Quantum Gas

423

where δx is Dirac-measure at the point x, i. e., for all A ∈ B(Rd ), ( 1, if x ∈ A, δx (A) = 0, otherwise. Lemma 1. Let ξ ∈ S(Rd ), 3 ∈ L(Rd ), and let F : 0 7→ R be a B3 (0)-measurable d function and z˜ ∈ Lloc 1 (R ). Then Z hZ iZ ehξ,γ i F (γ )πz˜ (dγ ) = exp (eξ(x) − 1)˜z(x)dx F (γ )πz˜ eξ (dγ ), (1.2) Rd

0

0

provided the integrals with respect to z˜ dx and the Poisson measure πz˜ eξ in the right-hand side of (1.2) exist. Proof. Since D(Rd ) := C0∞ (Rd ) is dense in S(Rd ), it is enough to prove (1.2) for ξ ∈ D(Rd ). We may assume that 3 ⊃ supp ξ . Then, using formula (1.1), we obtain: Z ehξ,γ i F (γ )πz˜ (dγ ) = 0

Z ∞ iX h Z 1 z˜ (x)dx eξ(x1 )+···+ξ(xn ) F {xi }ni=1 (˜z dx)1n = exp − n! 3n 3 n=0 i hZ (eξ(x) − 1)˜z(x)dx × = exp h

3

= exp

Z ∞ iX 1 F {xi }ni=1 (˜zeξ dx)1n = n! 3n 3 n=0 iZ (eξ(x) − 1)˜z(x)dx F (γ )πz˜ eξ (dγ ). t u

Z

× exp − hZ Rd

z˜ (x)eξ(x) dx

0

For an arbitrary measurable function fn over Rdn with bounded essential support and γ ∈ 0 we define a Wick monomial hfn , : γ ⊗n :i as a function of γ ∈ 0 by the following formula [12]: X X fn (xπ(1) , . . . , xπ(n) ), (1.3) hfn , : γ ⊗n :i := {x1 ,...,xn }⊂γ π∈Sn

where Sn denotes the group of all permutations of the set {1, . . . , n}. Remark 1. Formula (1.3) allows us to regard :γ n :, n ∈ N, as the following point measure over B(Rdn ) [15,18,20] :

n Y i=1

γ (dxi ) : =

n i−1 Y X (γ − δxj )(dxi ). i=1

(1.4)

j =1

Lemma 2. For any B(Rdn )-measurable function fn over Rdn with bounded essential support Z Z hfn , : γ ⊗n :iπz˜ (dγ ) = f (x1 , . . . , xn )(˜z dx)1n , (1.5) 0

Rdn

if the integral on the right-hand side of (1.5) exists.

424

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

Proof. Choosing 3 ∈ L(Rd ) in such a way that 3n ⊃ ess supp fn and applying formulae (1.1) and (1.3), we get Z 0

hfn , : γ ⊗n :iπz˜ (dγ ) =

Z ∞ iX 1 z˜ (x)dx = exp − m! 3m 3 h

Z

m=0

m X

fn (xi1 , . . . , xin )(˜zdx)1m =

i1 ,...,in =1 ij 6=ik , j 6=k

Z Z 1 1 z˜ (x)dx fn (x1 , . . . , xn )(˜zdx)n (˜zdx)n+1 = exp − m = (m − n)! 3n 3 3m−n m=n Z Z f (x1 , . . . , xn )(˜zdx)1n = f (x1 , . . . , xn )(˜zdx)1n . t u = h

Z

∞ iX

3n

Rdn

In view of formula (1.5), for the calculation of averages from a product of Wick monomials with respect to Poisson measure πz˜ , it is useful to have some rule for expressing a product of Wick monomials by their linear combination. But in order to introduce such a rule we need some more definitions. Let α = {α1 , . . . , αk } be a partition of the set {1, . . . , n}, n ∈ N, into k non-empty subsets, i. e., α1 ∪ α2 ∪ · · · ∪ αk = {1, . . . , n}, αi 6 = ∅, αi ∩ αj = ∅, i 6 = j, i, j = 1, . . . , k, and a function ζα : {1, . . . , n} 7 → {1, . . . , k}, k = #α, be defined in such a way that i ∈ αζα (i) , where the subsets α1 , . . . , αk are supposed to be ordered somehow, for example by the smallest number they contain. Now, for an arbitrary B(Rdn )-measurable function fn over Rdn we can introduce a mapping fn 7 → fnα , k = #α, by the following formula: fnα (x1 , . . . , x#α ) = fn (xζα (1) , . . . , xζα (n) ),

(1.6)

and formulate the rule as follows: Theorem 1. For all partitions α of the set {1, . . . , n} into k non-empty subsets α1 , . . . , αk , hfn , : γ ⊗#α1 : ⊗ · · · ⊗ : γ ⊗#αk :i =

X e :|e α αi ∩αj |≤1, i=1,...,#e α, j =1,...,k

e α hfnα , : γ ⊗#e :i.

(1.7) The proof is just a combinatorial exercise based on formula (1.4) and definition (1.6), see [18,20] and the references therein. Remark 2. Using the tradition of graphical representation for complicated formulae, widely followed in quantum field theory, it is possible to restate formula (1.7) in the form of a generalized Wick theorem for Poisson fields γ , see [18,20,16,12] and the references therein.

Scattering Problem for Quantum Gas

425

Along with the Wick monomials defined by formula (1.3), in this article we also use the following Charlier polynomials: Q0 = 1,

Qn (fn , γ ) := hfn , : (γ − z˜ )⊗n :i, n ∈ N,

(1.8)

where by definition : (γ − z˜ )⊗n : =

n X k=0

Cnk (−1)k z˜ ⊗k ⊗ : γ ⊗(n−k) : .

For example, Q2 (ϕ1 ⊗ ϕ2 , γ ) = hϕ1 , γ − z˜ ihϕ2 , γ − z˜ i − hϕ1 ϕ2 , γ i, Q1 (ϕ, γ ) = hϕ, γ − z˜ i, Q3 (ϕ1 ⊗ ϕ2 ⊗ ϕ3 , γ ) = hϕ1 , γ − z˜ ihϕ2 , γ − z˜ ihϕ3 , γ − z˜ i − hϕ1 ϕ2 , γ ihϕ3 , γ − z˜ i − − hϕ1 ϕ3 , γ ihϕ2 , γ − z˜ i − hϕ2 ϕ3 , γ ihϕ1 , γ − z˜ i + 2hϕ1 ϕ2 ϕ3 , γ i. Alternatively, Charlier polynomials can be defined through the following generating functional [6–8,20]: exp hln(1 + ϕ), γ i − hϕ, z˜ i , (1.9) where ϕ ∈ S(Rd ). It is easy to see from (1.8) and (1.5) or directly from (1.9) that the Charlier polynomials of different order are mutually orthogonal, more precisely: Z Qn (ϕ1 ⊗ · · · ⊗ ϕn , γ )Qm (φ1 ⊗ · · · ⊗ φm , γ )πz˜ (dγ ) 0 X hϕ1 , z˜ φπ(1) i . . . hϕn , z˜ φπ(n) i, = δm,n π∈Sn

where δm,n equals 1 if n = m, and 0 otherwise. In [13] a generalized Wick theorem for compensated Poisson fields was obtained which gives the rule for expressing a product of Charlier polynomials by their linear combination (see also [19,9]). For the case of the product of two Charlier polynomials it gives the following formula: b ,γ) + bfm , γ ) + Qn+m−1 (vn ⊗f Qn (vn , γ )Qm (fm , γ ) = Qn+m (vn ⊗ m b m , γ ), bfm , γ ) + Qn+m−3 (vn ⊗f + Qn+m−2 (vn ⊗

(1.10)

where by definition bfm )(x1 , . . . , xn+m ) := Psn+m vn (x1 , . . . , xn )fm (xn+1 , . . . , xn+m ), (vn ⊗ b m )(x1 , . . . , xn+m−1 ) := mnP n+m−1 vn (x1 , . . . , xn )fm (xn , . . . , xn+m−1 ), (vn ⊗f s mnPsn+m−2

bfm )(x1 , . . . , xn+m−2 ) := (vn ⊗ Z · vn (x1 , . . . , xn−1 , x)fm (x, xn , . . . , xn+m−2 )˜zdx, Rd

n+m−3 b )(x , . . . , x (vn ⊗f m 1 n+m−3 ) := m(m − 1)n(n − 1)Ps Z · vn (x1 , . . . , xn−1 , x)fm (x, xn−1 , . . . , xn+m−3 )˜zdx,

Rd

426

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

and Psn denotes the projection on the symmetric subspace of D(Rnd ). Any vector field v ∈ C01 (Rd 7 → Rd ) defines a one-parameter group of diffeomorphisms ψt,v of Rd , t ∈ R. Let for any γ ∈ 0, ψt,v γ := {ψt,v x | x ∈ γ }. Then, we can define the directional derivative of a functional F : 0 7 → Rd along the vector field v ∈ C01 (Rd 7 → Rd ) as follows: (∇v0 F )(γ ) =

d F (ψt,v γ ) t=0 , dt

(1.11)

provided the right-hand side of (1.11) makes sense. For the case when F belongs to the set of smooth cylinder functionals FCb∞ (D, 0), i. e. has the following form: (1.12) F (γ ) = gF hϕ1 , γ i, . . . , hϕn , γ i for some n ∈ N, gF ∈ Cb∞ (Rn ) and ϕ1 , . . . , ϕn ∈ D := C0∞ (Rd ), it is possible to write its directional derivative explicitly: (∇v0 F )(γ ) =

n X

∂i gF hϕ1 , γ i, . . . , hϕn , γ i h∇v ϕi , γ i,

(1.13)

i=1

where (∇v ϕ)(x) = h∇ϕ(x), v(x)iTx (Rd ) is the usual directional derivative on Rd along the vector field v. Note that the set of smooth cylinder functionals is dense in L2 (0, πz˜ ). We define the tangent space Tγ (0) at the point γ ∈ 0 as the Hilbert space of B(Rd )measurable γ -square-integrable vector fields V : Rd 7→ Rd with the scalar product Z hV1 (x), V2 (x)iTx (Rd ) γ (dx), V1 , V2 ∈ Tγ (0). hV1 , V2 iTγ (0) := Rd

The intrinsic 0-gradient of a function F ∈ FCb∞ (D, 0) is a mapping 0 3 γ 7→ (∇ 0 F )(γ ) ∈ Tγ (0) such that (∇v0 F )(γ ) = h(∇ 0 F )(γ ), viTγ (0) for any v ∈ C01 (Rd 7→ Rd ). As follows from (1.13) for smooth cylinder functions of the form (1.12) we have ∇ 0 F (γ , x) =

n X

∂i gF hϕ1 , γ i, . . . , hϕn , γ i ∇ϕi (x),

i=1 d

where ∇ = ∇ R is the usual gradient in Rd . As can be shown from the definitions of Charlier polynomials and the intrinsic 0gradient above we have that ∇ 0 Qn (ϕ1 ⊗ · · · ⊗ ϕn , γ ) = Y X (−1)|τ |−1 |τ | − 1 !Q|τ¯ | ⊗ ϕi , γ ∇ ϕi , = τ ⊂{1,...,n}, τ 6=∅

where by definition τ¯ := {1, . . . , n} \ τ .

i∈τ¯

i∈τ

(1.14)

Scattering Problem for Quantum Gas

427

For any F, G ∈ FCb∞ (D, 0) introduce the pre-Dirichlet form Eπ0z by Z Eπ0z (F, G) = h∇ 0 F, ∇ 0 GiTγ (0) πz (dγ ). 0

(1.15)

As is shown in [1] this pre-Dirichlet form is closable and its closure Eπ0z , Dom(Eπ0z ) is a Dirichlet form on L2 (0, πz ). Moreover, the corresponding operator Hπ0z , Dom(Hπ0z ) associated with this Dirichlet form, in the sense that Eπ0z (F, G) = hHπ0z F, GiL2 (0,πz ) ,

for all F, G ∈ Dom(Hπ0z ),

is the Friedrichs’ extension of the operator − div0 ∇ 0 , where the 0-divergence div0 is defined in such a way that Z Z hV(γ ), ∇ 0 F (γ )iTγ (0) πz (dγ ) = − (div0 V)(γ )F (γ )πz (dγ ), F ∈ FCb∞ (D, 0) 0

0

for any vector field V of the form V=

n X i=1

Fi vi , vi ∈ C01 (Rd 7 → Rd ), Fi ∈ FCb∞ (D, 0), i = 1, . . . , n, n ∈ N,

i. e., div0 V = div0

n X i=1

vi Fi :=

n X

h∇ 0 Fi , vi iT (0) + Fi div vi .

i=1

It should be mentioned that Hπ0z is essentially self-adjoint on the set of smooth cylinder functions FCb∞ (D, 0) (cf. [1, Sect. 5]). 2. Potential Perturbations. Non-Existence of Standard Wave Operators Set H0 := Hπ0z . The operator H0 has the following second quantization structure [1]: H0 1 = 0, i. e., 1 is a ground state of H0 , and for any fn ∈ S(Rdn ), H0 Qn (fn , ·) = Qn (−1n fn , ·),

n ≥ 1,

(2.1)

where −1n is the Laplacian in L2 (Rdn ), 1 := 11 . Under the Wiener–Itô–Segal isomorphism between L2 (0, πz ) and the Fock space over L2 (Rd , zdx) the operator H0 transforms into the second quantization operator from the one particle operator −1. It immediately follows that H0 is a non-negative essentially self-adjoint operator and σ (H0 ) = σac (H0 ) = [0, ∞), where σ (H0 ) and σac (H0 ) are the spectrum and the absolutely continuous spectrum of the operator H0 , respectively, [4, Vol. II, Chapter 6]. Moreover, the absolutely continuous subspace Hac corresponding to the operator H0 coincides with the orthogonal complement to the ground state Hac (H0 ) = {1}⊥ . By the spectral representation theorem, (2.1) implies that for any bounded Borel function ϕ : R 7 → C, fn ∈ L2 (Rdn ). ϕ(H0 )Qn (fn , ·) = Qn ϕ(−1n )fn , · ,

428

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

In particular, e−itH0 Qn (fn , ·) = Qn e−it1n fn , · ,

fn ∈ L2 (Rdn ).

(2.2)

Let V ∈ L2 (0, πz ) be a real-valued function, and H be some self-adjoint extension of the symmetric operator H0 + V defined on FCb∞ (D, 0). Such a self-adjoint extension exists because H0 + V commutes on FCb∞ (D, 0) with the operator of complex conjugation, i. e., (H0 + V )F (γ ) = (H0 + V )F (γ ),

F ∈ FCb∞ (D, 0).

For example, if V is semi-bounded from below, H can be the Friedrich’s extension of the corresponding symmetric operator. From the point of view of standard non-stationary scattering theory [10,17] it would be natural to define wave operators W ± (H, H0 ) as the strong limits: W ± (H, H0 ) = s-lim eitH e−itH0 Pac (H0 ), t→∓∞

(2.3)

if they exist. Here Pac (H0 ) is the projection on the absolutely continuous subspace Hac (H0 ) of the operator H0 . Let us show that the wave operators W ± (H, H0 ) do not exist for an arbitrary non-zero potential V . Theorem 2. Suppose that V is a real-valued non-zero potential V ∈ L2 (0, πz ) and that the operator V e−H0 is bounded in H. Let H be a self-adjoint extension of the symmetric operator H0 + V defined on FCb∞ (D, 0). Then the wave operators defined by (2.3) do not exist. Proof. Suppose that W ± (H, H0 ) exist. Then (see, e. g., [10]) (H + i)−1 W ± = W ± (H0 + i)−1 and s-lim (H + i)−1 − (H0 + i)−1 e−itH0 Pac (H0 ) = 0.

t→±∞

(2.4)

Denote the closure of a linear operator T by T . Clearly, e−H0 V = (V e−H0 )∗ is a bounded operator in H and e−H0 V (H0 + i)−1 = e−H0 V (H0 + i)−1 = = e−H0 (H0 + V + i)(H + i)−1 V (H0 + i)−1 = = − (H0 + i)e−H0 + (V e−H0 )∗ (H + i)−1 − (H0 + i)−1 . Hence, by (2.4) s-lim e−H0 V (H0 + i)−1 e−itH0 Pac (H0 ) = 0.

t→±∞

As the set (H0 + i)−1 Hac (H0 ) is dense in Hac (H0 ), we have s-lim e−H0 V e−itH0 Pac (H0 ) = 0.

t→±∞

(2.5)

Scattering Problem for Quantum Gas

429

We will now show that condition (2.5) is not valid, and therefore wave operators do not exist. Define uϕ (γ ) = Q1 (ϕ, γ ) ≡ hϕ, γ i − hϕ, zi ∈ Pac (H0 )H. Here ϕ ∈ S(Rd ), where S(Rd ) is the space of tempered test functions. Note that e−itH0 uϕ = uϕt , see (2.2), where Z 2 ei|x−y| /4t ϕ(y)dy. (2.6) ϕt (x) := (e−it1 ϕ)(x) = (4πit)−d/2 Rd

It is sufficient to show that for some ϕ 6 = 0, lim ke−H0 V uϕt kH 6= 0.

t→∞

Let V =

∞ P n=0

Qn (vn , γ ) be the expansion of V in a series of orthogonal Charlier poly-

nomials [7,8]. Then, we have V uϕ(t) =

∞ X

Qn (vn , γ )Q1 ϕt , γ .

n=0

By formula (1.10), b n , γ ) + Qn−1 (ϕt ⊗ bvn , γ ) + Qn (ϕt ⊗v bvn , γ ). Q1 ϕt , γ Qn (vn , γ ) = Qn+1 (ϕt ⊗ b k As kϕt kL∞ (Rd ) → 0 when t → ∞, we have that kϕt ⊗v n L2 (Rdn ) → 0 when t → ∞. bvn kL (Rd(n−1) ) → 0 as t → ∞. This is clear for Analogously, one can prove that kϕt ⊗ 2 vn ∈ S(Rdn ). For an arbitrary vn ∈ L2 (Rdn ), it follows from the estimate bvn kL (Rd(n−1) ) ≤ z nkϕkL2 (Rd ) kvn kL2 (Rdn ) kϕt ⊗ 2 and the fact that S(Rdn ) is dense in L2 (Rdn ), n ∈ N. Let n0 be the minimal integer for which kvn0 k 6 = 0 and let Pn , n ∈ N, be the projector on the subspace generated by Charlier polynomials of order n. It suffices to show that kPn0 +1 e−H0 V uϕt kH 6 → 0 as t → ∞. We have

b n +1 , γ +Qn +1 ϕt ⊗ bvn0 , γ + Qn0 +1 ϕt ⊗v bvn0 +2 , γ . Pn0 +1 V uϕt (γ ) = Qn0 +1 ϕt ⊗ 0 0

Clearly, the norm of the second and the third terms go to zero as t tends to infinity. Hence, bvn0 , γ kH = (2.7) lim kPn0 +1 e−H0 V uϕt kH = lim kQn0 +1 e−1n0 +1 ϕt ⊗ t→∞

t→∞

1

((n0 + 1)!) 2 z

n0 +1 2

ke−1 ϕkL2 (Rd ) ke−1n0 vn0 kL2 (Rdn0 ) 6 = 0.

Here we have used the following equality Z vˆn (x1 , . . . , xk−1 , xk+1 , . . . , xn+1 )ϕˆt (xk ) × lim t→∞ Rd(n+1)

× vˆn (x1 , . . . , xl−1 , xl+1 , . . . , xn+1 )ϕˆt (xl )(dx)1n+1 = 0, k 6= l,

(2.8)

430

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

where vˆn := e−1n vn , ϕˆ := e−1 ϕ and (dx)1n := dx1 . . . dxn , n ∈ N. Note that (2.8) follows from the weak convergence w

ϕt = e−it1 ϕ −→ 0, as t → ∞. By (2.7), we see that (2.5) is not valid. Hence, the wave operators W ± (H, H0 ) do not exist. We have proved in Theorem 2 that V uϕt → 0 as t → ∞ iff ϕ = 0. In Sects. 3 and 4, we will need the following generalization of this fact. Lemma 3. Let V be a non-zero function from L2 (0, πz ) and suppose that all the operators V Pn , n ∈ N, are bounded. Then F ∈ H | lim kV e−itH0 F kH = 0 = {0}. t→∞

Proof. Let V =

∞ X

Qn (vn , γ )

and

F =

n=0

∞ X

Qm (fn , γ )

m=0

be expansions of V and F in the series of Charlier polynomials, and set n0 := min{n | kvn kL2 (Rdn ) 6 = 0}, and suppose that

kV e−itH0 F k

H

m0 := min{m | kfm kL2 (Rdm ) 6 = 0}

→ 0 as t → ∞. Then

kPn0 +m0 V e−itH0 F kH → 0 as t → ∞.

(2.9)

By (1.10), we have that b m ,t , γ ) + bfm0 ,t , γ ) + Qn0 +m0 (vn0 +1 ⊗f Pn0 +m0 (V e−itH0 F ) = Qn0 +m0 (vn0 ⊗ 0 b , γ ) + + Qn0 +m0 (vn0 ⊗f m0 +1,t X X b b Qn0 +m0 (vn ⊗fm,t , γ ) + Qn0 +m0 (vn ⊗f + m,t , γ ), n≥n0 , m≥m0 n+m=n0 +m0 +2

n≥n0 , m≥m0 n+m=n0 +m0 +3

where by definition fn,t := e−it1 fn for any fn ∈ L2 (Rdm ), n ∈ N. Clearly (see the proof of Theorem 2), for any fn ∈ S(Rdn ), n ∈ N, bfm0 ,t kL (Rd(n0 +m0 ) ) → kvn0 kL (Rdn0 ) kfm0 kL (Rdm0 ) , as t → ∞, kvn0 ⊗ 2 2 2

b m,t k kvn ⊗f L2 (Rd(n+m−1) ) → 0, as t → ∞, bfm,t kL (Rd(n+m−2) ) → 0, as t → ∞, kvn ⊗ 2 b kvn ⊗fm,t kL2 (Rd(n+m−3) ) → 0, as t → ∞. Hence, for all fn ∈ S(Rdn ), kPn0 +m0 V e−itH0 F kH → ((n0 + m0 )!) 2 z 1

n0 +m0 2

kvn0 kL2 (Rdn0 ) kfm0 kL2 (Rdm0 ) 6= 0,

as t → ∞. (2.10) Since the operator Pn0 +m0 V = (V Pn0 +m0 )∗ is bounded in H, one can extend (2.10) to any vector F ∈ H. This leads to a contradiction to condition (2.9), and therefore, t fm = 0 for all m ≥ 0 from which it implies that F ≡ 0. u

Scattering Problem for Quantum Gas

431

3. Existence of Wave Operators for Cylindric Darboux Transforms Consider the operator H associated with the Dirichlet form corresponding to Poisson measure perturbed by a smooth cylindric density 0 E8 2 π (F, G) z

Z =

0

h∇ 0 F (γ ), ∇ 0 G(γ )iTγ (0) 82 (γ )πz (dγ ).

We shall assume that 8(γ ) = exp g8 hφ1 , γ i, . . . , hφn , γ i , where g8 ∈ Cb1 (Rn ), φ1 , . . . , φn ∈ C 1 (Rd ) ∩ L1 (Rd ) ∩ L2 (Rd ). The action of the operator H on exponentials F (γ ) = exp hξ, γ i , ξ ∈ S(Rd ) can be written [3] in the following form: (H F )(γ ) = (H0 F )(γ ) + 2

n X i=1

∂i g8 hφ1 , γ i, . . . , hφn , γ i h∇φi , ∇ξ iRd , γ F (γ ). (3.1)

From now on, we will assume that H is some non-negative definite self-adjoint extension e = L2 0, 82 (γ ) of the symmetric operator defined by (3.1) in the Hilbert space H e ⊃ H, and we can consider the wave operators defined by (2.3), as πz (dγ ) . Then H e operators from H into H. Theorem 3. Let H0 and H be self-adjoint operators associated with the Dirichlet forms 0 Eπ0z and E8 2 π correspondingly, and assume that z

(i) g8 ∈ Cb1 (Rn ); (ii) φj ∈ C 1 (Rd ) and for some c > 0, α > d/2 + 1 |∇φj (x)| ≤

c , (1 + |x|)α

x ∈ Rd , j = 1, . . . , n.

(3.2)

Then, the wave operators defined by the formula W ± (H, H0 ) = s-lim eitH e−itH0 t→∓∞

exist, and moreover, σac (H ) = σac (H0 ) = [0, +∞). Remark 3. Since H 1 = H0 1 = 0 and Hac (H0 ) = {1}⊥ , we can omit the projector Pac in the definition of the wave operators. Remark 4. Assumption (3.2) can be replaced by some Lp condition. For example, it suffices to assume that for some ε > 0, |∇φi | ∈ L2 Rd , (1 + |x|)2+ε dx ,

i = 1, . . . , n.

432

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

Proof. According to the Cook criterion, existence of the wave operators W ± (H, H0 ) will be proved if we show that for any function F from some dense subset of H there exists tF > 0 such that Z ∞

(H − H0 )e±itH0 F e dt < ∞. (3.3) H tF

It is convenient for our purpose to choose as such a dense subset the linear span of all functions of the form F (γ ) = exp[hlog(1 + ϕ), γ i], where ϕ ∈ F −1 C0∞ (Rd \ {0}) , ϕ > −1, F −1 C0∞ (Rd \ {0}) denoting the image under the inverse Fourier transform F −1 in L2 (Rd ) of the set of all infinite differentiable functions whose support is a bounded subset of Rd \ {0}. Then [1] h

i ϕ±t := e±it1 ϕ, e±itH0 F (γ ) = exp log(1 + ϕ±t ), γ , and (3.3) can be rewritten as follows

Z ∞ h

i

(H − H0 ) exp log(1 + ϕ±t ), · dt < ∞.

e H

tF

(3.4)

By (2.6), kϕ±t kL∞ (Rd ) ≤ (4πt)−d/2 kϕkL1 (Rd ) ,

(3.5)

hence we can assume that there exists tF > 0 such that for all t ≥ tF

kϕ±t kL∞ (Rd ) ≤

1 , 2

(3.6)

and so expression (3.4) makes sense. Applying formula (3.1) and taking into account the boundedness of ∂i g8 , i = 1, . . . , n, and g8 , we obtain that

h

i

(H − H0 ) exp log(1 + ϕ±t ), ·

e

H

n

E h

X D

i ∇ϕ±t

= 2 ∂i g8 hφ1 , ·i, . . . , hφn , ·i ∇φi , log 1 + ϕ±t , · d , · exp

e 1 + ϕ±t R H i=1

D

n E h

X

i ∇ϕ±t

≤2 sup |∂i g8 |

∇φi , 1 + ϕ Rd , · exp log 1 + ϕ±t , · e ±t H i=1

D

n E h X

i ∇ϕ±t

. ≤ 2(sup eg8 )1/2 | sup ∂i g8 | ∇φ , , · exp log 1 + ϕ , · i ±t d

1 + ϕ±t R H i=1

Consequently, (3.4) will be proved if we show that for some tF so that (3.6) holds, we have

Z ∞ D E h

i

∇φi , ∇ϕ±t log 1 + ϕ±t , · i = 1, . . . , n. d , · exp

dt < ∞,

1+ϕ R tF

±t

H

(3.7)

Scattering Problem for Quantum Gas

433

Define η±t := h∇φj , ∇ log(1 + ϕ±t )iRd , and ξ±t := log(1 + ϕ±t ). Then Z

2

hη±t , ·i exp hξ±t , ·i = hη±t ⊗ η±t , γ ⊗2 i exp 2hRe ξ±t , γ i πz (dγ ) = H 0 iZ h Z 2 Re ξ±t (x) (e − 1)dx hη±t ⊗ η±t , γ ⊗2 iπze2 Re ξ±t (dγ ) = = exp z d R 0 i h Z (e2 Re ξ±t (x) − 1)dx × = exp z d ZR 2 Z 2 e2 Re ξ±t (x) η±t (x)dx + e2 Re ξ±t (x) η±t (x) dx , × z Rd

Rd

(3.8)

where we used formulae (1.2), (1.7) and (1.5) from Sect. 1. Since e2 Re ξ±t (x) = |eξ±t (x) |2 = |1 + ϕ±t |2 , kϕ±t kL2 (Rd ) = kϕkL2 (Rd ) and h Z exp z

Rd

R Rd

(3.9)

ϕ±t (x)dx = (2π )d/2 (Fϕ±t )(0) = 0, we get that

i h Z (e2 Re ξ±t − 1)dx = exp z

Rd

i

|1 + ϕ±t |2 − 1

= exp zkϕk2L

d 2 (R )

.

(3.10)

Moreover, taking into account (3.9), (3.6) and the equality ∇ϕ±t = (∇ϕ)±t , we obtain that 1/2 Z 2 Z 2 e2 Re ξ±t (x) η±t (x)dx + e2 Re ξ±t (x) η±t (x) dx ≤ z Rd

Rd

2

∇ϕ±t 2 |1 + ϕ±t | ∇φi , + d dx 1 + ϕ±t R Rd 1/2 Z

∇ϕ±t 2 |1 + ϕ±t |2 ∇φi , dx ≤ +z d 1 + ϕ±t R Rd Z ∇ϕ 2 2 ±t 1 + |ϕ±t | |∇φi | ≤ z dx + 1 + ϕ±t Rd 1/2 Z ∇ϕ 2 ±t 2 ≤ 1 + |ϕ±t | |∇φi |2 +z dx 1 + ϕ±t Rd Z 2 2 |∇ϕ±t | dx + 1 + |ϕ±t | |∇φi | ≤ z 1 − |ϕ±t | Rd 1/2 Z 2 |∇ϕ±t |2 ≤ 1 + |ϕ±t | |∇φi |2 +z 2 dx 1 − |ϕ±t | Rd

Z ≤ z

434

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

Z 1/2 Z 2 3 2 2 ≤3 |∇φi | |∇ϕ±t |dx + z |∇φi | |∇ϕ±t | dx ≤ z d 2 Rd Z R Z 1/2 3 |∇φi | |∇ϕ±t |dx + z |∇φi |2 |∇ϕ±t |2 dx . (3.11) ≤3 z 2 Rd Rd So, as one can see from (3.8), (3.10), (3.11) and the fact that all components of ∇ϕ also belong to F −1 C0∞ (Rd \ {0}) , (3.7) will be proved if we show that Z ∞

|∇φi | ϕ±t dt < ∞, i = 1, . . . , n (3.12) L (Rd ) 1

tF

and

Z

∞ tF

|∇φi | ϕ±t

L2 (Rd )

dt < ∞, i = 1, . . . , n

(3.13)

for all ϕ ∈ F −1 C0∞ (Rd \ {0}) . It is well known (see Theorem XI.24 in [17]) that (3.13) holds if |∇φi (x)| ≤

c , (1 + |x|)1+ε

i = 1, . . . , n

for some c, ε > 0. But this follows from condition (ii) of the theorem. To prove (3.12), it suffices to show (see (3.5) and the proof of Theorem XI.16 in [17]) that for all b > a > 0, Z ∞ Z −d/2 ∇φi (x)dx dt < ∞, t i = 1, . . . , n. I := tF

at≤|x|≤bt

By condition (ii) of the theorem, we have that Z Z ∇φi (x) dx ≤ c at≤|x|≤bt

at≤|x|≤bt

for some constant e c. And so,

Z I ≤e c

∞

tF

dx ≤e ct d−α (1 + |x|)α

t d/2−α dt < ∞

for all α > d/2 + 1. To show that σac (H ) = σac (H0 ) = [0, +∞) let us introduce the following subspaces 0 = (Ker W + )⊥ , H = Ran W + . Then H0 and H are invariant subspaces of the Hin in in in operators H0 and H respectively, and the restriction H0 H0 is unitary equivalent to in

0 = H. To prove this, let us H Hin (see Prop. 4 of Sect. XI.3 in [17]). In our case Hin + + show that Ker W = {0}. Let F ∈ Ker W . Then

kW + F k = lim k82 e−itH0 F k = 0. t→+∞

By Lemma 3, we see that F = 0. Therefore, H0 is unitary equivalent to H Hin and t σac (H ) ⊃ σac (H0 ) = [0, ∞). On the other hand H ≥ 0, and so σ (H ) ⊂ [0, ∞). u

Scattering Problem for Quantum Gas

435

4. Scattering Theory for Local Gibbs Perturbations of Poisson Measures As we have mentioned in the introduction the Poisson space L2 (0, πz ) describes a functional space realization of the Araki-Woods representation for the free non-relativistic Bose gas of positive density z > 0, see [3] and the references therein. The Poisson measure plays the role of a vacuum measure in this realization. Let us consider now the case of a perturbed vacuum measure µ. This perturbation will be local in space and of Gibbs type. Due to the canonical quantization approach from [3] the new vacuum measure again creates a state of locally perturbed non-relativistic Bose gas. In this approach the Hamiltonian of the perturbed system appears as an operator H associated with the Dirichlet form of the Gibbs measure µ on 0 defined by h 1 i ⊗2 :i πz (dγ ), µ(dγ ) = 4−1 z exp − hV , : γ 2 where V (x, y) = v(x − y)a(x) a(y),

(4.1)

d a ∈ C0∞ (Rd ), a ≥ 0, v, ∇v ∈ Lloc 2 (R ) and v(x) = v(−x). Here and below, we assume that there exists a bounded non-negative measurable function b(·) with bounded support such that the inequality

1 hV , : γ ⊗2 :i ≥ −hb, γ i 2

(4.2)

holds for πz -almost all γ ∈ 0. Remark 5. Assumption (4.2) is similar to the well-known stability condition on the potential v: X v(xi − xj ) ≥ −Bn. (4.3) ∃B : ∀n ∈ N, ∀x1 , . . . , xn ∈ Rd 1≤i<j ≤n

In particular, (4.2) can be obtained from (4.3) if we take a(·) = χ3 (·), where χ3 is the characteristic function of some bounded subset 3 in Rd . Remark 6. Both assumptions (4.2) and (4.3) are obviously fulfilled provided v ≥ 0 and, in this case, we can give a more simple proof of Theorem 4 below, see Appendix 4. The action of the operator H on cylinder functions F = g hξ1 , ·i, . . . , hξn , ·i ∈ FCb∞ (S, 0), S := S(Rd ), is defined by the formula (see [3]): (H F )(γ ) = (H0 F )(γ ) +

n 1X hKξi , : γ ⊗2 :i∂i g hξ1 , ·i, . . . , hξn , ·i , 2

(4.4)

i=1

where

Kξ (x, y) := ∇x V (x, y), (∇ξ )(x) Rd + ∇y V (x, y), (∇ξ )(y) Rd .

We will assume that H is some non-negative self-adjoint extension of the symmetric operator defined by (4.4) in the Hilbert space Hµ = L2 0, µ .

436

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

Recall that Pn is the orthogonal projector on the subspace generated by all Charlier polynomials of order n in the Hilbert space H and consider the operator Jn acting from H into Hµ by the formula: Jn F = Pn F , n ≥ 1. As we will show below (Lemma 4), the operators Jn , n ≥ 1, are bounded from H into Hµ , and therefore, one can define wave operators (from H into Hµ ) W ± (H, H0 , Jn ) := s-lim eitH Jn e−itH0 ≡ s-lim eitH e−itH0 Pn , t→∓∞

t→∓∞

(4.5)

provided the limit exists. We will show that such wave operators indeed exist for all n, and hence the standard wave operators W ± (H, H0 )F = lim eitH e−itH0 F,

(4.6)

t→∓∞

exist for all polynomials F . Lemma 4. For any n ≥ 1, Jn are bounded operators from H to Hµ . Proof. By (4.2),

Qn (fn , ·) 2

Hµ

Z

h i Qn (fn , γ ) 2 exp − 1 hV , : γ ⊗2 :i πz (dγ ) 2 Z0 2 −1 Qn (fn , γ ) exp hb, γ i πz (dγ ). ≤4 ≤ 4−1 z z

0

Since

Qn (fn , ·) 2 = zn n!kfn k2 H L

2 (R

d)

,

to prove the lemma it is sufficient to show that for some cn > 0 and any symmetric function fn ∈ L2 (Rdn ), Z Qn (fn , γ ) 2 exp hb, γ i πz (dγ ) ≤ cn kfn k2 d . L (R ) 2

0

Denote b˜ := eb − 1, e(ϕ, γ ) := exp

h

i

∞ X 1 Qn (ϕ ⊗n , γ ), log(1 + ϕ), γ − zhϕi = n!

n=0

and consider for ϕ ∈ C0∞ (Rd ), Z

e(tϕ, γ )Qn (fn , γ ) exp hb, γ i πz (dγ ) = 0 Z ˜ γ Qn (fn , γ )πz (dγ ) = ˜ e b˜ + tϕ(1 + b), = exp zhb˜ + tϕ bi 0 E D n ˜ (b˜ + tϕ(1 + b)) ˜ ⊗n , fn . =z exp zhb˜ + tϕ bi dn L2 (R )

(4.7)

Scattering Problem for Quantum Gas

437

By (4.7), we have Z 0

Qn (ϕ ⊗n , γ )Qn (fn , γ ) exp hb, γ i πz (dγ ) =

˜ = n! z exp zhbi n

n X k=0

Cnk

E ˜ kD zhϕ, bi ˜ ⊗(n−k) , fn . b˜ ⊗k ⊗ (ϕ + ϕ b) L2 (Rdn ) k!

Therefore, Z 0

n X Qn (fn , γ ) 2 exp hb, γ i πz (dγ ) = zn exp zhbi ˜

Z ×

×

k Y i=1

k=0

Rd(n+k)

×

fn (x1 , . . . , xk , xk+1 , . . . , xn )fn (x10 , . . . , xk0 , xk+1 , . . . , xn ) ×

n Y

˜ i0 ) ˜ i )b(x b(x

(n!)2 (k!)2 (n − k)!

j =k+1

˜ j ) dx1 . . . dxn dx10 . . . dxk0 ≤ cn kfn k2 1+ b(x L

2 (R

dn )

.

t u

Theorem 4. Let H0 and H be self-adjoint operators associated with the Dirichlet forms d corresponding to the measures πz and µ. Suppose also that v and |∇v| are from Lloc 2 (R ), ∞ d v(x) = v(−x), and the potential V defined by formula (4.1) with a ∈ C0 (R ) satisfying assumption (4.2). Then for all n ∈ N the wave operators W ± (H, H0 , Jn ) exist and, in particular, the wave operators W ± (H, H0 ) exist on all polynomials. Moreover, σac (H ) = σac (H0 ). Remark 7. By a modification of the following proof it is possible to weaken the condition d v, ∇v ∈ Lloc 2 (R ) allowing a non integrable positive singularity of the potential v, see e. g. [2]. Proof. Let us show that for all n ∈ N the wave operators (4.5) exist. To do this, it suffices to show that for any F from some total subset of Pn H there exists tF > 0 such that Z ∞

(H − H0 )e±itH0 F dt < ∞. (4.8) H µ

tF

We may take F (γ ) = Qn (fn , ·) with kernels fn = ϕ1 ⊗ · · · ⊗ ϕn , where as before ϕ1 , . . . , ϕn ∈ F −1 C0∞ (Rd \ {0}) , and denote ϕi,±t := e±it1 ϕi , i = 1, . . . , n. Then, see (2.2), e±itH0 Qn (fn , γ ) = Qn (ϕ1,±t ⊗ · · · ⊗ ϕn,±t , γ ), and using formulae (4.4) and (1.14), we get that (H − H0 )e±itH0 Qn (fn , γ ) =

1 2

X

(−1)|τ |−1 |τ | − 1 !

τ ⊂{1,...,n} τ 6 =∅

hKϕτ,±t , : γ ⊗2 :iQ|τ¯ | ⊗ ϕi,±t , γ , i∈τ¯

438

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

where, by definition, ϕτ,±t := So,

Q i∈τ

ϕi,±t and τ¯ := {1, . . . , n} \ τ .

(H − H0 )e±itH0 Qn (fn , γ )

Hµ 1 X

≤ |τ | − 1 ! hKϕτ,±t , : γ ⊗2 :iQ|τ¯ | ⊗ ϕi,±t , γ , Hµ 2 τ ⊂{1,...,n} i∈τ¯ τ 6 =∅

and because of the finite number of terms in the sum over τ , (4.8) can be reduced to Z ∞

∀τ ⊂ {1, . . . , n}, τ 6 = ∅.

hKϕτ,±t , : γ ⊗2 :iQ|τ¯ | ⊗ ϕi,±t , γ dt < ∞ Hµ

i∈τ¯

tF

(4.9)

We may assume that τ = {1, . . . , k}, k = 1, . . . , n, n ∈ N. Taking into account formula (1.8) and our assumption about the functions ϕ1 , . . . , ϕn : hϕi,±t i = (2π)d/2 (Fϕi )(0) = 0, one can see that Qn−k

D n E n ⊗ ϕi,±t , γ = ⊗ ϕi,±t , : (γ − z)⊗(n−k) : i=k+1 E D i=k+1 n ⊗ ϕi,±t , : γ ⊗(n−k) : , k ∈ N, = i=k+1

and so (4.9) is equivalent to Z ∞

hKϕ ...ϕ , : γ ⊗2 :ihϕk+1,±t ⊗ · · · ⊗ ϕn,±t , : γ ⊗(n−k) :i dt < ∞ 1,±t k,±t H µ

tF

(4.10)

∀k ∈ {1, . . . , n}, n ∈ N. As follows from the definition of measure µ, formulae (4.2), (1.2) and Lemma 5 (see Appendix B),

hKϕ ...ϕ , : γ ⊗2 :ihϕk+1,±t ⊗ · · · ⊗ ϕn,±t , : γ ⊗(n−k) :i 2 = 1,±t k,±t Hµ Z hKϕ ...ϕ , : γ ⊗2 :ihϕk+1,±t ⊗ · · · ⊗ ϕn,±t , : γ ⊗(n−k) :i 2 × = 4−1 1,±t z k,±t 0 Z i h 1 hKϕ ...ϕ , : γ ⊗2 :i × × exp − hV , : γ ⊗2 :i πz (dγ ) ≤ 4−1 1,±t z k,±t 2 0 h i 2 × hϕk+1,±t ⊗ · · · ⊗ ϕn,±t , : γ ⊗(n−k) :i exp hb, γ i πz (dγ ) = Z −1 he b,zi hKϕ ...ϕ , : γ ⊗2 :ihϕk+1,±t ⊗ · · · ⊗ ϕn,±t , : γ ⊗(n−k) :i 2 × ≤ 4z e 1,±t k,±t 0

e

hb,zi × × πzeb (dγ ) ≤ 4−1 z e bkL × P zeB , | supp a|, zke

2 (R

× kKϕ1,±t ...ϕk,±t k2L

2 (R

d)

,

d)

, kϕi kL2 (Rd ) , kϕi,±t kL∞ (Rd ) , i = k + 1, . . . , n ×

Scattering Problem for Quantum Gas

439

where, as in the proof of Lemma 4, e b := eb − 1, B = ess sup b, and P is a nonbkL2 (Rd ) , kϕk+1 kL2 (Rd ) , negative polynomial of 2n + 3 real variables zeB , | supp a|, zke . . . , kϕn kL2 (Rd ) , kϕk+1,±t kL∞ (Rd ) , . . . , kϕn,±t kL∞ (Rd ) . By (3.5) kϕi,±t kL∞ (Rd ) , i = k + 1, . . . , n, are bounded for all t > tF . So, consider kKϕ1,±t ...ϕk,±t kL2 (Rd ) . We have that

kKϕ1,±t ...ϕk,±t kL2 (R2d ) = ∇x V (x, y), ∇(ϕ1,±t . . . ϕk,±t )(x) Rd

+ ∇y V (x, y), ∇(ϕ1,±t . . . ϕk,±t )(y) Rd ≤ L2 (R2d )

≤ ∇x V (x, y), ∇(ϕ1,±t . . . ϕk,±t )(x) Rd L2 (R2d )

+ ∇y V (x, y), ∇(ϕ1,±t . . . ϕk,±t )(y) Rd = L2 (R2d )

= ≤ 2 ∇x V (x, y), ∇(ϕ1,±t . . . ϕk,±t )(x) Rd 2d L2 ( R )

k

X

≤ 2

i=1

≤2

∇x V (x, y), ∇ϕi,±t (x) Rd

k Y

ϕj,±t (x)

j =1 j 6 =i

k k

X Y

ϕj,±t (x)

∇x V (x, y), ∇ϕi,±t (x) Rd i=1

k Y k X

ϕj,±t (x) ≤2 i=1

j =1 j 6 =i

j =1 j 6 =i

L∞ (Rd )

L2 (R2d )

L2 (R2d )

≤

≤

×

× ∇x V (x, y), ∇ϕi,±t (x) Rd

L2 (R2d )

.

(4.11)

By (3.5), (4.10) will follow from the estimate Z

∞ tF

h∇x V (x, y), ∇ϕ±t (x)i

Rd L2 (R2d ) dt

<∞

∀ϕ ∈ F −1 C0∞ (Rd \ {0}) . (4.12)

But because of the fact that our potential V has a bounded support, from the method of stationary phase (see the

corollary after Theorem XI.14 in [17]), it follows that

h∇x V (x, y), ∇ϕ±t (x)i d R L2 (Rd ) vanishes faster than any power of t, and therefore (4.12) holds. Since eitH0 and eitH are contractions on H and Hµ , respectively, for all t ∈ R, and Jn : H 7 → Hµ is bounded (Lemma 4), the above implies the existence of the wave operators W ± (H, H0 , Jn ) for all n ∈ N. The proof of the equality σac (H ) = σac (H0 ) is analogous to that in the proof of Theorem 3. u t Corollary 1. Suppose, in addition to the assumptions of Theorem 4, that V ≥ 0. Then, the wave operators W ± (H, H0 ) defined by (4.6) exist for all F ∈ H.

440

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

Proof. Since k · kHµ ≤ k · kH , the operators eitH e−itH , t ∈ R, are uniformly bounded from H into Hµ . Hence, from the existence of the limits in (4.6) on the set of all polynomials (which is dense in H) the existence of the wave operators W ± (H, H0 ) on the whole space H follows. u t Appendix A Here, just as in Sect. 4, we consider a local Gibbs perturbation of the Poisson measure πz and give another, more simple, proof of Theorem 4 for the case when the potential set of V is a non-negative function1 . Namely, we verify the Cook criterion on the same exponents as in the proof of Theorem 3, i. e. for any ϕ ∈ F −1 C0∞ (Rd \ {0}) , ϕ > −1, there exists such tϕ > 0 that

Z ∞ h

i

(H − H0 ) exp log(1 + ϕ±t ), · dt < ∞, (A.1)

tϕ

Hµ

where ϕ±t := e±it1 ϕ. Indeed, using the definition of the measure µ in formula (4.4), the assumption about the positivity of V , and formulae (1.2), (3.10), one obtains that

h

i

(H − H0 ) exp log(1 + ϕ±t ), · =

Hµ

= ≤ ≤ ≤

≤

Z h 1 i 2 1 −1 4z hKξ±t , : γ ⊗2 :iehξ±t ,γ i exp − hV , : γ ⊗2 :i πz (dγ ) ≤ 4 2 Z0 1 −1 ⊗2 hξ±t ,γ i 2 4 hKξ±t , : γ :ie πz (dγ ) = 4 z 0 Z 1 −1 hKξ±t ⊗ Kξ±t , : γ ⊗2 : ⊗ : γ ⊗2 :i exp 2hRe ξ±t , γ i πz (dγ ) = 4z 4 0 h Z i 1 −1 4z exp z (e2 Re ξ±t (x) − 1)dx · 4 Rd Z · hKξ±t ⊗ Kξ±t , : γ ⊗2 : ⊗ : γ ⊗2 :iπze2 Re ξ±t (dγ ) = 0 Z 1 −1 4z exp zkϕkL2 (Rd ) hKξ±t ⊗ Kξ±t , : γ ⊗2 : ⊗ : γ ⊗2 :iπze2 Re ξ±t (dγ ), 4 0

where just as in the proof of Theorem 2, ξ±t := log(1 + ϕ±t ). Next, using formulae (1.7), (1.5), (3.9), (3.6) and Schwarz’ inequality we get Z hKξ±t ⊗ Kξ±t , : γ ⊗2 : ⊗ : γ ⊗2 :iπze2 Re ξ±t (dγ ) = 0 2 Z Kξ±t (x1 , x2 )zeRe ξ±t (x1 ) zeRe ξ±t (x2 ) dx1 dx2 + = 2d ZR Kξ±t (x1 , x2 )Kξ±t (x1 , x˜2 )zeRe ξ±t (x1 ) zeRe ξ±t (x2 ) zeRe ξ±t (x˜2 ) dx1 dx2 d x˜2 + +4 R3d

1 In this case, H ⊃ H and the wave operators W ± (H, H ) exist on the whole space H, see Corollary 1. µ 0

Scattering Problem for Quantum Gas

Z +2

R2d

441

Kξ (x1 , x2 ) zeRe ξ±t (x1 ) zeRe ξ±t (x2 ) dx1 dx2 ≤ ±t

2 3 3 3 4 Z Kξ±t (x1 , x2 )dx1 dx2 + 4 z × ≤ z 2 2 R2d Z Kξ (x1 , x2 ) Kξ (x1 , x˜2 ) dx1 dx2 d x˜2 + × ±t ±t R3d 3 2 Z Kξ (x1 , x2 ) dx1 dx2 ≤ +2 z ±t 2d 2 R 3 3 3 2 Z 3 4 2 Kξ (x, y) dxdy. z | supp a| + 4 z | supp a| + 2 z ≤ ±t 2 2 2 R2d Therefore, (A.1) is equivalent to Z

∞ tϕ

kKξ±t kL2 (R2d ) dt < ∞

(A.2)

for all ϕ ∈ F −1 C0∞ (Rd \ {0}) , ϕ > −1. But because of (3.5) we can always choose tϕ so that for all t > tϕ , kϕ±t kL∞ (Rd ) <

1 , 2

and so kKξ±t kL2 (R2d )

D

∇ϕ±t (x) E

≤ 2 ∇x V (x, y), 1 + ϕ±t (x) Rd L2 (R2d )

≤ 4 ∇x V (x, y), ∇ϕ±t (x) Rd , 2d L2 (R )

which reduces (A.2) to (4.12).

t u

Appendix B This appendix is devoted to the proof of the following d d Lemma 5. Let K ∈ L2 (Rd × Rd ) be supported in 3 × 3 ⊂ R × R , |3| < ∞. Then, for all n ∈ N, ϕ1 , . . . , ϕn ∈ F −1 C0∞ (Rd \ {0}) , and any non-negative bounded measurable function z˜ on Rd such that (˜z − z) ∈ L2 (Rd ) for some z ≥ 0,

Z 0

hK, : γ ⊗2 :ihϕ1 ⊗ · · · ⊗ ϕn , : γ ⊗n :i 2 πz˜ (dγ ) ≤

≤ P sup z˜ , |3|, k˜z − zkL2 (Rd ) , kϕi kL2 (Rd ) , kϕi kL∞ (Rd ) , i = 1, . . . , n kKk2L

2 (R

where P is a non-negative polynomial of 2n + 3 real variables.

, (B.1)

d)

442

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

Proof. The main steps of the proof are the same as those leading to estimating the integral Z hK, : γ ⊗2 :iπz˜ (dγ ) 0

in Appendix 4, with the only difference that now we cannot write down explicitly all the terms arising after applying formulae (1.7) and (1.5) to Z hK ⊗ K ⊗ ϕ1 ⊗ · · · ⊗ ϕn ⊗ ϕ1 ⊗ · · · ⊗ ϕn , 0

: γ ⊗2 : ⊗ : γ ⊗2 : ⊗ : γ ⊗n : ⊗ : γ ⊗n :iπz˜ (dγ ).

Nevertheless, we can divide them into three groups and give an explicit rule for estimating the terms from each of the groups, leading to formula (B.1). Let us ascribe to the first group the terms corresponding to such partitions α e in formula (1.7), every element of which contains no more than one number from the set {1, 2, 3, 4}. For example, as in α e = {1, 5, 6 + n}, {2}, {3}, {4}, {6, 7 + n}, {7}, {8}, . . . , {4 + 2n} . The contribution of the corresponding term to the integral on the left-hand side of (B.1) can be written as follows: Z K(x1 , x2 )K(x˜1 , x˜2 )ϕ1 (x1 )ϕ2 (y2 )ϕ3 (y3 )ϕ4 (y4 ) . . . ϕn (yn ) × Rd(2n+1)

˜ 12 (˜zdy)2n (˜zd y˜1 )(˜zd y) ˜ 4n = × ϕ1 (y˜1 ) ϕ2 (x1 ) ϕ3 (y2 ) ϕ4 (y˜4 ) . . . ϕn (y˜n )(˜zdx)12 (˜zd x) = hϕ3 , z˜ ihϕ4 , z˜ i . . . hϕn , z˜ ihϕ1 , z˜ ihϕ4 , z˜ i . . . hϕn , z˜ i × Z Z ϕ2 (y2 )ϕ3 (y2 )˜z(y2 )dy2 K(x1 , x2 )K(x˜1 , x˜2 )ϕ1 (x1 )ϕ2 (x1 )(˜zdx)12 (˜zd x) ˜ 12 ≤ × Rd

34

≤ |hϕ3 , z˜ i| |hϕ4 , z˜ i| . . . |hϕn , z˜ i| |hϕ1 , z˜ i| |hϕ4 , z˜ i| . . . |hϕn , z˜ i| × × kϕ2 kL2 (Rd ,˜zdx) kϕ3 kL2 (Rd ,˜zdx) kϕ1 kL∞ (Rd ) kϕ2 kL∞ (Rd ) × (sup z˜ |3|)2 kK(x, y)k2L

2 (R

2d ,˜ zdx z˜ dy)

.

We note that the sign of complex conjugation inside the modulo and norm signs in the above formula is useful for inferring the general rule. It is easy to see that in the general situation the norm of ϕk or ϕk is in L∞ (Rd ), if α, together with the number 4 + k or 4 + n + k is contained in some e αi , i = 1, . . . , #e zdx) if it is contained in some e αi , one of the numbers 1, 2, 3 or 4, and is in L2 (Rd ,e i = 1, . . . , #e α, together with any other number from 5 to 4 + 2n. Otherwise, i. e., if α, alone, we get the number 4 + k or 4 + n + k is contained in some e αi , i = 1, . . . , #e |hϕi , z˜ i| or |hϕi , z˜ i| correspondingly. Next, we ascribe to the second group the terms corresponding to the partitions α e with only one element containing more than one number from the set {1, 2, 3, 4}. For example, as in α e {1, 3, 5, 6 + n}, {2}, {4}, {6, 7 + n}, {7}, {8}, . . . , {4 + 2n} , which gives the following contribution to the integral on the left-hand side of (B.1): hϕ3 , z˜ ihϕ4 , z˜ i . . . hϕn , z˜ ihϕ1 , z˜ ihϕ4 , z˜ i . . . hϕn , z˜ i × Z Z ϕ2 (y2 )ϕ3 (y2 )˜z(y2 )dy2 K(x1 , x2 )K(x1 , x˜2 )ϕ1 (x1 )ϕ2 (x1 )(˜zdx)12 (˜zd x˜2 ) ≤ × Rd

34

Scattering Problem for Quantum Gas

443

≤ |hϕ3 , z˜ i| |hϕ4 , z˜ i| . . . |hϕn , z˜ i| |hϕ1 , z˜ i| |hϕ4 , z˜ i| . . . |hϕn , z˜ i|kϕ2 kL2 (Rd ,˜zdx) × × kϕ3 kL2 (Rd ,˜zdx) kϕ1 kL∞ (Rd ) kϕ2 kL∞ (Rd ) sup z˜ |3|kK(x, y)k2L

2 (R

2d ,˜ zdx z˜ dy)

.

Finally, to the third group, we ascribe the terms corresponding to the partition α e in formula (1.7) with two different elements containing more than one number from the set {1, 2, 3, 4} (or to be precise, exactly by two of them), as in α e = {1, 3, 5, 6 + n}, {2, 4}, {6, 7 + n}, {7}, {8}, . . . , {4 + 2n} . The corresponding contribution to the integral in the left-hand side of (B.1) can be estimated by |hϕ3 , z˜ i| |hϕ4 , z˜ i| . . . |hϕn , z˜ i| |hϕ1 , z˜ i| |hϕ4 , z˜ i| . . . |hϕn , z˜ i| × ×kϕ2 kL2 (Rd ,˜zdx) kϕ3 kL2 (Rd ,˜zdx) kϕ1 kL∞ (Rd ) kϕ2 kL∞ (Rd ) kK(x, y)k2L

2 (R

2d ,˜ zdx z˜ dy)

.

We then see that the only difference between the estimate of the terms from the different groups lies in the power of sup z˜ |3|. To conclude the proof of the lemma it is sufficient to note that kϕi kL2 (Rd ,˜zdx) ≤ (sup z˜ )1/2 kϕi kL2 (Rd ,dx) , i = 1, . . . , n, kK(x, y)k2L

2 (R

and

Z |hϕ, z˜ i| =

Rd

2d ,˜ zdx z˜ dy)

≤ (sup z˜ )2 kK(x, y)k2L

2 (R

Z ϕ(x)˜z(x)dx =

Rd

2d )

,

ϕ(x) z˜ (x) − z dx ≤ kϕkL2 (Rd ) k˜z − zkL2 (Rd )

for all ϕ ∈ F −1 S(Rd \ {0}) . u t Acknowledgements. The first author was supported by the DFG (Deutsche Forschungsgemeinschaft) through the project no. AL 214/9–2. The second author, A.K., would like to thank the BiBoS Research Center, Bielefeld University for the warm hospitality during his stay in May of 1997. The support by the DFG (through SFB 343) is also gratefully acknowledged. G.S. was partially supported by the project “Euklidische Gibbs Zustände” no. X 271.7, Deutsch–Ukrainische WTZ. He also gratefully acknowledges the kind hospitality of the SFB 343 and Research Center BiBoS, Bielefeld University.

References 1. Albeverio, S., Kondratiev, Yu.G. and Röckner, M.: Analysis and geometry on configuration spaces I. J. Funct. Anal. 154, 444–500 (1998) 2. Albeverio, S., Kondratiev, Yu.G. and Röckner, M.: Analysis and geometry on configuration space II. The Gibbsian case. J. Funct. Anal. 157, 242–291 (1998) 3. Albeverio, S., Kondratiev, Yu.G. and Röckner, M.: Diffeomorphism groups and current algebras: Configuration space analysis in quantum theory. Rev. Math. Phys. 11, 1–23 (1999) 4. Berezansky, Yu.M. and Kondratiev, Yu.G.: Spectral methods in infinite-dimensional analysis. Dordrecht– Boston–London: Kluwer Academic Publisher, 1995 5. Bratelli, O. and Robinson, D.: Operators Algebras and Quantum Statistical Mechanics. Vol. 2, Berlin– Heidelberg–New York: Springer, 1997 6. Ito, Y.: On a generalization of non-linear Poisson functionals. Math. Rep. Toyoma Univ. 3, 111–122 (1980) 7. Ito, Y.: Generalized Poisson functionals. Probability Theory and Related Fields 77, 1–28 (1988) 8. Ito, Y. and Kubo, I.: Calculus on Gaussian and Poisson white noises. Nagoya Math. J. 111, 41–84 (1988)

444

Yu. G. Kondratiev, A. Yu. Konstantinov, M. Röckner, G. V. Shchepan’uk

9. Kabanov, Ju.M. On extended stochastic integrals. Teorija Verojatnostej i Primenenija 20, no. 4, 725–737 (1975) (in Russian) 10. Kato, T. Perturbation theory for linear operators. Berlin–Heidelberg–New York: Springer-Verlag, 1966 11. Kondratiev, Yu.G. and Konstantinov, A.Yu.: The scattering problem for special perturbed harmonic systems. Selecta Mathematica formerly Sovietica 13, no. 3, 217–224 (1994) 12. Kondratiev,Yu.G., Lytvynov, E.W., Rebenko, A.L., Röckner, M. and Shchepan’uk, G.V.: Euclidean Gibbs states for quantum systems with Boltzmann statistics via cluster expansions. Methods of Functional Analysis and Topology 3, no. 1, 62–81 (1997) 13. Lytvynov, E.W., Rebenko, A.L. and Shchepan’uk, G.V.: Wick theorems in non-Gaussian white noise calculus. Rep. Math. Phys. 37, no. 2, 217–232 (1996) 14. Lytvynov, E.W., Rebenko, A.L. and Shchepan’uk, G.V.: Wick calculus on spaces of generalized functions of compound Poisson white noise. Rep. Math. Phys. 39, no. 2, 219–248 (1997) 15. Menikoff, R. and Sharp, D.: Representations of a local current algebra: Their dynamical determination. J. Math. Phys.16, no. 12, 2341–2352 16. Rebenko, A.L. and Shchepan’uk, G.V.: The convergence of cluster expansion for continuous systems with many-body interaction. J. Stat. Phys. 88, no. 3/4, 665–689 (1997) 17. Reed, M. and Simon, B.: Methods of modern mathematical physics III. Scattering theory. New York–San Francisco–London: Academic Press, 1979 18. Shchepan’uk, G.V.: Poisson fields and distribution functions in statistical mechanics of charged particles. Ukrainian Math. J. 47, no. 5, 710–719 (1995) 19. Surgailis, D.: On multiple stochastic integrals and associated Markov semigroups. Probability and Math. Stat. 3, no. 2, 217–239 (1984) 20. Shchepan’uk, G.V. and Rebenko, A.L.: Poisson field approach to classical statistical mechanics of charged balls with Yukawa interaction. Preprint N 95.9, Institute for Mathematics of Ukrainian National Science Academy, Kyiv, 1995, 42 p. Communicated by H. Araki

Commun. Math. Phys. 203, 445 – 463 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Adiabatic Theorem without a Gap Condition Joseph E. Avron, Alexander Elgart Department of Physics, Technion, 32000 Haifa, Israel. E-mail: [email protected] Received: 3 December 1998/ Accepted: 7 December 1998

Abstract: We prove the adiabatic theorem for quantum evolution without the traditional gap condition. All that this adiabatic theorem needs is a (piecewise) twice differentiable finite dimensional spectral projection. The result implies that the adiabatic theorem holds for the ground state of atoms in quantized radiation field. The general result we prove gives no information on the rate at which the adiabatic limit is approached. With additional spectral information one can also estimate this rate. 1. Introduction and Motivation The adiabatic theorem of Quantum Mechanics describes the long time behavior of solutions of an initial value problem where the Hamiltonian generating the evolution depends slowly on time. The theorem relates these solutions to spectral information of the instantaneous Hamiltonian. Traditionally, the adiabatic theorem is stated for Hamiltonians that have an eigenvalue which is separated by a gap from the rest of the spectrum. Folk wisdom is that some form of a gap condition is a sine qua non for an adiabatic theorem to hold. This is based on the following simple but at the same time rather forceful argument: The notion of Hamiltonian that depend slowly on time makes sense provided the system in question has a finite intrinsic time scale which determines what slow and fast mean. In quantum mechanics the intrinsic time scale is often determined by the gaps in the spectrum (and Planck’s constant) [14]. For example, a Harmonic oscillator with natural frequency ω, has gaps in the spectrum whose size is h¯ ω. The condition for adiabaticity is |ω| ˙ << ω2 . In the ω → 0 limit the intrinsic time diverges and ω˙ 6= 0 is never adiabatic. This suggests that one can not expect a general adiabatic theorem to hold in the absence of gaps. It is, of course, conceivable that in the absence of a gap some other property may determine a relevant and intrinsic time scale. For example, in the case of linearly crossing eigenvalues, the difference in slopes of the q eigenvalues at the point of crossing α = ˙ ˙ (E1 − E2 ), Fig. 1, determines a time scale, 1 , that takes over as the time scale associated α

446

J. E. Avron, A. Elgart

with the gap diverges. An adiabatic theorem that builds on this fact goes back to Born and Fock [15]. But, at the same time, a general adiabatic theorem in the absence of a gap condition which does not use some other special properties, like a slope condition, seems unlikely and on physical grounds, morally wrong. spec(H)

s

Fig. 1. Crossing Eigenvalues in Born Fock Theory

Nevertheless, the folk wisdom is actually wrong since we shall prove a general adiabatic theorem without a gap condition. All one really needs for the adiabatic theorem is a finite dimensional spectral projection for the Hamiltonian that depends smoothly on time. The role of the gap is to provide an a-priori rate at which the adiabatic limit is approached. In the absence of a gap, there is no such a-priori information on the rate at which the adiabatic limit is approached and it could be arbitrarily slow. Our approach to an adiabatic theorem without a gap condition has some of the flavor of an operator analog of the Riemann–Lebesgue lemma [46]. If a function and also its derivative are in L1 (R) then it is an elementary exercise that its Fourier transform decays at infinity at least as fast as an inverse power of the argument. The Riemann–Lebesgue lemma says that, in fact, the Fourier transform of any L1 (R) function vanishes at infinity. The loss of a-priori information about the derivative translates to loss of information about the rate at which the function vanishes at infinity. In this analogy differentiability is the analog of the gap condition, and the L1 (R) condition is the analog of the smoothness condition on the spectral projection. A gap condition is associated with spectral stability. Situations without a gap condition often lead to spectral instabilities. This may suggest that an adiabatic theorem without a gap condition may be an academic exercise in the sense that it may have no applications and that its premise, the existence of a smooth spectral projection, is either contrived or would be hard to establish in applications. For example, for applications to atomic physics, where the essential spectrum is absolutely continuous, [18], embedded eigenvalues tend to dissolve to resonances [47] so it is unlikely that the projection associated to an embedded eigenvalue would be continuous. Indeed, we do not know of an application to embedded eigenvalues. An interesting application of the adiabatic theorem without a gap condition is to eigenvalues at threshold. A ground state at threshold is a feature of any reasonable model Hamiltonian for atoms interacting with a radiation field. Models that do not have this property describe unstable atoms, or stable atoms in a world that has no soft photons. Models of atom-photon systems have the property that when the fine structure constant, α, is small, the ground state describes the bound electrons of the atom and a photon field close to the vacuum. Soft photons are responsible for the absence of a gap in these models. A relatively simple yet interesting model for which the existence (and uniqueness) of the ground state [48,1] as well as gaplessness [28] are known rigorously

Adiabatic Theorem without a Gap Condition

447

is the spin-boson Hamiltonian: The model of a two level system coupled to a radiation field. This has also been established for a model of non-relativistic QED [1,6–9]: A model of nonrelativistic electrons coupled to a radiation field with an ultraviolet cutoff. Unfortunately, for real QED [13,16], where both the electrons and photons are treated as relativistic quantum fields, all that is rigorously known at present is on a perturbative level. Our original motivation was to prove an adiabatic theorem for models describing

spec(H) 1111111111111111111 0000000000000000000 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111 0000000000000000000 1111111111111111111

s

Fig. 2. An Eigenvalue at Threshold

atom-photon interaction. We proved this for the Dicke model, which is the simplest model of this kind, in [3]. We then realized that one could prove a much more general adiabatic theorem without a gap condition which is not specific to models of atoms in radiation field, but would cover these as a special case. The adiabatic theorem without a gap condition resolves a problem regarding the relation between the quantum mechanics with and without radiation field. If the folk wisdom was true, and a gap condition was a necessary ingredient in the adiabatic theorem, one would expect the adiabatic theorem to hold for a two level system, but not for the spin-boson model. Since the spin-boson model is clearly a more accurate description of nature than the model of a two level system, the success of the adiabatic theorem in numerous applications where a two level model has been used, would appear like a mystery. The fact that adiabatic theorems do not really need a gap condition means that at least as far as the adiabatic theorem is concerned quantum mechanics without radiation and quantum mechanics with radiation sit in the same basket. An interesting problem that we do not resolve here is to show that not only is the adiabatic theory of quantum mechanics (without radiation) qualitatively correct, but it is also quantitatively accurate. For the Dicke model, some results in this direction are given in [3]. 2. Formulation of the Problem and the Main Result To formulate the problem of the adiabatic theory more precisely it is convenient, and traditional, to replace the physical time t by the scaled time s = t/τ . One is then concerned with the solution of the initial value problem i ψ˙ τ (s) = τ H (s) ψτ (s),

(1)

in the limit of large τ . H (s) is a self-adjoint Hamiltonian which depends sufficiently smoothly on s. ψτ is a vector (in Hilbert space) valued function. We shall be more specific about what we mean by smoothness below. H (s) evolves slowly in physical time for a

448

J. E. Avron, A. Elgart

long interval of time with finite variation in H (s). Quantum adiabatic theorems say that the solution of the initial value problem is characterized, in the adiabatic limit τ → ∞, by spectral information. There is no single adiabatic theorem. Different adiabatic theorems focus on different aspects of the problem: What is assumed about H (s) and H˙ (s); about properties of the projection P (s); the notion of smoothness, and what are the optimal error estimates, etc. All have the following structure: Let P (s) be an appropriate family of spectral projections for H (s). Let the initial data be such that ψτ (0) ∈ RangeP (0). Then, for an appropriate value of γ ≥ 0, dist ψτ (s), Range P (s) ≤ O τ −γ .

(2)

For γ = 0 we take the right-hand side to mean O τ −0 = o(1). In the present work we shall restrict ourselves to the case where H˙ (s) is compactly supported. Then we can, without loss, take s ∈ [0, 1]. Second, we shall restrict ourselves to uniform error estimates, i.e. error estimates that hold for all scaled time. This is actually the easier case. In adiabatic theory it is often possible to obtain much sharper results for times outside the support of H˙ (s). Our main result is the following: Theorem 1. Suppose that P (s) is smooth finite rank spectral projection, for the bounded, smooth Hamiltonian H (s). Then, theevolution of the initial state ψτ (0) ∈ RangeP (0), is such that dist ψτ (s), Range P (s) ≤ o(1) for all s ∈ [0, 1]. Remark 1. This is the weakest, but at the same time, the simplest, and most characteristic of our results. As it stands, it does not even apply to the Schrödinger operator because H (s) is assumed to be bounded. In Sect. 5 we shall state a generalization of this result to unbounded operators. There are two reasons why we have chosen to state the weaker result. The first is that we did not want to obscure the central issue, and what is new in this work, behind a mask of technicalities. The second is almost ideological. The adiabatic problem is an infrared, low energy, problem. The central issue in an adiabatic theorem without a gap condition is to control low energy excitations. The unboundedness of Schrödinger operators is an ultraviolet problem. This problem has well developed analytical tools [32,45,50], and has nothing to do with the core of the infrared problem of adiabatic evolution. Once one has an adiabatic theorem without a gap condition for bounded operators, the extension to unbounded ones is technical. Remark 2. We have stated the theorem with a condition of smoothness. Much less than smoothness is needed and we shall formulate a stronger result requiring only piecewise, twice differentiability of P (s) in Sect. 5. One reason why we have chosen to state a weaker result is again for simplicity, and the second is that it is likely that even the result in Sect. 5 is not optimal. Remark 3. The theorem, as stated, does not cover the case of eigenvalue crossings. This is because at eigenvalue crossing the spectral projection P (s) is not smooth (T r P (s) is discontinuous). Eigenvalue crossings can be handled by a method due to Kato [33] and we shall state a stronger version of the theorem that allows for finitely many crossings in Sect. 5.

Adiabatic Theorem without a Gap Condition

449

2.1. The Results of Davies and Spohn. Davies and Spohn [19] studied the evolution of a driven, finite dimensional quantum system coupled to a heat bath. Their prime interest was the linear response of such a system which is closely related to the adiabatic limit. They choose a Hamiltonian of the form ! r 1 Hi , τ Hq (s) + Hf + τ where Hq (s) is the time dependent Hamiltonian of the driven, finite dimensional, quantum sub-system, Hf is the Hamiltonian of a quasi-free fermion field, and Hi is the interaction. The coupling vanishes in the adiabatic limit τ → ∞. They show that the induced evolution of the finite dimensional sub-system is governed by a (finite dimensional) Hamiltonian of the form τ Hq (s) + L(s). Davies and Spohn then proceed to analyze the evolution of this finite dimensional system using some of the ideas that enter into the adiabatic theory of Kato [33]. Davies and Spohn do not prove an adiabatic theorem in the sense that the physical evolution adheres to a spectral subspace of the coupled Hamiltonian. 3. A Panorama of Adiabatic Theorems In this section we recall some of the basic adiabatic theorems: Adiabatic theorems with a gap condition, for crossing eigenvalues, adiabatic theorems beyond all orders, and adiabatic theorems for scattering. We examine how these relate to the adiabatic theorem without a gap condition.

3.1. Adiabatic Theorems with a Gap condition. The first satisfactory formulation and rigorous proof of an adiabatic theorem in the then new quantum mechanics was given in 1928 by Born and Fock [15]. They were motivated by a point of view advocated by Ehrenfest [20], which identified classical adiabatic invariants as the observables that get quantized. The theorem they proved was geared to show that quantum numbers are preserved by adiabatic deformations. Born and Fock proved an adiabatic theorem for Hamiltonian operators, H (s), with simple discrete spectrum. They showed that in Eq. (2) one can take γ ≥ 1. Their proof covers Hamiltonians like the one dimensional Harmonic oscillator, but not the Hydrogen atom, which has absolutely continuous spectrum at positive energies, and eigenvalues with multiplicities at negative energies. In 1958 Kato [33] initiated a new strategy for proving adiabatic theorems. He introduced a notion of adiabatic evolution which is purely geometric. It is associated with a natural connection in the bundle of spectral subspaces. Kato’s method was to compare the geometric evolution with the evolution generated by H (s) and to show that in the adiabatic limit the two coincide. Using this idea, Kato was able to relax the condition that H (s) had simple discrete spectrum. He showed that the adiabatic theorem holds when P (s) is a finite dimensional spectral projection associated with an isolated eigenvalue. No assumption on the spectral type of H (s) restricted to RangeP⊥ (s) need be made, Fig. 3.

450

J. E. Avron, A. Elgart

spec(H) 11111111111111111111 00000000000000000000 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111

gap

s

Fig. 3. Spectrum in Kato’s Theory

Kato’s results cover the case of Schrödinger operator for the Hydrogen atom. However it does not cover Schrödinger operators that arise for instance in the study of condensed matter physics, where there is no discrete spectrum at all. Kato’s results were extended in [5,42], to P (s) that need not be associated with an eigenvalue, and whose rank could also be infinite. In particular, the initial data could lie in a subspace corresponding to an energy band provided it is separated by a gap from the rest of the spectrum.

3.2. Adiabatic Theorems beyond All Orders. There are interesting and more delicate adiabatic theorems that apply provided one considers Eq. (2) for times s that lie outside the support of H˙ (s). Assuming a gap condition and smoothness (or analyticity) of H (s) it has been shown [23,24,11,35,43] that the adiabatic theorem Eq. (2) holds with γ = ∞. Stronger results hold in the analytic case [39,31,29,11].

3.3. Adiabatic Theorems with Eigenvalue Crossings. Born and Fock also studied the adiabatic theorem for crossing eigenvalues where the spectral projections have smooth continuations through the crossing point [15]. Born and Fock showed that if crossing is of order m (linear crossing is m = 1) then Eq. (2) holds with γ = 1/(m + 1). This problem was later studied in much detail in [22] and [27]. Kato [33] also considered the adiabatic theorems for crossing eigenvalues. He did not make any explicit assumptions about how the eigenvalues behave near crossings. The only assumption he did make was that P (s) could be continued through the crossings, and that there are finitely many crossings. Under these conditions he showed that Eq. (2) holds with γ = 0. 3.4. Adiabatic Theorems without a Gap Condition. We are aware of one example of an adiabatic theorem without a gap condition for operators that have essential spectrum. This is a result of [4] for rank one perturbations of dense point spectra. The rate of approach to the adiabatic limit is γ = 1 in Eq. (2). 3.5. Adiabatic Theorems for the Scattering Matrix. Adiabatic scattering theory relates the time dependent scattering matrix to the time independent scattering matrix. Results in this direction are described in [41,39]. These have very little to do with the kind of adiabatic theorems we consider here. In scattering theory a time scale is determined by

Adiabatic Theorem without a Gap Condition

451

the initial data: The scattered particle spends a finite amount of time in the region of interaction, and in the limit that the interaction varies slowly, it does not see the variation in the Hamiltonian. The adiabatic theorems we are interested in consider a particle that spends a long time in the region of interaction. 4. The Adiabatic Theorem and a Commutator Equation In this section we shall describe the proof of Theorem 1. To simplify the presentation, we shall stay away from making optimal assertions. In Sect. 5 we shall strengthen the result dropping most of the simplifying assumptions. The center of this section, and the heart of the adiabatic theorem, is the commutator equation, Eq. (7). It is an operator valued equation for two bounded operators X and Y . If one sets Y = 0 one gets a commutator equation that goes back to Kato. The commutator equation with Y = 0 has a bounded solution X provided there is gap. If there is no gap the equation may, in some cases, have a bounded solution, but in general it will not. The basic idea behind the adiabatic theorem without a gap condition is that one can always solve this equation with X bounded and Y bounded and small. The smaller Y the larger is the norm of X in general, but this is all right, as we shall see. In this section H (s) is a family of bounded self-adjoint Hamiltonians that depends smoothly on s so that H˙ (s) is supported in the interval [0, 1]. H (s) generates unitary evolution as the solution of the initial value problem: i U˙ τ (s) = τ H (s)Uτ (s), Uτ (0) = 1, s ∈ [0, 1].

(3)

We assume, without loss, that H (s) has eigenvalue 0 and this eigenvalue has finite multiplicity. For this eigenvalue we formulate and prove our main result. We recall the notion of adiabatic evolution [33,5]. Let UA (s) be the solution of the initial value problem: i ˙ ˙ i UA (s) = τ H (s) + [P (s), P (s)] UA (s), UA (0) = 1, s ∈ [0, 1]. (4) τ It is known that this unitary evolution has the intertwining property [5]: UA (s) P (0) = P (s) UA (s).

(5)

That is, UA (s) maps Range P (0) onto Range P (s). In particular, the solution of the initial value problem i ˙ ˙ (6) i ψ(s) = τ H (s) + [P (s), P (s)] ψ(s), ψ(0) ∈ RangeP (0), τ has the property that ψ(s) ∈ RangeP (s). We shall show that the Hamiltonian evolution, Uτ (s), is close to the adiabatic evolution UA (s). We first formulate the basic lemma: Lemma 1. Let P (s), s ∈ [0, 1], be a differentiable family of spectral projections for the self-adjoint Hamiltonian H (s) with (operator) norm kP˙ (s)k < ∞. Suppose that the commutator equation [P˙ (s), P (s)] = [H (s), X(s)] + Y (s)

(7)

452

J. E. Avron, A. Elgart

˙ has operator valued solutions, X(s) and Y (s) with X(s), X(s) and Y (s) bounded. Then k(Uτ (s) − UA (s)) P (0)k ≤

˙



 2 kX(s) P (s)k+

X(s) P (s) P (s)

maxs∈[0,1]  τ 



 + k Y (s) P (s)k .

(8)

The commutator equation, Eq. (7), can be viewed as a definition of Y (s). The issue is not to find a solution to this equation, but rather to find solutions that make Y small. In the case that there is a gap 1 separating the eigenvalue from the rest of the spectrum, a solution of the commutator equation is

X(s) =

1 2πi

Z 0

R(z, s) P˙ (s)R(z, s) dz, Y (s) = 0.

(9)

Here 0 is a circle in the complex plane, centered at the eigenvalue, and of radius 1/2, Fig. 4. R(z, s) = (H (s) − z)−1 is the resolvent at scaled time s. In this case the rate at which the adiabatic limit is obtained, is seen from Eq. (8) to be 1/τ .

Fig. 4. A contour 0 in the Complex Plane

The strategy for proving the adiabatic theorem without a gap condition is to show that one can pick Y so that its norm is arbitrarily small, possibly at the expense of large ˙ So long as the norm of X and X˙ is finite, it can be compensated by norm for X and X. taking τ large. This means that one can make the right-hand side of Eq. (8) arbitrarily small. The price paid is that there is, generally speaking, no information about the rate at which the adiabatic limit is obtained. Proof. Let W (s) = Uτ† (s)UA (s) be the wave operator comparing the adiabatic and Hamiltonian evolution. Since

kUτ (s) − UA (s)k = Uτ (s) 1 − W (s) = k1 − W (s)k,

(10)

Adiabatic Theorem without a Gap Condition

453

we need to bound W (s) − 1. From the definition of the adiabatic evolution, the commutator equation, and the equation of motion W˙ (s) = Uτ† (s) [P˙ (s), P (s)] UA (s) = Uτ† (s) [P˙ (s), P (s)] Uτ (s) W (s) (11) = Uτ† (s) [H (s), X(s)] + Y (s) Uτ (s) W (s) i = − U˙ τ† (s)X(s)Uτ (s) + Uτ† (s)X(s)U˙ τ (s) W (s) τ +Uτ† (s)Y (s)UA (s) i ˙ ˙ = − (Uτ† (s)X(s)Uτ (s)) − Uτ† (s)X(s)U τ (s) W (s) τ +Uτ† (s)Y (s)UA (s) i † ˙ Uτ (s)X(s)Uτ (s) W (s) − Uτ† (s)X(s)Uτ (s) W˙ (s) =− τ † ˙ − Uτ (s)X(s)Uτ (s) W (s) + Uτ† Y (s)UA (s) i † ˙ Uτ (s)X(s)UA (s) − Uτ† (s)X(s) [P˙ (s), P (s)] UA (s) =− τ † ˙ − Uτ (s)X(s)UA (s) + Uτ† (s)Y (s)UA (s). The lemma then follows by integration since W (s) is unitary with W (0) = 1. u t Let us describe a solution of the commutator equation which is motivated by the solution Eq. (9) in the case of a gap. In order to have explicit error estimates and also in order to make the presentation simple and as elementary as possible, we choose a Gaussian regularizer. Definition 1. Let g and e denote the Gaussian and Error functions1 , and 8 be the special function defined below: Z ω 2 ds g(s), 8(ω) = θ (ω) − e(ω), (12) g(ω) = e−πω , e(ω) = −∞

θ is the usual step function which vanishes for negative argument. Also, let us denote the scaling of a function by g1 (ω) = g(1ω),

(13)

and the multiplication operator by the argument by (ωg)(ω) = ω g(ω). An elementary lemma is: 1 The error function we use differs by a factor and shift from canonical error function.

(14)

454

J. E. Avron, A. Elgart

Lemma 2. 8 has finite L1 norm and finite moments. In particular: k8k1 =

1 1 , kω8k1 = . π 4π

(15)

Under scaling, 1 > 0: k81 k1 =

1 1 , kω81 k1 = . π1 4π 12

(16)

We assume, without loss, that the spectral projection P (s) is associated with the eigenvalue zero. Lemma 3. Let P (s) be a smooth spectral projection for H (s) associated with the eigenvalue zero. Let 0 be an infinitesimal contour around the origin in the complex plane.2 Then the commutator equation has the solution H (s) † ˙ ; X1 (s) = A + A , A = P (s) P (s)R(0, s) 1 − g 1 H (s) H (s) ˙ P (s)P (s) + P (s)P˙ (s)g , (17) Y1 (s) = −g 1 1 where kX1 (s)P (s)k ≤

˙

X1 (s)P (s) ≤

2kP˙ (s)P (s)k , 1 2(kP¨ (s)k + kP˙ 2 (s))k 1

(18) +

π kP˙ (s)k kH˙ k . 12

Proof. We start with a formal calculation. Let F1 (s) = g H1(s) − P (s) , R 1 ˙ X1 (s) = 2πi 0 dz (1 − F1 (s)) R(z, s) P (s)R(z, s) (1 − F1 (s)).

(19)

(20)

Since P˙ (s) = P (s)P˙ (s) + P˙ (s)P (s), X1 (s) can be written as a sum of two adjoint terms, one of them is 1 R dz (1 − F1 (s)) R(z, s) P (s)P˙ (s)R(z, s) (1 − F1 (s)) 2πi 0 Z R(z, s) 1 ˙ dz (1 − F1 (s)) P P (s) (1 − F1 (s)) = 2πi z 0 = P (s)P˙ (s)R(0, s)(1 − P (s)) (1 − F1 (s)) = P (s)P˙ (s)R(0, s)(1 − P (s) − F1 (s)) H (s) ˙ = A. = P (s)P (s)R(0, s) 1 − g 1

(21)

2 The choice of Gaussian is not optimal. It would be more convenient to choose a regularizer which is a better approximant to a characteristic function and the reader may want to think of a Gaussian which is flattened at the top.

Adiabatic Theorem without a Gap Condition

We have used

455

P (s) F1 (s) = F1 (s) P (s) = g(0) − 1 P (s) = 0.

(22)

Using this integral representation of X1 (s) we now find Y1 (s). By our choice of F1 (s) we have [F1 (s), H (s)] = 0. Hence, [X1 (s), H (s)] Z h i 1 (1 − F1 (s)) R(z, s) P˙ (s)R(z, s) (1 − F1 (s)), H (s) − z dz = 2πi 0 Z h i 1 dz (1 − F1 (s)) R(z, s) , P˙ (s) (1 − F1 (s)) = 2πi 0 h i = (1 − F1 (s)) P (s), P˙ (s) (1 − F1 (s)) o n = [P (s), P˙ (s)] − F1 (s), [P (s), P˙ (s)] + F1 (s)[P (s), P˙ (s)]F1 (s) = [P (s), P˙ (s)] + F1 (s)P˙ (s)P (s) − P (s)P˙ (s)F1 (s). So a solution of the commutator equation is H (s) ˙ H (s) P (s)P (s) + P (s)P˙ (s)g . Y1 (s) = −g 1 1

(23)

(24)

˙ Using the fact the Gaussian is its own It remains to estimate the norms of X and X. Fourier transform, Z H (s) = 1 g(1 t) exp[2π itH (s)]dt, (25) g 1 R one checks that with our choice of 8 Z H (s) = 2πi 8 (t1) exp[2π itH (s)] dt. R(0, s) 1 − g 1 R Hence

R(0, s) 1 − g H (s) ≤ 2π k81 k1 = 2 .

1 1

(26)

(27)

Using the equation for X(s) this estimate proves the bound on X(s). To get a bound on ˙ X1 (s)P (s) , use the Duhammel formula,

˙ exp(2πitH (s)) = 2πi t

Z

1

dz e2π iztH (s) H˙ (s) e2π i(1−z)tH (s) .

(28)

0

Collecting the various terms give the claimed estimate. u t ˙ As Lemma 3 shows, as 1 shrinks, the norms of X(s) and X(s) may, and in general, will, grow. This, however is of no concern, as long as the norms remain finite, for one can always compensate for this growth by choosing τ large enough. The good thing about shrinking 1 is that this can be used to make the norm of Y1 small. Hence, we can always make the right-hand side of Eq. (8) arbitrarily small.

456

J. E. Avron, A. Elgart

Lemma 4. Suppose that H (s) is smooth with a zero eigenvalue with spectral projection P (s) smooth and of finite rank. Let g H1(s) be as above. Then kY1 (s) P (s)k = kg H1(s) P˙ (s)P (s)k → 0 uniformly as 1 shrinks to zero. Remark 4. We owe the proof below to Michael Aizenman. Proof. For the sake of simplicity suppose that P (s) is a one-dimensional projection with P (s)ψ(s) = ψ(s), ψ is normalized to 1. Let ϕ = P˙ (s)ψ. Then, using P (s)ϕ(s) = P (s)P˙ (s)ψ(s) = P (s)P˙ (s)P (s)ψ(s) = 0, we obtain H (s) ˙ H (s) H (s) ˙ 2 2 P (s)P (s)k = kg P (s)ψ(s)k = kg ϕ(s)k2 = kg 1 1 1 Z H (s) g 2 (x/1)dµϕ (x), (29) |ϕ = ϕ |g 2 1 σ (H (s)) x where µϕ denotes the spectral measure. Now, g 1 is bounded by one, and goes monotonically to zero for all x 6 = 0, and g(0) = 1. Hence Z g 2 (x/1)dµϕ (x) = µϕ (0) = 0. (30) lim 1→0 σ (H (s))

t It follows that there is a sequence of 1 that makes Y1 (s) arbitrarily small. u This completes the proof of Theorem 1. The physical interpretation of the adiabatic theorem without a gap condition is that although the adiabatic theorem “always” holds, it does so for different physical mechanisms. In the case that there is a gap in the spectrum the adiabatic theorem holds because the eigenstate is protected by a gap from tunneling out of the spectral subspace. In the case that there is no gap and the spectrum near the relevant eigenvalue is essential, the adiabatic theorem holds because essential spectrum is associated with states that are supported near spatial infinity. There is little tunneling to these states because of small overlap with the wave function corresponding to an eigenvalue which is supported away from infinity. 5. Fine Print In this section we extend the adiabatic theorem without a gap condition to unbounded self-adjoint operators; replace the smoothness condition by a condition on differentiability and allow eigenvalue crossing. These extensions are technical in character and rely on existing machinery. 5.1. Unbounded Hamiltonians. The first, and perhaps the main, difficulty with unbounded operators H (s) is the existence of solutions to the initial value problem, Eq. (1). For bounded operators the existence is a consequence of the Dyson formula, see e.g. Theorem X.59 in [45]. For unbounded operators existence is more subtle so we chose a class for which this is the case: Definition 2. A family of (possibly unbounded) self-adjoint Hamiltonians H (s) is admissible if

Adiabatic Theorem without a Gap Condition

457

1. H (s) have the common domain in Hilbert space for all s ∈ [0, 1]. 2. H (s) is bounded from below by 3. ˙ s) is bounded. 3. R(i, s) is bounded and differentiable and H (s)R(i, It is a consequence of our definition of admissibility that A(t) = (H (t) − 3 + 1) is a strictly positive operator. Moreover, it is follows from property (1) by a closed graph theorem that A(t)A(s)−1 is bounded. Since, for t − s small, k(t − s)−1 (A(t)A(s)−1 − ˙ −1 k + o(|t − s|), the last expression is bounded due to property (3). I )k = kA(s)A(s) The existence of the unitary evolution for an admissible family of Hamiltonians follows now from ([45], Theorem X.70): Theorem 2. Let X be a Banach space and let I be an open interval in R. For each t ∈ I , let A(t) be the generator of a contraction semigroup on X so that 0 ∈ ρ(A(t)) and 1. The A(t) have the common domain D. 2. For each φ ∈ X, (t − s)−1 (A(t)A(s)−1 − I )φ is uniformly strongly continuous and uniformly bounded in s and t for t 6 = s lying in any fixed compact subinterval of I . 3. For each φ ∈ X, C(t)φ ≡ limt→s (t − s)−1 (A(t)A(s)−1 − I )φ exists uniformly for t in each compact subinterval and C(t) is bounded and strongly continuous in t. Then unitary evolution exists uniformly in s. Then we can prove the following result. Theorem 3. Suppose that P (s) is finite rank spectral projection, which is at least twice differentiable (as a bounded operator), for an admissible family H (s). Then, the evolution of the initial (0), according to Eq. (1), is such that for all state ψ(0) ∈ RangeP s ∈ [0, 1], dist ψτ (s), Range P (s) ≤ o(1). Proof. Tracing the steps in Theorem 1 one sees that it is enough to check that the ˙ operators X(s), X(s) and Y (s) are bounded uniformly in s. Now, by Eq. (17), X1 and Y1 are made of bounded operators such as P , P˙ and R. Moreover, X1 is also differentiable as a bounded operator by our assumption that P is twice differentiable, and by the admissibility condition that guarantees that R is differentiable as a bounded operator. By the functional calculus g(H (s)) is also differentiable as a bounded operator. The only change is in the explicit estimate on the norm of X˙ 1 in terms of 1, which is replaced by

2(kP¨ (s)k + kP˙ 2 (s))k ˙

X1 (s)P (s) ≤ 1 π kP˙ (s)(H (s) + i)k kR(s, i) H˙ k , + 12

(31)

which is bounded for admissible H (s). u t 5.2. Piecewise Differentiability and Eigenvalue Crossing . If at some time 0 < s0 < 1 crossing of eigenvalues occurs, then the spectral projection associated with one of the eigenvalues, P (s), is discontinuous at s0 since its rank jumps. Suppose that P (s), s 6 = s0 is a spectral projection whose limit from the right and left coincide at s0 . In this case we can use an argument of Kato [33] that shows that global continuity together with piecewise smoothness is good enough.

458

J. E. Avron, A. Elgart

Kato’s argument goes as follows: Choose a small ε. The physical evolution follows the adiabatic evolution up to an arbitrarily small error on the interval [0, s0 − ε]. On the short interval [s0 − ε, s0 + ε] the physical evolution takes RangeP (s0 − ε) close to itself. Since P (s) is continuous at s0 , by assumption, this is equivalent to the statement that the physical evolution takes RangeP (s0 − ε) close to RangeP (s0 + ε), with an error that can be made arbitrarily small with ε. The physical evolution now follows the adiabatic evolution up to an arbitrarily small error on the interval [s0 + ε, 1]. Summarizing we have: Theorem 4. Suppose that P (s), s 6 = s0 ∈ [0, 1], is a finite rank spectral projection which is piecewise twice differentiable (as a bounded operator) and is everywhere continuous on data ψτ ∈ RangeP (0) evolve according to Eq. (1) so [0, 1]. Then the initial that dist ψτ (s), Range P (s) ≤ o(1) for all s ∈ [0, 1]. 6. The Rate of Approach to the Adiabatic Limit The general adiabatic theorems we have formulated give no information on the rate at which the adiabatic limit is approached. In fact, from the results of Born and Fock and Kato about eigenvalue crossings, it is clear that in the absence of a gap, the rate can be arbitrarily slow. To get interesting results on the rate at which the adiabatic limit is approached necessarily involves additional spectral information. In particular, if the bound state is either embedded or at the threshold of essential spectrum, with good behavior of the spectral measure at nearby energy, one expects to do better. An illustration of such estimates is given below. Recall [36] that a (Borel) measure µ is called (uniformly) α-Hölder continuous, α ∈ [0, 1], if there is a constant C such that for every interval 1 with |1| < 1, 3 µ(1) < C|1|α .

(32)

The interest in such measures comes from the fact [36,17,26] that they carry dynamical information and α-continuous measures are the limits of α-Hölder continuous measures. Knowing something about the Hausdorff dimension of the spectrum [36] then translates to information about the rate of approach of the adiabatic limit. Corollary 1. If the spectral measure µϕ (1), is α-Hölder continuous then the adiabatic α . In the case of a family of Hamiltonians limit is approached at least at rate γ = 2+α † related by unitaries, H (s) = V (s)H V (s), with V˙ V † bounded and differentiable the α . rate is at least γ = 1+α Proof. Let us note, first of all, that if the spectral measure µϕ (1), is α-Hölder continuous α . Indeed, ˜ then rhs of (30) is bounded by C|1| Z g (x/1)dµϕ (x) < 2

σ (H (s))

∞ X

Z g (n)( 2

n=0

< C|1|α

n1

∞ X n=−∞

3 | · | denote Lebesgue measure.

(n+1)1

Z dµϕ (x) +

α ˜ g 2 (n) = C|1| .

−n1

−(n+1)1

dµϕ (x)) (33)

Adiabatic Theorem without a Gap Condition

459

Collecting the various error estimates one gets for the right hand side of Eq. (8) the upper bound B A α ˜ . (34) + 2 + C|1| 1τ 1 τ A, B and C are constants. For the case of the family of Hamiltonians related by unitaries, by Eq.(39) below, B = 0. Optimizing the choice of 1 gives the result. u t 6.1. Unitary Families. By unitary families we mean the special case where the family H (s) has the form H (s) = V (s)H V † (s),

(35)

with V (s) unitary. There are two points that we want to make about unitary families. The first is that such families are interesting in the context of adiabatic dynamics from the perspective of applications. The second is that there is some simplification that occurs for such families. In a moving frame, the Schrödinger equation, Eq. (1), for ψτ = V φτ takes the form: i (36) i φ˙ τ = τ H + V † (s)V˙ (s) φτ . τ This leads to time independent Hamiltonian in the very special case: V (s) = eisσ , with σ self-adjoint (fixed) operator. The general case of unitary families, even in the rotating frame, leads to a time dependent problem, albeit one with a weak time dependent perturbation. As this perturbation is allowed to act for a long time, there is no obvious simplification in the rotating frame. Unitary families often enter in applications. See for example, M. Berry’s model of a spin half in a magnetic field [11] H (s) = B(s) · σ, where σ is a vector of Pauli matrices, and B(s) a vector in R3 of unit length. This is a unitary family, which has all the intricacies of adiabatic theory associated with e.g. Zener tunneling [29]. Now we come to the simplification. In the case of unitary families one can improve ˙ which affects the estimate of the rate γ . the estimate of the norm of X, In the case of a unitary family P (s) = V (s)P V † (s), P˙ (s) = [V˙ (s)V † (s), P (s)].

(37)

Hence, applying Eq. (37) to X1 (s) = A(s) + A† (s), where A, A† are given by Eq. (17), we derive H (38) V † (s). A(s) = V (s)P [V † (s)V˙ (s), P ]R(0, 0) 1 − g 1 Therefore, ˙ A(s) = [V˙ (s)V † (s), A(s)] + V (s)P [V † (s)˙ V˙ (s), P ]R(0, 0) · H V † (s). · 1−g 1

(39)

This identity is the reason why, for unitary families, one gets an improved rate with α . γ = α+1

460

J. E. Avron, A. Elgart

6.2. Friedrichs Models. Hölder continuity of the spectral measure gave an estimate α α ≤ 13 , in the general case and γ = α+1 ≤ 21 in the case of of the rate γ = α+2 unitary families. Presumably, neither is optimal, since we used the additional spectral information only to estimate the norm of Y1 , but not to improve the estimate on X1 and X˙ 1 . As a consequence, the best rate we get is γ = 21 . It is intriguing that for classical ergodic systems the approach to the adiabatic limit in the classical adiabatic theorem is with rate γ = 21 [44]. This does not imply that the rate of approach to the adiabatic limit must be slow compared to the rate with a gap. In this subsection we shall consider a class of models, patterned after Friedrichs [21], where a more precise estimate of γ can be made and where γ can also take the value 1 in the absence of a gap. Let us consider the family of unitarily related Hamiltonians H (s) = V (s)H V † (s). At any given time, s, there exists a representation of the Hilbert space such that Hs = C ⊕ L2 (Rd , dµ(k)) with inf (support µ) > −∞ and µ(0) = 0. A vector 9 ∈ H is normalized by Z ω 2 2 |ψ(k)|2 dµ(k), ω ∈ C. (40) 9= , k9k = |ω| + ψ(k) Rd The (Friedrichs) Hamiltonian H (s) in this representation acts on Hs like so: 00 ω 0 H (s) 9 = = . 0k |ψi |k ψi

(41)

The projection P (s) has a form P (s) =

10 , 00

(42)

and the formal (reduced) resolvent R(s) is given by 00 . R(s) = 0 k −1

(43)

The time dependence of this unitary family can be encoded in the rate of change of two operators, namely 0 hfs | (44) P˙ (s) = |fs i 0 and ˙ † (s))P (s) = (1 − P (s))(V˙ (s)V

0 0 . |gs i 0

Suppose that V˙ (s)V † (s) is bounded, has a bounded derivative, and Z Z |fs |2 dµ(k) ≤ O(|I |2α ), |gs |2 dµ(k) ≤ O(|I |2α ), α ≥ 0, BI

where BI stands for a ball of radius I .

BI

(45)

(46)

Adiabatic Theorem without a Gap Condition

461

Proposition 1. For the Friedrichs model described above, the evolution of the state that starts as the bound state ψτ (0) ∈ RangeP (0), is such that it remains close to the instantaneous bound state and  1  O τ , α > 1;  (47) dist ψτ (s), Range P (s) ≤ O logτ τ , α = 1;   −α , α < 1, O τ for all s ∈ [0, 1]. Proof. Formally  0 X = R P˙ + P˙ R =  f E ks

D  fs k  0

(48)

solves the commutator equation [X, H ] = [P , P˙ ]. Now choose



(49) D

0 X = R P˙ + P˙ R =  f χ (k>) E s k



fs χ (k>) k 

0

,

(50)

where R = R(s)χ(k > ) and pick Y according to Eq. (7), Y = [P , P˙ ] − [X , H ] 0 hfs χ(k < )| . = 0 |fs χ (k < )i Then

Z kY k ≤ O

Since

Z Rd /B

0

|fs | dµ(k) ≈ 2α .

|f |2 dµ(k) ≤ k2

(51)

2

(52)

O( 2(α−1) ) if α 6= 1 −O(log ) if α = 1

(53)

we get the appropriate estimate of kX k. What remains is to estimate the norm of (X P˙ (s))P (s): (X P˙ (s))P (s) = R P¨ (s)P (s) + R˙ P˙ (s)P (s) ˙ (s) = BR P˙ (s)P (s) − R P˙ (s)BP (s) + R BP ˙ (s), = BX P (s) − X BP (s) + R (1 − P (s))BP where B =

(54)

V˙ (s)V † (s).

Making use of (46) we obtain that O 2(α−1) if α 6 = 1 ˙ . kX P (s)k , kX P (s)k ≤ −O (log ) if α = 1

(55)

462

J. E. Avron, A. Elgart

So, provided α > 1, we get the adiabatic theorem with Y = 0 and with a rate 1/τ . When α < 1 we optimize which gives  α−1 O + O ( α ) if α < 1 kX P (s)k + kX˙ P (s)k τ + kY k ≤ . (56)  −O log + O( α ) if α = 1 τ τ

t u Acknowledgement. We are grateful to M. Aizenman for suggesting using the regularity of measures to streamline the proof of the main theorem, V. Bach, R. Seiler and H. Spohn for useful discussions and hospitality. This work was partially supported by a grant from the Israel Academy of Sciences, the Deutsche Forschungsgemeinschaft, and by the Fund for Promotion of Research at the Technion.

References 1. Arai, A., Hirokawa, M.: On the existence and uniqueness of ground states of a generalized spin-boson model. J. Funct. Anal. 151 (2), 455–503 (1997) 2. Arnold, V.I.: Geometrical Methods in the theory of Ordinary Differential Equations. Berlin–Heidelberg– New-York: Springer, 1983 3. Avron, J. E., Elgart, A.: An adiabatic theorem without a gap condition: Two level system coupled to quantized radiation field. Phys. Rev. A 58, 4300-4306 (1998) 4. Avron, J. E., Howland, J. S., Simon, B.: Adiabatic theorems for dense point spectra. Commun. Math. Phys. 128, 497–507 (1990) 5. Avron, J. E., Seiler, R., Yaffe, L. G.: Adiabatic theorems and applications to the quantum Hall effect. Commun. Math. Phys. 110, 33–49 (1987), (Erratum: Commun. Math. Phys. 153, 649–650 (1993)) 6. Bach, V., Fröhlich, J., Sigal, I. M.: Mathematical theory of nonrelativistic matter and radiation. Lett. Math. Phys. 34, 183–201 (1995) 7. Bach, V., Fröhlich, J., Sigal, I. M.: Quantum electrodynamics of confined non-relativistic particles. Adv. in Math. 137, 299 (1998) 8. Bach, V., Fröhlich, J., Sigal, I. M.: Renormalization group analysis of spectral problems in quantum field theory. Adv. in Math. 137, 205–298 (1998) 9. Bach, V., Fröhlich J., Sigal, I. M., Sofer, A.: Positive commutators and spectrum of non-relativistic QED. To appear 10. Berry, M.V.: Proc. Roy. Soc. Lond. A 392, 45 (1984); The quantum phase: Five years after. In: Geometric phases in physics. Shapere, A. and Wilczek, F., eds., Singapore: World Scientific, 1989 11. Berry, M.V.: Histories of adiabatic transition. Proc. Roy. Soc. Lond. A 429, 61–72 (1990) 12. Berry, M.V., Robbins, J.M.: Chaotic classical and half classical adiabatic reactions: Geometric magnetism and deterministic friction. Proc. Roy. Soc. Lond. A 442, 659–672 (1993); Proc. Roy. Soc. A 392, 45 (1984) 13. Bethe, H.A., Salpeter, E.E.: Quantum Mechanics of one and two electron atoms. New York: Plenum, 1977 14. Born, M.: The Mechanics of the Atom. New York: Ungar, 1960 15. Born, M., Fock, V.: Beweis des Adiabatensatzes. Z. Phys. 51, 165–169 (1928) 16. Cohen-Tannoudji, C., Dupont-Roc, J., Grynberg G.: Atoms and Photons Interactions. New York: Wiley, 1992 17. Combes, J. M.: In: Differential equations with applications to mathematical physics. Boston: Academic Press, 1993; Combes, J. M., Montcho, R.: Remarks on the relation between quantum dynamics and fractal spectra J. Math. Anal. and Appl. 2130, 698–722 (1997) 18. Cycon, H. L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger Operators. Berlin–Heidelberg–New-York: Springer, 1987 19. Davies, E. B., Spohn, H.: Open Quantum Systems with Time-Dependent Hamiltonians and Their Linear Response. J. Stat. Phys. 19, 511–523 (1978) 20. Ehrenfest, P.: Adabatische Invarianten u. Quantentheorie. Ann. d. Phys. 51, 327 (1916) 21. Friedrichs, K. O.: On the perturbation of continuous spectra. Comm. Pure Appl. Math. 1, 361–406 (1948) 22. Friedrichs, K. O.: Special topics in quantum theory. Lecture notes, Courant Institute of Mathematical Science, New York University, (1953); On the adiabatic theorem in quantum theory, Part I. Courant Institute of Mathematical Science, New York University, (1955); On the adiabatic theorem in quantum theory, Part II. Courant Institute of Mathematical Science, New York University, 1956 23. Garrido, L. M.: Generalized adiabatic invariance. J. Math. Phys. 5, 355–362 (1964) 24. Galindo, A., Pascual, P.: Quantum mechanics. Berlin–Heidelberg–New-York: Springer-Verlag, 1991

Adiabatic Theorem without a Gap Condition

463

25. Golin, S., Knauf, A., Marmi, S.: The Hannay angles: Geometry, Adiabaticity and an example. Commun. Math. Phys. 123, 95–122 (1989) 26. Guarneri, I.: On the dynamical meaning of spectral dimensions. Ann. Inst. H. Poincaré. To appear 27. Hagedorn, G.: Adiabatic Expansions near Eigenvalue Crossings. Ann. Phys. 196, 278–295 (1989) 28. Huebner, Spohn, H.: Ann. Inst. H. Poincaré Phys. Theor. 62 no. 3, 289 (1995) 29. Jak˘si´c, V., Segert, J.: On the Landau Zener formula for two-level systems. J. Math. Phys. 34, 2807–2820 (1993) 30. Jarzinski, C.: Multiple-time-scale approach to ergodic adiabatic systems: Another look. Phys. Rev. Lett. 71, 839 (1993) 31. Joye, A., Pfister, C.E.: Exponential Estimates in Adiabatic Quantum Evolution. Proceeding of the XII ICMP, Brisbane Australia (1997); Quantum Adiabatic Evolution. In: On Three Levels. Fannes, M., Maes, C., Verbure, A., editors, London: Plenum, 1994 32. Kato, T.: Integration of the equation of evolution in a Banach space. J. Math. Soc. Japan. 5, 208–234 1953) 33. Kato, T.: On the adiabatic theorem of quantum mechanics. Phys. Soc. Jap. 5, 435–439 (1958) 34. Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–New-York: Springer, 1966 35. Klein, M., Seiler, R.: Power law corrections to the Kubo formula vanish in quantum Hall systems. Commun. Math. Phys. 128, 141 (1990) 36. Last, Y.: Quantum Dynamics and Decomposition of Singular Continuous Spectra. J. Funct. Anal. 142, 406–445 (1996) 37. Lennard, A.: Adiabatic Invariance to All Orders. Ann. Phys. 6, 261–276 (1959) 38. Lochak, P., Meunier, C.: Multiphase Averaging for Classical systems. Berlin–Heidelberg–New-York: Springer, 1988 39. Martinez, A.: Precise exponential estimates in adiabatic theory. J. Math. Phys. 35, 3889–3915 (1994) 40. Martinez, A., Nakamura, S.: Adiabatic limit and scattering. C.R. Acd. Sci. Paris. 318, 1153–1158 (1994) 41. Narnhofer, H., Thirring, W.: Adiabatic theorem in quantum statistical mechanics. Phys. Rev. A 26, 364 (1982) 42. Nenciu, G.: On the adiabatic theorem of quantum mechanics. J. Phys. A 13, L15–L18 (1980) 43. Nenciu, G.: Linear Adiabatic Theory: Exponential Estimates. Commun. Math. Phys. 152, 479–496 (1993) 44. Ott, E.: Goodness of ergodic adiabatic invariants. Phys. Rev. Lett. 42, 1628–1631 (1979); and Brown R., Ott, E., Grebogi, C.: Goodness of ergodic adiabatic invariants. J. Stat. Phys. 49, 511–550 (1987) 45. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II. Fourier Analysis, Self-Adjointness. London: Academic Press, 1975 46. Riemann, B.: Ueber der Darstellbarkeit einer Function durch eine trigonometrishe Reihe In: Math. Werke, Leipzig: Teubner, 1876, pp. 213–253; Lebesgue, H.: Sur les Series ´ Trigonometriques. ´ Ann. Sci. Ecole Norm. Sup. 20. 453–485 (1903) 47. Simon, B.: The theory of resonances for dilation analytic potentials and the foundations of time-dependent perturbation theory. Ann. of Math. 97, 247–274 (1973) 48. Spohn, H.: Ground state(s) of the spin-boson Hamiltonian. Comm. Math. Phys. 123, 277–304 (1989) 49. Thouless, D.J.: Topological Quantum Numbers in Nonrelativistic Physics. Singapore: World Scientific, (1998) 50. Yosida, K.: Functional Analysis. Berlin: Springer-Verlag, 1968 Communicated by B. Simon

Commun. Math. Phys. 203, 465 – 479 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Quantum Monodromy in Integrable Systems San Vu˜ Ngo.c1,2 1 Institut Fourier UMR5582, B.P. 74, 38402 Saint-Martin d’Hères, France.

E-mail: [email protected]

2 Mathematics Institute, P.O. Box 80010, 3508 TA Utrecht, The Netherlands

Received: 21 April 1998 / Accepted: 8 December 1998

Abstract: Let P1 (h), . . . , Pn (h) be a set of commuting self-adjoint h-pseudo-differential operators on an n-dimensional manifold. If the joint principal symbol p is proper, it is known from the work of Colin deVerdière [6] and Charbonnel [3] that in a neighbourhood of any regular value of p, the joint spectrum locally has the structure of an affine integral lattice. This leads to the construction of a natural invariant of the spectrum, called the quantum monodromy. We present this construction here, and show that this invariant is given by the classical monodromy of the underlying Liouville integrable system, as introduced by Duistermaat [9]. The most striking application of this result is that all two degree of freedom quantum integrable systems with a focus-focus singularity have the same non-trivial quantum monodromy. For instance, this proves a conjecture of Cushman and Duistermaat [7] concerning the quantum spherical pendulum.

1. Introduction Obstructions to the existence of global action-angle coordinates for completely integrable systems are well known since Duistermaat’s article [9]. It was then natural to raise the question about the impact of these obstructions on quantum integrable systems, at least for the (semi)-classical pseudo-differential quantisation on cotangent bundles. The first attempts in this direction were [7] and [11], both of them concerning the monodromy invariant for the example of the spherical pendulum. This system is indeed one of the simplest (along with the Champagne bottle [1]) that exhibits a non-trivial monodromy. The first of these articles [7] proposed a particularly interesting way of detecting the monodromy by observing a shift in the lattice structure of the joint spectrum. It is the purpose of this article to state, prove and explain this idea. Surprisingly enough, this idea of quantum monodromy has been sleeping for ten years, before new interest resulted in its experimental discovery in the spectrum of excited water molecules [4,5].

466

S. V˜u Ngo.c

Back to mathematics, it turns out that, in the framework of semi-classical microlocal analysis (developed for integrable systems in [3]), there is a natural way of defining an invariant of the joint spectrum away from singularities of the principal symbols, that precisely describes the obstruction to the existence of a global lattice structure for the spectrum. The organisation of this article is as follows: we first extract the relevant properties of joint spectra, and define the quantum monodromy invariant for any set that shares these properties (Sect. 2). Then we prove in Sect. 3 that, for spectra, the quantum monodromy is precisely given by the classical monodromy of the underlying classical Hamiltonian system. The result is applied in Sect. 4 to the particularly interesting case of systems admitting a focus-focus singularity. The last Sect. 5 finally shows how to read off the monodromy from a picture of the spectrum. As an example, we use the spectrum of the Champagne bottle computed by Child [4].

2. Construction of the Quantum Monodromy Let U be an open subset of Rn , let H be a set of positive real numbers accumulating at 0, and for any h in H let 6(h) be a discrete subset of U. If B is an open subset of U, a family (f (h))h∈H of smooth functions on B with values in Rn is called a symbol (of order zero) if it admits an asymptotic expansion of the form f (h) = f0 + hf1 + h2 f2 + · · · for smooth functions fi : B → Rn . More precisely we require that for any ` ≥ 0, for any N ≥ 0, and for any compact K ⊂ B, there is a constant C`,N,K such that for all h ∈ H,

N

X

k h fk ≤ C`,N,K hN +1 ,

f (h) −

k=0

`

where k.k` denotes the C ` norm in K. The symbol f (h) is elliptic if its principal part f0 is a local diffeomorphism of B into Rn . The value of f (h) at a point c ∈ B will be denoted by f (h; c). A family (r(h))h∈H of elements of a finite dimensional vector space is said to be O(h∞ ) if for any N ≥ 0 there is a constant C > 0 such that kr(h)k ≤ ChN , uniformly for all h ∈ H . If S(h) is any family of sets depending on h, then the notation f (h) ∈ S(h) + O(h∞ ) means that the function dist(f (h), S(h)) is O(h∞ ). We will say that 6(h) has the structure of an “asymptotic affine lattice” whenever it can be described with a locally finite set of “asymptotic affine integral charts”, in the following sense: Definition 1. (6(h), U) is an “asymptotic affine lattice” if for any c ∈ U, there exists a small open ball B ⊂ U around c, and an elliptic symbol f (h) : B → Rn of order zero such that, for any family λ(h) ∈ B : – λ(h) ∈ 6(h) ∩ B + O(h∞ ) ⇐⇒ f (h; λ(h)) ∈ hZn + O(h∞ ) – if λ(h) and λ0 (h) are in 6(h) ∩ B, then λ0 (h) − λ(h) = O(h∞ ) if and only if for small h, λ0 (h) = λ(h).

Quantum Monodromy in Integrable Systems

467

h B

f (h)

hZn

U Fig. 1. An asymptotic affine lattice

Intuitively this means that zooming by a factor of h1 inside B makes 6(h)∩B converge to the standard lattice as h tends to zero. The issue here is to see what prevents 6(h) from globally converging to a lattice. Of course, the reason for this definition is that, under suitable hypothesis, the joint spectrum of a set of n commuting h-pseudo-differential operators on an n-dimensional manifold is indeed an “affine asymptotic lattice” (see the next section). For short, a symbol f (h) satisfying Definition 1 will be referred to as an “affine chart” of 6(h). The main point is that the transition functions associated to these charts are elements of the affine group GA(n, Z) (following Berger [2], we denote by GA(n, R) the group of invertible affine transformations of Rn , which is the semi-direct product of the linear group GL(n, R) by the normal subgroup of translations. Some authors use the notation Affn (R) instead. The subgroup GA(n, Z) consists then of elements A ∈ GA(n, R) such that A and A−1 leave Zn globally invariant). Proposition 1. Let f (h) and g(h) be two affine charts of 6(h), both defined on a ball B. Then there is a unique A ∈ GA(n, Z) ⊂ GA(n, R) such that f (h) −1 g(h) ◦ = Af (h)(B)/ h + O(h∞ ). h h Suppose now that U is covered by a locally finite union of balls Bα on each of which is defined an affine chart fα (h) of 6(h). Proposition 1 yields a family of affine linear maps Aαβ such that on non-empty intersections Bα ∩ Bβ , 1 1 fα (h) = Aαβ fβ (h) . h h ˇ This in turn defines a 1-cocycle M in the Cech cohomology of U with values in the non-Abelian group GA(n, Z). Definition 2. The class [M] ∈ Hˇ 1 (U, GA(n, Z)) of the cocycle defined by Aαβ is called the quantum monodromy of (6(h), U). Let L be the canonical homomorphism, whose kernel is the group of translations: L : GA(n, R) → GL(n, R).

468

S. V˜u Ngo.c

Let ι be the inclusion of GL(n, R) into GA(n, R) such that for any M ∈ GL(n, R), ι(M) leaves the origin 0 ∈ Rn invariant. Then ι is an injective homomorphism that depends on the choice of the origin 0, satisfying L ◦ ι = I d. Any A ∈ GA(n, R) can be written in a unique way A = τ (k) ◦ ι(M), (which is usually written A = M + k), where M = L(A) ∈ GL(n, Z) and τ (k) is translation by the vector k ∈ Zn . The exact sequence of group homomorphisms 0

- Zn

τ-

GA(n, Z)

L-

GL(n, Z)

-1

gives rise to the following sequence of maps (which are not homomorphisms, since cohomology sets with values in a non-abelian group have no natural group structure – see [12, p. 38]): Hˇ 1 (U, Zn )

τ∗ -

Hˇ 1 (U, GA(n, Z))

L∗ -

Hˇ 1 (U, GL(n, Z))

- 1.

This sequence is “exact” in the sense that L∗ is surjective, and if L∗ ([M]) = 1, then there is an integer cocycle [ω] ∈ Hˇ 1 (U, Zn ) such that [M] = τ∗ ([ω]). The surjectivity of L∗ is due to the existence of the cross section ι, which gives rise to the map ι Hˇ 1 (U, GA(n, Z)) ∗

Hˇ 1 (U, GL(n, Z))

such that L∗ ι∗ = I d. For the second point, we remark that if the cocycle L(Aαβ ) is a coboundary, then it can be written Mα Mβ−1 . Therefore the cocycle ι(Mα−1 )Aαβ ι(Mβ ) (which is equivalent to Aαβ ) has a linear part equal to the identity, hence is a translation. Remark 1. The lack of injectivity for τ∗ is measured by Hˇ 0 (U, GL(n, Z)) : one can check that two cocycles [k] and [k 0 ] in Hˇ 1 (U, Zn ) yield the same element of Hˇ 1 (U, GA(n, Z)) if and only if there is an M ∈ Hˇ 0 (U, GL(n, Z)) such that [k 0 ] = [M · k]. Let us now give various interpretations of the quantum monodromy M. The action of GA(n, Z) on Zn being effective, it is a standard fact that the cohomology set Hˇ 1 (U, GA(n, Z)) classifies the isomorphism classes of fibre bundles over U with structure group GA(n, Z) and fibre Zn (see for instance [12, pp.40–41]). Let L be such a lattice bundle associated to M. The elements Aαβ just define the transition functions between two adjacent trivialisations of L. Since these trivialisation functions are locally constant, there is a naturally defined parallel transport γ .p of a point p ∈ Lc along a path γ in the base U. This defines the holonomy of L, as a map from π1 (U, c) into GA(Lc ). We will always identify the latter with GA(n, Z) by choosing an affine basis of Lc . The choice of such a basis is equivalent to that of a trivialisation f of L above c that sends this basis to the canonical basis of Zn ; the holonomy µf is then defined by : f (γ .p) = µf (γ )f (p).

(1)

Finally, this is also equivalent to the choice of an affine chart f (h) of 6(h) around c. If M is any cocycle associated to this trivialisation, then µf (γ ) = A1,` ◦ · · · ◦ A3,2 ◦ A2,1 ,

(2)

Quantum Monodromy in Integrable Systems

469

where Ai,j denotes the transition element corresponding to a pair of intersecting open balls (Bi , Bj ), and B1 , . . . , B` enumerate elements of a cover of U encountered by γ (t) when t runs from 0 to 1. We shall always assume that U is connected, so that µf does not depend on the base point c. Note that since (γ 0 γ ).p = γ .(γ 0 .p), we have µf (γ 0 γ ) = µf (γ )µf (γ 0 ). It should be noticed that the bundles considered here have discrete fibres, so that we could reduce the discussion to the theory of coverings. The fibre bundle formulation seems however to be more natural when it comes to comparing them with objects arising in Hamiltonian systems. Nevertheless, the covering approach will be used in Sect. 5. Other geometric interpretations of M will also be discussed in Sect. 5. For the moment just notice that the non-triviality of [M] is equivalent to the non-triviality of the lattice bundle L and to the fact that there is no globally defined symbol f (h) on U sending 6(h) to the straight lattice hZn . Proof of Proposition 1. There are no surprises in this quite elementary proof. Let c ∈ U, and f (h), g(h) be two affine charts of 6 defined on a ball B around c. Because of Definition 1, any open ball around c contains, for h small enough, at least one element of 6(h). Therefore, there exists a family λ(h) ∈ 6(h) ∩ B such that lim λ(h) = c.

h→0

Let k ∈ Zn and let λ0 (h) be a family of elements of 6(h) ∩ B such that f (h; λ(h)) = f (h; λ0 (h)) + hk + O(h∞ ). Then, as h tends to zero,

λ0 (h)−λ(h) h

tends towards a limit v ∈ Rn satisfying k = df0 (c)v

(recall that f0 denotes the principal part of f (h)). Since λ(h) and λ0 (h) are in 6(h), there is a family k 0 (h) ∈ Zn such that g(h; λ0 (h)) − g(h; λ(h)) = k 0 (h) + O(h∞ ). h The left-hand side of the above equation has limit dg0 (c)v as h → 0. Therefore k 0 (h) is equal to a constant integer k 0 for small h, and we have k 0 = dg0 (c)(df0 (c))−1 k, which implies that dg0 (c)(df0 (c))−1 ∈ GL(n, Z). Since GL(n, Z) is discrete, there is a constant matrix M ∈ GL(n, Z) such that for all c ∈ B, dg0 (c) = M · (df0 (c)); this in turn implies the existence of a constant k ∈ Zn such that, on B, g0 = M · f0 + k. But k is necessarily zero : indeed, applying the above equality to λ(h) gives a sequence k 0 (h) ∈ Zn such that hk 0 (h) = g(h; λ(h)) − M · f (h; λ(h)) = k + O(h). def

470

S. V˜u Ngo.c

Therefore k 0 (h) must tend to zero, and hence must equal zero for small h, implying that k = 0. We have proved the existence of a smooth symbol F (h) such that M · f (h) − g(h) = hF (h). Because F (h; λ(h)) ∈ Zn + O(h∞ ) and limh→0 F (h; λ(h)) = F0 (c), we must have F0 (c) ∈ Zn . So F0 = const ∈ Zn in B. This easily implies that all lower order terms in F (h) must vanish on B, so we are left with F (h) = k + O(h∞ ), for a k ∈ Zn . This gives g(h) = M · f (h) − hk + O(h∞ ), which reads 1 1 g(h) = A( f (h)) + O(h∞ ), h h t with A ∈ GA(n, Z) defined by A(p) = M · p − k, p ∈ Zn . u Remark 2. Because of the discreteness of GA(n, Z), Proposition 1 implies that there is an h0 > 0 such that the transition element A is uniquely defined by (g(h0 )/ h0 ) (f (h0 )/ h0 )−1 acting on a finite subset of Zn . Therefore, when restricted to any open subset of U with compact closure in U, the cocycle [M] is really a quantum object, in the sense that “you don’t need to let h tend to zero” to define it. 3. Link with the Classical Monodromy Let P1 (h), . . . , Pn (h) be a set of commuting self-adjoint h-pseudo-differential operators on an n-dimensional manifold X. They will be assumed to be classical and of order zero, in the sense that in any coordinate chart their Weyl symbols pj (h) have an asymptotic expansion of the form j

j

j

pj (h; x, ξ ) = p0 (x, ξ ) + hp1 (x, ξ ) + h2 p2 (x, ξ ) + · · · . Because the principal symbols p01 , . . . , p0n commute with respect to the symplectic Poisson bracket on T ∗ X, the map T ∗ X 3 (x, ξ )

p-

(p01 (x, ξ ), . . . , p0n (x, ξ )) ∈ Rn

is a momentum map for the local Hamiltonian action of Rn on T ∗ X defined by the j Hamiltonian flows of the p0 . We will always assume that p is proper, so that the level sets 3c = p−1 (c) are compact. Moreover, we ask that these level sets be connected. Conclusions for non-connected 3c can be obtained by separately studying the different connected components.

Quantum Monodromy in Integrable Systems

471

Let Ur be the open subset of regular values of the momentum map p, and let U be an open subset of Ur with compact closure. It follows from the Arnold-Liouville theorem that pU is a smooth fibration whose fibres are Lagrangian tori. The structure of this fibration is semi-globally (i.e. in a neighbourhood of a fibre) described with the help of action-angle coordinates. However, the flat fibre bundle H1 (3c , Z) → c ∈ U (with fibre Zn ) may have non-trivial monodromy, preventing the construction of global action variables on p−1 (U) (see Duistermaat [9]). We will denote by [Mcl ] (classical monodromy) the cocycle in Hˇ 1 (U, GL(n, Z)) associated to this lattice bundle. On the other hand, let 6(h) be the intersection with U of the joint spectrum of the operators P1 (h), . . . , Pn (h). It is known from [3] that this spectrum is discrete and for small h is composed of simple eigenvalues. Moreover, the following result holds: Proposition 2 ([3]). 6(h) is an asymptotic affine lattice on U. We denote by [Mqu ] ∈ Hˇ 1 (U, GA(n, Z)) the quantum monodromy of the spectrum on U, given by Definition 2. Recall that ι denotes the inclusion of GL(n, R) into GA(n, R) such that for any M ∈ GL(n, R), ι(M) leaves the origin 0 ∈ Rn invariant. The relation between [Mqu ] and the classical monodromy [Mcl ] is then given by the following theorem : Theorem 1. The quantum monodromy is “dual” to the classical monodromy in the following sense: [Mqu ] = ι∗ (t [Mcl ]−1 ). In other words, for any c ∈ U there exists a choice of basis of H1 (3c , Z) and of an affine chart of 6(h) such that the monodromy representations µcl : π1 (U, c) → GL(n, Z) and µqu : π1 (U, c) → GA(n, Z) defined by [Mcl ] and [Mqu ] satisfy : µqu = ι ◦ (t µcl )−1 . Proof. Let α be the Liouville 1-form on T ∗ X. Let c0 ∈ U and for c near c0 let (γ1 (c), . . . , γn (c)) be a smooth family of loops on 3c whose homology classes form a basis of H1 (3c , Z). It is known from [3,6] (see also [14] for a viewpoint closer to this article) that one can find an affine chart f (h) for 6(h) around c such that the principal part f0 is equal to the action integral associated to γ1 , . . . , γn : Z Z 1 1 α, . . . , α). f0 (c) = ( 2π γ1 (c) 2π γn (c) Because of Proposition 1, any other affine chart around c having the same principal part must equal f (h) (modulo O(h∞ )). In this way, the choice of a local smooth basis

472

S. V˜u Ngo.c

of H1 (3c , Z) determines an affine chart of 6(h). If (γ10 (c), . . . , γn0 (c)) is another basis of H1 (3c , Z) such that (γ 0 (c)) = M(c) · (γ (c)),

(3)

for a matrix M(c) ∈ GL(n, Z) depending smoothly on c, then the corresponding affine charts f (h) and f 0 (h) of 6(h) satisfy : f 0 (h; c) = M(c) · f (h; c) + O(h∞ ). Recall that the notation “M·” here means matrix multiplication by M, which is of course the same as affine composition by ι(M). But formula (3) says that if k and k 0 are trivialisation functions of the bundle H1 (3c , Z) → c associated to the basis γ and γ 0 , then k 0 = t M −1 k. Therefore, if t M −1 are transition elements for the lattice bundle H (3 , Z) → c, then ι(M ) define 1 c αβ αβ a monodromy cocycle for 6(h). u t Remark 3. The fact that the affine nature of quantum monodromy is here naturally reduced to an action of the linear group GL(n, Z) is due the the global existence of a primitive of the symplectic form on T ∗ X, namely the Liouville 1-form α. 4. Monodromy of a Focus-Focus Singularity It is probably not worth discussing monodromy in arbitrary degrees of freedom, for it is a typical phenomenon of 4-dimensional symplectic manifolds (see [13]). More precisely, let X be a 2-dimensional manifold, and let P1 (h), P2 (h) be two commuting self-adjoint h-pseudo-differential operators on X. As before, suppose that the momentum map p = (p01 , p02 ) defined by the principal symbols is proper with connected level sets. We shall make the following hypothesis. There exists a critical point m ∈ T ∗ X of p of maximal corank (i.e. both p01 and p02 are critical at m) such that, in some local symplectic coordinates (x, y, ξ, η), the Hessians (p01 )00 (m) and (p02 )00 (m) (thereafter denoted by H(p01 ) and H(p02 )) generate a 2-dimensional subalgebra of the algebra Q(4) of quadratic forms in (x, y, ξ, η) under Poisson bracket that admits the following basis (q1 , q2 ): q1 = xξ + yη, q2 = xη − yξ. Such a singularity m is called a focus-focus singularity. The point m is then isolated amongst critical points of p. Therefore, we can choose U ⊂ Ur to be a small punctured disc around o = p(m). Finally, we shall always assume that m is the only critical point of the critical level set 30 = p−1 (o). It is known (probably since [15]; see for instance [14] or [8] for discussions and more references on this topic) that the fibration pU has non-trivial monodromy, and can be described in the following way: Near m, we know from [10] that the integrable Hamiltonian system (p01 , p02 ) can be brought into a normal form given by (q1 , q2 ). In other words there exists a local diffeomorphism F : (R2 , 0) → (R2 , o) such that (p01 , p02 ) = F (q1 , q2 ).

Quantum Monodromy in Integrable Systems

473

This allows one to define transversal vector fields X1 and X2 tangent to the fibres 3c that are equal to the Hamiltonian vector fields Xq1 and Xq2 near m. Note that X2 is periodic of period 2π. Around each c ∈ U, we can now define the following smooth basis (γ1 (c), γ2 (c)) of H1 (3c , Z) ' π1 (3c ): – γ2 (c) is a simple integral loop of X2 . – Take a point on γ2 (c); let it evolve under the flow of X1 . After a finite time, it goes back on γ2 (c). Close it up on γ2 (c). This defines γ1 (c).

γ1 (c)

γ2 (c) 3c

Fig. 2. The basis (γ1 (c), γ2 (c))

Proposition 3 ([15]). Let c ∈ U. With respect to the basis (γ1 (c), γ2 (c)), the action of the classical monodromy map µcl on a simple loop δ ∈ π1 (U, c) enclosing o is given by the matrix 10 cl . µ (δ) = 1 Here is the sign of det M, where M ∈ GL(2, R) is the unique matrix such that : (H(p01 ), H(p02 )) = M · (H(q1 ), H(q2 )). Note also that M = dF (0). This, together with Theorem 1, proves the following result: Theorem 2. Let P1 (h), P2 (h) be a quantum integrable system with a focus-focus singularity. Then there exists a small punctured neighbourhood U of the critical value o such that for any c ∈ U, if f (h) is an affine chart of the joint spectrum 6(h) around c having principal part Z Z 1 1 α, α , 2π γ1 (c) 2π γ2 (c)

474

S. V˜u Ngo.c qu

then the value of the quantum monodromy map µf ∈ GA(2, Z) at a simple loop δ ∈ π1 (U, c) enclosing o is given by the matrix 1 − qu . µf (δ) = ι 0 1 Here is the sign of det M, where M ∈ GL(2, R) is the unique matrix such that : (H(p01 ), H(p02 )) = M · (H(q1 ), H(q2 )). 5. How to Detect Quantum Monodromy 5.1. Introduction. Theorem 1 wouldn’t be of much interest if one could not “read off” the quantum monodromy from a picture of the joint spectrum. This is actually easy to do, at least in a heuristic way. The rigorous mathematical formulation may however look slightly awkward. The first idea is the following. Given a straight lattice Zn , and any two points A and B in Zn , there is a natural parallel translation from A to B acting on Zn , namely the → translation by the integral vector AB. Now, the joint spectrum 6(h) locally around any point c ∈ U looks like a lattice. If the points A and B in 6(h) are close enough to c and h is small enough, one can still define a parallel translation from A to B, taking points of 6(h) near A to points in 6(h) near B. This allows us to pass from one chart to another, and hence to define the notion of parallel transport along any loop through c. This yields a map from π1 (U, c) to GL(n, Z) which is precisely the linear part of the quantum monodromy µqu .

Fig. 3. Parallel transport on 6(h)

This idea is made precise in Sect. 5.2. The problem can also be viewed the other way round. Roughly speaking, (6(h), U) is an affine manifold, and hence can be defined by the data of a local diffeomorphism f (h) from the universal cover U˜ of U to hRn sending 6(h) to hZn , and of the holonomy ν associated to it : ˜ ∀γ ∈ π1 (U), ∀c˜ ∈ U˜ . f (h; γ .c) ˜ = νc˜ (γ )f (h; c), Of course, ν should be related to the quantum monodromy µf . The diffeomorphism f (h) can be seen as an “unwinding” of 6(h) onto Rn . This viewpoint is developed in Sect. 5.3.

Quantum Monodromy in Integrable Systems

475

5.2. Parallel transport on 6(h). We discuss here the notion of parallel transport on any asymptotic affine lattice (6(h), U). 1. First suppose that there exists an affine chart f (h) of 6(h) defined globally on U. Since f (h) is elliptic and sends elements of 6(h) into hZn + O(h∞ ), there is an h0 > 0 such that for any h < h0 , there is an injective map f˜(h) sending elements of 6(h) exactly into hZn and such that f˜(h) − f (h) = O(h∞ ). Because f (h) is of order zero, there is a fixed open ball B˜ 0 ⊂ f (h; U) such that 0 ˜ B ∩ (hZn ) is contained in f˜(h; 6(h)). ˜ in Then, one can find a smaller ball B˜ ⊂ B˜ 0 such that for any two points P˜ , Q → n n 0 ˜ takes any point of B˜ ∩(hZ ) into B˜ ∩(hZn ) B˜ ∩(hZ ), the translation by the vector P˜ Q ˜ Pulling back by (Fig. 4). Let us denote by B an open ball in Rn such that f (h; B) ⊂ B. h

A˜ 0

B˜ 0

A˜ ˜ Q hZn

P˜

B˜

Fig. 4. Parallel translation

f˜(h), one thus defines the “parallel transport” τP→Q (A) of a point A ∈ 6(h) ∩ B along the direction given by two points P and Q in 6(h) ∩ B. When the composition is defined, we have → ◦ τ → = τ→ . τQR PQ PR

(4)

Moreover, because translation in Zn is an isometry, there exists a constant C > 0, independent of h, such that for any A ∈ 6(h) ∩ B →

→

||QτP→Q (A)|| < C||P A||.

(5)

Because of Proposition 1, any other choice of affine chart f (h) gives the same parallel transport. 2. Now, let (6(h), U) be a general asymptotic affine lattice. If γ is any path in U, one can cover its image by open balls Bi on which parallel transport is well defined for h

476

S. V˜u Ngo.c

less than some hi > 0. If U is compact, as we shall always assume, this can be done with a finite number of such balls B1 , . . . , B` , ordered in a way that for each 1 ≤ i < `, Bi ∩ Bi+1 6 = ∅. In the following, take h to be less than mini hi . Let P ∈ 6(h)∩B0 and Q ∈ 6(h)∩B` . For each i = 1, . . . , ` − 1, pick up a point Pi ∈ 6(h) ∩ (Bi ∩ Bi+1 ). For h small enough, this set is not empty. Because of the estimate (5), the mapping def

τγ ,P ,Q = τP

→

`−1 Q

◦ · · · ◦ τP → ◦ τP→ P P 1 2

1

is well-defined when restricted to a sufficiently small ball B0 around P (here again, 6(h) ∩ B0 won’t be empty if h is small enough). Equation (4) shows that this map does not depend on the choice of the intermediate points Pi . Therefore it depends only on P , Q, and on the homotopy class of γ (as a path from a point in B1 to a point in B` ). If Q = P , and γ is a loop (B` ∩ B1 6 = ∅ and B0 ⊂ B1 ) then τγ ,P ,P is a map from 6(h) ∩ B0 to 6(h) ∩ B1 leaving P invariant. If f (h) is an affine chart for 6(h) on B1 , then f˜(h) ◦ τγ ,P ,P ◦ f˜(h)−1 is a locally defined map τ˜γ ,f (h),P from hZn to itself leaving f˜(h; P ) invariant. We know from Sect. 2 (formula (1)) that the choice of such an affine chart allows the quantum monodromy map µf to take its values in GA(n, Z). Remember that L denotes the natural homomorphism from GA(n, R) to GL(n, R). Proposition 4. The map τ˜γ ,f (h),P is equal to the linearisation at P˜ = f˜(h; P ) of the quantum monodromy along γ : →

→

˜ = L(µf (γ ))P˜ R, ˜ ∀R˜ ∈ hZn , P˜ τ˜γ ,f (h),P (R) whenever the left-hand side of the above is defined. Proof. If we choose affine charts fi (h) for 6(h) on each of the Bi ’s with f1 = f , and let Ai,i+1 be the transition elements of the monodromy cocycle fi (h)/ h = Ai,i+1 (fi+1 (h)/ h) + O(h∞ ) (convention ` + 1 ≡ 1), then it is easy to check that →

→

˜ = L(A1,` ) · · · L(A3,2 )L(A2,1 ) · P˜ R, ˜ P˜ τ˜γ ,f (h),P (R) whenever the composition is defined. Using (2) finishes the proof.

As an application, one can easily “read off” from the spectrum of the quantum Champagne bottle (Fig. 5) that the linear part of the quantum monodromy is conjugate to the 1 −1 matrix . 0 1

Quantum Monodromy in Integrable Systems

477 E2 = hn

R0

R

E1

P

γ

Fig. 5. Spectrum of the Champagne bottle. The gray disc encloses the focus-focus critical value. R 0 = τγ ,P ,P (R)

5.3. Unwinding the spectrum. We keep here the notation of the previous paragraph. In particular, 6(h) is any asymptotic affine lattice on U, γ is a path in U whose image is covered by balls Bi on which local parallel translation is defined. We choose points P ∈ B1 ∩ 6(h), Q ∈ B` ∩ 6(h) and P1 , P2 , . . . , P`−1 , P` = Q such that for i = 1, . . . , ` − 1, Pi ∈ Bi ∩ Bi+1 ∩ 6(h). Given an affine chart f (h) on B1 , for h small there is a unique k1 ∈ Zn such that the ◦ f˜(h)−1 is just translation by hk1 . If B1 , . . . , B` are endowed with map f˜(h) ◦ τP→ P1 affine charts f1 (h) = f (h), f2 (h), . . . , f` (h), in the same way we define ki ∈ Zn such that f˜i (h) ◦ τ → ◦ f˜i (h)−1 Pi−1 Pi

is translation by the vector hki . We unwind the points P , P1 , . . . , P` onto hZn using the following procedure (see Fig. 6): – P˜ = f˜(h; P ); – P˜1 = P˜ + hk1 = f˜(h, P1 ); – P˜2 = P˜1 + hL(A2,1 ) · k2 ; – ... ˜ = P˜` = P˜`−1 + hL(A`,`−1 ) · · · L(A2,1 ) · k` . – Q Then one easily checks that P˜i = hA1,2 ◦ A2,3 ◦ · · · ◦ Ai−1,i (f˜i (h; Pi )/ h). In particular, applying this procedure to a loop γ (P = Q) proves the following : ˜ Proposition 5. For h small enough, the quantum monodromy µf gives the end point Q of the unwinding of any loop γ on U through a point P ∈ 6(h) around which we are given an affine chart f (h) by the following formula : ˜ = h(µf (γ ))−1 (f˜(h; P )/ h). Q

478

S. V˜u Ngo.c f (h)

E2 = hn

h P1

P

P2

P˜1 ˜ Q

P˜2

P˜

Q P3

P˜11

P11 E1

P4

P10

P5

P9

P˜5

P8

P6

P˜8

P7 P˜7

Fig. 6. Unwinding of the points Pi . We deduce that yP˜ = 4, which allows us to locate the horizontal line through the origin 0 ∈ hZ2 (the dotted one)

Remark 4. There is a unique symbol g(h) defined on the universal cover U˜ of U that is an affine chart for 6(h) and that coincides with f (h) above B0 . Then Q can be seen as ˜ The point is now that the lift γ .P ∈ U. ˜ + O(h∞ ). g(h; Q) = Q ˜ and for any γ ∈ π1 (U), there is a unique νP (γ ) ∈ GA(n, Z) such that For any P ∈ U, g(h; γ .P )/ h = νP (γ )(g(h; P )/ h) + O(h∞ ). By definition, we have νP (γ γ 0 ) = νγ .P (γ 0 )νP (γ ). But one can show that for any loop γ such that γ .P = Q, then νQ (γ 0 ) = νP (γ )νP (γ 0 )νP (γ )−1 . Therefore, νP is actually a homomorphism. Proposition 5 just says that νP = µ−1 f . Applying this proposition together with Theorem 2 to a focus-focus singularity, weR see R 1 1 that if the principal part of f (h) is given by the action integrals 2π γ1 α and 2π γ2 α then, for a small loop δ enclosing the critical value o, 1 ν(δ) = ι . 01 In particular, the whole horizontal line through the origin consists of fixed points. Of course, locating the origin on a diagram like Fig. 6 may require the computation of the

Quantum Monodromy in Integrable Systems

479

˜ it is easy to find the horizontal action at one point. However, given P˜ and its image Q, line through the origin, for yP˜ = xQ˜ − xP˜ .

Acknowledgements. One of the reasons for having written this article is the enthusiasm of R. Cushman for the subject; I would like to thank him for this. I would also like to thank my adviser Y. Colin de Verdière, and J. J. Duistermaat, for stimulating discussions. My research is supported by a Marie Curie Fellowship Nr. ERBFMBICT961572.

References 1. Bates, L.M.: Monodromy in the Champagne bottle. Z. Angew. Math. Phys. 6, 837–847 (1991) 2. Berger, M.: Géométrie. Vol. 1. Paris: Cedic/Nathan, 1977 3. Charbonnel, A.-M.: Comportement semi-classique du spectre conjoint d’opérateurs pseudo-différentiels qui commutent. Asymptotic Analysis 1, 227–261 (1988) 4. Child, M.S.: Quantum states in a Champagne bottle. J. Phys. A. 31, 657–670 (1998) 5. Child, M.S., Weston, T., and Tennyson, J.: Quantum monodromy in the spectrum of H2 O and other systems: New insight into the level structure of quasi-linear molecules. To appear 6. Colin de Verdière, Y.: Spectre conjoint d’opérateurs pseudo-différentiels qui commutent II. Math. Z. 171, 51–73 (1980) 7. Cushman, R. and Duistermaat, J.J.: The quantum spherical pendulum. Bull. Am. Math. Soc. (N.S.) 19, 475–479 (1988) 8. Cushman, R. and Duistermaat, J.J.: Non-hamiltonian monodromy. Preprint University of Utrecht, 1997 9. Duistermaat, J.J.: On global action-angle variables. Comm. Pure Appl. Math. 33, 687–706 (1980) 10. Eliasson, L.H.: Hamiltonian systems with Poisson commuting integrals. Ph.D. thesis, University of Stockholm, 1984 11. Guillemin, V. and Uribe, A.: Monodromy in the quantum spherical pendulum. Commun. Math. Phys. 122, 563–574 (1989) 12. Hirzebruch, F.: Topological methods in algebraic geometry. Grundlehren der math. W., Vol. 131. New York: Springer, 1966 13. Nguyên Tiên, Z.: A topological classification of integrable hamiltonian systems. Séminaire Gaston Darboux de géometrie et topologie différentielle (Brouzet, R., ed.) Université Montpellier II, 1994–1995, pp. 43–54 14. V˜u Ngo.c, S.: Bohr-Sommerfeld conditions for integrable systems with critical manifolds of focus-focus type. Preprint Institut Fourier 433, 1998 15. Zou, M.: Monodromy in two degrees of freedom integrable systems. J. Geom. Phys. 10, 37–45 (1992) Communicated by H. Araki

Commun. Math. Phys. 203, 481 – 498 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Ergodic Actions of Universal Quantum Groups on Operator Algebras Shuzhou Wang Department of Mathematics, University of California, Berkeley, CA 94720, USA. E-mail: [email protected] Received: 21 April 1998 / Accepted: 14 December 1998

Abstract: We construct ergodic actions of compact quantum groups on C ∗ -algebras and von Neumann algebras, and exhibit phenomena of such actions that are of different nature from ergodic actions of compact groups. In particular, we construct: (1) an ergodic action of the compact quantum Au (Q) on the type IIIλ Powers factor Rλ for an appropriate positive Q ∈ GL(2, R); (2) an ergodic action of the compact quantum group Au (n) on the hyperfinite II1 factor R; (3) an ergodic action of the compact quantum group Au (Q) on the Cuntz algebra On for each positive matrix Q ∈ GL(n, C); (4) ergodic actions of compact quantum groups on their homogeneous spaces, as well as an example of a non-homogeneous classical space that admits an ergodic action of a compact quantum group. 1. Introduction It is well known that compact groups admit no ergodic actions on operator algebras other than the finite ones (i.e. those with finite traces) [15]. Therefore, there arose the following basic problem (cf. p. 76 of [15]): Construct an ergodic action of a semisimple compact Lie group on the Murray–von Neumann II1 factor R. Later, Wassermann developed some general theory of ergodic actions of compact groups on operator algebras and showed that SU (2) cannot act ergodically on R [33,34], leaving experts the doubt that semisimple compact Lie groups admit ergodic actions on R at all. In [5], Boca studied the general theory of ergodic action of compact quantum groups [37] on C ∗ -algebras and generalized some basic results on ergodic actions of compact groups to compact quantum groups. But so far there is still a lack of non-trivial examples of ergodic actions of compact quantum groups on operator algebras. The purpose of the present paper is two-fold, which is in some sense opposite to that of Boca [5]. First, we show that some new phenomena can occur for ergodic actions of quantum groups. Second, we supply some general methods to construct ergodic actions of compact quantum groups on operator algebras and give several non-trivial examples of

482

S. Wang

such actions. We show that the universal compact matrix quantum groups Au (Q) of [27, 28] admit ergodic actions on both the (infinite) injective factors of type III (for Q 6 = cIn , c ∈ C∗ ) and the (infinite) Cuntz algebras (for Q > 0). We construct an ergodic action of the universal compact matrix quantum group of Kac type Au (n) on the hyperfinite factor R, which may not admit ergodic actions of any semisimple compact Lie group [34]. We also study ergodic actions of compact quantum groups on their homogeneous spaces and show that there are non-homogeneous classical spaces that admit ergodic actions of quantum groups. These results show that compact quantum groups have a much richer theory of ergodic actions on operator algebras than compact (Lie) groups. Unlike Boca [5], we study actions of compact quantum groups on both C ∗ -algebras and von Neumann algebras, not just C ∗ -algebras. Our construction of ergodic actions of compact quantum groups on von Neumann algebras come from their “measure preserving” actions on C ∗ -algebras, just as in the classical situation (see Theorem 2.5). One of our constructions of ergodic actions (see Sect. 3) uses tensor products of irreducible representations of compact quantum groups. This method was first used by Wassermann [35] in the setting of Lie groups (instead of quantum groups) to construct subfactors from their “product type actions”. At the other extreme, actions of quantum groups with large fixed point algebras (i.e. prime actions) have been studied by many authors, see, e.g. [9, 7]. Generalizing the canonical action of compact Lie groups on the Cuntz algebras [8] introduced by Doplicher-Roberts [12,10], Konishi et al [19] study (the non-ergodic) action of SUq (2) on the Cuntz algebra O2 and its CAR subalgebra and show that their fixed point algebras coincide (see also [20]). This result is extended to SUq (n) by Paolucci [22]. In [21], this action of the quantum group SUq (n) is induced to a (non-ergodic) action on the Powers factor Rλ by a rather complicated method, which follows from our result Theorem 2.5 in a much simpler and more conceptual manner. The contents of this paper are as follows. In Sect. 2, we give a general method of construction of quantum group actions on von Neumann algebras from their “measure preserving” action on C ∗ -algebras. Using this and a result of Banica [3] on the tensor products of the fundamental representation of Au (Q), we construct in Sect. 3 an ergodic action of a universal quantum groups Au (Q) on the Powers factor Rλ of type IIIλ and and an ergodic action of Au (n) on the hyperfinite II1 factor R. In Sect. 4, using results of Banica [2], we show that the fixed point subalgebra of R under the quantum subgroup Ao (n) of Au (n) is also a factor and that the action of Ao (n) on R is prime. In Sect. 5, we construct ergodic action of Au (Q) on the Cuntz algebras and on the injective factor R∞ of type III1 as well as the other factors of type III. It is also shown that the (unimodular) compact quantum group Au (n) of Kac type acts ergodically on the injective factor of type III 1 , a fact rather surprising to us. In the last section, Sect. 6, we n study ergodic actions of compact quantum groups on their “quotient spaces”, and show that the quantum automorphism group Aaut (X4 ) acts ergodically on the classical space X4 with 4 points, but X4 is not isomorphic to a quotient space. We point out that instead of using the fundamental representation of Au (Q), we can also use representations of free products of compact quantum groups [28] in the examples in Sect. 3 and Sect. 5 for the constructions of ergodic actions. 2. Lifting Actions on C ∗ -Algebras to von Neumann Algebras In this section, we describe (Theorem 2.5) how to construct ergodic actions of compact quantum groups on von Neumann algebras from “measure preserving” actions on noncommutative topological spaces (i.e. C ∗ -algebras).

Actions of Quantum Groups on Operator Algebras

483

To fix notation, we first recall some basic notions concerning actions of quantum groups on operator algebras ([1,5,23,32]). For convenience in this paper, we will use the definition given in [32] for the notion of actions of compact quantum groups on C ∗ algebras. As in [32], Woronowicz Hopf C ∗ -algebras are assumed to be full in order to define morphisms. We adapt the following convention (see [28,27,32]): when A = C(G) is a Woronowicz Hopf C ∗ -algebra, we also say that A is a compact quantum group, referring to the dual object G. Definition 2.1 (cf [32]). A (left) action of a compact quantum group A on a C ∗ -algebra B is a unital *-homomorphism α from B to B ⊗ A such that (1) (idB ⊗ 8)α = (α ⊗ idA )α, where 8 is the coproduct on A, (2) (idB ⊗ )α = idB , where is the counit on A, (3) There is a dense *-subalgebra B of B, such that α(B) ⊆ B ⊗ A, where A is the canonical dense *-subalgebra of A. Remarks. (1) The definition above is equivalent to the one in Podles [23]. As in [23], we do not impose the condition that α is injective, which is required in [1,5], though the examples constructed in this paper satisfy this condition. We conjecture that this condition is a consequence of the other conditions in the definition. A special case of this conjecture says that the coproduct of a Woronowicz Hopf C ∗ -algebra is injective, which is true for both the full Woronowicz Hopf C ∗ -algebras (because of the counital property) and the reduced ones (because of Baaj-Skandalis [1]). Even if this conjecture is false, one can still obtain an injective α˜ from α by passing to the quotient of B by the kernel of α. We leave the verification of the latter as an exercise for the reader. (2) The above notion of left action of quantum group G would be called right coaction of the Woronowicz Hopf C ∗ -algebra C(G) by some other authors. But we prefer the more geometric term “action of quantum group”. We can similarly define a right action of a quantum group G, which would be called a left coaction of the Woronowicz Hopf C ∗ -algebra C(G) by some other specialists. Definition 2.2. Let α be an action of a compact quantum group A on B. An element b of B is said to be fixed under α (or invariant under α) if α(b) = b ⊗ 1A .

(2.1)

The fixed point algebra B α (or B A if no confusion arises) of the action α is B α = {b ∈ B | α(b) = b ⊗ 1A }.

(2.2)

The action of A is said to be ergodic if B α = CI . A continuous functional φ on B is said to be invariant under α if (φ ⊗ idA )α(b) = φ(b)1A .

(2.3)

Fix an action α of a compact quantum group A on B. Let h be the Haar state on A [37,28,26]. Then we have Proposition 2.3. (1) The map E = (1 ⊗ h)α is a projection of norm one from B onto Bα.

484

S. Wang

(2) Let Bα = {b ∈ B | α(b) = b ⊗ 1A }.

(2.4)

Then B α is norm dense in B α . Hence the action α is ergodic if and only if it is so when restricted to the dense *-subalgebra B of B. Proof. (1) This is an easy consequence of the following form of the invariance of the Haar state (cf. [37]): (idA ⊗ h)8(a) = h(a)1A , a ∈ A. (2) If b ∈ B α , then b can be approximated in norm by a sequence of elements bl ∈ B. Let b¯l be the average of bl : b¯l = (1B ⊗ h)α(bl ). Then from part (1) of the proposition, b¯l ∈ B α . From condition (3) of Definition 2.1, we see that b¯l ∈ B α . Moreover, kb¯l − bk = k(1 ⊗ h)α(bl − b)k ≤ k(1 ⊗ h)αkkbl − bk → 0. The rest is clear. u t Preserve the notation above. Let A be the von Neumann algebra generated by the GNS representation πh of A for the state h. Then A is a Hopf von Neumann algebra. For later use, we need to adapt the definitions above to the situation of von Neumann algebras. Definition 2.4. A right coaction of a Hopf von Neumann algebra A on a von Neumann algebra B is a normal homomorphism α from B to B ⊗ A such that (1) (idB ⊗ 8)α = (α ⊗ idA )α, where 8 is the coproduct on A; (2) α(B)(1 ⊗ A) generates the von Neumann algebra B ⊗ A. The main reason why we use the term “coaction of Hopf von Neumann algebra” is that von Neumann algebras are measure-theoretic objects instead of geometric-topological objects (cf. Remark (2) after Definition 2.1). Condition (2) in the above definition is an analog of the density condition as used in the Hopf C ∗ -algebra setting [1,23]. It is well known that there is no analogue of counit in the Hopf von Neumann algebra situation simply because a von Neumann algebra corresponds to a measure space in the commutative case (the simplest case), and functions are defined only up to sets of measure zero. Hence we do not have an analog of condition (2) of Definition 2.1 for Hopf von Neumann algebras coactions. If A comes from the GNS-representation of the Haar state on a compact quantum group A and A coacts on the right on some von Neumann algebra B, we will abuse the terminology by saying that the quantum group A acts on B. Other notions such as invariant elements (or functionals), fixed point algebra and ergodic actions in the C ∗ -case above can also be carried over to to the von Neumann algebra situation. The main result of this section is the following Theorem 2.5. Let B be a C ∗ -algebra endowed with an action α of a compact quantum group A. Let τ be an α-invariant state on B. Then

Actions of Quantum Groups on Operator Algebras

485

(1) α lifts to a coaction α˜ of the Hopf von Neumann algebra A = πh (A)00 on the von Neumann algebra B = πτ (B)00 defined by α(π ˜ τ (b)) = (πτ ⊗ πh )α(b),

b ∈ B,

(2.5)

where πh and πτ are respectively the GNS representations associated with the Haar state h on A and the state τ on B. (2) If α is ergodic, then so is α. ˜ Proof. (1). We will only show that the natural map α˜ given on the dense subalgebra πτ (B) by b ∈ B, α(π ˜ τ (b)) = (πτ ⊗ πh )α(b), is well defined and extends to a normal morphism from B to B ⊗ A. Let b ∈ B and a ∈ A. Denote by b˜ and a˜ respectively the corresponding elements of the Hilbert spaces H = L2 (B, τ ) and K = L2 (A, h). Define an operator U on H ⊗ K by ˜ U (b˜ ⊗ a) ˜ = (πτ ⊗ πh )α(b)(1˜ B ⊗ a).

(2.6)

Then since τ is α invariant, we have < U (b˜ ⊗ a), ˜ U (b˜ ⊗ a) ˜ > = (τ ⊗ h)(1B ⊗ a ∗ )α(b∗ b)(1B ⊗ a) = aha ∗ ((τ ⊗ idA )α(b∗ b)) = aha ∗ (τ (b∗ b)1A ) = < b˜ ⊗ a, ˜ b˜ ⊗ a˜ >, where aha ∗ is the functional on A defined by aha ∗ (x) = h(a ∗ xa),

x ∈ A.

Hence U is an isometry. Since α(B)(1 ⊗ A) is dense in B ⊗ A, U is a unitary operator. We also have (πτ ⊗ πh )α(b)U (b˜0 ⊗ a˜0 ) = (πτ ⊗ πh )α(b)(πτ ⊗ πh )α(b0 )(1˜ B ⊗ a˜0 ) = (πτ ⊗ πh )α(bb0 )(1˜ B ⊗ a˜0 ) = U (πτ (b)b˜0 ⊗ a˜0 ) = U (πτ (b) ⊗ 1)(b˜0 ⊗ a˜0 ). That is (πτ ⊗ πh )α(b) = U (πτ (b) ⊗ 1)U ∗ .

(2.7)

Condition (1) of Definition 2.4 follows immediately. Since α(B)(1⊗A) is dense in B ⊗A (cf. Remark (1) after Definition 2.1 and Podles [23]), Condition (2) of Definition 2.4 follows. (2) Assume α is ergodic. Let z ∈ B be a fixed element under α: ˜ α(z) ˜ = z ⊗ 1A . Let bn ∈ B be a net of elements such that πτ (bn ) → z in the weak operator topology. Consider the average of πτ (bn ) integrated over the quantum group A, ˜ τ (bn )), zn = (idB ⊗ h)α(π

486

S. Wang

where we use the same letter h to denote the faithful normal state on A determined by the Haar state h on A. Then one can verify that zn → z in the weak operator topology. Moreover, using (idA ⊗ h)8(a) = h(a)1A , a ∈ A, where we denote the coproduct on A by the same symbol as the coproduct 8 on A, we have ˜ τ (bn )) α(z ˜ n ) = (idB ⊗ idA ⊗ h)(α˜ ⊗ idA )α(π ˜ τ (bn )) = (idB ⊗ idA ⊗ h)(idB ⊗ 8)α(π ˜ τ (bn )) = (idB ⊗ (idA ⊗ h)8)α(π ˜ τ (bn )) ⊗ 1A = zn ⊗ 1A . = (idB ⊗ h)α(π ˜ From part (1) of the theorem, we see That is, each zn is fixed under α. zn = (πτ ⊗ h)α(bn ) = πτ (b¯n ), where

b¯n = (1 ⊗ h)α(bn ) ∈ B α is the average of of bn . Since α is ergodic, b¯n is a scalar. This implies that each zn is also a scalar. Consequently, the operator z, as a limit of the zn ’s in the weak operator topology, is a scalar. u t Remarks. Define on the Hilbert A-module H ⊗ A (conjugate linear in the second variable) an operator u by u(b˜ ⊗ a) = (πτ ⊗ 1)α(b)(1˜ B ⊗ a).

(2.8)

Then one verifies that u is a unitary representation of the quantum group A (cf. [37,1, 28]) and (πτ , u) satisfies the following covariance condition in the sense of 0.3 of [1]: (πτ ⊗ 1)α(b) = u(πτ (b) ⊗ 1)u∗ .

(2.9)

The operator U defined above is given by U = (1 ⊗ πh )u. The pair (πτ , U ) along with the relation (2.7) can be called a covariant system in the framework of Hopf von Neumann algebras. Note that part (1) of the above theorem also gives a conceptual proof of Proposition 4.2.(i) of [21], where a rather complicated (and nonconceptual) proof is given. Notation. Let v be a unitary representation of a quantum group A on some finite dimensional Hilbert space Hv [37,1,28]. Define Adv (b) = v(b ⊗ 1)v ∗ , b ∈ B(Hv ).

(2.10)

Then using Proposition 3.2 of [37], we see that Adv is an action of A on B(Hv ) (see also the remark after the proof of Theorem 4.1 in [32]). It will be called the adjoint action of the quantum group A for the representation v. Note that unlike in the case of locally compact groups, for quantum groups we have in general Adv⊗in w 6 = Adv ⊗in Adw ,

(2.11)

where ⊗in denotes the interior tensor product representations [37,29]. For other basic notions on compact quantum groups, we refer the reader to [37,28, 29].

Actions of Quantum Groups on Operator Algebras

487

3. Ergodic Actions of Au (Q) on the Powers Factor Rλ and the Murray–von Neumann II1 Factor R We construct in this section an ergodic action of the universal quantum group Au (Q) on the type IIIλ Powers factor Rλ for a proper choice of Q and an ergodic action of Au (n) on the type II1 Murray–von Neumann factor R. These are obtained as consequences of Theorem 3.6 below. Recall [28,27,30] that for every non-singular n × n complex matrix Q (n > 1 in the rest of this paper), the universal compact quantum group (Au (Q), u) is generated by uij (i, j = 1, · · · , n) with defining relations (with u = (uij )): Au (Q) :

u∗ u = In = uu∗ ,

ut QuQ ¯ −1 = In = QuQ ¯ −1 ut .

There is also another related family of quantum groups Ao (Q) [28,27,30,2]: Ao (Q) :

ut QuQ−1 = In = QuQ−1 ut ,

u¯ = u; (here Q > 0).

Part (1) of the next proposition gives a characterization of Au (Q) in terms of the functional φQ defined below. Proposition 3.1. Consider the adjoint action Adu corresponding to the fundamental representation u of the quantum group (Au (Q), u). (1) The quantum group (Au (Q), u) is the largest compact matrix quantum group such that its action Adu on Mn (C) leaves invariant the functional φQ defined by φQ (b) = T r(Qt b), b ∈ Mn (C). (2) Adu is an ergodic action if and only if Q = λE, where λ is a nonzero scalar, E is the positive matrix (1 ⊗ h)ut u¯ (cf. [27]), and h is the Haar measure of Au (Q). Proof. (1) It is a straightforward calculation to verify that the action Adu of Au (Q) leaves the functional φQ invariant. Assume that (A, v) is a compact quantum group such that Adv leaves φQ invariant (v = (vij )ni,j =1 ). Then the vij ’s satisfy the defining relations for Au (Q). Hence (A, v) is a quantum subgroup of Au (Q). (2) A matrix S is fixed by Adu if and only if S intertwines the fundamental representation u with itself. Hence the action Adu is ergodic if and only if the fundamental representation u is irreducible. When Q = λE, then Au (Q) = Au (E). Since E is positive, u is irreducible (cf. [3]). On the other hand we have (see [27]) ¯ −1 , (ut )−1 = E uE where E is defined as in the proposition. We also have ¯ −1 . (ut )−1 = QuQ Hence

¯ −1 and Q−1 E u¯ = uQ ¯ −1 E. E uE ¯ −1 = QuQ

If u is irreducible, then so is u¯ and therefore Q−1 E = scalar. u t

488

S. Wang

Note. The proof of necessary condition in (2) above was pointed to us by Woronowicz. Our original proof contains an error. In general the invariant functional φQ defined above is not a trace, even if the action Adu is ergodic. However, for ergodic actions of compact groups on operator algebras, one has the following finiteness theorem of Høegh–Krohn–Landstad–Størmer [15]: Theorem 3.2. If a von Neumann algebra admits an ergodic action of a compact group G, then (a) this von Neumann algebra is finite; (b) the unique G-invariant state is a trace on the von Neumann algebra. The proposition above shows that part (b) of this finiteness theorem is no longer true for compact quantum groups in general. We now show that part (a) of the above finiteness theorem is false for compact quantum groups either: not only can compact quantum groups act on infinite algebras, they can act on purely infinite factors (type III factors). Definition 3.3. Let (Bi , πj i ) be an inductive system of C ∗ -algebras (i, j ∈ I ). For each i ∈ I , let αi be an action of a compact quantum group A on Bi . We say that the actions αi are a compatible system of actions for (Bi , πj i ) if for each pair i ≤ j , the following holds: (πj i ⊗ 1)αi = αj πj i . The following lemmas will be used in the next theorem. Preserve the notation in Definition 3.3. Let πi be the natural embedding of Bi into the inductive limit B of the Bi ’s. Lemma 3.4. Put for each i ∈ I , απi (bi ) = (πi ⊗ 1)αi (bi ),

bi ∈ Bi .

Then α induces a well defined action of the quantum group A on B. The action α is ergodic if and only if each αi is. Assume further that φi is an inductive system of states on Bi and that each φi is invariant under αi . Then α leaves invariant the inductive limit state τ = lim φi . Proof. Let j > i, so πj i (bi ) ∈ Bj . Then by the formula of α given in the lemma, we have απj (πj i (bi )) = (πj ⊗ 1)αj (πj i (bi )).

(*)

Since πj πj i = πi , the left-hand side of the above is equal to απi (bi ) = (πi ⊗ 1)αi (bi ). From the compatibility condition we see that the right-hand side of (∗) is equal to (πj ⊗ 1)(πj i ⊗ 1)αi (bi ) = (πi ⊗ 1)αi (bi ). S This shows that α is well defined on the dense subalgebra B = πi (Bi ) of B, where Bi is the dense *-subalgebra of Bi according to Definition 2.1. It also clear that α is bounded

Actions of Quantum Groups on Operator Algebras

489

and satisfies conditions of Definition 2.1. Hence α induces a well defined action of the quantum group on B. Assume that each αi is ergodic. It is clear that the action α is ergodic on the dense *-subalgebra B. Hence α is ergodic on B by Proposition 2.3. Conversely, if α is ergodic, then the restrictions αi of α to Bi is clearly ergodic. We now show that τ is invariant under α. Note that τ (πi (bi ) = φi (bi ). From this we have (τ ⊗ 1)α(πi (bi )) = (τ ⊗ 1)(πi ⊗ 1)αi (bi )) = (φi ⊗ 1)αi (bi )) = φi (bi ) = τ (πi (bi )). By density of B in B, we have (τ ⊗ 1)α(b) = τ (b),

b ∈ B.

This completes the proof of the lemma. u t Note. Not every action of a compact quantum group on an inductive limit of C ∗ -algebras arises from a compatible system of actions of A. Lemma 3.5. Let uk be a unitary representation of a compact quantum group A on Vk for each natural number k. Assume that Aduk leaves invariant a functional ψk on B(Vk ). Then the action Adu1 ⊗in ···⊗in uk leaves the functional φ k = ψ1 ⊗ · · · ⊗ ψk invariant. Proof. Straightforward calculation. u t Let Q ∈ GL(n, C) be a positive matrix with trace 1. We now construct a sequence of actions αk of the compact quantum group (Au (Q), u) on Mn (C)⊗k . Denote by uk the k th fold interior tensor product of the representation u, i.e., uk = u ⊗in · · · ⊗in u, see [29] for the definition of the interior tensor product ⊗in . Put αk = Aduk ,

⊗k k φQ = φQ = φQ ⊗ · · · ⊗ φQ .

Let B = lim Mn (C)⊗k , k→∞

B = πQ (B)00 ,

k τQ = lim φQ , k→∞

A = πh (Au (Q))00 ,

where πQ and πh are respectively the GNS-representations for the positive functional τQ and the Haar state h on Au (Q). Theorem 3.6. The actions αk (k = 1, 2, · · · ) of Au (Q) forms a compatible system of k invariant. These actions give rise to a natural ergodic actions leaving the functionals φQ ergodic action on the UHF algebra B leaving invariant the positive functional τQ , which in turn lifts to an ergodic action on the von Neumann algebra B.

490

S. Wang

Proof. It is straightforward to verify that the actions αk are a compatible system of actions. Since each uk is irreducible (cf. [3]), we see that the actions αk are ergodic. By 3.1.(1), φQ is invariant under the action Adu . Hence applying the lemmas above, we k are invariant under the actions α , and these actions give rise see that the functionals φQ k to an ergodic action of the quantum group A on B leaving τQ invariant. Now apply Theorem 2.5, the action α on B induces an ergodic action α˜ : B −→ B ⊗ A at the von Neumann algebra level defined by α(π ˜ Q (b)) = (πQ ⊗ πh )α(b), where b ∈ B. u t Corollary 3.7. Take

Q=

a 0 0 1−a

, a ∈ (0, 1/2).

Then τQ is the Powers state, so the quantum group Au (Q) acts ergodically on the Powers factor Rλ of type IIIλ , where λ = a/(1 − a). Corollary 3.8 (compare with [34]). Take Q = In . Then τQ is the unique trace on the UHF algebra B of type n∞ , so the quantum group Au (n) = Au (In ) acts ergodically on the hyperfinite II1 factor R. We will see in Sect. 5 that for an appropriate choice of Q, the quantum groups Au (Q) act on the injective factor R∞ of type III1 also. It would be interesting to know whether compact quantum groups admit ergodic actions on factors of type III0 too. 4. Fixed Point Subalgebras of Quantum Subgroups In this section, we show that although the actions of the universal quantum groups Au (Q) constructed in the last section are ergodic, when restricted to some of their non-trivial quantum subgroups, we obtain interesting large fixed point algebras. Let a 0 Q = , as in 3.7. Put q = λ1/2 = (a/(1 − a))1/2 . Then from the defini0 1−a tions of SUq (2) and Au (Q), we see that SUq (2) is a quantum subgroup of Au (Q). By restriction, we obtain from the action of Au (Q) an action of SUq (2) on Rλ . The fixed point subalgebra of Rλ under the action of SUq (2) is generated by the Jones projections {1, e1 , e2 , · · · , }. The restriction of the Powers states τQ to this fixed point algebra is a trace and its values on the Jones projections gives the Jones polynomial. See the book of Jones [18]. Now take Q = n1 In . We have Corollary 3.8. For simplicity of notation, let τ denote the trace τQ on the UHF algebra B. There are two special quantum subgroups of Au (n): SU (n) and Ao (Q) = Ao (n). By 4.7.d. of [14], for any closed subgroup G of SU (n), the fixed point algebra R G is a II1 subfactor of R. We now show that the same result holds for quantum subgroups of Ao (n). For this, it suffices to prove the following Proposition 4.1. The fixed point subalgebra R Ao (n) of R for the quantum subgroup Ao (n) of Au (n) is a II1 factor and the action of Ao (n) on R is prime.

Actions of Quantum Groups on Operator Algebras

491

Proof. Put β = n2 . By [2], the fixed point subalgebra of Mn (C)⊗k for the action αk = Aduk is generated by 1, e1 , · · · , ek−1 , where u is the fundamental representation of the quantum group Ao (n), es = IH ⊗(s−1) ⊗

X1 eij ⊗ eij ⊗ IH ⊗(k−s−1) , n i,j

and H = Cn . The e’s satisfy the relations: (i) es2 = es = es∗ ; (ii) es et = et es , 1 ≤ s, t ≤ k − 1, |s − t| ≥ 2; (iii) βes et es = es , 1 ≤ s, t ≤ k − 1, |s − t| = 1. We now show that the restriction of τ on the fixed point subalgebra of Mn (C)⊗k satisfies the Markov trace condition of modulus β, where τ is the trace on R. Namely, we will verify the identity 1 τ (wek−1 ) = τ (w) β for w in the subalgebra of Mn (C)⊗k generated by 1, e1 , · · · , ek−2 . By Theorem 4.1.1 and Corollary 2.2.4 of Jones [17], this will complete the proof of the proposition. It will also follow that the action of Ao (n) on R is prime. To verify this, it suffices by Proposition 2.8.1 of [14] to check the Markov trace condition for w of the form w = (ei1 ei1 −1 · · · ej1 )(ei2 ei2 −1 · · · ej2 ) · · · (eip eip −1 · · · ejp ), where 1 ≤ i1 < i2 · · · ip ≤ k − 2, 1 ≤ j1 < j2 · · · jp ≤ k − 2, i1 ≥ j1 , i2 ≥ j2 , · · · , ip ≥ jp , 0 ≤ p ≤ k − 2. If ip < k − 2 then it is easy to see that τ (wek−1 ) = τ (w)τ (ek−1 ) =

1 τ (w), β

noting that τ (ek−1 ) = β1 . Hence we can assume ip = k − 2. Let lw be the length of the word w. Then w takes the form w=

X 1 ( )lw (· · · ) ⊗ eab ⊗ 1, n

where the summation is over the indices a, b and some other indices that need not be specified, and the terms in (· · · ) are certain elements of Mn (C)⊗(k−2) that need not be

492

S. Wang

specified either (the components in the tensor product of the terms in (· · · ) are products of eij ’s). We have then XX 1 τ (wek−1 ) = ( )lw +1 τ ((· · · ) ⊗ eab exy ⊗ exy ) n x,y

XX 1 τ ((· · · ) ⊗ eab exy ⊗ 1)τ (IH ⊗(k−1) ⊗ exy ) = ( )lw +1 n x,y

XX 1 1 ((· · · ) ⊗ eab exx ⊗ 1) ) = ( )lw +1 τ ( n n x =

1 τ (w). β

The proof is complete. u t Remarks. (1) In view of the above result, fixed point algebras of quantum subgroups of Ao (n) give examples of subfactors. Therefore, it would be interesting to classify finite quantum subgroups of Ao (n) and study them in the light of Jones’theory, see [14] for this in the case of the Lie group SU (2). Note that the quantum Ao (n) contains the quantum permutation group Aaut (Xn ) of n point space Xn (see [32]) and many other interesting quantum subgroups (see [28]). It would also be interesting to determine the fixed point subalgebras of the quantum subgroups of SU−1 (n) (SU−1 (n) is a quantum subgroup of Au (n) because its antipode has period 2 [28,27]). We refer the reader to Banica [4] for some interesting related results. (2) Note that since Adv⊗in w 6 = Adv ⊗in Adw , for unitary representation v and w of Ao (n), we do not have a commuting square like the one on p. 222 of [14] for a given quantum subgroup G of Ao (n). 5. Ergodic Actions of Au (Q) on the Cuntz Algebra and the Injective Factor of Type III1 The Cuntz algebra On is an infinite simple C ∗ -algebra without trace, hence by [15] it does not admit an ergodic action of a compact group. Recall that the Cuntz algebra On is the simple C ∗ -algebra generated by n isometries Sk (k = 1, · · · , n) such that n X k=1

Sk Sk∗ = 1.

(5.1)

Just as U (n), the compact matrix quantum group Au (Q) acts on On in a natural manner [12,10,19], where Q is a positive matrix of trace 1 in GL(n, C): α(Sj ) =

n X

Si ⊗ uij ,

(5.2)

i=1

the dense *-algebra B of Definition 2.1 being the *-subalgebra 0 On of On generated by the Si ’s, see Doplicher-Roberts [11]. However, unlike the actions of compact groups on On , we have

Actions of Quantum Groups on Operator Algebras

493

Theorem 5.1. The above action α of the quantum group Au (Q) on On is ergodic, the unique α-invariant state on On is the quasi-free state ωQ associated with Q [13]. Proof. Let H be the Hilbert subspace of On linearly spanned by the Sk ’s. Let (H s , H r ) be the linear span of elements of the form Si1 Si2 · · · Sir Sj∗s · · · Sj∗2 Sj∗1 . Then 0 On is the linear span of all the spaces (H s , H r ) , r, s ≥ 0 (see [11]). Observe that each of the spaces (H s , H r ) is invariant under the action α: α(Si1 Si2 · · · Sir Sj∗s · · · Sj∗2 Sj∗1 ) =

n X k1 ,···kr ,l1 ,··· ,ls =1

Sk1 Sk2 · · · Skr Sl∗s · · · Sl∗2 Sl∗1 ⊗ uk1 i1 uk2 i2 · · · ukr ir u∗ls js · · · u∗l2 j2 u∗l1 j1 .

Hence (id ⊗ h)α((H s , H r )) is the space of the fixed elements of (H s , H r ) under α, where h is the Haar state on Au (Q). For r 6 = s, the tensor product representations u⊗r and u⊗s of the fundamental representation u of the quantum group Au (Q) are inequivalent and irreducible [3]. Hence by Theorem 5.7 of Woronowicz [37], for r 6 = s, h(uk1 i1 uk2 i2 · · · ukr ir u∗ls js · · · u∗l2 j2 u∗l1 j1 ) = 0,

(5.3)

and therefore (H s , H r ) has no fixed point other than 0. For r = s, identifying the elements Si1 Si2 · · · Sir Sj∗r · · · Sj∗2 Sj∗1 of (H r , H r ) with the matrix units ei1 j1 ⊗ ei2 j2 ⊗ · · · ⊗ eir jr of Mn (C)⊗r , the action α on (H r , H r ) is identified with the action αr on Mn (C)⊗r of Theorem 3.6. Hence the fixed elements of (H r , H r ) under α are the scalars. Consequently, the fixed elements of 0 On under α are the scalars. By Proposition 2.3, α is ergodic on On . Let φ be the (unique) α-invariant state on On . Then for x ∈ (H r , H s ) with r 6= s, r, s ≥ 0, we have φ(x) = h((φ ⊗ 1)α(x)) = φ((1 ⊗ h)α(x)). But (1 ⊗ h)α(x) = 0 according to the computation above. Hence φ(x) = 0. From the consideration of the last paragraph, α restricts to an ergodic action on the subalgebra (H k , H k ) of On . Identifying (H k , H k ) with Mn (C)⊗k as above, we see that k (x), φ(x) = φQ

x ∈ (H k , H k ),

k is the functional in Theorem 3.6. This shows that φ is the quasi-trace state ω where φQ Q associated with Q (cf. [13]). u t

We can assume that Q = diag(q1 , q2 , · · · , qn ) is a diagonal positive matrix with trace 1, since Au (Q) and Au (V QV −1 ) are similar to each other [27]. Let β be a positive number. Define numbers ω1 , ω2 , · · · , ωn by diag(e−βω1 , e−βω2 , e−βωn ) = diag(q1 , q2 , · · · , qn ).

(5.4)

Let πQ be the GNS representation of the α-invariant state ωQ of On . Then by Theorem 4.7 of Izumi [16] and Theorem 2.5, we have

494

S. Wang

Corollary 5.2. If ω1 /ωk is irrational for some k, then the compact quantum group Au (Q) acts ergodically on the injective factor πQ (On )00 of type III1 . Remarks. (1) The big quantum semi-group Unc (n) of Brown also acts on On in the same way as Au (Q) on On above. See Brown [6] and 4.1 of Wang [28] for the quantum semi-group structure on Unc (n). (2) If the ω1 /ωk ’s are rational for all k, then πQ (On )00 is an injective factor of type IIIλ , on which Au (Q) acts ergodically, where λ is determined from an equation involving q1 , · · · , qn (see [16]). In particular, taking Au (Q) = Au (n), we see that even the compact matrix quantum group Au (n) of Kac type admits ergodic actions on both the infinite C ∗ -algebra On and the injective factor πQ (On )00 of type III 1 . In view of Corollary 3.8, n it would be interesting to solve the following problem: Problem. Does a compact matrix quantum group of non-Kac type admit ergodic action on the hyperfinite II1 factor R? 6. Ergodic Actions on Quotient Spaces In this section, we study ergodic actions of compact quantum groups on their quantum quotient spaces. We also give an example to show that, contrary to the classical situation, not all ergodic actions arise in this way. Fix a quantum subgroup H of a compact quantum group G, which is given by a surjective morphism θ of Woronowicz Hopf C ∗ -algebras from C(G) to C(H ). Let hH and hG be respectively the Haar states on C(H ) and C(G). Then there is a natural action β of the quantum group H on G given by β : C(G) −→ C(H ) ⊗ C(G),

β = (θ ⊗ 1)8G ,

(6.1)

where 8G is the coproduct on C(G). The quotient space H \G is defined by the fixed point algebra of β (cf. [23]): C(H \G) = C(G)β = {a ∈ C(G) : (θ ⊗ 1)8G (a) = 1 ⊗ a}.

(6.2)

The restriction of 8G to C(H \G) defines a natural action α of G on C(H \G): α = 8G |C(H \G) : C(H \G) −→ C(H \G) ⊗ C(G).

(6.3)

The dense *-subalgebras of Definition 2.1 for the actions β and α are the natural ones. Note that E = (hH ⊗ 1)β = (hH θ ⊗ 1)8G is a projection of norm one from C(G) to C(H \G) (cf. Proposition 2.3 and [23]). Proposition 6.1. In the situation as above, we have (1) the action α of G on C(H \G) is ergodic; (2) C(H \G) has a unique α invariant state ω satisfying hG (a) = ω((hH θ ⊗ 1)8G (a)), Namely, ω is the restriction of hG on C(H \G).

a ∈ C(G).

(6.4)

Actions of Quantum Groups on Operator Algebras

495

Note. Part (2) of the proposition above is the analogue of the following well known integration formula in the classical situation: Z Z Z a(g)dg = a(hg)dhdω(g), a ∈ C(G). G

H \G H

Proof. (1) Let a ∈ C(H \G) be fixed under α, i.e., α(a) = a ⊗ 1.

(**)

Since α(a) = 8G (a) and since (by the definition of C(H \G)) (θ ⊗ 1)8G (a) = 1 ⊗ a, it follows that (θ ⊗ 1)α(a) = 1 ⊗ a. Using (∗∗) for the left hand side of the above, we get θ(a) ⊗ 1 = 1 ⊗ a. This is possible only for a = λ · 1 for some scalar λ. (2) The general result of the existence and uniqueness of the invariant state for an ergodic action is proven in [5]. For the special situation we consider here, we now not only prove the existence and uniqueness of the invariant state, but also give the precise formula of the invariant state. Let ω be the restriction of hG on the subalgebra C(H \G) of C(G). Since (hH θ ⊗ 1)8G is a projection from C(G) onto C(H \G) and α is the restriction of 8G on C(H \G), the invariance of ω for the action α follows from the invariance of the Haar state hG . Conversely, let µ be any invariant state on C(H \G). Using again the fact that (hH θ ⊗ 1)8G is a projection from C(G) onto C(H \G), a standard calculation shows that the functional φ(a) = µ((hH θ ⊗ 1)8G (a)), a ∈ C(G) is a right invariant state, i.e. φ ∗ ψ(a) = φ(a),

a ∈ C(G),

where ψ is a state on A and φ ∗ ψ = (φ ⊗ ψ)8G is the convolution operation (cf. [37]). From the uniqueness of the Haar state, it follows from this that φ = hG ,

µ = ω = hG |C(H \G) .

t u

Remarks. (1) Note that the quantum groups Au (Q), Ao (Q) and Bu (Q) have many quantum subgroups. In the light of Proposition 6.1 and Theorem 2.5, it would be interesting to study the corresponding operator algebras and the actions on them. We leave this to a separate work. (2) More general than the considerations in Proposition 6.1, if two quantum groups admit commuting actions on a noncommutative space, then they act on each other’s orbit spaces (not necessarily in an ergodic manner), just as in the classical situation. Note that the notion of orbit space corresponds to fixed point algebra in the noncommutative situation.

496

S. Wang

An Example. Every transitive action of a compact group G on a topological space X is isomorphic to the natural action of G on H \G, where H is the closed subgroup of G that fixes some point of X. However, this is no longer true for quantum groups, even if the space on which the quantum group acts is a classical one. To see this, let Xn = {x1 , · · · , xn } be the space with n points. By Theorem 3.1 of [32], the quantum automorphism group Aaut (X4 ) of X4 contains the ordinary permutation group S4 , hence it acts ergodically on X4 . The quantum subgroup of Aaut (X4 ) that fixes a point, say x1 , is isomorphic to Aaut (X3 ), which is the same as C(S3 ), a (commutative) algebra of dimension 6. From [32], we know that as a C ∗ -algebra, Aaut (Xn ) is the same as C(Sn ) for n ≤ 3 and it has C ∗ (Z/2Z ∗ Z/2Z) as a quotient for n ≥ 4, where Z/2Z ∗ Z/2Z is the free product of the two-element group Z/2Z with itself, because the entries of the matrix   p 1−p 0 0 0 0  1 − p p  0 0 q 1−q 0 0 1−q q satisfy the commutation relations of the algebra Aaut (X4 ), where p, q are the projections generating the C ∗ -algebra C ∗ (Z/2Z ∗ Z/2Z): p = (1 − u)/2 and q = (1 − v)/2, u and v being the unitary generators of the first and second copies of Z/2Z in the free product (cf. [24]). For simplicity of notation, let C(G) = Aaut (X4 ), and let M be the canonical dense subalgebra of C(G) generated by the coefficients of the fundamental representation of G (see [37]). Let H = S3 , the subgroup of G that fixes x1 , and let H = C(H ). Let θ be the surjection from C(G) to C(H ) that embeds H as subgroup of G (cf. [32]). Let β be the action defined in the beginning of this section. We claim that the coset space H \G is not isomorphic to X4 as a G-space (see Sect. 2 of [32] for the notion of morphism). Namely, we have Theorem 6.2. The G-algebras C(H \G) (which is defined to be C(G)β ) and C(X4 ) are not isomorphic to each other. Proof. Since C(X4 ) has dimension 4, it suffices to show that C(H \G) is infinite dimensional. We make M into a Hopf H-module (i.e. a compatible system of a left H comodule and a left H module) as follows. The restriction of β to M clearly defines a left H comodule structure: β : M −→ H ⊗ M.

(6.5)

The left H module structure on M is the trivial one defined by H ⊗ M −→ M, h · m = (h)m, h ∈ H, m ∈ M.

(6.6) (6.7)

By Theorem 4.1.1 of Sweedler [25], we have an isomorphism of left H modules H ⊗ Mβ ∼ = M.

(6.8)

H ⊗ A(H \G) ∼ = M, h ⊗ m0 7 → h · m0 , h ∈ H, m0 ∈ A(H \G),

(6.9) (6.10)

That is

Actions of Quantum Groups on Operator Algebras

497

where A(H \G) = Mβ is the canonical dense subalgebra of C(H \G) = C(G)β . Since M is infinite dimensional and H is finite dimensional, A(H \G) and therefore C(H \G) are also infinite dimensional. u t Acknowledgement. The author is indebted to Marc A. Rieffel for continual support. Part of this paper was written while the author was a member at the IHES during the year July, 1995-Aug, 1996. He thanks the IHES for its financial support and hospitality during this period. The author also wishes to thank the Department of Mathematics at UC-Berkeley for its support and hospitality while the author holds an NSF Postdoctoral Fellowship there during the final stage of this paper.

References 1. Baaj, S. and Skandalis, G.: Unitaires multiplicatifs et dualité pour les produits croisés de C ∗ -algèbres. Ann. Sci. Ec. Norm. Sup. 26, 425–488 (1993) 2. Banica, T.: Théorie des représentations du groupe quantique compact libre O(n). C. R. Acad. Sci. Paris 322, Serie I, 241–244 (1996) 3. Banica, T.: Le groupe quantique compact libre U (n). Commun. Math. Phys. 190, 143–172 (1997) 4. Banica, T.: Quantum groups acting on n points, complex Hadamard matrix and a construction of subfactors, math/9806054 5. Boca, F.: Ergodic actions of compact matrix pseudogroups on C ∗ -algebras. In: Recent Advances in Operator Algebras. Astérisque 232, 93–109 (1995) 6. Brown, L.: Ext of certain free product of C ∗ -algebras. J. Operator Theory 6, 135–141 (1981) 7. Ceccherini, T., Doplicher, S., Pinzari, C. and Roberts, J.E.: A generalization of the Cuntz algebras and model actions. J. Funct. Anal. 125, 416–437 (1994) 8. Cuntz, Joachim: Simple C ∗ -algebras generated by isometries. Commun. Math. Phys. 57, no. 2, 173–185 (1977) 9. Cuntz, Joachim: Regular actions of Hopf algebras on the C ∗ -algebra generated by a Hilbert space. In: Operator algebras, mathematical physics, and low-dimensional topology (Istanbul, 1991), Wellesley, MA: A K Peters, 1993, pp. 87–100; MR 94m:461 10. Doplicher, S.: Abstract compact group duals, operator algebras and quantum field theory. Proc. ICM1990, Kyoto: Springer, 1991 11. Doplicher, S. and Roberts, J.E.: Duals of compact Lie groups realized in the Cuntz algebras and their actions on C ∗ -algebras. J. Funct. Anal. 74, 96–120 (1987) 12. Doplicher, S. and Roberts, J.E.: Compact group actions on C ∗ -algebras. J. Operator Theory 19, 283–305 (1988) 13. Evans, David E.: On On . Publ. RIMS. Kyoto Univ. 16, 915–927 (1980) 14. Goodman, F.M. and de la Harpe, P. and Jones, V.F.R.: Coxeter Graphs and Towers of Algebras. MSRI Publ. 14, Berlin–Heidelberg–New York: Springer-Verlag, 1989 15. Høegh-Krohn, R. and Lanstad, M.B. and Størmer, E.: Compact ergodic groups of automorphisms. Ann. of Math. 114, 75–86 (1981) 16. Izumi, Masaki: Subalgebras of infinite C ∗ -algebras with finite Watatani indices. I. Cuntz algebras. Commun. Math. Phys. 155, no. 1, 157–182 (1993); MR 94e:46104 17. Jones, V. F. R.: Index for Subfactors. Invent. Math. 72, 1–5 (1983) 18. Jones, V. F. R.: Subfactors and Knots. Regional Conference Series 80, Providence, RI: Am. Math. Soc., 1991 19. Konishi, Y., Nagisa, M. and Watatani, Y.: Some remarks on actions of compact matrix quantum groups on C ∗ -algebras. Pacific J. Math. 153, 119–127 (1992) 20. Marciniak, M.: Actions of compact quantum groups on C ∗ -algebras. Proc. AMS 126, 607–616 (1998) 21. Nakagami, Y.: Takesaki duality for the crossed product by quantum groups. In: Quantum and NonCommutative Analysis. H. Araki, ed., Dordrecht: Kluwer Academic Publishers, 1993, pp. 263–281 22. Paolucci, A.: Coactions of Hopf algebras on Cuntz algebras and their fixed point algebras. Proc. AMS 125, 1033–1042 (1997) 23. Podles, P.: Symmetries of quantum spaces. Subgroups and quotient spaces of quantum SU (2) and SO(3) groups. Commun. Math. Phys. 170, 1–20 (1995) 24. Raeburn, I. and Sinclair, A.M.: The C ∗ -algebra generated by two projections. Math. Scand. 65, 278–290 (1989) 25. Sweedler, M.E.: Hopf Algebras. New York: Benjamin, 1969 26. Van Daele, A.: The Haar measure on a compact quantum group. Proc. Am. Math. Soc. 123, 3125–3128 (1995)

498

S. Wang

27. Van Daele, A. and Wang, S. Z.: Universal quantum groups. International J. Math 7:2, 255–264 (1996) 28. Wang, S. Z.: Free products of compact quantum groups. Commun. Math. Phys. 167, 671–692 (1995) 29. Wang, S. Z.: Tensor products and crossed products of compact quantum groups. Proc. London Math. Soc. 71, 695–720 (1995) 30. Wang, S. Z.: New classes of compact quantum groups. Lecture notes for talks at the University of Amsterdam and the University of Warsaw, January and March, 1995 31. Wang, S. Z.: Problems in the theory of quantum groups. In: Quantum Groups and Quantum Spaces. Banach Center Publication 40 (1997), Inst. of Math., Polish Acad. Sci., Editors: R. Budzynski, W. Pusz, and S. Zakrzewski, pp. 67–78 32. Wang, S. Z.: Quantum symmetry groups of finite spaces. Commun. Math. Phys. 195:1, 195–211 (1998) 33. Wassermann, A.: Ergodic actions of compact groups on operator algebras I: General theory. Ann. of Math. 130, 273–319 (1989) 34. Wassermann, A.: Ergodic actions of compact groups on operator algebras III: Classification for SU (2). Invent. Math. 93, 309–355 (1988) 35. Wassermann, A.: Coactions and Yang-Baxter equations for ergodic actions and subfactors. In: Operator Algebras and Applications, no 2, ed. by D. Evans and M. Takesaki, London Math. Soc. Lecture Notes 136, 1988, pp. 203–236 36. Woronowicz, S. L.: Twisted SU (2) group. An example of noncommutative differential calculus, Publ. RIMS, Kyoto Univ. 23, 117–181 (1987) 37. Woronowicz, S. L.: Compact matrix pseudogroups. Commun. Math. Phys. 111, 613–665 (1987) 38. Woronowicz, S. L.: Tannaka–Krein duality for compact matrix pseudogroups. Twisted SU (N ) groups. Invent. Math. 93, 35–76 (1988) Communicated by A. Connes

Commun. Math. Phys. 203, 499 – 530 (1999)

Communications in

Mathematical Physics © Springer-Verlag 1999

Solutions of the Dirac–Fock Equations for Atoms and Molecules Maria J. Esteban1,? , Eric S´er´e2,? 1 Ceremade (URA CNRS 749), CNRS et Universit´ e Paris-Dauphine, Place du Mar´echal de Lattre de Tassigny, F-75775 Paris Cedex 16, France 2 D´ epartement de Math´ematiques, Universit´e de Cergy-Pontoise, 2 Av. Adolphe Chauvin, F-95302 Cergy-Pontoise Cedex, France

Received: 15 December 1997 / Accepted: 29 June 1998

Abstract: The Dirac–Fock equations are the relativistic analogue of the well-known Hartree–Fock equations. They are used in computational chemistry, and yield results on the inner-shell electrons of heavy atoms that are in very good agreement with experimental data. By a variational method, we prove the existence of infinitely many solutions of the Dirac–Fock equations “without projector”, for Coulomb systems of electrons in atoms, ions or molecules, with Z ≤ 124, N ≤ 41, N ≤ Z. Here, Z is the sum of the nuclear charges in the molecule, N is the number of electrons.

1. Introduction In relativistic quantum mechanics [5], the state of a free electron is represented by a wave function 9(t, x) with 9(t, .) ∈ L2 (R3 , C4 ) for any t. This wave satisfies the free Dirac equation: i∂t 9 = H0 9, with H0 = −i

3 X

αk ∂k + β.

(1.1)

k=1

Here, we have chosen a system of units such that ~ = c = 1, the mass me of the electron has also been normalized to 1. Before going further, let us fix some notations. In the whole paper, the conjugate of z1 ·

z ∈ C will be denoted by z ∗ . For X = z·4 a column vector in C4 , we denote by X ∗ the row covector (z1∗ , . . . , z4∗ ). Similarly, if A = (aij ) is a 4 × 4 complex matrix, we denote by A∗ its adjoint, (A∗ )ij = a∗ji . ? Present address: Ceremade (UMR CNRS 7534), Universit´ e Paris-Dauphine, Place du Mar´echal de Lattre de Tassigny, F-75775 Paris Cedex 16, France

500

M. J. Esteban, E. S´er´e

We denote by (X, X 0 ) the Hermitian product of two vectors X, X 0 in C4 , and by 4 X Xi Xi∗ . The usual Hermitian product in |X| , the norm of X in C4 , i. e. |X|2 = i=1

L2 (R3 , C4 ) is denoted (ϕ, ψ)L2 =

Z

ϕ(x), ψ(x) d3 x.

(1.2)

R3

In the Dirac equation, α1 , α2 , α3 and β are 4 × 4 complex matrices, whose standard form (in 2 × 2 blocks) is 0 σk I 0 (k = 1, 2, 3), β= , αk = σk 0 0 −I

with σ1 =

01 0 −i 1 0 , σ2 = , σ3 = . 10 i 0 0 −1

One can easily check the following relations: αk = αk∗ , β = β ∗ , αk α` + α` αk = 2δk` , αk β + βαk = 0.

(1.3)

These algebraic conditions are here to ensure that H0 is a symmetric operator, such that H02 = −1 + 1.

(1.4)

Let us now consider an electron near a nucleus of atomic number Z. We assume that the nucleus is point-like and is situated at the origin of coordinates, and we take the system of units of Eq. (1.1). The Hamiltonian of the electron, in the coulombic field created by the nucleus, is then HZ = H0 − αZV (x), with V (x) =

1 . |x|

(1.5)

1 . Here, α is a positive dimensionless constant. Its physical value is α ≈ 137 Lemma 1.1 lists some properties of H0 and V (x), that will be useful in this paper.

Lemma 1.1. (P1) H0 is a self-adjoint operator on L2 (R3 , C4 ), with domain D(H0 ) = H 1 (R3 , C4 ). Its spectrum is (−∞, −1] ∪ [1, +∞). There are two orthogonal projectors on L2 (R3 , C4 ), 3+ and 3− = 1L2 − 3+ , both with infinite rank, and such that √ √ − 13+ = 3+ 1 − √ 1 H0 3+ = 3+ H0 = 1√ (1.6) H0 3− = 3− H0 = − 1 − 13− = −3− 1 − 1. 1 satisfies the following Hardy-type inequalities: (P2) The coulombic potential V (x) = |x| 1 π 2 (1.7) ϕ, (µ ∗ V ) ϕ 2 ≤ ( + ) ϕ, |H0 |ϕ 2 , 2 2 π L L

for all ϕ ∈ 3+ (H 1/2 ) ∪ 3− (H 1/2 ) and for all probability measures µ on R3 . Moreover, π ϕ, |H0 |ϕ , ∀ϕ ∈ H 1/2 , (1.8) ϕ, (µ ∗ V ) ϕ 2 ≤ 2 L L2 (1.9) k (µ ∗ V ) ϕkL2 ≤ 2k∇ϕkL2 , ∀ϕ ∈ H 1 .

Dirac–Fock Equations for Atoms and Molecules

501

In the particular case where µ is equal to the Dirac mass at the origin δ0 , an inequality more precise than (1.7) was proved in [8, 47, 48]. This inequality reads as follows: αZ ϕ, ϕ ≥ ((1 − αZ)ϕ, ϕ) , H0 − |x| for all Z ≤ Zc := ( π +22 )α , for all ϕ ∈ 3+ (H 1/2 (R3 , C4 )) . The technique used in 2 π [8, 47, 48] is based on ideas introduced by Evans-Perry and Siedentop in [18]. We refer to [27, 30] for inequality (1.8) in the case µ = δ0 . Thaller’s book [46] gathers many results on the Dirac operator, including (P1) and the standard Hardy inequality (1.9) for µ = δ0 , with references. The extension of (1.7), (1.8) and (1.9) from µ = δ0 to a general probability measure µ is immediate, since the projectors 3± , the gradient ∇ and the free Dirac operator H0 commute with translations. For completeness, we shall give the explicit form of the projectors 3+ , 3− in Sect. 3. For ϕ ∈ L2 (R3 , C4 ), let us denote ϕ+ = 3+ ϕ, ϕ− = 3− ϕ. Let E = H 1/2 (R3 , C4 ), E + = 3+ E, E − = 3− E. E is a Hilbert space with Hermitian product √ = ϕ, 1 − 1ψ = ϕ+ , ψ + + ϕ− , ψ − . ϕ, ψ E

L2

E

E

(1.10)

Since H0 is unbounded from below, it is difficult to define a ground state for relativistic atoms and molecules. In order to study the stability of relativistic molecules from a mathematical viewpoint, various simplified models have been √ introduced. In the simplest one, H0 is replaced by the positive definite Hamiltonian 1 − 1. See for instance [27, 14, 37, 34], and the Selecta of E.H. Lieb [33] for a more detailed list of references on this topic. A more realistic model due to Brown and Ravenhall [6] uses projection operators: 3+ (H0 + V )3+ replaces H0 + V , i. e., the one-particle Hilbert space is 3+ L2 instead of L2 . The above projected operator and its multi-particle counterpart was widely discussed by J. Sucher in [43, 44]. In [26], Hardekopf and Sucher investigated numerically the operator B := 3+ (H0 − αZ|x|−1 )3+ , and they claimed that its ground state energy vanishes when Z = Zc := ( π +22 )α . The first mathematical study on the semibounded2 π ness of B appeared in [18]. In [18], Evans, Perry and Siedentop proved that on the space of rapidly decaying smooth spinors, B is bounded from below by αZ(1/π − π/4) if the charge Z does not exceed Zc and unbounded from below if Z is larger than Zc . As already mentioned, several authors [8, 47, 48] improved this result later by showing that B is positive and bounded from below by (1 − αZ) whenever Z ≤ Zc . For results concerning multi-particle versions of B, see for instance [35]. The Dirac–Fock (DF) functional was first introduced by Swirles [45] as an approximation for the energy of a system of N electrons in an atom of large nuclear charge Z. In such atoms, the inner-shell electrons have relativistic energies, and the standard Hartree–Fock (HF) approximation, based on the nonrelativistic Schr¨odinger equation, is no longer valid. The Euler–Lagrange equations of the DF energy functional can be solved numerically. The solutions represent stationary states of the electrons in the atom. The numerical results are in very good agreement with experimental data (see e.g. [32, 23, 15, 38, 31, 22]). In [43, 44, 41, 24, 10], the relationship between Dirac–Fock and quantum electrodynamics is studied.

502

M. J. Esteban, E. S´er´e

In the Dirac–Fock model, the N electrons are represented by a Slater determinant of N functions ϕk ∈ E, subjected to the normalization constraints = δk` . (1.11) ϕ` , ϕk L2

We shall denote 8 = (ϕ1 , · · · , ϕN ), and the constraints above will be written in the shorter form Gram8 = 11, with h i := ϕ` , ϕk 2 . (1.12) Gram8 k`

L

We consider a molecule, with: • nuclear charge density Zµ, where Z > 0 is the total nuclear charge and µ is a probability measure defined on R3 . In the particular case of m point-like nuclei, each m m X X Zi δxi and Z = Zi . one having atomic number Zi at a fixed location xi , Zµ = i=1

• N relativistic electrons.

i=1

We assume that the interaction between these particles is purely electrostatic. The DF energy of the N electrons in the molecule, is E(8) =

N X `=1

α + 2

ϕ` , H0 ϕ`

ZZ

L2

− αZ

N X `=1

h

ϕ` , (µ ∗ V ) ϕ`

L2

i

V (x − y) ρ(x)ρ(y) − tr R(x, y)R(y, x)

(1.13) d xd y. 3

3

R3 ×R3

Here, ρ is a scalar and R is a 4 × 4 complex matrix, given by ρ(x) =

N X

N X ϕ` (x), ϕ` (x) , R(x, y) = ϕ` (x) ⊗ ϕ∗` (y),

`=1

(1.14)

`=1

ρ is the electronic density, R is the exchange matrix which comes from the antisymmetry of the Slater determinant. Note that R(y, x) = R(x, y)∗ , so that tr R(x, y)R(y, x) = X |R(x, y)ij |2 . i,j

The main difference with the more standard HF functional, is that the kinetic energy term (ϕk , −1ϕk )L2 in HF is replaced by (ϕk , H0 ϕk )L2 in DF. This changes completely the nature of the functional, which becomes strongly indefinite: it is not bounded below, and any of its critical points has an infinite Morse index. The DF functional is invariant under the action of the group U(N ): X X u1` ϕ` , . . . , uN ` ϕ` , u ∈ U(N ), 8 ∈ E N . (1.15) u·8= `

We denote

`

o n 6 = 8 ∈ E N / Gram8 = 11 .

(1.16)

Dirac–Fock Equations for Atoms and Molecules

503

Using inequality (1.8), one can easily prove that the DF functional E is smooth on E N . A critical point of E|6 is a weak solution of the following Euler–Lagrange equations: H 8 ϕk =

N X

λk` ϕ` , k = 1, . . . , N.

(1.17)

`=1

Here, H 8 ψ = H0 ψ −αZ (µ ∗ V ) ψ R +α(ρ ∗ V )ψ − α R3 R(x, y)ψ(y)V (x − y) dy.

(1.18)

Since H 8 is self-adjoint from H 1/2 to its dual H −1/2 , 3 = (λk` ) is a self-adjoint (N ×N ) complex matrix. It is the matrix of Lagrange multipliers associated to the constraints (ϕ` , ϕk )L2 = δk` . For 8 ∈ 6 a critical point whose matrix of multipliers is 3, and u ∈ U(N ), the ˜ = u · 8 is 3 ˜ = u3u∗ . So any U (N )-orbit matrix of multipliers of the critical point 8 of critical points of E|6 contains a weak solution of the following system of nonlinear eigenvalue problems, called the Dirac–Fock equations: H 8 ϕk = k ϕk , k = 1, . . . , N.

(1.19)

Physically, H 8 represents the Hamiltonian of an electron in the mean field due to the nuclei and the electrons. The eigenvalues 1 , . . . , N are the energies of each electron in this mean field. In the HF model, the Euler–Lagrange equations have a form similar to (1.19), with −1 instead of H0 in the expression of H 8 . The physically interesting states correspond to 1 ≤ · · · ≤ N < 0, and the ground state minimizes EHF on 6, which implies that 1 , . . . , N are the N first eigenvalues of H 8 (see [36]). In the DF model, the physically interesting states correspond to 0 < k < 1: a positive energy inferior to the rest mass of the electron. The definition of a ground state is less clear: the DF functional has no minimum on 6. This fact is at the origin of serious difficulties in the numerical implementation, as well as the interpretation, of the DF equations (see [10] and references therein). One way to deal with this problem, is to restrict the energy functional to the space (3+ E)N , 3+ being, as defined above, the projector on the space of positive states of the free Dirac operator [43, 44]: this corresponds to a Hartree–Fock reduction of the already mentioned Brown-Ravenhall model. The associated Euler–Lagrange equations are the “projected” Dirac–Fock equations 3+ H 8 3+ ϕk = k ϕk .

(1.20)

Note that, in the case k > 0, (1.19) can be written formally as 3+8 H 8 3+8 ϕk = k ϕk .

(1.21)

Here, 3+8 is the projector on the positive space associated to H 8 . Numerical computations using (1.19) rather than (1.20), give results that are in very good agreement with experimental data (see e.g. [23, 38]). This is not very surprising: in the presence of strong electric fields, the projector 3+8 seems physically more adequate than the free-energy projector 3+ (see [28]). In [41] Mittleman derived the DF equations with “self-consistent projector” (1.21), from a variational procedure applied to a QED Hamiltonian in Fock space, followed by the standard Hartree–Fock approximation.

504

M. J. Esteban, E. S´er´e

Important existence results are known on the HF equations. Lieb and Simon [36] proved the existence of a ground state of EHF on 6, provided N < Z + 1, where Z is the total nuclear charge, P.-L. Lions [39] proved the existence of infinitely many excited states if N ≤ Z. Using inequality (1.7), one can easily extend the results of [36, 39] to the projected equations (1.20), assuming that α max(Z, N ) <

2 , N < Z + 1. π/2 + 2/π

1 is not a compact perturbation of H0 , but this does not The only difference is that |x| create any important difficulty. In the present paper, we give the first existence result for solutions of the DF equations “without projector” (1.19). Our assumptions are

α max(Z, 3N − 1) <

2 , N < Z + 1. π/2 + 2/π

Since we find positive eigenvalues k , the equations we solve are formally equivalent to the DF equations with “self-consistent projector” (1.21). 2 is rather restrictive, and we do not have a clear The condition α(3N − 1) < π/2+2/π definition of the “ground state”. But we hope that this first study will stimulate further mathematical research on Dirac–Fock. Our main theorem is the following: 2 , N < Z + 1, with Z > 0 the Theorem 1.2. Assume that α max(Z, 3N − 1) < π/2+2/π total nuclear charge. The nuclear charge density is Zµ, where µ is a fixed probability measure on R3 . Then, there is an infinite sequence (8j )j≥0 of critical points of the DF functional E on o n 6 = 8 ∈ E N / Gram8 = 11 .

The functions ϕj1 , · · · , ϕjN satisfy the normalization constraints (1.11) and they are T strong solutions, in H 1/2 (R3 , C4 ) ∩ 1≤q<3/2 W 1,q (R3 , C4 ), of the Dirac–Fock equations H 8j ϕjk = jk ϕjk , 1 ≤ k ≤ N,

(1.22)

0 < j1 ≤ · · · ≤ jN < 1.

(1.23)

0 < E(8j ) < N,

(1.24)

Moreover, j

lim E(8 ) = N.

j→∞

Remark 1. With the physical value α ≈

1 137

(1.25)

and Z an integer, our conditions become

Z ≤ 124, N ≤ 41, N ≤ Z. Remark 2. Since µ is arbitrary, our assumptions contain the case of point-like nuclei as X 1 ρi (x) ∗ , where ρi ∈ well as more realistic nuclear potentials, of the form −α |x| i Z X ρi = Z. L∞ ∩ L1 , ρi ≥ 0, i

R3

Dirac–Fock Equations for Atoms and Molecules

505

Remark 3. The first solution 80 is a good candidate for a ground state. Indeed, in the nonrelativistic limit (α → 0), it converges, after rescaling, to a ground state of Hartree– Fock. This will be proved in a forthcoming paper. Remark 4. In the case N = 1, the Dirac–Fock equations are linear and correspond to an eigenvalue problem for Dirac operators with scalar potentials. For variational results in this case, see [17, 25, 16]. Remark 5. The functions ϕj1 , · · · , ϕjN of Theorem 1.2 are smooth outside supp µ. Moreover, if supp µ is compact, they decay exponentially fast, as well as their derivatives, when |x| goes to infinity. Our main theorem is the analogue, for the DF model, of P.-L. Lions’ result for HF [39]. In order to control the Lagrange multipliers k (they should be negative in his case), Lions uses an estimate on the Morse index of the critical points. Such an estimate can be obtained for a critical point associated to a “finite-dimensional” min-max, if the functional satisfies the Palais–Smale compactness condition (see e.g. [2, 3, 4, 11, 49]). However, the HF functional does not satisfy the Palais–Smale compactness condition: since the essential spectrum of −1 is [0, ∞), only Palais–Smale sequences with negative Lagrange multipliers k are precompact. Lions works on approximate functionals that satisfy Palais–Smale, finds critical points of these functionals with Morse index estimates, and passes to the limit. In [19, 21], Fang and Ghoussoub give general existence results on Palais–Smale sequences with Morse-type information, for functionals that do not satisfy Palais–Smale. As an application, they rewrite Lions’ proof, working directly with the HF functional. For DF, we also need a control on k : 0 < k < 1. Moreover, the essential spectrum of H0 is R \ (−1, 1), so that the only precompact Palais–Smale sequences for DF, are such that |k | < 1. So a natural approach is to adapt the above ideas to DF. To realize this program, we faced several difficulties. 1 is not a compact perturbation of H0 . The first (and smallest) difficulty is that µ ∗ |x| 1 by a This creates some technical problems. They are easily solved, replacing V = |x| regularized potential Vν . At the end of the proof, we can pass to the limit ν → 0, thanks to inequality (1.7). The second difficulty is that the Morse index estimates can only give upper bounds on the multipliers in [39]. But in the DF case, we want to ensure that k > 0. To overcome this problem, we replace the constraint Gram (8) = 11 by a penalization term πp (8), subtracted from the energy functional. The new Euler equations are H 8 ϕk = ∂ϕk πp (no more Lagrange multipliers). The eigenvalue k is now an explicit function of ϕk , which appears in the expression of the derivative ∂ϕk πp . This function has only positive values, so we automatically get k > 0.

The third difficulty with DF, is that all critical points have an infinite Morse index. This kind of problem is often encountered in the theory of Hamiltonian systems and in certain elliptic PDEs. One way of dealing with it is to use a concavity property of the functional to get rid of the “negative directions”: see e.g. [1, 7, 9]. We shall use this method. We get a reduced functional Iν,p . A min-max argument gives us Palais– Smale sequences 8n,ν,p for Iν,p with finite “Morse index”, thanks to [19]. Adapting the arguments of [39], we prove that the k ’s of such sequences are smaller than 1. Then we pass to the limit (ν, n, p) → (0, ∞, ∞), and get the desired solutions of DF, with 0 < k < 1.

506

M. J. Esteban, E. S´er´e

2 Our concavity argument works only if α(3N − 1) < π/2+2/π . In the last 20 years, very powerful methods have been developed to deal with strongly indefinite functionals, that do not present any concavity property [42, 13, 20, 29]. This suggests that it might be possible to weaken the assumptions on N in Theorem 1.2.

2. Sketch of the Proof of Theorem 1.2 As announced in the Introduction, we replace V (x) = by the regularized potential Vν (x) =

1 (2πν)3/2

e−|x|

2

/2ν

1 |x| ,

in the expression of E (1.13),

∗ V (x), ν > 0.

(2.1)

This replacement is made for the attractive potential of the nucleus, as well as for the electronic repulsion and exchange terms. The regularized DF functional is denoted ν Eν , and the associated one-particle Hamiltonian (1.18) is denoted H 8 . 2 1 The Gaussian (2πν)3/2 e−|x| /2ν is normalized in L1 , so that Vν satisfies the same inequalities (1.7–8–9) as V . We also replace the constraint “8 ∈ 600 by a penalization term πp . The penalization parameter p is a positive integer. The penalized functional

is defined in the domain

Fν,p = Eν − πp

(2.2)

o n A = 8 ∈ E N / 0 < Gram8 < 11 ,

(2.3)

where Gram8 is the N × N matrix (ϕi , ϕj )L2 1≤i,j≤N . The penalization term has the form h p −1 i 11 − Gram8 . (2.4) πp (8) = tr Gram8 Note that Fν,p is invariant under the U(N ) action (1.15). It is easy to see that Fν,p is well-defined and smooth on A. We are going to construct approximate critical points of Fν,p . As ν → 0 and p → ∞, these points will converge to critical points of E|6 . Any U(N ) orbit in A contains a point 8 such that Gram 8 is diagonal, with eigenvalues in nondecreasing order: Gram8 = Diag(σ1 , . . . , σN ), 0 < σ1 ≤ · · · ≤ σN < 1.

(2.5)

We call O the set of points 8 ∈ A, satisfying (2.5). If 8 ∈ O, then ∂Fν,p ν (8) = H 8 ϕk − k ϕk , ∂ϕk

(2.6)

with k = ep (σk ), ep (x) =

xp pxp−1 − (p − 1)xp d = . dx 1 − x (1 − x)2

(2.7)

Dirac–Fock Equations for Atoms and Molecules

507

The function ep is positive and increasing on (0, 1), so that 0 < 1 ≤ · · · ≤ N . This is one of the advantages of the penalized functional Fν,p : its critical points in O are solutions of a nonlinear eigenvalue problem, with positive eigenvalues. In the proof of Theorem 1.2, we need to control not only the critical points of Fν,p , but also its Palais–Smale sequences. Of course, we just need to study Palais– Smale sequences in O, thanks to the U(N ) invariance. Unfortunately, the Palais–Smale condition does not hold for Fν,p , exactly as in the case of the HF functional. But it can be replaced by the following lemma, which is related to the spectral properties of the Dirac operator with a potential. Its proof is based on inequality (1.7). Lemma 2.1 (Convergence of approximate solutions). Assume that α max(Z, N ) < 2 π/2+2/π . (a) Let (νn ) be a sequence of real numbers in (0, 1), (pn ) a sequence of positive integers, and (8n ) a sequence in O, i.e. such that Gram8n = Diag(σ1,n , . . . , σN,n ), 0 < σ1,n ≤ · · · ≤ σN,n < 1. d xpn We denote k,n = epn (σk,n ), with epn (x) = dx 1−x . We assume that 0

Fνn ,pn (8n ) −→ 0 n→∞

(2.8)

iN h ∗ 1 = E N . We also assume that for the strong topology of H − 2 (R3 , C4 ) lim inf σ1,n > 0.

(2.9)

lim inf 1,N ≥ h0 ,

(2.10)

n→∞

Then, n→∞

where h0 ∈ (0, 1) is a constant which depends only on αZ, αN. (b) If, moreover, lim sup N,n < 1, n→∞

(2.11)

then, after extraction of a subsequence, the functions ϕk,n converge to N functions T ϕk ∈ E ∩ 1≤q<3/2 W 1,q (R3 , C4 ), for the strong H 1/2 topology. (b.1) In the case νn → ν ∈ (0, 1) and pn = p for n large, 8 = (ϕ1 , · · · , ϕN ) is a critical point of Fν,p in O. Moreover, Fν,p (8) = limn→∞ Fνn ,pn (8n ). (b.2) In the case νn → 0 and pn → +∞, ϕ1 , · · · , ϕN satisfy T the orthonormality constraints (ϕl , ϕk )L2 = δkl . They are strong solutions, in 6 ∩ 1≤q<3/2 W 1,q (R3 , C4 ), of the Dirac–Fock equations H 8 ϕk = k ϕk , k = lim k,n ∈ [h0 , 1), n→∞

(2.12)

and the DF energy of 8 is E(8) = lim Fνn ,pn (8n ). n→∞

(2.13)

508

M. J. Esteban, E. S´er´e

Lemma 2.1 will be proved in Sect. 3. Our problem now is to find sequences 8n satisfying the assumptions of this lemma. Assumption (2.11) is the most difficult to check. In the HF case, a similar question was solved by P.-L.Lions [39]: if 8∗ is a critical HF point of EHF on 6, the associated multipliers 1 ≤ · · · ≤ N are eigenvalues of H 8∗ . Let us denote λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · < 0 the sequence of negative eigenvalues of HF Z H 8∗ = −1 − |x| + . . . (assuming N < Z). If k < 0, there is an integer n(k) such that k = λn(k) . Moreover, we may impose n(k + 1) > n(k). If k ≥ 0, we take n(k) = +∞. Lions proved the following inequality: h i (2.14) m(8∗ ) ≥ max n(k) − k , 1≤k≤N

where m(8∗ ) is the Morse index of 8∗ . As a consequence, if 8∗ is a minimizer of EHF on 6, then n(k) = k (∀k). This particular case of (2.14) was proved earlier by Lieb and Simon [36]. In the DF case, we would also like to control the k ’s, using the Morse index for Fν,p . Unfortunately, the functional Fν,p is strongly indefinite, as mentioned in the introduction. We overcome this difficulty thanks to a concavity argument, as in [1, 7, 9]. 2 . Lemma 2.2 (Concavity in the E − directions). Assume that α(3N − 1) < π/2+2/π Then there is a constant s > 0, independent of ν, p, such that, for any 8 ∈ A and 9− ∈ (E − )N , N X 00 ||ψk− ||2E . Fν,p (8) · 9− , 9− ≤ −s

(2.15)

k=1

00

Lemma 2.2 will be proved in Sect. 4, where an explicit formula for Fν,p will be given. Now, let n o (2.16) A+ = A ∩ (E + )N = 8+ ∈ (E + )N / 0 < Gram(8+ ) < 11 . For 8+ ∈ A+ , let

n o 0(8+ ) = χ− ∈ (E − )N / Gram(8+ ) + Gram(χ− ) < 11 o n = χ− ∈ (E − )N / 8+ + χ− ∈ A .

(2.17)

One can easily see that 0(8+ ) is an open convex subset of (E − )N , and that Fν,p (8+ +χ− ) converges to −∞ as χ− approaches the boundary of 0(8+ ), for 8+ fixed. So, Lemma 2.2 has the following consequence: 2 . Then, for any 8+ ∈ A+ , the funcCorollary 2.3. Assume that α(3N − 1) < π/2+2/π tional χ− ∈ 0(8+ ) 7→ Fν,p (8+ + χ− )

has a unique maximizer hν,p (8+ ) ∈ 0(8+ ). The mapping hν,p : A+ → (E − )N is smooth for the (H 1/2 )N norm, and equivariant under the U(N ) action (1.15).

Dirac–Fock Equations for Atoms and Molecules

We denote

509

Iν,p (8+ ) = Fν,p 8+ + hν,p (8+ ) , 8+ ∈ A+ .

(2.18)

Iν,p is well-defined and smooth on A+ . Since hν,p is U(N ) equivariant, Iν,p is invariant, and any U(N ) orbit in A+ contains a point 8+ such that 8 = 8+ + hν,p (8+ ) satisfies 0 (2.5). By definition of hν,p , for all 9− ∈ (E − )N , Fν,p (8+ + hν,p (8+ )) · 9− = 0 . As a consequence, if 8+ is a critical point of Iν,p , then, 8 = 8+ + hν,p (8+ ) is a critical point of Fν,p . So we just have to look for critical points of Iν,p . This is much more comfortable, because this reduced functional is not strongly indefinite. We now give a relationship between Morse-type information on a Palais–Smale sequence 8+n for Iν,p , and the estimate (2.11) on the k ’s. Unfortunately, we do not have a precise inequality like (2.14). 2 , Lemma 2.4 (The Morse index controls the k ’s). Assume that α(3N − 1) < π/2+2/π + + N < Z + 1. Let ν ∈ (0, 1), p ≥ 2, M > 0, and (8n ) a sequence in A . Denoting

8n = 8+n + hν,p (8+n ), we assume that 8n ∈ O, i.e. Gram8n = Diag(σ1,n , . . . , σN,n ), with 0 < σ1,n ≤ · · · ≤ σN,n < 1. Suppose that 0

Iν,p (8+n ) ≤ M, Iν,p (8+n ) → 0, lim inf σ1,n > 0,

(2.19)

and that the quadratic form on (E + )N : n h i X 00 ||ψk+ ||2E Qn (9+ ) = Iν,p (8+n ) 9+ , 9+ + δn

(2.20)

k=1

has a negative space of dimension at most m, for a sequence δn → 0. Then, there is a constant bm ∈ (0, 1), independent of ν, p, M, 8+n , δn , such that lim sup N,n ≤ bm , with N,n = ep (σN,n ). n→∞

(2.21)

The last step in the proof of Theorem 1.2 is to find Palais–Smale sequences for Iν,p , with Morse-type information. For this purpose, we look for positive min-max levels of Iν,p in A+ . Note that A+ is an open subset of E N , whose boundary is ∂A+ = G1 ∪ G2 , with  n o  G1 = 8+ ∈ (E + )N / Gram8+ ≤ 11, det Gram8+ = 0 n o .  G2 = 8+ ∈ (E + )N / Gram8+ ≤ 11, det(11 − Gram8+ ) = 0 If Iν,p were negative for 8+ close to ∂A+ , the existence of positive min-max levels for Iν,p would be a direct consequence of the topology of (A+ , ∂A+ ). We have Iν,p (8+ ) −→ −∞ as distL2 (8+ , G2 ) −→ 0, with ||8+ ||E N bounded. But Iν,p (8+ ) may remain positive when 8+ is close to G1 . Following [12, 13, 40], we solve this difficulty by studying the gradient vector field of Iν,p near G1 . We prove that this field “points inward”, in the following sense:

510

M. J. Esteban, E. S´er´e

Lemma 2.5 (A pseudo-gradient pointing inward near G1 ). Assume that α max(Z, 3N − 1) <

2 . π/2 + 2/π

Take ν > 0. Then there are d(ν), e(ν) > 0 such that, if8+ ∈ A+ satisfies

det (Gram8+ ) ∈ [d(ν), 2d(ν)]

then one can find a vector X ∈ (E + )N , with ( 0 Iν,p (8+ ) · X ≥ e(ν)kXk(E)N (∀p ≥ 2), 10 (8+ ) · X > 0, where 1(8+ ) = det(Gram8+ ).

(2.22)

Note that in Lemma 2.5 there is a constant, d(ν), which depends on ν. We have been unable to make d independent of ν. ∞ (d(ν), 1), R be such that Now, let αν ∈ C  ∀x ≥ 2d(ν) αν (x) = 0,    0 ∀x < 2d(ν) αν (x) > 0, (2.23)    αν (x) → −∞ as x → d(ν). Let β ∈ C ∞ (R, R) be such that   β ≡ −1 on (−∞, −1) β(t) = t, ∀t ≥ 0  β(t) ≤ 0, ∀t ≤ 0. We define a new functional Jν,p , by ( Jν,p (8+ ) = β Iν,p (8+ ) + αν ◦ 1(8+ ) if 8+ ∈ A+ and 1(8+ ) > d(ν), Jν,p (8+ ) = −1

(2.24)

(2.25)

otherwise

It is easy to see that Jν,p is smooth on (E + )N . If 8+ is a critical point of Jν,p with Jν,p (8+ ) ≥ 0, then 8+ is also a critical point of Iν,p + αν ◦ 1, at the same level. From (2.22)–(2.23), this is only possible if 1(8+ ) > 2d(ν), hence 8+ is a critical point of Iν,p and Iν,p coincides with Jν,p in a neighborhood of 8+ . The same holds for Palais–Smale sequences. So we can look for positive min-max levels of Jν,p instead of Iν,p . This is much more convenient, because Jν,p is defined on (E + )N , with Jν,p = −1 on ∂A+ . Jν,p is invariant under the U(N ) action (1.15). For F a finite dimensional complex subspace of E + , let o n (2.26) D(F ) = 8+ ∈ F N / Gram8+ ≤ 11 . We say that a homotopy h ∈ C [0, 1] × (E + )N , (E + )N is “admissible ” if   h(λ, u · 8+ ) = u · h(λ, 8+ ), 

h(λ, 8+ ) = 8+ , ∀λ ∈ [0, 1],

∀u ∈ U(N ) ∀(λ, 8+ ) ∈ [0, 1] × (E + )N ∀8+ ∈ ∂A+ .

(2.27)

Dirac–Fock Equations for Atoms and Molecules

511

We define the class of sets n Q(F ) = Q ⊂ (E + )N / there is h, admissible, such that o h(0, ·) = Id(E + )N , h(1, D(F )) = Q .

(2.28)

Finally, let cν,p (F ) =

inf

max Jν,p (8+ ).

Q∈Q(F ) 8+ ∈Q

(2.29)

We have Lemma 2.6 (The min–max levels). Assume that α max(Z, 3N − 1) <

2 , N < 2Z + 1. π/2 + 2/π

For any integer j ≥ 0, there is a complex vector space Fj ⊂ E + with dimC Fj = N + j, and three constants 0 < a(j) < a(j) < N, p(j) ≥ 1, such that a(j) → N as j → ∞, and a(j) ≤ cν,p (Fj ) ≤ a¯ (j), (∀ν ∈ (0, 1)) (∀p ≥ p(j)). Note that the action of U(N ) is free on the set

(2.30)

o n 8+ ∈ (E + )N / Jν,p (8+ ) > 0 .

Moreover, D(Fj ) / U(N ) has dimension mj = 2N j+N 2 . It then follows from arguments by Fang and Ghoussoub [19, 21], that there is a Palais–Smale sequence at the level cν,p (Fj ), with Morse-type information: Lemma 2.7 (Palais–Smale sequences with bounded Morse index). Assume that α max(Z, 3N − 1) <

2 , N < Z + 1. π/2 + 2/π

Take Fj as in Lemma 2.6, and ν ∈ (0, 1), p ≥ p(j). Then there is a sequence 8+n ∈ A+ , with 0

Iν,p (8+n ) → 0, Iν,p (8+n ) → cν,p (Fj ), 1(8+n ) > d(ν),

(2.31)

and a sequence δn > 0, δn → 0, such that the quadratic form N i h X 00 ||ψk+ ||2E , 9+ ∈ (E + )N Qn (9+ ) = Iν,p (8+n ) 9+ , 9+ + δn k=1

has a negative space of dimension at most mj = 2N j + N 2 .

(2.32)

512

M. J. Esteban, E. S´er´e

Proof of Theorem 1.2. We now prove Theorem 1.2 as a direct consequence of Lemmas 2.1, 2.4, 2.6 and 2.7. Let j ≥ 0, p ≥ max(3, p(j)) be two integers. Take ν = p1 ∈ (0, 1). There is a sequence 8+n satisfying (2.31–32) of Lemma 2.7 and such that 8n = 8+n + h1/p,p (8+n ) satisfies (2.5). Then (2.21) of Lemma 2.4 holds, with m = mj . So, from (b.1) of Lemma 2.1, 8n converges, after extraction of a subsequence, to a critical point 8j,p of F1/p,p , with p ), F1/p,p (8j,p ) = c1/p,p (Fj ) , Gram8j,p = Diag(σ1p , · · · , σN p < 1, h0 ≤ ep (σkp ) ≤ bmj < 1. 0 < σ1p ≤ · · · ≤ σN

Since ep converges uniformly to 0 on any interval [0, s], s < 1, we have lim σ1p = 1. p→∞

Applying (b.2) of Lemma 2.1 to the sequence 8j,p , for j fixed, we find, after extraction of a subsequence, a limit 8j which satisfies the requirements (1.22–25) of Theorem 1.2. In Sect. 3, we study the properties of the first derivative of Fν,p , and we prove Lemma 2.1. In Sect. 4, we compute the Hessian of Fν,p , and we prove Lemmas 2.2 and 2.4. In Sect. 5, we study the min-max argument, and prove Lemmas 2.5 and 2.6. 3. The First Derivative of Fν,p Our first task is to prove property (P1) of Lemma 1.1. For this purpose, we write H0 in Fourier space: [ H 0 ψ(ξ) =

3 X k=1

b 0 (ξ) the matrix We denote by H

1 ξ·σ b b = αk ξk + β ψ(ξ) ψ(ξ) ξ · σ −1

(3.1)

3 X 1 ξ.σ , with the standard notation ξ · σ = ξk σk . ξ.σ −1 k=1

b 0 (ξ)2 = (1 + |ξ|2 )11C4 . Taking b 0 (ξ) is a self-adjoint 4 × 4 matrix, and we have: H H  p b 0 (ξ) + 1 + |ξ|2 11 H  +  c  p = 3 (ξ) =   2 1 + |ξ|2      √ 1 2 + 1 | √ξ.σ 2 (3.2) 1+|ξ| 1+|ξ|   1      = − − − − − | − − − − −    2 √ξ.σ   | −√ 1 2 + 1 2 1+|ξ|

and

1+|ξ|

 p b 0 (ξ) + 1 + |ξ|2 11 −H  d −   p = 3 (ξ) =   2 1 + |ξ|2      − √ 1 2 + 1 | − √ξ.σ 2 1+|ξ| 1+|ξ|   1   ,  − − − − − | − − − − − =     2  √ξ.σ 1  √  | +1 − 2 2 1+|ξ|

1+|ξ|

(3.3)

Dirac–Fock Equations for Atoms and Molecules

513

d − c+ (ξ), 3 we find that 3 (ξ) are two orthogonal projectors of rank 2, with  p c+ H c+ (ξ) = 1 + |ξ|2 3 c+ (ξ) c0 3 c0 (ξ)  =H 3  p  d d d − − c0 3 c0 (ξ) =H (ξ) = − 1 + |ξ|2 3 (ξ) 3− H . d +d − − c+ c  = 3 3 (ξ) = 0 3 3 (ξ)     c+ d − (ξ) = 11 4 3 (ξ) + 3

(3.4)

C

Finally, if we define 3+ , 3− on L2 (R3 , C4 ) by ( + c+ (ξ)ψ(ξ) [ b ψ(ξ) = 3 3 d − − [ b ψ(ξ) = 3 (ξ)ψ(ξ) 3

(3.5)

we easily obtain (P1) of Lemma 1.1, as a consequence of (3.4). We now give a first consequence of inequality (1.7). Lemma 3.1. Assume that α max(Z, N ) < (i)

2 π/2+2/π

.

There is a constant h0 > 0, such that for any ν ∈ [0, 1], 8 ∈ E N such that Gram(8) ≤ 11, and ψ ∈ E, ν

h0 ||ψ||H 1/2 ≤ ||H 8 ψ||H −1/2 .

(3.6)

ν

In other words, H 8 is a self-adjoint isomorphism between H 1/2 and its dual H −1/2 , whose inverse is bounded independently of 8, ν. ν (ii) Take ν ∈ [0, 1], 8 ∈ E N with Gram(8) ≤ 11, and ψ ∈ E, such that H 8 ψ ∈ T 1,q (R3 , C4 ). L2 (R3 , C4 ). Then ψ ∈ 1≤q<3/2 Wloc N (iii) Let νn ∈ [0, 1], and 8n ∈ E with Gram(8n ) ≤ 11. We assume that kϕk,n kE is a bounded sequence, for k = 1, · · · , N . Let ψn ∈ E be such that the sequence νn 1/2 kH 8n ψn kL2 is bounded. Then ψn is precompact in Hloc (R3 , C4 ). Proof. (i) Let ψ + = 3+ ψ, ψ − = 3− ψ. From inequality (1.7), we have  (π/2 + 2/π)αZ ν   (ψ + , H 8 ψ + )E×E ∗ ≥ (1 − )||ψ + ||2E , 2  (π/2 + 2/π)αN  −(ψ − , H ν ψ − ) )||ψ − ||2E E×E ∗ ≥ (1 − 8 2 Let us choose h0 = 1 −

(π/2+2/π)α max(Z,N ) . 2

ν

We get ν

kψkE kH 8 ψkH −1/2 ≥Re(ψ + − ψ − , H 8 ψ)E×E ∗ ν

ν

= (ψ + , H 8 ψ + )E×E ∗ − (ψ − , H 8 ψ − )E×E ∗ ≥ ν H8

(3.7)

(3.8)

h0 kψk2E ,

is obviously self-adjoint from H 1/2 to its dual H −1/2 , so it is hence (3.6). Now, ν Fredholm of index 0. Equation (3.6) tells us that H 8 is injective, so it is an isomorphism, and the norm of its inverse is less than or equal to 1/h0 . (ii) ψ and ϕ1 , · · · , ϕN are in H 1/2 (R3 , C4 ), so they are in Lr (R3 , C4 ), for any 2 ≤ r < 3. Vν is in the Marcinkiewicz space M3 , so (µ ∗ Vν )ψ is in any

514

M. J. Esteban, E. S´er´e

R 0 Lqloc (R3 , C4 ), 1 ≤ q < 3/2, and (ρ8 ∗ Vν )ψ − R8 (x, y)ψ(y)dy is in any Lqloc (R3 , C4 ), 1 ≤ q 0 < 3. As a consequence, Z ν −1 αZ(µ ∗ Vν )ψ − α(ρ8 ∗ Vν )ψ + α R8 (x, y)ψ(y)dy + H 8 ψ ψ = H0 T 1,q (R3 , C4 ). is in 1≤q<3/2 Wloc (iii) From (i), kψn kE is a bounded sequence. kϕk,n kE is also bounded. ψn and ϕk,n are thus precompact in any Lrloc (R3 , C4 ), 2 ≤ r < 3. So, refining the arguments in the 1/2 proof of (ii), we see that ψn is precompact in Hloc (R3 , C4 ). We are now going to prove Lemma 2.1. (a) First of all, (2.8) gives νn

H 8n ϕk,n = k,n ϕk,n + δk,n ,

(3.9)

with lim ||δk,n ||H −1/2 = 0. As a consequence, n→∞

νn

||H 8n ϕk,n ||H −1/2 ≤ k,n ||ϕk,n ||L2 + ||δk,n ||H −1/2 .

(3.10)

Using (3.6) of Lemma 3.1, we get h0 ||ϕk,n ||E ≤ k,n ||ϕk,n ||L2 + ||δk,n ||H −1/2 .

(3.11)

But we assume (2.9), i.e. lim inf ||ϕ1,n ||L2 > 0. So we must have lim inf 1,n ≥ h0 , and (2.10) follows from (2.8) (2.9), with the same h0 as in Lemma 3.1. (b) Under the additional assumption (2.11), i.e. lim sup N,n < 1, let us study the convergence of ϕk,n for the H 1/2 topology. After extraction of a subsequence, we may impose k,n −→ k ∈ [h0 , 1) and νn ν ∈ [0, 1]. n→∞

kk,n ϕk,n kL2 is a bounded sequence. So (3.11) implies that kϕk,n kE is bounded uniformly in n. Let χk,n ∈ H 1/2 be defined by H8νnn χn = k,n ϕk,n . By (iii) of Lemma 3.1, we may 1/2

impose, after extraction of a subsequence, χk,n −→ ϕk in Hloc (R3 , C4 ). n→∞ On the other hand, we may write νn

H 8n (ϕk,n − χk,n ) = δk,n −→ 0 in H −1/2 (R3 , C4 ). n→∞

So 1k,n = ϕk,n − χk,n → 0 in H

1/2

(R , C4 ), and ϕk,n = χk,n + 1k,n −→ ϕk , in 3

n→∞

1/2

Hloc (R3 , C4 ). N T 1,q (R3 , C4 ) , of Then 8 = (ϕ1 , . . . , ϕN ) is a strong solution, in E ∩ 1≤q<3/2 Wloc ν

H 8 ϕk = k ϕk , (∀k). Our goal is to prove that ϕk,n converges to ϕk in H 1/2 (R3 , C4 ). Let ψk,n = ϕk,n − ϕk . Let us denote X X |ψk,n (x)|2 , R˜ n (x, y) = ψk,n (y)∗ ⊗ ψk,n (x). (3.12) ρ˜n = k

k

Dirac–Fock Equations for Atoms and Molecules

515

For ψ ∈ H 1/2 (R3 , C4 ), let

Lk,n ψ = H0 ψ + α ρ˜n ∗ Vνn ψ Z − α R˜ n (x, y)ψ(y)Vνn (x − y)dy − k ψ.

(3.13)

We have lim ||Lk,n ψk,n ||H −1/2 = 0.

n→∞

Now, from inequality (1.7) and our assumptions, it is easy to see that the map ψ − ∈ E − 7→ Fk,n (ψ − ) = ψk,n + ψ − , Lk,n (ψk,n + ψ − ) 2 L

is strictly concave. So, denoting

± ψk,n

±

= 3 ψk,n , 0

− − ) ≤ Fk,n (0) − Fk,n (0).ψk,n ≤ 3||ψk,n ||E ||Lk,n ψk,n ||H −1/2 , Fk,n (−ψk,n (3.14) − ) ≤ 0. But if we define δ = 1 − N , we have hence lim supFk,n (−ψk,n n→∞

− + + ) = ψk,n , Lk,n ψk,n Fk,n (−ψk,n

L2

√ + + ≥ ψk,n , δ 1 − 1 ψk,n . 2 L

(3.15)

+ ||E → 0 as n → ∞. As a consequence, ||ψk,n − − − ||H −1/2 −→ 0 and lim ψk,n , Lk,n ψk,n = 0. But from inequalSo, ||Lk,n ψk,n n→∞ n→∞ L2 ity (1.7),

− − , Lk,n ψk,n ψk,n

L2

αN (π/2 + 2/π) − 2 ||ψk,n ||E . ≤− 1− 2

(3.16)

− ||E 0, and ||ψk,n ||E −→ 0 as n → ∞. So ||ψk,n We have thus proved that ||ϕk,n − ϕk ||E −→ 0 as n → ∞. (b.1) We now assume that pn = p for n large. Being a limit of 8n in the strong E N -topology, 8 is obviously a critical point of Fν,p in A, and from (2.9), Gram 8 > 0. We also have Gram 8 = Diag(σ1 , . . . , σN ), 0 < σ1 ≤ . . . . ≤ σN < 1. From (2.10),

ep (σ1 ) =

pσ1p−1 − (p − 1)σ1p ≥ h0 . (1 − σ1 )2

There is a unique number cp ∈ (0, 1) such that ep (cp ) = h0 , and we have lim cp = 1. p→∞

Since ep is increasing on (0, 1), we get cp 11 ≤ Gram8 < 11. (b.2) Finally, let us assume that νn → 0 and pn → ∞. Then 1 > k,n ≥ cpn

−→ 1,

n→∞

516

M. J. Esteban, E. S´er´e

so 8 ∈ 6. Obviously, 8 satisfies (2.12), so it is a critical point of E|6 . Moreover, E(8) = lim E(8n ). n→∞ X xp θpn (σk,n ), with θp (x) = 1−x . Now, πpn (8n ) = k

0

θp (x) p 1 = + ≥ p, ∀x ∈ (0, 1) . So, < 1. But θp (x) x 1 − x

0

We recall that θpn (σk,n ) = k,n θpn (σk,n ) <

1 pn

−→ 0. As a consequence, πpn (8n ) → 0 and

n→∞

E(8) = lim Fνn ,pn (8n ). n→∞

This ends the proof of Lemma 2.1.

4. The Hessian of Fν,p We shall use the following formula for the second derivative of the DF energy: h i X 1 00 ||ψ`+ ||2E − ||ψ`− ||2E − αZ ψ` , (µ ∗ Vν (x)) ψ` 2 Eν (8) 9, 9 = 2 L `

+ K1 − K2 + K3 − K4 + K5 , where

(4.1)

ZZ K1 = α

Vν (x − y)ρ9 (x)ρ8 (y) X X |ψ` (x)|2 , ρ8 (y) = |ϕm (y)|2 , ρ9 (x) = m

`

X ZZ

K2 = α

`6=m

+α

(4.2)

R3 ×R3

Vν (x − y) ψ` (y), ϕm (y) ϕm (x), ψ` (x)

X ZZ

(4.3)

Re ϕ` (x), ψ` (x) ,

(4.4)

Vν (x − y)Im ϕ` (x), ψ` (x) Im (ϕ` (y), ψ` (y)) ,

`

ZZ K3 = α

k(x)k(y)Vν (x − y), k(x) =

X `

ZZ

Vν (x − y)tr K(x, y)K(y, x) , 1 X ∗ ϕ` (y) ⊗ ψ` (x) + ψ`∗ (y) ⊗ ϕ` (x) , K(x, y) = 2

K4 = 2α

(4.5)

`

K5 = α

X ZZ

Re ϕ` (x), ψ` (x) Re ϕm (y), ψm (y) Vν (x − y).

`6=m

Eν has a very useful concavity property:

(4.6)

Dirac–Fock Equations for Atoms and Molecules

Lemma 4.1. If (3N − 1)α < [0, 1],

2 π/2+2/π ,

517

then for any 8 ∈ A, and any 9− ∈ (E − )N , ν ∈

N h i X 00 ||ψk− ||2E , Eν (8) 9− , 9− ≤ −s

(4.7)

k=1

where s > 0 is a constant independent of 8, 9, ν. Proof of Lemma 4.1. We obviously have (ψ` , (µ ∗ Vν )ψ` )L2 > 0. The Fourier transform of Vν is a positive measure, so ZZ (4.8) Vν (x − y)f (x)f (y)∗ ≥ 0, ∀f ∈ L1 ∩ L3/2 (R3 , C). As a consequence, K2 ≥ 0. Now, K(y, x) = K(x, y)∗ , so that tr K(x, y)K(y, x) ≥ 0 ∀x, y, hence K4 ≥ 0. We thus have h i X 1 00 E (8) 9, 9 ≤ ||ψ`+ ||2E − ||ψ`− ||2E + K1 + K3 + K5 . 2

(4.9)

`

Now, take 8 ∈ A and 9− ∈ (E − )N . For m = 1, . . . , N , we have ||ϕm ||L2 ≤ 1. So, using inequality (1.7), we easily get (π/2 + 2/π)αN X − 2 ||ψ` ||E , 2

(4.10)

(π/2 + 2/π)α(N − 1) X − 2 ||ψ` ||E . 2

(4.11)

K1 ≤

`

and K5 ≤

`

By the Cauchy–Schwarz inequality, K3 ≤ K1 .

(4.12)

N Finally, for any 8 ∈ A and 9− ∈ E − , i h X X (π/2 + 2/π)α 1 00 Eν (8) 9− , 9− ≤ − ||ψ`− ||2E + (3N − 1) ||ψ`− ||2E 2 2 ` ` X − 2 ≤ −s ||ψ` ||E , (4.13) `

with s = 1 −

(π/2+2/π)α 2

(3N − 1) . Note that s > 0 provided α(3N − 1) <

2 . π/2 + 2/π

(4.14)

518

M. J. Esteban, E. S´er´e

This proves Lemma 4.1.

We now compute the second derivative of the penalization term πp . We write πp (8) = Sp ◦ Gram(8), with i X h  Sp (Q) = tr Qp (1 − Q)−1 = Tn (Q), (4.15) n≥p  Tn (Q) = tr(Qn ). 00

Since πp is U(N ) invariant, we just need an expression of πp (8) when 8 ∈ O, i.e. Q = Gram8 = Diag(σ1 , . . . , σN ), 0 < σ1 ≤ · · · ≤ σN < 1. For any self-adjoint matrix h we have X X 0 tr(Qα hQβ ) = ntr(Qn−1 h) = n σkn−1 hkk , Tn (Q) · h = α+β=n−1

and

(4.16)

k

h i X X X 00 tr(Qa hQb h) = n σka σ`b |hk` |2 . (4.17) Tn (Q) · h, h = n a+b=n−2

k,`

a+b=n−2

Summing up, we get 0

Sp (Q) · h =

X

ep (σk )hkk ,

(4.18)

k

h i X 0 X ep (σk ) − ep (σ` ) 00 ep (σk )|hkk |2 + |hk` |2 , Sp (Q) · h, h = σk − σ` k

with ep (t) =

tp 1−t

0 =

ptp−1 −(p−1)tp . (1−t)2 0

(4.19)

k6=`

Finally we obtain

πp (8) · ψ = 2

X

ep (σk )Re(8k , ψk ),

(4.20)

k

h i X 00 0 2ep (σk )||ψk ||2L2 + ep (σk )|2Re(ϕk , ψk )L2 |2 πp (8). 9, 9 = k

2 X ep (σk ) − ep (σ` ) + (ϕk , ψ` )L2 + (ψk , ϕ` )L2 . σk − σ` k6=`

(4.21)

The function ep is positive and strictly increasing on (0, 1). As a consequence, we have Lemma 4.2. For any p ≥ 1, the functional πp is strictly convex on o n A = 8 ∈ E N / 0 < Gram8 < 11 .

Dirac–Fock Equations for Atoms and Molecules

519

Lemma 2.2 is an immediate consequence of Lemmas 4.1 and 4.2. Our goal is now to hprove Lemma 2.4. We start with an upper bound on the seci 00 + + + ond derivative Iν,p (8 ) 9 , 9 , at a point 8+ ∈ A+ , in a direction 9+ ∈ (E + )N ∩ iN h H 1 (R3 , C4 ) , under the orthogonality conditions (ϕ+k , ψ`+ )L2 = 0, ∀k, `. Lemma 4.3. Assume that α(3N − 1) < 8+ ∈ A+ such that

2 π/2+2/π .

Take ν ∈ (0, 1) and p ≥ 2. Consider

Gram(8) = Diag(σ1 , . . . , σN ), 0 < σ1 ≤ · · · ≤ σN < 1, N

where 8 = 8+ + hν,p (8+ ). Let 9+ ∈ E + ∩ H 1 (R3 , C4 )

(4.22)

satisfy

(ϕ+k , ψ`+ )L2 = 0, ∀k, `. Then the following inequality holds: i i X h h 1 00 1 00 ep (σk )kψk+ k2L2 + Iν,p (8+ ) 9+ , 9+ ≤ Eν (8) 9+ , 9+ − 2 2 k X + 2 k∇ψk kL2 . +c

(4.23)

(4.24)

k

Here, c depends only on (Z, N ). h iN Proof. Take 8+ ∈ A+ , 9+ ∈ (E + )N ∩ H 1 (R3 , C4 ) , and denote 0

8− = hν,p (8+ ), 9− = hν,p (8+ )9+ , 8 = 8+ + 8− , 9 = 9+ + 9− .

(4.25)

We may write h h i i 1 00 1 00 (4.26) Iν,p (8+ ) 9+ , 9+ = Fν,p (8) 9, 9 . 2 2 If we impose the condition (4.23): ϕ+k , ψ`+ 2 = 0 ∀k, `, then, from (4.21), we see L

that, for any χ− ∈ (E − )N , i X i h h 00 2ep (σk )kψk+ k2L2 + πp (8) χ− , χ− . πp00 (8) 9+ + χ− , 9+ + χ− =

(4.27)

k

As a consequence, i X h i 1 00 h 1 00 ep (σk )kψk+ k2L2 + Fν,p (8) 9, 9 = Eν (8) 9+ , 9+ − 2 2 k i 1 00 i h h 00 + Eν (8) 9+ , 9− + Fν,p (8) 9− , 9− . 2

(4.28)

0

Now, 9− = hν,p (8)9+ is solution of h h i i 00 00 Fν,p (8) 9+ , χ− + Fν,p (8) 9− , χ− = 0, ∀χ− ∈ (E − )N .

(4.29)

520

M. J. Esteban, E. S´er´e

h i h i 00 00 Note that Fν,p (8) 9+ , χ− = Eν 9+ , χ− , from (4.21) and (4.27). So, applying (4.29) to χ− = 9− , and using Lemma 2.2, we get 00 i h X kψk− k2E ≤ Eν (8) 9+ , 9− . s

(4.30)

k

From Hardy’s inequality (1.9), we have 00 i hX i1/2 hX i1/2 h ||∇ψk+ ||2L2 ||ψk− ||2L2 . (4.31) Eν (8) 9+ , 9− ≤ C(N, Z) k

k

Combining (4.30) (4.31), we get X X kψk− k2E ≤ C 0 k∇ψk+ k2L2 , k

(4.32)

k

and finally i X h i h 1 00 1 00 ep (σk )kψk+ k2L2 + Fν,p (8) 9, 9 ≤ Eν (8) 9+ , 9+ − 2 2 k X + 2 k∇ψk kL2 . +¯c

(4.33)

k

Now, combining (4.26), (4.33), one easily gets (4.24), and the lemma is proved. h i 00 The next lemma gives an upper estimate on Eν (8) 9, 9 for 9 of the form (0, . . . , 0, ψ), ψ ∈ E, radial. It is inspired by [36, 39]. Lemma 4.4. For any 8 ∈ A, ν ∈ (0, 1) and ψ ∈ E, of the form ψ(x) = f (|x|), taking 9(x) = (0, . . . , 0, ψ(x)), we have h i 1 00 Eν (8) 9, 9 ≤ ψ, H0 ψ + α(N − 1) ψ, Vν ψ 2 L2 L2 (4.34) − αZ ψ, (µ ∗ Vν ) ψ 2 . L

Proof. We may write

Eν (ϕ1 . . . ϕN ) = Eν ϕ1 . . . ϕN −1 + ϕN , H0 ϕN ZZ Vν (x − y)

+α

N −1 X

|ϕk (y)|2 |ϕN (x)|2

Vν (x − y)

N −1 X

ϕk (y), ϕN (y)

ϕN (x), ϕk (x) .

k=1

So Eν is a quadratic form in ϕN , when ϕ1 , . . . , ϕN −1 are fixed. Note that ZZ Vν (x − y) ϕk (y), ψ(y) ψ(x), ϕk (x) is nonnegative, from (4.8). Hence,

L2

(4.35)

k=1

ZZ −α

L2

− αZ ϕN , (µ ∗ Vν ) ϕN

Dirac–Fock Equations for Atoms and Molecules

521

h i 1 00 Eν (8) 9, 9 ≤ ψ, H0 ψ − αZ ψ, (µ ∗ Vν ) ψ 2 L2 L2 ZZ N −1 X +α Vν (x − y) |ϕk (y)|2 |ψ(x)|2 ,

(4.36)

k=1

for any 9 = 0, . . . , 0, ψ), ψ ∈ E.

Now, if ρ ∈ L1 (R3 , R+ ) is radial, then an easy computation shows that Z Z ρ(x)Vν (x − x0 )dx ≤ ρ(x)Vν (x)dx, ∀x0 ∈ R3 . R3

(4.37)

R3

As a consequence, if ψ(x) = f (|x|), then Z dy R3

N −1 X

Z |ϕk (y)|2

k=1

R3

Z dy

≤ R3

|ψ(x)|2 Vν (x − y)dx

N −1 X

Z

|ϕk (y)|

|ψ(x)|2 Vν (x)dx =

2

k=1

(4.38)

R3

= (N − 1) ψ, Vν ψ 2 . L

Lemma 4.4 follows directly from (4.36) and (4.38).

00

We now give an upper bound on Iν,p · (0, . . . , 0, ψ + )2 for ψ + in a suitably chosen finite dimensional subspace of E + . 2 , N < Z + 1. Then, for any m ≥ 0, Lemma 4.5. Assume that α(3N − 1) < π/2+2/π + there is a real finite dimensional subspace of E denoted Xm , with dimR Xm = m + 1 and a constant bm ∈ (0, 1) such that i 2 h 1 00 (4.39) Iν,p (8+ ). 0, . . . , 0, ψ + ≤ bm − ep (σN ) kψ + k2E 2

for any ν ∈ (0, 1), p ≥ 1, ψ + ∈ Xm , and any 8+ ∈ A+ such that Gram 8 = Diag(σ1 , . . . , σN ), 0 < σ1 ≤ · · · ≤ σN < 1, with the notation 8 = 8+ + hν,p (8+ ). Proof. Let d be a positive integer. We choose a d-dimensional subspace of o n H 1 [0, ∞), r2 dr , R , denoted Vd . To f (r) ∈ Vd , and λ > 0, we associate   f (|x|/λ)  0  . ψ(x) =  0  0

(4.40)

Obviously, ψ ∈ H 1 (R3 , C4 ). We call Wd,λ the d-dimensional real vector space of functions ψ of the form (4.54), with λ fixed and f ∈ Vd arbitrary.

522

M. J. Esteban, E. S´er´e

It is easy to see that there are two constants 0 < c∗ (d) < c∗ (d) < ∞ such that, for any ψ ∈ Wd,λ and λ large, 2

(H0 ψ, ψ) = kψkL2 , c∗ 2 2 k∇ψkL2 ≤ 2 kψkL2 , λ c∗ 2 ψ, Vν ψ 2 ≥ kψkL2 , ∀ν ∈ [0, 1], λ L c∗ 2 2 k3− ψkL2 ≤ 2 kψkL2 , λ 1 ((µ ∗ Vν ) ψ, ψ)L2 ≥ (Vν ψ, ψ)L2 − o ||ψ||2L2 , ∀ν ∈ [0, 1]. λ

(4.41) (4.42) (4.43) (4.44)

(4.45)

Inequalities (4.42), (4.43) and (4.45) follow from scaling arguments, and (4.44) is a consequence of formula (3.3). Now, suppose that ψ ∈ Wd,λ satisfies = 0, ∀k, (4.46) ϕ+k , ψ L2

for some 8 = (ϕ+1 , . . . , ϕ+N ) ∈ A+ , such that Gram 8 = (σ1 , . . . , σn ), 0 < σ1 ≤ · · · ≤ σN < 1, with 8 = 8+ + hν,p (8+ ). Let 9+ = (0, . . . , 3+ ψ). From Lemma 4.3, we have, +

for any ν ∈ (0, 1), p ≥ 1, i 1 00 i h h 1 00 2 Iν,p (8+ ) 9+ , 9+ ≤ Eν (8) 9+ , 9+ − ep (σN )k3+ ψkL2 + 2 2 2 + c¯k∇ψkL2 .

(4.47)

From Lemma 2.2, i 1 00 i h h i h 00 1 00 Eν (8) 9+ , 9+ ≤ Eν (8) 9, 9 − Eν (8) 9, 9− , 2 2

(4.48)

where 9 = (0, . . . , 0, ψ), 9− = (0, . . . , 0, 3− ψ). But from Hardy’s inequality (1.9), i h 00 Eν (8) 9, 9− ≤ ck∇ψkL2 k3− ψkL2 , (4.49) for some c > 0 which depends only on N, Z. Moreover, using Lemma 4.4, we get h i 1 00 Eν (8) 9, 9 ≤ ψ, H0 ψ + α(N −1)(ψ, Vν ψ)L2 − αZ ((µ ∗ Vν ) ψ, ψ)L2. 2 (4.50) Finally, combining (4.41, 4.42, . . . , 4.50), we get i h αc∗ + o(1) 1 00 kψk2L2 − ep (σN )k3+ ψk2L2 Iν,p (8) 9+ , 9+ ≤ 1 − (Z − N + 1) 2 λ (c + c¯)c∗ + kψk2L2 λ2 (4.51) αc∗ (Z − N + 1) ≤ 1− − ep (σN ) k3+ ψk2E 2λ

Dirac–Fock Equations for Atoms and Molecules

523

for λ = λ(d) large enough. Now, take m ≥ 0. Choose Xm as an (m + 1)-dimensional subspace of 3+ Wd,λ(d) ∩ n o⊥ ϕ+1 , . . . , ϕ+N , where d = m + 2N + 1 (such a space always exists). Take bm = αc∗ (Z − N + 1) 1− . Then it is easy to check that Xm satisfies (4.39), and Lemma 4.5 2λ(d) is proved. Lemma 2.4 is now an immediate consequence of Lemma 4.5. 5. The Min–Max Argument We start with a proof of Lemma 2.5. We need the following result: 2 . Take ν ∈ (0, 1). There is a constant Lemma 5.1. Assume that α(3N − 1) < π/2+2/π + + C(ν) > 0 such that, for any p ≥ 1 and 8 ∈ A ,

σ1 (8+ ) ≤ C(ν)σ1+ (8+ ).

(5.1)

Here, σ1+ (8+ ) is the smallest eigenvalue of Gram 8+ , and σ1 (8+ ) is the smallest eigenvalue of Gram 8, where 8 = 8+ + hν,p (8+ ). Remark. The constant C depends on ν. We have been unable to prove that C remains bounded as ν tends to 0. Proof of Lemma 5.1. Take 8+ ∈ A+ , i.e. 8+ ∈ (E + )N with 0 < Gram(8+ ) < 11. Using the U(N ) invariance, we just have to prove the lemma when + + ), 0 < σ1+ ≤ · · · ≤ σN < 1. Gram(8+ ) = Diag(σ1+ , . . . , σN

(5.2)

We denote − + − hν,p (8+ ) = 8− = (ϕ− 1 , . . . , ϕN ), 8 = 8 + 8 = (ϕ1 , . . . , ϕN ).

We introduce the following functional on E − : F (ψ − ) = ϕ+1 + ψ − , H8ν 1 (ϕ+1 + ψ − )

L2

− πp (ϕ+1 + ψ − , ϕ2 , . . . , ϕN ).

(5.3)

Here, 81 = (ϕ2 , . . . , ϕN ) ∈ E N −1 , and H8ν 1 ψ = H0 − αZ (µ ∗ Vν ) ψ +α

N ZZ X

Vν (x − y) |ϕk (y)|2 ψ(x) − (ϕk (y), ψ(y))ϕk (x) dy.

(5.4)

k=2

We have extended πp to E N , with values in R, by defining πp (8) = +∞ when 11−Gram8 00 is not positive definite. F is thus well-defined on E − with values in R, and F (ψ − ) exists when F (ψ − ) > −∞. From Lemma 2.2, F is strictly concave, and 00 (5.5) F (ψ − ) χ− , χ− ≤ −skχ− k2E ,

524

M. J. Esteban, E. S´er´e

for any χ− ∈ E − , and ψ − ∈ E − such that F (ψ − ) > −∞. We have Fν,p ϕ+1 + ψ − , ϕ2 , . . . , ϕN = F (ψ − ) + Eν (81 ),

(5.6)

− + so ϕ− 1 is the unique maximizer of F on E . From (5.2), (ϕ1 , ϕk )L2 = 0, ∀k ≥ 2. − − Therefore, for any χ ∈ E , 0 πp ϕ+1 , ϕ2 , . . . , ϕN · χ− =   + n−1  − − 0 Re(χ− , ϕ− (σ1 ) 0 ... 0 2 ) . . . Re(χ , ϕN ) −   0 X Re(ϕ−  2 ,χ )    2ntr  =   . .   . . n−1 .   . Gram 81 ) 0 n≥p − 0 , χ ) Re(ϕ− N = 0.

As a consequence, F 0 (0)χ− = 2Re χ− , H8ν 1 ϕ+1 .

(5.7)

So there is a constant K1 (ν) > 0 such that |F 0 (0)χ− | ≤ K1 (ν)kϕ+1 kL2 kχ− kE , ∀χ− ∈ E − .

(5.8)

But (5.5) implies that − − 2 0 F (ϕ− 1 ) ≤ F (0) + F (0)ϕ1 − skϕ1 kE .

(5.9)

Since F (ϕ− 1 ) ≥ F (0), (5.8) (5.9) give + kϕ− 1 kE ≤ K2 (ν)kϕ1 kL2 .

(5.10)

Finally, (5.10) gives

X

2

N

2 + 2 + +

σ1 (8 ) = inf ξk ϕk

2 ≤ kϕ1 kL2 ≤ C(ν)kϕ1 kL2 = C(ν)σ1 (8 ). (5.11) ξ∈ C N +

||ξ||=1

Lemma 5.1 is proved.

k=1

L

We are now ready to prove Lemma 2.5. Using once again the U(N ) invariance, we just have to consider 8+ ∈ A+ such that, denoting 8 = 8+ + 8− , 8− = hν,p (8+ ), the following holds: Gram(8) = Diag(σ1 , . . . , σN ), 0 < σ1 ≤ · · · ≤ σN < 1.

(5.12)

We want to find X ∈ (E + )N satisfying (2.22), assuming that 1(8+ ) = det Gram 8+ is in [d(ν), 2d(ν)]. We choose X = (ϕ+1 , 0, . . . , 0).

(5.13)

10 (8+ ) · X = 21(8+ ) > 0.

(5.14)

Obviously,

Dirac–Fock Equations for Atoms and Molecules

525

0

Since Fν,p (8+ ) · (χ− , 0, . . . , 0) = 0, ∀χ− ∈ E − , we may write 0

0

Iν,p (8+ ) · X = Fν,p (8) · (ϕ+1 − ϕ− 1 , 0, . . . , 0) − ν , H ϕ = 2 ϕ+1 , H8ν 1 ϕ+1 2 − 2 ϕ− 81 1 1 L2 L 2 − 2ep (σ1 ) ||ϕ+1 ||2L2 − ||ϕ− 1 ||L2 .

(5.15)

From inequality (1.7), we have ( )||ϕ+ ||2E , ∀ϕ+ ∈ E + , (ϕ+ , H8ν 1 ϕ+ )L2 ≥ (1 − (π/2+2/π)αZ 2 (π/2+2/π)α(N −1) − ν − 2 )||ϕ− ||2E , ∀ϕ− ∈ E − . −(ϕ , H81 ϕ )L ≥ (1 − 2

(5.16)

As a consequence, h i 0 (π/2 + 2/π) α max(Z, N − 1) − ep (σ1 ) kϕ+1 k2E . (5.17) Iν,p (8+ ) · X ≥ 2 1 − 2 xp x ≤ = e1 (x) is small when x > 0 is small. Moreover, by 1−x 1−x (π/2+2/π) α max(Z,N −1) < 1. assumption, 2 From Lemma 5.1, h i N1 (5.18) 1(8+ ) ≤ σ1 (8+ ) ≤ C(ν)σ1+ (8+ ) ≤ C(ν) 1(8+ ) .

But ep (x) =

Lemma 2.5 is now an immediate consequence of (5.14), (5.17) and (5.18). Our goal now is to prove Lemma 2.6. We start with a “linear” result that will give us the lower bound a(j) in (2.30). 2 . Then there is a nondecreasing sequence Lemma 5.2. Assume that αZ < π/2+2/π {λj , j ≥ 0} in (0, 1), with lim λj = 1, and a sequence {Gj , j ≥ 0} of complex vector j→∞

subspaces of E + , with dim C (E + /Gj ) = j, and ϕ+ , (H0 − αZ (µ ∗ V )) ϕ+ 2 ≥ λj kϕ+ k2L2 , ∀ϕ+ ∈ Gj .

(5.19)

L

Proof. The arguments below are classical (see [46], 112-117 for a similar situation). The operator T = 3+ (H0 − αZ (µ ∗ V )) 3+ , defined as a Friedrichs extension, is selfadjoint on 3+ (L2 ) and has essential spectrum σess (T ) = [1, +∞). Indeed, the arguments used in [18] to prove the result when µ is a Dirac mass, extend to the more general case. From (1.7), σ(T ) ⊂ (0, ∞). As a consequence, σ(T ) ∩ (−∞, 1) consists only of positive eigenvalues with finite multiplicity. One can easily prove, using the Rayleigh quotients, that σ(T ) ∩ (−∞, 1) = {λj , j ≥ 0}, with 0 < λ0 ≤ · · · ≤ λj ≤ . . . , lim λj = 1. Let Gj be the orthogonal space, for the L2 -hermitian product, of M Ker(T − λk IE + ). Kj =

j→∞

(5.20)

k≤j−1

Obviously, E /Gj ≈ Kj has complex dimension j, and (5.19) holds. +

We now construct the space Fj , and we find the upper bound a¯ (j).

526

M. J. Esteban, E. S´er´e

2 Lemma 5.3. Assume that α(3N − 1) < π/2+2/π , N < 2Z + 1. There is a sequence {¯a(j), j ≥ 0} in (0, N ) and a sequence {Fj , j ≥ 0} of complex vector subspaces of E + , with dim C Fj = j + N , and

N ∩ A+ . Iν,p (8+ ) ≤ a¯ (j), ∀8+ ∈ Fj

(5.21)

Proof. Our arguments will be similar to those in the proof of Lemma 2.4, but simpler. We consider the space Wd,λ of functions ψ of the form (4.40), with λ fixed and f ∈ Vd + arbitrary. We denote Wd,λ = 3+ (Wd,λ ). From (4.44), for λ large enough, + = dimC Wd,λ = d. dimC Wd,λ

(5.22)

N From (4.37), for any 8 ∈ Wd,λ , such that Gram 8 ≤ 2 δk` , Eν (8) =

X

(ϕk , H0 ϕk ) − αZ (ϕk , (µ ∗ Vν ) ϕk )L2

k

αX + 2

ZZ

n Vν (x − y) |ϕk (x)|2 |ϕ` (y)|2

(5.23)

k6=`

o − ϕk (x), ϕ` (x) ϕ` (y), ϕk (y) X X α αZ (ϕk , (µ ∗ Vν ) ϕk )L2 . ϕk , H0 + (N − 1)Vν ϕk 2 − ≤ 2 L k

k

Moreover, using inequalities (1.7) and (1.9), one can find two constants a, b > 0 such that 1/2 X 1/2 X 0 k∇ϕk k2L2 kψk− k2L2 + |Eν (8).9− | ≤ a k

+b

X k

k

2 kϕ− k kE

1/2 X k

kψk− k2E

1/2

,

(5.24)

− − ∈ (E − )N is arbitrary. Now, we take 8+ = where ϕ+k = 3+ ϕk , ϕ− k = 3 ϕk , and 9 + N ) ∩ A+ . (ϕ+1 , . . . , ϕ+N ) ∈ (Wλ,d + We recall that A = {8+ ∈ (E + )N / 0 < Gram8+ < 11}. From (4.44), for λ large enough, there is 8 ∈ (Wλ,d )N , such that 3+ ϕk = ϕ+k (∀k) and Gram(8) ≤ 2(δk,` ). Since πp ≥ 0, we may write sup Eν (8 + 9− ). (5.25) Iν,p (8+ ) ≤ Eν 8+ + hν,p (8+ ) ≤ 9− ∈(E − )N

Combining (5.24), (5.25) and Lemma 4.1, we get, for some a0 > 0, X 2 2 k∇ϕk kL2 + k3− ϕk kE . Iν,p (8+ ) ≤ Eν (8) + a0 k

Finally, combining (5.23), (5.26) and the estimates (4.41), . . . , (4.45), we find,

(5.26)

Dirac–Fock Equations for Atoms and Molecules

527

c∗ Iν,p (8 ) ≤ N 1 − α(2Z − N + 1) +o 2λ +

1 . λ

(5.27)

+ ¯ . Then (5.27) gives We take λ(d) large enough, and Fj = Wj+N, ¯ λ(j+N )

Iν,p (8+ ) ≤ a¯ (j) < N, ∀8+ ∈ (Fj )N ∩ A+ . From (5.22), dimC Fj = j + N , so Lemma 5.3 is proved.

(5.28)

We are now ready to prove Lemma 2.6. We take Fj as in Lemma 5.3. Obviously, cν,p (Fj ) = inf Q∈Q(Fj ) max8+ ∈Q Jν,p (8+ ) ≤ ≤ max8+ ∈(Fj )N ∩A+ Jν,p (8+ ) ≤ a¯ (j),

(5.29)

where for any F , the class of sets Q(F ) is defined in Section 2, formula (2.28). To find a lower estimate on cν,p (Fj ), we define n j+1 o 11 . Sj = 8+ ∈ (Gj )N / Gram8+ = j+2

(5.30)

Take 8+ ∈ Sj . From Lemma 5.2, we have Eν (8+ ) ≥

X k

j+1 λj . ϕk , (H0 − αZ (µ ∗ Vν )) ϕk ≥ N j+2

(5.31)

So there is p(j) such that, if p ≥ p(j), then j + 1 p j + 1 1 11 = N (j + 2) ≤ Eν (8+ ). πp j+2 j+2 j+2 Together with (5.31), this gives Iν,p (8+ ) ≥ Eν (8+ ) − πp (8+ ) ≥ N We choose a(j) = N

j+1 j+2

2

j+1 j+2

2 λj .

(5.32)

λj . Obviously, lim a(j) = N , and Lemma 2.6 is an immej→∞

diate consequence of the following intersection result: Lemma 5.4. For any Q ∈ Q(Fj ), the intersection Q ∩ Sj is non-empty. Proof of Lemma 5.4 (hence of Lemma 2.6). The quotient set Sj /U(N ) is a submanifold of the Hilbert manifold A+ /U(N ), and codimR Sj /U(N ), A+ /U(N ) = codimR Sj , (E + )N = (5.33) N 2 + codimR (Gj )N , (E + )N = N 2j + N . Take > 0 small, and define n o Mj () = 8+ ∈ (Fj )N ∩ A+ /det(Gram 8+ )det(11 − Gram 8+ ) ≥ . (5.34)

528

M. J. Esteban, E. S´er´e

Mj is a manifold with boundary, and dim R Mj = dimR (Fj )N = 2N (j + N ). If h is ”admissible”, then, from (2.27) and by continuity of h, there is h > 0 such that (5.35) h [0, 1] × ∂Mj (h ) ∩ Sj = ∅. Now, Mj /U(N ) is a submanifold (with boundary) of A+ /U(N ), and dimR Mj /U(N ) = dimR Mj − dimR U(N )

= N (2j + N ) = codimR Sj /U(N ), A+ /U(N ) .

(5.36)

Perturbing slightly Fj if necessary, we may impose that Fj and Gj intersect transversally. Their intersection is then a complex subspace Hj of E + , of dimension N , and Sj /U(N )∩ Mj /U(N ) is a transverse intersection of cardinal 1. Its unique element is the U(N ) class j+1 11. So the intersection of bases (ϕ+1 , . . . , ϕ+N ) of Hj , such that Gram (ϕ+1 , . . . , ϕ+N ) = j+2 index of Sj /U(N ) and Mj /U(N ) (mod 2) is 1. From (5.35), we also have (5.37) IZ2 Sj /U(N ), h(1, Mj )/U(N ) = 1. So Sj intersects Q = h 1, D(Fj ) , and Lemma 5.4 (hence Lemma 2.6) is proved. This ends the proof of Theorem 1.2. Acknowledgement. The authors wish to thank B. Buffoni, P. Chaix and P. Indelicato for stimulating conversations. They are also indebted to the referee for valuable remarks.

References 1. Amann, H.: Saddle points and multiple solutions of differential equations. Math. Z. 169, 127–166 (1979) 2. Bahri, A.: Une m´ethode perturbative en th´eorie de Morse. Th`ese d’Etat, Universit´e P. et M. Curie, Paris, 1981 3. Bahri, A., Berestycki, H.: A perturbation method in critical point theory and applications. Trans. Am. Math. Soc. 267 (1), 1–32 (1981) 4. Bahri, A., Berestycki, H.: Points critiques de perturbations de fonctionnelles paires et applications. Comptes rendus Acad. Sci. Paris, S´erie A-B 291 (3), 189–192 (1980) 5. Bjorken, J.D., Drell, S.D.: Relativistic quantum mechanics. New York: McGraw-Hill, 1964 6. Brown, G.E., Ravenhall, D.G.: On the interaction of two electrons. Proc. Roy. Soc. London. A208, 552–559 (1951) 7. Buffoni, B., Jeanjean, L.: Minimax characterization of solutions for a semi-linear elliptic equation with lack of compactness. Ann. Inst. H. Poincar´e 10 (4), 377–404 (1993) 8. Burenkov, V.I., Evans, W.D.: On the evaluation of the norm of an integral operator associated with the stability of one-electron atoms. Preprint Mp-arc archive list number 97–247 9. Castro, A., Lazer, A.C.: Applications of a min-max principle. Rev. Colomb. Mat. 10, 141–149 (1976) 10. Chaix, P., Iracane, D.: The Bogoliubov-Dirac–Fock formalism. J. Phys. At. Mol. Opt. Phys. 22, 3791– 3814 (1989) 11. Coffman, C.V.: Ljusternik–Schnirelman theory: Complementary principles and the Morse index. Nonlinear Analysis, Theory and Applications 12 (5), 507–529 (1988) 12. Conley, C.: Isolated invariant sets and the Morse index. C.B.M.S. 38, Providence, RI: A.M.S. 1978 13. Conley, C., Zehnder, E.: The Birkhoff–Lewis fixed point theorem and a conjecture of V.I. Arnold. Invent. Math. 73, 33–49 (1983)

Dirac–Fock Equations for Atoms and Molecules

529

14. Daubechies, I., Lieb, E.H.: One-electron relativistic molecules with Coulomb interaction. Commun. Math. Phys. 90, 497–510 (1983) 15. Desclaux, J.: Relativistic Dirac–Fock expectation values for atoms with Z = 1 to Z = 120. Atomic Data and Nuclear Data Tables 12, 311–406 (1973) 16. Dolbeault, J., Esteban, M.J., S´er´e, E.: Variational characterization for eigenvalues of Dirac operators. Preprint mparc 98–177 17. Esteban, M.J., S´er´e, E. Existence and multiplicity of solutions for linear and nonlinear Dirac problems. In: Partial Differential Equations and Their Applications. CRM Proceedings and Lecture Notes, volume 12. Eds. P.C. Greiner, V. Ivrii, L.A. Seco and C. Sulem. Providence, RI: AMS, 1997 18. Evans, W.D., Perry, P., Siedentop, H.: The spectrum of relativistic one-electron atoms according to Bethe and Salpeter. Commun. Math. Phys. 178, 733–746 (1996) 19. Fang, G., Ghoussoub, N.: Morse-type information on Palais–Smale sequences obtained by min-max principles. Manuscripta Math. 75, 81–95 (1992) 20. Floer, A.: A relative Morse index for the symplectic action. CPAM 41, 393–407 (1988) 21. Ghoussoub, N.: Duality and perturbation methods in critical point theory. Cambridge: Cambridge Univ. Press, 1993 22. Gorceix, O., Indelicato, P., Desclaux, J.P.: Multiconfiguration Dirac–Fock studies of two-electron ions: I. Electron-electron interaction. J. Phys. B: At. Mol. Phys. 20, 639–649 (1987) 23. Grant, I.P.: Relativistic Calculation of Atomic Structures. Adv. Phys. 19, 747–811 (1970) 24. Grant, I.P., Quiney, H.M.: Foundations of the relativistic theory of atomic and molecular structure. Adv. Atom. Mol. Phys. 23, 37–86 (1988) 25. Griesemer, M., Siedentop, H.: A minimax principle for the eigenvalues in spectral gaps. Preprint mparc 97–492 26. Hardekopf, G., Sucher, J.: Relativistic wave equations in momentum space. Phys. Rev. A 30, 703–711 (1984) 27. Herbst, I.W.: Spectral theory of the operator (p2 + m2 )1/2 − Ze2 /r. Commun. Math. Phys. 53, 285–294 (1977) 28. Heully, J.L., Lindgren, I., Lindroh, E., Martensson-Pendrill, A.M.: Comment on relativistic wave equations and negative-energy states. Phys. Rev. A 33, 4426–4429 (1986) 29. Hofer, H., Wysocki, K.: First order elliptic systems and the existence of homoclinic orbits in Hamiltonian systems. Math. Ann. 288, 483–503 (1990) 30. Kato, T.: Perturbation theory for linear operators. Berlin–Heidelberg–New York: Springer, 1966 31. Quiney, H.M., Grant, I.P., Wilson, S.: The Dirac equation in the algebraic approximation: V. Selfconsistent field studies including the Breit interaction. J. Phys. B: At. Mol. Phys. 20, 1413–1422 (1987) 32. Kim, Y.K.: Relativistic self-consistent Field theory for closed-shell atoms. Phys. Rev. 154, 17–39 (1967) 33. Selecta of E.H. Lieb. The stability of matter: From atoms to stars. Edited by W. Thirring (second edition), 2nd edition, Berlin–Heidelberg–New York: Springer, 1997 34. Lieb, E.H., Loss, M., Siedentop, H.: Stability of relativistic matter via Thomas-Fermi theory. Helvetica Physica Acta 69 (5–6), 974–984 (1996) 35. Lieb, E.H., Siedentop, H., Solovej, J.P.: Stability and instability of relativistic electrons in classical electromagnetic fields. J. Stat. Phys. 89 (1-2), 37–59 (1997) 36. Lieb, E.H., Simon, B.: The Hartree–Fock theory for Coulomb systems. Commun. Math. Phys. 53, 185– 194 (1977) 37. Lieb, E.H., Yau, H.-T.: The stability and instability of relativistic matter. Commun. Math. Phys. 118, 177–213 (1988) 38. Lindgren, I., Rosen, A.: Relativistic self-consistent field calculations. Case Stud. At. Phys. 4, 93–149 (1974) 39. Lions, P.-L.: Solutions of Hartree–Fock equations for Coulomb systems. Commun. Math. Phys. 109, 33–97 (1987) 40. Majer, P., Terracini, S.: Periodic solutions to some problems of n-body type. Arch. Rat. Mech. Anal. 124, 381–404 (1993) 41. Mittleman, M.H.: Theory of relativistic effects on atoms: Configuration-space Hamiltonian. Phys. Rev. A 24 (3), 1167–1175 (1981) 42. Rabinowitz, P.H.: Periodic solutions of Hamiltonian systems. CPAM 31, 157–184 (1978) 43. Sucher, J.: Foundations of the relativistic theory of many-particle atoms. Phys. Rev. A 22 (2), 348–362 (1980) 44. Sucher, J.: Relativistic many-electron Hamiltonians. Phys. Scrypta 36, 271–281 (1987) 45. Swirles, B.: The relativistic self-consistent field. Proc. Roy. Soc. A 152, 625–649 (1935)

530

M. J. Esteban, E. S´er´e

46. Thaller, B.: The Dirac equation. Berlin–Heidelberg–New York: Springer-Verlag, 1992 47. Tix, C.: Strict positivity of a relativistic Hamiltonian due to Brown and Ravenhall. Bull. London Math. Soc. 30 (3), 283–290 (1998) 48. Tix, C.: Lower bound for the ground state energy of the no-pair Hamiltonian. Phys. Lett. B 405, 293–296 (1997) 49. Viterbo, C.: Indice de Morse des points critiques obtenus par minimax. A.I.H.P. Analyse non lin´eaire 5 (3), 221–225 (1988) Communicated by B. Simon

Commun. Math. Phys. 203, 531 – 549 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

The Moduli of Flat PGL(2, R) Connections on Riemann Surfaces Eugene Z. Xia Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA. E-mail: [email protected] Received: 2 January 1997 / Accepted: 28 November 1998

Abstract: Suppose X is a compact Riemann surface with genus g > 1. Each class [σ ] ∈ Hom(π1 (X), PGL(2, R))/ PGL(2, R) is associated with the first and second Stiefel–Whitney classes w1 ([σ ]) and w2 ([σ ]). The set of representation classes with a fixed w1 6 = 0 has two connected components. These two connected components are characterized by w2 being 0 or 1. For each fixed w1 6 = 0, we prove that the component, characterized by w2 = 0, contains an open dense set diffeomorphic to the total space of a vector bundle of rank 2g − 2 over a once punctured algebraic torus of dimension g − 1. The other component, characterized by w2 = 1, contains an open dense set diffeomorphic to the total space of a vector bundle of rank 2g − 2 over an algebraic torus of dimension g − 1. 1. Introduction Let X be a compact Riemann surface of genus g > 1, and Hom(π1 (X), PGL(2, R)) the space of homomorphisms from π1 (X) to PGL(2, R). The group PGL(2, R) has two connected components and is isomorphic to SO(2, 1). The space Hom(π1 (X), PSL(2, R)) has 4g − 3 connected components and these components are distinguished by the Euler class e [6,9,10,18]. To obtain more detailed information on these representation spaces, Hitchin made use of the complex structure on X. By studying the space of rank-2 Higgs bundles over X, he showed that the 2g − 2 connected components (corresponding to non-zero Euler classes) of Hom(π1 (X), PSL(2, R))/ PSL(2, R) are complex vector bundles over symmetric products of X [9].

532

E. Z. Xia

Let α ∈ H1 (X, Z2 ) and define P Wα = {σ ∈ Hom(π1 (X), PGL(2, R)) : w1 (σ ) = α}. For any two non-zero classes α, β ∈ H1 (X, Z2 ), P Wα is homeomorphic to P Wβ [19]. Fix a non-zero class α and define P W to be P Wα . Then P W has two connected components distinguished by the two Stiefel–Whitney classes in H2 (X, Z2 ) [19]. This paper is a study of the topology of the space Hom(π1 (X), PGL(2, R))/ PGL(2, R), in particular the component P W/ PGL(2, R). Each representation σ ∈ P W may be ˜ SL(2, R)), where lifted to an element σ˜ = π ∗ (σ ) ∈ Hom(X, π : X˜ −→ X is a chosen unramified double cover of X. Let P W 0 be the subset of P W such that σ ∈ P W 0 implies that σ is irreducible and σ˜ is a semi-simple but non-central representation. In particular, P W 0 is open and dense in P W . For a precise description of P W 0 , see Sect. 2. Theorem 1.1. The space P W 0 / PGL(2, R) has two connected components P Q0 and P Q1 . The component P Q0 is the total space of a vector bundle of rank 2g − 2 over an once punctured compact algebraic torus of dimension g − 1. The component P Q1 is the total space of a vector bundle of rank 2g − 2 over an algebraic torus of dimension g − 1. The precise description of these two components is given in Sect. 3 and 6. Corollary 1.2. The space Hom(π1 (X), PGL(2, R)) has 22g+1 +4g −5 connected components. A representation is called parabolic if it is reducible but not semi-simple. The set P W \ P W 0 consists of representations of three types: 1. The R∗ representations. 2. The parabolic representations 3. The representations that lift to parabolic representations by π ∗ . Together these points form a subvariety of P W . ˜ 2. The Pull-Back Representations of π1 (X) Let X be a compact Riemann surface of genus g > 1 and G an algebraic group. A representation σ ∈ Hom(π1 (X, x), G) defines a flat G-bundle P over X. Let SL− (2, R) = {g ∈ GL(2, R) : det(g) = −1}, SLi (2, R) = SL(2, R) ∪ i SL(2, R)− , SL±i (2, R) = SLi (2, R) ∪ i SLi (2, R). Then SLi (2, R) is a subgroup of SL(2, C) and has two connected components. The projectivization of both SLi (2, R) and SL±i (2, R) is PGL(2, R).

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

533

The obstruction classes of P give rise to the obstruction class maps on : Hom(π1 (X, x), G) −→ Hn (X, πn−1 (G)). In particular, if G is GL(2, R), PGL(2, R) or SLi (2, R), then o1 is the first Stiefel– Whitney class w1 [17,19]. The class o2 is the second Stiefel–Whitney class w2 when G is PSL(2, C) and the Euler class e when G is PSL(2, R) [6,17]. g−1 Fix a point x ∈ X. Choose a set of generators S = {ai , bi }i=0 for the fundamental group π1 (X, x) and define R to be the formal expression g−1 Y i=0

ai bi ai−1 bi−1 .

Then π1 (X, x) is generated by S with the relation R = 1. Let 0 be the central extension of π1 (X, x) by c with R = c and c2 = 1. This gives the exact sequence 0 −→ Z2 −→ 0 −→ π1 (X, x) −→ 1. Let M be the space Hom(0, SL(2, C)) which has two connected components depending on whether c goes to I or −I [2,6,9]. Denote the two components by M0 and M1 , respectively. Note M0 is the space Hom(π1 (X, x), SL(2, C)). The space N = 2g−2 Hom(0, SL(2, R)) has 4g − 3 connected components consisting of {Nj }j =2−2g [6,9] and is a subset of Hom(0, SL(2, C)). Each σ ∈ Hom(0, SL(2, C)) acts on C2 via the standard representation of SL(2, C). Definition 2.1. A representation σ is irreducible if its action on C2 is irreducible, and is semi-simple if it is a direct sum of irreducible representations. Identify H2 (X, Z2 ) with Z2 and H2 (X, Z) with Z. Let J2 (X) be the space of central representations: J2 (X) = Hom(π1 (X, x), {±I }) ∼ = Z2 . 2g

Define PM P Mi PN P Nj

= Hom(π1 (X, x), PSL(2, C)), = w2−1 (i) ⊂ P M, = Hom(π1 (X, x), PSL(2, R)), = e−1 (j ) ⊂ P N.

The space J2 (X) is a group and acts on the Mi ’s and Ni ’s. The quotients are precisely the P Mi ’s and P Ni ’s (the projective representations). Remark 2.2. We shall use the superscripts s and ss to denote the irreducible and semisimple subspaces. For example, M s and M ss denote the subspaces of irreducible and semi-simple subspaces of M, respectively. Definition 2.3. W = {σ ∈ Hom(0, SLi (2, R)) : σ (a0 ) ∈ i SL− (2, R) and σ (S \ {a0 }) ⊂ SL(2, R)}, W0 = {σ ∈ W : σ (c) = I }, W1 = {σ ∈ W : σ (c) = −I }.

534

E. Z. Xia

The subspaces W0 and W1 are the ones associated with the second Stiefel–Whitney class w2 being 0 and 1, respectively. The group J2 (X) ∩ W acts on W . Denote by P W, P W0 , P W1 the respective quotient spaces. The sets P W0 and P W1 are connected [19]. There exists a double cover X˜ of X with covering map [1] π : X˜ −→ X g−1

˜ x) ˜ is generated by S˜ = {a˜ i , b˜i }i=1−g with such that π1 (X, π∗ (a˜ 0 ) = a02 , π∗ (b˜0 ) = b0 , and for i > 0,

π∗ (a˜ i ) = a0−1 π∗ (a˜ −i )a0 = ai , π∗ (b˜i ) = a0−1 π∗ (b˜−i )a0 = bi .

The double cover admits a fixed point free involution τ that is π -invariant, i.e, the diagram τ X˜ −−−−→  π y

X˜  π y

id

X −−−−→ X commutes. Composition of representations σ ∈ P W with the induced map ˜ x) ˜ −→ π1 (X, x) π∗ : π1 (X, defines a map

˜ x), ˜ PSL(2, R)). π ∗ : P W −→ Hom(π1 (X,

Proposition 2.4. The image of π ∗ consists of representations σ˜ satisfying e(σ˜ ) = 0. Proof. Let P˜ = π ∗ (P ). Then P˜ admits an involution: ∗

τ P˜ −−−−→   y

P˜   y

τ ˜ X˜ −−−−→ X.

Since σ (a0 ) ∈ i SL− (2, R), the associated flat principal bundle P is not orientable. ˜ but τ ∗ Hence τ ∗ must reverse orientations on P˜ . Since τ preserves orientations on M, reverses orientations on P˜ , e(σ˜ ) = τ ∗ (e(σ˜ )) = e(τ ∗ (σ˜ )) = −e(σ˜ ). This implies e(σ˜ ) = 0. u t Corollary 2.5. The representation σ˜ may be further lifted to a representation in ˜ x), ˜ SL(2, R)). Hom(π1 (X,

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

535

Proof. The obstruction to lifting is the mod-2 reduction of the Euler class e(σ˜ ) which is zero by Proposition 2.4. u t ˜ x)) ˜ which has index 2 in π1 (X, x) and let F1 = Let F0 be the group π∗ (π1 (X, π1 (X, x) \ F0 . Consider the following homomorphisms of groups π∗

i

˜ x) ˜ −→ F0 −→ π1 (X, x). π1 (X, These homomorphisms are injective. Therefore, a representation σ ∈ W induces a representation ˜ x) ˜ −→ SL(2, R). σ˜ : π1 (X, This defines a map

˜ x), ˜ SL(2, R)). π ∗ : W −→ Hom(π1 (X,

The map π ∗ is equivariant with respect to the action of PGL(2, R), thus, descends to a map ˜ x), ˜ SL(2, R))/ PGL(2, R). π ∗ : W/ PGL(2, R) −→ Hom(π1 (X, Definition 2.6. Ws W 00 W0 PWs PW0

= = = = =

{σ ∈ W : σ is irreducible}, {σ ∈ W s : π ∗ (σ ) is irreducible}, {σ ∈ W s : π ∗ (σ ) is semi-simple, and σ (a02 ) 6= ±I } ∪ W 00 , (J2 (X) ∩ W )\W s , (J2 (X) ∩ W )\W 0 .

The subspaces associated with w2 being 0 and 1 are denoted by the subscripts 0 and 1. Proposition 2.7. The subspace W 0 is open and dense in W . Proof. The space W s is smooth and open and dense in W . The subvariety W s \ W 0 has real codimension at least 1 in W s . Hence W 0 is open and dense in W s and is, therefore, open and dense in W . u t Corollary 2.8. The subspace P W 0 is open and dense in P W . Proposition 2.9. The projection π ∗ is a 2-to-1 map on W 0 and the two points in each fibre differ by a central representation. Proof. Let σ1 , σ2 ∈ W 0 such that σ˜ = π ∗ (σ1 ) = π ∗ (σ2 ). Then σ˜ ◦ π∗ (a˜ 0 ) = σ1 (a02 ) = σ2 (a02 ). Case (1). Suppose σ1 (a02 ) = σ (a02 ) 6 = ±I . Then there are exactly two elements ±A ∈ SL(2, R) such that (±A)2 = σ1 (a0 )2 = σ1 (a02 ).

536

E. Z. Xia

Hence σ2 (a0 ) = ±A = ±σ (a0 ). Hence the inverse image of σ˜ by π ∗ has two points and these two points differ by a central representation. Case (2). Let σ1 , σ2 ∈ W 00 . Then σ˜ is irreducible. Hence σ1 |F0 = σ2 |F0 is irreducible. Let d ∈ F0 . Then a0−1 da0 ∈ F0 . This implies σ1 (a0−1 da0 ) = σ2 (a0−1 da0 ). Hence

σ2 (a0 )σ1 (a0−1 )σ1 (d) = σ2 (d)σ2 (a0 )σ1 (a0−1 ).

That is, σ2 (a0 )σ1 (a0−1 ) intertwines σ1 |F0 . Since σ1 |F0 is irreducible, by Schur’s lemma, σ2 (a0 )σ1 (a0−1 ) is in the center of SL(2, R). Thus, σ2 (a0 ) = ±σ1 (a0 ). u t Corollary 2.10. The map π ∗ is 1-to-1 on P W 0 . Corollary 2.11. 1. The map π ∗ is 2-to-1 on W 0 / PGL(2, R) and the two points in each fibre differ by a central representation. 2. The map π ∗ is 1-to-1 on P W 0 / PGL(2, R). 3. The Prym Variety over X˜ Consider the given complex structure on X and denote by K its canonical bundle. The projection π induces a complex structure on X˜ and the free involution τ preserves this structure. Any holomorphic bundle E over X pulls back to a holomorphic bundle E˜ over ˜ = E. ˜ In particular, τ ∗ K˜ = K, ˜ where K˜ is the canonical bundle on X˜ such that τ ∗ (E) 0 ˜ Let Div (X) denote the group of all degree zero divisors on X. The Jacobi variety X. J (X) is the space of holomorphic line bundles over X with degree zero [1]. For any holomorphic line bundle L over X, π ∗ L is a holomorphic line bundle over ˜ X. Hence π induces a homomorphism ˜ π ∗ : J (X) −→ J (X). ˜ The resulting homomorphism If D ∈ Div 0 (X), then π −1 (D) ∈ Div 0 (X). ˜ π ∗ : Div 0 (X) −→ Div 0 (X) together with the basic epimorphism u satisfy the commutative diagram [1]: π∗

˜ Div 0 (X) −−−−→ Div 0 (X)   u u y y J (X)

π∗

−−−−→

˜ J (X)

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

537

˜ then π(D) ˜ ∈ Div 0 (X). Hence π also induces On the other hand, if D˜ ∈ Div 0 (X), a homomorphism (the norm map) ˜ −→ Div 0 (X). Nm : Div 0 (X) The map N m descends to a homomorphism ˜ −→ J (X) Nm : J (X) and the diagram Nm

˜ −−−−→ Div 0 (X) Div 0 (X)   u u y y ˜ J (X)

Nm

−−−−→

J (X)

commutes [1]. ˜ τ −1 (D) ˜ is in Div 0 (X). ˜ Hence τ induces automorphisms τ ∗ on For D˜ ∈ Div 0 (X), 0 ˜ ˜ the group Div (X) and J (X) such that the diagram ∗

τ ˜ −−− ˜ −→ Div 0 (X) Div 0 (X)   u u y y

˜ J (X)

τ∗

−−−−→

˜ J (X)

commutes. Let ˜ = −L}, ˜ ˜ : τ ∗ (L) P = {L˜ ∈ J (X) ∗ ˜ ˜ ˜ ˜ S = {L ∈ J (X) : τ (L) = L},

Remark 3.1. 1. The space S is an abelian variety of dimension g. ˜ τ ). 2. The identity component P0 of P is, by definition, the Prym Variety P rym(X, ˜ ˜ 3. The subgroups of 2-torsions of J (X) and J (X) are precisely J2 (X) and J2 (X). ˜ ∩ P and the quotient is denoted by The group P contains the subgroup J2 (X) ˜ ∩ P)\P. P P = (J2 (X) ˜ has two points, namely Proposition 3.2. 1. The kernel of the map π ∗ : J (X) −→ J (X) the trivial bundle 1 and a two torsion Tη . ˜ = 22g . 2. |P ∩ J2 (X)| 3. P has four connected components and P P is connected. 4. P contains Ker(Nm) as a subgroup of index 2. 5. If deg(L˜ 0 ) = 2 such that τ ∗ L˜ 0 = L˜ 0 , then there exists L˜ 1 such that L˜ 21 = L˜ 0 and τ ∗ (L˜ 1 ) = L˜ 1 ⊗ T˜ , where T˜ ∈ Ker(Nm) \ P0 .

538

E. Z. Xia

Proof. For 1, 2, 3 and 4, see [1,11]. Suppose deg(L˜ 0 ) = 2. Then it is immediate that there exists L˜ 1 such that L˜ 21 = L˜ 0 . Since τ ∗ (L˜ 0 ) = L˜ 0 , (τ ∗ (L˜ 1 ))2 = τ ∗ (L˜ 21 ) = τ ∗ (L˜ 0 ) = L˜ 0 = L˜ 21 . Hence,

τ ∗ (L˜ 1 ) = L˜ 1 ⊗ T˜

˜ In addition, since for some T˜ ∈ J2 (X). T˜ = L˜ 1 ⊗ (τ ∗ (L˜ 1 ))−1 and deg(L˜ 1 ) = 1, T˜ ∈ Ker(Nm) \ P0 [1]. u t 4. Stable Holomorphic Pairs and the Self-Dual Equation This section briefly reviews the rank-2 gauge theory over Riemann surfaces. The main results are due to Corlette, Hitchin and Donaldson. See [2–5,9] for details. For general smooth projective varieties, see [13–16]. 4.1. The complex case. The maximum compact subgroups of GL(2, C) and SL(2, C) are U(2) and SU(2) with fundamental groups isomorphic to Z2 . Let P c be a principal GL(2, C) bundle over a compact Riemann surface with first Chern class c1 (P c ) being either 0 or 1. Let V be the associated vector bundle. Fix a Hermitian metric h on V . This corresponds to a reduction of P c to a U(2) principal bundle P over X. Choose a U(2) (i.e. compatible with h) connection D0 on V such that the curvature F (D0 ) is central [2,9]. In addition, in the case of c1 (P c ) = 0, we choose h to be the constant metric 1 and D0 = d. Denote by G c the SL(2, C) gauge group on P c and G the SU(2) gauge group on P . The gauge group G preserves h. Let ad(P ) = P ×Ad su(2), ad(P c ) = P c ×Ad sl(2, C), where Ad is the adjoint representation. The difference of any two connections on P c or P is a 1-form. Hence, with the choice of D0 , one may identify 1 (X, ad(P c )) and 1 (X, ad(P )) with the space of connections of the fixed determinant det(D0 ) on P c and P , respectively. An element 8 of 1,0 (X, ad(P c )) is called a Higgs field. Given 8 ∈ 1,0 (X, ad(P c )) and A ∈ 1 (X, ad(P )), one may construct connections DA and D: DA = D0 + A, D = D0 + A + (8 + 8∗ ), where 8∗ denotes the adjoint of 8 with respect to the metric h. The (0, 1) part of D0 determines a holomorphic structure ∂¯0 on V [7]. Again, let A ∈ 1 (X, ad(P )), i.e. DA is compatible with h. Then ∂¯A , the (0, 1) part of DA , defines a holomorphic structure on V . Similarly, given a holomorphic structure ∂¯ on V with ¯ = det(∂¯0 ), det(∂) there exists a unique A ∈ 1 (X, ad(P )) such that DA is compatible with h and ¯ ∂¯A = ∂.

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

539

Hence the metric h determines a one-to-one correspondence between the space 1 (X, ad(P )) and the space of holomorphic structures on V with determinant equal to det(∂¯0 ). Higgs fields are sections of the bundle End0 V ⊗ K, where End0 V is the bundle of trace free complex linear transformations of V , and K is the canonical bundle on X. A holomorphic structure ∂¯ on V induces on End0 V a holomorphic structure which, when combined with the inherent holomorphic structure on K, gives a holomorphic structure ¯ on End0 V ⊗ K. A Higgs field 8 is holomorphic if (which we shall also call ∂) ¯ = 0. ∂8 ¯ 8) is a Higgs bundle. A pair (DA , 8) is holomorphic When 8 is holomorphic, we say (∂, if ∂¯A 8 = 0. Therefore, the set HC of holomorphic (DA , 8) pairs corresponds bijectively to the set ¯ 8). Hig of Higgs bundles (∂, The complex gauge group G c acts on Hig naturally. Since G ⊂ G c , G acts on the Higgs fields. Hence G acts on HC. The group G c is much larger than G; hence, one can expect the space HC/G to be much larger than the space Hig/G c . The key issue of this analysis is to establish an equivalence between the stable Higgs bundles in Hig/G c and the irreducible pairs in HC/G satisfying Hitchin’s self-duality equation. ¯ is 8-invariant if A holomorphic subbundle L of (V , ∂) 8(L) ⊆ L ⊗ K. ¯ 8) is stable (semi-stable) if L being 8-invariant implies A Higgs bundle (∂, 1 deg(L) < (≤) deg(V ). 2 A Higgs bundle is poly-stable if it is a direct sum of stable Higgs bundles of the same degree. Denote by Higs and Higss the space of stable and poly-stable Higgs bundles on X. The action of G c preserves Higs and Higss ; hence, one may define the moduli spaces Hs = Higs /G c , Hss = Higss /G c . The space Hss is a coarse moduli space parameterizing G c -equivalence classes of polystable Higgs bundles while Hs is a fine moduli space of stable Higgs bundles [9,12]. Denote by H0s , H1s , H0ss , H1ss the components of stable and poly-stable Higgs bundles associated with c1 (P c ) being 0 and 1, respectively. A pair (DA , 8) is irreducible if the connection DA + 8 + 8∗ is irreducible. A pair (DA , 8) is semi-simple if DA + 8 + 8∗ is a direct sum of irreducible connections of the same degree. A holomorphic pair (DA , 8) is called self-dual if it satisfies Hitchin’s self-duality equation [9]: F (DA ) + [8, 8∗ ] =

1 tr(F (D0 ))I. 2

Let Y M s and Y M ss denote the spaces of irreducible and semi-simple self-dual pairs. The action of G preserves both the properties of irreducibility and self-duality; hence, one may define the moduli spaces YMs = Y M s /G, YMss = Y M ss /G.

540

E. Z. Xia

¯ 8) such that its Hitchin showed that each G c orbit in Hs contains a Higgs bundle (∂, corresponding pair (DA , 8) is a self-dual pair with ¯ ∂¯A = ∂. ¯ 8) is unique up to G-equivalence. In other words, the Moreover the Higgs bundle (∂, s two moduli spaces H and YMs are diffeomorphic. Furthermore, given any self-dual pair in Y M s , the connection D = DA + 8 + 8∗ is flat and irreducible for c1 = 0 and descends to a flat PSL(2, C) connection for c1 = 1. From now on, we shall always assume V to have a holomorphic structure and write V0 for ¯ 8). We call the connection ∂¯0 and (V , 8) for a poly-stable Higgs bundle instead of (∂, D, so constructed from a Higgs bundle (V , 8), the connection associated with (V , 8). The 2-torsion subgroup J2 (X) acts on Hss by L.(V , 8) = (L ⊗ V , 8). Theorem 4.1 (Hitchin [9,12]). Hcs Hcss J2 (X)\Hcs J2 (X)\Hcss

∼ = ∼ = ∼ = ∼ =

Mcs / PSL(2, C), Mcss / PSL(2, C), P Mcs / PSL(2, C), P Mcss / PSL(2, C).

Denote by g the identification maps of these spaces. It is straightforward to generalize the notion of stability, semi-stability and polystability to Higgs bundles (V , 8) with c1 (V ) equal to any integer. Define Hcss to be the moduli space of G c -equivalence classes of poly-stable Higgs bundle (V , 8) with c1 (V ) = c. Let (Ld , DL ) be a holomorphic line bundle of degree d with a connection DL . The line bundle Ld defines an isomorphism between the space of holomorphic bundles of a fixed first Chern class c with the space of holomorphic bundles with first Chern class c + 2d: Ld ⊗ Vc 7 −→ Vc+2d . Moreover if V has a connection D, then the projective bundles (P (V ), D) and (P (Ld ⊗ V ), DL ⊗ D) are isomorphic. Define U = Ld ⊗ V , where c1 (V ) is either 0 or 1. Then U ⊗ U ∗ = (Ld ⊗ V ) ⊗ (Ld ⊗ V )∗ = V ⊗ V ∗ , and

End0 U ⊗ K = End0 V ⊗ K.

Hence Ld defines an isomorphism Ld

(V , 8) 7 −→ (Ld ⊗ V , 8) ss . which is G c -equivariant, hence, defines an isomorphism from Hcss to Hc+2d

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

541

Corollary 4.2. The components Hcss and J2 (X)\Hcss are homeomorphic to Mcss0 / PSL(2, C) and P Mcss0 / PSL(2, C), respectively if c ≡ c0 mod 2. ˜ Fix the Hermitian metric h˜ = π ∗ (h) on X. Definition 4.3. Construct the above moduli spaces on the double cover X˜ and denote these objects by a ˜ . For example, h˜ = π ∗ (h) is the pull-back Hermitian metric on V˜ ˜ and H˜ ss is the coarse moduli space of poly-stable Higgs bundles on X. The involution τ induces a pull-back action τ ∗ on H˜ ss . ˜ Proposition 4.4. The involution τ ∗ commutes with g. ˜ all the the operations involved Proof. Since τ preserves h˜ and the complex structure on X, in the identification map g˜ commute with τ ∗ . One can see this locally by choosing an acyclic cover {U˜ i , V˜i } on X˜ symmetric with respect to τ in the sense that  τ (U˜ i ) = V˜i     τ (V˜i ) = U˜ i     ˜ Ui ∩ V˜i = ∅. Such a cover is possible because τ does not fix any point and preserves the complex ˜ u structure on X. t 4.2. The real case. Now we turn our attention to the subsets of Hs and Hss that correspond to N s / PGL(2, R) and N ss / PGL(2, R). We say a Higgs bundle (V , 8) satisfies the reality condition or is a real Higgs bundle [9] if 1. There is a holomorphic line bundle L such that V = L ⊕ (L−1 ⊗ det(V )), 2. 8 = 81 ⊕ 82 , where 81 : L −→ L−1 ⊗ det(V ) ⊗ K, 82 : L−1 ⊗ det(V ) −→ L ⊗ K, i.e. 8 is of the form:

0 b , c 0

where b and c are holomorphic sections of the bundles L2 ⊗ K ⊗ det(V )−1 and L−2 ⊗ K ⊗ det(V ), respectively.

542

E. Z. Xia

For such a Higgs bundle (V , 8), the line bundle L inherits a Hermitian metric h1 from h. Let D be the connection associated to (V , 8). The metric h1 defines a bundle isomorphism between L¯ and L−1 . This induces an anti-holomorphic linear transformation f : V −→ V ⊗ det(V )−1 , f (u1 , u2 ) = (u¯ 2 , u¯ 1 ). In addition, D commutes with f . Thus, D is a flat connection on the projective subbundle P E ⊂ P (V ) fixed by f . Moreover, (P E, D) is a flat PSL(2, R)-bundle and the Euler class of P E equals 2 deg(L) − deg(V ). Let RHess be the subset of Hss of poly-stable real Higgs bundle with Euler class e and RHes the subset of RHess of stable real Higgs bundles. Theorem 4.5 (Hitchin [9]). The moduli space RHess is homeomorphic to Ness / PSL(2, R), and J2 (X)\RHess ∼ = P Ness / PSL(2, R).

The subspaces RHes and J2 (X)\RHes are diffeomorphic to Nes / PSL(2, R) and P Nes / PSL(2, R), respectively. Corollary 4.6. Tensoring with a line bundle Ld of degree d gives a one-to-one correspondence between the real Higgs bundles in Hcss with Euler class e and the real Higgs ss with Euler class e. bundles in Hc+2d 5. Stability This section is a study of stability criteria of real Higgs bundles. Proposition 5.1. Suppose V = L1 ⊕ L2 , with d = deg(L1 ) = deg(L2 ).

Then the bundles L1 and L2 are the only two holomorphic subbundles of V of degree d if and only if L1 6 = L2 . Proof. If L1 = L2 , then ts ⊕ (1 − t)s generates a line subbundle of V for all t ∈ [0, 1], where s is a meromorphic section of L1 [8]. Suppose L1 6 = L2 . Let H ⊂ V be a holomorphic line bundle of degree d. Then H corresponds to a holomorphic section ϕ of the bundle H −1 ⊗ V = H −1 ⊗ L1 ⊕ H −1 ⊗ L2 such that ϕ has no zero. Thus where ϕ1 is a section of do ϕ1 and ϕ2 . However

H −1

ϕ = ϕ1 ⊕ ϕ2 , ⊗ L1 and ϕ2 of H −1 ⊗ L2 . Since ϕ has no poles, neither

deg(H −1 ⊗ L1 ) = deg(H −1 ⊗ L2 ) = 0, so ϕ1 is either identically zero or has no zero. The same is true with ϕ2 . If ϕ1 ≡ 0, then ϕ2 has no zero and H = L2 . If ϕ2 ≡ 0, then ϕ1 has no zero and H = L1 . If neither ϕ1 , ϕ2 has any zero, then H = L1 and H = L2 . This is the case of L1 = L2 . t u

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

543

Proposition 5.2. Suppose (V , 8) is a Higgs bundle on X and V˜ = π ∗ (V ). In addition, suppose V˜ = L˜ 1 ⊕ L˜ 2   deg(L˜ 1 ) = deg(L˜ 2 ) = d        τ ∗ (L˜ 1 ) = L˜ 2

with

  τ ∗ (L˜ 2 ) = L˜ 1      ˜ 6 = L˜ 2 . L1 Then (V , 8) is stable. Proof. Suppose H ⊂ V . Then H˜ = π ∗ (H ) ⊂ V˜ . Hence, by Proposition 5.1, H˜ = L˜ 1 or H˜ = L˜ 2 or deg(H˜ ) < d. On the other hand, since τ ∗ (H˜ ) = H˜ , it must be the case that deg(H˜ ) < d. This implies V is a stable holomorphic bundle. Hence (V , 8) is stable for any 8. u t 6. The Flat PGL(2, R) Structures Suppose (V , 8) is a Higgs bundle on X. Then (V , 8) pulls back to ˜ = π ∗ (V , 8). (V˜ , 8) Proposition 4.4 indicates that one needs to determine the set of stable Higgs bundles of the form (V , 8) on X such that π ∗ (V , 8) ∈ RH˜ 0ss and det(V ) is det(V0 ). The pull-back V˜0 = π ∗ (V0 ) is a holomorphic bundle on X˜ and det(V˜0 ) is a line ˜ Suppose deg(V0 ) = 1. Then deg(V˜0 ) = 2. By Proposition 3.2, there exists bundle on X. a line bundle L˜ 1 and a 2-torsion line bundle T˜ such that L˜ 21 = det(V˜0 ),

τ ∗ (L˜ 1 ) = L˜ 1 ⊗ T˜ .

Definition 6.1. ˜ b) ˜ : L˜ ∈ P, L˜ 2 6 = 1, b˜ ∈ H0 (X, ˜ L˜ 2 K)}, ˜ Q0 = {(L, 0 ˜ ˜2 ˜ ˜ ˜ ˜ ˜ ˜ Q1 = {(L ⊗ L1 , b) : L ∈ P, b ∈ H (X, L ⊗ T˜ ⊗ K)}, Q = Q0 ∪ Q1 . ˜ ∩ P acts on Q: The group J2 (X) ˜ ∩ P) × Q −→ Q, (J2 (X) ˜ b)) ˜ −→ (L˜ 0 ⊗ L, ˜ b). ˜ (L˜ 0 , (L, The quotients are denoted by P Q0 , P Q1 , P Q, respectively. Proposition 6.2. The spaces P W00 / PGL(2, R) and P W10 / PGL(2, R) are diffeomorphic to P Q0 and P Q1 , respectively.

544

E. Z. Xia

Proof. Case 1: w2 = 0. Let σ ∈ W00 / PGL(2, R). Then σ corresponds to an element in H0s . Let σ˜ = π ∗ (σ ). By Proposition 2.4 and Theorem 4.5, σ˜ is an SL(2, R) representation and corresponds ˜ such that: to an element in RH˜ 0ss , hence, to a poly-stable Higgs bundle (V˜ , 8) π ∗ (V ) = V˜ = L˜ ⊕ L˜ −1 , 0 ∗ ˜ ˜ ˜ π (8) = 8 = 81 ⊕ 82 = c˜ with

b˜ 0

˜ = 0. deg(L)

˜ Suppose Since τ preserves the degree of any divisor, τ ∗ preserves the degree of L. 2 ˜ L 6 = 1. By Proposition 5.1, either   ˜ ˜  τ ∗ (L)  τ ∗ (L) = L˜ = L˜ −1 or  τ ∗ (L˜ −1 ) = L.  τ ∗ (L˜ −1 ) = L˜ −1 ˜ This implies, after normalizing, the following dichotomy: 1.

2.

 ∗ ˜ = L, ˜ τ ∗ (L˜ −1 ) = L˜ −1 , τ (L)     ˜ 1) = 8 ˜ 1 , τ ∗ (8 ˜ 2) = 8 ˜ 2, τ ∗ (8     ∗ ˜ ˜ τ ∗ (c) ˜ = c; ˜ τ (b) = b,  ∗ ˜ = L˜ −1 , τ ∗ (L˜ −1 ) = L, ˜ τ (L)     ˜ 1) = 8 ˜ 2 , τ ∗ (8 ˜ 2) = 8 ˜ 1, τ ∗ (8     ˜ = c, ˜ ˜ τ ∗ (c) ˜ = b. τ ∗ (b)

Let E˜ ⊂ V˜ be the flat SL(2, R)-bundle fixed by the anti-holomorphic map f˜. Let D˜ ˜ be the connection associated with (V˜ , 8). ˜ therefore, the pair (E, ˜ D) ˜ With the solutions to Eq. 1, τ ∗ preserves orientations on E; ˜ L˜ 2 ⊗ K) descends to a flat SL(2, R)-bundle (E, D) (w1 (E) = 0) of X. Note b˜ ∈ H0 (X, ˜ L˜ −2 ⊗ K) with L˜ ∈ S. Alternatively, (V˜ , 8) ˜ is a lift of a pair (V , 8) and c˜ ∈ H0 (X, that satisfies the reality condition. Note π ∗ (V , 8) = π ∗ (V ⊗ Tη , 8). ˜ is not unique. However these two Higgs bundles differ Hence the descent from (V˜ , 8) by a 2-torsion; hence, the projectivized bundles are the same. ˜ Hence the pair (E, ˜ D) ˜ With the solutions to Eq. 2, τ ∗ reverses orientations on E. −1 ˜ ˜ descends to a flat SLi (2, R)-bundle (E, D) on X. The two equations on L, L are precisely the condition for L˜ to be in P.

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

545

The group P has four connected components. Let Lη ∈ J (X) such that L2η = Tη . Let L˜ η = π ∗ (Lη ). Then L˜ 2η = π ∗ (Lη )2 = π ∗ (L2η ) = π ∗ (Tη ) = 1. In other words, L˜ η is a 2-torsion in S, hence, it is also in P. Suppose V is a rank-2 holomorphic bundle on X with det(V ) = 1 such that V˜ = ∗ π (V ) = L˜ ⊕ L˜ −1 , where L˜ ∈ P. Then L˜ ⊗ L˜ η ∈ P and π ∗ (V ⊗ Lη ) = V˜ ⊗ L˜ η = (L˜ ⊗ L˜ η ) ⊕ (L˜ ⊗ L˜ η )−1 . However,

det(V ⊗ Lη ) = Tη 6 = 1.

Therefore if L˜ ∈ P and

V˜ = π ∗ (V ) = L˜ ⊕ L˜ −1 ,

then either det(V ) = 1 or det(V ) = Tη . Since det(V ) cannot jump on connected components of P, it must be the case that only two components of P induce vector bundles V on X with det(V ) = 1. Denote these two components P00 . The components P \P00 will induce bundles V with determinant Tη . Hence only the Higgs bundles induced by P00 correspond to SLi (2, R) representations. Remark 6.3. The points in Q0 correspond to points in the space of SL±i (2, R) representation classes. Let ˜ ˜ b) ˜ : L˜ ∈ P00 , L˜ 2 6 = 1, b˜ ∈ H0 (X, ˜ L˜ 2 K)}. Q00 = {(L, By Corollary 2.11, this construction describes a 2-to-1 map 5 : W00 / PGL(2, R) −→ Q00 . Note the 2-torsions in P00 are excluded because they correspond to the reducible representation classes [σ ] with σ (a02 ) = ±I , hence, are not in W00 by definition. ˜ b) ˜ ∈ Q0 and To show 5 is onto, let (L, 0 V˜ = L˜ ⊕ L˜ −1 , 0 b˜ ˜ = 8 ˜ 0 . τ ∗ (b) ˜ The involution τ ∗ on V˜ preserves the subspace E˜ ⊂ V˜ but reverses orientations on E. ∗ ∗ Let < τ > be the order two group generated by τ and define the quotients V = V˜ / < τ ∗ >, ˜ < τ ∗ >, 8 = 8/ ˜ < τ∗ > . E = E/ Since L˜ 6 = L˜ −1 , by Proposition 5.2, (V , 8) is a stable Higgs bundle over X. Moreover ˜ and E˜ are pull-backs of (V , 8) and E by π ∗ , respectively. Let D˜ be the con(V˜ , 8) ˜ D) ˜ is a flat SL(2, R)-bundle and τ ∗ reverses ˜ Then (E, nection associated with (V˜ , 8).

546

E. Z. Xia

˜ Hence (E, D) is a flat SLi (2, R)-bundle on X. Hence the map 5 orientations on E. ˜ consists of the two points (V , 8) and is onto. Note the fibre of 5 at the point (V˜ , 8) (V ⊗ Tη , 8). Case 2: w2 = 1. The proof is similar to the proof of Case 1. A representation class σ ∈ W10 / PGL(2, R) corresponds to an element in H1s . Hence σ˜ = π ∗ (σ ) is an SL(2, R) representation. By Proposition 4.5 and Corollary 4.6, σ˜ corresponds to a ˜ such that: poly-stable Higgs bundle (V˜ , 8) π ∗ (V ) = V˜ = L˜ 1 ⊗ V˜1 = L˜ 1 ⊗ L˜ ⊕ L˜ 1 ⊗ L˜ −1 , ˜ ˜ =8 ˜1⊕8 ˜2 = 0 b , π ∗ (8) = 8 c˜ 0 ˜ = 0. Suppose with deg(L)

L˜ 1 ⊗ L˜ 6 = L˜ 1 ⊗ L˜ −1 .

By Proposition 5.1, either  ˜  τ ∗ (L˜ 1 ⊗ L) = L˜ 1 ⊗ L˜  τ ∗ (L˜ ⊗ L˜ −1 ) = L˜ ⊗ L˜ −1 1 1

 ˜  τ ∗ (L˜ 1 ⊗ L) or

= L˜ 1 ⊗ L˜ −1

 τ ∗ (L˜ ⊗ L˜ −1 ) = L˜ ⊗ L. ˜ 1 1

This implies, after normalizing, the following dichotomy: 1.

 ∗ ˜ = L˜ 1 ⊗ L, ˜ τ ∗ (L˜ 1 ⊗ L˜ −1 ) = L˜ 1 ⊗ L˜ −1 , τ (L˜ 1 ⊗ L)     ˜ 1) ˜ 1, ˜ 2) ˜ 2, =8 τ ∗ (8 =8 τ ∗ (8     ∗ ˜ ˜ = b, τ ∗ (c) ˜ = c; ˜ τ (b)

2.

 ∗ ˜ = L˜ 1 ⊗ L˜ −1 , τ ∗ (L˜ 1 ⊗ L˜ −1 ) = L˜ 1 ⊗ L, ˜ τ (L˜ 1 ⊗ L)     ˜ 1) ˜ 2, ˜ 2) ˜ 1, =8 τ ∗ (8 =8 τ ∗ (8     ∗ ˜ ˜ = c, ˜ τ ∗ (c) ˜ = b. τ (b) Equation 1 has no solution. Since V˜ = π ∗ (V ), the equality ˜ = L˜ 1 ⊗ L˜ τ ∗ (L˜ 1 ⊗ L)

would imply the existence of L0 ∈ V with L˜ 1 ⊗ L˜ = π ∗ (L0 ). ˜ = 1, the degree of L0 would have been 1 . This is not possible. Since deg(L˜ 1 ⊗ L) 2 With the solutions to Eq. 2, τ ∗ reverses orientations on the SL(2, R)-bundle E˜ ⊂ V˜ . ˜ D) ˜ descends to an ˜ Then the pair (E, Let D˜ be the connection associated with (V˜ , 8).

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

547

SLi (2, R)-bundle (E, D) on X. The bundle further descends to a flat projective bundle (P (E), D) on X. Since T˜ is a 2-torsion in P, τ ∗ (T˜ ) = T˜ −1 = T˜ . Hence T˜ ∈ S. Since S is an abelian variety, there exists L˜ 2 ∈ S such that T˜ = L˜ 22 . This implies

  τ ∗ (L˜ 2 ) = L˜ 2 = T˜ ⊗ L˜ −1 2  τ ∗ (L˜ −1 ) = L˜ −1 = T˜ ⊗ L˜ . 2 2 2

Hence for each L˜ ∈ P,   τ ∗ ((L˜ ⊗ L˜ 2 ) ⊗ L˜ 1 )

= (L˜ ⊗ L˜ 2 )−1 ⊗ L˜ 1

 τ ∗ ((L˜ ⊗ L˜ )−1 ⊗ L˜ ) = (L˜ ⊗ L˜ ) ⊗ L˜ . 2 1 2 1 Hence the solutions to Eq. 2 give points in Q1 . Note, by Proposition 3.2, T˜ 6 ∈ P0 . Thus there does not exist L˜ ∈ P such that L˜ 2 ⊗ T˜ = 1. This implies that there does not exist L˜ ∈ P such that L˜ ⊗ L˜ 2 ⊗ L˜ 1 = (L˜ ⊗ L˜ 2 )−1 ⊗ L˜ 1 . This gives a 2-to-1 map

5 : W10 / PGL(2, R) −→ Q1 .

Similar to Case 1, there are only two components of P that induce vector bundles V such that ˜ ⊕ L˜ 1 ⊗ (L˜ 2 ⊗ L) ˜ −1 , π ∗ (V ) = V˜ = L˜ 1 ⊗ (L˜ 2 ⊗ L) with det(V ) = det(V0 ). Denote these two components P10 and define ˜ : L˜ ∈ P10 , b˜ ∈ H0 (X, ˜ L˜ 2 ⊗ T˜ ⊗ K)}. ˜ Q01 = {(L˜ ⊗ L˜ 1 , b) ˜ b) ˜ ∈ Q0 and Let (L, 1 ˜ ⊕ L˜ 1 ⊗ (L˜ 2 ⊗ L) ˜ −1 , V˜ = L˜ 1 ⊗ (L˜ 2 ⊗ L) 0 b˜ ˜ = 8 ˜ 0 . τ ∗ (b) ˜ The involution τ ∗ on V˜ preserves the subspace E˜ ⊂ V˜ but reverses orientations on E. Let < τ ∗ > be the order two group generated by τ ∗ and define quotient sets V = V˜ / < τ ∗ >, ˜ < τ ∗ >, 8 = 8/ ˜ < τ∗ > . E = E/ ˜ and E˜ are By Proposition 5.2, (V , 8) is a stable Higgs bundle over X. Moreover, (V˜ , 8) ∗ ˜ D) ˜ is an SL(2, R)-bundle and pull-backs of (V , 8) and E by π , respectively. Since (E,

548

E. Z. Xia

˜ (E, D) is an SLi (2, R)-bundle on X. The bundle (E, D) τ ∗ reverses orientations on E, descends to a flat projective bundle (P (E), D) (PGL(2, R)-bundle) on X. Hence 5 is onto. Finally, by Corollary 2.11, ˜ ∩ P)\Q = (J2 (X) ∩ W )\W 0 / PGL(2, R) = P W 0 / PGL(2, R). P Q = (J2 (X) Q00

Q0

Let = W 0 / PGL(2, R)

t u

∪ Q01 . Proposition 6.2 actually provides an explicit identification with Q0 . This is stronger than needed to obtain Theorem 1.1. Since

of

P (V˜ ⊗ L˜ 0 ) = P (V˜ ) for any line bundle L˜ 0 , an alternative approach is to look at the equation ˜ = (V˜ ⊗ L˜ 0 , 8). ˜ τ ∗ (V˜ , 8) For the component P W 0 / PGL(2, R), this leads to the system of equations:  ∗ ˜ = L˜ −1 ⊗ L˜ 0 , τ ∗ (L˜ −1 ) = L˜ ⊗ L˜ 0 τ (L)     ˜ 1) = 8 ˜ 2, ˜ 2) = 8 ˜ 1, τ ∗ (8 τ ∗ (8     ∗ ˜ ˜ ˜ τ ∗ (c) ˜ = b. τ (b) = c, The solutions to this system of equations correspond to the GL(2, C) connections that project down to flat PGL(2, R) connections. The quotient P P is homeomorphic to a compact complex torus with complex dimen˜ ∈ P P \{[1]} is the vector space H0 (X, ˜ L˜ 2 K). ˜ sion g −1. If w2 = 0, then above each [L] By the Riemann-Roch formula, ˜ − h0 (L˜ −2 ) = 1 − (2g − 1) + deg(L˜ 2 K). ˜ h0 (L˜ 2 K) This implies ˜ = 1 − (2g − 1) + [2(2g − 1) − 2] = 2g − 2. h0 (L˜ 2 K) Hence the total dimension is ˜ + dim(P P) = 3g − 3. h0 (L˜ 2 K) ˜ = 2g − 1. Note if L˜ is a 2-torsion, then h0 (L˜ 2 K) Suppose w2 = 1. Again there is no L˜ ∈ P such that L˜ 2 ⊗ T˜ = 1. This implies that ˜ L˜ 2 ⊗ T˜ ⊗ K) ˜ is of dimension ˜ ∈ P P, the vector space H0 (X, above any [L] ˜ = 1 − (2g − 1) + [2(2g − 1) − 2] = 2g − 2. h0 (L˜ 2 ⊗ T˜ ⊗ K) Again, the total dimension is ˜ + dim(P P) = 3g − 3. h0 (L˜ 2 ⊗ T˜ ⊗ K) Finally, by Corollary 2.8, P W 0 is open and dense in P W . Therefore P W 0 / PGL(2, R) is open and dense in P W/ PGL(2, R). This proves Theorem 1.1. Acknowledgement. Most of this research was carried out at the University of Maryland at College Park. I thank Professor William Goldman, for his encouragement and for insightful discussions over the course of this research. I thank Professors Kevin Corlette, Ron Donagi, Jonathan Poritz, Richard Schwartz and Scott Wolpert for insightful discussions. I also thank Goldman and Poritz for proof-reading previous versions. I thank the referee for detailed and helpful suggestions for improvement. Finally, I thank IHES for hospitality and for providing an excellent research environment during the final revision of this paper.

Moduli of Flat PGL(2, R) Connections on Riemann Surfaces

549

References 1. Arbarello, E., Cornalba, M., Griffiths, P., Harris, J.: Geometry of Algebraic Curves Vol. I. Berlin– Heidelberg–New York: Springer-Verlag, 1984 2. Atiyah, M., Bott, R.: TheYang–Mills Equations Over Riemann Surfaces. Philos. Trans. Roy. Soc. London, Ser. A 308, 523–615 (1982) 3. Corlette, K.: Flat G-bundles With Canonical Metrics. J. Diff. Geom. 28, 361–382 (1988) 4. Donaldson, S.: Twisted Harmonic Maps and the Self-Duality Equations. Proc. London Math. Soc. 55, 127–131 (1987) 5. Donaldson, S., Kronheimer, S.: The Geometry of Four-Manifolds. Oxford Mathematical Monographs, 1990 6. Goldman, W.: Topological Components of Spaces of Representations. Invent. Math. 93, 557-607 (1988) 7. Griffiths, P., Harris, J.: Principles of Algebraic Geometry. New York: Wiley Interscience, 1978 8. Gunning, R.: Lectures on Vector Bundles Over Riemann Surfaces. Princeton, NJ: Princeton University Press, 1967 9. Hitchin, N.: The Self-Duality Equations on a Riemann Surface. Proc. London Math. Soc. 55, 59–126 (1987) 10. Milnor, J.: On the Existence of a Connection with Curvature Zero. Comment. Math. Helv. 32, 215–223 (1958) 11. Mumford, D.: Prym Varieties I. Contributions to Analysis, London–New York: Academic Press, 1974, pp. 325–350 12. Nitsure, N.: Moduli Space of Semistable Pairs on a Curve. Proc. London Math. Soc. 62, 275–300 (1991) 13. Simpson, C.: Constructing Variations of Hodge Structures Using Yang–Mills Theory and Applications to Uniformization. J. of the AMS 1, 867–918 (1988) 14. Simpson, C.: Hodge Bundles and Local Systems. Publ. Math. I.H.E.S. 75, 6–95 (1992) 15. Simpson, C.: Moduli of Representations of the Fundamental Group of a Smooth Projective Variety. I. Publ. Math. I.H.E.S. 79, 47–129 (1994) 16. Simpson, C.: Moduli of Representations of the Fundamental Group of a Smooth Projective Variety. II. Publ. Math. I.H.E.S. 80, 5–79 (1994) 17. Steenrod, N.: The Topology of Fiber Bundles. Princeton, NJ: Princeton University Press, 1951 18. Wood, J.: Bundles with Totally Disconnected Structure Group. Comment. Math. Helv. 51, 183–199 (1971) 19. Xia, E.: Components of Hom(π1 , PGL(2, R)). Topology 36 No. 2, 481–499 (1997) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 203, 551 – 572 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Yangian Realisations from Finite W-Algebras E. Ragoucy1,? , P. Sorba2 1 Theory Division, CERN, CH-1211 Genève 23, Switzerland. E-mail: [email protected] 2 Laboratoire de Physique Théorique LAPTH?? , LAPP, BP 110, F-74941 Annecy-le-Vieux Cedex, France.

E-mail: [email protected] Received: 1 April 1998 / Accepted: 28 November 1998

Abstract: We construct an algebra homomorphism between the Yangian Y (sl(n)) and the finite W-algebras W(sl(np), n.sl(p)) for any p. We show how this result can be applied to determine properties of the finite dimensional representations of such Walgebras. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . 2. Finite W Algebras: Notation and Classification . . . 2.1 Classical W(G, S) algebras . . . . . . . . . . . 2.2 Quantum W(G, S) algebras . . . . . . . . . . . 2.3 Miura representations . . . . . . . . . . . . . . 2.4 Example: W(sl(np), n.sl(p)) algebras . . . . . 3. Yangians Y (G) . . . . . . . . . . . . . . . . . . . . 3.1 Definition . . . . . . . . . . . . . . . . . . . . 3.2 Evaluation representations of Y (sl(n)) . . . . . 4. Yangians and Classical W-Algebra . . . . . . . . . 5. Yangians and Quantum W(sl(np), n.sl(p)) Algebras 6. Representations of W(sl(2n), n.sl(2)) Algebras . . 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . A. The Soldering Procedure . . . . . . . . . . . . . . . B. Classical W(sl(np), n.sl(p)) Algebras . . . . . . . B.1 Generalities . . . . . . . . . . . . . . . . . . . B.2 The generic case n 6 = 2 . . . . . . . . . . . . . B.3 The particular case of Y (sl(2)) . . . . . . . . . ? On leave of absence from LAPTH.

?? URA 14-36 du CNRS, associée à l’Université de Savoie.

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

552 553 553 554 554 555 556 556 557 558 559 560 562 563 565 565 567 568

552

E. Ragoucy, P. Sorba

C. Quantum W(sl(np), n.sl(p)) Algebras . . . . . . . . . . . . . . . . . . . . . 569 D. Tensor Products of Some Finite Dimensional Representations of sl(n) . . . . . 570

1. Introduction In the year 1985, the mathematical physics literature was enriched with two new types of symmetries: W algebras [1] and Yangians [2]. W algebras showed up in the context of two dimensional conformal field theories. They benefited from development owing in particular to their property to be algebras of the constants of motion for Toda field theories, themselves defined as constrained WZNW models [3]. Yangians were first considered and defined in connection with some rational solutions of the quantumYang– Baxter equation. Later, their relevance in integrable models with non-Abelian symmetry was remarked [4]. Yangian symmetry has been proved for the Haldane-Shastry SU (n) quantum spin chains with inverse square exchange, as well as for the embedding of this ˆ (2)1 WZNW one; this last approach leads to a new classification of model in the SU the states of a conformal field theory in which the fundamental quasi-particles are the spinons [5] (see also [6]). Let us also emphasize theYangian symmetry determined in the Calogero-Sutherland-Moser models [5,7]. Coming back to W algebras, it can be shown that their zero modes provide algebras with a finite number of generators and which close polynomially. Such algebras can also be constructed by symplectic reduction of finite dimensional Lie algebras in the same way usual – or affine – W algebras arise as reduction of affine Lie algebras: they are called finite W algebras [8] (FWA). This definition extends to any algebra which satisfies the above properties of finiteness and polynomiality [9]. Some properties of such FWA’s have been developed [9–12] and in particular a large class of them can be seen as the commutant, in a generalization of the enveloping algebra U(G), of a subalgebra Gˆ of a simple Lie algebra G [11]. This feature of FWA’s has been exploited in order to get new realizations of a simple Lie algebra G once knowing a G differential operator realization. In such a framework, representations of a FWA are used for the determination of G representations. This method has been applied to reformulate the construction of the unitary, irreducible representations of the conformal algebra so(4, 2) and of its Poincaré subalgebra, and compared it to the usual induced representation techniques [10]. It has also been used for building representations of observable algebras for systems of two identical particles in d = 1 and d = 2 dimensions, the G algebra under consideration being then symplectic ones; in each case, it has then been possible to relate the anyonic parameter to the eigenvalues of a Wgenerator [12]. In this paper, we show that the defining relations of aYangian are satisfied for a family of FWA’s. In other words, such W algebras provideYangian realizations. This remarkable connection between two a priori different types of symmetry deserves in our opinion to be considered more closely. Meanwhile, we will use results on the representation theory of Yangians and start to adapt them to this class of FWA’s. In particular, we will show on special examples – the algebra W(sl(2n), n.sl(2)) – how to get the classification of all their irreducible finite dimensional representations. It has seemed to us necessary to introduce in some detail the two main and a priori different algebraic objects needed for the purpose of this work. Hence, we propose in Sect. 2 a brief reminder on W-algebras with definitions and properties which will become useful to establish our main result. In particular, a short paragraph presents the Miura transformation. The structure of W(sl(np), n.sl(p)) algebras is also analysed.

Yangians from Finite W-Algebras

553

Then, the notion of Yangian Y (G) is introduced in Sect. 3, with some basic properties on its representation theory. Such preliminaries allow us to arrive well-equipped for showing the main result of our paper, namely that there is an algebra homomorphism between the Yangian Y (sl(n)) and the finite W(sl(np), n.sl(p)) algebra (for any p). This property is proven in Sect. 4 for the classical case (i.e. W-algebras with Poisson brackets) and generalized to the quantum case (i.e. W-algebras with usual commutators) in the Sect. 5. The proof necessitates the explicit knowledge of commutation relations among W generators. Such a result is obtained in the classical case via the soldering procedure [13]. Its extension to the quantum case leads to determine sl(n) invariant tensors with well determined symmetries. In order not to overload the paper, all these necessary intermediate results are gathered in the appendices. Finally, as an application, the representation theory of W(sl(2n), n.sl(2)) algebras is considered in Sect. 6. General remarks and a discussion about some further possible developments conclude our study. 2. Finite W Algebras: Notation and Classification As mentioned above, the W algebras that we will be interested in can be systematically obtained by the Hamiltonian reduction technique in a way analogous to the one used for the construction and classification of affine W algebras [3,14,15]. Actually, given a simple Lie algebra G, there is a one-to-one correspondence between the finite W algebras one can construct in U(G) and the sl(2) subalgebras in G. We note that any sl(2) G-subalgebra is principal in a subalgebra S of G. The step generator E+ in the sl(2) subalgebra which is principal inPS is then written as a linear combination of the simple root generators of S: E+ = si=1 Eβi , where βi , i = 1, . . . , s = rank S are the simple roots of S. It can be shown that one can complete uniquely E+ with two generators E− and H such that (E± , H ) is an sl(2) algebra. It is rather usual to denote the corresponding W algebra as W(G, S). It is an algebra freely generated by a finite number of generators and which has a second antisymmetric product. Depending on the nature of this second product, we will speak of a classical (the product is a Poisson bracket) or a quantum (the product is a commutator) W-algebra. 2.1. Classical W(G, S) algebras. To specify the Poisson structure of the W-algebra, we start with the Poisson–Kirillov structure on G ∗ . It mimicks the Lie algebra structure on G, and we will still denote by Hi , E±α the generators in G ∗ . U(G ∗ ) is then a PoissonLie enveloping algebra. We construct the classical W(G, S) algebra from a Hamiltonian reduction on U(G ∗ ), the constraints being given by the sl(2) embedding as follows. The Cartan generator H of the sl(2) subalgebra under consideration provides a gradation of G: G = ⊕N i=−N G i with [H, X] = i X, ∀ X ∈ G i . The root system of G is also graded: 1 = ⊕i 1i . We have X χα Eα , χα ∈ C. H ∈ G 0 , E± ∈ G ±1 and E− =

(2.1)

(2.2)

α∈1−1

Then, the first class constraints are 8α = 8(Eα ) = Eα = 0 if Eα ∈ G <−1 and 8α = 8(Eα ) = Eα − χα = 0 if Eα ∈ G −1 ,

(2.3)

554

E. Ragoucy, P. Sorba

the second class constraints being given by 8(X) = X = 0 if X ∈ G ≥0 and {E+ , X} 6 = 0.

(2.4)

Note that considering G ∗ as a module of the above mentioned sl(2), the generators of W(G, S) are in one-to-one correspondence with the highest weights of this sl(2). The Poisson bracket structure of the W(G, S) algebra is then given by the Dirac brackets associated to the constraints. More explicitly, one labels the constraints 8i , i = 1, . . . , I0 and denotes them by Cij ∼ {8i , 8j }, where the symbol ∼ means that one has to apply the constraints after calculation of the Poisson–Kirillov brackets. Then, the Dirac brackets are given by {A, B}∗ ∼ {A, B} −

I0 X

{A, 8i } C ij {8j , B} with C ij = (C −1 )ij .

(2.5)

i,j =1

They have the remarkable property that any generator in U(G ∗ ) has vanishing Dirac brackets with any constraint, so that the W(G, S) algebra can be seen as the quotient of U(G ∗ ) provided with the Dirac bracket by the ideal generated by the constraints. 2.2. Quantum W(G, S) algebras. These algebras are a quantization of the classical W(G, S) algebras, where the Poisson structure has been replaced by a commutator. As we have an algebra on which one has imposed some constraints, there are two ways to quantise it: either, one first quantizes U(G ∗ ) and then imposes the constraints at the quantum level; or one quantizes directly the W-algebra. In the first case, one can just look at U(G) as the quantization of U(G ∗ ) and then use a BRS formalism. This has been developed in [16], where the BRS operator is constructed, and its cohomology computed. The quantum version of the W(G, S) algebra is then the zeroth cohomological space, which inherits an algebraic structure from U(G). It is possible to explicitly construct a representative of each cohomology class, using the highest weights in G of the sl(2) under consideration. Note that for super W-algebras, the same treatment can be applied, the factorisation of spin 21 fields (and other gauging properties) being replaced by a filtration of the cohomological spaces in that formalism [17]. Here, we will look directly at the quantization of the W-algebra. In that approach, we start with the classical W(G, S) algebra and ask for an algebra which has a noncommutative product law, the associated commutator admitting as a limit the Poisson bracket (PB). This more pedestrian and less powerful approach will be sufficient for our purpose. In fact, we will only need to know that the quantum W(G, S) algebra has the same number of generators as the classical one, as well as the same leading term in the commutator. By leading term, we mean that in each commutator, the term of highest “conformal spin” is the same as the right-hand side of the corresponding PB. In the following, we will choose this term to be a symmetrized product (see below). 2.3. Miura representations. Associated to the gradation of G, there exists a representation of the W(G, S)-algebra, called the Miura map. It is an algebra morphism from U(G ∗0 ) to W(G, S). Note that one can show that U(G ∗0 ) has the same dimension as W(G, S), but the Miura map is neither an algebra isomorphism, nor a vector space isomorphism.

Yangians from Finite W-Algebras

555

It is based on a restriction of the Hamiltonian reduction that leads from G to W(G, S). More explicitly, for classical finite W-algebras, we start with the (G ∗ -valued) matrix J0 = e− + j k tk with [h, tk ] = 0,

(2.6)

where e− and tk are in the fundamental representation of G ({tk } is a basis of G 0 ) and j k ∈ G ∗0 . Then, we consider the transformations J0 → J g = g J0 g −1 with g ∈ G+ , Lie(G+ ) = G + .

(2.7)

It can be shown that there exists a unique element g such that J0 → J g = Jhw = e− + W k mk with [e+ , mk ] = 0,

(2.8)

where mk (in the fundamental representation of G) are the highest weights of (h, e± ), the sl(2) algebra under consideration. {W k } generate the classical W-algebra and are determined by the Miura map (2.8). They are expressed in terms of the j k generators, and thus this map allows to construct the W(G, S)-algebra generators as polynomials in G 0 ones (see e.g. [3] for details). At the quantum level, one can still proceed in two ways: either work on cohomological space (the Miura map in that case corresponds to a restriction to the zero grade of the general cohomological construction), or directly quantize the classical Miura construction. Using the Miura construction then leads to (finite dimensional) representations of W(G, S) associated to (finite dimensional) representations of G 0 . Note however that the irreducibility of these W(G, S)-representations is a priori not known, even if one starts from an irreducible representation of G 0 (see [18] for a counter example). 2.4. Example: W(sl(np), n.sl(p)) algebras. As an example, let us first consider the W(sl(4), sl(2) ⊕ sl(2)) algebra. It is made of seven generators Ji , Si (i = 1, 2, 3) and a central element C2 such that: i, j, k = 1, 2, 3, [Ji , Jj ] = iij k Jk , [Ji , Sj ] = iij k Sk , 2 [Si , Sj ] = iij k Jk (C2 − 2 JE ), 2 with JE = J12 + J22 + J32 . [C2 , Ji ] = [C2 , Si ] = 0

(2.9)

We recognize the sl(2) subalgebra generated by the Ji ’s as well as a vector representation (i.e. Si generators) of this sl(2) algebra. We note that the Si ’s close polynomially on the other generators. The same type of structure can be remarked, at a higher level, for the class of algebras W(sl(np), n.sl(p)), where n.sl(p) stands for the regular embedding sl(p)⊕· · ·⊕sl(p) (n times). We recall that an algebra S is said to be regular in G, itself generated by the Cartan generators Hi , i = 1, . . . , rank (G) and the root generators Eα , α ∈ 1, if S is, up to a conjugation, generated by a subset of the above Cartan part, as well as a subset {Eα }, α ∈ δ ⊂ 1 of the G-root generator set. In order to determine the number and the “conformal spin” of the W generators, one has first to decompose the adjoint representation of G under the sl(2) which is principal in S = n.sl(p): G = ⊕j Dj with Dj the (2j + 1)-dimensional irreducible representation of sl(2). Then to each Dj will correspond one W generator of “conformal spin” (j + 1) if we keep in mind that such

556

E. Ragoucy, P. Sorba

a generator can be seen as the zero mode of a primary field in a (Toda) conformal field theory. Actually, it is known that the G adjoint representation can be seen as arising from the direct product of the G fundamental representation by itself. Since this later representation reduces with respect to the S principal sl(2) as nD p−1 , it leads to1 : 2

nD p−1 × nD p−1 −→ n2 (Dp−1 + Dp−2 + · · · + D1 ) + (n2 − 1)D0 . 2

(2.10)

2

A more careful study will allow to recognize in the (n2 − 1)D0 the generators of an sl(n) algebra, and to associate to each set of (n2 − 1)Dk , k = 1, 2, . . . , p − 1, an irreducible (adjoint) representation under the above determined sl(n) algebra, the corresponding elements being of “conformal spin” (k + 1). To each remnant Dk will finally be associated a spin (k + 1) element which commutes with all the W generators: these central elements will be denoted C2 , C3 , . . . , Cp . They can be identified with the first p Casimir operators of the sl(np) algebra. Using a notation which will become clear in the next sections, we call W0a , a = 1, . . . , n2 − 1 the sl(n) generators, and Wka , k = 1, 2, . . . , p − 1, the W generators of respective spin (k +1). We can gather the above assertions in the following commutation relations: [W0a , W0b ] = f ab c W0c ,

[W0a , Wkb ] = f ab c Wkc , [Ci , W0a ] = [Ci , Wka ] =

a = 1, 2, . . . , n2 − 1, 0,

k = 1, 2, . . . , p − 1, i = 2, 3, . . . , p.

(2.11)

The remaining commutator takes the form ab (W ) [Wka , W`b ] = Pk`

(2.12)

ab is a polynomial in the W-generators, which is of “conformal spin” (k +`−1). where Pk` It is determined using the technics described above. The G 0 subalgebra associated to W(sl(np), n.sl(p)) is p.sl(n) ⊕ (p − 1).gl(1) which one can denote s(p.gl(n)), i.e. traceless matrices of gl(n) ⊕ gl(n) ⊕ · · · ⊕ gl(n) (p times). Thus, the Miura map for this kind of W-algebras leads to a realisation in term of generators of the enveloping algebra of s(p.gl(n)).

3. Yangians Y (G) 3.1. Definition. We briefly recall some definitions about Yangians, most of them being gathered in [19]. Yangians are one of the two well-known families of infinite dimensional quantum groups (the other one being quantum affine algebras) that correspond to deformation of the universal enveloping algebra of some finite-dimensional Lie algebra, called G. As such, it is a Hopf algebra, topologically generated by elements Qa0 and Qa1 , 1 For details about the conformal spin contents of W-algebras computed using sl(2) representations see [15].

Yangians from Finite W-Algebras

557

a = 1, . . . , dimG which satisfy the following defining relations: Qa0 generate G : [Qa0 , Qb0 ] = f ab c Qc0 ,

(3.1)

Qa1 form an adjoint rep. of G : [Qa0 , Qb1 ] = f ab c Qc1 , [Qa1 , [Qb0 , Qc1 ]] + [Qb1 , [Qc0 , Qa1 ]] + [Qc1 , [Qa0 , Qb1 ]] p q = f a pd f b qx f c ry f xy e ηde s3 (Q0 , Q0 , Qr0 ), [[Qa1 , Qb1 ], [Qc0 , Qd1 ]] + [[Qc1 , Qd1 ], [Qa0 , Qb1 ]] = p q f a pe f b qx f cd y f y rz f xz g + f c pe f d qx f ab y f y rz f xz g ηeg s3 (Q0 , Q0 , Qr1 ),

(3.2) (3.3)

(3.4)

where f ab c are the totally antisymmetric structure constants of G, ηab is the Killing form, and sn (., ., . . . , .) is the totally symmetrized product of n terms. The generators Qan for n > 1 are defined recursively through f a bc [Qb1 , Qcn−1 ] = cv Qan

with cv ηab = f a cd f bcd .

(3.5)

It can be shown that for G = sl(2), (3.3) is a consequence of the other relations, while for G 6 = sl(2), (3.4) follows from (3.1–3.3). The coproduct on Y (G) is given by 1(Qa0 ) = 1 ⊗ Qa0 + Qa0 ⊗ 1, 1(Qa1 ) = 1 ⊗ Qa1 + Qa1 ⊗ 1 + 21 f a bc Qb0 ⊗ Qc0 .

(3.6)

In the following, we will focus on the Yangians Y (sl(n)). 3.2. Evaluation representations of Y (sl(n)). When G = sl(n), there is a special class of finite dimensional irreducible representations called evaluation representations. They are defined from the algebra homomorphisms  U(sl(n))  Y (sl(n)) → ± ta Qa0 → with A ∈ C, (3.7) evA  Qa → A t a ± d a t b t c bc 1 where the t a ’s form a sl(n) basis, and d a bc is the totally symmetric invariant tensor of + − and evA are isomorphic sl(n) (we set d a bc = 0 when n = 2). It can be shown that evA + − (and indeed evA = evA = evA when G = sl(2)). An evaluation representation of Y (G) is defined by the pull-back of a G-representation ± ). The corresponding representation (with the help of the evalutation homomorphism evA ± space will be denoted generically by V A (π), where π is a representation of sl(n). We select hereafter two properties [19] which will be used in Sect. 6. Theorem 1. Any finite-dimensional irreducible Y (sl(n)) module is isomorphic to a subquotient of a tensor product of evaluation representations. Theorem 2. When G=sl(2), let V A (j ) be the (2j + 1)-dimensional irreducible representation space of evA (j ∈ 21 Z). Then, V A (j ) ⊗ V B (k) is reducible if and only if A − B = ±(j + k − m + 1) for some 0 < m ≤ min(2j, 2k). In that case, V A (j ) ⊗ V B (k) is not completely reducible, and not isomorphic to V B (k) ⊗ V A (j ); otherwise, V A (j ) ⊗ V B (k) is irreducible and isomorphic to V B (k) ⊗ V A (j ).

558

E. Ragoucy, P. Sorba

4. Yangians and Classical W-Algebra In this section, we want to show that there is an algebra morphism between the Yangian Y (sl(n)) and the classical W(sl(np), n.sl(p)) algebras (∀ p). For such a purpose, we need to compute some of the PB of the W-algebra. It is done using the soldering procedure [13], the calculation being quite tricky (see the appendices). For the generic case of W(sl(np), n.sl(p)) algebras, the result is {W1a , W1b } =

1 ab p3 au b f c W2c − d v f cu − d bu v f a cu d v de W0c W0d W0e , (4.1) 5 16

where the indices run from 0 to n2 − 1, with the notation W00 = 0 and W10 = C2 and the normalisation: {W0a , Wkb } =

1 ab f c Wkc p

k = 0, 1, 2, · · · .

(4.2)

When p = 2, we have the constraint W2a = 5 d a bc W0b W1c . Let us stress that the tensors f ab c and d ab c are gl(n) tensors, not sl(n) ones: see Appendix B for clarification. For the case of W(sl(2p), 2.sl(p)), the relations simplify to: 1 c p3 c E 2 W2 − W0 W0 , (4.3) {W1a , W1b } = f ab c 5 2 6(3p2 − 7) c 0 (p 2 − 9)(p 2 − 4) c E 2 3 a b ab W3c + W W + W1 W0 + {W1 , W2 } = f c 14 p(p2 − 1) 1 1 2(p2 − 1) i (4.4) + 3 W20 W0c − 30 W0c (WE1 · WE0 ) , together with the constraints W2a = 10 [ W0a W10 + δ0a (WE1 · WE0 ) ] for p = 2, and W3a = 0 for p = 2 or 3. In this basis, the map is  Y (sl(n)) → W(sl(np), n.sl(p))     → βk Wka for k = 0, 1, . . . , p , Qak (4.5) ρp     a (W , W , . . . , W ) for l > 0 Qap+l → Pp+l 0 1 p where Pla are some homogeneous polynomials which preserve the “conformal spin” of Wka . A careful computation shows (see Appendix B), using the PBs (4.1–4.4), that the generators Wka obey the relations (3.1–3.4), the commutators being replaced by PBs. As Y (sl(n)) is topologically generated by Qa0 and Qa1 , it is sufficient to give β0 and β1 . Indeed, once (4.5) is satisfied for k = 0 and 1, the relation (3.5) together with the PB of the W-algebra ensure that (4.5) can be iteratively constructed for all k. We show in Appendix B that in our basis this relation is indeed satisfied for β0 = p and β1 = 2.

(4.6)

We can thus conclude: Proposition 1. The classical algebra W(sl(np), n.sl(p)) provides a representation of the Yangian Y (sl(n)), the map being given by ρp defined in (4.5) and (4.6).

Yangians from Finite W-Algebras

559

5. Yangians and Quantum W(sl(np), n.sl(p)) Algebras We can use the above study to deduce the same result for quantum W(sl(np), n.sl(p)) algebras. In fact, as these algebras are a quantisation of the classical ones, we can deduce that the most general form of the commutator is 1 ab f c Wkc , (5.1) p p3 au b 1 d v f cu − d bu v f a cu d v de s3 (W0c , W0d , W0e ) + [W1a , W1b ] = f ab c W2c − 5 16 ab s2 (W0c , W0d ) + t˜cab W0c (5.2) +tcab W1c + tcd [W0a , Wkb ] =

for sl(n), and in the special case of sl(2), 6(3p2 − 7) 3 a b ab W3c + s2 (W1c , W10 ) + 3 s2 (W20 , W0c )+ [W1 , W2 ] = f c 14 p(p2 − 1) 2 (p − 9)(p2 − 4) g c c d e ηdg ηe − 30 ηde ηg s3 (W0 , W0 , W1 ) + + 2(p2 − 1) ab ab W1c W0d + gcde s3 (W0c , W0d , W0e ) + gˆ cab W1c + +gcab W2c + gcd ab s2 (W0c , W0d ) + g˜ cab W0c +gˆ cd

(5.3)

and gaab1 a2 ···ak . By construction, the t-tensors are symmetric in for some tensors taab 1 a2 ···ak the lower indices and antisymmetric in the upper ones. The g-tensors are only symmetric ab which has no symmetry property. Moreover, the in the lower indices, except for gcd a Jacobi identities with W0 show that they are invariant tensors. Hence, we are looking for objects which belong to the trivial representation in 32 (G) ⊗ Sk (G), S2 (G) ⊗ Sk (G), or 32 (G) ⊗ 32 (G), where Sk (G) is the totally symmetric product G ⊗ G ⊗ · · · ⊗ G (k times), while 32 (G) is the antisymmetric product G ⊗ G. Computing the decomposition of these tensor products shows that the multiplicity M0 of the trivial representation in these products is (for G = sl(n)): 1 if n 6= 2 M0 [S2 (G) ⊗ G] = , M0 [32 (G) ⊗ G] = 1, 0 for sl(2) M0 [32 (G) ⊗ S2 (G)] =

1 if n 6 = 2 , M0 [32 (G) ⊗ 32 (G)] = 0 for sl(2)

3 if n 6 = 2 , 1 for sl(2)

  4 if n 6 = 2, 3 M0 [32 (G) ⊗ S3 (G)] = 3 for sl(3) .  1 for sl(2) (5.4) Now, it is easy to show that the following tensors indeed belong to these spaces2 : 32 (G) ⊗ G :

tcab = f ab c ,

32 (G) ⊗ S2 (G) :

ab = f ab d e . tcd e cd

2 More general formulae are given in Appendix D.

S2 (G) ⊗ G :

tcab = d ab c ,

(5.5)

560

E. Ragoucy, P. Sorba

As they are evidently independent and give the correct multiplicities (with the convention that the d-tensor is null for sl(2)), we deduce that the most general form one gets is: p3 au b d v f cv − d bu v f a cv d v de s3 (W0c , W0d , W0e ) + 16 1 c W2 + µ1 W1c + µ2 d c de s2 (W0d , W0e ) + µ3 W0c (5.6) +f ab c 5

[W1a , W1b ] = −

for the algebra W(sl(np), n.sl(p)). This commutator is the only one needed to prove that the algebra satisfies the defining relations of the Yangian when n 6 = 2. For the algebra W(sl(2p), 2.sl(p)), we need also the relation:

6(3p2 − 7) 3 W3c + s2 (W1c , W10 ) + 3 s2 (W20 , W0c ) + 14 p(p2 − 1) 2 (p − 9)(p2 − 4) g c c η η − 30 η η (5.7) s3 (W0 , W0d , W1e ) + + dg de e g 2(p 2 − 1) i + ν1 W2c + ν10 W1c + ν100 W0c + ν2 f c de W1d W0e + ν3 ηde s3 (W0c , W0d , W0e ) .

[W1a , W2b ]

=f

ab

c

Then, one can show that the commutators (5.6–5.7) obey the defining relations of the Yangians Y (sl(n)) for the same normalisations as in the classical case. It is done in Appendices C and D. Proposition 2. The quantum algebra W(sl(np), n.sl(p)) provide a representation of the Yangian Y (sl(n)), the map being given by ρp defined in (4.5) and (4.6). At this point, let us note that the Yangian structure of the algebra W(sl(4), 2sl(2)) has been already remarked [9,20] and used for quantum mechanics applications [20]. 6. Representations of W(sl(2n), n.sl(2)) Algebras Owing to the above identification, it is possible to adapt some known properties on Yangian representation theory to finite W representations. We first illustrate this assertion in the case of W(sl(2n), n.sl(2)). Proposition 3. Any finite dimensional irreducible representation of the algebra W(sl(4), 2.sl(2)) is either an evaluation module V A (j ) or the tensor product of two evaluation modules V A (j )⊗V (−A) (k). Conversely, V A (j ) for any A, and V A (j )⊗V (−A) (k) (A 6 = 0) with 2A 6 = ±(j + k − m + 1) for any m such that 0 < m ≤ min (2j, 2k), are finite dimensional irreducible representations of the algebra W(sl(4), 2.sl(2)). The tensor product is calculated via the Yangian coproduct defined in (3.6). The proof is done by direct calculation, using Theorem 2 of Sect. 3.2.As a (irreducible) representation of the W(sl(4), 2.sl(2)) algebra must be a (irreducible) representation of theYangian Y (sl(2)), we deduce that the (finite dimensional) irreducible representations of W(sl(4), 2.sl(2)) are in the set of evaluation modules V A (j ) or V A (j ) ⊗ V B (k) ⊗ · · · ⊗ V C (`).

Yangians from Finite W-Algebras

561

For V A (j ), it is obvious that we have an irreducible representation, where the value of the W Casimir operator C2 is related to A: (6.1) C2 (A, j ) = 2j (j + 1) + A2 I. For a product V A (j ) ⊗ V B (k), calculations show that we must have A + B = 0 and 1 2 2 C2 (A, j ; B, k) = 2j (j + 1) + 2k(k + 1) + (A + B ) I ⊗ I 2

(6.2)

in order to get a representation of the W-algebra. The irreducibility is fixed by the first part of Theorem 2, Sect. 3.2, i.e. when there is no m ∈ ] 0, min(2j, 2k) ] such that 2A = ±(j + k − m + 1). It is the second part of the theorem which ensures that the above construction exhausts the set of irreducible finite dimensional representations of W(sl(4), 2sl(2)). Indeed, in the product V A (j ) ⊗ V B (k) ⊗ V C (`), we already know that we must have B = −A and C = −B for V A (j ) ⊗ V B (k) and V B (k) ⊗ V C (`) to be representations of the Walgebra. Then, the irreducibility of V (−A) (k)⊗V A (`) implies that this last representation is isomorphic to V A (`)⊗V (−A) (k). Therefore, V A (j )⊗V (−A) (k)⊗V A (`) is isomorphic to V A (j ) ⊗ V A (`) ⊗ V (−A) (k). But the product V A (j ) ⊗ V A (`) is not a representation of the W-algebra3 , so that the triple product is not either. Note that we get the surprising result that the tensor product of two representations of the W(sl(4), 2.sl(2)) algebra (V A (j ) and V B (k)) is not always a representation of this algebra. In some sense, this result can be interpreted as a no-go theorem for the existence of a coproduct for W-algebras. Let us also remark that the above representations are those obtained through the Miura map (see Sects. 2.3 and 2.4), so that we have proved that the Miura map gives all the irreducible finite dimensional representations of this W-algebra. Moreover, as the G 0 algebra we have to consider is just sl(2) ⊕ sl(2) ⊕ gl(1) = s(2.gl(2)), the condition A + B = 0 in the tensor product V A (j ) ⊗ V B (k) can just be interpreted as the traceless condition on s(2.gl(2)). Indeed, a representation of 2gl(2) is given by a representation space Dj ⊗ Dk of 2sl(2), together with the values A and B of the two gl(1) generators, while for s(2.gl(2)), one has to impose A + B = 0. In fact, we are able to prove a more general result: Proposition 4. Any finite dimensional irreducible representation of the algebra W(sl(2n), n.sl(2)) must be either an evaluation module V ± A (π ) or the tensor prod± ∓ 0 uct of two evaluation modules V A (π) ⊗ V (−A) (π ), where π and π 0 are irreducible finite dimensional representations of sl(n), the tensor product being calculated via the Yangian coproduct defined in (3.6). All these representations can be obtained from the Miura map: s(2.gl(n)) ≡ 2.sl(n) ⊕ gl(1) → W(sl(2n), n.sl(2)). Note that these algebras are just the ones used in [11] to construct the finite Walgebras as commutants in U(G). It seems rather natural to conjecture that this situation will remain valid in the general case of W(sl(np), n.sl(p)) algebras [21]. 3 In fact, for A = 0, the tensor product indeed provides a representation of the W-algebra (it is just a representation of sl(2)). However, in that case, the tensor product is not irreducible.

562

E. Ragoucy, P. Sorba

7. Conclusion A rather surprising connection between Yangians and finite W-algebras has been developed in this paper. We have proved directly that finite W-algebras of the type W(sl(np), n.sl(p)) satisfy the defining relations of theYangian Y (sl(n)). In particular, we have been led to explicitly compute rather non-trivial commutators of W generators, namely spin 2 – spin 2 and spin 2 – spin 3 ones, a result which is interesting in itself. The question is now to understand more deeply this relationship betweenYangian and finite W-algebras. Of course, the structure of the W(sl(np), n.sl(p)) algebra (see Sect. 2.4) reveals the special role played by its (spin one) Lie subalgebra sl(n). The W generators of equal spin gather into adjoint representations of this sl(n) algebra, inducing some resemblance with the Y (sl(n)) yangian structure. At this point, let us remark another common point between Y (sl(n)) and W(sl(np), n.sl(p)), namely the construction of their finite dimensional representations with the help of sl(n) ones. Indeed, the evaluation homomorphism (in the case of Yangians) and the Miura map (for W-algebras) play identical roles for such a construction: the former allows to represent Y (sl(n)) on the tensor product of sl(n) representations (with the use of additional constant numbers), while the later uses a representation of the G 0 algebra p.sl(n) ⊕ (p − 1).gl(1). This clearly shows a one-to-one correspondence. Let us also stress another feature of the W(sl(np), n.sl(p)) algebras: for p = 2, they are the commutant in (a localisation of) U(sl(n)) of an Abelian subalgebra G˜ of sl(n) [11,18], the case p > 2 being with no doubt generalisable. Finally, in seeking to understand our results, one could think of a R-matrix approach. This point of view looks natural, since a R-matrix definition of the Yangians is available, while our W-algebras are symmetry algebras of (integrable) non-Abelian lattice Toda models. Due to the wide class of W(G, S)-algebras, it seems natural to think of generalisations of our work. First of all, one could imagine to study Yangians Y (G) (with G 6 = sl(n)) from the W-algebras point of view. However, a rapid survey of W(G, S)-algebras shows that W(sl(np), n.sl(p)) algebras are the only W(G, S)-algebras where the generators are all gathered in adjoint representations of the Lie W-subalgebra. Inversely, W(G, S)-algebras might be a way to generalize the notion of Yangians Y (G) to cases where the generators are in any representation of G. In that case, the Hopf structure remains to be determined. Finally, it would be of some interest to look for an extension to affine W-algebras. Let us end with two comments concerning applications. The first one concerns the representation theory of finite W-algebras. Preliminary results have been given in Sect. 6 and deals with the classification of finite dimensional representations of W(sl(2n), n.sl(2)). More complete results will be available soon [21]. Secondly, the possibility of carrying out the tensor product of W representations, although only in special cases, allows to imagine the construction of spin chain models based on a finite W(sl(np), n.sl(p)) algebra. Acknowledgements. We have benefited from valuable discussions with M. L. Ge, Ph. Roche and particulary Ph. Zaugg.

Yangians from Finite W-Algebras

563

Appendices A. The Soldering Procedure The soldering procedure [13] allows to compute the Poisson brackets of the W-algebras. The basic idea is to implement the W(G, S)-transformations from G ones. Indeed, as the W(G, S)-algebra can be realised from a Hamiltonian reduction on G, one can see the W transformations as a particular class of (field dependent) G conjugations that preserve the constraints we have imposed. Thus, the soldering procedure just says that the PBs of the W(G, S) algebra can be deduced from the commutators in G. It applies to any W(G, S) algebra, but we will focus on the W(sl(np), n.sl(p)) ones. For such a purpose, we define J =

2 (np) X−1

J a ta where ta are (np) × (np) matrices and J a ∈ G ∗ .

(A.1)

a=1

Then, we introduce the highest weight basis for the sl(2) under consideration (E± , H ): J = E− +

2 )−1 p(n X

W i Mi with [E+ , Mi ] = 0,

(A.2)

i=1

where Mi are (np) × (np) matrices and E+ is considered here in the fundamental Pnp−1 representation E+ = i=1 Ei,i+1 , with Eij the matrix whose elements are (Eij )kl = δik δj l . To compute the PB of the generator W i of the W-algebra, one writes the variation of J under the infinitesimal action of one of the W-generators in two ways, namely: δε J = { tr(εJ), J }P B = [ εJ, J ],

(A.3)

where { tr(εJ), J }P B is the matrix of PB: { tr(εJ), J }P B = { tr(εJ), W i }P B Mi ,

(A.4)

and [ εJ, J ] is a commutator of (np) × (np) matrices: [ εJ, J ] = f a bc εb Jc ta .

(A.5)

ε is an np × np matrix such that δε J = [ εJ, J ] keeps the form (A.2) with of course δε E− = 0. This matrix ε has p(n2 ) − 1 free entries, which is the right number of parameters needed to describe a gauge transformation by a general element in the Walgebra. Identifying the matrix of PB with the commutator of matrices leads to the PB of the W-algebra. We now use the property gl(np) ∼ gl(n) ⊗ gl(p) to explicitly compute some of the PBs. In gl(n) ⊗ gl(p), a general element can be written as J =

2 −1 p 2 −1 nX X

α=0 s=0

J αs tα ⊗ τs

(A.6)

564

E. Ragoucy, P. Sorba

with tα , n × n matrices and τs , p × p matrices. The principal sl(2) in n.sl(p) takes the form H = In ⊗ h and E± = In ⊗ e± ,

(A.7)

where (h, e± ) form the principal sl(2) in sl(p), and In is the identity in sl(n). Then, J = I n ⊗ e− −

p−1 X

Wk ⊗ mk ,

(A.8)

k=0

Pp−1 where e− is viewed as a p × p matrix, e− = i=1 Ei,i+1 and mk are p × p matrices representing the highest weights of the principal sl(2) in sl(p). They have been computed in [22]: mk =

p−k X i=1

aki Ei,i+k with aki =

(i + k − 1)! (p − i)! . (i − 1)! (p − k − i)!

(A.9)

Wk are n × n matrices whose entries Wka (a = 0, . . . , n2 − 1) are the W-generators (with Wk0 related to Ck+1 for k > 0 and W00 = 0 by the traceless condition on J). Note that the indices run from 0 to n2 − 1 because we are using gl(n) indices instead of sl(n) ones (see Appendix B for details). Using this notation and demanding that δε J keeps the form (A.2), one can compute the commutator [ εJ, J ] to get the relations defining the matrix ε. These relations are quite awful, but, for our purpose, we just need to compute the matrix elements [ εJ, J ]1,2 and [ εJ, J ]1,3 . A rather long calculation leads to: {tr(µ W0 ), Wk }P B =

1 [Wk , µ] k = 0, 1, 2, . . . , p

(A.10)

{tr(λ W1 ), W1 }P B = 2 1 6 p −4 , λ] + , {W , λ}] + {W , [W , λ]} + = [W [W 2 0 1 1 0 p(p2 − 1) 5 2 1 (A.11) − [W0 , [W0 , [W0 , λ]]] , 2 3(p2 − 9) 6 [W3 , λ] + {W2 , [W0 , λ]} + {tr(λ W1 ), W2 }P B = 2 p(p − 1) 14 1 1 1 + [W0 , {W2 , λ}] + [{W1 , W1 }, λ] − [W1 , [W0 , [W0 , λ]]] + 2 3 2 1 1 (A.12) − [W0 , [W1 , [W0 , λ]]] − [W0 , [W0 , [W1 , λ]]] , 4 12 where µ (resp. λ) is a n × n matrix whose entries µa (resp. λa ) are the parameters of the infinitesimal transformations associated to W0a (resp. W1a ): µ = µa ta ; λ = λa ta ; Wk = Wka ta k = 0, 1, 2

with t0 = In .

(A.13)

Yangians from Finite W-Algebras

565

B. Classical W(sl(np), n.sl(p)) Algebras B.1. Generalities. As we are using heavily the isomorphism gl(np) ∼ gl(n) ⊗ gl(p) for our calculations, we are forced to make use of gl(n) indices instead of sl(n) ones. We denote the last index by a = 0. It corresponds to the gl(1) generator that commutes with sl(n) in gl(n). We can consistently extend the definition of the totally (anti-)symmetric tensors f and d from sl(n) to gl(n) by d ab 0 = 2 ηab and f ab 0 = 0 ∀ a, b = 0, 1, . . . , n2 − 1.

(B.1)

In the fundamental representation of gl(n), we have then the decomposition: ta tb =

1 ab (f c + d ab c )t c with t 0 = In . 2

(B.2)

Then, it is easy to show that the Jacobi identities f ab c f cd e + f bd c f ca e + f da c f cb e = 0, d ab c f cd e + d bd c f ca e + d da c f cb e = 0

(B.3) (B.4)

are still valid for any values of a, b, d, e = 0, 1, . . . , n2 −1. If we compute {{t a , t b }, t c }− {{t c , t b }, t a } = [[t a , t c ], t b ], we get also the relation between f and d tensors: d ab d d dc e − d bc d d da e = f ac d f db e .

(B.5)

These identities will be the only one needed for our purpose. Note that the identity f ab c fabd = cv ηcd

(B.6)

is not valid in gl(n) since the left hand side is 0 for c = d = 0. As an aside comment, let us remark that the isomorphism gl(np) ∼ gl(n) ⊗ gl(p) together with the above conventions allow us to construct the structure constants of gl(np) from those of gl(n) and gl(p). Indeed let t a (resp. t¯q and T (a,q) = t a ⊗ t¯q ) be the generators in the fundamental representation of gl(n) (resp. gl(p) and gl(np)); qr let f ab c (resp. f¯s and F (a,q)(b,r) (c,s) ) be their structure constants; and let d ab c (resp. qr (a,q)(b,r) d¯s and D (c,s) ) be their totally symmetric invariant tensor. The calculation of [T (a,q) , T (b,r) ] and {T (a,q) , T (b,r) } show that 1 ab ¯ qr qr f c ds + d ab c f¯s , 2 1 ab ¯qr qr = f c fs + d ab c d¯s , 2

F (a,q)(b,r) (c,s) =

(B.7)

D (a,q)(b,r) (c,s)

(B.8)

which shows that e.g. D (a,q)(b,r) (0,0) = 2η(a,q)(b,r) = 2ηab ηqr in agreement with our conventions.

(B.9)

566

E. Ragoucy, P. Sorba

With these conventions and properties, we deduce from the soldering procedure result (A.10–A.12) the PBs 1 ab k = 0, 1, 2, . . . , (B.10) f c Wkc p 2 p − 4 ab 6 1 f c W2c + (d a cu f ub d − d b cu f ua d ) W1c W0d + {W1a , W1b } = 2 p(p − 1) 5 2 1 a (B.11) + f cu f b dv f uv e W0c W0d W0e , 2 6 1 b ua 3(p2 − 9) ab c a ub W + d − f d f f {W1a , W2b } = d d c cu 3 p(p2 − 1) 14 2 cu 1 W0c W2d + f ab u d u cd W1c W1d + 6 1 a 1 1 a b uv b uv f cu f ev f d + f du f cv f e + f a eu f b cv f uv d W0c W0d W1e . (B.12) 2 4 12 {W0a , Wkb } =

We repeat that the indices run from 0 to n2 − 1. Noting the identity (proved using (B.5) and the commutativity of the product) f a cu f b dv f uv e W0c W0d W0e = 3 a 1 ab u v ub b ua v f u d cv d de − (d uv f c − d uv f c )d de W0c W0d W0e , = 2 4

(B.13)

and performing a change of basis 2 e1a = p(p − 1) W1a + p d a bc W0b W0c , W 6 4 2 2 e1c + e2a = p(p − 1)(p − 4) W2a + 5 d a bc W0b W W 6 5p(p 2 − 4) a d bu d u cd W0b W0c W0d , + 24 2 2 2 e3a = p(p − 1)(p − 4)(p − 9) W3a , W 6

(B.14)

(B.15) (B.16)

we obtain the PB4 : {W1a , W1b } =

1 ab p 3 au b f c W2c − d v f cu − d bu v f a cu d v de W0c W0d W0e . (B.17) 5 16

Note that in this basis, we have W10 = C2 (i.e. W10 is central). Now, as the relations that we have to verify are different if n is 2 or not, we specify both cases. We begin with the general case. 4 We keep the notation W a for W e a : throughout the text it is W e a which is used, except in Eqs. (B.10-B.12) j j j and the convention (A.13).

Yangians from Finite W-Algebras

567

B.2. The generic case n 6 = 2. One has to verify that the PB (B.17) obeys the defining relations of the Yangian. We rewrite (3.3) as q

f bc d {Qa1 , Qd1 } + circ. perm. (a, b, c) = f a qd f b rx f c sy f xyd Q0 Qr0 Qs0 .

(B.18)

Plugging the PB into the left hand side of (B.18) leads to 3 1 2p bc aµ d dµ a f d d ν f πµ − d ν f π µ + circ. perm. (a, b, c) × lhs = −β0 β1 16 p γ ρ × d ν γρ W0π W0 W0 . This has to be compared with q

rhs = β03 f a qx f b ry f c sz f xyz W0 W0r W0s , where we have used latin (resp. greek) letters for sl(n) (resp. gl(n)) indices. To prove the equality between lhs and rhs, we first remark, using the Jacobi identity for f , that the index 0 can be dropped from lhs, or equivalently added to rhs. We choose to use gl(n) indices, and come back to latin letters to denote them. p2 q lhs = −β0 β12 f ab d d cy x f d qy −d dy x f c qy d x rs W0 W0r W0s +circ. perm. (a, b, c) 16 p2 q (B.19) = −β0 β12 f ab d d dy x f c yq d x rs W0 W0r W0s +circ. perm. (a, b, c), 8 where we have used the Jacobi identities (B.3–B.4). With (B.5) and the symmetry in (q, r, s), one can rewrite rhs as: 1 q rhs = f a qd f b rx (d cx y d yd s − d cd y d yx s ) W0 W0r W0s 3 β0 1 cx b = d y f rx (d yd s f a qd + d yd q f a sd ) − 2 1 cd a q yx b yx b − d y f qd (d s f rx + d r f sx ) W0 W0r W0s 2 1 q = (f a qd f by x − f ay x f b qd )d cd y d x rs W0 W0r W0s 2 1 q = − R abc qx d x rs W0 W0r W0s . 2 Using the Jacobi identity (B.4), we have

(B.20)

R abc qx = f by x (f ac d d d qy + f a yd d cd q ) − (a ↔ b) = (f ac d f by x − f bc d f ay x )dyq d + (f by x f a yd − f ay x f b yd )d cd q = (f ac d f by x − f bc d f ay x )dyq d + f ab y fdx y d cd q .

(B.21)

Now, since rhs is invariant under cyclic permutations of (a, b, c), we can write q 6 rhs = β03 2f ab d f cy x dyq d −f ab y fdx y d cd q + circ. perm. (a, b, c) d x rs W0 W0r W0s q = β03 f ab d −2f cy q dyx d −fyx d d cy q + circ. perm. (a, b, c) d x rs W0 W0r W0s q

= −3β03 f ab d f cy q dyx d d x rs W0 W0r W0s + circ. perm. (a, b, c).

(B.22)

568

E. Ragoucy, P. Sorba

From the normalisation β0 = p, we deduce that lhs and rhs are equal when β12 = 4, i.e. for Qa0 = p W0a

and Qa1 = 2 W1a .

(B.23)

which ends the proof for the generic case. B.3. The particular case of Y (sl(2)). As a normalisation, we take for the fundamental representation of gl(2) the matrices: 01 0 −i 1 0 10 ; t2 = ; t3 = ; t0 = , (B.24) t1 = 10 i 0 0 −1 01 We have in that case d abc = 2δ0a ηbc + circ. perm. (a, b, c)

and cv = −8. j

(B.25)

j

Then, using the special property f ij m f m kl = −4(ηki ηl − ηli ηk ) valid in sl(2) (i.e. when none of the index is 0), we get the PB 1 k p3 E j i ij k E W − (W0 · W0 ) W0 , (B.26) {W1 , W1 } = f k 5 2 2 where xE · yE = x 1 y1 + x 2 y2 + x 3 y3 and the indices i, j , k now run from 1 to 3. Note that for p = 2, once the constraint W2a = 10 [W0a W10 + δ0a (WE1 · WE0 )] is applied, we recover the algebra presented in Sect. 2.4, up to the normalisation J i = W0i , S i = 2 W1i , C2 = W10 and f ij k = 2iεij k . After multiplication by fij k fmn l , the relation (3.4) can be rewritten as: j j j l k fij k {{W1i , W1 }, W1l } + fij l {{W1i , W1 }, W1k } = 32 fij k ηm + fij l ηm W0m W0 W1i . (B.27) Using the above PB and the normalisation β0 = p, we get: h i l + f l ηk W m W j W i , lhs = cv β13 15 {W2k , W1l } + {W2l , W1k } − p2 fij k ηm ij m 0 1 0 l + f l ηk W m W j W i . rhs = 32p 2 β1 fij k ηm ij m 0 1 0 (B.28) Thus, we need to simplify the PBs (B.12) in the new basis (B.14–B.15). For sl(2) it takes the form: 6(3p2 − 7) k 0 3 k j W3 + 3 W20 W0k + W W + {W1i , W2 } = f ij k 14 p(p2 − 1) 1 1 (p 2 − 9)(p 2 − 4) k E 2 k E E W1 W0 − 30 W0 (W1 · W0 ) (B.29) + 2(p2 − 1) so that it does not contribute to lhs. Hence, we have j

l l + f k ij ηm ) W0m W0 W1i . lhs = −cv β13 p2 (f k ij ηm

(B.30)

The relation (B.27) is then satisfied for Qa0 = p W0a

and Qa1 = 2 W1a ,

which is the same normalisation as for the generic case.

(B.31)

Yangians from Finite W-Algebras

569

C. Quantum W(sl(np), n.sl(p)) Algebras In the quantum case, one has to check that the corrections to the leading terms in the W-algebras do not perturb the defining relations of the Yangian5 . – In the general case n 6 = 2, the calculation is quite easy. Indeed, the commutator takes the form p3 au b d v f cv − d bu v f a cv d v de s3 (W0c , W0d , W0e ) + 16 1 c ab c c d e c W + µ1 W1 + µ2 d de s2 (W0 , W0 ) + µ3 W0 , (C.1) +f c 5 2

[W1a , W1b ] = −

where µi (i = 1, 2, 3) are undetermined constants. However, one remarks that the terms containing f ab c in [W1a , W1b ] do not contribute to (3.3). Since these are the only type of terms we add, the calculation is identical to the classical one (up to symmetrization of the products). – In the case of W(sl(2p), 2.sl(p)), we need a little more. Due to the calculations done in the classical case, we already know that proving (3.4) amounts to show that j j j j µ1 [W1i , W1 ] + [W1 , W1i ] + µ3 [W0i , W1 ] + [W0 , W1i ] + 1 i j j [W2 , W1 ] + [W2 , W1i ] = 0, (C.2) + 5 where the indices run from 1 to 3. The terms corresponding to µ1 and µ3 disappear because of the antisymmetry in (i, j ). Thus, we just need to compute the corrections to the commutator [W2a , W1b ]. Using the results of Appendix D, we compute the most general form of this commutator: [W1a , W2b ] = f ab c

6(3p2 − 7) 3 W3c + s2 (W1c , W10 ) + 3 s2 (W20 , W0c )+ 14 p(p2 − 1)

(p 2 − 9)(p2 − 4) g c c d e ηdg ηe − 30 ηde ηg ) s3 (W0 , W0 , W1 ) + +( 2(p2 − 1) +f ab c ν1 W2c + ν10 W1c + ν100 W0c + ν2 f ab c ηde s3 (W0c , W0d , W0e ) + u + ν30 ηab ηcd + ν300 (ηca ηdb + ηda ηcb ) W1c W0d + + ν3 f ab u fcd (C.3) + ν4 ηab ηcd + ν40 (ηca ηdb + ηda ηcb ) s2 (W0c , W0d ) with indices running from 0 to 3. Looking at (C.2), one sees that some of the new terms that may appear in the right-hand side of the commutator do contribute to (C.2). Thus, one has to check that they are not in the true commutator. It is done thank to the Jacobi identity based on (W1a , W1b , W1c ) which shows (for a, b, c all different) that6 5 More exactly that the modification is the same as the one introduced in replacing the commutative product in U (G ∗ ) by the (symmetrised) non Abelian product of U (G). 6 Let us note en passant that the Jacobi identity has just removed in the new terms those which are symmetric in a, b.

570

E. Ragoucy, P. Sorba

ν30 = ν300 = ν4 = ν40 = 0. We deduce that the commutator takes the form: 6(3p2 − 7) 3 W3c + s2 (W1c , W10 ) + 3 s2 (W20 , W0c )+ [W1a , W2b ] = f ab c 14 p(p2 − 1) 2 (p − 9)(p 2 − 4) g c c ηdg ηe − 30 ηde ηg s3 (W0 , W0d , W1e ) + + 2(p2 − 1) + ν1 W2c + ν10 W1c + ν100 W0c + i + ν2 ηde s3 (W0c , W0d , W0e ) + ν3 f c de W1d W0e so that (C.2) and hence (3.4) are satisfied. D. Tensor Products of Some Finite Dimensional Representations of sl(n) We want here to compute the tensor product of the G-adjoint representation by itself several times, for G = sl(n). We will also need to select the totally symmetric part of these products. For such a purpose, we use Young diagrams, which allow us to determine the decompositions: G ⊗ G = (2, 0, .., 0, 2) ⊕ 2 (1, 0, .., 0, 1) ⊕ (2, 0, . . . , 0, 1, 0) ⊕ (0, 1, 0, . . . , 0, 2) ⊕ (0, 1, 0, . . . , 0, 1, 0) ⊕ (0, . . . , 0), where we have denoted by G = (1, 0, . . . , 0, 1) the adjoint representation. It remains to select the (anti-)symmetric part of these products. For G ⊗ G, the calculation has already been done (see e.g. [23]) and reads: S2 (G) = (G ⊗ G)sym = (2, 0, . . . , 0, 2) ⊕ (1, 0, . . . , 0, 1) ⊕(0, 1, 0, . . . , 0, 1, 0) ⊕ (0, . . . , 0), 32 (G) = (G ⊗ G)skew = (1, 0, . . . , 0, 1) ⊕(2, 0, . . . , 0, 1, 0) ⊕ (0, 1, 0, . . . , 0, 2).

(D.1) (D.2)

As far as S3 (G) is concerned, we already know that this sum of representations belongs to (S2 (G) ⊗ G)sym , which decomposes as (S2 (G) ⊗ G)sym = (3, 0, . . . , 0, 3) ⊕ 3 (2, 0, . . . , 0, 2) ⊕ 3 (1, 0, . . . , 0, 1) ⊕3 (0, 1, 0, . . . , 0, 1, 0) ⊕ 2 (1, 1, 0, . . . , 0, 1, 1) ⊕2 [(2, 0, . . . , 0, 1, 0) ⊕ (0, 1, 0, . . . , 0, 2)] ⊕(0, 0, 1, 0, . . . , 0, 1, 0, 0) ⊕ [(3, 0, . . . , 0, 1, 1) ⊕(1, 1, 0, . . . , 0, 3)] ⊕ [(1, 1, 0, . . . , 0, 1, 0, 0) ⊕(0, 0, 1, 0, . . . , 0, 1, 1)] ⊕ (0, . . . , 0). (D.3) This implies that we must have S3 (G) = a (3, 0, . . . , 0, 3) ⊕ b (2, 0, . . . , 0, 2) ⊕ c (1, 0, . . . , 0, 1) ⊕d (0, 1, 0, . . . , 0, 1, 0) ⊕ e (1, 1, 0, . . . , 0, 1, 1) ⊕f (0, 0, 1, 0, . . . , 0, 1, 0, 0) ⊕ m (0, . . . , 0) ⊕g [(2, 0, . . . , 0, 1, 0) ⊕ (0, 1, 0, . . . , 0, 2)] ⊕ h [(3, 0, . . . , 0, 1, 1) ⊕ (1, 1, 0, . . . , 0, 3)] ⊕ ⊕ i [(1, 1, 0, . . . , 0, 1, 0, 0) ⊕ (0, 0, 1, 0, . . . , 0, 1, 1)]

(D.4)

Yangians from Finite W-Algebras

571

with each multiplicity in (D.4) lower or equal to the corresponding multiplicity in (D.3). But we know the dimension of S3 (G): it is the dimension of a totally symmetric tensor 2 4 with 3 indices in a space of dimension dimG=n2 − 1, i.e. n (n6 −1) . Computing this dimension with (D.4) leads to only two possible solutions for the parameters: a = e = f = 1, h = i = 0, b = d = m, c = 3 − m and g = 2 − m with m = 0 or 1. As m is the multiplicity of the trivial representation in S3 (G), we deduce that, for7 G = sl(n), n 6 = 2, we have m = 1 (since dabc belongs to this space). Thus S3 (G) = (3, 0, . . . , 0, 3) ⊕ (2, 0, . . . , 0, 2) ⊕ 2 (1, 0, . . . , 0, 1) ⊕(0, 1, 0, . . . , 0, 1, 0) ⊕ (1, 1, 0, . . . , 0, 1, 1) ⊕(0, 0, 1, 0, . . . , 0, 1, 0, 0) ⊕ [(2, 0, . . . , 0, 1, 0) ⊕(0, 1, 0, . . . , 0, 2)] ⊕ (0, . . . , 0).

(D.5)

Finally, the multiplicity of the trivial representation in the tensor products occurring in Sect. 5 is computed through the remark that, in sl(n), the tensor product of two finite dimensional irreducible representations R and R 0 contains the trivial representation if and only if R and R 0 are conjugate. In that case, the multiplicity is 1. This leads to the multiplicities given in (5.4). We give also a basis for the corresponding spaces. To be complete, let us mention the bases:  ab  tcd = f ab e f e cd ab = d a d eb − d b d ea 32 (G) ⊗ 32 (G) : tcd ce d ce d ,  ab tcd = f a ce d eb d − f b ce d ea d  ab t    cde ab tcde 32 (G) ⊗ S3 (G) : t ab    cde ab tcde

= f ab c ηcd + circ. perm. (c, d, e) = f ab g d g cm d m de + circ. perm. (c, d, e) . = (ηca d b de − ηcb d a de ) + circ. perm. (c, d, e) = (f a cg d gb m − f b cg d ga m )d m de + circ. perm. (c, d, e) (D.6)

In the case of sl(2), we need more information. Fortunately, the calculation is easier in that case, and we can go further. Indeed, we have (with Dj the (2j + 1)-dimensional representation of sl(2)): (D1 × D1 )sym = D0 ⊕ D2 ; (D1 × D1 )skew = D1 ; S3 (D1 ) = D1 ⊕ D3 ,

(D.7)

which leads to the multiplicities and tensors: M0 [(D1 × D1 )skew × (D1 × D1 )skew ] = 1 : f ab u f u cd ∼ ηca ηdb − ηda ηcb ,

M0 [(D1 × D1 )sym × (D1 × D1 )sym ] = 2 : M0 [(D1 × D1 )skew × (D1 × D1 )sym ] = 0 M0 [(D1 × D1 )skew × S3 (D1 )] = 1 M0 [(D1 × D1 )sym × S3 (D1 )] = 0 7 The case G = sl(2) is treated below.

ηab ηcd , , ηca ηdb + ηda ηcb

− : f ab c ηde + circ. perm. (c, d, e), −.

(D.8)

572

E. Ragoucy, P. Sorba

References 1. Zamolodchikov, A.B.: Infinite additional symmetries in two-dimensional conformal quantum field theory. Theor. Math. Phys. 63, 347 (1985) 2. Drinfel’d, V.G.: Hopf algebras and the quantum Yang–Baxter equation. Sov. Math. Dokl. 32, 254 (1985) 3. Feher, L., O’Raifeartaigh, L., Ruelle, P., Tsutsui, I. and Wipf, A.: On the general structure of Hamiltonian reductions of the WZWN theory. Phys. Rep. 222, 1, (1992) and ref. therein 4. Bernard, D.: Hidden Yangians in 2d massive current algebras. Commun. Math. Phys. 137, 191 (1991) 5. Haldane, F.D.M., Na, Z.N.C., Falstra, J.C., Bernard, D. and Pasquier, V.: Yangian symmetry of integrable quantum chains with long-range interactions and a new description of states in conformal field theory. Phys. Rev. Lett. 69, 2021 (1992) 6. Schoutens, K.: Yangian symmetry in conformal field theory. Phys. Lett. B331, 335 (1994); Bouwknegt, P., Ludwig, A. and Schoutens, K.: Spinon bases, Yangian symmetry and fermionic representations of Virasoro characters in conformal field theory. Phys. Lett. B338, 448 (1994) 7. Avan, J., Babelon, O. and Billey, E.: Exact Yangian symmetry in the classical Euler–Calogero–Moser model. Phys. Lett. A188, 263 (1994) 8. De Boer, J., Harmsze, F. and Tjin, T.: Non-linear finite W-symmetries and applications in elementary systems. Phys. Rep. 272, 139 (1996) 9. Barbarin, F., Ragoucy, E. and Sorba, P.: Remarks on finite W-algebras. hep-th/9612070, Proceedings of Vth International Colloquium on Quantum Groups and Integrable Systems, Prague (Czech Republic), June 1996; Extended and Quantum Algebras and their Applications to Physics, Tianjin (China), August 1996; Selected Topics of Theoretical and Modern Mathematical Physics, Tbilisi (Georgia), September 1996 10. Barbarin, F., Ragoucy, E. and Sorba, P.: W-realization of Lie algebras: Application to so(4, 2) and Poincaré algebras. Commun. Math. Phys. 186, 393 (1997) 11. Barbarin, F., Ragoucy, E. and Sorba, P.: Non-polynomial realizations of W-algebras. Int. J. Math. Phys. A11, 2835 (1996) 12. Barbarin, F., Ragoucy, E. and Sorba, P.: Finite W algebras and intermediate statistics. Nucl. Phys. B442, 425 (1995) 13. Balog, J., Feher, L., O’Raifeartaigh, L., Forgacs, P. and Wipf, A.: Toda theory and W-algebra from a gauged WZWN point of view. Ann. of Phys. 203, 76 (1990) 14. Bais, F.A., Tjin, T. and van Driel, P.: Covariantly coupled chiral algebras. Nucl. Phys. B357, 632 (1991) 15. Frappat, L., Ragoucy, E. and Sorba, P.: W-algebras and superalgebras from constrained WZW models: A group theoretical classification. Commun. Math. Phys. 157, 499 (1993) 16. de Boer, J. and Tjin, T.: The relation between quantum W algebras and Lie algebras. Commun. Math. Phys. 160, 317 (1994); Representation theory of finite W algebras. Commun. Math. Phys. 158, 485 (1993) 17. Madsen, J.O. and Ragoucy, E.: Quantum Hamiltonian reduction in superspace formalism. Nucl. Phys. B429, 277 (1994); Secondary quantum Hamiltonian reduction. Commun. Math. Phys. 185, 509 (1997) 18. Barbarin, F.: Algèbres W et applications. (in french), PhD-thesis, p.82-85, Preprint LAPTH 19. Chari, V. and Pressley, A.: A guide to quantum group. chap. 12, Cambridge: Cambridge University Press, 1994 20. Ge, M.L., Xue, K. and Cho, Y.M.: Realizations of Yangians in Quantum Mechanics and Applications. preprint NIM-TP-97-12 21. Ragoucy, E., Sorba, P. and Zaugg, Ph.: Work in progress 22. Frappat, L., Ragoucy, E. and Sorba, P.: Folding the W-algebras. Nucl. Phys. B404, 805 (1993) 23. Gourdin, M.: Basic of Lie groups. Moriond series no. 37, Ed. Frontières, 1982 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 203, 573 – 592 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Classical r-Matrices and Compatible Poisson Structures for Lax Equations on Poisson Algebras Luen-Chau Li Department of Mathematics, Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected] Received: 10 February 1998 / Accepted: 9 December 1998

Abstract: Given a classical r-matrix on a Poisson algebra, we show how to construct a natural family of compatible Poisson structures for the Hamiltonian formulation of Lax equations. Examples for which our formalism applies include the Benny hierarchy, the dispersionless Toda lattice hierarchy, the dispersionless KP and modified KP hierarchies, the dispersionless Dym hierarchy, etc.

1. Introduction Two Poisson brackets on the same manifold are said to be compatible if their sum is also a Poisson bracket [GDO,M]. There are many examples of integrable systems which are Hamiltonian with respect to two compatible Poisson structures (see, e.g. [DO]). Indeed, when one of the structures happens to be nondegenerate, there is a simple way which allows one to produce a whole family of compatible Poisson structures [KR,RSTS1]. However, the existence of further structures is not a necessity when the two compatible structures are both degenerate. In the late seventies, we saw the beginning of the Lie algebraic approach to integrable systems [K,A]. The Korteweg de-Vries (KdV) equation, for example, was shown to be a Hamiltonian system on coadjoint orbits [A]. Furthermore, the second Poisson structure for KdV type equations was constructed on subspaces of the algebra of formal pseudo-differential operators [A,GD]. We now refer to this second structure as the Adler–Gelfand–Dickey structure. Recently, it was found to be of independent interest in conformal field theory [DFIZ]. In the meantime, the Lie algebraic approach to integrable systems was extensively developed, particularly by the Russian school in St. Petersburg (see, e.g., the survey in [RSTS2]). In the so-called r-matrix framework, the simplest Poisson structures for the Hamiltonian formulation of Lax equations on Lie algebras are the linear Poisson structures associated with the R-brackets. In the case where g is the Lie algebra of a noncommutative, associative algebra, a construction of quadratic brackets

574

L.-C. Li

which give Lax equations was first available for the skew-symmetric r-matrices satisfying the modified Yang–Baxter equation [STS1]. Subsequently, this was superseded by a more general construction valid for a wider class of r-matrices [LP1,LP2]. Indeed, in [LP2], even a third order structure was found. At this juncture, the reader should note that on the abstract level of associative algebras, neither the linear structure nor the quadratic structure is nondegenerate. Therefore, the recipe for producing a whole family of structures is not applicable in this context. As a matter of fact, no Poisson structures with order > 3 was ever found. In this connection, we would like to mention the thesis of Strack [ST], which showed (by using computer algebra) that beyond order 3, no Poisson structures of a certain form can exist for the Hamiltonian formulation of Lax equations. So this is the state of affairs for noncommutative, associative algebras. In this paper, we address the Hamiltonian formulation of Lax equations, as before, but in the context of Poisson algebras. Here, we show how to construct a natural family of compatible Poisson structures on the full algebra. On the group of invertible elements (if non-empty and forms an open subset), similar consideration shows we can even define structures of negative order. Thus the situation for Poisson algebras, in which multiplication is commutative, is entirely different. Recall that a Poisson algebra is by definition a commutative, associative algebra with unit 1 equipped with a Lie bracket such that the Leibniz rule holds [W1]. The most familiar examples of Poisson algebras are given by the collection of smooth functions on Poisson manifolds. For us, the particular examples which have partly motivated this work are the algebras associated with the truncated Benny equation [G-KR], and the various dispersionless equations [DM,K,TT] which are currently of interest in topological field theory [D,K]. As the reader will see, a family of vector fields Vn , n ≥ −1, plays the key role in this investigation. These vector fields Vn are invariants of degree 1 of the vector fields associated with the Lax equations, and satisfy the Virasoro relations [Vm , Vn ] = (n−m)Vm+n . For a given classical r-matrix on the Poisson algebra, we can construct the associated linear bracket. If we denote by π−1 the bivector field corresponding to this basic linear structure, we shall show that the Lie derivatives LVm π−1 essentially generate all higher order structures. Thus our construction works for an arbitrary classical r-matrix! This is in marked contrast to previous results on quadratic Poisson structures on noncommutative, associative algebras [STS1,LP1,LP2], where one has to make rather stringent assumptions on the r-matrix. In this connection, we would like to remind the reader of the important difference between the notions of double Lie algebras and Lie bialgebras. Recall that the former was motivated by the study of integrable systems [STS1] and is associated with classical r-matrices. On the other hand, the notion of Lie bialgebras had its origin in the geometry of Poisson Lie groups [DR]. The two do intersect, for example, in the class of double Lie algebras called Baxter Lie algebras [STS2] (where the r-matrix satisfies additional properties). In our case, as the r-matrix is assumed to be completely arbitrary, we are working within the framework of double Lie algebras here. The paper is organized as follows. In Sect. 2, we assemble a number of basic facts and definitions which will be used in the paper. In Sect. 3, we formulate the main result and display the explicit formulas for the linear, quadratic, and higher order structures. Then we study a number of basic properties. In order to prepare for the proof of the main result, we introduce the vector fields Vn in Sect. 4 and discuss their relation with the Lax equations. Then, in Sect. 5, we give a proof of the main result. In order to illustrate the use of our construction in Sect. 3, we describe the multi-Hamiltonian formalism of some concrete partial differential equations in Sect. 6. Our examples include the hierarchy of truncated Benny equations [B,G-KR] in nonlinear waves, the dispersionless Toda lattice

Classical r-Matrices and Compatible Poisson Structures

575

hierarchy [DM], the dispersionless KP [K,TT] and modified KP hierarchies, and the dispersionless Dym hierarchy. Note that in each example, the set of Lax operators under consideration is a submanifold of the full Poisson algebra. However, this submanifold is not necessarily a Poisson submanifold of the full algebra equipped with a bracket which comes from Sect. 3. For this reason, the passage from the bracket on the algebra to the Hamiltonian structure on the submanifold of Lax operators might involve the process of reduction [MR]. Thus in our examples, we find Dirac reduction [D,MR] (i.e. reduction with constraints) comes in naturally. For the Benny hierarchy and the dispersionless Toda lattice hierarchy, we shall compute the first few Poisson structures explicitly, and illustrate the use of Dirac reduction. Our explicit expressions for the structures not only allow us to find the Casimir functions, they also show that the structures which come from our Poisson algebras are of hydrodynamic type or its generalizations [DN,F]. Indeed, as it turns out, all the higher structures of the dispersionless Toda lattice hierarchy are nonlocal generalizations of brackets of hydrodynamic type. This shows how our construction in Sect. 3 can get complicated upon reduction to a specific submanifold of Lax operators. To close we stress again that our main result is formulated along the lines of the r-matrix approach (where Poisson structures are defined either on Lie algebras or their duals, or on Lie groups) and applies to all Poisson algebras satisfying the assumptions of Theorem 3.2. In any concrete applications, the use of reduction techniques (where necessary) is perfectly natural and the reader should not feel uncomfortable under such circumstances. 2. Preliminaries We collect in this section a number of basic facts, and introduce some terminology which will be used in the sequel. Let P be a smooth manifold. A Poisson bracket {·, ·} on P is a Lie bracket on C ∞ (P ) which satisfies the derivation property in each argument. If π is the bivector field corresponding to the bracket operation, i.e. {F, H } = π(dF, dH ),

(2.1)

then it is well-known that the Jacobi identity for {·, ·} is equivalent to [π, π]S = 0 [W2], where [·, ·]S is the Schouten bracket [S]. Recall that if 0(∧k T M) is the space of sections of the vector bundle ∧k T M, and ∧∗ (M) = ⊕ 0(∧k T M), the Schouten bracket [·, ·]S k≥0

is the bilinear map

(2.2) [·, ·]S : ∧∗ (M) × ∧∗ (M) → ∧∗ (M) which extends the usual Lie bracket operation on 0(T M) and makes ∧∗ (M) into a Lie superalgebra. In particular, the following graded Jacobi identity holds: (−1)pr [u, [v, w]S ]S + (−1)qp [v, [w, u]S ]S + (−1)rq [w, [u, v]S ]S = 0,

(2.3)

where u ∈ 0(∧p T M), v ∈ 0(∧q T M) and w ∈ 0(∧r T M). As we mentioned in the introduction, two Poisson brackets on P are said to be compatible if their sum is also a Poisson bracket, i.e. satisfies the Jacobi identity [GDO,M]. In terms of the corresponding bivector fields π1 and π2 , this is equivalent to [π1 , π2 ]S = 0, as [πi , πi ]S = 0, i = 1, 2. In this paper, we shall construct compatible Poisson structures for the Hamiltonian formulation of Lax equations (associated with r-matrices) when the underlying manifold P is a Poisson algebra.

576

L.-C. Li

Definition 2.4. Let A be a commutative, associative algebra with unit 1. If there is a Lie bracket on A such that for each element a ∈ A, the operator ada : b 7 → [a, b] is a derivation of the multiplication, then (A, [·, ·]) is called a Poisson algebra. Thus the Poisson algebras are Lie algebras with an additional associative algebra structure (with commutative multiplication and unit 1) related by the derivation property to the Lie bracket. Note that some authors call the Lie bracket on A the Poisson structure on A (see, for example, [W1]), but we shall refrain from such usage in order to avoid confusion. We now recall the notion of a classical r-matrix [STS1]. Let g be a Lie algebra. A linear operator R in the space g is called a classical r-matrix if the R-bracket given by [X, Y ]R =

1 ([RX, Y ] + [X, RY ]), X, Y ∈ g 2

(2.5)

is a Lie bracket, i.e. satisfies the Jacobi identity. Some well-known sufficient conditions for R ∈ End(g) to be a classical r-matrix are theYang–Baxter equation and the modified Yang–Baxter equation. But in this paper, we can establish our results without assuming these conditions. To close this section, we define what we mean by Lax equations. Definition 2.6. Let A be a Poisson algebra, and suppose R ∈ End(A) is a classical r-matrix. Equations of the form L˙ = [R(X(L)), L], L ∈ A,

(2.7)

where X : A → A is a smooth map satisfying [X(L), L] = 0, dX(L) · [L0 , L] = [L0 , X(L)], L, L0 ∈ A,

(2.8)

are called Lax equations. The basic Lax equations on A are given by L˙ = Zm (L) = [R(Lm ), L], m ≥ 1.

(2.9)

More generally, if H is a smooth ad-invariant function (in the sense defined in (3.1)), then L˙ = [R(Lm dH (L)), L] is also a Lax equation. 3. A Family of Compatible Poisson Structures on Poisson Algebras In what follows, we shall assume the Poisson algebra A is equipped with a non-degenerate ad-invariant pairing (·, ·). A function F defined on A is said to be smooth if there exists a map dF : A → A such that d F (L + tL0 ) = (dF (L), L0 ) , L, L0 ∈ A dt t=0

(3.1)

Theorem 3.2. Let A be a Poisson algebra with Lie bracket [·, ·] and non-degenerate adinvariant pairing (·, ·) with respect to which the operation of multiplication is symmetric, i.e. (XY, Z) = (X, Y Z), ∀ X, Y, Z ∈ A. Assume R ∈ End(A) is a classical r-matrix, then

Classical r-Matrices and Compatible Poisson Structures

577

(a) for each integer n ≥ −1, the formula {F, H }(n) (L) = (L, [R(Ln+1 dF (L)), dH (L)] + [dF (L), R(Ln+1 dH (L))]) (3.3) (where F and H are smooth) defines a Poisson structure on A, (b) the structures {·, ·}(n) are compatible with each other, (c) if πn is the bivector field corresponding to {·, ·}(n) and Dπn : ∧∗ (A) → ∧∗ (A) is the associated coboundary operator, i.e. Dπn X = [πn , X]S , X ∈ ∧∗ (A). There exists vector fields Vm on A, m ≥ −1 satisfying the Virasoro relations [Vm , Vn ] = (n − m)Vm+n such that Dπn Vm = (n − m)πm+n , m, n ≥ −1. We shall prove this result in Sect. 5, after we introduce the vector fields Vm in Sect. 4 and explain what they are in relation to the Lax equations. As the reader will see, the relations [πn , Vm ]S = (n − m)πm+n between the bivector fields which we establish at the beginning of Sect. 5 play the key role in proving parts (a) and (b) of the above theorem. They are also responsible for the following. Corollary 3.4 (Involution of Casimir Functions). {Hπ0n (A), Hπ0n (A)}(m+n) = 0, m, n ≥ −1, m 6 = n. Proof. This follows from the formula [πn , Vm ]S (dF, dH ) = LVm πn (dF, dH ) = t Vm {F, H }(n) − {Vm F, H }(n) − {F, Vm H }(n) . u Remark 3.5. Note that from the compatibility of the structures, it follows that {Hπ0m (A), Hπ0m (A)}(n) ⊂ Hπ0m (A).

(3.6)

We now give a number of basic properties of the Poisson structures {·, ·}(n) , n ≥ −1. Theorem 3.7. (a) Smooth functions in A which are ad-invariant Poisson commute in {·, ·}(n) . (b) The Hamiltonian system generated by a smooth ad-invariant function H in the Poisson structure {·, ·}(n) is given by the Lax equation L˙ = [R(Ln+1 dH (L)), L]. Proof. (a) If F and H are smooth functions in A which are ad-invariant, we have [dF (L), L] = [dH (L), L] = 0. Therefore, {F, H }(n) (L) = ([dH (L), L], R(Ln+1 dF (L)))+([L, dF (L)], R(Ln+1 dH (L))) = 0. (b) If H is ad-invariant, for any smooth F , we have {F, H }(n) (L) = (L, [dF (L), t R(Ln+1 dH (L))]) = (dF (L), [R(Ln+1 dH (L)), L]). u From formula (3.3), it is clear that the bracket {·, ·}(n) vanishes at the unit 1. Therefore, the linearization of {·, ·}(n) defines a Lie bracket on A, and an easy calculation shows it coincides with the R-bracket [·, ·]R . The following result is reminiscent of the multiplicative property of Poisson Lie groups [DR]. However, it is in the context of a Poisson algebra and the reason for its validity is entirely different. Theorem 3.8. Equip A with the structure {·, ·}(0) and A × A with the product structure. Then the multiplication map m : A × A → A is a Poisson map.

578

L.-C. Li

Proof. Let F and H be smooth functions on A. For L1 , L2 ∈ A, let L = m(L1 , L2 ). Clearly, F ◦ m depends on two variables and by taking its derivative with respect to the i th variable, i = 1, 2, we obtain d1 (F ◦ m)(L1 , L2 ) = L2 dF (L), d2 (F ◦ m)(L1 , L2 ) = L1 dF (L). To simplify notation, let X1 = dF (L), X2 = dH (L) and denote the product structure on A × A also by {·, ·}(0) , then we have {F ◦ m, H ◦ m}(0) (L1 , L2 ) = (L1 , [R(LX1 ), L2 X2 ] + [L2 X1 , R(LX2 )]) + (L2 , [R(LX1 ), L1 X2 ] + [L1 X1 , R(LX2 )]).

(*)

By the derivation property of [·, ·], the commutativity of multiplication and its symmetry with respect to the ad-invariant pairing (·, ·), we have (L1 , [R(LX1 ), L2 X2 ]) = (L, [R(LX1 ), X2 ]) − (L2 , [R(LX1 ), L1 X2 ]). Likewise, (L2 , [L1 X1 , R(LX2 )]) = (L, [X1 , R(LX2 )] − (L1 , [L2 X1 , R(LX2 )]). When we insert these relations in (*), the result follows. u t Consider now Ainv , the group of invertible elements of A. We assume Ainv 6= φ and form an open subset of A. Then we can define vector fields Z−m , V−n for m ≥ 1, n ≥ 2, on Ainv as in formulas (4.2) and (4.5). If we define {F, H }(−n) (L) = (L,[R(L−n+1 dF (L)), dH (L)] + [dF (L), R(L−n+1 dH (L))]), n≥2 (3.9) for smooth functions F and H on Ainv , it is easy to check that the analysis in Sect. 5 also holds for these objects. In particular, this means {·, ·}(−n) are Poisson structures on Ainv . Theorem 3.10. Let ι : Ainv → Ainv be the inversion map, i.e. ι(L) = L−1 . Then {F ◦ ι, H ◦ ι}(n) (L) = −{F, H }(−n) ◦ ι(L), n ≥ 0, for all smooth functions F and H on Ainv . Proof. We have d(F ◦ ι)(L) = −L−2 dF (L−1 ) and so {F ◦ ι, H ◦ ι}(n) (L) = (L, [R(Ln−1 dF (L−1 )), L−2 dH (L−1 )] − (F ↔ H )). Now, (L, [R(Ln−1 dF (L−1 )), L−2 dH (L−1 )]) = (L, L−2 [R(Ln−1 dF (L−1 )), dH (L−1 )])+(L dH (L−1 ), [R(Ln−1 dF (L−1 )), L−2 ]) = (L−1 , [R(Ln−1 dF (L−1 )), dH (L−1 )])+2(dH (L−1 ), [R(Ln−1 dF (L−1 )), L−1 ]) = −(L−1 , [R(Ln−1 dF (L−1 )), dH (L−1 )]). Hence the assertion follows. u t

Classical r-Matrices and Compatible Poisson Structures

579

4. Lax Equations on Poisson Algebras and Virasoro Invariants According to Definition 2.6, corresponding to each smooth map X : A → A satisfying (2.8) is a Lax equation e L˙ = X(L) = [R(X(L)), L]. (4.1) To prepare for the proof of Theorem 3.2, we shall introduce vector fields Vn , n ≥ −1 on A which are related to the Lax equations. Before we do so, we first prove e Y e] = 0. Theorem 4.2. Let X, Y : A → A be smooth maps satisfying (2.8). Then [X, Proof. We have e(L) · X(L) e dY e e = [R(dY (L) · X(L)), L] + [R(Y (L)), X(L)] = [R([R(X(L)), Y (L)]), L] + [R(Y (L)), [R(X(L)), L]]. Therefore, e Y e](L) [X, = 2[R([X(L), Y (L)]R ), L]+[R(Y (L)), [R(X(L)), L]]−[R(X(L)), [R(Y (L)), L]] = [2R([X(L), Y (L)]R ), L]−[[R(X(L)), R(Y (L))], L], by Jacobi identity = −[[R(X(L)), R(Y (L))]−2R([X(L), Y (L)]R ), L]. Let BR (X, Y ) = [RX, RY ]−2R([X, Y ]R ). Then R is a classical r-matrix iff [BR (X, Y ), Z]+[BR (Y, Z), X]+[BR (Z, X), Y ] = 0, ∀ X, Y, Z ∈ A. Using the ad-invariant pairing, this is equivalent to [BR (X, Y ), Z] =R ∗ [RX, [Y, Z]] − R ∗ [X, R ∗ [Y, Z]] − [RX, R ∗ [Y, Z]] + R ∗ [RY, [Z, X]] − R ∗ [Y, R ∗ [Z, X]] − [RY, R ∗ [Z, X]]. If we now put X = X(L), Y = Y (L) and Z = L in the above relation, we obtain e Y e](L) = 0, as asserted. u [X, t The vector fields Vn , n ≥ −1, are defined as follows: Vn (L) = Ln+1 , n ≥ −1.

(4.3)

Theorem 4.4. The vector fields Vn satisfy the Virasoro relations [Vm , Vn ] = (n − m)Vm+n , m, n ≥ −1. Proof. Clear. u t Given a smooth manifold M and a vector field V on M, recall that a tensor field T is an invariant tensor field of V iff LV T = 0. Generalizing one step further, we shall say that T is an invariant tensor field of degree 1 iff L2X T = 0. The vector fields Vn e corresponding introduced in (4.3) above are invariants of degree 1 of the vector fields X to the Lax equations. Indeed, we have e=Y e, Theorem 4.5. If X : A → A is a smooth map satisfying (2.8), we have LVm X where Y (L) = dX(L) · Vm (L).

580

L.-C. Li

Proof. e [Vm , X](L) e e = d X(L) · Vm (L) − dVm (L) · X(L)

= [R(dX(L) · Vm (L)), L] + [R(X(L)), Vm (L)] − (m + 1)Lm [R(X(L)), L] = [R(dX(L) · Vm (L)), L]. Thus, it remains to show Y (L) = dX(L) · Vm (L) satisfies (2.8). To do this, first note that from the condition [X(L), L] = 0, L ∈ A, we have [dX(L) · L0 , L] + [X(L), L0 ] = 0, L, L0 ∈ A. Therefore, [Y (L), L] = [dX(L) · Vm (L), L] = −[X(L), Vm (L)] = −(m + 1)Lm [X(L), L] = 0, for all L ∈ A. On the other hand, it follows from dX(L)·[L0 , L] = [L0 , X(L)], L, L0 ∈ A, that (d 2 X(L) · L0 )([L00 , L]) + dX(L) · [L00 , L0 ] = [L00 , dX(L) · L0 ], L, L0 , L00 ∈ A. (∗) Consequently, for all L, L0 ∈ A, we have dY (L) · [L0 , L] = (d 2 X(L) · [L0 , L])(Vm (L)) + dX(L) · ((m + 1)Lm [L0 , L]) = (d 2 X(L) · Vm (L))([L0 , L]) + dX(L) · [L0 , Vm (L)] = [L0 , Y (L)], by (*).

t u

Remark 4.6. For the vector fields Zn in (2.9), we have in particular the relations LVm Zn = nZm+n , m ≥ −1, n ≥ 1. If we now combine Theorem 4.5 and Theorem 4.2, the nature of the vector fields Vm is now revealed. Corollary 4.7. L2X e V−1 = 0. e Vn = 0, n ≥ 0, LX 5. Virasoro Action on the Bivector Fields and Compatibility of the Structures The goal of this section is to prove Theorem 3.2. To do this, we consider the action of the vector fields Vm on the bivector fields πn corresponding to {·, ·}(n) , n ≥ −1. Theorem 5.1. LVm πn = (n − m)πm+n , m, n ≥ −1. As indicated in Sect. 3, this result is the key in proving Theorem 3.2. The demonstration of Theorem 5.1 is quite tedious, so we break it up into several steps. First, note that from the property of the Lie derivative, we have LVm πn (dF, dH ) = Vm {F, H }(n) − {Vm F, H }(n) − {F, Vm H }(n) .

(5.2)

Using the expressions for {·, ·}(n) and Vm , we obtain the identities in the next two lemmas. We shall omit the rather lengthy computations. Lemma 5.3. Vm {F, H }(n) (L) = (Vm (L), [R(Ln+1 dF (L)), dH (L)]) + (L, [R(Ln+1 dF (L)), d 2 H (L) · Vm (L)]) + (n + 1)(L, [R(Lm+n+1 dF (L)), dH (L)]) + (L, [R(Ln+1 d 2 F (L) · Vm (L)), dH (L)]) − (F ↔ H ), where (F ↔ H ) denote terms obtained from previous ones by switching F and H .

Classical r-Matrices and Compatible Poisson Structures

581

Lemma 5.4. {Vm F, H }(n) (L) + {F, Vm H }(n) (L) = (L, [R(Ln+1 d 2 F (L) · Vm (L)), dH (L)] + [d 2 F (L) · Vm (L), R(Ln+1 dH (L))]) + (m + 1)(L, [R(Lm+n+1 dF (L)), dH (L)] + [Lm dF (L), R(Ln+1 dH (L))]) − (F ↔ H ). Proof of Theorem 5.1. By combining the expressions in Lemma 5.3 and Lemma 5.4 according to (5.2), it is clear that terms involving second derivatives cancel out, and we obtain LVm πn (L) (X1 , X2 ) = (Vm (L), [R(Ln+1 X1 ), X2 ]) + (n − m)(L, [R(Lm+n+1 X1 ), X2 ]) − (m + 1)(L, [Lm X1 , R(Ln+1 X2 )]) − (1 ↔ 2),

(*)

where X1 = dF (L), X2 = dH (L). Now, by repeated application of the derivation property, the commutativity of multiplication and its symmetry with respect to (·, ·), we have (Vm (L), [R(Ln+1 X1 ), X2 ]) − (1 ↔ 2) = (L, [R(Ln+1 X1 ), Lm X2 ]) − (LX2 , [R(Ln+1 X1 ), Lm ]) − (1 ↔ 2) = (L, [R(Ln+1 X1 ), Lm X2 ]) − m(Lm X2 , [R(Ln+1 X1 ), L]) − (1 ↔ 2) = (m + 1)(L, [R(Ln+1 X1 ), Lm X2 ]) − (1 ↔ 2) = (m + 1)(L, [Lm X1 , R(Ln+1 X2 )]) − (1 ↔ 2). If we substitute this in (*), the result follows. u t Remark 5.5. In the case of noncommutative, associative algebra, relations similar to the ones in Theorem 5.1 were obtained in [LP2] for the three structures there. Corollary 5.6. [πm , πn ]S = m, n ≥ −1.

1 n+2 [Vn+1 , [π−1 , πm ]S ]S

+

m−n−1 n+2 [π−1 , πm+n+1 ]S

for

Proof. From Theorem 5.1 and the graded Jacobi identity for the Schouten bracket, it follows that [πm , πn ]S 1 [πm , [Vn+1 , π−1 ]S ]S =− n+2 1 1 [Vn+1 , [π−1 , πm ]S ]S + [π−1 , [Vn+1 , πm ]S ]S =− n+2 n+2 1 m−n−1 [Vn+1 , [π−1 , πm ]S ]S + [π−1 , πm+n+1 ]S . =− n+2 n+2

t u

Remark 5.7. The formulation of Corollary 5.6 is motivated by similar considerations in [AvM].

582

L.-C. Li

Proof of Theorem 3.2. If we set m = −1 in the identity in Corollary 5.5, we find 1 [Vn+1 , [π−1 , π−1 ]S ]S = 0, ∀ n ≥ −1, as π−1 is the bivector [π−1 , πn ]S = − 2(n+2) field for the Lie–Poisson structure {·, ·}(−1) . From the same identity, it now follows that [πm , πn ]S = 0, ∀ m, n ≥ −1. Hence the brackets {·, ·}(n) define compatible Poisson structures on A. Finally, the assertion in part (c) follows from Theorem 5.1. u t

6. Some Examples In this section, we look at some concrete examples of partial differential equations which can be realized as Lax equations on Poisson algebras. In each case, we describe the multiHamiltonian formalism which follows from our universal construction in Sect. 3. The reader should note that in these applications, we are dealing with Lax operators which form submanifolds of the full Poisson subalgebras under consideration. Although these submanifolds of Lax operators are invariant under the dynamics of the associated Lax equations, however, they are not automatically Poisson submanifolds of the brackets which arise from the general construction in Sect. 3. For this reason, there are two kinds of situations in the examples which follows. In the happy case where the submanifold M of Lax operators does form a Poisson submanifold of (A, {·, ·}(n) ), there is of course an induced structure on M which can be obtained by simple restriction of {·, ·}(n) to M. On the other hand, when M is not a Poisson submanifold of (A, {·, ·}(n) ), the reader will see that the geometry in each case warrants the application of Dirac reduction, i.e. reduction with constraints [D,KO,MR]. Thus in this latter case, the brackets which arise from the construction in Sect. 3 serve as the starting point of a reduction process from which the constrained brackets on M are computed. In the following, we shall rescale the expression for {·, ·}(n) by the factor 21 . 6.1. The Benny hierarchy. The Benny equations in nonlinear waves [B] (we shall consider the simplest case here) are given by the quasi-linear system u0 1 u0 u0 = . (6.1) u−1 t u−1 x u−1 u0 We shall deal with the case where u0 , u−1 are smooth functions on the circle S 1 = R/Z. Following [G-KR], introduce the algebra A of Laurent polynomials in λ, having the form X ui (x)λi , (6.2) u(x, λ) = i

where the coefficients ui are smooth functions on the circle S 1 . With the well-known Lie-bracket defined by [u, v]−1 =

∂u ∂v ∂u ∂v − , u, v ∈ A, ∂λ ∂x ∂x ∂λ

(6.3)

it is clear that (A, [·, ·]−1 ) is a Poisson algebra. In [G-KR], the Benny equations are rewritten as a Lax equation in this Poisson algebra. Indeed, (6.1) is equivalent to 1 2 dL = R L ,L , (6.4) dt 4 −1

Classical r-Matrices and Compatible Poisson Structures

583

where the Lax operator L is an element of the Benny manifold n o MBenny = L ∈ A L(x, λ) = λ + u0 (x) + u−1 (x)λ−1

(6.5)

and the r-matrix R is the one associated with the direct sum decomposition A = A>1 ⊕ A60 into subalgebras A >1

 

  X = u ∈ A u(x, λ) = ui (x)λi ,  

(6.6)

(6.7a)

i >1

A 60

 

  X = u ∈ A u(x, λ) = ui (x)λi .  

(6.7b)

i 60

In view of the representation in (6.4), the quasi-linear system (6.1) is only a member of a hierarchy of Lax equations on MBenny , and this is what we call the Benny hierarchy. Note that the Poisson algebra introduced above admits the trace functional Z (6.8) tr−1 u = u−1 (x)dx, u ∈ A (here and below we integrate over S 1 ) which satisfies the important property tr−1 [u, v] = 0, u, v ∈ A.

(6.9)

Therefore, we can equip A with a non-degenerate ad-invariant pairing (·, ·)−1 : (u, v)−1 = tr−1 (uv) , u, v ∈ A.

(6.10)

Thus we have all the ingredients which are required for the application of Theorem 3.2. Consequently, we have a family of Poisson structures {·, ·}(n) , n ≥ −1, on A. It is easy to check that MBenny is a Poisson submanifold of (A, {·, ·}(−1) ). Therefore, the induced structure on MBenny provides the first Poisson structure for the equations in the Benny hierarchy [G-KR]. Using u = (u0 , u−1 ) as coordinates on MBenny , the associated Hamiltonian operator is given explicitly by d 0 D , D= , (6.11) B(−1) (u) = D 0 dx which is apparently well-known to people working in other frameworks (see, for examR structure is degenerate, with Casimirs given by C1 (u) = Rple, [DN]). Clearly, this first u0 (x)dx and C2 (u) = u−1 (x)dx. Remark 6.12. One of the advantages in formulating the Benny equations as a Lax equation on A is that it automatically suggests a method of solution, namely, via a factorization problem on a symplectic diffeomorphism group. The analytic details, however, are nontrivial.

584

L.-C. Li

We now turn to the higher structures. Here, it is easy to see that MBenny is not a Poisson submanifold of any of the brackets {·, ·}(n) , n > 0. However, we shall see that we can apply Dirac reduction to {·, ·}(n) with appropriate constraints to obtain the higher structures on MBenny . We shall illustrate the procedure for n = 0 and n = 1, thereby obtaining the second and third Poisson structures on MBenny . For n = 0, the Hamiltonian vector field generated by H is of the form (0) XH (u) = u56−2 ([dH (u), u]−1 ) − 560 (udH (u)), u −1 = 5>1 (udH (u)), u −1 − u5≥−1 ([dH (u), u]−1 ).

(6.13) (0)

If L ∈ MBenny , it follows from this formula that the highest order term of XH (L) in λ is λ0 , while the lowest order is in λ−2 . Using u = (u0 , u−1 , u−2 ) as coordinates on the submanifold {λ + u0 (x) + u−1 (x)λ−1 + u−2 (x)λ−2 ∈ A}, the operator which gives (0) XH (L) can be computed explicitly:  u−1 D + u−1x D u0 D + u0x .  u0 D 2u−1 D + u−1x 0 2 0 −u−1 D − u−1 u−1x u−1 D 

(6.14)

Therefore, we can apply Dirac reduction with constraint u−2 ≡ 0 to obtain the second structure on MBenny :

D u0 D + u0x B0 (u) = u0 D 2u−1 D + u−1x u−1 D + u−1x (−u2−1 D − u−1 u−1x )−1 (u−1 D 0) − 0 00 01 2 u0 u . D+ u + = u0 2u−1 1 0 −1x 0 0 0x

(6.15)

Note that this second structure is of hydrodynamic type [DN] because the associated Hamiltonian operator is of the form ij

ij

B0 (u) = g ij (u)D + bk (u) ukx .

(6.16)

In this case, the metric which defines the structure (6.15) is non-degenerate where 1 = u20 − 4u−1 6= 0.

(6.17)

For n = 1, i.e. for the bracket {·, ·}(1) , we have a similar formula for the Hamiltonian vector field h i (1) XH (u) = u2 56−2 ([dH (u), u]−1 ) − 560 (u2 dH (u)), u −1 h i 2 2 = 5>1 (u dH (u)), u − u 5>−1 ([dH (u), u]−1 ). (6.18) −1

Classical r-Matrices and Compatible Poisson Structures

585

(1)

This time, the highest order term of XH (L) (L ∈ MBenny ) in λ is still λ0 , but the lowest order term is in λ−3 . Therefore, in the coordinates u = (u0 , u−1 , u−2 , u−3 ), the operator (1) which gives XH (L) is given by  2u0 D + u0x (u20 + 3u−1 )D + 2u0 u0x + 2u−1x (u2 + 3u−1 )D + u−1x 4u0 u−1 D + 12u−1 u0x + 2u0 u−1x  0  2u0 u−1 D u2−1 D 2 0 u−1 D  2u0 u−1 D + 2u0 u−1x + 2u−1 u0x u2−1 D + 2u−1 u−1x  u2−1 D + 2u−1 u−1x 0  . (6.19) −2u0 u2−1 D − 2u0 u−1 u−1x − u2−1 u0x −u3−1 D − 2u2−1 u−1x  −u3−1 D − u2−1 u−1x 0 To obtain the structure on MBenny , we have to use Dirac reduction with the constraints u−2 ≡ 0, u−3 ≡ 0. Accordingly, we have to invert the lower 2 × 2 block of (6.19): −1 −2u0 u2−1 D − 2u0 u−1 u−1x − u2−1 u0x −u3−1 D − 2u2−1 u−1x −u3−1 D − u2−1 u−1x 0   1 1 −1 0 2 D 2 u u −1 −1 . (6.20) = −  1 −1 1 1 −1 u0 − u0 D −1 1 D − D 2 2 2 2 2 u −1 u u u u u −1

−1

−1

−1

−1

Hence the Hamiltonian operator of the third structure is given by 2u0 D + u0x (u20 + 3u−1 )D + 2u0 u0x + 2u−1x B1 (u) = (u20 + 3u−1 )D + u−1x 4u0 u−1 D + 2u−1 u0x + 2u0 u−1x 2u0 u−1 D + 2u−1 u0x + 2u0 u−1x u2−1 D + 2u−1 u−1x + u2−1 D + 2u−1 u−1x 0   1 1 −1 D u2 0 2u0 u−1 D u2−1 D u2−1 −1   1 −1 1 0 u2−1 D D u−1 − u21 D −1 uu20 − uu20 D −1 u21 u2−1 −1 −1 −1 −1 0 2 u20 + 4u−1 4u0 2 2u0 + = D + u u−1x . 0x 0 2u−1 2 2u0 u20 + 4u−1 4u0 u−1 (6.21) Again, this corresponds to a bracket of hydrodynamic type and the non-degeneracy of the metric is characterized by the same condition in (6.17). Remark 6.22. R(a) Alternatively, on Rthe symplectic leaves of the first structure defined by the conditions u0 (x)dx = const, u−1 (x)dx = const, B−1 is invertible and therefore we can compute the recursion operator u0 + u0x D −1 2 −1 = . (6.23) R = B0 B−1 2u−1 + u−1x D −1 u0 From this, we can check that B1 = RB0 . (b) In principle, one can compute all higher structures explicitly by applying Dirac reduction to {·, ·}(n) or by using the recursion operator R, but the calculations are quite formidable and we do not know if there exists an efficient way to do this.

586

L.-C. Li

6.2. The dispersionless Toda lattice hierarchy. Let A be the algebra introduced in Example 6.1, but now we equip it with the following Lie bracket: ∂u ∂v ∂u ∂v − , u, v ∈ A. (6.24) [u, v]0 = λ ∂λ ∂x ∂x ∂λ Then (A, [·, ·]0 ) is also a Poisson algebra. The dispersionless Toda lattice hierarchy is defined by the Lax equations dL = 5k (Ln ), L 0 = − 5l (Ln ), L 0 , n = 1, 2, . . . , dt

(6.25)

where the Lax operator L is an element of the manifold MdToda = {L ∈ A L(x, λ) = u1 (x)λ + u0 (x) + u1 (x)λ−1 }

(6.26)

and 5k , 5l are the projection operators relative to the direct sum decomposition A=k⊕l

(6.27)

) X i −i ui (x)(λ − λ ) , k = u ∈ A u(x, λ) =

(6.28a)

into subalgebras (

i>0

  X ui (x)λi . l = u ∈ A u(x, λ) =    

(6.28b)

i≤0

When n = 1, the corresponding Lax equation dL = [5k (L), L]0 ⇐⇒ dt

u0t = 4u1 u1x . u1t = u1 u0x

(6.29)

These are the dispersionless Toda lattice equations and can be obtained from the periodic Toda lattice ODE system dak 2 = 2(bk2 − bk−1 ), dt

dbk = bk (ak+1 − ak ) dt

(6.30)

by taking a continuum (or long wave) limit. The Poisson algebra (A, [·, ·]0 ) also has all the ingredients needed for the construction in Theorem 3.2. In this case, the invariant trace is of the form Z (6.31) tr0 u = u0 (x)dx, u ∈ A which gives rise to the non-degenerate ad-invariant pairing (·, ·)0 : (u, v)0 = tr0 (uv), u, v ∈ A.

(6.32)

As the r-matrix for the equations in (6.25) is given by R = 5k − 5l ,

(6.33)

Classical r-Matrices and Compatible Poisson Structures

587

it follows from (3.3) that the Hamiltonian vector field generated by H in the structure {·, ·}(n) is of the form h i (n) XH (u) = 5k (un+1 dH (u)), u − un+1 5∗l ([dH (u), u]0 ) 0 h i (6.34) = un+1 5∗k ([dH (u), u]0 ) − 5l (un+1 dH (u)), u . 0

Using this formula, we can now check that MdToda is a Poisson submanifold of (A, {·, ·}(n) ) only for n = −1, 0, and 1. Accordingly, the induced structures on MdToda provide the first, second and third Poisson structures for the equations in the dispersionless Toda lattice hierarchy. Using u = (u0 , u1 ) as coordinates on MdToda , the Hamiltonian operator of the first structure is given explicitly by 0 u1x 0 u1 D+ . (6.35) B−1 (u) = u1 0 0 0 Clearly, the associated Hamiltonian vector fields preserve the sign of u1 . Therefore, B−1 (u) restricts to a structure on −1 M+ dToda = {L ∈ A L(x, λ) = u1 (x)λ + u0 (x) + u1 (x)λ , u1 (x) > 0} (6.36) R R whose symplectic leaves are the level sets of the Casimirs u0 (x)dx, ln u1 (x)dx. Finally, we note that B−1 (u) is obviously of hydrodynamic type and the corrresponding 2 ij metric is non-degenerate on M+ dToda as det (g ) = −u1 . Remark 6.37. Note that the equations in the dispersionless Toda lattice hierarchy are (strictly) hyperbolic in M+ dToda and we can take w1 (u) = u0 − 2u1 , w2 (u) = u0 + 2u1

(6.38)

as the Riemann invariants. We shall not give the proof as the reader can easily supply the details. As for the second and third Poisson structures on MdToda , direct calculation shows the corresponding Hamiltonian operators have the form 2 4u1 u0 u1 4u1 u0 0 0 u + (6.39) D + u , B0 (u) = 0x 0 u1 1x u1 0 u0 u1 u21 ! 4u31 + u20 u1 8u0 u21 4u21 0 D + u B1 (u) = 2u0 u1 u21 0x 4u31 + u20 u1 2u0 u21 (6.40) 8u0 u1 u20 + 8u21 + u1x . 4u21 2u0 u1 These structures also restrict to M+ dToda , and are obviously of hydrodynamic type. But in contrast to the first structure, the metrics associated with B0 (u) and B1 (u) are nondegenerate only on a subset of M+ dToda , characterized by the condition w1 (u)w2 (u) 6= 0, where w1 (u), w2 (u) are the Riemann invariants in (6.38).

(6.41)

588

L.-C. Li

In order to compute the higher structures, we have to invoke Dirac reduction, as in the last example. Here, we shall do this for the fourth structure as it presents new features which are also shared by all higher structures. First of all, we check that for L ∈ M+ dToda , (2) we have XH (L) ∈ I m5∗l , and the highest order term in λ is λ2 . Then we write down the (2) operator which gives XH (L) using the coordinates u = (u0 , u1 , u2 ) on the submanifold (2) where XH (L) lies: 

(12u41 + 12u20 u21 )D + 6(u41 + u20 u21 )x

(u3 u + 12u u3 )D + u (u3 + 6u u2 ) 0 1 1 0 0 1 x  0 1 2u41 D (u30 u1 + 12u0 u31 )D + 6u31 u0x + 24u0 u21 u1x + u30 u1x

2u41 D + 8u31 u1x

(4u41 + 3u20 u21 )D + 3u0 u21 u0x + (8u31 + 3u20 u1 )u1x

u31 u0x

−u31 u0x

−u41 D

  .

(6.42)

− 2u31 u1x

Finally, we invoke Dirac reduction with constraint u2 ≡ 0 to compute the Hamiltonian operator of the fourth structure, and the result is ! ! 12u0 u21 16u41 + 12u20 u21 12u0 u31 + u30 u1 4u31 D+ u0x B2 (u) = 12u0 u31 + u30 u1 4u41 + 3u20 u21 3u20 u1 + 8u31 3u0 u21 ! 32u31 + 12u20 u1 24u0 u21 + u30 u1x + (6.43) 12u0 u21 8u31 + 3u20 u1 ! 16u1 u1x D −1 u1 u1x 4u1 u1x D −1 u1 u0x . − 4u1 u0x D −1 u1 u1x u1 u0x D −1 u1 u0x Thus, B2 (u) has a nonlocal tail, and provides an example of a class of nonlocal Hamiltonian operators of the form ij

B ij (u) = g ij D + bk ukx +

N X

j

(wα )ik ukx D −1 (w α )` u`x .

(6.44)

α=1

In the case where det(g ij ) 6 = 0, we note that the geometric root of such structures was discussed in [F] and applied to the chromatography equations. At this point, the reader can check that the subset of M+ dToda where the metric associated with B2 (u) is on-degenerate isRlikewise defined by (6.41). R Also, on the symplectic leaves of the first structure where u0 (x)dx = const and ln u1 (x)dx = const, the recursion operator −1 exists and it is not hard to show that B1 = Rb0 and B2 = R2 B0 . R = B0 B−1 Remark 6.45. In [DM], the authors considered the dispersionless Toda lattice equations with boundary conditions u1 (0) = 0, u1 (1) = 0. We remark that the multi-Hamiltonian formalism of this problem can also be obtained in a similar fashion. Indeed, the only major change one has to make here is to replace theP algebra above by the algebra of Laurent polynomials in λ, having the form u(x, λ) = i ui (x)λi , where the coefficients ui are smooth functions on I = [0, 1] satisfying the additional conditions uj (0) = uj (1) = 0, j 6 = 0. Otherwise, everything goes through just the same as before. In particular, the formula for the Hamiltonian operators of the first four structures are still those given in (6.35), (6.39), (6.40) and (6.43).

Classical r-Matrices and Compatible Poisson Structures

589

In the next two examples, we shall consider equations with infinitely many field variables. For simplicity of exposition, we shall not get into reduction calculations here, only remark that the number of constraints is still finite in each case. 6.3. The dispersionless KP hierarchy. Let A be the algebra of formal Laurent series in λ, having the form N (u) X ui (x)λi , (6.46) u(x, λ) = i=−∞

where the coefficients ui are smooth functions on S 1 = R/Z. Define [u, v]−1 =

∂u ∂v ∂u ∂v − , u, v ∈ A, ∂λ ∂x ∂x ∂λ

(6.47)

then (A[·, ·]−1 ) is a Poisson algebra. The (extended) dispersionless KP (dKP) hierarchy is defined by the equations dL = 5≥0 (Ln ), L −1 = − 5≤−1 (Ln ), L −1 , n = 1, 2, . . . , dt where the Lax operator is an element of the (extended) dKP manifold ) ( 0 X i ui (x)λ , MdKP = L ∈ A L(x, λ) = λ +

(6.48)

(6.49)

i=−∞

and 5≥0 , 5≤−1 are projection operators relative to the decomposition A = A≥0 ⊕ A≤−1 into subalgebras A≥0

 

  X = u ∈ A u(x, λ) = ui (x)λi ,   (

A≤−1

(6.50)

(6.51a)

i≥0

) −1 X = u ∈ A u(x, λ) = ui (x)λi .

(6.51b)

i=−∞

In the standard form of the dKP equations [TT], the coefficient u0 ≡ 0, but we shall not get into reduction calculations here. For the Poisson algebra (A, [·, ·]−1 ), the invariant trace is defined by Z (6.52) tr−1 u = u−1 (x)dx, u ∈ A, and we have the non-degenerate ad-invariant pairing (·, ·)−1 : (u, v)−1 = tr−1 (uv), u, v ∈ A.

(6.53)

So again we can invoke Theorem 3.2, using the r-matrix R = 5≥0 − 5≤−1

(6.54)

590

L.-C. Li

in this case to obtain the corresponding brackets {·, ·}(n) , n ≥ −1. Here, it is easy to check that MdkP is a Poisson submanifold of (A, {·, ·}(n) ) only for n = −1, 0. Therefore, the induced structures on MdKP provide the first and second Hamiltonian structures nfor the equations in the hierarchy. For o the bracket {·, ·}(1) , the slightly larger P 1 i ui (x)λ is a Poisson submanifold. Hence the manifold u ∈ A u(x, λ) = i=−∞

third structure on MdKP can be computed using Dirac reduction with constraint u1 ≡ 1. We shall leave the details to the interested reader. 6.4. The dispersionless modified KP and the dispersionless Dym hierarchy. Let (A, [·, ·]−1 ) be the Poisson algebra in Example 6.3, with the same invariant pairing (·, ·)−1 . Consider the decomposition A = A≥k ⊕ A≤k−1 , k ≥ 0 with associated projection operators 5≥k and 5≤k−1 , where     X ui (x)λi , A≥k = u ∈ A u(x, λ) =  

(6.55)

(6.56a)

i≥k

) k−1 X = u ∈ A u(x, λ) = ui (x)λi . (

A≤k−1

(6.56b)

i=−∞

Clearly, A≥k is a subalgebra of (A, [·, ·]−1 ) for all k. On the other hand, simple verification shows that A≤k−1 is a subalgebra of (A, [·, ·]−1 ) only for k = 0, 1, 2. Therefore, among the direct sum decompositions in (6.55), only the three cases k = 0, 1, and 2 lead to r-matrices, and the case k = 0 has already appeared in Example 6.3. We now consider the other two cases, with Lax equations dL = 5≥k (Ln ), L −1 = − 5≤k−1 (Ln ), L −1 , n = 1, 2, . . . , ; k = 1, 2, . dt (6.57) For k = 1 and L ∈ MdKP , the equations in (6.57) constitute the dispersionless modified KP hierarchy. For k = 2, we obtain the dispersionless Dym hierarchy when the Lax operator L is from the submanifold ( ) 1 X i ui (x)λ . (6.58) MdDym = L ∈ A L(x, λ) = i=−∞

These hierarchies are the semi-classical limit of the modified KP and the Dym hierarchies in [ANPV,KO]. For the dmKP hierarchy, with r-matrix given by R = 5≥1 − 5≤0 , the manifold of Lax operators is a Poisson submanifold of the associated brackets {·, ·}(n) for n = −1, 0, 1. Hence the induced structures on MdKP provide the first three Poisson structures for the Hamiltonian description of dmKP. The higher structures, on the other hand, have to be computed using Dirac reduction. For the dispersionless Dym hierarchy, the situation is even better, for in this case the first five Poisson structures on MdDym are obtained from the brackets {·, ·}(n) (−1 ≤ n ≤ 3) associated with R = 5≥2 − 5≤1 by simple restriction. Again, the passage from {·, ·}(n) (n ≥ 4) to the higher structures require the application of Dirac reduction.

Classical r-Matrices and Compatible Poisson Structures

591

References [A]

Adler, M.: On a trace functional for formal pseudo-differential operators and the symplectic structure of the Korteweg de-Vries (KdV) type equations. Invent. Math. 50, 219–248 (1979) [AvM] Adler, M., van Moerbeke, P.: Compatible Poisson structures and the Virasoro algebra. Comm. Pure Appl. Math. 47, 5–37 (1994) [ANPV] Aratyn, H., Nissimov, E., Pacheva, S., Vaysburd, I.: R-matrix formulation of the KP hierarchies and their gauge equivalence. Phys. Lett. B 294, 167–176 (1992) [B] Benny, D. J.: Some properties of long nonlinear waves. Stud. Appl. Math. 52, 45–50 (1973) [D] Dubrovin, B.: Geometry of 2D topological field theories. In: Lecture Notes in Math., Vol. 1620, Berlin–Heidelberg–New York: Springer-Verlag, 1996 [DFIZ] DiFrancesco, P., Itzykson, C., Zuber, J.-B.: Classical W -algebras. Commun. Math. Phys. 140, 543– 567 (1991) [DM] Deift, P., McLaughlin, K. T-R.: A continuum limit of the Toda lattice. Memoirs of Am. Math. Soc. 131, no. 624 (1998) [DN] Dubrovin, B., Novikov, S. P.: Hydrodynamics of weakly deformed soliton lattices, differential geometry and Hamiltonian theory. Russian Math. Surveys 44, 35–124 (1989) [DO] Dorfman, I.: Dirac structures and integrability of nonlinear evolution equations. Chichester, England : J. Wiley, 1993 [DR] Drinfeld, V. G.: Hamiltonian structure on Lie groups, Lie bialgebras and the geometrical meaning of the Yang–Baxter equations. Sov. Math. Doklady 27, 69–71 (1983) [F] Ferapontov, E. V.: Differential geometry of nonlocal Hamiltonian operators of hydrodynamic type. Funct. Anal. Appl. 25, 195–204 (1991) [GD] Gelfand, I. Dickey, L.: A family of Hamiltonian structures related to nonlinear integrable differential equations. Preprint no. 136, Inst. Appl. Math. USSR Acad. Sci. 1978 (in Russian). English transl. In: Collected papers of I. M. Gelfand, Vol. 1. Berlin–Heidelberg–NewYork: Springer 1987, pp. 625–646 [GDO] Gelfand, I., Dorfman, I.: Hamiltonian operators and algebraic structures related to them. Funct. Anal. Appl. 13, 248–262 (1979) [G-KR] Golenischeva-Kutuzova, M., Reiman, A. G.: Integrable equations, related with the Poisson algebra. J. Soviet Math. 169, 890–894 (1988) [K] Kostant, B.: The solution to a generalized Toda lattice and representation theory. Adv. Math. 34, 195–338 (1979) [Kri] Krichever, I. M.: The dispersionless Lax equations and topological minimal models. Commun. Math. Phys. 143, 415–426 (1991) [KO] Konopelchenko, B., Oevel, W.: An r-matrix approach to nonstandard classes of integrable equations. Publ. RIMS, Kyoto Univ. 29, 581–666 (1993) [KR] Kulish, P. P., Reiman, A. G.: Hierarchy of symplectic forms for the Schrodinger and the Dirac equations on the line. Zap. Nauchn. Sem. L. O. M. I. 77, 134–147 (1978) (in Russian), English transl. In: J. Soviet Math. 22, 1627–1637 (1983) [LP1] Li, L. C., Parmentier, S.: A new class of quadratic Poisson structures and the Yang–Baxter equation. C. R. Acad. Sci., Paris Ser. I 307, 279–281 (1988) [LP2] Li, L. C., Parmentier, S.: Nonlinear Poisson structures and r-matrices. Commun. Math. Phys. 125, 545–563 (1989) [M] Magri, F.: A simple model of the integrable Hamiltonian equation. J. Math. Phys. 19, 1156–1162 (1978) [MR] Marsden, J., Ratiu, T.: Reduction of Poisson manifolds. Lett. in Math. Phys. 11, 161–169 (1986) [RSTS1] Reiman, A. G., Semenov-Tian-Shansky, M. A.: A family of Hamiltonian structures, hierarchy of Hamiltonians, and reduction for first-order matrix differential operators. Funct. Anal. Appl. 14, 146–148 (1980) [RSTS2] Reiman, A. G., Semenov-Tian-Shansky, M. A.: Group-theoretical methods in the theory of finite dimensional integrable systems. In: Dynamical Systems VII, ed. by V. I. Arnold, S. P. Novikov, Encyclopaedia of Mathematical Sciences, Vol. 16, Berlin–Heidelberg–New York: Springer-Verlag, 1994 [S] Schouten, J. A.: On the differential operators of first order in tensor calculus. Conv. di Geom. Differen. 1953, Roma: Ed. Cremonese, 1954 [ST] Strack, K.: r-Matrizen and assoziativen Algebren: eine systematische Suche nach PoissonKlammern. Thesis (1990)

592

[STS1] [STS2] [TT] [W1] [W2]

L.-C. Li Semenov-Tian-Shansky, M. A.: What is a classical r-matrix? Funct. Anal. Appl. 17, 259–272 (1983) Semenov-Tian-Shansky, M. A.: Dressing transformations and Poisson Lie group actions. Publ. RIMS, Kyoto University 21, 1237–1260 (1985) Takasaki, K., Takebe, T.: Integrable hierarchies and dispersionless limit. Rev. Math. Phys. 7, 743– 808 (1995) Weinstein, A.: Coisotropic calculus and Poisson groupoids. J. Math. Soc. Japan 40, 705–727 (1988) Weinstein, A.: The local structure of Poisson manifolds. J. Diff. Geom. 18, 523–557 (1983)

Communicated by B. Simon

Commun. Math. Phys. 203, 593 – 612 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Multifractal Analysis of Local Entropies for Expansive Homeomorphisms with Specification Floris Takens, Evgeny Verbitski Department of Mathematics, University of Groningen, P.O.Box 800, 9700 AV, Groningen, The Netherlands. E-mail: [email protected]; [email protected] Received: 22 September 1998 / Accepted: 11 December 1998

Abstract: In the present paper we study the multifractal spectrum of local entropies. We obtain results, similar to those of the multifractal analysis of pointwise dimensions, but under much weaker assumptions on the dynamical systems. We assume our dynamical system to be defined by an expansive homeomorphism with the specification property. We establish the variational relation between the multifractal spectrum and other thermodynamical characteristics of the dynamical system, including the spectrum of correlation entropies.

1. Introduction Recently in the series of papers [10,11,2] L. Barreira, Ya. B. Pesin, J. Schmeling, and H. Weiss performed a complete multifractal analysis of local dimensions, entropies and Lyapunov exponents for conformal expanding maps and surface Axiom A diffeomorphisms with Gibbs measures. The main goal of these papers was primarily the analysis of the local (pointwise) dimensions. This is an extremely difficult problem and, for example, similar results for hyperbolic systems in dimensions 3 and higher have not been obtained. In the present work we concentrate our attention on the multifractal analysis of the local (pointwise) entropies. We are able to obtain results, which are similar to those mentioned above, for Gibbs measures of the expansive homeomorphisms with specification property. Note that such dynamical systems may not have Markov partitions, which is a crucial condition in the previous works. However, due to the fact that less is known about thermodynamical properties of these dynamical systems we were able to obtain only the continuous differentiabilty of the multifractal spectrum of local entropies (compare: the same spectra for the dynamical systems with Markov partitions are analytic). We believe that the smoothness of the multifractal spectrum in our case can be improved.

594

F. Takens, E. Verbitski

We have related the mutifractal spectrum of the local entropies to the spectrum of correlation entropies. These correlation entropies serve as entopy-like analogues of the Hentschel–Procaccia and Renyi spectra of generalized dimensions. This allows us to complete the duality between the mutifractal analyses of local dimensions and entropies. 2. Expansiveness and Specification The following definitions and fundametal results are taken from [6,8,17], for a compact presentation see [9, Chap.20]. Throughout this paper we assume (X, d) to be a compact metric space. Definition 2.1. A homeomorphism f : X → X is called expansive if there exists a constant γ > 0 such that if d(f n (x), f n (y)) < γ for all n ∈ Z then x = y.

(2.1)

The maximal γ with such a property is called the expansivity constant. Another important property is the following. Definition 2.2 (Bowen [6]). We say that f : X → X is a homeomorphism with the specification property (abbreviated to “a homeomorphism with specification”) if for each δ > 0 there exists an integer p = p(δ) such that the following holds: if a) I1 , . . . , In are intervals of integers, Ij ⊆ [a, b] for some a, b ∈ Z and all j, b) dist(Ii , Ij ) ≥ p(δ) for i 6 = j , then for arbitrary x1 , . . . , xn ∈ X there exists a point x ∈ X such that 1) f b−a+p(δ) (x) = x, 2) d(f k (x), f k (xi )) < δ for k ∈ Ii . The specification property guarantees good mixing properties of f and a sufficient number of periodic orbits. Homeomorphisms that are expansive and with specification, form a general class of “strongly chaotic” dynamical systems. For example, the following holds: Theorem 2.3 ( [9, Theorem 18.3.9]). Let 3 be a topologically mixing compact locally maximal hyperbolic set for a diffeomorphism f . Then f |3 has the specification property. Remark. A generalization of the notion of a space with a hyperbolic diffeomorpism is the so-called Smale space [16]. Also for the Smale spaces mixing implies specification as well. 3. Equilibrium States For the multifractal analysis one needs an invariant probability measure. On an attractor there is usually one physically relevant measure (density of a generic orbit) called the SRB (Sinai-Ruelle-Bowen) measure, which often belongs to the class of equilibrum states or Gibbs measures. We introduce the last notion now. Again, let (X, d) be a compact space, f : X → X a continuous map and ϕ : X → R a continuous function. We shall use the following notation.

Multifractal Analysis of Local Entropies

595

Definition 3.1. For every n ∈ N and any x, y ∈ X define a new metric dn (x, y) =

max

i=0,... ,n−1

d(f i (x), f i (y)),

and let Bn (x, ε) = {y ∈ X : dn (x, y) < ε} for ε > 0. The set E ⊂ X is said to be (n, ε)-separated if for every x, y ∈ E such that x 6 = y we have dn (x, y) > ε. We say that a set F ⊂ X is (n, ε)-spanning if for every y ∈ X there exist x ∈ F such that dn (x, y) < ε. For any function ϕ : X → R and x ∈ X put (Sn ϕ)(x) =

n−1 X

ϕ(f k (x)).

k=0

Now we introduce the topological pressure which will be defined on the space C(X) of all continuous functions on (X, d). Definition 3.2. For n ∈ N and ε > 0 define ) ( X exp (Sn ϕ)(x) , Zn (ϕ, ε) = sup E

(3.1)

x∈E

where the supremum is taken over all (n, ε)-separated sets E. The pressure is then defined as P (ϕ) = lim lim sup ε→0 n→∞

1 log Zn (ϕ, ε). n

(3.2)

The topological entropy of f , denoted by htop (f ), is by definition the topological pressure of ϕ ≡ 0. The topological pressure admits other equivalent definitions, for this, see [21]. In particular, the following statement is known as the Variational Principle. Theorem 3.3. Denote by Mf (X) the set of all f -invariant Borel probability measures on X. Let ϕ ∈ C(X). Then Z hµ (f ) + ϕdµ . P (ϕ) = sup µ∈Mf (X)

This result inspires the following definition. Definition 3.4. An element µ of Mf (X) is called an equilibrium state for the potential ϕ if Z P (ϕ) = hµ (f ) +

ϕ dµ.

The equilibrium state for ϕ ≡ 0 (if it exists) is called a measure of maximal entopy. We recall some other basic properties of the topological pressure: 1. P : C(X) → R is continuous and monotonously increasing, i.e., ϕ ≤ ψ ⇒ P (ϕ) ≤ P (ψ).

596

F. Takens, E. Verbitski

2. One of the following holds: P (ϕ) = +∞ ∀ϕ ∈ C(X), P (ϕ) < +∞ ∀ϕ ∈ C(X). Expansive homeomorphisms, which we will consider in the next sections, always have finite topological entropy and hence the pressure of every continuous function is finite. 3. P : C(X) → R is convex, i.e., ∀λ ∈ [0, 1], P (λϕ + (1 − λ)ψ) ≤ λP (ϕ) + (1 − λ)P (ψ). 4. For any ϕ ∈ C(X) and c ∈ R one has P (ϕ + c) = P (ϕ) + c. We impose additional conditions on the class of potentials under consideration. We say that ϕ ∈ Vf (X) if it is continuous and there exist ε > 0 and K > 0 such that for all n ∈ N, d(f k (x), f k (y)) < ε for k = 0, . . . , n − 1 ⇒ (Sn ϕ)(x) − (Sn ϕ)(y) < K. For example, for a hyperbolic diffeomorphism f , any Hölder continuous function ϕ is in Vf (X) [9, Prop.20.2.6]. Theorem 3.5 ( [6,16,9]). If f is an expansive homeomorphism with specification and ϕ ∈ Vf (X) then there exists a unique measure µϕ such that Z P (ϕ) = hµϕ (f ) + ϕdµϕ . Moreover, µϕ is ergodic, positive on open sets and mixing. The equilibrium state µϕ can be constructed from the measures concentrated on periodic points in the following way. For every n ≥ 1 define a probability measure µϕ,n supported on the set of periodic points F ix(f n ) = {x ∈ X : f n (x) = x} as follows: µϕ,n =

1 P (f, ϕ, n)

X

e(Sn ϕ)(x) δx ,

(3.3)

x∈F ix(f n )

where δx is a unit measure at x and P (f, ϕ, n) =

X

e(Sn ϕ)(x) is a normalizing

x∈F ix(f n )

constant. Theorem 3.6 ([6,9]). An equilibrium state µϕ is a weak∗ limit of the sequence {µϕ,n }, i.e., for every h ∈ C(X), Z Z h(x)dµϕ,n → h(x)dµϕ as n → ∞. For our purposes of analysis of local entropies the following result will play a key role.

Multifractal Analysis of Local Entropies

597

Theorem 3.7 ( [8, Proposition 2.1], [9, Theorem 20.3.4]). Let f be an expansive homeomorphism with the specification property. Let ϕ ∈ Vf (X) and denote its equilibrium state by µϕ . Then for a sufficiently small ε > 0 there exist constants Aε , Bε > 0 such that for all x ∈ X and n ≥ 0, µϕ {y ∈ X : d(f k (x), f k (y)) < ε for k = 0, . . . , n − 1} ≤ Bε . (3.4) Aε ≤ exp (−nP (ϕ) + (Sn ϕ)(x)) Remark. Actually, the result above states that for expansive homeomorphisms with specification the equilibrium states are the so-called Gibbs measures (states) as well. See [8] for detailed discussion. We have seen that for every ϕ ∈ Vf (X) there exists a unique equilibrium state. Using (3.3) and (3.4) we are able to give necessary and sufficient conditions for potentials ϕ, ψ ∈ Vf (X) to have the same equilibrium states µϕ = µψ . Theorem 3.8. Let f be an expansive homeomorphism with specification. The equilibrium states µϕ and µψ corresponding to the potentials ϕ, ψ ∈ Vf (X) coincide if and only if there exists a constant c ∈ R such that (Sn ϕ)(x) = (Sn ψ)(x) + nc

(3.5)

for all x ∈ F ix(f n ) and all n. Proof. If (3.5) holds for all x ∈ F ix(f n ) and n, then by (3.3) one has µϕ,n = µψ,n for all n. Thus µϕ = µψ . Suppose that µϕ = µψ =: µ. Consider “adjusted” potentials e ϕ = ϕ − P (ϕ) and e = ψ − P (ψ). Let x ∈ F ix(f n ) for some n ∈ N, applying (3.4) for sufficiently small ψ ε > 0, we conclude that e)(x) . ϕ )(x) ≤ µ(Bn (x, ε)) ≤ Bεψ exp (Sn ψ Aϕε exp (Sn e e)(x) + C 0 for some constant C 0 independent of x and ϕ )(x) ≤ (Sn ψ This implies that (Sn e kn n. Since x ∈ F ix(f ) for all k ∈ N we have that e)(x) (Skn e ϕ )(x) (Skn ψ e)(x). ≤ lim = (Sn ψ k→∞ k→∞ k k

ϕ )(x) = lim (Sn e

By symmetry we obtain the opposite inequality. Hence e)(x) ϕ )(x) = (Sn ψ (Sn e t for all x ∈ F ix(f n ) and n ∈ N. This implies (3.5) with c = P (ϕ) − P (ψ). u 4. Thermodynamical Formalism for Expansive Homeomorphisms with Specification In this section we establish some technical results on the properties of the pressure for expansive homeomorphisms which will be exploited later in the proof of the main result.

598

F. Takens, E. Verbitski

Lemma 4.1. Suppose f : X → X is an expansive homeomorphism with specification. Let ϕ ∈ Vf (X). Then the function P (qϕ), q ∈ R, is continuously differentiable with respect to q and its derivative is given by Z dP (qϕ) = ϕdµq , dq where µq is the equlibrium state corresponding to the potential qϕ. Moreover, P (qϕ) is a strictly convex function of q provided the equilibrium state µϕ for ϕ is not a measure of maximal entropy. If µϕ is the measure of maximal entropy then P (qϕ) − qP (ϕ) = (1 − q)htop (f ) for all q ∈ R. Proof. We shall use several results from [21] to show that P (qϕ) is a differentiable function of q. For a moment we are going to use the fact that f : X → X is a continuous map on a compact metric space (X, d) with finite topological entropy. Since the topological pressure is a continuous and convex function on C(X), for every ϕ, ψ ∈ C(X), the function P (ϕ + tψ) − P (ϕ) t→ t is non-increasing as t ↓ 0. Hence there exist right and left derivatives of P (ϕ) in the direction of ψ, i.e., P (ϕ + tψ) − P (ϕ) , t→0+ t P (ϕ + tψ) − P (ϕ) . d − P (ϕ)(ψ) = lim t→0− t d + P (ϕ)(ψ) = lim

We say that the pressure P is Gâteaux differentiable at ϕ if for every ψ the following holds d + P (ϕ)(ψ) = d − P (ϕ)(ψ). This turns out to be equivalent to the condition that the map ψ → d + P (ϕ)(ψ) is linear. A linear functional α on C(X) is called a tangent functional (subdifferential) to P (·) at ϕ if P (ϕ + ψ) − P (ϕ) ≥ α(ψ) for all ψ ∈ C(X). Applying the Riesz representation theorem we conclude that there exist a finite signed measure ν = ν(α) on X such that Z α(ψ) = ψdν for all ψ ∈ C(X). From now on we identify the tangent functional α with the corresponding measure ν from the Riesz representation. Denote by tϕ (P ) the set of all tangent functionals to P at ϕ and by Mϕ (X) the set of all equilibrium states corresponding to the potential ϕ. Applying the Variational Principle one concludes Mϕ (X) ⊂ tϕ (P ).

Multifractal Analysis of Local Entropies

599

One can easily check that the pressure P is Gâteaux differentiable at ϕ if and only if there is a unique tangent functional ν to P at ϕ [21, Corollary 2] and that Z dP (ϕ)(ψ) = ψdν. Combining the results of Theorems 8.2 and 9.15 from [21] one has that for expansive homeomorphism f : X → X, Mϕ (X) = tϕ (X) for every ϕ ∈ C(X). Since for every ϕ ∈ Vf (X) the set Mϕ (X) consists of a single element (uniqueness of equilibrium states), we have that the pressure P is Gâteaux differentiable at any ϕ ∈ Vf (X) and Z d (4.1) P (ϕ + tψ) = ψdµϕ t=0 dt for all ψ ∈ C(X). This proves the differentiability of the pressure function P (qϕ) at q = 1. The result for all other q follows in the same manner since qϕ ∈ Vf (X) for every q ∈ R if ϕ ∈ Vf (X). If a convex function is differentiable, then its derivative is continuous. Since we have already established the differentiability of P (qϕ) (and it is convex) we obtain the desired result. Now we are going to establish the strict convexity of P (qϕ). Suppose, µϕ is not a measure of maximal entropy. Then applying the result of Theorem 3.8 we conclude that the equilibrium states µq1 and µq2 , corresponding to potentials q1 ϕ and q2 ϕ respectively, are not equal if q1 6 = q2 . Indeed, assume µq1 = µq2 for some q1 6= q2 . Then by Theorem 3.8 we conclude that for some constant c, (Sn q1 ϕ)(x) = (Sn q2 ϕ)(x) + nc F ix(f n ).

This implies that (Sn ϕ)(x) = nc˜ with c˜ = c/(q1 − q2 ). for all n and x ∈ Appying again Theorem 3.8 one has that the equilibrium state µϕ and the equilibium state µ0 , corresponding to potential ψ ≡ 0, are equal. It means that µϕ is the measure of maximal entropy. Hence we have arrived at a contradiction with the assumption. Therefore µq1 6 = µq2 if q1 6 = q2 . The function h : R → R is called strictly convex if for every q0 ∈ R there exists λ(q0 ) ∈ R such that R

h(q) > h(q0 ) + λ(q0 )(q − q0 )

for all q 6 = q0 .

Put λ(q0 ) = ϕdµq0 for any q0 ∈ R. Since µq 6= µq0 for q 6= q0 and µq is the unique equilibrium state for qϕ, one has Z P (qϕ) = hµq (f ) + qϕdµq Z = sup hµ (f ) + qϕdµ µ∈Mf (X)

> hµq0 (f ) + = hµq0 (f ) +

Z Z

qϕdµq0 q0 ϕdµq0 + (q − q0 )

= P (q0 ϕ) + λ(q0 )(q − q0 ).

Z ϕdµq0

600

F. Takens, E. Verbitski

This means that P (qϕ) is a strictly convex function. If the equilibrium state µϕ is indeed a measure of maximal entropy, then µϕ = µqϕ =: µ for all q ∈ R. This is a consequence of Theorems 3.5 and 3.8. Then applying the Variational Principle to µϕ and µqϕ we conclude that Z P (qϕ) = hµ (f ) + q ϕdµ, Z P (ϕ) = hµ (f ) + ϕdµ, where hµ (f ) = htop (f ) since µ is the measure of maximal entropy. The result follows immediately. u t Remark. Much stronger result on smoothness of the pressure are known. For example, the analyticity of pressure has been established for Smale spaces [16], i.e., generalizations of Axiom A diffeomorphisms. The key property which these systems inherit from hyperbolic dynamical systems is the so-called local product structure, which in turn guarantees the existence of Markov partitions. The known methods of establishing the analyticity of pressure strongly rely on this Markov structure. Expansive homeomorphism with specification do not necessarily have Markov partitions. For expansive homeomorpshism with specification we were able to prove only the continuous differentiability of the pressure. However we believe that this result can be improved. Definition 4.2. We say that E is a maximal (n, ε)-separated set if it can not be enlarged by adding new points preserving the separation condition. It is easy to see that every maximal (n, ε)-separated set E is an (n, ε)-spanning set as well. The following estimates from [8] will be used later. Lemma 4.3. Let f be an expansive homeomorphism and γ > 0 be its expansivity constant. Let ϕ ∈ Vf (X). For every finite set E put X exp (Sn ϕ)(x) . Zn (ϕ, E) = x∈E

1. If ε, ε 0 < γ /2 and E, E 0 are the maximal (n, ε)- and (n, ε0 )-separated sets respectively then one has Zn (ϕ, E) ≤ CZn (ϕ, E 0 ), where the constant C = C(ε, ε0 ) is independent of n. In particular, P (ϕ) = lim

n→∞

1 log Zn (ϕ, En ), n

(4.2)

where En are the arbirary maximal (n, ε)-separated sets. 2. If furthermore f satisfies the specification property and ε < γ /2, then there exists a constant D = D(ϕ, ε) > 0 such that | log Zn (ϕ, En ) − nP (ϕ)| < D for all n and all maximal (n, ε)-separated sets.

(4.3)

Multifractal Analysis of Local Entropies

601

5. Topological Entropy for Non-Compact Sets The generalization of the topological entropy to non-compact or non-invariant sets goes back to Bowen [5]. Later Pesin and Pitskel [13] generalized the notion of pressure to the case of non-compact sets. Note that by definition topological entropy is the topological pressure for ϕ ≡ 0. Now we give the formal definition of the topological entropy of a non-compact or non-invariant set. Suppose f : X → X is a continuous map on a compact metric space (X, d). Let U = {U1 , . . . , UM } be a finite open cover of X. By defintion, a string U is a sequence Ui1 . . . Uin with ik ∈ {1, . . . , M}, its length n is denoted by n(U). The collection of all strings of length n is denoted by Wn (U). For each U ∈ Wn (U) define the open set X(U) = U1 ∩ f −1 U2 ∩ . . . ∩ f −n+1 Un = {x ∈ X : f k−1 x ∈ Uk , k = 1, . . . , n}. We say that a collection of strings 0 covers a set Z ⊂ X if [ X(U) ⊃ Z. U∈0

For every real number s introduce M(Z, s, U) = lim inf N→∞ 0

X

exp(−n(U)s),

U∈0

S where the infinum is taken over all collections 0 ⊆ n≥N Wn (U) covering Z. There exists a unique value s such that M(Z, ·, U) jumps from +∞ to 0, h(Z, U) := s = sup{s : M(Z, s, U) = +∞} = inf{s : M(Z, s, U) = 0} Finally, one can show that the following limit exists: htop (f |Z ) :=

lim

diam(U)→0

h(Z, U).

Definition 5.1. The number htop (f |Z ) is called the topological entropy of f restricted to the set Z, or, simply, the topological entropy of Z. This definition of the topological entropy is similar to the definition of the Hausdorff dimension (the diameters of the covering open sets are substituted by exp(−n(U)), which can be treated as a “dynamical diameter” of X(U)). Indeed, these definitions are particular cases of the so-called Carathéodory dimension characteristics [14]. Theorem 5.2 ([12] ). The topological entropy as defined above has the following properties: 1. htop (f |Z1 ) ≤ htop (f |Z2 ) for any Z1 ⊂ Z2 ⊂ X; 2. htop (f |Z ) = sup htop (f |Zi ), where Z = ∪∞ i=1 Zi ⊂ X; i

3. if µ is an invariant measure such that µ(Z) = 1, then htop (f |Z ) ≥ hµ (f ).

602

F. Takens, E. Verbitski

6. Local Entropy In this section we give the definition of local entropy. The fundamental result on its existence and properties is the Brin–Katok formula below. Using the notation from Sect. 3 we introduce the lower and upper local entropies at x ∈ X as follows 1 hµ (f, x) := lim lim inf − log µ(Bn (x, ε)), ε→0 n→∞ n 1 hµ (f, x) := lim lim sup − log µ(Bn (x, ε)). ε→0 n→∞ n

(6.1) (6.2)

Note that the limits in ε exist due to the monotonicity. We say that the local entropy exists at x if hµ (f, x) = hµ (f, x).

(6.3)

In this case the common value will be denoted by hµ (f, x). Theorem 6.1 (Brin–Katok formula, [7]). Let f : X → X be a continuous map on a compact metric space (X, d) preserving a non-atomic Borel measure µ, then 1. for µ-a.e. x ∈ X the local entropy exists, i.e., hµ (f, x) = hµ (f, x) = hµ (f, x); 2. hµ (f, x) is a f –invariant function of x, and Z hµ (f, x) dµ = hµ (f ), where hµ (f ) is the measure–theoretic entropy of f . Remark. If µ is ergodic then hµ (f, x) = hµ (f ) for µ-a.e. x ∈ X. Lemma 6.2. Let f be an expansive homeomorphism with specification. Consider an equilibrium state µϕ for the potential ϕ ∈ Vf (X). For every x ∈ X put ϕ ∗ (x) = lim inf n→∞

ϕ ∗ (x) = lim sup n→∞

Then

n−1

1X ϕ(f i (x)), n 1 n

i=0 n−1 X

ϕ(f i (x)).

i=0

hµ (f, x) = P (ϕ) − ϕ ∗ (x), hµ (f, x) = P (ϕ) − ϕ ∗ (x),

for all x ∈ X. Therefore hµ (f, x) = hµ (f, x) if and only if ϕ ∗ (x) = ϕ ∗ (x).

Multifractal Analysis of Local Entropies

603

Proof. Using the estimate from Theorem 3.7 we conclude that for every sufficiently small ε > 0 and some constants C1 , C2 one has n−1

1X 1 C1 + P (ϕ) − ϕ(f i (x)) ≤ − log µ(Bn (x, ε)) n n n i=0

n−1

≤

1X C2 ϕ(f i (x)) + P (ϕ) − n n i=0

for all n ≥ 1 and every x ∈ X. The statement follows easily. 7. Multifractal Spectrum for Local Entropies Following [2] we introduce a multifractal spectrum for (local) entropies. For every α consider a level set of local entropy Kα = {x ∈ X : hµ (f, x) = α}, and the corresponding multifractal decomposition on level sets [ [ Kα {x ∈ X : hµ (f, x) does not exist}. X=

(7.1)

(7.2)

α

We use the topological entropy, defined in Sect. 5, to measure the “size” of sets {Kα }. Namely, define a multifractal spectrum for local entropies as follows: EE (α) = htop (f |Kα ).

(7.3)

This notation needs a brief explanation: two E’s stand for the topological Entropy of level set of local Entropy. For other multifractal spectra DE , ED , DD , see [2]. From a general multifractal formalism one expects EE (α) to be smooth and concave on a certain interval of α’s. We are able to establish this in the case of equilibrium states for expansive homeomorphisms with specification. The crucial observation which we exploit in the proof is the following. Let µ = µϕ be an equilibrium state for a potential ϕ. Then applying the result of the previous section one gets that n−1

1X ϕ(f i (x)) = P (ϕ) − α. n→∞ n

x ∈ Kα ⇐⇒ hµ (f, x) = α ⇐⇒ lim

(7.4)

k=0

Therefore, the level sets of local entropies are exactly the level sets of limits of ergodic averages of ϕ. From the Ergodic Theorem one concludes that only one of these level sets has full measure, while others are of measure 0. We adopt a technique of estimation of the topological entropy of these level sets from [2]. The main idea is the following: we introduce a 1-parameter family of measures such that for each α with Kα 6= ∅ there is exactly one measure in the family for which Kα has full measure. These measures µq are the equilibrium states for potentials ϕq = qϕ −P (qϕ). However, for the correspondence between levels {Kα } and measures {µq } we need a parameterization α(q) such that 1, if α˜ = α(q), µq (Kα˜ ) = 0, if α˜ 6= α(q).

604

F. Takens, E. Verbitski

The parameterization can be given as follows: first define T (q) = P (qϕ) − qP (ϕ), and α(q) = −T 0 (q) (note that T is C 1 by Lemma 4.1). Below we will establish that htop (f |Kα(q) ) = hµq (f ), i.e., µq is the measure with maximal metric entropy among all invariant measures {ν} such that ν(Kα(q) ) = 1. In order to complete the analysis we have to show that Kα = ∅ for every α 6 ∈ [inf q α(q), supq α(q)]. 8. Main Result In this section we state our main result. It is exactly in the form of the corresponding results from [2,10] for the multifractal analysis of local (pointwise) dimensions. We are following the same notation and order of statements. The last statement of our theorem is analogous to Remark 5 in [10]. It relates the multifractal spectra of the local entropies to the spectra hµ (f, q) of the correlation entropies (analogue of the Hentschel– Procaccia spectra for dimensions H P (q)) and Rµ (f, q) (analogue of the Renyi spectra of dimensions R(q)). Although it would be natural to call Rµ (f, q) the Renyi spectra of entropies, it might cause some confusion, since there exists a different notion called the Renyi entropy of order q [4,20]. Theorem 8.1. Let f be an expansive homeomorphism with the specification property of a compact metric space (X, d). Let ϕ ∈ Vf (X) and µ = µϕ be the corresponding equilibrium state. Then 1. For µ-a.e. x ∈ X the local entropy at x exists and Z hµ (f, x) = hµ (f ) = P (ϕ) −

ϕ dµ.

2. For any q ∈ R define the function T (q) = P (qϕ) − qP (ϕ). Then T (q) is a convex C 1 function R of q. Moreover, T (0) = htop (f ), T (1) = 0; for every q ∈ R one has T 0 (q) = ϕdµq − P (ϕ) ≤ 0, where µq is the equilibrium state for ϕq = qϕ − P (qϕ). 3. Put α(q) = −T 0 (q). Then EE (α(q)) := htop (f |Kα(q) ) = T (q) + qα(q). Define α = inf α(q) = lim α(q), q

q→+∞

α = sup α(q) = lim α(q). q

q→−∞

Then Kα = ∅ if α 6 ∈ [α, α]. It means that the domain of the multifractal spectrum for local entropies α → EE (α) is the range of the function q → −T 0 (q).

Multifractal Analysis of Local Entropies

605

4. If the equilibrium state µ for the potential ϕ is not a measure of maximal entropy, then the relation between EE and T (q) can be written in the following variational form: EE (α) = inf (T (q) + qα) for α ∈ (α, α), q∈R

T (q) = sup (EE (α) − qα) for q ∈ R. α∈(α,α)

This implies that EE is strictly concave and continuously differentiable on (α, α) with the derivative given by EE0 (α) = q, where q ∈ R is such that α = −T 0 (q). 5. For every q ∈ R, q 6 = 1, the following limits exist: Z 1 log µ(Bn (x, ε))q−1 dµ, hµ (f, q) = lim lim − ε→0 n→∞ n(q − 1) X 1 µ(Bn (x, ε))q , log sup Rµ (f, q) = lim lim − ε→0 n→∞ n(q − 1) E x∈E

where the supremum is taken over all (n, ε)-separated sets E. For q 6 = 1 one has T (q) . hµ (f, q) = Rµ (f, q) = − q −1 The family of correlation entropies hµ (f, q) depends continuously on q and hµ (f, 0) = htop (f ), hµ (f, 1) := lim hµ (f, q) = hµ (f ). q→1

Proof. (1) The first statement is a consequence of the Brin-Katok formula for ergodic dynamical systems (Theorem 6.1). (2) The smoothness and convexity properties of T follow directly from Lemma 4.1. We calculate the derivative of T with respect to q. Using the formula from Lemma 4.1 one gets Z (8.1) T 0 (q) = ϕdµq − P (ϕ), where µq is the equilibrium state for the potential ϕq = qϕ − P (qϕ). The inequality T 0 (q) ≤ 0 follows from the Variational Principle applied to ϕ. (3) This statement is taken from [2] where it has not been proved. For the sake of completeness we give the proof here. Let us first calculate the measure–theoretic entropy of the equilibrium state µq . From the Variational Principle for µq we have Z hµq (f ) = P (ϕq ) − ϕq dµq Z = 0 + T (q) + qP (ϕ) − q ϕdµq (8.2) Z = T (q) + q P (ϕ) − ϕdµq = T (q) + qα(q),

606

F. Takens, E. Verbitski

where α(q) = −T 0 (q) and we use formula (8.1) for the derivative of T (q). As we have seen in Lemma 6.2 for any α one has n−1

1X ϕ(f i (x)) = P (ϕ) − α. n→∞ n

hµ (f, x) = α

if and only if lim

i=0

Let us apply now Lemma 6.2 to the equilibrium state µq corresponding to the potential qϕ. Similarly one gets that for every β, n−1

hµq (f, x) = β

if and only if

1X ϕ(f i (x)) = P (qϕ) − β. n→∞ n

q lim

i=0

Hence one concludes that hµ (f, x) = α

hµq (f, x) = P (qϕ) − qP (ϕ) + qα.

if and only if

For α = α(q) we get x ∈ Kα(q)

if and only if

hµq (f, x) = T (q) + qα(q).

(8.3)

Combining the results of (8.2) and (8.3) one gets hµq (f ) = T (q) + qα(q), hµq (f, x) = T (q) + qα(q)

if and only if x ∈ Kα(q) .

This means that hµq (f, x) = hµq (f ) if and only if x ∈ Kα(q) . Since µq is ergodic, we know from the Brin–Katok formula that hµq (f, x) = hµq (f ) for µq -a.e. x ∈ X. Hence we conclude that µq (Kαq ) = µq ({x : hµq (f, x) = hµq (f )}) = 1. Therefore we obtained the desired parametrization of the level sets. We have to compute the topological entropy of f restricted to Kα(q) , EE (α(q)) := htop f |Kα(q) . Using the properties of the topological entropy from Theorem 5.2 we conclude that EE (α(q)) = htop f |Kα(q) ≥ hµq (f ) = T (q) + qα(q), since µq (Kα(q) ) = 1. We have to prove the oposite inequality. For this it would be sufficient to show that htop (f |Kα(q) ) ≤ λ for any λ > T (q) + qα(q). Choose such λ and let δ = λ − T (q) − qα(q) > 0. Rewriting the definition of Kα(q) in terms of µq and ϕq one has Kα(q) = x ∈ X : hµq (f, x) = hµq (f ) = T (q) + qα(q) ) ( n−1 1X i ϕq (f x) = −T (q) − qα(q) . = x ∈ X : lim n→∞ n i=0

Multifractal Analysis of Local Entropies

607

For every x ∈ Kα(q) there exists an integer n(x) such that n−1 δ 1 X i ϕq (f x) + T (q) + qα(q) ≤ 2 n

(8.4)

i=0

for all n ≥ n(x). For every integer N consider the set Kα(q),N = {x ∈ Kα(q) : n(x) ≤ N}. Obviously we have Kα(q) =

[

Kα(q),N , Kα(q),N ⊂ Kα(q),N+1 .

N≥1

Using the properties of the topological entropy from Theorem 5.2 we conclude that htop (f |Kα(q) ) = lim htop (f |Kα(q),N ). N→∞

We are going to show that for any N ∈ N one has htop (f |Kα(q),N ) ≤ λ; this in turn will imply htop (f |Kα(q) ) ≤ λ. M Consider an arbitrary finite cover U = B(xi , ε/2) i=1 of X by open balls of radius ε/2, with ε < γ /2, where γ is the expansivity constant for f . Together with U we consider U˜ an open cover by balls with centers at xi and radii ε. Let E = {yj } be a maximal (n, ε/2)-separated set in X. Define a subset E 0 of E by choosing those yj which have a point from Kα(q),N close to them, namely E 0 = {yj ∈ E : Kα(q),N ∩ Bn (yj , ε/2) 6 = ∅}. This implies that

[

Kα(q),N ⊂

Bn (yj , ε/2).

yj ∈E 0

For every yj ∈ E 0 there exists at least one string Ui0 ,... ,in−1 from Wn (U) such that yj ∈ X(Ui0 ,... ,in−1 ). It is easy to see that if yj ∈ X(Ui0 ,... ,in−1 ) = Ui0 ∩ f −1 Ui1 ∩ . . . f −n+1 Uin−1 , then

Bn (yj , ε/2) ⊂ S(U˜ i0 ,... ,in−1 ) = U˜ i0 ∩ f −1 U˜ i1 ∩ . . . f −n+1 U˜ in−1 . In other words the collection of strings 0˜ = {U˜ i0 ,... ,in−1 } covers Kα(q),N . Therefore ˜ n) = m(Kα(q),N , λ, U, ≤

X

inf

0⊂∪k≥n Wk (U˜ ) U∈0 0 covers Kα(q),N

X

exp(−m(U)λ)

exp(−m(e U)λ)

e U∈e 0

= e−nδ

X

exp −n(T (q) + qα(q))

e U∈e 0

= e−nδ

X

yj ∈E 0

exp −n T (q) + qα(q) .

(8.5)

608

F. Takens, E. Verbitski

Since the potential ϕ ∈ Vf (X), so is ϕq , and n−1 n−1 X X (Sn ϕq )(x) − (Sn ϕq )(y) = ϕq (f k (x)) − ϕq (f k (y)) ≤ |q|K k=0

k=0

for all x, y ∈ X with dn (x, y) < ε/2. For any yj ∈ E 0 let xj be an arbitrary point from Kα(q),N ∩ Bn (yj , ε/2). Since xj ∈ Kα(q),N and n ≥ N from (8.4) we have −n(T (q) + qα(q)) ≤ −n(T (q) + qα(q)) − (Sn ϕq )(xj ) + (Sn ϕq )(yj ) + |q|K nδ + (Sn ϕq )(yj ) + |q|K. ≤ 2 Thus we can continue the estimate (8.5) as follows: X e n) ≤ e−nδ/2+|q|K exp((Sn ϕq )(yj )) m(Kα(q),N , λ, U, yj ∈E 0 0 −nδ/2

≤Ce

Zn (ϕq , E).

Using the estimates from Lemma (4.3) and the fact that P (ϕq ) = 0 we conclude that ˜ n) ≤ C 00 e−nδ/2 . m(Kα(q),N , λ, U, Hence

˜ = lim m(Kα(q),N , λ, U, ˜ n) = 0, m(Kα(q),N , λ, U) n→∞

and since U was an open cover by balls of radius ε/2 we get m(Kα(q),N , λ) =

lim

diam(U )→0

˜ = 0. m(Kα(q),N , λ, U)

Then by definition of the topological entropy we have htop (f |Kα(q),N ) ≤ λ for all N . Hence htop (f |Kα(q) ) ≤ λ for all λ > T (q) + qα(q). This completes the proof that htop (f |Kα(q), ) ≤ T (q) + qα(q). The rest of the statement is taken from [18]. It states that we have a complete description of the spectra for local entropies. (4) If the equilibrium state for the potential ϕ is not a measure maximal entropy then it was shown in Lemma 4.1 that T (q) is strictly convex, i.e., the following holds for every q, q0 ∈ R, q 6 = q0 : T (q) > T (q0 ) + T 0 (q0 )(q − q0 ). Therefore, if α ∈ (α, α) then there exists q0 ∈ R such that α = −T 0 (q0 ). We have seen that in this case EE (α) = T (q0 ) + αq0 . Using the strict convexity of T (q) we obtain that for q ∈ R, q 6 = q0 the following holds EE (α) = T (q0 ) + αq0 < T (q) + αq. Hence, EE (α) = inf (T (q) + αq) for α ∈ (α, α). q∈R

Multifractal Analysis of Local Entropies

609

In a similar manner one obtains the second relation T (q) = supα∈(α,α) (EE (α) − qα). Using the notion of the Legendre transform [15] we can say that actually functions T (q) and F (α) := −EE (−α) form a Legendre pair, i.e., one is the Legendre transform of another. Therefore the convexity and differentiabilty of EE follow from the properties of the Legendre transform. In particular, for α ∈ (α, α) one has EE0 (α) = q, where q ∈ R is such that α = −T 0 (q). In the case when µ is the measure of maximal entropy one has hµ (f, x) = hµ (f ) = htop (f ) for all x ∈ X. It means that EE is a delta-like function htop (f ), if α = htop (f ), EE (α) = 0, otherwise. This “degenerate” behaviour of the multifractal spectrum for the measure of maximal entropy can be successfuly exploited. For this see [2], where it has been used for the calculations of the mutifractal spectra for Lyapunov exponents. (5) This is an essentially new result. We prove it by means of standard thermodynamical technique. Let q > 1 and E be an arbitrary (n, ε)-separated set. One has Z Z X µ(Bn (x, ε)q−1 dµ µ(Bn (x, ε))q−1 dµ ≥ xi ∈EB (x ,ε/2) n i

≥

X

µ(Bn (xi , ε/2))q ,

xi ∈E

since x ∈ Bn (xi , ε/2) implies Bn (xi , ε/2) ⊂ Bn (x, ε). Applying inequality (3.4), and using the fact that E is an (n, ε)-separated set, we get   Z n−1 X X q Aε/2 exp −qP n + qϕ(f j xi )  , µ(Bn (x, ε))q−1 dµ ≥ sup  E

xi ∈E

j =0

where the supremum is taken over all (n, ε)-separated sets. Taking logarithms and applying estimates from Lemma 4.3 we conclude that in the limit hµ (f, q) ≤ Rµ (f, q) ≤

P (qϕ) − qP (ϕ) . 1−q

To finish the proof we have to show the opposite inequality. We do it in a similar manner. Let now E be a maximal (n, ε/2)-separated set, then Z Z X µ(Bn (x, ε/2))q−1 dµ µ(Bn (x, ε/2))q−1 dµ ≤ xi ∈E B (x ,ε/2) n i

≤

X

µ(Bn (xi , ε))q ,

xi ∈E

since x ∈ Bn (xi , ε/2) implies that Bn (x, ε/2) ⊂ Bn (xi , ε).

610

F. Takens, E. Verbitski

Again since E is an arbitrary (n, ε/2)-separated set and applying the inequality (3.4) we obtain   Z n−1 X X Bεq exp −qP n + qϕ(f j xi )  . µ(Bn (x, ε/2))q−1 dµ ≤ sup  E

xi ∈F

j =0

Taking logarithms and using estimates from Lemma 4.3 in the limit n → ∞ we get hµ (f, q) ≥ Rµ (f, q) ≥

P (qϕ) − qP (ϕ) . 1−q

Combining all together we get the statement in the case q > 1. The case q < 1 is completely analogous. The continuity and other properties of hµ (f, q) follow from the corresponding properties of T (q). u t 9. Final Remarks A. Consider an irregular set B = {x ∈ X : hµ (f, x) does not exist } n−1

1X ϕ(f k (x)) does not exist}. n→∞ n

= {x ∈ X : lim

k=0

We have seen that for the measure of maximal entropy mE this is an empty set. It was shown in [3] that in a number of cases, the set B is either empty or has full topological entropy and Hausdorff dimension. B. There exists another way of defining local (pointwise) entropies. Namely, consider an arbitrary finite measurable partition ξ of X. We can define a local entropy at x with respect to ξ as follows (if the limit exists): 1 hµ (f, x, ξ ) = lim − log µ(ξ (n) (x)), n→∞ n where ξ (n) = ξ ∨ f −1 ξ ∨ . . . ∨ f −n+1 ξ and ξ (n) (x) is the element of ξ (n) containing x. We can define a spectrum of local entropies with respect to ξ as follows: EE (α) = htop (f |{x:hµ (f,x,ξ )=α)} ). The situation when ξ is a finite Markov partition for an expanding dynamical system has been studied in [2,1]. One can eaily check that in this case the two spectra coincide. C. The results of this paper can be extended to the case of expansive endomorphisms (i.e., non-invertible maps) with the specification property. They are defined in exactly the same way as the expansive homeomorphisms with specification except that the set Z in (2.1) is substituted by N (positive expansiveness). The characteristic property of the equilibrium states (Theorem 3.7) remains valid [17]. Therefore our analysis works without any modifications. In the case of expansive homeomorphisms we can give another definition of local entropies. Namely, for any n ≥ 1 define i i B± n (x, ε) = {y ∈ X : d(f (x), f (y)) < ε for all i = −n + 1, . . . , n − 1},

Multifractal Analysis of Local Entropies

and

611

1 log µ(B± n (x, ε)), 2n − 1 1 ± hµ (f, x) = lim lim sup − log µ(B± n (x, ε)). ε→0 n→∞ 2n − 1 h± µ (f, x) = lim lim inf − ε→0 n→∞

Then the level sets of these local entropies will be in one-to-one correspondence with the level sets of two-sided ergodic averages of ϕ, 1 n→∞ 2n − 1 lim

n−1 X

ϕ(f k (x)).

k=−n+1

The level sets of two-sided and one-sided ergodic averages of ϕ can be different. However, they have the same topological entropy with respect to f . Therefore the multifractal spectrum based on h± µ (f, x) will be the same. D. A requirement of the existence of a Markov partition is stronger than a specification property, provided the dynamical system is mixing. Consider the family of one-dimensional interval maps Tβ , defined by Tβ (x) = βx (mod 1). For β > 1 these maps are expanding and therefore expansive. The ergodic properties of Tβ depend on the number-theoretic properties of β. For these systems it turns out [19] that: i) the set of β’s for which Tβ has a finite Markov partition is at most countable; ii) the set of β’s for which Tβ has the specification property is uncountable and has Hausdorff dimension 1, but still has Lebesgue measure 0. Therefore, we can see that in the family {Tβ }β>1 , specification is a much more general property than the property of having a finite Markov partition. Acknowledgements. The work of the second author was supported by the Netherlands Organization for Scientific Research (NWO), grant 613-06-551. We would like to thank Luis Barreira, Yakov Pesin and Jörg Schmeling for their valuable advice and comments.

References 1. Barreira, L., Pesin, Ya. and Schmeling, J.: Multifractal spectra and multifractal rigidity for horseshoes. J. Dynam. Control Systems 3(1), 33–49 (1997) 2. Barreira, L., Pesin, Ya. and Schmeling, J.: On a general concept of multifractality: Multifractal spectra for dimensions, entropies, and Lyapunov exponents. Multifractal rigidity. Chaos 7 (1), 27–38 (1997) 3. Barreira, L and Schmeling, J.: Sets of “non-typical” points have full topological entropy and full hausdorff dimension. Preprint Instituto Superior Techno, 1997 4. Beck, Ch. and Schlögl, F.: Thermodynamics of chaotic systems. Vol. 4 of Cambridge Nonlinear Science Series, Cambridge, Cambridge University Press, 1993, An introduction 5. Bowen, R.: Topological entropy for noncompact sets. Trans. Am. Math. Soc. 184, 125–136 (1973) 6. Bowen, R.: Some systems with unique equilibrium states. Math. Systems Theory 8 (3), 193–202 (1974/75) 7. Brin, M. and Katok, A.: On local entropy. In: Geometric dynamics (Rio de Janeiro, 1981), Vol. 1007 of Lecture Notes in Math., Berlin: Springer, 1983, pp. 30–38 8. Haydn, N.T.A. and Ruelle, D.: Equivalence of Gibbs and equilibrium states for homeomorphisms satisfying expansiveness and specification. Commun. Math. Phys. 148 (1), 155–167 (1992) 9. Katok, A. and Hasselblatt, B.: Introduction to the modern theory of dynamical systems. Vol. 54 of Encyclopedia of Mathematics and its Applications, Cambridge: Cambridge University Press, 1995 10. Pesin, Ya. and Weiss, H.: A multifractal analysis of equilibrium measures for conformal expanding maps and Moran-like geometric constructions. J. Stat. Phys. 86 (1–2), 233–275 (1997)

612

F. Takens, E. Verbitski

11. Pesin, Ya. and Weiss, H.: The multifractal analysis of Gibbs measures: Motivation, mathematical foundation, and examples. Chaos 7 (1), 89–106 (1997) 12. Pesin, Ya.B.: Dimension-like characteristics for invariant sets of dynamical systems. Uspekhi Mat. Nauk 43 (4(262)), 95–128, 255 (1988) 13. Pesin, Ya.B. and Pitskel, B.S.: Topological pressure and the variational principle for noncompact sets. Funktsional. Anal. i Prilozhen. 18 (4), 50–63, 96 (1984) 14. Pesin, Ya.B.: Dimension theory in dynamical systems. Chicago Lectures in Mathematics. Chicago, IL: University of Chicago Press, 1997, Contemporary views and applications 15. Roberts, A.W. and Varberg, D.E.: Convex functions. New York–London: Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], 1973, Pure and Applied Mathematics, Vol. 57 16. Ruelle, D.: Thermodynamic formalism, Vol. 5 of Encyclopedia of Mathematics and its Applications, Reading, MA: Addison-Wesley, 1978 17. Ruelle, D.: Thermodynamic formalism for maps satisfying positive expansiveness and specification. Nonlinearity 5 (6), 1223–1236 (1992) 18. Schmeling, J.: On the completeness of multifractal spectra. Preprint WIAS, Berlin, 1996 19. Schmeling, J.: Symbolic dynamics for β-shifts and self-normal numbers. Ergodic Theory Dynam. Systems 17 (3), 675–694 (1997) 20. Takens, F. and Verbitski, E.: Generalized entropies: Rényi and correlation integral approach. Nonlinearity 11, 771–782 (1998) 21. Walters, P.: An introduction to ergodic theory. Vol. 79 of Graduate Texts in Mathematics. NewYork-Berlin: Springer-Verlag, 1982 22. Walters, P. Differentiability properties of the pressure of a continuous transformation on a compact metric space. J. London Math. Soc. (2) 46 (3), 471–481 (1992) Communicated by Ya. G. Sinai

Commun. Math. Phys. 203, 613 – 633 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

On the Algebro-Geometric Integration of the Schlesinger Equations P. Deift1 , A. Its2 , A. Kapaev3 , X. Zhou4 1 Courant Institute of Mathematical Sciences, New York, NY 10003, USA 2 Department of Mathematical Sciences, Indiana University – Purdue University Indianapolis, Indianapolis,

IN 46202-3216, USA. E-mail: [email protected]

3 St. Petersburg Branch of Steklov Mathematical Institute, Russian Academy of Sciences, St. Petersburg,

191011, Russia

4 Department of Mathematics, Duke University, Durham, NC 27708-0320, USA.

Received: 3 March 1998 / Accepted: 16 December 1998

Abstract: A new approach to the construction of isomonodromy deformations of 2 × 2 Fuchsian systems is presented. The method is based on a combination of the algebrogeometric scheme and Riemann–Hilbert approach of the theory of integrable systems. For a given number 2g + 1, g ≥ 1, of finite (regular) singularities, the method produces a 2g-parameter submanifold of the Fuchsian monodromy data for which the relevant Riemann–Hilbert problem can be solved in closed form via the Baker–Akhiezer function technique. This in turn leads to a 2g-parameter family of solutions of the corresponding Schlesinger equations, explicitly described in terms of Riemann theta functions of genus g. In the case g = 1 the solution found coincides with the general elliptic solution of the particular case of the Painlevé VI equation first obtained by N. J. Hitchin [H1]. Introduction Let 0=

 

P = (λ, w) : w2 =



2g+1 Y j =1

 

(λ − aj ) , aj 6= ak for j 6= k , 

be a hyperelliptic curve of genus g ≥ 1. We shall use the standard representation of 0 as a two-sheeted covering of CP1 with the cuts along the intervals [a2k+1 , a2k+2 ], k = 0, . . . , g, a2g+2 = ∞, assuming that
(1)

and <λ >
π . 2

(2)

614

P. Deift, A. Its, A. Kapaev, X. Zhou

Bk a 2k+1 a 2k a

1

a2

a

2k+2

a

2g+1

oo

a 2k-1

Ak Fig. 1. The basis of the canonical cycles of the curve 0 g

We choose the basis {Ak , Bk }k=1 of the group H1 (0) so that Ak lies fully on the upper sheet and encircles the intervals [a2j −1 , a2j ], j = 1, . . . , k clockwise while Bk emerges on the upper sheet at the point a2k , passes to the point a2k+1 and returns to the initial point a2k through the lower sheet (see Fig. 1). We normalize the corresponding basis of the holomorphic differentials, {ωj (P )}, by the relation Z ωj = 2πiδj k , 1 ≤ j, k ≤ g . Ak

Consider CP1 \{a1 , . . . , a2g+1 , a2g+2 = ∞}, the complex sphere with 2g + 2 punctures, and suppose that 2g + 1 matrices Aj ∈ M(2, C) are given. Then a representation 1 of the fundamental group π1 CP \{a1 , . . . , a2g+1 , ∞} can be obtained via solutions of the 2 × 2 Fuchsian system, 2g+1 X Aj dY (λ) = Y (λ), dλ λ − aj

Y (λ) ∈ M(2, C),

(3)

j =1

in the following way. Fix a base point a0 ∈ CP1 \{a1 , . . . , a2g+1 , ∞} and realize π1 as the collection of loops {γ = γ (t), 0 ≤ t ≤ 1, γ (0) = γ (1) = a0 } in the usual way. Fix a matrix Y0 ∈ Gl(2, C). Now let Y (λ) be the solution of (3) with Y (a0 ) = Y0 . For each γ ∈ π1 , set M(γ ) = M(γ ; a0 , Y0 ) ≡ Y0−1 Y (γ (1)), where Y (γ (1)) denotes the analytic continuation of the solution Y (λ) along the loop γ , a0 = γ (0) 7 → γ (1) = a0 . Then a simple calculation shows that the map π1 3 γ 7 −→ M(γ ) ∈ Gl(2, C) is a linear representation of π1 , which is called a monodromy representation of system (3). Given {Aj , aj }, the subgroup {M(γ ), γ ∈ π1 } ⊂ Gl(2, C) is called the monodromy group for the Fuchsian system (3). (This nomenclature of course involves a slight abuse of notation: the group {M(γ ) = M(γ ; a0 , Y0 )} depends on the choice of a0 and Y0 . However, all these groups are isomorphic via a conjugation, e.g. M(γ ; a0 , Y0 ) =

Algebro-Geometric Integration of Schlesinger Equations

615

Y0−1 M(γ ; a0 , I )Y0 , and when we say monodromy group we really mean the equivalence class of monodromy group modulo conjugation.) The monodromy group, which can of course be introduced in a similar way for a Fuchsian system of an arbitrary matrix dimension, is a central object of the global theory of Fuchsian systems. We refer the reader to the review [B] for a comprehensive account on the history as well as the most recent results concerning the general monodromy theory. As we vary the aj ’s, 1 ≤ j ≤ 2g + 1, and hence deform the punctured sphere CP1 \{a1 , . . . , a2g+1 , ∞}, it is clearly of basic interest to determine which coefficients Aj = Aj (a1 , . . . , a2g+1 ) give rise to the same monodromy group. Maps 2g+1

2g+1

{aj }j =1 7 −→ {Aj (a1 , . . . , a2g+1 )}j =1

which give the same monodromy groups are called isomonodromy deformations P of (3), and it turns out that {Aj (a1 , . . . , a2g+1 )}, subject to the gauge condition, j Aj = diagonal matrix, is an isomonodromy deformation if and only if the Aj ’s satisfy the Schlesinger equations, dAj +

X daj − dak [Aj , Ak ] = 0, aj − ak

j = 1, . . . , 2g + 1,

(4)

j 6 =k

(see [Sch]; see also [JMU, Mal]). In the present paper we show how to use curves 0 of the above type to construct a 2g-parameter family of solutions of (4) (see (81) - (87) below). In our approach, the complete set of the algebro-geometric data consists of the curve 0 and an abelian integral (P ) of the second kind, 2g+1 X Aj dY (λ) Y (λ). = 0 → {0, (P )} → dλ λ − aj

(5)

j =1

The relevant monodromy group is parametrized by 2g independent parameters and is encoded in the abelian integral (λ), more precisely, in its A, B-periods (see (33)– (34), (27), and (37)–(38) below). All the generators Mj of the monodromy group are nontrivial, in fact (cf. (33)), the eigenvalues of Mj = ±i, ∀j = 1, . . . , 2g + 2.

(6)

The first arrow in diagram (5) is simply the selection of an abelian integral of the second kind on 0. The second arrow constitutes an explicit solution of a certain matrix Riemann–Hilbert problem determined by the periods of the abelian integral (see problem (i–iii) below). This means that we construct an example of a 2g-parameter submanifold (see (33)–(34) below) of the Fuchsian monodromy data corresponding to 2g + 1 finite regular singularities for which the inverse monodromy problem can be solved in closed form (see (66), (87) below). The method we propose is based on recent technical developments in the asymptotic analysis of integrable systems via the Riemann–Hilbert approach [DVZ, DIZ]. A general scheme for constructing algebro-geometric solutions of a generalized Fuchsian equation ( irregular singularities may be included) of an arbitrary matrix dimension was suggested in [JM] (see also [Ko and KoM] where the approach of [JM] has been developed and used for studying the Einstein equations). Applied to the curve 0, the

616

P. Deift, A. Its, A. Kapaev, X. Zhou

method of [JM] leads to a particular solution of the Schlesinger equation (4) generated by the diagram, dY (λ) = 0→ dλ

2g+1+m X j =1

Aj Y (λ), m > g. λ − aj

(7)

As indicated, in this scheme one needs to extend the set of regular singularities by more than g extra points. Moreover, the monodromy group corresponding to the solution is very special and is determined a priori by the curve 0. The monodromy group is in fact a finite abelian group isomorphic to the group Z2 ; the monodromy matrices associated with each of the singular points aj are given by the equations, 01 ≡ σ1 , j = 1, . . . ., 2g + 1, Mj = 10 at the branch points, and Mj = I, j = 2g + 2, . . . , 2g + 1 + m, at the added points. We emphasize that all the monodromy matrices at the added singularities are trivial. Said differently, in contrast to the method of [JM] the approach proposed in this paper allows us to construct a 2g-parameter (g ≥ 1) family of solutions of the Schlesinger equations (4) related to 2 × 2 Fuchsian system with a fixed number 2g + 1 of finite (regular) singularities and with nontrivial monodromy in each of the singular points. The corresponding monodromy groups are infinite and non-abelian. (The method of [JM] can, in principal, be modified, using some of the constructions presented below, to allow for the inclusion of the free parameters in the monodromy group. Nevertheless, the extension of the set of regular singularities indicated above can not be avoided in the framework of [JM].) In what follows, the new scheme is described in detail. In the particular case g = 1, this approach gives the general solution of the Painlevé VI equation with parameters 1 1 1 3 8 , − 8 , 8 , 8 , whose elliptic function representation was first obtained in [H1] and then discussed extensively in the literature (see [H2, Man, LO] and references therein). We also believe that the constructions developed here have a connection to the results from topological field theory in [D, Kr, KrPh]. The main result of the paper is stated in Theorem 1 below. Note. After this work was completed and submitted for publication, we became aware of preprint [KK] where a different approach to the construction of the same solution of the Schlesinger equations was presented. Integral (P ) Let (P ) be an abelian integral on 0 with possibly one pole, placed at P = ∞. In addition, suppose that (a2g+1 ) = 0, and that

(P ∗ ) = −(P ),

Algebro-Geometric Integration of Schlesinger Equations

617

where P → P ∗ denotes the involution of the sheets of 0, P ∗ = (λ, −w).

P = (λ, w),

(8)

We note that (P ) can be represented as Z (P ) =

P

a2g+1

Q dλ, w

(9)

where Q = Q(λ) is a polynomial in λ. g g With (P ) given, we define the sets of nonzero constants, {ln ck }k=1 and {ln dk }k=1 by the equations involving the A and B - periods of (λ), Z d = ln dk , k = 1, . . . , g, (10) Ak

Z Bk

d = ln

ck ck−1

, c0 = 1,

k = 1, . . . , g.

(11)

These equations define the period map, M : {(P )} → {ln dk , ln ck } ∈ C2g ,

(12)

to the monodromy data {ln dk , ln ck }. Note that ker M = = {meromorphic functions on 0 with possibly one pole at ∞},

(13)

and that map (12) is onto. Indeed, as λ0 , λ1 , . . . ,λg are analytic on the Riemann surface away from infinity, and behave like even powers of the local variable at infinity, it follows from the Weierstrass gap theorem (see e.g. [FK]) that there are no functions analytic on the finite part of the Riemann surface with odd order poles of degree < 2g at infinity. But any nonzero vector in the null space of the map (12) gives rise to an analytic function on the finite part of the Riemann surface with an odd order pole at infinity. It follows that the null space of map M restricted to the integrals (P ) with poles of degree < 2g at infinity is trivial and hence the map (12) is onto. The surjectivity of the map M implies that the inverse period map, M−1 : {ln dk , ln ck } → {(P )}/ker M,

(14)

is well defined, so that the set, a1 , . . . , a2g+1 ; ln c1 , . . . , ln cg ; ln d1 , . . . , ln dg , becomes our parameter set: {a1 , . . . , a2g+1 ; ln c1 , . . . , ln cg ; ln d1 , . . . , ln dg } → {0, (P ) + ker M}. The integral (P ) can be also represented as follows: (P ) = h(P ) +

g X j =1

Z αj

P

a2g+1

ωj ,

(15)

618

P. Deift, A. Its, A. Kapaev, X. Zhou

with the abelian integral h(P ) uniquely determined by this equation and the normalization conditions, Z dh = 0, j = 1, . . . ., g. (16) Aj

A simple calculation shows that h(P ) = ∞ (λ) + o(1), P → ∞,

(17)

where ∞ (λ) denotes the principle part of the integral (P ) at P = ∞. From (10) and (11) it follows that αk = −

i ln dk , k = 1, . . . , g, 2π

(18)

and Z

g i X ck dh(P ) = Bkj ln dj + ln , k = 1, . . . , g, 2π c k−1 Bk

(19)

j =1

where

Z Bkj =

Bk

ωj

is the canonical B-matrix of 0. Let   g   i X ck Bkj ln dj + ln = 0 ⊂ C4g+1 50 = (aj ; ln cj ; ln dj ) :   2π ck−1

(20)

j =1

be a special submanifold of co-dimension g in the space of parameters aj , ln cj , ln dj . From (19) it follows that (a1 , . . . , a2g+1 ; ln c1 , . . . , ln cg ; ln d1 , . . . , ln dg ) 6∈ 50 H⇒ ∞ (λ) 6 ≡ 0.

(21)

Indeed, the assumption ∞ (λ) = 0, would imply that the integral h(P ) is an abelian integral of the first kind with all the A-periods zero. This means h(P ) must be a constant (zero, in fact), and hence all the B-periods are zero as well. Because of (19), this would yield (a1 , . . . , a2g+1 ; ln c1 , . . . , ln cg ; ln d1 , . . . , ln dg ) ∈ 50 . Implication (21) means that for generic aj , ln cj , and ln dj , i.e. under the condition, (a1 , . . . , a2g+1 ; ln c1 , . . . , ln cg ; ln d1 , . . . , ln dg ) 6∈ 50 , the integral (P ) is of the second kind.

(22)

Algebro-Geometric Integration of Schlesinger Equations

619

The Riemann–Hilbert Problem Let L denote the oriented polygonal line (see Fig. 2), L = [∞, a1 ] ∪ [a1 , a2 ] ∪ [a2 , a3 ] ∪ . . . · · · ∪ [a2k , a2k+1 ] ∪ [a2k+1 , a2k+2 ] ∪ · · · ∪ [a2g , a2g+1 ] ∪ [a2g+1 , ∞], where we assume that <λ <
and <λ >
The positive and the negative parts of the plane C with respect to the contour L we shall denote as C+ and C− respectively. We also note that the domain C+ contains the infinite part of the positive imaginary axis. Having chosen the monodromy set {ln ck , ln dk }, we define the 2 × 2 matrix function Y (λ) (cf. diagram (5)) as the solution of the following Riemann–Hilbert problem (RH): (i) Y (λ) is analytic in CP1 \L. (ii) The L2 -limits, Y± (λ), of Y (λ) as λ → L± satisfy the jump conditions. c

Y− (λ) = Y+ (λ)

0 cgk − ccgk 0

! ,

(23)

if λ ∈ [a2k+1 , a2k+2 ], k = 0, . . . , g, a2g+2 = ∞ and Y− (λ) = Y+ (λ)

dg dk

0

0 dk dg

! ,

(24)

if λ ∈ [a2k , a2k+1 ], k = 0, . . . , g, a0 = ∞, d0 = 1. (iii) The behavior of Y (λ) as λ → ∞ is normalized by the equation, 1 1 1 idg , λ → ∞, λ ∈ C+ , λ− 4 σ3 Y (λ) = I + O −1 idg 1 λ where

σ3 =

(25)

1 0 , 0 −1

1

and λ 4 denotes the branch defined in the neighborhood of ∞, U∞ = λ : |λ| > R > max{|aj |, j = 1, . . . , 2g + 1} , cut along the segment [a2g+1 , ∞]∩U∞ and normalized by the condition, arg λ = when λ lies on the positive imaginary axis.

π 2

620

P. Deift, A. Its, A. Kapaev, X. Zhou

cg

0

-

C+

ck

ck

0

cg

a 2k+1 a

a 2k oo

a1

a

a 2

a

2k+2

oo

2g+1

2k-1

dg d

0

k

dk

0

C-

dg

Fig. 2. The contour L and the Riemann–Hilbert problem (i–iii)

)

The Fuchsian Equation, the Monodromy Data, and the Schlesinger Equations Assume that the solution Y (λ) of the RH problem (i–iii) exists. Then, one can see that in the neighborhoods of the points λ = aj , j = 1, . . . , 2g + 1 the function Y (λ) admits the following representations: j +1 Y (λ) = Yˆj (λ)(λ − aj )(−1)

σ3 4

1 iρj iρj−1 1

Cj (λ), j = 1, . . . , 2g + 1,

(26)

where ( ρj =

c d

g g − ck−1 dk

c d − ck−1g dgk−1

for j = 2k for j = 2k − 1

,

(27)

and Cj (λ) are the piecewise constant matrix functions given by the equations, Cj (λ) = I, ∀λ ∈ C+ ,

(28)

 σ3  dg for j = 2k d , ∀λ ∈ C− . Cj (λ) = dk σ3 g  for j = 2k − 1 dk−1

(29)

The matrix function Yˆj (λ) is holomorphic and invertible in the neighborhood of the point λ = aj , Uj = λ : |λ − aj | < r < min{|aj − al |, 1 ≤ l ≤ 2g + 1, l 6 = j } ,

Algebro-Geometric Integration of Schlesinger Equations

621

1

and (λ−aj ) 4 denotes an arbitrary branch defined in the same neighborhood cut along the segment [aj −1 , aj ] ∩ Uj if j = 2k and along the segment [aj , aj +1 ] ∩ Uj if j = 2k − 1. A similar representation for Y (λ) near λ = ∞ reads (cf. (25)) as σ 1 idg − 43 ˆ C∞ (λ), Yˆ∞ (∞) = I, (30) Y (λ) = Y∞ (λ)λ idg−1 1 I for λ ∈ C+ , (31) C∞ (λ) = dgσ3 for λ ∈ C− 1 where Yˆ∞ (λ) is holomorphic and invertible at λ = ∞, and λ 4 is defined as above. By standard Liouville theorem arguments (see e.g. [FT]) based on the λ-independence of the jump matrices of the RH problem (i–iii) and Eqs. (26–29), we conclude that Y (λ) satisfies the Fuchsian system (3) with the matrix residues Aj determined by the equations,

σ3 ˆ −1 (32) Y (aj ), j = 1, . . . , 2g + 1. 4 j Equations (26–30) yield the following set of the corresponding monodromy matrices, Aj = (−1)j +1 Yˆj (aj )

Y˜ (λ)|(λ−aj )7→(λ−aj )e2π i = Y˜ (λ)Mj , j

Mj = (−1)

0 ρj −ρj−1 0

, j = 1, . . . , 2g + 1,

(33)

Y˜ (λ)| 1 7→ 1 e2πi = Y˜ (λ)M∞ , λ

λ

M∞ =

0 −dg dg−1 0

.

(34)

Here Y˜ (λ) denotes the analytic continuation of Y (λ) from C+ to the universal covering of CP1 \{a1 , . . . , a2g+1 , ∞}. We note that trace Aj = 0, for all j, and that

2g+1 X

Aj = diagonal matrix,

j =1

due to the normalization condition, Yˆ∞ (∞) = I . The monodromy matrices M1 , . . . , M2g+1 , M∞ generate the monodromy group for the Fuchsian system (3), satisfy the cyclic relation, M∞ M2g+1 . . . M1 = I, and depend only on cj , dj . Hence, the 2 × 2 matrices, Aj ≡ Aj (a1 , . . . , a2g+1 |c1 , . . . , cg ; d1 , . . . , dg ) = σ3 (35) = (−1)j +1 Yˆj (aj ) Yˆj−1 (aj ), j = 1, . . . , 2g + 1, 4 which are evaluated explicitly through Y (λ), give rise to a solution of the associated Schlesinger equations (4).

622

P. Deift, A. Its, A. Kapaev, X. Zhou

Solution for the Riemann–Hilbert Problem Here we follow the method, which for similar RH problems was first suggested in [DVZ], and then used in [DIZ] and subsequently in [IK].

Reduction to the canonical algebro-geometric RH problem. Given the monodromy data {ln ck , ln dk } let (P ) be the associated abelian integral. Define Y (λ) = 9(λ)e(λ)σ3 ,

(36)

where (λ) denotes the natural restriction of the integral (P ) on C+ ∪ C− : the integration path in (9) lies in C+ and in C− for λ ∈ C+ and λ ∈ C− respectively, and the integral itself is taking on the upper sheet of 0. Because of Eqs. (10), (11), and (9) we have that (+ + − )|[a2k+1 ,a2k+2 ] = −2 =−

Z g X j =k+1 Bj

d = ln

Z g X

a2j +1

j =k+1 a2j

d =

ck , k = 0, . . . , g, a2g+2 = ∞, cg

(37)

and (+ − − )|[a2k ,a2k+1 ] = −2 Z

Z =−

Ag

d +

Ak

d = ln

g−1 Z X

a2j +2

j =k a2j +1

d =

dk , k = 0, . . . , g, a0 = ∞, d0 = 1. dg

(38)

Equation (37) implies that in terms of 9(λ) the jump condition (23) takes the form 0 1 , (39) 9− (λ) = 9+ (λ) −1 0 while the jump condition (24) is replaced by 10 = 9+ (λ), i.e. no jump. 9− (λ) = 9+ (λ) 01

(40)

Therefore, in terms of the function 9(λ) the RH problem (i–iii) transforms to the following canonical RH problem (see Fig. 3): Sg (i0 ) 9(λ) is analytic in CP1 \L0 , L0 = k=0 [a2k+1 , a2k+2 ], a2g+2 = ∞. (ii0 ) The L2 -limits, 9± (λ), of 9(λ) as λ → L0± satisfy the jump condition, 0 1 . (41) 9− (λ) = 9+ (λ) −1 0

Algebro-Geometric Integration of Schlesinger Equations

623

(iii0 ) The behavior of 9(λ) as λ → ∞ in C+ is normalized by the relation, σ3 1 1 1 0 1 i − σ 3 λ 4 I +O e−∞ (λ)σ3 , 9(λ) = dg 2 −i0 1 i 1 λ

(42)

where ∞ (λ) as before denotes the principle part of (P ) at infinity, and the constant 0 is given by the expansion, 1 0 1 , P → ∞. (43) (P ) = ∞ (λ) − ln dg + 1/2 + O 2 λ λ3/2

0

1

-1

0

C+

a 2k+1 a

a 2k a

1

a2

a

2k+2

2g+1

oo

a 2k-1

CFig. 3. The contour L0 and the canonical Riemann-Hilbert problem (i0 –iii0 )

We intend to use solution of the RH problem (i–iii) to solve the Schlesinger equations (4). This means that the monodromy data {ln cj , ln dj } should be taken to be independent of the aj ’s. Thus, our main objective is to solve the RH problem (i–iii) for generic values of parameters aj , ln cj , and ln dj , i.e. under condition (22). The generic condition (22), by virtue of the implication (21), means that we should assume that, ∞ (λ) 6 ≡ 0,

(44)

in setting up the RH problem (i0 –iii0 ). In fact, the solution of the RH problem (i–iii), which we will eventually obtain (see (66) below), will also be valid for the special submanifold 50 . Solution for the canonical RH problem. Let " β(λ) = (λ − a2g+1 )

g Y λ − a2l−1 l=1

λ − a2l

# 41 .

(45)

624

P. Deift, A. Its, A. Kapaev, X. Zhou

The function β(λ) is defined as a single-valued analytic function in CP1 \L0 , normalized by the asymptotic condition, β(λ) ∼ λ1/4 , λ → i∞, arg λ =

π . 2

(46)

We also note that the boundary values of β(λ) on the contour L0 satisfy the equation, β− (λ) = iβ+ (λ), λ ∈ L0 .

(47)

The function β 4 (λ) − 1 has precisely g + 1 zeros λ0 , λ1 , . . . , λg in C. At each λj either β(λ) − β −1 (λ) = 0, or

β(λ) + β −1 (λ) = 0.

Under the involution, P → P ∗ , which switches the sheets of 0 (see (8)), function β(λ) behaves so that β(λ) + β −1 (λ) → ±i β(λ) − β −1 (λ) . Therefore the zero divisor , D0 =

g X

Pj , Pj = (λj , wj ) ∈ 0

(48)

j =0

of the sum β(λ) + β −1 (λ) considered as a function on 0 is well defined (though the function itself is not ). Moreover, D0∗

=

g X j =0

Pj∗

is the zero divisor of the function β(λ) − β −1 (λ). We assume that the reduced divisor, D=

g X

Pj ,

(49)

j =1

is a nonspecial positive divisor of degree g. This property is generic for (a1 , . . . , a2g+1 ) in C2g+1 , and is true, in particular, if the numbers λ1 , . . . , λg are distinct in C (see e.g. [Mum]). Set β + β −1 −iβ + iβ −1 . (50) N(λ) = iβ − iβ −1 β + β −1 From Eq. (47) it follows that the matrix N (λ) satisfies the jump condition (41). Nevertheless, because of the generic assumption (44), it does not solve the RH problem (i0 − iii0 ) : the asymptotic condition (42) at infinity is not satisfied. To make N (λ) into a genuine solution of the problem (i0 –iii0 ) we now use a certain Baker–Akhiezer deformation of N(λ) (cf. [DIZ]).

Algebro-Geometric Integration of Schlesinger Equations

625

Let us define on 0 the following pair of the Baker–Akhiezer functions: 91 (P ) =

92 (P ) =

θ

θ

R

ω−V−D R e−h(P ) , P θ ∞ω−D

(51)

ω−V+D R e−h(P ) , P θ ∞ω+D

(52)

P ∞

R

P ∞

where

P 1. θ (z) = m∈Zg exp 21 (Bm, m) + (z, m) , is the Riemann theta function (matrix B is given in (19)) which satisfies the canonical periodicity equation, 1 θ(z + 2πin + Bm) = exp − (Bm, m) − (z, m) θ (z). 2

(53)

2. The vector D belongs to the Jacobian of 0 and is defined by the equation D = A(D) + K, where A(D) is the abelian mapping with base point at infinity, A(D) =

g Z X

Pj

j =1 ∞

ω,

(54)

K is the associated vector of Riemann constants, and the divisor, D = D0 − P0 , is the reduced zero divisor of the function β(λ) + β −1 (λ) (see (49)). Also from (54) it follows that A(D∗ ) = −A(D),

(55)

as points on the Jacobian. 3. The abelian integral h(P ) is defined in (17), (16). 4. The vector V is the vector of the B-periods of the integral h: Z dh = ( see (19)) = Vk = Bk

g ck i X Bkj ln dj + ln , k = 1, . . . , g. = 2π ck−1 j =1

Note that V = 0 if and only if the generic condition (22) is violated.

(56)

626

P. Deift, A. Its, A. Kapaev, X. Zhou

By standard arguments (see e.g. [BBEIM]) based on the periodicity property (53) of the theta function, the functions 91,2 (P ) are single-valued on 0. Moreover, by their definitions and in view of (55), the functions 91 (P ) and 92 (P ) only have poles at the divisors D and D∗ respectively. They also have an essential singularity at P = ∞, which is described by the equations, 1 −∞ (λ) θ (V + D) 1+O √ , (57) 91 (P ) = e θ (D) λ P → ∞, 92 (P ) = e

−∞ (λ) θ

1 (V − D) , 1+O √ θ (D) λ

(58)

P → ∞. Therefore, deforming the matrix N(λ) (see (50)) according to the rule, N(λ) → 9(λ) = 3Ndeformed (λ) ≡ (β + β −1 )91 (λ) −i(β − β −1 )91∗ (λ) , ≡3 i(β − β −1 )92 (λ) (β + β −1 )92∗ (λ) where

(59)

91,2 (λ) = 91,2 (P ), P = (λ, w) ∈ upper sheet of 0, ∗ (λ) = 91,2 (P ∗ ), P ∗ = (λ, −w) ∈ lower sheet of 0, 91,2

and taking into account that ∗ ]∓ (λ), λ ∈ L0 , [91,2 ]± (λ) = [91,2

we come to a solution of (i0 , ii0 ). In order to satisfy (iii0 ), and hence to produce a solution of the canonical RH problem (i0 –iii0 ), the matrix 3 must be chosen to satisfy the normalization condition (42), i.e. σ3 1 0 −1 2 N∞ , (60) 3 = dg −i0 1 where N∞ is determined by the asymptotic equation: 1 − 41 σ3 1 i λ e−∞ (λ)σ3 , Ndeformed (λ) = N∞ I + O i 1 λ C+ 3 λ → ∞. From the Eqs. (57) and (58) written up to the term λ−1 = λ−1/2 σ3 1 θ+ iθ− 2 dg 3= , iκ− θ+ κ+ θ− κ+ + κ −

2

(61)

it follows that (62)

where θ± =

θ(D) , θ (V ± D)

(63)

Algebro-Geometric Integration of Schlesinger Equations

κ± = 1 + 2

g X

627

θ (D ± V) i ± ln dl , ∂zl ln θ (D) 2π

ωl1

l=1

∂zl θ (F) ≡ ∂zl θ (z) |z=F

(64)

and the constants ωl1 = ωl1 (a1 , . . . , a2g+1 ) are defined through the expansion, ∞

ωl (P ) X = ωlk λ−k−1/2 , P → ∞. dλ

(65)

k=1

Equations (59) and (36) complete the solution of the original RH problem (i–iii). The final expression for Y (λ) is given by the formula, R R   λ λ θ ∞ ω−V−D θ ∞ ω+V+D −1 −1  (β + β ) θ R λ ω−D −i(β − β ) θ R λ ω+D   ∞ R ∞ R  Y (λ) = 3  × λ λ θ ∞ ω−V+D θ ∞ ω+V−D   i(β − β −1 ) R λ (β + β −1 ) R λ θ

×e

i − 2π

∞ ω+D

θ

Rλ j =1 ln dj a2g+1

Pg

ωj σ3

∞ ω−D

,

(66)

where all the abelian integrals are taken on the upper sheet of 0, and the integration paths lie in C+ and in C− for λ ∈ C+ and λ ∈ C− respectively. It is worth noticing that any ambiguity (see (13)) related to the choice of the abelian integrals (P ) and h(P ) has disappeared. The integral h(P ) is not involved in the final formula (66) at all. The rhs in (66) only depends on the basic parameters, a1 , . . . , a2g+1 ; ln c1 , . . . , ln cg ; ln d1 , . . . , ln dg . In fact, as soon as formula (66) for the solution Y (λ) of the RH problem (i − iii) is written down, it can be verified by direct calculation using the periodicity property (53) of the theta function. The calculations that led to (66) were based on the asumption that certain quantities (e.g. θ (D)) were nonzero. In order to guarantee that these calculations are valid, it is sufficient to make the following generic assumptions on aj , ln cj , ln dj in addition to (1): θ(D0 − V)θ(D0 )θ(D0 + V) 6= 0, where D0 = A(D0 ) + K ≡

g Z X

Pj

j =0 ∞

(67)

ω+K,

and D0 , as in (48), denotes a complete zero divisor of the sum β(λ) + β −1 (λ). This condition arises for the following reasons. First we notice that ! Z Z ∗ θ(D0 ) = θ

P0

∞

ω+D =θ

P0

∞

ω−D ,

628

P. Deift, A. Its, A. Kapaev, X. Zhou

R P and hence the inequality θ(D0 ) 6 = 0 means that the theta functions θ ∞ ω ± D are not identical zero on 0. This fact in turn (see e.g. [Mum]) is equivalent to the nonspeciality of the reduced divisors D and D∗ ( note that D∗ = −D, in view of (55)) . Secondly, for our construction to be consistent, we need the a priori invertibility of the matrix N∞ (see (60) and (61)). We have that det N∞ = lim det Ndeformed (λ). λ→∞

On the other hand, since the determinants of all the jump matrices of the RH problem (i0 – iii0 ) are equal to 1, the determinant of Ndeformed (λ) is holomorphic on CP1 . Therefore, it is a constant. We can evaluate this constant taking λ = λ0 in (59), which corresponds to the point P0 (see (48)). The substitution makes the matrix Ndeformed (λ) either off diagonal (the point P0 lies on the upper sheet of 0) or diagonal (the point P0 lies on the lower sheet of 0). This in turn shows that up to a nonzero multiplier the determinant of N∞ coincides with the product, θ(D0 − V)θ(D0 + V), and, together with θ(D0 ) 6 = 0, condition (67) follows. Note that the invertibility of N∞ (i.e. the existence of matrix 3) implies the inequality, κ+ + κ− 6 = 0 .

(68)

In the special case, (a1 , . . . , a2g+1 ; ln c1 , . . . , ln cg ; ln d1 , . . . , ln dg ) ∈ 50

⇐⇒

V = 0,

(69)

the theta functions disappear from all the equations above, and direct verification of formula (66) for the solution Y (λ) of the RH problem (i–iii) becomes straightforward. In this case, formula (66) reduces to the formula, Y (λ) = 30

β + β −1 −iβ + iβ −1 iβ − iβ −1 β + β −1

where 1 σ3 3 = d2 2 0

e

Rλ j =1 ln dj a2g+1

Pg

i − 2π

1 i 0 κ0 iκ− +

ωj σ3

,

(70)

,

(71)

and 0 κ±

g i X = 1 ± 0 = 1 ± ωl1 ln dl . π

(72)

l=1

If in addition we assume that ln cj = ln dj = 0, ∀j,

(73)

formula (70) in turn reduces to the trivial equation, 1 Y (λ) = 2

1i i 1

β + β −1 −iβ + iβ −1 iβ − iβ −1 β + β −1

≡β

−σ3

1i , i 1

(74)

Algebro-Geometric Integration of Schlesinger Equations

629

which obviously provides a solution of the RH problem (i–iii) corresponding to the choice, cj = d1 = 1, ∀j. Finally observe that as Y (λ) is the (unique) solution of the RH problem (i − iii) which only depends on the aj ’s, cj ’s, and dj ’s, the function Y (λ) does not depend on the particular branches of ln ck , ln dk . Thus, a1 , . . . , a2g+1 ; c1 , . . . , cg ; d1 , . . . , dg , are our effective parameters. It is, in fact, not difficult to see this directly from (66); one again has to take into account the periodicity property (53) of the theta function. The explicit formulae obtained above for the solution Y (λ) of the Riemann–Hilbert problem (i–iii) constitute the main result of the paper, which can be formulated as the following theorem. Theorem 1. Let g be a positive interger, g ≥ 1, and let a1 , . . . , a2g+1 ; c1 , . . . , cg ; d1 , . . . , dg , be the complex numbers such that, aj 6 = ak , cj 6 = 0, dj 6 = 0,

∀j, k, j 6 = k,

and
X

exp

m∈Zg

(75)

Z 1 ωj , (Bm, m) + (z, m) , Bkj = 2 Bk

denote the Riemann theta function associated with the hyperelliptic curve,   2g+1   Y (λ − aj ) . 0 = P = (λ, w) : w 2 =   j =1

g

(We shall assume that the basis {Ak , Bk }k=1 of the group H1 (0) and the corresponding basis of the holomorphic differentials ωj (P ) are chosen as indicated in the introduction.) Denote also g X Pj , Pj ∈ 0, D0 = j =0

the zero divisor of the sum β(λ) + equation (cf. (45) and (46) ),

β −1 (λ),

where the function β(λ) is defined by the

" β(λ) = (λ − a2g+1 )

g Y λ − a2l−1 l=1

λ − a2l

# 41 .

630

P. Deift, A. Its, A. Kapaev, X. Zhou

Suppose that in addition to (75) the following generic assumption on the parameters aj , cj , dj is satisfied, θ(D0 − V)θ(D0 )θ(D0 + V) 6= 0, where D0 = A(D0 ) + K ≡

g Z X

Pj

j =0 ∞

(76)

ω+K,

g i X ck Vk = Bkj ln dj + ln , k = 1, . . . , g, 2π ck−1

V = (V1 , . . . , Vg ),

j =1

and K stands for the vector of Riemann constants associated with the abelian mapping A. Then, the 2 × 2 matrix function Y (λ) ≡ Y (λ|a1 , . . . , a2g+1 ; c1 , . . . , cg ; d1 , . . . , dg ), defined by the formulae (62) - (66) solves the Riemann–Hilbert problem (i–iii). In addition, the function Y (λ) satisfies a 2 × 2 Fuchsian system, 2g+1 X Aj dY (λ) Y (λ) , = dλ λ − aj j =1

whose monodromy matrices at the singular points are given by the equations (cf. (33) and (34)), ( cg dg − ck−1 dk for j = 2k 0 ρj j , ρj = (77) Mj = (−1) c d −ρj−1 0 − g g for j = 2k − 1 ck−1 dk−1

at λ = aj , j = 1, . . . , 2g + 1, and

M∞ =

0 −dg dg−1 0

,

(78)

at λ = ∞. Solution for the Schlesinger Equations Equation (66) for the function Y (λ) allows us to evaluate explicitly the holomorphic factor Yˆj (λ) in representation (26). In particular, we obtain that (−1)j 2

Yˆj (aj ) = βj

1

e2

Pg

l=[(j +1)/2] Vl

3j ,

(79)

where βj =

2g+1 Y

(aj − al )

(−1)l+1 2

≡ (λ − aj )

l6=j, l=1 1

(−1)j 2

β

2 λ=aj

(the branch of (λ − aj ) 2 is the same as in (26))

,

(80)

Algebro-Geometric Integration of Schlesinger Equations

631

and matrix 3j is given by the equations (cf. (62)–(65)), (−1)j −1 σ23 dg 3j = κ+ + κ− j −1 ×σ3

(j )

κ+ i (j ) iκ− 1

! j −1

σ2

(j )

θ± =

(j ) κ±

=

(−1)j −1 βj

−2

(j )

g X l=1

σ3

(−ρj )− 2 ,

×

σ2 =

0 −i i 0

,

θ(D)θ(Ej − V ∓ D) , θ(V ± D)θ (Ej ∓ D)

(j ) ωl1

(j )

!

(j )

θ+ iθ− (j ) (j ) iκ− θ+ κ+ θ−

(81)

(82)

! θ D ± V + Ej i ± ln dl . ∂zl ln 2π θ D + Ej

(83)

(j )

In (83), (82) the constants ωl1 = ωl1 (a1 , . . . , a2g+1 ) are defined through the expansion, ∞

ωl (P ) X (j ) = ωlk (λ − aj )k−3/2 , P → aj , dλ

(84)

k=1

and the half-period Ej is defined by the equations, (Ej )l = πiδl[j/2] −

1 2

g X

Brl .

(85)

r=[(j +1)/2]

It is worth mentioning that (−1)j +1 −

det 3j = βj

e

Pg

l=[(j +1)/2] Vl

.

(86)

Indeed, a priori we have that det Yˆj (aj ) = 1. Equation (86) then follows from (79). From (35) we conclude that Aj = (−1)j +1 3j

σ3 −1 3j , 4

(87)

which together with Eq. (81) yields a 2g-parameter family of explicit solutions of the Schlesinger equations (4). At the same time, equations (81)–(87) provide an explicit solution of the inverse monodromy problem, {M1 , . . . , M2g+1 ; a1 , . . . , a2g+1 , ∞} 7 → {A1 , . . . , A2g+1 ; a1 , . . . , a2g+1 , ∞},

(88)

where the monodromy matrices Mj are given in (77). We emphasize that solution (81)–(87) is nonsingular as long as condition (76) is assumed to hold. It is of interest to compare this result with the general theory of singularities of the solutions of the Schlesinger equations (see [Mal, Miw]). We also notice that the choice, cj = dj = 1, ∀j,

632

P. Deift, A. Its, A. Kapaev, X. Zhou

yields (cf. (74)) the trivial solution of the Schlesinger equations, σ3 . 4 Warning. One can not use the “intermediate” formula (70) to generate solutions of Schlesinger equations. Monodromy data corresponding to (70) depend on the aj ’s. Aj = (−1)j

Remark (Hitchin’s solution of the sixth Painlevé equation). In the case g = 1, a1 = 0, a2 = t, a3 = 1, the Schlesinger equations (4) lead (see [JM]) to the sixth Painlevé equation, 2 1 1 dy 1 1 dy 1 1 1 d 2y + + + + + = − 2 dt 2 y y−1 y−t dt t t − 1 y − t dt t −1 t (t − 1) y(y − 1)(y − t) t + γ + δ + α + β , t 2 (t − 1)2 y2 (y − 1)2 (y − t)2

(89)

(90)

for the function y(t) =

(A3 )12 t , u= . t + (t − 1)u(t) (A1 )12

(91)

Using the general relations (see again [JM]) between the monodromy exponents in representation (26) and the parameters α, β, γ , δ in (90), we see that in our case, α=

1 1 3 1 , β=− , γ = , δ= , 8 8 8 8

(92)

as in [H1]. Equations (91) and (87) give an explicit representation for the general solution of the Painlevé equation (90), (92) in terms of the elliptic theta functions depending on the two complex constants c1 and d1 . These formulae do not coincide exactly with the ones found in [H1]. Nevertheless, both the expressions describe the same Painlevé function since in the case g = 1 the monodromy data indicated in (77) is the same as in [H1]. Of course the two expressions can also be shown to be equal by direct calculation. Acknowledgements. The authors would like to thank Hermann Flaschka and John Harnad, for attracting the authors’ attention to the papers [H1, H2 and Man]. The authors would also like to thank Alexander Bobenko, Andrei Bolibruch, Igor Krichever, and Samson Shatashvili for many useful and stimulating discussions. Percy Deift was supported in part by NSF grant #DMS-9500867. Alexander Its was supported in part by NSF grant #DMS-9501559. Andrei Kapaev was supported in part by RFFI grant #96-01-00668. Xin Zhou was supported in part by NSF grant #DMS-9706644.

References [B] [BE]

Bolibruch, A.A.: The Riemann–Hilbert problem. Russian Math. Surveys 45:2, 1–47 (1990) Bateman, H., Erdélyi, A.: Higher transcendental functions, v. II. New York, Toronto, London: McGraw-Hill Book Company, Inc., 1953 [BBEIM] Belokolos, E.D., Bobenko, A.I., Enol’skii, V.Z., Its, A.R., Matveev, V.B.: Algebro-geometric approach to nonlinear integrable equations. Berlin–Heidelberg: Springer-Verlag, 1994 [D] Dubrovin, B.A.: Geometry of 2D topological field theories. In: Springer LNM. 1620, 1996, pp. 120–348 [DIZ] Deift, P.A., Its, A.R. and Zhou, X.: A Riemann–Hilbert approach to asymptotic problems arising in the theory of random matrix models, and also in the theory of integrable statistical mechanics. Ann. of Math. 146, 149–235 (1997)

Algebro-Geometric Integration of Schlesinger Equations

[DVZ] [FK] [FT] [JM] [JMU] [H1] [H2] [IK] [KK] [Ko] [Kr] [KoM] [KrPh] [LO] [Mal] [Mam] [Man] [Miw] [Sch]

633

Deift, P.A., Venakides, S. and Zhou, X.: The collisionless shock region for the long-time behavior of solutions of the KdV equation. Comm. Pure Appl. Math. 47, 199–206 (1994) Farkas, H.M., Kra, I.: Riemann Surfaces. Berlin–Heidelberg: Springer-Verlag, 1980 Faddeev, L.D. and Takhtajan, L.A.: Hamiltonian methods in the theory of solitons. Berlin– Heidelberg: Springer-Verlag, 1987 Jimbo,M. and Miwa, T.: Monodromy preserving deformation of linear ordinary differential equations with rational coefficients II. Physica D2, 407–448 (1981) Jimbo, M., Miwa, T. and Ueno, K.: Monodromy preserving deformation of linear ordinary differential equations with rational coefficients I. Physica D2, 306–352 (1981) Hitchin, N.J.: Twistor spaces, Einstein metrics and isomonodromic deformations. J. Diff. Geom. 42, N.1 July, 30–112 (1995) Hitchin, N.J.: Frobenius manifolds. Preprint University of Cambridge (1996) Its, A.R. and Kapaev, A.A.: Elliptic asymptotics of the second Painlevé transcendent via the nonlinear steepest descent method. In preparation Kitaev, A.V., Korotkin, D.A.: On solutions to the Schlesinger equations in terms of 2-functions. Preprint, Max-Planck-Institute, Potsdam (1998) Korotkin, D.A.: Finite-zone solutions of the stationary, axially symmetric Einstein equations in vacuum. Teoret. Mat. Fiz. 77, 1, 25–41 (1988) Krichever, I.M.: The τ - function of the universal Whitham hierarchy, matrix models, and topological field theories. Comm. Pure Appl. Math. 47, 437–475 (1994) Korotkin, D.A. and Matveev, V.B.: Algebro-geometric solutions of the gravitational equations. Leningrad Math. J. (Algebra and Analyse), 1, 2, 379–408 (1990) Krichever, I.M., Phong, D.H.: On the integrable geometry of soliton equations and N = 2 supersymmetric gauge theories. Preprint Columbia University (1996) Levin, A.M., Olshanetsky, M.A.: Classical limit of the Knizhnik–Zamolodchikov–Bernard equations as hierarchy of isomonodromic deformations. Free field approach ITEP-TH45/97 hepth/9709207 (1997) Malgrange, B.: Sur les deformations isomonodromiques. Séminaire E. N. S., IV, n0 7, 8 (1981– 1982) Mamford, D.: Tata lectures on Theta I, II: Progress in Mathematics, 28, 48, Basel–Boston: Birkhäuser, 1983, 1984 Manin, Yu.I.: Sixth Painleve Equation, Universal Elliptic Curve, And Mirror of P2 . Preprint, MaxPlanck-Institute, alg-geom/9605010 (1996) Miwa, T.: Painlevé property of monodromy preserving equations and the analyticity of τ function. Publ. R. I. M. S. Kyoto University 17-2, 703–721 (1981) Schlesinger, L.: Über eine Klasse von Differentialsystemen beliebiger Ordnung mit festen kritischen Punkten. J. für Math. 141, 96–145 (1912)

Communicated by T. Miwa

Commun. Math. Phys. 203, 635 – 547 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

A Farey Fraction Spin Chain P. Kleban1 , A. E. Özlük2 1 Department of Physics and Astronomy and Laboratory for Surface Science and Technology, University of

Maine, Orono, ME 04469, USA. E-mail: [email protected]

2 Department of Mathematics and Statistics, University of Maine, Orono, ME 04469, USA

Received: 20 August 1998/ Accepted: 17 December 1998

Abstract: We introduce a new number-theoretic spin chain and explore its thermodynamics and connections with number theory. The energy of each spin configuration is defined in a translation-invariant manner in terms of the Farey fractions, and is also expressed using Pauli matrices. We prove that the free energy exists and a phase transition occurs for positive inverse temperature β = 2. The free energy is the same as that of related, non-translation-invariant number-theoretic spin chain. Using a number-theoretic argument, the low-temperature (β > 3) state is shown to be completely magnetized for long chains. The number of states of energy E = log(n) summed over chain length is expressed in terms of a restricted divisor problem. We conjecture that its asymptotic form is (n log n), consistent with the phase transition at β = 2, and suggesting a possible connection with the Riemann ζ -function. The spin interaction coefficients include all even many-body terms and are translation invariant. Computer results indicate that all the interaction coefficients, except the constant term, are ferromagnetic. 1. Introduction There has been considerable recent interest in the area of statistical mechanical models inspired by or closely connected with number theory. An overall goal of this work is to illuminate connections between the two disciplines, so that each might provide new insights and techniques useful for the other. There is, more specifically, the hope that a direct link between the Riemann hypothesis and the Lee–Yang theory of phase transitions, based on zeros of the partition function, will eventually emerge. In this paper, we introduce and examine a new equilibrium statistical mechanics spin chain model based on the number-theoretic Farey fractions. Our model is closely related to the number theoretic spin chain studied by Knauf (see references given below) and others (see [Cv]) and in fact has the same free energy. However it differs from that previous work in several respects. Perhaps most important, our model is translation invariant by construction. In addition, our matrix formulation clarifies some of the results

636

P. Kleban, A. E. Özlük

on the number theoretic spin chain. However, we have yet to conclusively demonstrate a connection with the Riemann ζ -function, as is the case (by construction) for the previous model at low temperatures. In Sect. 2 we define the new model, and point out its connection to the number theoretic spin chain studied by Knauf and others, which we refer to as KSC below. We also exhibit exact results for the partition function at certain temperatures. Section 3 contains a first proof of the existence of the free energy (of the infinite system) and has a unique phase transition at β = 2. In fact, it is eqal to the KSC free energy, establishing a direct connection between the two models. In Sect. 4 we examine the number of states, and show it is related to a certain number-theoretic restricted divisor problem. Using a conjectured asymptotic form for the summed (over all chain lengths) number of states we then suggest a connection with the Riemann function. Sect. 5 uses the results of the previous section to prove the existence of a phase transition in a different way, and also shows that the infinite system is in a completely magnetized state at low temperatures. In Sect. 6 we examine the spin interaction coefficients, which include all even many-body terms and are translation invariant. Our numerical results indicate that all the interaction coefficients, except the constant term, are ferromagnetic. We also discuss the question of whether the Farey and KSC interactions are the same in the limit of long chains. 2. Definition of the Farey Spin Chain We begin with a preliminary definition and then proceed to construct the Farey fractions. Definition. The mediant of the rational numbers Note that, for

a b

<

c d

and ab ,

c d

a b

and

c d

is

a+c b+d .

∈ [0, 1] a+c c a < < . b b+d d

(1)

Definition. The Farey fractions are defined by starting with the set F1 = 01 , 11 and recursively inserting mediants of each neighboring pair of fractions. Fn denotes the sequence of fractions generated by this procedure up to and including the nth step; so that F1 = 01 , 11 , F2 = 01 , 21 , 11 , F3 = 01 , 13 , 21 , 23 , 11 etc. Fn includes all rational fractions in [0, 1] with denominators d ≤ n, and others with larger denominators. Obviously F1 ⊂ F2 ⊂ F3 ⊂ · · · ⊂ Fn+1 ⊂ · · · . WeTlet F∞ denote the set of all fractions generated in this manner. It follows that F∞ = Q [0, 1]. Throughout this paper we employ a matrix representation of the elements of Fn . Suppose FM = x0m , x1m , . . . , x m 2m−1 , where x0m = 01 , x2mm−1 = 11 and xjm+1 is the immediate right neighbor of xjm in Fm . For 0 ≤ j ≤ 2m−1 , let xjm = ab , and xjm+1 = dc , and suppose that the binary expansion of j is j = (am−1 . . . a1 a0 )2 , ak = {0, 1} for 0 ≤ k ≤ m − 1. Let 1 0 A= , 1 1 (2) 1 1 t B= =A. 0 1 This leads to

Farey Fraction Spin Chain

637

Theorem 1. m

M (j ) ≡

m−1 Y

A

1−ak

B

ak

≡A

1−am−1

B

am−1

A

1−am−2

B

am−2

...A

1−a0

B

k=0

a0

c a = , d b

for m ≥ 1. Proof. Theorem 1 follows from recursion relations for the Farey fractions very similar to Eq. (5) of ([Ka], p. 79). u t Remark. We set am−1 = 0, so that this product always starts with A, the only exception being when j = 2m−1 = (100 . . . 0)2 . Then it is m 1 (3) BAm−1 = m−1 1 In this case only, one employs the immediate right neighbor of 11 ,

m m−1

> 1.

In what follows, we omit the j = 2m−1 case. Thus, the fraction 11 is not included and the denominator 1 only appears once in Fn . Furthermore, M m (j ) = AM m−1 (1), with 0 ≤ 1 ≤ 2m−2 , l = (am−2 . . . a1 a0 )2 . Therefore, for 0 ≤ j ≤ 2m−1 − 1, with T m (j ) defined as T m (j ) = Tr(M m (j )), we have T m (j ) = b + c = Den(xjm ) + Num(xjm+1 ).

(4)

We are now in a position to define the partition function of the spin chain. First we extend the definition of M m (j ) to include all possible products of m factors A or B, so there are now 2m matrix products. Now A = I + σ− , and similarly B = I + σ+ , where I is the (2 × 2) unit matrix and σ+ , σ− are Pauli matrices. These satisfy σ+2 = 0 = σ−2 , σ+ σ− + σ− σ+ = I . It follows that M m (j ) is a linear combination of I , σ+ , σ− , σ+ σ− and σ− σ+ . If one exchanges A and B in the product, σ + and σ− are exchanged, but the trace remains invariant. Thus each new T m (j ) is exactly the same as one from the original set. We then interpret each T m (j ) as specifying the energy of a given spin state j of a periodic chain of length m via Em (j ) = log T m (j ).

(5)

The k th spin may be regarded as down or up (or, equivalently, the k th site as empty or full) according to whether ak = 0 or 1, respectively. Definition. The partition function is given by X T m (j )−β , Zm (β) =

(6)

j

where the sum extends over all 2m matrix products, i.e. the first factor in M m (j ) may be either A or B. It follows that Z1 = 21−β ,

Z2 = 21−β + 2 · 3−β ,

Z3 = 21−β + 6 · 2−2β , Z4 = 21−β + 4 · 6−β + 8 · 5−β + 2 · 7−β , . . . .

(7)

638

P. Kleban, A. E. Özlük

Remark. The A − B properties of T m (j ) discussed above imply that the Farey chain exhibits up-down (equivalently, particle-hole) symmetry. It also follows immediately from the properties of the trace that the energy is invariant under cyclic translation of the spin matrices. Therefore the spin chain has translation invariant spin interactions (see Sect. 6). Remark. The partition function may be evaluated exactly for certain values of β. When β = 0, Zm = 2m , the number of states. For β = −1, Zm is the sum of the trace of all possible matrix products. Since the two operations commute, Zm (−1) = Tr(A + B)m

(8)

[P]. Now A + B has eigenvalues 1 and 3, so Zm (−1) = 1m + 3m = 1 + 3m .

(9)

One can also calculate correlation functions of the A and B matrices for these b values in a straightforward way. These simplifications may also be applied to the KSC. The β = −1 method gives an easy way to derive the results of [G-K]. Remark. The KSC may also be expressed using the A and B matrices. To do this, one must replace T m (j ) in the formula for the partition function with (10) D m (j ) = b = Den(xjm ) = M m (j ) 2,2 and sum only over the restricted set of matrices beginning with A. We denote the resulting K β. partition function Zm−1 In the language of [Ka], this is the canonical partition function for a chain of length m − 1 (the leading matrix A is not counted). Since all matrix elements of M m (j ) are positive, 0 < D m (j ) < T m (j ).

(11)

Thus the energy of each (restricted) state of the Farey chain is bounded below by the energy of the KSC. Since Zm may also be computed using the restricted sum, for β > 0 one has K (β). Zm (β) < 2Zm−1

(12)

For β < 0, the inequality is reversed. 3. Existence of the Free Energy and Phase Transition Definition. The Farey free energy (per spin) F (b) is defined as βF (β) = lim

m→∞

− ln[Zm (β)] , m

(13)

and the KSC free energy FK (β) is defined similarly, with Z replaced by Z K . Theorem 2. The Farey free energy satisfies F (β) = FK (β). Thus βF (β) exists for all β ≥ 0, for β any negative integer, and exhibits a unique phase transition at β = 2.

Farey Fraction Spin Chain

639

Proof. For β = 0 βF (β) = βFK (β) = − ln 2

(14)

follows immediately for either spin spin chain by the remark above. For the other β values, we make use of Theorem 3 below, which is independent of the present argument. With dc the immediate neighbor of ab , one sees that c = b¯ + at; t = 0, 1, 2, 3, . . . ,

(14a)

where the index t classifies the appearances of dc with increasing chain length m and satisfies t < m. Here bb¯ = 1( mod a) and 1 ≤ b¯ ≤ a − 1 so that b¯ < a < b. Hence c < (m + 1)b,

(15)

whick make use of as follows. First note that D m (j ) = b and T m (j ) = b + c satisfy D m (j ) < T m (j ) < (m + 2)D m (j ).

(16)

For β > 0, it follows that K > Zm > Zm−1

1 ZK . (m + 2)β m−1

Taking logarithms and dividing by −βm gives ln(m + 2) 1 1 K K < Fm < . Fm−1 + 1− Fm−1 1− m m m

(17)

(18)

Our results for positive β are then established by taking the limit m → ∞, and using the rigorous results that βFK (β) exists for all β > 0 [Kd] and exhibits a unique phase transition at β = 2 [C-K]. For β < 0 same inequality holds for Fm ; thus since βFK (β), is also known to exist for β any negative integer [C-K], the rest of the theorem is proved. t u Remark. Another way of establishing the phase transition and gaining information on the behavior of the system at large β in given in Theorem 5. Remark. Note that F (−1) = ln 3 by the calculation above. Making use of [C-K] gives the same result for FK (−1). Figures 1 and 2 illustrate the free energy and energy fluctuation per spin 1E (which is proportional to the specific heat) obtained by exact enumeration for chains up to length m = 16. It is clear that the convergence of the free energy with length is much slower at large β, where the system is known to be completely magnetized in the infinite m limit (see Sect. 5), than for small β values. In addition, Fig. 1 shows that while the approach to the limit is non–monotonic for small chain lengths, for β = 1 this is not the case at larger β values. Finally, Fig. 2 shows a peak in 1E near β = 2 consistent with the phase transition there. 1E increases much more rapidly with length at the peak (β = 2) than at nearby β values.

640

P. Kleban, A. E. Özlük

0.4

0.2

Fm 0

-0.2

2

4

6

8

10

12

14

16

m Zm vs. m for Farey spin chain at β = 1 (stars) and β = 4 (diamonds) Fig. 1. Free energy Fm ≡ − ln mβ

0.4 0.35 0.3 0.25

1E 0.2 0.15 0.1 0.05

1.4

1.6

1.8

2

2.2

2.4

2.6

β 1 Fig. 2. Fluctuation of the energy per spin 1E ≡ m with m = 16

D

E Em (j )2 − hEm (j )i2 vs. β for the Farey chain spin

4. Number of States

In this section we consider the summed number of states of the spin chain, that is, the number of states of a given energy regardless of chain length. To begin, we derive an expression for the number of immediate right neighbors of a given Farey fraction. Definition. If m is the smallest positive integer such that conductor of ab . We write m = cond ab .

a b

∈ Fm , then we call m the

Farey Fraction Spin Chain

641

0 Let m = cond ab = m, and let Yx 0 < neighbors of ab in Fm . Then we have

a b

<

x x

so that

x0 x y0 , y

∈ Fm−1 are the immediate

a = x + x 0 , 0 ≤ x, x 0 < a, b = y + y 0 , 0 ≤ y, y 0 < b,

(19)

bx − ay = (y + y 0 )x − (x + x 0 )y = y 0 x − x 0 y.

(20)

bx 0 − ay 0 = (y + y 0 )x 0 − (x + x 0 )y 0 = yx 0 − xy 0 .

(21)

and

Similarly,

We conclude, by induction on m, that bx − ay = 1

(22)

bx 0 − ay 0 = −1,

(23)

bx ≡ 1(mod a); 0 ≤ x < a

(24)

ay ≡ −1(mod b); ≤ y < b.

(25)

and

so that

and

Suppose b¯ is the unique multiplicative inverse of b(mod a), i.e. the unique solution of (24). Then

Definition. Let NR m ≥ cond (a/b).

a b

bb¯ − ay = 1 so that bb¯ − 1 . y= a

(26) (27)

denote the set of all immediate right neighbors of

The above provides a proof of Theorem 3. The immediate right neighbor of b¯ is ¯ and

a b

at step m = cond

bb−1 a

NR

 

  t = 0, 1, 2 . . . . = ¯  bb+abt−1  b

a

b¯ + at a

a b

a b

in Fm , for

is m = cond

a b

642

P. Kleban, A. E. Özlük

Remark. If m = cond a b

a−b¯ a¯

a b

, then in Fm , one can show that the immediate left neighbor of ¯

is and its immediate right neighbor is b−b a¯ , where a¯ is the unique multiplicative inverse of a(mod b). The form for the right neighbor is a consequence of the fact that a a¯ + bb¯ = 1 + ab.

(28)

The next step is to find the number of states with a given energy, regardless of chain length. We proceed to count the number of solutions to Tr(X) = n,

(29)

where n > 2 and X is any finite product of A’s and B’s beginning with A as described above. We count the contribution of ab to the left side of (29) by keeping track of the immediate right neighbors of ab . For this, we consider Theorem 3 and b + b¯ + at = n; t = 0, 1, 2 . . . ,

(30)

which is equivalent to b2 + bb¯ + abt = bn or

(31)

b − bn + 1 ≡ 0 (mod a). 2

(32)

Therefore we look for the divisors a of bn−b2 −1 for 1 ≤ a < b. There are db (bn−b2 −1) of these, where db (m) is the number of positive divisors of m that are less than b. It immediately follows that Theorem 4. The number of solutions of (29) is 8(n) =

n−1 X

db (bn − b2 − 1).

b=1

Similar quantities are considered in [H-T]. Remark. We can put an upper bound on 8(n) by considering (30). If we ignore the restrictions, the number of solutions to it is just the number of unordered partitions of n into three positive integers. Therefore 1 n+2 1 (33) 8(n) ≤ = (n + 2)(n + 1) ∼ n2 , 2 2 2 where the asymptotic form applies as n → ∞. Conjecture. As n → ∞, 8(n) ∼ 21 n log n. Consider the fact that

n X

d(m) ∼ n log n, which implies that for large n the average

m=1

n X √ 2 dmu (m) ∼ sin−1 ( u) · n [D-D-T]. d(m) π m=1 Now replace the quantities d(m) and db (m) by the averaged functions d(x) and d(x; y),

value of d is log n, and the DDT theorem

Farey Fraction Spin Chain

643

where the latter is the averaged number of divisors of y ≤ x. Taken together, the quoted results suggest that Z n √ 2 d(x u ; x)dx ∼ sin−1 ( u) · n log n, (34) π 1 and more generally the averaged quantity satisfies s ! log g 2 · log n d(g; x) ∼ sin−1 π log x

(35)

log g is consistent with the fact that divisors are uniformly log x distributed on a log scale ([H-T], p. 62). The asymptotic form of the number of states then follows Z n−1 d(b : bn − b2 − 1)db 8(n) ∼ 1 s ! (36) Z log b 2 n−1 −1 sin db · log n. ∼ π 1 log b(n − b) 1 π Evaluating the integral for large n, we find sin−1 √ n = n, which leads immedi4 2 ately to the conjectured formula. for large x ≤ n. The argument

5. Thermodynamic Consequences The results for the (summed) number of states in Sect. 6 may be used to derive some consequences for the thermodynamics. We first express the partition function as Zm (β) = 21−β + 2

∞ X 8m (n) n=3

nβ

,

(37)

where 8m (n) is the number of solutions of (29) for fixed chain length m, i.e. the number of states. The factor 2 appears in front of the summation of the preceding equation since (29) refers to the restricted set of matrices beginning with A. The summed number ∞ X 0 (β) = Z (β) − 21−β and use the 8m (n). If we define Zm of states is then 8(n) = m m=1

conjectured asymptotic form of 8(n), then ∞ X m=1

0 Zm (β) = 2

=

∞ X 8(n)

n=3 ∞ X n=1

nβ ∞

X ε(n) 22−β n log n log 2 + 2 − nβ nβ 3 n=1

= −ζ 0 (β − 1) + 2˜ε(β) −

22−β log 2, 3

(38)

644

P. Kleban, A. E. Özlük

where (n) corrects the asymptotic form, ε˜ (β) is its Dirichlet transform, and ζ 0 is the derivative of the Riemann ζ -function. If we assume that ε˜ (β) is regular, then a singularity in the thermodynamics for β = 2 follows from the pole in the Riemann function at β − 1 = 1. We have not been able to prove this assumption, however. Theorem 5. Z 0 m(β) → 0 as m → ∞ so that Zm(β) → 21−β , and the free energy F (β) = 0 for β > 3. Proof. We use the upper bound on 8(n) derived above. Now ∞ X m=1

0 Zm (β) =

∞ X 8(n) n=3

nβ

≤

∞ X (n + 2)(n + 1) n=3

nβ

.

(39)

The summation in this equation is finite for β > 3. u t Remark. Since βF (β) is finite at β = 0, Theorem 5 establishes the existence of a phase transition (singularity in βF (β)) in a different way than used in Sect. 3. In addition, since the partition function for long chains is given by the sum over the two lowest energy (j = 0 or 2m − 1) states only, it shows that the limiting chain is in a completely magnetized state (all spins up or all spins down) at low temperatures. Similar behavior holds, by construction, for the KSC, with the partition function approaching a ratio of Riemann ζ -functions for β > 2 as m → ∞. For the Farey chain, replacing the upper bound by our conjectured asymptotic form for 8(n), leads to F (β) = 0 and Zm (β) → 21−β for β > 2. 6. Spin Interaction Coefficients We define the spin interaction coefficients Jm (t) as follows [Kb]. Let t, 0 ≤ t ≤ 2m−1 , specify the coefficient, and suppose that the binary expansion of t is t = (bm−1 . . .b1 b0 )2 , m−1 X ai bi . Then bk ∈ {0, 1}. Let j · t = i=0

Jm (t) = −

1 X (−1)j ·t Em (j ), 2m

(40)

j

where Em (j ) is the energy of configuration j for a chain of length m, defined above. In this notation, the energy of any configuration is given via a sum over spin clusters X (−1)j ·t Jm (t). (41) Em (j ) = − t

Note that each factor si ≡ (−1ai ) ∈ {−1, +1} in each term may be interpreted as a spin at site i on the chain. The spin si is present or not in a given term according to whether bi = 1 or bi = 0, respectively. More explicitly, " # Y X m−1 bi (si ) J ({bi }) . (42) Em (j ) = − {bi }

i=0

Each term thus defines a spin cluster, i.e. a set of sites for which bi = 1. The sites in a given cluster may be adjacent to one another or separated by bi = 0 sites with no spins present.

Farey Fraction Spin Chain

645

Lemma. The interaction coefficient Jm (0) satisfies Jm (0) = − 21m √ m ln

5+1 2

!

X

ln Tm (j ) ≥

j

= −(0.48121 . . . )m for large m.

Proof. By considering the generation of the Farey fractions Fm , it is clear that each numerator and denominator is bounded above by the Fibonacci number Fm+1 . Therefore √ !m+1 1 + 2 5 , (43) T m (j ) ≤ 2Fm+1 ∼ √ 2 5 as m → ∞. u t We have calculated Jm (0) by computer for chains up to m = 16 as illustrated in Fig. 3. The results indicate that it approaches −(am + b), with a = 0.3962, for both the Farey spin chain and KSC. Since Jm (0) is also (minus) the average energy for a chain of length m, we considered the fluctuation of the energy, i.e.  2 X X 1 1 2 ln T m (j ) . (44) ln T m (j ) −  m σm2 = m 2 2 j

j

Numerically, from calculations on chains up to length 16, σm appears to approach 0.019 m as m → ∞. The corresponding quantity for the KSC apparently approaches 0.014 m. Given the numerical uncertainties, it is possible that the asymptotic value is the same for both chains. Note that, since Tm ≥ 2, Jm (0) < 0. By the up-down symmetry discussed above, it follows that Jm (t)P= 0 whenever the cluster defined by t contains an odd number of spins, i.e. when i bi = odd, so only even interaction coefficients are non-zero. Furthermore, Jm (t) exhibits cyclic symmetry in t, i.e. it is invariant under translation of the spin cluster, due to the invariance of the trace under cyclic translation of the spin matrices mentioned above. Our computer results for short chains verify all the exact behavior mentioned in the preceding paragraph. In addition, we find several interesting features. For all non–zero clusters (with an even number of spins) Jm (t) > 0, so all interaction coefficients (except t = 0) are ferromagnetic. Similar behavior occurs for the KSC in what is referred to as the grand canonical ensemble [Kb]. We find that as m increases, each such Jm (t) apparently approaches a finite limit. An example is shown in Fig. 3. Denoting by J (t) the limit of Jm (t) as m → ∞, we find the values listed in Table 1. Note that the (approximate) numbers for the pair interaction coefficients (t = (10 . . . 01000 . . . )) are consistent with a decrease of J by a factor of 1/2 for each increase of separation of the two spins in the cluster by one site. Generally, J appears to decrease with the number of spins in the cluster and the distance between the spins. Theorem 2 establishes that the Farey spin chain and KSC have the same free energy. This raises the question as to whether J (t) is in some sense the same in both cases. This is a more complicated issue, in part because the KSC is not translation invariant. Our numerical results are consistent with the J (t) being the same, at least for KSC interactions with spin clusters far from the edges, i.e. with t of the form 0 . . . 0p0 . . . 0,

646

P. Kleban, A. E. Özlük

0.2

-0.3

0.15

-0.325

0.1

-0.35

0.05

-0.375

-0.4 2.5

5

7.5

10

12.5

15

m

Fig. 3. Jm+1 (0) − Jm (0) vs. m for the Farey chain (stars) and KSC (diamonds) with values on the right scale; pair interaction Jm (11000 . . . ) vs. m for the Farey chain (boxes) with values on the left scale

Table 1. Interaction coefficients t

J (t)

11000 . . . 101000 . . . 1001000 . . . 10001000 . . . 100001000 . . . 1000001000 . . . 1111000 . . . 11011000 . . . 110011000 . . . 11101000 . . .

0.131 0.0612 0.0291 0.0141 0.0068 0.0033 0.0081 0.0028 0.0011 0.004

with p = 1 . . . 1 fixed and each string of 0s of length proportional to m. For t of the form p0 . . . 0, so the cluster remains at one edge, the J (t) values are certainly different. Further, Jm (t) for the KSC is rigorously known to exhibit the following behavior for m → ∞: Jm (t) → 0 when t has an odd number of spins, it is small unless the length of p is small, and it has translation invariance [Ka]. All of these are consistent with what we find for the Farey chain, as described above. Unfortunately, the bounds relating D m (j ) and T m (j ) used in the proof of Theorem 2 are not strong enough to establish equality, since each term in Jm (t) will be bounded above and below by the corresponding term in JmK (t) plus (depending on sign) a possible term of magnitude ln(m + 2)2−m , and on summation the latter gives rise to a divergent contribution as m → ∞. However, one can draw some conclusion about Jm (0) and σm in this way. We assume, consistent with the lemma and numerical results above, that both

Farey Fraction Spin Chain

647

quantities are proportional to m in this limit. It is then easy to show that the respective coefficients must be the same for either spin chain, as suggested by the numerics. Note added. Recent work [C-K-K] has extended the analysis of the Farey spin chain, addressing in particular the magnetization in the vicinity of the phase transition. Acknowledgements. We acknowledge stimulating and useful interactions with P. Contucci, A. Knauf and I. Peschel, thank A. Knauf for a computer program, and thank an anonymous referee for important comments.

References [C-K]

Contucci, P. and Knauf, A.: The Phase Transition of the Number-Theoretical Spin Chain. Forum Mathematicum 9, 547–567 (1997) [C-K-K] Contucci, P., Kleban, P., and Knauf A.: A Fully Magnetizing Phase Transition. Preprint (mathph/9811020) [Cv] Cvitanovic, P.: Circle Maps: Irrationally Winding. In: From Number Theory to Theoretical Physics, Berlin–Heidelberg–New York: Springer, 1992 [D-D-T] Deshouillers, J.-M., Dress, F. and Tenenbaum, G.: Lois de répartition des diviseurs, I. Acta Arithmetica XXXIV (1979) [G-K] Guerra, F. and Knauf, A.: Free Energy and Correlations of the Number-Theoretical Spin Chain. J. Math. Phys. 39, 3188–3202 (1998) (http://wwwsfb288.math.tu_berlin.de/bulletinboard.html) [H-T] Hall, R. and Tenenbaum, G.: Divisors. Cambridge: Cambridge University Press, 1988 [Ka] Knauf, A.: On a Ferromagnetic Spin Chain. Commun. Math. Phys. 153, 77–115 (1993) [Kb] Knauf, A.: Phases of the Number-Theoretical Spin Chain. J. Stat. Phys. 73, 423–431 (1993) [Kd] Knauf, A.: On a Ferromagnetic Spin Chain, Part II: Thermodynamic Limit. J. Math. Phys. 35, 228–236 (1994) [P] Peschel, I.: Private communication Communicated by M. E. Fisher

Commun. Math. Phys. 203, 649 – 666 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Riccati-Type Equations, Generalised WZNW Equations, and Multidimensional Toda Systems L. A. Ferreira1 , J. F. Gomes1 , A. V. Razumov2 , M. V. Saveliev2 , A. H. Zimerman1 1 Instituto de Física Teórica - IFT/UNESP, Rua Pamplona 145, 01405-900, São Paulo - SP, Brazil.

E-mail: [email protected]; [email protected]; [email protected]

2 Institute for High Energy Physics, 142284 Protvino, Moscow Region, Russia.

E-mail: [email protected]; [email protected] Received: 3 August 1998 / Accepted: 21 December 1998

Abstract: We associate to an arbitrary Z-gradation of the Lie algebra of a Lie group a system of Riccati-type first order differential equations. The particular cases under consideration are the ordinary Riccati and the matrix Riccati equations. The multidimensional extension of these equations is given. The generalisation of the associated Redheffer–Reid differential systems appears in a natural way. The connection between the Toda systems and the Riccati-type equations in lower and higher dimensions is established. Within this context the integrability problem for those equations is studied. As an illustration, some examples of the integrable multidimensional Riccati-type equations related to the maximally nonabelian Toda systems are given. 1. Introduction At the present time there is a great number of papers in mathematics and physics devoted to various aspects of the matrix differential Riccati equation proposed in the ’20s by Radon in the context of the Lagrange variational problem. In particular, this equation has been discussed in connection with the oscillation of the solutions to systems of linear differential equations, Lie group and differential geometric aspects of the theory of analytic functions of several complex variables in classical domains, probability theory, computation schemes. For a systematic account of the development in the theory of the matrix differential Riccati equation up to the ’70s see, for example, the survey [33]. More recently there appeared papers where this equation was considered as a Bäcklund-type transformation for some integrable systems of differential geometry. In particular, for the Lamé and the Bourlet equations. A relevant superposition principle for the equation has been studied on the basis of the theory of Lie algebras, see, for example, [1,2,20, 30–32] and references therein. The matrix Riccati equation also arises as an equation of motion on Grassmann manifolds and on homogeneous spaces attached to the Hartree– Fock–Bogoliubov problem, see, for example, [4,11] and references therein; and in some other subjects of applied mathematics and physics such as optimal control theory, plasma,

650

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

etc., see, for example, [12,29,28]. Continued–fraction solutions to the matrix differential Riccati equation were constructed in [7,8], based on a sequence of substitutions with the coefficients satisfying a matrix generalisation of the Volterra-type equations which in turn provide a Bäcklund transformation for the corresponding matrix version of the Toda lattice. In papers [5,6] the matrix differential Riccati equation occurs in the steepest descent solution to the total least squares problem as a flow on Grassmannians via the Brockett double bracket commutator equation; in the special case of projective space this is the Toda lattice flow in Moser’s variables. In the present paper we investigate the equations associated with an arbitrary Zgradation of the Lie algebra g of a Lie group G. For the case G = GL(2, C) and the principal gradation of gl(2, C) this is the ordinary Riccati equation. For the case G = GL(n, C) and some special Z-gradation of gl(n, C) we get the matrix Riccati equation. The underlying group-algebraic structure allows us to give a unifying approach to the investigation of the integrability problem for the equations under consideration which we call the Riccati-type equations. We also give a multidimensional generalisation of the Riccati-type equations and discuss their integrability. It has been very useful for the study of ordinary matrix Riccati equations to associate with them the so-called Redheffer–Reid differential system [25–27]. In our approach the corresponding generalisation of such systems appears in a natural way. The associated Redheffer–Reid system can be considered as the constraints providing some reduction of the Wess–Zumino–Novikov–Witten (WZNW) equations. On the other hand, it is well known that the Toda-type systems can be also obtained by the appropriate reduction of the WZNW equations, see, for example, [14]. This implies a deep connection of the Toda-type systems and the Riccati-type equations. In particular, under the relevant constraints the Riccati-type equations play the role of a Bäcklund map for the Toda systems, and, in a sense, are a generalisation of the Volterra equations. Some years ago there appeared a remarkable generalisation [16] of the WZNW equations. The associated Redheffer–Reid system in the multidimensional case can be considered again as the constraints imposed on the solutions of those equations. We show that in the same way as in the two dimensional case, the appropriate reduction of the multidimensional WZNW equations leads to the multidimensional Toda systems [22], in particular to the equations [9,10,13] describing topological and antitopological fusion.1 The multidimensional Toda systems are integrable for the relevant integration data with the general solution being determined by the corresponding arbitrary mappings in accordance with the integration scheme developed in [22]. Therefore the integrability problem for the multidimensional Riccati-type equations can be studied, in particular, on the basis of that fact. As an illustration of the general construction we discuss in detail some examples related to the maximally nonabelian Toda systems [23]. Analogously to the Toda systems one can construct higher grading generalisations in the sense of [18,24] for the multidimensional Riccati-type equations. 2. One Dimensional Riccati-Type Equations We shall introduce Riccati-type equations within the usual language of integrable models, through an associate linear problem involving Lie algebra and Lie group valued objects. 1 It is rather clear that the multidimensional systems suggested in [22] become two dimensional equations only under a relevant reduction. Moreover, arbitrary mappings determining the general solution to these equations are not necessarily factorised to the products of mappings each depending on one coordinate only. One can easily be convinced by the examples considered there in detail.

Riccati-Type Equations and Toda Systems

651

Let G be a connected Lie group and g be its Lie algebra. Without any loss of generality we assume that G is a matrix Lie group, otherwise we replace G by its image under some faithful representation of G. For any fixed mapping λ : R → g consider the equation ψ −1

dψ =λ dx

(2.1)

for the mapping ψ : R → G. Certainly one can use the complex plane C instead of the real line R. Suppose that the Lie algebra g is endowed with a Z-gradation, M gm . g= m∈Z

Define the following nilpotent subalgebras of g: M M gm , g>0 = gm , g<0 = m<0

m>0

and represent the mapping λ in the form λ = λ<0 + λ0 + λ>0 , where the mappings λ<0 , λ0 and λ>0 take values in g<0 , g0 and g>0 respectively. Denote by G<0 , G0 and G>0 the connected Lie subgroups of G corresponding to the subalgebras g<0 , g0 and g>0 respectively. Under the appropriate assumptions for an element a ∈ G belonging to some dense subset of G, the generalised Gauss decomposition a = a<0 a0 a>0

(2.2)

is valid, where a<0 ∈ G<0 , a0 ∈ G0 and a>0 ∈ G>0 . For the mapping ψ we can write ψ = ψ<0 ψ0 ψ>0 ,

(2.3)

where the mapping ψ<0 takes values in G<0 , the mapping ψ0 takes values in G0 and the mapping ψ>0 takes values in G>0 . Using the Gauss decomposition (2.3) of the mapping ψ rewrite Eq. (2.1) as −1 −1 dψ≤0 −1 dψ>0 ψ>0 + ψ>0 = λ, (2.4) ψ≤0 ψ>0 dx dx where ψ≤0 = ψ<0 ψ0 . From (2.4) it follows that −1 ψ≤0

dψ>0 −1 dψ≤0 −1 + ψ>0 = ψ>0 λ ψ>0 , dx dx

and hence −1 ψ≤0

dψ≤0 −1 = (ψ>0 λ ψ>0 )≤0 , dx

(2.5)

where the subscript ≤ 0 denotes the corresponding component with respect to the decomposition g = g≤0 ⊕ g>0 = (g<0 ⊕ g0 ) ⊕ g>0 .

652

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

Substituting (2.5) into (2.4) one gets −1 ψ>0

dψ>0 −1 −1 = λ − ψ>0 (ψ>0 λ ψ>0 )≤0 ψ>0 dx

that can be rewritten as dψ>0 −1 −1 )>0 . (2.6) ψ>0 = (ψ>0 λ ψ>0 dx By the reasons which are clear from what follows we call this equation for the mapping ψ>0 a Riccati-type equation. The formal integration of Eq. (2.6) can be performed in the following way. Consider the linear differential equation dψ = ψ λ, dx

(2.7)

for the mapping ψ : R → G. Find the solution of this equation with the initial condition ψ(0) = a, where a is a constant element of the Lie group G. Using now the Gauss decomposition (2.3) of the mapping ψ we find the solution of Eq. (2.6) with the initial condition ψ>0 = a>0 , where a>0 is the positive grade component of a arising from the Gauss decomposition (2.2). It is clear that in order to obtain the general solution of Eq. (2.6) it suffices to consider elements a belonging to the Lie subgroup G>0 . Then the solution of (2.6) is expressed in terms ofthe solution of (2.7). Note that the solution of Eq. (2.7) with the initial condition ψ(0) = a can be obtained from the solution with the initial condition ψ(0) = e, where e is the unit element of G, by left multiplication by a. Thus we have shown that one can associate a Riccati-type equation to any Z-gradation of a Lie group. The integration of these equations is reduced to integration of some matrix system of first order linear differential equations. Let now χ be some mapping from R to G. It is clear that if the mapping ψ satisfies Eq. (2.7), then the mapping ψ 0 = ψχ −1 satisfies the equation dψ 0 = ψ 0 λ0 , dx where λ0 = χ λ χ −1 −

dχ −1 χ . dx

(2.8)

If χ is a mapping from R to G0 , then the corresponding component 0 = χ ψ>0 χ −1 ψ>0

of the mapping ψ 0 satisfies the Riccati-type equation (2.6) with λ replaced by λ0 . In this, λ00 = χ λ0 χ −1 −

dχ −1 χ , dx

and it is clear that we can choose the mapping χ so that λ00 vanishes. Another interesting possibility arises when χ is a mapping from R to G>0 . Let us choose a mapping χ such that λ0>0 = 0. From (2.8) it follows that this case is realised if and only if dχ −1 χ = (χ λχ −1 )>0 , dx

Riccati-Type Equations and Toda Systems

653

i.e., χ should satisfy the Riccati-type equation (2.6). Thus, having a particular solution of the Riccati-type equation, its general solution can be constructed from the general solution of the equation with λ>0 = 0. As will be shown below, for this case the Riccatitype equation can be solved in a quite simple way. 3. Simplest Example Consider first the case of the Lie group GL(n, C), n ≥ 2 and represent n as the sum of two positive integers n1 and n2 . For the Lie algebra gl(n, C) there is a Z-gradation where arbitrary elements x<0 , x0 and x>0 of the subalgebras g<0 , g>0 and g0 have the form 0 0 (x0 )11 0 (x>0 )12 0 , x0 = . , x>0 = x<0 = 0 0 (x<0 )21 0 0 (x0 )22 Here (x<0 )21 is an n2 × n1 matrix, (x>0 )12 is an n1 × n2 matrix, (x0 )11 and (x0 )22 are n1 × n1 and n2 × n2 matrices respectively. The corresponding subgroups G<0 , G>0 and G0 are formed by the matrices In1 (a0 )11 In1 (a>0 )12 0 0 a<0 = , a0 = , a>0 = . 0 In2 (a<0 )21 In2 0 (a0 )22 Here (a<0 )21 is an arbitrary n2 ×n1 matrix, (a>0 )12 is an arbitrary n1 ×n2 matrix, (a0 )11 and (a0 )22 are arbitrary nondegenerate n1 × n1 and n2 × n2 matrices respectively. The Gauss decomposition (2.2) of an element a11 a12 a= a21 a22 is given by the relations −1 (a>0 )12 = a11 a12 ,

(a0 )11 = a11 , (a<0 )21 =

−1 a21 a11 ,

(a0 )22 =

−1 a22 − a21 a11 a12 .

(3.1) (3.2)

Returning to Eq. (2.1), since λ is a general element of the Lie algebra g, it can be parametrized as A B λ= C D and ψ>0 as

ψ>0 =

In1 U 0 In2

.

(3.3)

One easily sees that Eq. (2.6) takes in the case under consideration the form dU = B − AU + U D − U CU. (3.4) dx In the case n = 2, n1 = n2 = 1, we have the usual Riccati equation. For n = 2m, n1 = n2 = m, we come to the so-called matrix Riccati equation. This justifies our choice for the name of Eq. (2.6) in the general case. Note that Eq. (2.6) has arisen in the context of the factorization method, see, for example, [19]. However, the explicit connection of this equation with the Riccati equation and the matrix Riccati equation has not been traced yet.

654

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

3.1. Case B = 0. If C = 0 then Eq. (3.4) is linear. In the case B = 0, under the conditions n1 = n2 and det U (x) 6 = 0 for any x, the substitution V = U −1 leads to the linear equation dV = V A − DV + C. dx Nevertheless, it is instructive to consider the procedure of obtaining the general solution to equation (3.4) for B = 0. Recall that having a particular solution to the Riccati-type equation, we can reduce the consideration to the case where λ>0 = 0. For the equation in question this is equivalent to the requirement B = 0. First, find the mapping χ : R → G0 such that transformation (2.8) would give λ00 = 0. Parametrising χ as Q 0 χ= , 0 R one comes to the following equations for R and Q: dR = R D. dx

dQ = Q A, dx

Therefore we can choose Z Z x 0 0 A(x ) dx , R(x) = P exp Q(x) = P exp 0

x

0

D(x ) dx

0

,

(3.5)

0

where the symbol P exp(·) denotes the path ordered exponential (multiplicative integral). Now solve the equation dψ 0 = ψ 0 λ0 , dx where λ0 =

0 0 C0 0

=

0 0 RCQ−1 0

.

The solution of this equation with the initial condition ψ 0 (0) = In is 0 I n1 ψ(x) = S(x) In2 with Z

x

S(x) =

R(x 0 ) C(x 0 ) Q−1 (x 0 ) dx 0 .

0

Hence, the solution of Eq. (2.7) with the initial condition ψ(0) = In is given by Q 0 ψ= . SQ R

(3.6)

Riccati-Type Equations and Toda Systems

655

To obtain the general solution of the equation under consideration we should have the solution of Eq. (2.7) with the initial condition In1 m , (3.7) ψ(0) = 0 In 2 where m is an arbitrary n1 × n2 matrix. Such a solution is represented as (In1 + mS)Q mR . ψ= SQ R Now, using (3.1) we conclude that the general solution to Eq. (3.4) in the case B = 0 is U = Q−1 (In1 + mS)−1 mR, where Q, R and S are given by relations (3.5) and (3.6). Thus we see that in the case when λ is a block upper or lower triangular matrix the Riccati-type equation (3.4) can be explicitly integrated. Actually if λ is a constant mapping we can reduce it by a similarity transformation to the block upper or lower triangular form and solve the corresponding Riccati-type equation. The solution of the initial equation is obtained then by some algebraic calculations. 3.2. The case A = 0 and D = 0. Representing the mapping ψ in the form ψ11 ψ12 ψ= ψ21 ψ22 one easily sees that Eq. (2.7) is equivalent to the system dψ11 = ψ12 C, dx dψ21 = ψ22 C, dx

dψ12 = ψ11 B, dx dψ22 = ψ21 B. dx

(3.8) (3.9)

3.2.1. The case C = B. Consider the case C = B; that is certainly possible only if n1 = n2 . In this case we can rewrite Eqs. (3.8) and (3.9) as d(ψ11 + ψ12 ) = (ψ11 + ψ12 )B, dx d(ψ22 + ψ21 ) = (ψ22 + ψ21 )B, dx

d(ψ11 − ψ12 ) = −(ψ11 − ψ12 )B, dx d(ψ22 − ψ21 ) = −(ψ22 − ψ21 )B. dx

Hence, the solution of Eq. (2.7) with the initial condition ψ(0) = In is given by 1 F +H F −H , ψ= 2 F −H F +H where

Z

x

F (x) = P exp 0

0

B(x ) dx

0

Z , H (x) = P exp − 0

x

0

B(x ) dx

0

.

656

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

The solution of Eq. (2.7) with the initial condition of form (3.7) is ψ=

1 2

F + H + m(F − H ) F − H + m(F + H ) F −H F +H

;

therefore, the general solution to the Riccati-type equation under consideration can be written as U = (F + H + m(F − H ))−1 (F − H + m(F + H )). 3.2.2. The case of constant B and C. As we noted above, the general solution to the Riccati-type equations (3.4) for the case of constant mapping λ can be obtained by a reduction of λ to the block upper or lower triangular form. Nevertheless, it is interesting to consider the particular case of the constant λ when the general solution has the most simple form. Suppose that n1 = n2 and that B and C are constant nondegenerate matrices. In this case the solution of Eq. (2.7) with the initial condition ψ(0) = In is ψ(x) =

√ √ √ −1 BCx) sinh( BCx) BCC cosh( √ √ √ , sinh( CBx) CBB −1 cosh( CBx)

and for the general solution one has −1 √ √ √ U (x) = cosh( BCx) + m sinh( CBx) CBB −1 √ √ √ × sinh( BCx) BCC −1 + m cosh( CBx) . It should be noted here that the expression for U (x) does not actually contain square roots of matrices that can be easily seen from the corresponding expansions into the power series.

4. A Further Example The next example is based on another Z-gradation of the Lie algebra gl(n, C). Here one represents n as the sum of three positive integers n1 , n2 and n3 and considers an element x of gl(n, C) as a 3×3 block matrix (xrs ) with xrs being an nr ×ns matrix. The subspace gm is formed by the block matrices x = (xrs ) where only the blocks xrs with s − r = m are different from zero. Arbitrary elements x<0 , x0 and x>0 of the subalgebras g<0 , g0 and g>0 have the form 

0

0 0

x<0 =  (x<0 )21 (x<0 )31 (x<0 )32  x0 = 

 0 0 , 0

 0 (x>0 )12 (x>0 )13 0 (x>0 )23  , = 0 0 0 0  0 0 . (x0 )33 

x>0

(x0 )11 0 0 (x0 )22 0 0

Riccati-Type Equations and Toda Systems

657

The subgroups G<0 , G0 and G>0 are formed by the nondegenerate matrices     In1 In1 (a>0 )12 (a>0 )13 0 0 In2 0 , In2 (a>0 )23  , a>0 =  0 a<0 =  (a<0 )21 0 0 In3 (a<0 )31 (a<0 )32 In3   (a0 )11 0 0 0 . (a0 )22 a0 =  0 0 0 (a0 )33 The Gauss decomposition of an element a ∈ GL(n, C) is determined by the relations −1 −1 , (a<0 )31 = a31 a11 , (a<0 )21 = a21 a11 −1 −1 a12 )(a22 − a21 a11 a12 )−1 , (a<0 )32 = (a32 − a31 a11 −1 (a0 )11 = a11 , (a0 )22 = a22 − a21 a11 a12 , −1 a13 (a0 )33 = a33 − a31 a11 −1 −1 −1 − (a32 − a31 a11 a12 )(a22 − a21 a11 a12 )−1 (a23 − a21 a11 a13 ), −1 −1 (a>0 )12 = a11 a12 , (a>0 )13 = a11 a13 , −1 −1 a12 )−1 (a23 − a21 a11 a13 ). (a>0 )23 = (a22 − a21 a11

We parametrise the mapping λ as 

 A11 B12 B13 λ =  C21 A22 B23  C31 C32 A33

and the mapping ψ>0 as



ψ>0

 In1 U12 U13 =  0 In2 U23  . 0 0 In3

After some algebra one sees that the Riccati-type equations for the case under consideration are dU12 = B12 − A11 U12 + U12 A22 + U13 C32 − U12 C21 U12 − U13 C31 U12 , dx dU23 = B23 − A22 U23 + U23 A33 − C21 U13 dx + C21 U12 U23 − U23 C31 U13 − U23 C32 U23 + U23 C31 U12 U23 , dU13 = B13 − A11 U13 + U13 A33 + U12 B23 − U12 C21 U13 − U13 C31 U13 . dx Consider the case where Brs = 0. Here by transformation (2.8) we can reduce our equations to the case where additionally Ars = 0. In the latter case the solution of Eq. (2.7) with the initial condition ψ(0) = In has the form   0 In1 0 ψ =  S21 In2 0  , S31 S32 In3

658

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

where

Z

x

S21 (x) =

C21 (x 0 ) dx 0 ,

0

Z

x

S31 (x) =

C31 (x 0 ) +

Z

x0

!

!

C32 (x 00 )dx 00 C21 (x 0 ) dx 0 ,

0

0

S32 (x) =

Z

x

C32 (x 0 ) dx 0 .

0

Using the explicit expressions for the Gauss decomposition given in this section, we find that the solution to the Riccati-type equation under consideration with the initial condition   In1 m12 m13 ψ>0 (0) =  0 In2 m23  0 0 In3 is determined by the relations U12 = (In1 + m12 S21 + m13 S31 )−1 (m12 + m13 S32 ), U13 = (In1 + m12 S21 + m13 S31 )−1 m13 , U23 = (In2 + m23 S32 − (S21 + m23 S31 )(In1 + m12 S21 + m13 S31 )−1 (m12 + m13 S32 ))−1 × (m23 − (S21 + m23 S31 )(In1 + m12 S21 + m13 S31 )−1 m13 ). The above consideration can be directly generalised to the case of the Z-gradation of gl(n, C) which leads to the natural representation of n × n matrices as p × p block matrices. The corresponding equations look more and more complicated. Nevertheless, at least for the case of constant mappings λ and for the case of block upper or lower triangular mappings λ, they can be explicitly integrated. Actually these gradations exhaust in a sense all possible Z-gradations of the Lie algebra gl(n, C) [23,21]. 5. Multidimensional Riccati-Type Equations Let now λi , i = 1, . . . , d be some g-valued functions on Rd whose standard coordinates are denoted by x i . Consider the following system of equations for a mapping ψ from Rd to the Lie group G: ∂i ψ = λi ψ,

(5.1)

where ∂i = ∂/∂x i . The integrability conditions for system (5.1) look like ∂i λj − ∂j λi + [λi , λj ] = 0.

(5.2)

Similarly to the one dimensional case we obtain the following equations for the component ψ>0 entering the Gauss decomposition of type (2.3): −1 −1 = (ψ>0 λi ψ>0 )>0 . ∂i ψ>0 ψ>0

(5.3)

Riccati-Type Equations and Toda Systems

659

We call these equations multidimensional Riccati-type equations. The integration of Eqs. (5.3) is again reduced to the integration of linear system (5.1). The transformation (2.8), where χ is a mapping from Rd to G0 , cannot be used now to get the Riccati-type equations with λ0 = 0. Indeed, to this end we should solve the equations χ −1 ∂i χ = (λi )0 .

(5.4)

The integrability conditions for these equations do not in general follow from (5.2). However, for the case (λi )>0 = 0 relations (5.4) are a consequence of relations (5.2) and we can, with the help of transformation (2.8), reduce these equations to the case where (λi )0 = 0. Note that in the multidimensional case it is again possible to use transformation (2.8), where χ is some solution of the Riccati-type equations, to reduce the equations to the case where (λi )>0 = 0. When λi are constant mappings, conditions (5.2) imply that the matrices λi commute. Here, by a similarity transformation, we can reduce λi to a triangular form. In such a case, and not only for constant λi , the multidimensional Riccati-type equations can be integrated by a procedure similar to one used in the one dimensional case. As a concrete example consider the Lie group GL(n, C) with the gradation of its Lie algebra described in Sect. 3. Parametrising the mappings λi as Ai Bi λi = Ci Di and using for the mapping ψ>0 parametrisation (3.3) we come to the following multidimensional Riccati-type equations: ∂i U = Bi − Ai U + U Di − U Ci U.

(5.5)

When Ai = 0 and Bi = 0 conditions (5.2) become ∂i Cj − ∂j Ci = 0; hence, there exists a mapping S such that Ci = ∂i S. Then, the general solution of Eqs. (5.5) has the form U = (In1 + mS)−1 m, where m is an arbitrary n1 × n2 matrix. 6. Generalised WZNW Equations and Multidimensional Toda Equations Consider the space R2d as a differential manifold and denote the standard coordinates on R2d by z−i , z+i , i = 1, . . . , d. Let ψ be a mapping from R2d to the Lie group G, which satisfies the equations ∂+j (ψ −1 ∂−i ψ) = 0, that can be equivalently rewritten as ∂−i (∂+j ψ ψ −1 ) = 0.

(6.1)

660

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

Here and in what follows we use the notations ∂−i = ∂/∂z−i and ∂+j = ∂/∂z+j . In accordance with [16] we call Eqs. (6.1) the generalised WZNW equations. It is wellknown that the two dimensional Toda equations can be considered as reductions of the WZNW equations; for a review we refer the reader to the remarkable paper [14], and for the affine case to [3,15]. Let us show that in multidimensional situation the appropriate reductions of the generalised WZWN equations give the multidimensional Toda equations recently proposed and investigated in [22]. It is clear that the g-valued mappings ι−i = ψ −1 ∂−i ψ,

ι+j = −∂+j ψ ψ −1

(6.2)

satisfy the relations ∂+j ι−i = 0,

∂−i ι+j = 0.

(6.3)

Moreover, the mappings ι−i and ι+i satisfy, by construction, the following zero curvature conditions: ∂−i ι−j − ∂−j ι−i + [ι−i , ι−j ] = 0,

∂+i ι+j − ∂+j ι+i + [ι+i , ι+j ] = 0.

(6.4)

The reduction in question is realised by imposing on the mapping ψ the constraints (ψ −1 ∂−i ψ)<0 = c−i ,

(∂+i ψ ψ −1 )>0 = −c+i ,

(6.5)

where c−i and c+i are some fixed mappings taking values in the subspaces g−1 and g+1 respectively. In other words, one imposes the restrictions (ι−i )<0 = c−i ,

(ι+i )>0 = c+i .

From (6.3) and (6.4) it follows that we should consider only the mappings c−i and c+i which satisfy the conditions ∂+j c−i = 0, [c−i , c−j ] = 0,

∂−i c+j = 0, [c+i , c+j ] = 0.

(6.6) (6.7)

Using the Gauss decomposition (2.3) we have −1 −1 −1 ψ0 (ψ<0 ∂−i ψ<0 )ψ0 ψ>0 ψ −1 ∂−i ψ = ψ>0

−1 −1 + ψ>0 (ψ0−1 ∂−i ψ0 )ψ>0 + ψ>0 ∂−i ψ>0 .

Taking into account the first equality of (6.5), one sees that −1 ∂−i ψ<0 )ψ0 = c−i . ψ0−1 (ψ<0

(6.8)

Similarly one obtains the equality −1 ∂+i ψ ψ −1 = ∂+i ψ<0 ψ<0

−1 −1 + ψ<0 (∂+i ψ0 ψ0−1 )ψ<0 + ψ<0 ψ0 (∂+i ψ>0 ψ>0 )ψ0−1 ψ<0

which implies −1 )ψ0−1 = −c+i . ψ0 (∂+i ψ>0 ψ>0

(6.9)

Riccati-Type Equations and Toda Systems

661

Let us use now the observation that the generalised WZNW equations can be considered as the zero curvature condition for the connection on the trivial principal fibre bundle R2d × G determined by the g-valued 1-form ρ on R2d with the components ρ−i = ψ −1 ∂−i ψ,

ρ+i = 0.

−1 we come After the gauge transformation of the form ρ generated by the mapping ψ>0 to the connection form ω with the components −1 ∂−i ψ<0 )ψ0 + ψ0−1 ∂−i ψ0 , ω−i = ψ0−1 (ψ<0

−1 ω+i = ψ>0 ∂+i ψ>0 .

Since the zero curvature condition is invariant with respect to gauge transformations, we conclude that the generalised WZNW equations are equivalent to zero curvature condition for the form ω. Using (6.8), (6.9) and denoting ψ0 by γ we see that ω−i = c−i + γ −1 ∂−i γ ,

ω+i = γ −1 c+i γ .

(6.10)

It is exactly the components of the form whose zero curvature condition leads to multidimensional Toda equations [22]2 having the following explicit form: ∂−i (γ c−j γ −1 ) = ∂−j (γ c−i γ −1 ),

(6.11)

∂+j (γ −1 ∂−i γ ) = [c−i , γ −1 c+j γ ], ∂+i (γ −1 c+j γ ) = ∂+j (γ −1 c+i γ ).

(6.12) (6.13)

Thus, if a mapping ψ satisfies the generalised WZNW equations (6.1) and constraints (6.5), then its component ψ0 , entering the Gauss decomposition (2.3), satisfies multidimensional Toda equations (6.11)–(6.13). On the other hand, assume that γ is a solution of the multidimensional Toda equations (6.11)–(6.13); then putting ψ0 = γ and choosing some ψ<0 and ψ>0 which satisfy (6.8) and (6.9), respectively, one can construct the solution ψ = ψ<0 ψ0 ψ>0 of the generalised WZNW equation submitted to constraints (6.5). The explicit construction of the mappings ψ<0 and ψ>0 from a given solution of the Toda equation for the two-dimensional case was considered in [17]. Below we give the generalisation of such a construction to the multidimensional case. First recall the procedure of obtaining the general solution to multidimensional Toda equations [22]. Let γ− and γ+ be some mappings from R2d to G0 satisfying the conditions ∂+i γ− = 0,

∂−i γ+ = 0.

Consider the equations −1 µ−1 − ∂−i µ− = γ− c−i γ− ,

−1 µ−1 + ∂+i µ+ = γ+ c+i γ+ ,

(6.14)

where µ− and µ+ obey the conditions ∂+i µ− = 0,

∂−i µ+ = 0.

2 In [22] there was considered the case of constant c and c . The generalisation to the case of arbitrary −i +i c−i and c+i satisfying (6.6) and (6.7) is straightforward.

662

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

The integrability conditions for Eqs. (6.14) are ∂−i (γ− c−j γ−−1 ) − ∂−j (γ− c−i γ−−1 ) = 0,

∂+i (γ+ c+j γ+−1 ) − ∂+j (γ+ c+i γ+−1 ) = 0.

Hence, the mappings γ− and γ+ cannot be arbitrary. Suppose that the above integrability conditions are satisfied and solve Eqs. (6.14). Consider the Gauss decomposition −1 µ−1 + µ− = ν− ην+ ,

(6.15)

where the mapping ν− takes values in G<0 , the mapping η takes values in G0 and the mapping ν+ takes values in G>0 . It can be shown [22] that the mapping γ = γ+−1 ηγ−

(6.16)

satisfies the multidimensional Toda equations (6.11)–(6.13). Since the manifold R2d is simply connected and the connection form ω satisfies the zero curvature condition, then there exists a mapping ϕ : R2d → G such that ω−i = ϕ −1 ∂−i ϕ,

ω+i = ϕ −1 ∂+i ϕ.

As it was shown in [22], the general form of the mapping ϕ corresponding to the solution of the multidimensional Toda equations constructed with the help of the above described procedure, is ϕ = aµ+ ν− ηγ− = aµ− ν+ γ− ,

(6.17)

where a is an arbitrary constant element of the Lie group G. Using (6.17) we have −1 ∂−i ν− ηγ− + (ηγ− )−1 ∂−i (ηγ− ). ω−i = ϕ −1 ∂−i ϕ = (ηγ− )−1 ν− Comparing this relation with the first equality in (6.10) and taking into account (6.16) we conclude that (γ+−1 ν− γ+ )−1 ∂−i (γ+−1 ν− γ+ ) = γ c−i γ −1 . Thus we see that the general solution of Eqs. (6.8) with ψ0 = γ can be written as ψ<0 = ξ−−1 γ+−1 ν− γ+ ,

(6.18)

where ξ− is an arbitrary mapping which takes values in G<0 and satisfies the conditions ∂−i ξ− = 0. In a similar way we obtain the relation −1 γ− ) (γ−−1 ν+ γ− ) = −γ −1 c+i γ ∂+i (γ−−1 ν+

which implies that the general solution of Eqs. (6.9) with ψ0 = γ is given by −1 γ− ξ+ , ψ>0 = γ−−1 ν+

(6.19)

where ξ+ is an arbitrary mapping which takes values in G>0 and satisfies the conditions ∂+i ξ+ = 0.

Riccati-Type Equations and Toda Systems

663

Using relations (6.18) and (6.19) we come to the following representation for the solution of the generalised WZNW equations corresponding to the solution of the multidimensional Toda equations ψ0 = γ : −1 γ− ξ+ . ψ = ψ<0 ψ0 ψ>0 = ξ−−1 γ+−1 ν− ην+

Due to relation (6.15) this representation is equivalent to ψ = ξ−−1 γ+−1 µ−1 + µ− γ− ξ+ . In the next section we use this representation to construct some integrable classes of the multidimensional Riccati-type equations. 7. Multidimensional Toda Systems and Riccati-Type Equations Let λ−i and λ+i , i = 1, . . . , d, be some fixed mappings from the manifold R2d to the Lie algebra g which satisfy conditions ∂+j λ−i = 0,

∂−j λ+i = 0.

(7.1)

Consider the system of equations ∂−i ψ = ψ λ−i ,

∂+i ψ = −λ+i ψ,

(7.2)

where ψ is a mapping from R2d to the Lie group G. The integrability conditions for this system is given by ∂−i λ−j − ∂−j λ−i + [λ−i , λ−j ] = 0, ∂+i λ+j − ∂+j λ+i + [λ+i , λ+j ] = 0.

(7.3)

It is clear that the mapping ψ satisfies the generalised WZNW equations. Hence we can treat system (7.2) with the mappings λ−i and λ+i satisfying (7.1) and (7.3), as a reduction of the generalised WZNW equations similar to the reduction considered in the previous section. The difference is that in the previous section we fixed only the components (ι−i )<0 and (ι+i )>0 of the mappings ι−i and ι+i and did it in a quite special way, but here we fix the mappings ι−i and ι+i completely. It is easy to show that if the mapping −1 ψ satisfies Eqs. (7.2) then the mappings ψ<0 and ψ>0 satisfy the multidimensional Riccati-type equations −1 −1 ψ<0 = (ψ<0 λ+i ψ<0 )<0 , ∂+i ψ<0 −1 ∂−i ψ>0 ψ>0

=

−1 (ψ>0 λ−i ψ>0 )>0 .

(7.4) (7.5)

Equations (7.2) are a multidimensional generalisation of the so-called associated Redheffer–Reid system [25–27]. The investigation of that system is very useful for studying one dimensional Riccati and matrix Riccati equations, see for example [33]. We believe that our generalisation also plays a significant role for the multidimensional Riccatitype equations. As a first application of such systems let us give a construction of some integrable class of the multidimensional Riccati-type equations. Suppose now that the mappings λ−i and λ+i are that (λ−i )<0 = c−i ,

(λ+i )<0 = c+i

(7.6)

664

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

with the mappings c−i and c+i taking values in g−1 and g+1 , respectively, and submitted to conditions (6.6) and (6.7). In this case the mapping γ = ψ0 satisfies the multidimensional Toda equations (6.11)–(6.13). On the other hand, if we have a solution γ of Eqs. (6.11)–(6.13), then using results of the previous section we can find the general solution to Eqs. (6.8) and (6.9), and construct the mapping ψ which satisfies the generalised WZNW equations and constraints (6.5). This mapping, via equalities (7.2), generates some mappings λ−i and λ+i certainly satisfying constraints (7.6). Actually if we have the general solution to multidimensional Toda equations (6.11)–(6.13), then we get in this way the general form of the mappings λ−i and λ+i which satisfy the integrability conditions (7.3) and constraints (7.6). Moreover, we have here the general solution to the multidimensional Riccati equations (7.4) and (7.5). The explicit form of the mappings λ−i and λ+i obtained with the help of the above described procedure is λ−i = ξ+−1 c−i ξ+ + ξ+−1 (γ−−1 ∂−i γ− )ξ+ + ξ+−1 ∂−i ξ+ ,

λ+i = ξ−−1 ∂+i ξ− + ξ−−1 (γ+−1 ∂+i γ+ )ξ− + ξ−−1 c+i ξ− ,

and the corresponding solutions of Eqs. (7.4) and (7.5) are given by (6.18) and (6.19). Consider now the Lie group GL(n, C) and the Z-gradation of the Lie algebra gl(n, C) discussed in Sect. 3. Parametrise the mappings γ∓ as β∓1 0 γ∓ = . 0 β∓2 The general form of the mappings c∓i is 0 0 , c−i = X−i 0

c+i =

0 X+i 0 0

,

where the mappings X−i and X+i are arbitrary. The integrability conditions of Eqs. (6.14) have now the form −1 −1 ) − ∂−j (β−2 X−i β−1 ) = 0, ∂−i (β−2 X−j β−1

(7.7)

= 0.

(7.8)

−1 −1 ) − ∂−j (β+1 X−i β+2 ) ∂+i (β+1 X+j β+2

For the mappings

λ∓i =

A∓i B∓i C∓i D∓i

we obtain −1 ∂−i β−1 − (ξ+ )12 X−i , A−i = β−1 −1 ∂−i β−1 (ξ+ )12 B−i = β−1

−1 − (ξ+ )12 β−2 ∂−i β−2 − (ξ+ )12 X−i (ξ+ )12 + ∂−i (ξ+ )12 ,

−1 ∂−i β−2 + X−i (ξ+ )12 , C−i = X−i , D−i = β−2

−1 ∂+i β+1 + X+i (ξ− )21 , B+i = X+i , A+i = β+1 −1 ∂+i β+2 (ξ− )21 C+i = β+2

D+i

−1 − (ξ− )21 β+1 ∂+i β+1 − (ξ− )21 X+i (ξ− )21 + ∂+i (ξ− )21 , = β+2 ∂+i β+2 − (ξ− )21 X+i ,

Riccati-Type Equations and Toda Systems

665

where (ξ+ )12 and (ξ− )21 are the nontrivial blocks of the mappings ξ+ and ξ− . In order to solve equations (7.4) and (7.5) one considers first equations (6.14). Next, −1 and ν− . In the one uses the Gauss decomposition (6.15) for finding the mappings ν+ case under consideration In1 −(In1 − (µ+ )12 (µ− )21 )−1 (µ+ )12 −1 = , ν+ 0 In2 0 In1 . ν− = (µ− )21 (In1 − (µ+ )12 (µ− )21 )−1 In2 Finally, using (6.19) and (6.18) one arrives at the following expressions for nontrivial blocks (ψ>0 )12 = U− and (ψ<0 )21 = U+ of the mappings ψ>0 and ψ<0 : −1 (In1 − (µ+ )12 (µ− )21 )−1 (µ+ )12 β−2 , U− = (ξ+ )12 − β−1

−1 (µ− )21 (In1 − (µ+ )12 (µ− )21 )−1 β+1 . U+ = (ξ− )21 + β+2

It is clear that the dependence of U− and U+ on z+i and z−i , respectively, is parametric, and the general solution of the equations can be written as −1 (In1 − m− (µ− )21 )−1 m− β−2 , U− = (ξ+ )12 − β−1

U+ =

−1 (ξ− )21 + β+2 m+ (In1

−1

− (µ+ )12 m+ )

β+1 ,

(7.9) (7.10)

where m− and m+ are arbitrary constant matrices of dimensions n1 × n2 and n2 × n1 respectively. We have said nothing yet about solving integrability conditions (7.7) and (7.8). In the general case the solution to these equations is not known. However, they can be solved in some particular cases. For example, let n = d + 1, n1 = d and n2 = 1, and let the mappings X∓i be defined by the relations (X−i )1j = δij ,

(X+i )j 1 = δij .

In this case the general solution [22] of integrability conditions (7.7) and (7.8) is −1 )ij = F− ∂−i H−j , (β−1 (β+1 )ij = F+ ∂+j H+i ,

−1 β−2 = F− , β+2 = F+ ,

where F∓ and H∓i are arbitrary functions depending on the coordinates z∓i . For the blocks (µ− )21 and (µ+ )12 one has (µ− )21 = H− ,

(µ+ )21 = H+ ,

where H− and H+ are 1 × d and d × 1 matrices formed by the functions H−i and H+i respectively. Now using the evident notations we can write expressions (7.9) and (7.10) as U−i = ξ+i + ∂−i log(1 − H− m− ),

U+i = ξ−i − ∂+i log(1 − m+ H+ ).

Acknowledgement. The authors are indebted to A. M. Bloch and A. K. Common who acquainted us with their studies related to the matrix ordinary differential Riccati equation. One of the authors (M. V. S.) is grateful to J.-L. Gervais for useful discussions; he also wishes to acknowledge the warm hospitality of the Instituto de Física Teórica, Universidade Estadual Paulista, São Paulo, Brazil, and the financial support from FAPESP during his stay there in March–July 1998. The research program of A. V. R. and M. V. S. is supported in part by the Russian Foundation for Basic Research under grant # 98–01–00015 and by INTAS grant # 96-690; and that of L. A. F., J. F. G. and A. H. Z. is partially supported by CNPq-Brazil.

666

L. A. Ferreira, J. F. Gomes, A. V. Razumov, M. V. Saveliev, A. H. Zimerman

References 1. Ablowitz, M.J., Beals, R., Tenenblat, K.: On the solution of the generalized sine–Gordon equations. Stud. Appl. Math. 74, 177–203 (1986) 2. Aminov, Yu.A.: Isometric immersions of domains of n-dimensional Lobachevsky space in (2n − 1)dimensional Euclidean space. Math. USSR Sbornik 39, 359–386 (1981) 3. Aratyn, H., Ferreira, L.A., Gomes, J.F., Zimerman, A.H.: Kac–Moody construction of Toda type field theories. Phys. Lett. B 254, 372–380 (1991) 4. Berceanu, S., Gheorghe, A.: On equations of motion on compact Hermitian symmetric spaces. J. Math. Phys. 33, 998–1007 (1992) 5. Bloch, A.M.: Steepest descent, linear programming and Hamiltonian flows. Contemp. Math. 114, 77–88 (1990) 6. Bloch, A.M., Flaschka, H., Ratiu, T.: A convexity theorem for isospectral manifolds of Jacobi matrices in a compact Lie algebra. Duke Math. J. 61:1, 41–65 (1990) 7. Common, A.K., Hafez, S.T.: Continued-fraction solutions to the Riccati equation and integrable lattice systems. J. Phys. A 23, 455-466 (1990) 8. Common, A.K., Roberts, D.E.: Solutions of the Riccati equation and their relation to the Toda lattice. J. Phys. A 19, 1889–1898 (1986) 9. Cecotti, S., Vafa, C.: Topological-anti-topological fusion. Nucl. Phys. B 367, 359–461 (1991) 10. Cecotti, S., Vafa, C.: Ising model and N = 2 supersymmetric theories. Commun. Math. Phys. 157, 139–178 (1993); hep-th/9209085 11. Deumens, E., Weiner, B., Öhrn, Y.: Time–dependent variational principle on the group SO(2r). Nucl. Phys. A 466, 85–98 (1984) 12. Drager, L.D., Foote, R.L., Martin, C.F.: Controllability of linear systems, differential geometry of curves in Grassmannians, and Riccati equations. Contemp. Math. 86, 85–98 (1987) 13. Dubrovin, B.: Geometry and integrability of topological-antitopological fusion. Commun. Math. Phys. 152, 539–564 (1993); hep-th/9206037 14. Fehér, L., O’Raifeartaigh, L., Ruelle, P., Tsutsui, I., Wipf, A.: On Hamiltonian reductions of the Wess– Zumino–Novikov–Witten theories. Phys. Rep. 222, 1–64 (1992) 15. Ferreira, L.A., Gomes, J.F., Schwimmer, A., Zimerman, A.H.: Comments on two-loop Kac–Moody algebras. Phys. Lett. B 274, 65–71 (1992); hep-th/9110032 16. Gervais, J.-L., Matsuo, Y.: Classical An -W -geometry. Commun. Math. Phys. 152, 317–368 (1993); hep-th/9201026 17. Gervais, J.-L., O’Raifeartaigh, L., Razumov, A.V., Saveliev, M.V.: Gauge conditions for the constrainedWZNW–Toda reductions. Phys. Lett. B 301, 41–48 (1993); hep-th/9211088 18. Gervais, J.-L., Saveliev, M.V.: Higher grading generalisations of the Toda systems. Nucl. Phys. B 453, 449-476 (1995); hep-th/9505047 19. Golubchik, I.Z., Sokolov, V.V.: On some generalizations of factorization method. Theor. Math. Phys. 110, 267–276 (1997) 20. Harnad, J., Winternitz, P., Anderson, R.L.: Superposition principles for matrix Riccati equations. J. Math. Phys. 24, 1062–1072 (1983) 21. Razumov, A.V., Frenet frames and Toda systems. Preprint MPI 98–29, Bonn, 1998 22. Razumov, A.V., Saveliev, M.V.: Multi-dimensional Toda-type systems. Theor. Math. Phys. 112, 999–1022 (1997); hep-th/9609031 23. Razumov,A.V., Saveliev, M.V.: Maximally non-abelian Toda systems. Nucl. Phys. B 494, 657–686 (1997); hep-th/9612081 24. Razumov, A.V., Saveliev, M.V.: Lie algebras, geometry, and Toda-type systems. Cambridge: Cambridge University Press, 1997 25. Redheffer, R.M.: On solutions of Riccati’s equation as functions of the initial values. J. Rat. Mech. Anal. 5:5, 835–848 (1956) 26. Redheffer, R.M.: The Riccati equation. Initial values and inequalities. Math. Ann. 133:3, 235–250 (1957) 27. Reid, W.T.: Solution of a Riccati matrix differential equation as function of initial values. J. Math. Mech. 8:2, 221–230 (1959) 28. Schneider, C.R.: Global aspects of the matrix Riccati equation. Math. Systems Theory 7, 281–293 (1973) 29. Shayman, M.A.: Geometry of the algebraic Riccati equation. I, II. SIAM J. Control Optim. 24, 379–409 (1983) 30. Shnider, S., Winternitz, P.: Classification of systems of nonlinear ordinary differential equations with superposition principles. J. Math. Phys. 25, 3155–3165 (1984) 31. Tenenblat, K., Terng, C.-L.: Bäcklund theorem for n–dimensional submanifolds of R 2n−1 . Ann. Math. 111, 477–490 (1980) 32. Terng, C.-L.: A higher dimensional generalization of the Sine–Gordon equation and its soliton theory. Ann. Math. 111, 491–510 (1980) 33. Zakhar–Itkin, M.Kh.: The matrix Riccati differential equation and the semi-group of linear fractional transformations. Russ. Math. Surv. 28:3, 89–131 (1973) Communicated by T. Miwa

Commun. Math. Phys. 203, 667 – 706 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Sharp Entropy Dissipation Bounds and Explicit Rate of Trend to Equilibrium for the Spatially Homogeneous Boltzmann Equation G. Toscani1 , C. Villani2 1 Department of Mathematics, University of Pavia, via Abbiategrasso 209, 27100 Pavia, Italy.

E-mail: [email protected]

2 École Normale Supérieure, DMI, 45 Rue d’Ulm, 75230 Paris Cedex 05, France.

E-mail: [email protected] Received: 24 June 1998 / Accepted: 23 December 1998

Abstract: We derive a new lower bound for the entropy dissipation associated with the spatially homogeneous Boltzmann equation. This bound is expressed in terms of the relative entropy with respect to the equilibrium, and thus yields a differential inequality which proves convergence towards equilibrium in relative entropy, with an explicit rate. Our result gives a considerable refinement of the analogous estimate by Carlen and Carvalho [9,10], under very little additional assumptions. Our proof takes advantage of the structure of Boltzmann’s collision operator with respect to the tensor product, and its links with Fokker–Planck and Landau equations. Several variants are discussed. Contents 1. 2. 3. 4. 5. 6. 7. 8.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries: Fokker–Planck and Landau Equations . . . . Symmetries for Boltzmann and Fokker–Planck Equations . Integral Representation of a Lower Bound for D . . . . . . Main Result . . . . . . . . . . . . . . . . . . . . . . . . . Extension to Other Kernels . . . . . . . . . . . . . . . . . The Kac Model . . . . . . . . . . . . . . . . . . . . . . . . Remarks About Fisher Information and Entropy Dissipation

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

667 675 681 686 690 696 698 702

1. Introduction This paper deals with the spatially homogeneous Boltzmann equation,   ∂f (t, v) = Q(f, f ) t ≥ 0, v ∈ RN (N ≥ 2), ∂t f (0, ·) = f , f ≥ 0, f ∈ L1 (RN ). 0

0

0

(1)

668

G. Toscani, C. Villani

The unknown f stands for the probability density of particles in the velocity space. Q is the so-called Boltzmann collision operator, Z Z dv∗ dω B(v − v∗ , ω) f 0 f∗0 − ff∗ , (2) Q(f, f ) = RN

S N −1

where dω is the normalized measure on S N−1 , f 0 = f (v 0 ), and so on, and ( v 0 = v − (v − v∗ , ω)ω, v∗0 = v∗ − (v − v∗ , ω)ω

(3)

are the postcollisional velocities of two particles that collide with respective velocities v and v∗ , according to the laws of elastic collision ( v 0 + v∗0 = v + v∗ (4) |v 0 |2 + |v∗0 |2 = |v|2 + |v∗ |2 . (We denote by (a, b) = a · b the scalar product in RN .) On physical grounds, it is assumed that the nonnegative kernel B(z, ω) (the “cross section”) depends only upon |z| and (z/|z|, ω). Typical examples are the (three-dimensional) hard spheres collision kernel BH S (z, ω) = |z · ω|,

(5)

or more generally the kernels associated to the so-called hard potentials with cut-off, BH P (z, ω) = |z|γ b(α),

(6)

where 0 < γ ≤ 1, α ∈ [0, π] is the angle between z and ω, and b ∈ L1 (0, π). For γ < 0, we speak of soft potentials with cut-off; for γ = 0, we speak of Maxwellian potential with cut-off. More generally, if B depends only on (z/|z|, ω), we speak of Maxwellian potential. We refer to [42,16] for a detailed discussion of other models. The Boltzmann equation is one of the most popular models in nonequilibrium statistical physics. Soon after its introduction by Maxwell [33], Boltzmann deduced from it the celebrated H -theorem, namely that the entropy Z f log f (7) H (f ) = RN

of any solution f to (1) is nonincreasing with time. From this fact he gave plausible arguments for these solutions to converge towards a definite equilibrium state as t goes to infinity (see [7] for instance). Let us recall them briefly. Let ϕ(v) be any function of the velocity variable. Multiplying (1) by ϕ and integrating, we obtain Z Z Z dv dv∗ dω B f 0 f∗0 − ff∗ ϕ, (8) Q(f, f )ϕ = R2N

S N −1

where for simplicity we omit the arguments of B. For fixed ω, the transformation Tω : (v, v∗ ) 7 −→ (v 0 , v∗0 )

(9)

Sharp Entropy Dissipation Bounds

669

is involutive and has unit Jacobian. Using this change of variables, and also the transformation R : (v, v∗ ) 7 −→ (v∗ , v), we easily obtain Z Z 1 dv dv∗ dω B f 0 f∗0 − ff∗ ϕ 0 + ϕ∗0 − ϕ − ϕ∗ . − Q(f, f )ϕ = 4

(10)

(11)

Choosing ϕ = 1, vi (1 ≤ i ≤ N), |v|2 , by (4) we deduce that (11) vanishes, and hence that the quantities ρ > 0 (mass), u ∈ RN (mean velocity), T > 0 (temperature) defined by Z Z Z f (v)|v|2 dv (12) ρ = f (v) dv, ρu = f (v)v dv, ρ(|u|2 + N T ) = are preserved with time. Now, choosing ϕ(v) = log f (v) in (11), we get Z − Q(f, f ) log f = D(f ), where D(f ) is the entropy dissipation, Z f 0 f∗0 1 dv dv∗ dω B(v − v∗ , ω) f 0 f∗0 − ff∗ log . D(f ) = 4 ff∗

(13)

(14)

Since (x, y) 7 → (x − y) log(x/y) is a nonnegative function, so is D(f ). Moreover, at least if B > 0 a.e., D(f ) vanishes if and only if for almost all v, v∗ , ω, f 0 f∗0 = ff∗ .

(15)

Boltzmann proved that if f is smooth, Eq. (15) implies (with the notations of (12)) f (v) =

2 ρ − |v−u| 2T e ≡ M f (v). (2πT )N/2

(16)

Such distributions are called Maxwellian. The assumption of smoothness was later proved to be unessential (see [17] and the references therein; see also the proof of Perthame [36] relying on Fourier transform). Lions also gave a direct proof [31] that (15) implies f ∈ C ∞ (RN ) as soon as f ∈ L1 . As a conclusion, if f (t) is a solution of (1), then H (f (t)) is strictly decreasing with time unless f = M f , and this strongly suggests that f (t) −−−→ M f t→∞

(17)

in some sense. By the way, the fact that M f is the only minimizer of H in the class of functions satisfying (12) is a direct consequence of the identity Z f f f f M f 1 − f + f log f H (f ) − H (M ) = M M M RN and the positivity of x 7 → 1 − x + x log x.

670

G. Toscani, C. Villani

Once these formal arguments have been cast, it is very difficult to go further and to prove that (17) actually holds, say in the sense of the topology induced by the L1 norm. Results have been obtained by several authors, in particular Arkeryd [3] and Wennberg [49], for hard (or Maxwellian) potentials with cut-off. More precisely, they prove kf (t) − M f k ≤ C(f0 )e−λt

(18)

for various weighted Lp norms (p ≥ 1), with a (nonexplicit) constant C(f0 ) depending on the initial datum. The rate λ is obtained by a compactness argument, and therefore completely unknown. These results rely on the study of the linearized Boltzmann equation and the rate is not given by the entropy dissipation. But, as pointed out by Carlen and Carvalho [9], such quantities as this rate λ, that are not explicitly computable, may be completely irrelevant from the physical point of view (as is Poincaré’s recurrence theorem for statistical physics). Another approach is to try to transform the assertion D(f ) = 0 ⇐⇒ f = M f into a quantitative result of the form D(f ) ≥ d(f, M f ), where d is some suitable metric on a space of functions which is stable under the action of the Boltzmann equation (for instance, L12 (RN ), the space of all L1 functions with finite moments of order 2). Desvillettes [19] was the first to obtain a result in this direction. His lower bound reads Z dv | log f (v) − log M(v)|, ∀R > 0 D(f ) ≥ KR inf M∈M |v|≤R

where M is the set of all Maxwellian distributions, and f is assumed to be bounded below by a fixed Maxwellian distribution. His proof gives no indication on the way the constant KR (obtained by the use of the open mapping theorem) depends on R. This result, proven for a certain class of kernels, was extended by Wennberg [48] to cover the physically realistic cases. Variants have also been derived for other kinetic equations. Another bound below was obtained by Gabetta and Toscani [24] in the case of Kac’s model, which is a one-dimensional caricature of Boltzmann’s equation (see Sect. 7). Their bound reads D(f ) ≥ θf I (f ) − I (M f ) , where θf is a constant depending on f in a complicated way, and Z I (f ) =

|∇f |2 =4 f

Z

p |∇ f |2

(19)

is the so-called Fisher information of f . Again, the proof does not yield any indication on the way θf varies with f , and hence the result cannot be exploited for the trend towards equilibrium.

Sharp Entropy Dissipation Bounds

671

Some years ago, Carlen and Carvalho [9] were able to derive the first inequality of the form (20) D(f ) ≥ 8 H (f |M f ) , where H (f |M f ) = H (f ) − H (M f )

(21)

is the relative entropy of f with respect to M f , and 8 is a nonnegative function, strictly increasing from 0 (but very slowly). 8 depends on a few qualitative properties of f0 (essentially, the existence of a finite moment of order higher than 2). Though 8 is implicitly defined, its construction is entirely explicit, and in particular for all ε > 0 one can compute η > 0 such that 8(η) ≥ ε. The result holds for kernels that are bounded below, in the sense B(z, ω) ≥ ν > 0. In a second paper [10], Carlen and Carvalho adapted their analysis to the case of the hard spheres potential, under some L∞ -type assumptions on f . As an immediate consequence of (13) and (21), their result implies d (22) − H (f |M f ) ≥ 8 H (f |M f ) dt and this is enough to conclude that H (f ) −−−→ H (M f ). t→∞

But, by the Csiszar-Kullback inequality [18,29], q kf − M f kL1 ≤ 2H (f |M f ),

(23)

(24)

and therefore (23) implies at once that f (t) goes towards M f in L1 norm. Moreover, given any number ε > 0 and an initial datum f0 , one can compute explicitly a time Tε (f0 ) such that kf (t) − M f kL1 ≤ ε for t ≥ Tε (f0 ). Several applications of this result have been given: in particular a rigorous hydrodynamic limit in the large for a model equation in plasma physics [11] and a proof of trend to equilibrium in the weak sense for the Boltzmann equation for hard spheres, in the case when H (f0 ) = ∞ [1]. Unfortunately, the function 8 given by Carlen and Carvalho is very intricate (see [10], p. 754), which makes explicit computations rather difficult, even when moments of high order are finite. Moreover, it is not a priori clear that 8 possesses positive derivatives (even of high order) near the origin, and hence the rate of return to equilibrium predicted by (22) is very slow. It is therefore natural to ask whether the inequality (22) can be found to hold for a simple function 8 that grows not too slow (ideally, linearly) near the origin. In fact, in an older paper, Cercignani [15] conjectured that D(f ) ≥ λ(f0 )H (f |M f )

(25)

for some λ(f0 ) > 0 depending on the initial datum. Inequality (25) would imply an exponential decay towards equilibrium. Bobylev [5] proved that this conjecture cannot

672

G. Toscani, C. Villani

hold for Maxwellian molecules if λ(f0 ) depends only on the mass, momentum and energy of f0 . Indeed, he exhibited a family of initial data with the same moments up to order 2, for which the trend to equilibrium can be as slow as desired. Wennberg [51] arrived to the same conclusion in the case of hard potentials (6) with 0 < γ ≤ 1, by a direct study of D(f ). Finally, in a very recent work [6], Bobylev and Cercignani proved the inequality (25) to be false, for all realistic potentials, even for functions that have an arbitrary high number of moments close to the equilibrium value, and are very smooth and bounded below by a given Maxwellian. They conjecture that the only reasonable spaces of functions in which (25) may hold would be Lp spaces with an inverse Maxwellian weight. A good theory of existence in such spaces is very far out of reach at the moment. In this paper, we shall derive a new bound of the form (20), with a function 8 which is at the same time much more simple and increasing faster near the origin. Namely D(f ) ≥ Cε (f ) H (f |M f )1+ε , where Cε depends only on the cross-section, the quantities (12), and Z dv f (v)| log f (v)|(1 + |v|2 )r/2 , kf kL1r log L ≡ RN

Z kf kL1s ≡

RN

f (v)(1 + |v|2 )s/2

(26)

(27)

(28)

for some r(ε) > 2, s(ε) > 4 that we shall compute, and K, A such that ∀v ∈ RN ,

f (v) ≥ Ke−A|v| . 2

(29)

We note that the result by Carlen and Carvalho is slightly more general than ours, since in its most general version it does not require bounds in L1r log L ∩ L1s but only in L log L ∩ L1s . Our conclusion holds for all kernels B such that |z · ω| N −2 (30) B(z, ω) ≥ ψ(|z|) |z| for some smooth function ψ > 0 decaying at most algebraically at infinity. Consider for instance the case when ψ(|z|) = (1 + |z|)γ ,

0 < γ ≤ 1.

In this framework, Gustafsson [28] proved uniform (in time) boundedness of the solution to (1) in weighted Lp spaces (p > 1), from which uniform bounds in L1s log L are easily extracted. In addition, it is known that one can choose fixed K and A, depending only on the mass, energy and entropy of f0 , such that (29) holds for f (t, ·) as soon as t ≥ t0 > 0 (see [37]). Consequently, our result implies that for this family of kernels, solutions to the Boltzmann equation decay towards the equilibrium in L1 norm like t −1/ε , for all p ε > 0, as soon as f0 ∈ ∩s>0 Ls . In fact, it would be very likely that the norms of f in 1 Ls log L become finite for all positive time as soon as the initial entropy is finite, as it happens for the moments [51].

Sharp Entropy Dissipation Bounds

673

We emphasize that the bound (26) is optimal, in the sense that, under our assumptions, the result simply does not hold for ε = 0, as shown by the previous discussion of Cercignani’s conjecture. In the case where condition (30) fails to be true, we do not recover such a strong result as (26). Yet if the set of (z, ω) such that (30) is violated is of small measure, it is easy to adapt our method and get an explicit algebraic bound of the form D(f ) ≥ CH (f |M)α . In particular, this is true for all hard potentials (see Sect. 6). Up to our knowledge, this is the first result of algebraic decay with explicit bounds available for the Boltzmann equation. But we wish to point out that the inequality (26) is more interesting than just a statement concerning solutions of the spatially homogeneous Boltzmann equation: it is a general functional inequality, that could be applied in any context, in particular the spatially inhomogeneous Boltzmann equation, if suitable estimates on the solutions were known. In addition, our work suggests simple obstructions for (25) not to hold, linked to the tails of the distribution f , that are generally found to be the most severe obstacle in rigorous proofs of decay to equilibrium (cf. [5] for instance). Also, the comparison of our results to the ones obtained by Desvillettes and the second author in [20,21], is a clear illustration of the physical fact that the tails of distribution may be an obstacle to the trend to equilibrium in the case of the Boltzmann equation (which is a kind of jump process), but not in the case of related diffusion-type equations. Our approach is based upon several tools that have been known for a more or less long time in the context of the Boltzmann equation. The first one is the regularization by the so-called adjoint Ornstein-Uhlenbeck semigroup, i.e. the semigroup (St ) generated by the linear Fokker–Planck equation, ∂t f = ∇ · (T ∇f + f (v − u)),

u ∈ RN , T > 0.

(31)

The class of distribution functions satisfying (12) is invariant by this semigroup. Moreover, solutions of (31) are smooth for all positive time, and converge towards M f . Therefore, this semigroup gives a very convenient interpolation between f and M f . It plays a central role in the proof of Carlen and Carvalho, and also in our analysis. Our second tool is the use of the tensorial structure of the Boltzmann equation, and in particular the fact that the entropy dissipation (14) is a convex function of the tensor product ff∗ . In fact, most of our study will take place in R2N , and it is only in the end that we shall go back to the one-variable N -dimensional space. Our third tool, going back at least to Boltzmann, and systematically used by Desvillettes [19], consists in the introduction of linear operators that “kill” functions with some symmetries. Typical examples are i |v|2 + |v∗ |2 = 0, (v − v∗ ) ∧ (∇ − ∇∗ ) U v + v∗ , 2

h

(32)

used by Boltzmann to prove that smooth solutions of (15) are Maxwellian (cf. [19]), or

∂2 ∂2 − 2 2 ∂y ∂x

x−y x+y +ϕ = 0, ϕ √ √ 2 2

(33)

arising in the context of rescaled convolution (see for instance the work by Carlen [8] about the cases of equality in the logarithmic Sobolev inequality).

674

G. Toscani, C. Villani

The last main ingredient of our proof is the so-called Landau collision operator, Z ∂ ∂f∗ ∂f −f dv∗ aij (v − v∗ ) f∗ QL (f, f ) = , (34) ∂vj ∂vj ∂v∗j where the symmetric matrix function aij is defined by zi zj aij (z) = δij − 2 9(|z|) |z|

(35)

for some nonnegative function 9. Here we have adopted the convention of Einstein for implicit summation over repeated indices. The collision operator (34) bears much resemblance with (2), and is in fact obtained from it by a suitable asymptotic regime (see [43] and the references therein). None of these tools is new; but the main feature of our study is the way they are combined all together. For the sake of completeness, we shall recall in the next section all the material concerning Eqs. (31) and (34) that will be needed in the sequel. Let us end this introduction with some comments on entropy dissipation methods. These have proved to be very robust and apply to various contexts, in particular diffusiontype equations ([39,2]), where sharp rates of exponential decay towards equilibrium in relative entropy have been derived for linear and weakly nonlinear models. In the context of the Boltzmann equation, one of their essential features is monotonicity: namely, if Q1 and Q2 are two Boltzmann operators of the form (2), with cross sections B1 ≥ B2 , then, in view of (14), D1 (f ) ≥ D2 (f ). The advantage of this property was noted by Carlen and Carvalho: in view of it, all the results that are obtained in an algebraically simplified framework for a peculiar cross section B2 , automatically extend to all cross sections B1 ≥ B2 . Carlen, Gabetta and Toscani [12] recently proved exponential convergence to equilibrium with an explicit rate for the Boltzmann equation with Maxwellian molecules. The rate depends on both the tails and the smoothness of the solution, and is essentially optimal. But their method relies on Fourier analysis, and hence seems very difficult to extend to other kernels. There, the proof makes use of a Lyapunov functional (distance) introduced in [26], defined in terms of the Fourier transform, which plays the role of the classical entropy. We note that the monotonicity properties of this entropy functional have been recently used to give a proof of uniqueness of the solution to (1) for true Maxwell molecules [41]. The organization of the paper is as follows. Section 2 is devoted to some preliminary material concerning the linear Fokker–Planck and the Landau equation. In Sect. 3, we study some symmetry properties enjoyed by the Boltzmann and the Fokker–Planck equation. In Sect. 4, we establish an integral representation of a lower bound for D(f ), based on regularization by (St ). In Sect. 5 we state and prove our main result, namely the bound below (26). In Sect. 6, we show how to extend our results to various models, including the hard potentials with cut-off. In Sect. 7, we treat Kac’s model as a variant. In Sect. 8, we do some remarks about the procedure of regularization of D by the semigroup (St ), and show how it can be linked to the decay of the Fisher information along the solutions to the Boltzmann equation.

Sharp Entropy Dissipation Bounds

675

2. Preliminaries: Fokker–Planck and Landau Equations From now on, unless otherwise stated, we shall consider only nonnegative distribution functions f satisfying the normalization ρ = 1,

u = 0,

T = 1,

(36)

where ρ, u and T are defined by (12). This class of functions is invariant by all the equations that we shall consider: Boltzmann, Fokker–Planck, Landau and Kac. Moreover, we shall denote by M the corresponding Maxwellian distribution, M(v) =

|v|2 1 e− 2 . N/2 (2π)

(37)

Accordingly, Z H (f |M) =

RN

Z I (f |M) = 4

RN

f f log M = H (f ) − H (M), M M

(38)

v p 2 f = I (f ) − I (M), ∇+ 2

(39)

all being well-defined in [0, ∞]. Another form of (39) is R these expressions |∇ log(f/M)|2 f , at least when some smoothness is available for f . Let us recall the basic properties of H (·|M) and I (·|M). We refer to [13,14] and the references therein for complete proofs of the assertions below, and more material about the Fokker–Planck equation. Both the relative entropy and the relative Fisher information are strictly convex, weakly lower semicontinuous (for the L1 topology for instance) functionals. For the relative entropy, this can be seen directly by the Legendre-type representation [22,13] Z Z , (40) H (f |M) = sup f ϕ − log eϕ M where the supremum is taken for instance over all continuous bounded functions ϕ. The basic link between H (·|M) and I (·|M) is given by the action of the adjoint Ornstein-Uhlenbeck semigroup, (St )t≥0 , which can be defined as the semigroup associated to the Fokker–Planck equation. With the conventions (36), the Fokker–Planck equation simply reads ∂t f = ∇ · (∇f + f v) ≡ Lf.

(41)

The explicit solution of this equation is well-known, and thus St f = fe−2t ∗ M1−e−2t , where we use the notation gλ (v) =

1 λN/2

g

v . √ λ

Note that M is invariant by the action of this semigroup.

(42)

(43)

676

G. Toscani, C. Villani

It is well known that t 7 −→ St f is continuous for the strong L1 topology. Moreover, assuming that H (f ) < ∞, the function t 7 −→ H (St f |M) belongs to C([0, ∞)) ∩ C 1 (0, ∞), and for all t > 0, d H (St f |M) = −I (St f |M). dt

(44)

In particular, H (St f |M) is decreasing with t. What is less known is the fact that I (St f |M) is also decreasing. The rate of decay has been found in [39]: I (St f |M) ≤ I (f |M)e−2t .

(45)

This inequality can also be considered as a direct consequence of the so-called Blachman– Stam inequality (cf. [13]), and follows as well by the important work of Bakry and Emery, based on the so-called 02 calculus [4]. We shall use this decay property in the study of the entropy production for Kac equation. Moreover, St f −→ M in relative entropy as t → ∞. More precisely [39], H (St f |M) ≤ H (f |M)e−2t .

(46)

As a consequence, Z

∞

H (f |M) = 0

dt I (St f |M).

(47)

We shall also use the fact that the Fokker–Planck equation propagates moments. Namely, for any s > 0 and any datum f ∈ L1 , for all t > 0, (48) kSt f kL1s ≤ max(1, 2s−1 ) e−st kf kL1s + (1 − e−2t )s/2 kMkL1s , This can be easily seen by remarking that (42) is nothing more than the probability density of the random variable Xt = e−t X + (1 − e−2t )1/2 W, where X has density f , and W is a normalized Gaussian variable independent of X. Consequently, h i kSt f kL1s = E (1 + Xt2 )s/2 , E denoting mathematical expectation. On the other hand, 2 s/2 1 + e−t X + (1 − e−2t )1/2 W h is ≤ e−t (1 + X2 )1/2 + (1 − e−2t )1/2 (1 + W 2 )1/2 and (48) follows from the inequality (x 2 + y 2 )s ≤ max(1, 2s−1 )(x 2s + y 2s ). For the moments of order 2, more can be said: all the quantities of the form Z dv (St f )(v)vi vj Pij (St f ) = RN

Sharp Entropy Dissipation Bounds

677

behave monotonically and converge exponentially fast to their equilibrium value δij as t → ∞. Finally, the semigroup (St ) has smoothing effects: for all t > 0, St f is C ∞ and such that | log St f (v)| ≤ Ct (1 + |v|2 )

(49)

for some constant Ct depending on t. A proof of (49) can be found in [9]. Let us now prove a less known propagation property. Proposition 1. Let s > 0, ε > 0, and let f ≥ 0 such that kf kL1 < ∞ and s+ε kf kL1s log L < ∞. Then there exists Cs depending only on s, ε and kf kL1s log L , kf kL1 , s+ε such that for all t > 0, kSt f kL1s log L ≤ Cs . Proof. Here as in the sequel, we will denote by C, Cs various constants. We note first that it suffices to obtain a uniform bound on Z Ls = dv f (v) log f (v)(1 + |v|2 )s/2 . Indeed, Z

f | log f |(1 + |v|2 )s/2 ≤

Z

f log f (1 + |v|2 )s/2 Z +2 f | log f |(1 + |v|2 )s/2 . f ≤1

(50)

But, owing to the inequality x| log x| ≤ y − x log y that holds for all 0 ≤ x ≤ 1, 0 < y ≤ 1, we see that if f ≤ 1, for any ε > 0, 2 )ε/2

f | log f | ≤ e−(1+|v|

+ (1 + |v|2 )ε/2 f,

and this implies Z f ≤1

f | log f |(1 + |v|2 )s/2 ≤ ds,ε + kf kL1 , s+ε

(51)

where we put Z ds,ε =

RN

2 )ε/2

(1 + |v|2 )s/2 e−(1+|v|

dv.

Since the moments of order s + ε are uniformly propagated by (St ), it is sufficient to consider the first integral in the right hand side of (50), whence our claim. For all t > 0, St f is smooth and | log f | is quadratically bounded. Moreover, the mapping t 7 → Ls (St f ) is continuous (see related arguments in Sect. 5) and continuously differentiable for t > 0. Let us compute its derivative. In the sequel, we use the abridged

678

G. Toscani, C. Villani

notation f = f (t) for St f . Integrating systematically by parts in all the integrals where only one derivative of f enters, we easily obtain Z |∇f |2 (1 + |v|2 )s/2 ∇ · (∇f + f v) log f (1 + |v|2 )s/2 = − f Z h i + f log f 1(1 + |v|2 )s/2 − v · ∇(1 + |v|2 )s/2 Z n h i o + f ∇ · v(1 + |v|2 )s/2 − 1(1 + |v|2 )s/2

Z

and Z

∇ · (∇f + f v) (1 + |v|2 )s/2 = −

Z

+

Z

f v · ∇(1 + |v|2 )s/2 f 1(1 + |v|2 )s/2 .

Hence Z

Z

|∇f |2 =− (1 + |v|2 )s/2 f Z s(s − 2)|v|2 s|v|2 Ns 2 s/2 + + N kf kL1s . + f log f (1 + |v| ) − (1 + |v|2 ) (1 + |v|2 )2 (1 + |v|2 ) d dt

2 s/2

f log f (1 + |v| )

Now, by (51), s|v|2 s(s − 2)|v|2 Ns − + f log f (1 + |v| ) (1 + |v|2 ) (1 + |v|2 )2 (1 + |v|2 ) Z Z f log f (1 + |v|2 )s/2 − s f log f (1 + |v|2 )s/2 ≤ (N s + s(s − 2)) f ≥1 f ≤1 Z h i 2 s/2 ≤ (N s + s(s − 2)) f log f (1 + |v| ) + (N s + s(s − 1)) ds,ε + kf kL1 , Z

2 s/2

s+ε

and we obtain Z p f log f (1 + |v|2 )s/2 ≤ −4 |∇ f |2 (1 + |v|2 )s/2 + Z h i (N s +s(s −2)) f log f (1+|v|2 )s/2 +(Ns +s(s −1)) ds,ε + kf kL1 +N kf kL1s . d dt

Z

s+ε

Next, since hp i sp p f (1 + |v|2 )s/4 − f v(1 + |v|2 )s/4−1 , ∇ f (1 + |v|2 )s/4 = ∇ 2

Sharp Entropy Dissipation Bounds

we write

679

Z

p |∇ f |2 (1 + |v|2 )s/2 2 Z p Z = 4 ∇ f (1 + |v|2 )s/4 + s 2 f |v|2 (1 + |v|2 )s/2−2 Z p p v − 4s ∇ f (1 + |v|2 )s/4 f (1 + |v|2 )s/4 1 + |v|2 Z = I f (1 + |v|2 )s/2 + s 2 f |v|2 (1 + |v|2 )s/2−2 Z v 2 s/2 + 2s f (1 + |v| ) ∇ · , 1 + |v|2 √ √ where we have used the identity 2(∇ g) g = ∇g and integrated by parts in the last integral. By Gross’s logarithmic Sobolev inequality [27], written with respect to the Lebesgue measure, for all functions g ∈ H 1 (RN ), and for any a > 0, Z Z N dv |g|2 log |g|2 /kgk2L2 (RN ) + N + log 2π a dv |g|2 N N 2 R R Z 2 ≤ 2a dv |∇g| . 4

RN

Hence, choosing a = [4Ns + 4s(s − 2)]−1 we obtain Z 2 s/2 ≥ 8(Ns + s(s − 2)) f log f (1 + |v|2 )s/2 + I f (1 + |v| ) Z f (1 + |v|2 )s/2 log(1 + |v|2 )s/2 − kf kL1s log kf kL1s + N N + log[π(2Ns + 2s(s − 2))−1 ] kf kL1s . 2 Grouping together all the previous inequalities, we conclude that Z Z d f log f (1 + |v|2 )s/2 ≤ −8(Ns + s(s − 2)) f log f (1 + |v|2 )s/2 + C, dt where C depends on N, s, kf kL1 in a explicitly computable way. By (48), kf kL1 is s+ε s+ε bounded uniformly in t, and this implies a uniform bound for Ls (f (t)). u t All these properties would suffice to ensure that (St )t≥0 gives a very convenient way of smoothing densities in the frame of the Boltzmann equation. This becomes still clearer in view of the work by Morgenstern [35] upon the case of Maxwellian potentials with cut-off. For these potentials, under a suitable normalization of B, the Boltzmann equation simply reads Z (52) ∂t f = Q+ (f, f ) − f = dv∗ dσ b(k · σ )f 0 f∗0 − f,

680

G. Toscani, C. Villani

where k = (v − v∗ )/|v − v∗ |. First Morgenstern in dimension two, and subsequently Bobylev in any dimension of the velocity variable [5] proved that if Q+ is given by (52), then for all δ > 0, Q+ (f ∗ Mδ , f ∗ Mδ ) = Q+ (f, f ) ∗ Mδ .

(53)

Since Q+ also commutes with the rescaling (43), it follows that Q+ (St f, St f ) = St Q+ (f, f ), and as a consequence that the semigroup induced by the Boltzmann equation with Maxwellian molecules commutes with the adjoint Ornstein-Uhlenbeck semigroup. This property was crucial in the analysis of [9] (see also [47]). Several other symmetry properties connecting the Boltzmann equation and the Fokker– Planck equation will be studied in the next section (one can safely assume that we overlooked many others). The Boltzmann equation with Maxwellian molecules enjoys many remarkable properties, reminiscent of the Fokker–Planck equation. In particular, I (f |M) is decreasing along its solutions. This was proven in [38] for N = 2, in [9] in the case when b is constant, and in [47] in the general case. As we shall see in Sect. 8, this decreasing property can be related to the problem of finding a lower bound for D(f ). This point had already been noted (by a different argument) by Carlen and Carvalho. These peculiar properties of Maxwellian molecules may be somewhat enlightened (or obscured !) by the study of the so-called asymptotics of grazing collisions. These are a limiting process under which the Boltzmann equation transforms into a nonlinear diffusion-type equation, called Landau (or sometimes Fokker–Planck !) equation. This limit was discovered from the formal point of view by Landau [30] in the study of plasmas, in the frame of the Coulombian potential. From the mathematical point of view, a very wide class of potentials can be considered. We refer to [43] for a detailed study, and further references on the subject. See also [45] for a rigorous variant of Landau’s original argument. It turns out that in the case of Maxwellian molecules, the corresponding Landau equation resembles very much the Fokker–Planck equation. In [44] the following representation was established: X (N − Ti )∂ii f + (N − 1)∇ · (f v) + 1S f, (54) ∂t f = i

where an orthonormal basis (ei ) has been chosen such that Z f (v)vi vj dv = δij Ti , RN

(55)

which is always possible by diagonalization of the quadratic form Z (v ∈ RN ). q : e 7 −→ f (v) (v, e)2 dv The condition (55) is preserved by Eq. (54). Ti will be called the directional temperature of f along the direction ei . Here 1S denotes the Laplace-Beltrami operator, X |v|2 − vi vj ∂ij f (v) − (N − 1) v · ∇f (v). 1S f (v) = ij

Sharp Entropy Dissipation Bounds

681

In particular, in the case of radial distributions, the Landau equation (54) coincides (up to a multiplicative factor N − 1) with the Fokker–Planck equation. The questions of existence, uniqueness, asymptotic behaviour and some qualitative properties of solutions to Eq. (54) have been addressed in [44]. The Landau equation with Maxwellian molecules shares many properties with both the Boltzmann equation and the Fokker–Planck equation (see [46] for instance). We shall be essentially interested in the associated entropy dissipation, Z p p (56) DL (f ) = 2 dv dv∗ |v − v∗ |2 5(v − v∗ )(∇ − ∇∗ ) ff∗ (∇ − ∇∗ ) ff∗ , where ∇∗ denotes the gradient with respect to the variable v∗ , and 5(z) is the orthogonal projection upon the space orthogonal to z, zi zj (57) 5ij (z) = δij − 2 |z| (we use the standard notation Axx = Aij xi xj ). We refer to [21] for a study of the functional DL . All that we shall use here is the inequality DL (f ) ≥ min (N − Ti ) I (f |M) = (N − Tf )I (f |M), 1≤i≤N

where

(58)

Z Tf = max

e∈S N −1 RN

dv f (v) (v, e)2 .

(59)

The inequality (58), established by Desvillettes and the second author, is clearly reminiscent of formula (44). See [21] for complete proofs, and many applications to the trend towards equibrium in the general case of the Landau equation with hard potentials. Our study will reveal an unexpected connection between the functionals D and DL , and the semigroup (St ), which will allow us to derive a bound below for D, starting from the bound (58). To establish this connection, we have to study more precisely the symmetries of the equations. 3. Symmetries for Boltzmann and Fokker–Planck Equations In this section, we give the symmetry properties that will make it possible to regularize D by the Ornstein-Uhlenbeck semigroup. We begin with an equivalent representation of (2), obtained by the classical change of variables (v, v∗ , ω) → (v, v∗ , σ ), such that  v + v∗ |v − v∗ |  + σ v 0 = 2 2 (60)  v 0 = v + v∗ − |v − v∗ | σ. ∗ 2 2 In these variables, Q(f, f ) =

Z

Z RN

dv∗

S N −1

e − v∗ , σ ) f 0 f∗0 − ff∗ , dσ B(v

(61)

682

G. Toscani, C. Villani

e − v∗ , σ ) = (2|k · ω|)N−2 B(v − v∗ , ω), and the notation where B(v v − v∗ k= |v − v∗ | will be systematically used R R in the sequel. We recall that dω and dσ are normalized in such a way that dω = dσ = 1. We also set X0 = (v 0 , v∗0 ) = Tω X = Uσ X,

X = (v, v∗ ),

(62)

where Uσ is associated to the transformation (60). Note that for fixed σ , Uσ : R2N → R2N is not bijective. For any function G(X), we write t

t

Tω G = G ◦ Tω ,

Uσ G = G ◦ Uσ .

We now state a lemma which is due to Boltzmann himself. Lemma 1. Let f ∈ L1 (RN ). Then the average Z dσf 0 f∗0 G(v, v∗ ) =

(63)

S N −1

depends only on m = v + v∗ and e = (|v|2 + |v∗ |2 )/2. More generally, this result holds for any average of the form Z dσ t Uσ F (X), (64) G(X) = S N −1

where F ∈

L1 (R2N ).

The proof is immediate, since, in view of (60), the average (63) depends only on the sphere with center q = (v + v∗ )/2 and radius r = |v − v∗ |/2. The set of (N + 1) scalar variables (q, r) is clearly equivalent to (m, e). We note that G ∈ L1 (R2N ) since for any ϕ ∈ L∞ , Z Z t = Gϕ dσ dX U F (X)ϕ(X) σ 2N N −1 2N R

Z =

R

Z R2N

dX F (X)

×S

dσ Uσ ϕ(X) ≤ kF kL1 kϕkL∞ . t

Now, the heart of the matter lies in the following property. We denote by T the operation of tensor product, and by A the average operation (64). Moreover, we use the same symbol St for the action of the adjoint Ornstein-Uhlenbeck semigroup in RN and in R2N . When no confusion is possible, we also use the symbol M for the Maxwellian in 2N variables: M(X) = M(v)M(v∗ ). Proposition 2. The diagram R T A f −−−−→ F = ff∗ −−−−→ G = dσf 0 f∗0    S S S y t y t y t T

St f −−−−→ is commutative.

St F

A

−−−−→

St G

(65)

Sharp Entropy Dissipation Bounds

683

Remark. Let DN be the set of functions in L1 (RN ) satisfying conditions (36). Then T maps DN into D2N , and A maps D2N into itself. Proof. We first prove that the action of St commutes with the tensorization T . Since St is the composition of a convolution by a Maxwellian distribution Mλ(t) and a rescaling of the velocity space, it is sufficient to check the property for these two operations. Since for all µ > 0, µX = (µv, µv∗ ), obviously (ff∗ )λ = (fλ )(fλ )∗ , which proves the second part of the proposition, and shows at the same time that we only need to consider the convolution by M instead of Mλ . On the other hand, M(X) = M(v)M(v∗ ),

(66)

and this directly implies that for all function g, (M ∗ g)(M ∗ g)∗ = M ∗ (gg∗ ). We now prove that in R2N , St commutes with A. Since X 7−→ Uσ (X) is homogeneous of degree one, the rescaling (43) commutes with A. Therefore, we just have to check that A also commutes with the convolution by M. Let us set q=

w + w∗ , 2

r = |w − w∗ |,

`=

w − w∗ . |w − w∗ |

Thus, r r r r w = q + `, w∗ = q − `, w0 = q + σ, w∗0 = q − σ. 2 2 2 2 Then, for any function F (v, v∗ ), Z Z r r M∗ dσ t Uσ F = J dq r N−1 dr d` dσ F q + σ, q − σ 2 2 r r M v − q − `, v∗ − q + ` , 2 2

(67)

where J denotes some Jacobian (remember that dσ is the normalized measure on S N −1 ). On the other hand, Z Z r r dσ t Uσ (M ∗ F ) = J dq r N−1 dr d` dσ F q + `, q − ` 2 2 r r 0 0 (68) M v − q − `, v∗ − q + ` . 2 2 Exchanging ` and σ in (67), we see that we only have to prove that for all q, r, `, v, v∗ , Z r r dσ M(v − q − σ, v∗ − q + σ ) 2 2 Z |v − v∗ | r v + v∗ |v − v∗ | r v + v∗ + σ − q − `, − σ −q + ` . = dσ M 2 2 2 2 2 2 (69)

684

G. Toscani, C. Villani

Changing v into v − q and v∗ into v∗ − q, we reduce to the case when q = 0. Using now the property z + z∗ z − z∗ M , (70) M(z, z∗ ) = M √ √ 2 2 √ we let M((v + v∗ )/ 2) appear as a multiplicative factor of both sides, and we just have to prove that Z Z |v − v∗ |σ − r` v − v∗ − rσ = dσ M . dσ M √ √ 2 2 Up to a constant, the left-hand side is e

2

− |v−v4 ∗ |

e

2

Z

2

Z

− r4

r

dσ e− 2 (v−v∗ )·σ ,

while the right hand side is e

2

− |v−v4 ∗ |

e

− r4

r

dσ e− 2 |v−v∗ |`·σ .

R Since, by rotational invariance, dσ e−r`·σ does not depend on ` ∈ S N −1 , the conclusion follows. u t Corollary 21. Let G(X) depend only on m = v + v∗ and e = (|v|2 + |v∗ |2 )/2. Then, for all t > 0, St G depends only on m and e. In fact, a direct proof of this corollary is immediate: by density and linearity, it is sufficient to consider only the case when G(X) = G1 ((v + v∗ )/2)G2 (|v − v∗ |/2). In view of (70), St = (St G1 )(St G2 ). Since G2 is radial by assumption, and since M is radial, so is St G2 , which completes the proof. In the case N = 2, more can be said. In the representation (61), one can take as a new variable the angle θ between σ and k. Thus the transformation X 7 −→ X0 can be seen as a rotation in R2 , or, more precisely v + v∗ v + v∗ (71) , v − v∗ 7 −→ , Rθ (v − v∗ ) , 2 2 where Rθ denotes the standard rotation by angle θ in oriented R2 . By extension, we shall denote by Rθ the application given by (71). Proposition 3. Assume N = 2. Then for each θ ∈ R/(2π Z), the diagram tR

T

θ

f −−−−→ F = ff∗ −−−−→ t Rθ F = f 0 f∗0 = F 0    S S S y t y t y t T

St f −−−−→

St F

tR

θ

−−−−→

is commutative. In short, St (f 0 f∗0 ) = (St f )0 (St f )0∗ .

(St F )0

(72)

Sharp Entropy Dissipation Bounds

685

Proof. It suffices to note that

Z

t

M ∗ ( Rθ F )(X) = = = =

Z

R2N

2N ZR

Z

R2N R2N

dY t Rθ F (Y )M(X − Y ) dY F (Rθ Y )M(X − Y ) dY F (Y )M(X − Rθ−1 Y ) dY F (Y )M(Rθ X − Y )

= (M ∗ F )(Rθ X), where we have used the fact that a rotation has unit Jacobian, and that M is invariant t under t Rθ . u The particular character of the dimension 2 was already noticed in related problems [38,47]. Unfortunately, it is difficult to see how this property could generalize to higher dimensions. For example, in the case N = 3, even if one chooses a system of spherical coordinates (r, θ, φ) with axis k, there is no canonical way to choose the coordinate φ, and it is not clear whether this can be done in such a manner that φ be well defined independently of k. The analog of Proposition 3 is however valid if one replaces Rθ by Z 2π dφ t Uσ F, Jθ : F 7 −→ 0

where in the right-hand side the coordinates of σ in the local spherical system of axis v − v∗ are (θ, φ). We conclude this section by noting that a similar lemma holds with the transformations Tω . Proposition 4. For each ω ∈ S N−1 , the diagram tT

T

ω

f −−−−→ F = ff∗ −−−−→ t Tω F = f 0 f∗0 = F 0    S S S y t y t y t T

St f −−−−→

St F

tT

ω

−−−−→

(73)

(St F )0

is commutative. The proof is the same as for Proposition 3. Remarks. 1. The properties of invariance under t Rθ or t Tω characterize the Maxwellian distribution. Hence no other convolution regularization than Maxwellian could yield the same conclusion. 2. As pointed out to us by Desvillettes, these propositions give an immediate proof that Maxwellian distributions are the only solutions in L12 of Eq. (15). Indeed, let f be such a solution. Then, in view of Proposition 4, so is St f for any t > 0. Since St f is smooth, classical proofs relying for instance on the “killing operator” (32) prove that St f is the Maxwellian distribution M f . By the continuity of St at time 0, this is also true for f .

686

G. Toscani, C. Villani

4. Integral Representation of a Lower Bound for D In this section, we fix a cross section |z · ω| N−2 , B(z, ω) = ψ(|z|) 2 |z|

e σ ) = ψ(|z|), i.e. B(z,

(74)

with the variables (60). The entropy dissipation for the kernel B reads 1 D(f ) = 4

Z

Z

f 0 f∗0 dσ f 0 f∗0 − ff∗ log . ff∗

dv dv∗ ψ(|v − v∗ |)

(75)

The assumptions on the nonnegative function ψ shall be made precise later on. R By the joint convexity of the function (x, y) 7→ (x−y) log(x/y), and since dσ = 1, we get Z F 1 (76) dX ψ(|v − v∗ |)(F − G) log ≡ D(f ), D(f ) ≥ 4 G where we use the notations (62), and Z F (v, v∗ ) = ff∗ ,

G=

dσ t Uσ F = AF.

(77)

Our aim here is to establish an integral representation for D(f ). To this end, we shall regularize D by (St ) and compute (d/dt)D(St f ). At first sight this seems a formidable job to do, since f appears no less than eight times in (75). But applying Proposition 2, we see that it is equivalent to apply St to the functions F and G appearing in the right hand side of (76). For all positive time t > 0, St F and St G are smooth and their logarithm is bounded by a quadratic expression. This is enough to justify all the manipulations below. The following lemma will enable us to compute very simply the time-derivative of D(St f ). It yields actually the commutator between derivation along St and the function (x, y) 7 −→ (x − y) log(x/y), which is homogeneous of degree 1. Proposition 5. Let F and G be smooth functions with logarithms quadratically bounded. Then ∇F ∇G 2 St F d = −(F + G) − (St F − St G) log dt t=0 St G F G d F + S . (78) (F − G) log t dt t=0 G Proof. In the sequel, ∇ stands for ∇X . Let L denote the Fokker–Planck operator in R2N , and let us compute LF LG F F − − L (F − G) log . L(F − G) log + (F − G) G F G G

(79)

Sharp Entropy Dissipation Bounds

687

We shall show that this expression is equal to the first term in the right hand side of (78). Indeed, expanding (79), we find F ∇ · ∇(F − G) + (F − G)X log G ∇ · (∇F + F X) ∇ · (∇G + GX) − + (F − G) F G ∇F ∇G F F +X− − X + (F − G)X log −∇ · ∇(F −G) log + (F − G) G F G G F F = ∇ · ∇(F − G) log − ∇ · ∇(F − G) log G G F F + ∇ · (F − G)X log − ∇ · (F − G)X log G G F −G F −G ∇ · (∇F + F X) − ∇ · (∇F + F X) + F F G−F G−F ∇ · (∇G + GX) − ∇ · (∇G + GX) + G G ∇G ∇F − = −∇(F − G) · F G ∇G ∇F − − (F − G)X · F G ∇G G∇F + − (∇F + F X) · − F F2 F ∇G ∇F + . − (∇G + GX) · − G G2 Expanding the last expression, we see that all the terms containing X cancel out. As for the other ones, we obtain in the end −

|∇G|2 ∇F · ∇G ∇G · ∇F |∇F |2 |∇G|2 |∇F |2 − +2 +2 −G − F F G F G F2 G2 ∇F ∇G 2 − = −(F + G) . F G

t u

This relation may seem somehat miraculous. In Sect. (8), we shall try to connect it with other known properties in kinetic theory. With Proposition 5 at hand, it is immediate to compute the time derivative of D(St f ). Let us write L∗ : g 7 −→ 1X g − X · ∇X g, for the adjoint of L. We note that

g ∈ D0 (R2N )

ψ 0 (|v − v∗ |) , 1X ψ(|v − v∗ |) = 21v ψ(|v − v∗ |) = 2 ψ (|v − v∗ |) + (N − 1) |v − v∗ | 00

688

G. Toscani, C. Villani

X · ∇X ψ(|v − v∗ |) = v · ∇v ψ(|v − v∗ |) − v∗ · ∇v ψ(|v − v∗ |) = |v − v∗ |ψ 0 (|v − v∗ |). Hence we can safely use the notation

(N − 1) − |z| ψ 0 (|z|), (L∗ ψ)(|z|) = 2ψ 00 (|z|) + 2 |z|

and by (76) and (78) we get for all t > 0, Z ∇St F ∇St G 2 1 d dX ψ(|v − v∗ |)(St F + St G) − D(St f ) = − dt 4 St F St G Z 1 St F + dX (L∗ ψ)(|v − v∗ |)(St F − St G) log , 4 St G

(80)

(81)

provided that ψ(|v − v∗ |)(St F − St G) log

St F ∈ L1 (R2N ), St G

(L∗ ψ)(|v − v∗ |)(St F − St G) log

St F ∈ L1 (R2N ). St G

(82)

(83)

We remark that, thanks to condition (49) and the result of Proposition 1, conditions (82) and (83) are propagated by St , under weak assumptions on the growth of ψ and L∗ ψ. In more detail, we assume for simplicity that ψ is bounded below by a fixed number ν > 0, F ψ ∈ L1 (R2N ), and L∗ ψ(|v − v∗ |) ≤ Cψ(|v − v∗ |) ≤ C1 (1 + |X|2 )α/2 ,

(84)

for some α > 0. Note that the condition (84) is always satisfied if ψ(|z|) behaves like a power of |z| at infinity, since then, the dominant term as |z| → ∞ of L∗ ψ(|z|) is −|z|ψ 0 (|z|). Then, if kf kL1

2+α

log L

< ∞,

(85)

both (82) and (83) hold uniformly in time for all t ≥ t0 > 0 (because the OrnsteinUhlenbeck adjoint semigroup generates pointwise Maxwellian lower bounds). Finally, we prove that continuity at time 0 of the mapping t 7→ D(St f ) follows under the same conditions (84) and (85). First, by the convexity of D and the strong continuity of (St ) at time 0, D(f ) ≤ limt→0 D(St f ). Therefore it is sufficient to check that D(f ) ≥ limt→0 D(St f ). Let us denote by Mε the centered Maxwellian with temperature ε, and define Fε =

(F ψ) ∗ Mε , ψ

Gε =

(Gψ) ∗ Mε . ψ

Since ψ is bounded below, it is clear that (Fε , Gε ) −→ (F, G) strongly in L1 × L1 as ε → 0, and (Fε ψ, Gε ψ) −→ (F ψ, Gψ) as well. We note that D(f ) = D(F ψ, Gψ),

Sharp Entropy Dissipation Bounds

689

where D(F, G) =

1 4

Z dX (F − G) log

F . G

Let us use for a while the notation D(f ) = D(F ). In view of the choice of Fε , Gε , D(F ) ≤ limε→0 D(Fε ψ, Gε ψ) = D((F ψ) ∗ Mε , (Gψ) ∗ Mε ) ≡ D(Fε ). But D is a translation-invariant functional, and (F ψ) ∗ Mε is an average of translates of F ψ. By convexity of D, D (F ψ) ∗ Mε , (Gψ) ∗ Mε ≤ D(F ψ, Gψ), and hence D(Fε ) → D(F )

as ε → 0.

(86)

By the preceding proof, since Fε , Gε are smooth (hence t 7 −→ D(St f ) is continuous at t = 0), we see that for all t > 0, Z Z 1 t Sτ Fε D(Fε ) ≥ D(St Fε ) − dτ (L∗ ψ)(Sτ Fε − Sτ Gε ) log . 4 0 Sτ Gε Hence, since by assumption L∗ ψ ≤ Cψ (and since Sτ Gε is an average of Sτ Fε ), Z C t D(Fε ) ≥ D(St Fε ) − dτ D(Sτ Fε ). 4 0 By Gronwall’s lemma, D(St Fε ) ≤ D(Fε )eCt/4 . Letting ε go to 0, by the convexity of D and (86), we get D(St F ) ≤ D(F )eCt/4 . This is sufficient to conclude that limt→0 D(St F ) ≤ D(F ). On the other hand, as t goes to infinity, D(St F ) goes to 0 (because St F −→ M), at least when F log F ψ(1 + |X|2 ) ∈ L1 (R2N ).

(87)

But this follows by condition (85). Thus, we conclude with the following Theorem 6. Representation formula for D. Assume that ψ is bounded below, and conditions (84), (85) are satisfied at t = 0. Then 1 D(f ) = 4

Z 0

∞

∇St F ∇St G 2 − dt dX ψ(|v − v∗ |)(St F + St G) St F St G Z ∞ Z 1 St F . − dt dX (L∗ ψ)(|v − v∗ |)(St F − St G) log 4 0 St G Z

(88)

Remark. We are confident that this formula holds under more general assumptions, but this will be more than enough for our purposes.

690

G. Toscani, C. Villani

5. Main Result We are now ready to prove our main result. Theorem 7. Let f satisfy the normalization (36). Let ψ(|z|) ≥ (1 + |z|)γ for some 2 < ∞, real number γ ≤ 0. Assume that f (v) ≥ Ke−A|v| , and that kf kL1 2+s+ε log L < ∞ for some s > 0, ε > 0. Then there exists a constant Cs (f ) depending kf kL1 4+s+ε , such that only on s, ε, γ , K, A, kf kL1 log L and kf kL1 2+s+ε

4+s+ε

D(f ) ≥ Cs (f ) H (f |M)1+

2−γ s

,

(89)

where D(f ) is the lower bound for the entropy dissipation given by (76). Remark. To treat potentials that are “essentially” bounded below, in the sense of Sect. 6, like hard potentials, it will be sufficient to apply this theorem with γ = 0. However, we choose a function ψ which may be decaying to show that the theorem also holds for soft potentials. Proof. Let ψ1 be a smooth convex function with ψ1 (|z|) = 1 for |z| ≤ 1, ψ1 (|z|) = |z|2 for |z| ≥ 2, |z|2 /2 ≤ ψ1 (|z|) ≤ 1 + |z|2 . Since L∗ (|z|2 ) = 4N − 2|z|2 , we can impose that |L∗ ψ1 (|z|)| ≤ C(1 + |z|2 ). Let us set ψR (|z|) = ψ1 (|z|/R). Hence, |L∗ ψR (|z|)| ≤

C 1 + |z|2 1|z|≥R . 2 R

Let D R (f ) be the functional D associated to ψR . Since ψR (|z|) ≥ |z|2 /(2R 2 ), and since |v − v∗ |2 ≥ R 2 H⇒ |v|2 + |v∗ |2 ≥

R2 , 2

we obtain by Theorem 6, Z ∞ Z ∇X St F ∇X St G 2 1 2 − D R (f ) ≥ dt dX |v − v∗ | (St F + St G) 8R 2 0 St F St G Z ∞ Z C St F . − 2 dt dX 1 + |X|2 1|X|≥R/√2 (St F − St G) log R 0 St G

(90)

Next, since ψR (|z|) = 1 for |z| ≤ R, we can write (1 + |z|)γ ≥ (1 + R)γ ψR (|z|) − (1 + R)γ

|z|2 1|z|≥R . R2

Accordingly,

Z 1 F 2 √ (F − G) log D(f ) ≥ (1 + R)γ D R (f ) − 1 . dX |X| |X|≥R/ 2 4R 2 G

(91)

Summing up, by (90) and (91) we have decomposed D(f ) into three parts, two of which involve only large values of X. Now we shall estimate the principal part of D R (f ). The heart of the whole argument lies in the following:

Sharp Entropy Dissipation Bounds

691

Proposition 8. Let F = ff∗ and let G be a function depending only on m = v + v∗ and e = (|v|2 + |v∗ |2 )/2. Assume that f and G are smooth, with logarithms quadratically bounded, and that f satisfies the normalization (36). Then Z ∇X F ∇X G 2 1 2 − ≥ (N − Tf ) I (f |M), (92) dX |v − v∗ | (F + G) F G 2 where

Z Tf = sup

e∈S N −1

dv f (v)(v, e)2 .

Proof. Let us write ∇X = [∇, ∇∗ ]. Then, ∇f ∇f ∇X F = , . F f f ∗

(93)

On the other hand,

∇m G ∂e G ∇m G ∂e G ∇X G = +v , + v∗ (m, e). G G G G G

(94)

Let us consider the “killing operator” P (v, v∗ ) defined by P : [A, B] 7 −→ 5(v − v∗ ) (A − B),

(95)

where 5 is given by (57). In view of (94), clearly, ∇X G = 0. (96) G For all (v, v∗ ), kP k ≤ 2, where k · k denotes the norm in the sense of matrices. Here we see precisely the advantage of the Ornstein-Uhlenbeck regularization in our approach: it enables us to use the operator (95), which is pointwise bounded, instead of (32), which is definitely not. Let us set ∇X G ∇X F − . K(X) = F G In view of (93) and (96), ∇f ∇X F ∇f = 5(v − v∗ ) − P K(X) = P . F f f ∗ P

Since |P K|2 ≤ 4|K|2 , we find Z ∇X F ∇X G 2 − dX |v − v∗ |2 (F + G) F G Z Z 1 dX |v − v∗ |2 F |P K|2 ≥ dX |v − v∗ |2 F |K|2 ≥ 4 1 = 4

Z

2 ∇f ∇f . dv dv∗ |v − v∗ | ff∗ 5(v − v∗ ) − f f 2

∗

Apart from the factor 1/2, this is the entropy dissipation for the Landau equation with Maxwellian molecules. It suffices to apply the inequality (58) to conclude. u t

692

G. Toscani, C. Villani

Now, by the properties recalled in Sect. 2, for all t > 0, T(St f ) ≤ Tf , with equality only when all the directional temperatures of f are equal to 1. Hence, setting λ = N − Tf , we can apply the preceding proposition and recover Z ∞ Z ∇X St F ∇X St G 2 1 2 − dt dX |v − v | (S F + S G) ∗ t t SF R2 0 St G t Z ∞ λ λ ≥ I (St f |M) = H (f |M), 2 2R 0 2R 2 where we have used the relation (47). It is shown in [21] that λ ≥ λ0 (f ), where λ0 depends onlyPon H (f ) and the normalization (36). Indeed, if (ei ) is any orthonormal basis, then i Ti = N . Therefore, to control λ from below it suffices to control all the directional temperatures Ti from below. The finiteness of H (f ) and of some moments of f suffices to prevent f from concentrating on a hyperplane (v, e) = 0. √ Proposition 9. Estimate of the tails. Let R ≥ 2, and Z F (97) e(R) = dX |X|2 1|X|≥R/√2 (F − G) log , G Z

∞

E(R) =

Z dt

0

dX |X|2 1|X|≥R/√2 (St F − St G) log

St F . St G

(98)

Then for all s > 0, ε > 0, Cs+ε , Rs where Cs+ε depends only on the normalization (36), s, ε, kf kL1 e(R) + E(R) ≤

and A such that f ≥

2 Ke−A|v| .

2+s+ε

log L , kf kL14+s+ε , K

Remark. We have used the inequality (1 + |X|2 )1|X|>R/√2 ≤ 2|X|2 1|X|>R/√2 to get E. Proof. We begin with e(R). Throughout the proof, C will denote various constants depending only on the aforementioned quantities. 2 2 2 Since f ≥ Ke−A|v| , by tensorization F ≥ Ke−A|X| , and also G ≥ Ke−A|X| , since Maxwellian distributions satisfy Eq. (15). Since (x − y) log(x/y) ≤ x log x if x ≥ y ≥ 1, and | log F |, | log G| ≤ C(1 + |X|2 ) if F, G ≤ 1, we can write Z e(R) ≤ dX |X|2 1|X|≥R/√2 F log F Z + dX |X|2 1|X|≥R/√2 G log G Z + C dX |X|2 (F + G)(1 + |X|2 )1|X|≥R/√2 .

Sharp Entropy Dissipation Bounds

693

R Since G = dσ F 0 and |X0 |2 = |X|2 , we write, using the convexity of x 7 −→ x log x and the change of variables dσ dX = dσ dX0 , Z dX |X|2 1|X|≥R/√2 G log G Z

Z ≤

dσ

dX |X|

2

1|X|≥R/√2 F 0 log F 0

Z =

dX |X|2 1|X|≥R/√2 F log F.

In the same manner, Z Z 2 2 √ dX |X| 1|X|≥R/ 2 (1 + |X| )G = dX |X|2 1|X|≥R/√2 (1 + |X|2 )F. Hence,

Z e(R) ≤ C Z +C

dX |X|2 1|X|≥R/√2 F log F

dX |X|4 1|X|≥R/√2 F.

Writing 1|X|≥R/√2 ≤ 1|v|≥R/2 + 1|v∗ |≥R/2 , |X|2 = |v|2 + |v∗ |2 , |X|4 ≤ C(|v|4 + |v∗ |4 ), log F = log f + log f∗ , we obtain

Z

dv |v| 1|v|≥R/2 f | log f | e(R) ≤ C dv∗ f∗ (1 + |v∗ | ) Z Z 2 2 dv |v| f | log f | +C dv∗ f∗ (1 + |v∗ | )1|v∗ |≥R/2 Z Z dv |v|4 1|v|≥R/2 f +C dv∗ f∗ Z Z 4 dv |v| f +C dv∗ f∗ 1|v∗ |≥R/2 Z

2

≤

2

C + kf k + kf k kf k 1 1 kf kL1 log L + kf kL1 1 kf kL1 . L L log L L r 2+r 2+r 2 4+r 4 Rr

Let us now turn to E(R). First, St (e−A|v| ) = Me−2t /(2A) ∗ M1−e−2t = M1+[(2A)−1 −1]e−2t , 2

and since St is a linear positive transformation, St f ≥ Ce−A0 |v| , 2

694

G. Toscani, C. Villani

where

A0 = sup

1 1 + [(2A)−1 − 1]e

, 0 ≤ t < ∞ = max(1, 2A). −2t

As a consequence, we also have a fixed Maxwellian lower bound for St F and St G. We estimate E(R), taking into account the fact that St F −→ M, St G −→ M. By the elementary inequality (f − g) log

f ≤ |f − M|| log f | + |g − M|| log g| g + C|f − M|(1 + |X|2 ) + C|g − M|(1 + |X|2 )

(which is easy to obtain distinguishing between the cases f ≥ g ≥ M, M ≥ f ≥ g, 2 ≥ f ≥ M ≥ g, f ≥ 2 ≥ M ≥ g and so on), we have Z ∞ Z dt dX |X|2 1|X|≥R/√2 |St F − M|| log St F | (99) E(R) ≤ 0 Z ∞ Z dt dX |X|2 1|X|≥R/√2 |St G − M|| log St G| (100) + 0 Z ∞ Z dt dX |X|2 1|X|≥R/√2 |St F − M|(1 + |X|2 ) (101) +C 0 Z ∞ Z dt dX |X|2 1|X|≥R/√2 |St G − M|(1 + |X|2 ). (102) +C 0

By convexity and changes of variables in the same spirit as before, we reduce to the problem of estimating only (99) and (101). We begin with (101). By the results recalled in Sect. 2, for all t > 0, kSt F − MkL1 ≤

p 2H (F |M)e−t

(of course H (F ) = 2H (f ) is finite), and for all r > 0, kSt F − MkL1r ≤ Cr (kf kL1r ) + kMkL1r . Hence, for any ε > 0, K > 0, separating between small and large |X|, Z C dX (1 + |X|)r |St F − M| ≤ CK r kSt F − MkL1 + ε kSt F − MkL1 r+ε K ≤ CK r e−t + where Cr+ε depends only on kF kL1 Choosing K =

1/(r+ε) Cr+ε et/(r+ε) ,

r+ε

Cr+ε , Kε

(i.e. on kf kL1

r+ε

we get ε

kSt F − MkL1r ≤ Ce− r+ε t ,

and the normalization (36)).

Sharp Entropy Dissipation Bounds

695

and therefore Z ∞ Z Z ∞ C dt dX |X|4 1|X|≥R/√2 |St F − M| ≤ s dt kSt F − MkL1 4+s R 0 0 C 4+s+ε ≤ s . R ε Finally, we handle the integral (99). Applying the same strategy as above and using Proposition 1, it is sufficient to prove that Z dX |St F − M|| log St F ||X|2 −−−→ 0 t→∞

with an exponential rate. To that purpose, we use the elementary inequality x x |x − y|| log x| ≤ |x − y| log 1x≤y + |x − y|| log y| + C x log + y − x . y y (103) To prove (103), it suffices to note that

x |x − y|| log x| ≤ |x − y|| log y| + |x − y| log , y

and to bound the second term. By homogeneity, we just have to check that if z = (x/y) ≥ 1, then (z − 1) log z ≤ C(z log z + 1 − z). This last inequality is easily obtained (note that both functions have vanishing derivatives of first order for z = 1). By Hölder’s inequality, then (103) applied with x = St F and y = M, Z |St F − M|| log St F ||X|2 Z ≤C

ε/(1+ε) |St F − M|| log St F | kf kL1

2+2ε

Z ≤ Cε

log L kf kL1

1/(1+ε)

4+2ε

1 1+ε . dX |St F − M|(1 + |X| ) + H (St F |M) 2

The right-hand side converges exponentially fast to 0, and the desired conclusion follows by the same arguments as before. u t End of the proof of Theorem 7. By Propositions 8 and 9, E(R) e(R) λ γ H (f |M) − − 2 D(f ) ≥ D(f ) ≥ C(1 + R) R2 R2 R Cs+ε H (f |M) − . ≥ C(1 + R) Rs √ 1/s u Choosing R = max(2−1/s Cs+ε H (f |M)−1/s , 2), we get the desired result. t γ −2

696

G. Toscani, C. Villani

6. Extension to Other Kernels Theorem 7 covers essentially all kernels B(z, ω) that are locally bounded below by |z · ω|/|z| (in dimension 3; (|z · ω|/|z|)N−2 in the general case). The cross-section |z · ω|, for instance, corresponding to the hard-spheres potential, does not satisfy this assumption for |z| near 0. In order to obtain a result for such potentials, we just have to follow the strategy applied by Carlen and Carvalho [10], namely cut out (small) portions of the velocity space where B is small. In this example, we write |z · ω| ≥

|z · ω| |z · ω| − 1|z|≤ , |z| |z|

and we estimate from above the entropy dissipation associated to ψ(|z|) = 1|z|≤ , i.e. Z Z f 0 f∗0 . χ() = dv dv∗ 1|v−v∗ |≤ dσ f 0 f∗0 − ff∗ log ff∗ As goes to 0, χ () −→ 0, and we have to estimate explicitly the corresponding rate of convergence. Here, it is not clear whether the assumptions f ∈ L1s log L ∩ L1r and 2 f (v) ≥ Ke−A|v| suffice to provide such an estimate. But as soon as, say, L2 bounds are available for f , this can be done easily. Indeed, let D (R) = {(v, v∗ ) ∈ R2N ; |v − v∗ | ≤ , |v|2 + |v∗ |2 ≤ R}. Then, for all R > 0, s > 0, Z f 0 f∗0 C χ () ≤ s (|v|2 + |v∗ |2 )(s+1)/2 1|v|2 +|v∗ |2 ≥R 2 f 0 f∗0 − ff∗ log R ff∗ Z f 0 f∗0 + . f 0 f∗0 − ff∗ log ff∗ D (R)×S N −1

(104)

Using the Maxwellian lower bound, we write systematically

f 0 f∗0 ≤ f 0 f∗0 log(f 0 f∗0 ) + ff∗ log ff∗ f 0 f∗0 − ff∗ log ff∗ + C(ff∗ + f 0 f∗0 )(|v|2 + |v∗ |2 ),

and changing primed to unprimed variables we reduce to estimating Z ff∗ log(ff∗ )(|v|2 + |v∗ |2 )s/2 R2N

and

(105)

Z D (R)

ff∗ log(ff∗ ).

(106)

Writing log ff∗ = log f + (log f )∗ , we easily see that (105) is bounded by a constant depending on the norm of f in L1s log L. On the other hand, if f ∈ L2 , then ff∗ ∈ L2 (R2N ), and this is enough to provide an explicit estimate of (106) in terms of the Lebesgue measure of D (R). In the end, it suffices to choose a convenient R, depending on .

Sharp Entropy Dissipation Bounds

697

We do not enter into the details of this computation, since several estimates of this type can be found in [10] for the Boltzmann equation, and in [21] for the Landau equation with hard potentials. The same technique also allows to cover all kernels B for which ( ) |z · ω| N −2 , |z| ≤ R = O(R β δ ) (z, ω)/B(z, ω) ≤ ψ(|z|) |z|

(107)

for some positive numbers β < N, δ > 0, as R goes to infinity and goes to 0 – that is, precisely those kernels such that the set of points for which they are not bounded below is of very small measure. As a conclusion, for all kernels B satisfying (107), our method enables to obtain an algebraic estimate of the form D(f ) ≥ Cα (f ) H (f |M)α ,

(108)

with α and Cα (f ) explicitly computable, and depending on β, δ in (107), γ in (7), and various norms of f , say in weighted L2 . Let us briefly comment on the conditions of Theorem 7. As proven recently by Pulvirenti and Wennberg [37], a Maxwellian lower bound for the solution to (1) in the case of hard potentials is available at any positive time, provided the initial datum has bounded energy and entropy. On the other hand, the Ornstein-Uhlenbeck semigroup produces such bounds. But, as the numerical applications in [10] show, these are terribly small, and it is not clear whether they would be useful. Essentially, for hard potentials, solutions automatically have a good decay at infinity, and with bounds that are uniform in time. In particular, weighted Lp -bounds on the solution propagate uniformly in time if sufficiently many moments exist at time t = 0. p A fairly complete study of these uniformly boundedness properties in Lr was done by Gustafsson in [28]. This result was improved by Wennberg in [50]. In any case, provided the conditions of Gustafsson theorems are satisfied, both weighted L2 - norms and the Maxwellian lower bound are available, uniformly in time. This allows us to transform the inequality (108) into a theorem of decay to equilibrium with explicit rate. On the contrary, for soft potentials it is an open problem whether the bounds can be found to hold uniformly in time. Yet a study of trend to equilibrium can be performed if these bounds are growing slowly: this will be the object of another work. Let us also do some comments on the estimate below for λ = N − Tf in Theorem 7. Even though λ is estimated below in terms only of H (f ) and the normalization (36), this gives very poor estimates, as the numerical applications in [21] show. Indeed, the entropy is very bad at controlling the concentration on sets of small measure. Here again, working in an L2 framework enables far better estimates. An interesting feedback effect was studied in [21]: under suitable assumptions, as time goes by, solutions of the Boltzmann equation converge towards the Maxwellian distribution, say in L12 , and hence all the directional temperatures of f (t) converge towards the equilibrium value 1. Therefore, the constant λ essentially becomes better with time, and is equal to N − 1 asymptotically. This effect is a direct consequence of the nonlinearity of the Boltzmann equation. By the way, note that in the particular case of radial solutions, one has always Tf = 1, hence λ = N − 1.

698

G. Toscani, C. Villani

7. The Kac Model In this section, we show how the previous analysis can be extended to the Kac model. We recall that Kac’s collision operator reads, for a distribution f (v), v ∈ R, Z 2π dθ dv∗ f 0 f∗0 − ff∗ , (109) QK (f, f ) = 0

where we note by dθ the normalized measure on (0, 2π ), and  0  v = v cos θ − v∗ sin θ  v 0 = v sin θ + v cos θ. ∗ ∗

(110)

Hence the postcollisional velocities are simply obtained by a rotation of angle θ in the space (v, v∗ ). There is only one collisional invariant in the Kac model, namely e = (|v|2 +|v∗ |2 )/2. R 0 It is clear that for any function f , dθf f∗0 depends only on e. Proposition 3 extends trivially to the Kac model, and all the subsequent analysis can be done. But the peculiarity of the dimension 1 is that there is no corresponding Landau equation, since the orthogonal projection 5 defined by (57) is meaningless. Therefore, we have to change the proof of Proposition 8 because we cannot rely on the use of the linear operator (95). This can be linked to the following elementary observation. For any vector function g on RN (think of g as ∇ log f ), the property ∀(v, v∗ ), g(v) − g(v∗ ) = λv,v∗ (v − v∗ ) H⇒ ∀v, g(v) = λv + µ (λ ∈ R, µ ∈ RN ), holds in any dimension N ≥ 2, but is obviously false if N = 1. The following proposition is thus a replacement for Proposition 3. Proposition 10. Assume N = 1. Let f be a smooth function with logarithm quadratically bounded, with unit mass and temperature, and let G be a function of e. Then Z ∇F ∇G 2 ≥ I (f |M). (111) − dX |X|2 F F G Proof. This time, 0 G G0 ∇G = v (e), v∗ (e) . G G G Hence we can apply the “killing operator” P (v, v∗ ) : [A, B] 7 −→ v∗ A − vB. For all (v, v∗ ), the square norm of P is bounded by 2(|v|2 + |v∗ |2 ) = 2|X|2 . Defining K(v, v∗ ) =

∇F ∇G − , F G

Sharp Entropy Dissipation Bounds

699

we see that P K(v, v∗ ) = v∗ Hence, Z

Z f 0 (v) ∇G 2 1 f 0 (v∗ ) 2 ≥ ff∗ − dv dv∗ v∗ −v dX F |X| F G 2 f (v) f (v∗ ) 2 ∇F

Z =

f 0 (v∗ ) f 0 (v) −v . f (v) f (v∗ )

Z dv∗ f (v∗ )|v∗ |2

dv

f 0 (v)2 f (v)

Z −

Z dv∗ v∗ f 0 (v∗ ) . dv vf 0 (v)

Integrating by parts the second term, we obtain simply I (f ) − 1 = I (f |M).

t u

At this point we can apply the ideas of the previous section, concluding in particular with the algebraic decay to equilibrium in relative entropy of the solution to the Kac equation. On the other hand, due to the particular symmetries of this one-dimensional model, Theorem 7 can be improved in a number of ways. Theorem 11. Assume N = 1. Let f ∈ L12 (R) satisfy the normalization (36). If in addition, for some s > 0, kf kL1 < ∞, and 2+s Z f 0 (v)4 < ∞, (112) I2 (f ) = dv f (v)3 then for all ε > 0 there exists a constant Cs,ε (f ) depending only on s, ε, kf kL1 and 2+s I2 (f ), such that D(f ) ≥ Cs,ε (f ) H (f |M)1+(2+ε)/s .

(113)

Remark. It is known that if I2 (f0 ) is finite, then I2 (ft ) remains bounded, uniformly in time, if ft is the solution to the Kac equation with initial datum f0 : more precisely [25, 40], n o I2 (St f ) ≤ max I2 (f0 ), CI (f0 )2 , where C is numerical. Proof. Let us repeat the argument of Proposition 10. We consider a function f which is smooth and whose logarithm is quadratically bounded. Let G be a function of e. Then, for any R > 0, Z Z ∇F ∇G 2 ∇G 2 1 2 ∇F − − ≥ dX F |X| ≥ dX F F G 2R 2 |v|≤R, |v∗ |≤R F G Z f 0 (v) f 0 (v∗ ) 2 1 dv dv∗ v∗ ff∗ = −v 4R 2 |v|≤R,|v∗ |≤R f (v) f (v∗ ) Z Z f 0 (v)2 1 2 dv f (v )|v | dv = ∗ ∗ ∗ 2R 2 f (v) |v∗ |≤R |v|≤R Z Z 1 0 0 dv vf (v) dv v f (v ) . (114) − ∗ ∗ ∗ 2R 2 |v|≤R |v∗ |≤R

700

G. Toscani, C. Villani

≡

1 4(f ). 2R 2

Writing Z

f 0 (v)2 = I (f ) − dv f (v) |v|≤R Z |v|≤R

dv vf 0 (v) = −1 −

Z |v|>R

Z |v|>R

dv

f 0 (v)2 , f (v)

dv vf 0 (v),

we see that Z

Z 4(f ) = I (f |M) + Z −

dv |v| f (v) 2

|v|>R

|v|>R

dv

f 0 (v)2 f (v) Z −2

|v|>R

− I (f )

|v|>R

dv Z

f 0 (v)2 f (v)

|v|>R

dv |v|2 f (v)

dv vf 0 (v) −

Z |v|>R

2 dv vf 0 (v) .

(115)

By Cauchy–Schwarz inequality, Z

0

|v|>R

2

dv vf (v)

Z ≤

Z dv |v| f (v) 2

|v|>R

|v|>R

dv

f 0 (v)2 . f (v)

(116)

Thus, 4(f ) ≥ I (f |M) − GR (f ), where we define Z

Z

f 0 (v)2 dv |v| f (v) + dv −2 GR (f ) = I (f ) f (v) |v|>R |v|>R

Z

2

Z

Z

|v|>R

dv vf 0 (v) (117)

f 0 (v)2 dv |v| f (v) + |v| f (v) + 2vf (v) + =I (f |M) f (v) |v|>R |v|>R 2 0 Z Z f (v) dv |v|2 f (v) + + v f (v). =I (f |M) f (v) |v|>R |v|>R 2

2

0

Now, we estimate GR (St f ) for t > 0. First, clearly, Z Cs dv |v|2 (St f )(v) ≤ I (St f |M) s , I (St f |M) R |v|>R where Cs depends only on kf kL1 . 2+s

Sharp Entropy Dissipation Bounds

701

Next, for all ε > 0,

Z |v|>R

dv

(St f )0 (v) +v (St f )(v)

2

Z ε (St f )(v) ≤ I (St f |M) 1+ε 2 Z +2

≤ I (f |M)

ε 1+ε

e

2εt − 1+ε

|v|>R

(St f )0 (v)2 St f (v) |v|>R 1 1+ε 2 dv |v| (St f )(v) dv

Z 2

Cs (St f )0 (v)2 +2 s dv St f (v) R |v|>R

1 1+ε

,

where we have used (45). By Cauchy-Schwarz inequality, 1/2 Z Z (St f )0 (v)2 dv dv St f (v) . ≤ I2 (St f )1/2 St f (v) |v|>R |v|>R Next, we use the fact that I2 is bounded uniformly in time for solutions of the onedimensional Fokker–Planck equation [40]: n o I2 (St f ) ≤ max e4 I2 (f ), (1 − e−2 )−2 I2 (M) . The proof of this inequality relies on the method developed by Lions and the first author [32] for proving refined estimates of the central limit theorem. Combining this with the boundedness of kf kL1 , we easily obtain 2+s

GR (St f ) ≤

ε 2εt Cs Cs,ε I (St f |M) 1+ε + s/(1+ε) e− 1+ε . s R R

(118)

Since (by Theorem 6 with ψ ≡ 1) Z ∞ 1 dt [I (St f |M) − GR (St f )], D(f ) ≥ 2R 2 0 by (118) we see that if R ≥ Rs (depending only on s and on kf kL1 , with Rs ≥ 1), D(f ) ≥

1 2R 2

2+s

Cs,ε 1 H (f |M) − s/(1+ε) , 2 R

(119)

where Cs,ε depends only on I2 (f ), kf kL1 , s and ε. Note indeed that if, for some a > 0, 2+s we denote by t0 the first time t such that I (St f |M) ≤ a, then (this is a rough estimate !) Z

+∞ 0

I (St f |M)

ε 1+ε

Z dt =

t0

0

≤a

I (St f |M) 1 − 1+ε

Z 0

+∞

ε 1+ε

Z dt +

+∞

t0

ε

I (St f |M) 1+ε dt ε

I (St f |M) dt + a 1+ε

Z

+∞

ε

e−2 1+ε t dt,

0

and we just have to choose Cs a −1/(1+ε) = 1/2 to make sure that the estimate (119) holds. −s/(1+ε) , we get the Choosing in (119) R −s/(1+ε) = min{(4Cs,ε )−1 , H (f |M), Rs result (with 2ε in place of ε). u t

702

G. Toscani, C. Villani

Theorem 11 gives a lower bound on the entropy production for Kac equation with a constant depending on the functional I2 . The boundedness of this functional substitutes the bounds on L1s log L, and also the moment condition is better. On the other hand, this functional has been shown to be uniformly bounded in time (cf. [23]) along the solution to the Kac equation. Hence, if the initial datum f0 for the Kac equation is such that I2 (f0 ) < ∞, the decay to equilibrium follows. McKean [34] has studied the rate of convergence to equilibrium in the Kac model, conjecturing the existence of a lower bound of type (113), without specifying the nature of the positive term on the right side. McKean’s paper contains a number of general ideas, including the validity of formula (53), as well as the introduction of Fisher information (that he called Linnik functional), and its connection with the trend to equilibrium. For his study of the decay, McKean used the regularization of the solution f (t), taking the convolution with a Maxwellian of small energy: f δ = f ∗ Mδ . Then I2 (f δ ) is bounded, and the conditions of the previous theorem are automatically satisfied if some moment of order higher than 2 is finite. 8. Remarks About Fisher Information and Entropy Dissipation We begin with a trivial assertion. Let (B t )t≥0 be a semigroup commuting with the adjoint Ornstein-Uhlenbeck semigroup (St )t≥0 , and let D be the associated entropy dissipation functional. Then d d D(S f ) = I (B t f ). (120) t dt t=0 dt t=0 Indeed, in view of the commuting property, both terms are equal to d d H (B s St f ). dt t=0 ds s=0 In other words, the evolution of the entropy dissipation along the adjoint OrnsteinUhlenbeck semigroup is given by the evolution of the Fisher information along the semigroup (B t ). As a first application, we can recover in a straightforward way the first term in the right-hand side of (88), without the use of formula (78). More precisely, we shall show that for any two smooth functions F and G, Z Z ∇F St F ∇G 2 d . (121) = − dX (F + G) − dX (St F − St G) log dt t=0 St G F G To this purpose, we introduce the linear system   ∂t F = (G − F )  ∂ G = (F − G). t

(122)

It is clear that, if we set H = H (F )+H (G), then the time-dissipation of H associated to the system (122) is the functional Z F D(F, G) = (F − G) log . G

Sharp Entropy Dissipation Bounds

703

But the semigroup associated to the system (122) obviously commutes with the system   ∂t F = LF (123)  ∂ G = LG, t where L stands for the linear Fokker–Planck operator, since LF − LG = L(F − G). Hence, to prove (121), it is sufficient to compute the time-derivative of I = I (F ) + I (G) along solutions of the system (122). This computation is immediate and yields the desired result. Let us now choose for (B t ) the semigroup associated to the Boltzmann equation with Maxwellian molecules, and think the other way. In view of Bobylev’s lemma [5], (B t ) commutes with (St ). Hence, in all the cases when the entropy dissipation D is directly seen to be decreasing under evolution by the adjoint Ornstein-Uhlenbeck semigroup, we recover by the remark above an alternative proof that I is decreasing along solutions of the Boltzmann equation (note that it suffices to deal with smooth functions, because of the regularizing properties of the Fokker–Planck equation). That D is decreasing along (St ) can be seen directly at least in three different cases, with the help of the results of Sect. 3. The case N = 2. For Maxwellian molecules in two dimensions, we can write, following the notations of Proposition 3, Z π ζ (θ ) D(F, t Rθ F ) D(f ) = 0

for some nonnegative function ζ . For each θ, D(F, t Rθ F ) is decreasing along the adjoint Ornstein-Uhlenbeck semigroup, and hence D also. As a consequence, we have a new proof of the result by Toscani [38] that I is decreasing along solutions of the Boltzmann equation with Maxwellian molecules in two dimensions. The case B(z, ω) constant. If B is a constant, then we can write, following the notations of Proposition 4, Z dω D(F, t Tω F ), D(f ) = S N −1

and conclude as before. This gives a new proof of the result by Carlen and Carvalho [9] that I is decreasing along solutions of the Boltzmann equation with constant kernel. In fact, in both of the previous cases, one can also adapt and simplify the argument given by McKean for Kac’s model: to prove that D(f ∗ Mδ ) ≤ D(f ) (which implies the decreasing property of I by differentiation in δ), write j (x, y) = (x − y) log(x/y) and note that, by Jensen’s inequality, for given θ (or ω, with Tω in place of Rθ ), j (F ∗ Mδ , (t Rθ F ) ∗ Mδ ) ≤ j (F, t Rθ F ) ∗ Mδ . Integration with respect to dv dv∗ , and use of the translational invariance of D yield then D((F ∗ Mδ , (t Rθ F ) ∗ Mδ ) ≤ D(F, t Rθ F ), and the conclusion follows.

704

G. Toscani, C. Villani

e σ ) constant. For simplicity we treat the case N = 3. If B = |z · ω|/|z|, The case B(z, e then B is constant, and Z Z F dω dX |k · ω|(F − Gω ) log , D(f ) = Gω S N −1 with Gω = t Tω F . Applying Proposition 5, we see that it suffices to prove that for each ω, Z F ≤ 0. (124) dX L∗ (|k · ω|) (F − Gω ) log Gω Let us compute L∗ (|k · ω|). First, using the general formula ∇v [b(k · ω)] =

1 b0 (k · ω)5k ⊥ ω, |v − v∗ |

(125)

we find that ∇X (|k · ω|) =

sgn(k · ω) 5k ⊥ ω, −5k ⊥ ω . |v − v∗ |

This term is well-defined only for k · ω 6 = 0, but since we can restrict to functions F and G that are smooth, this does not matter here. As a consequence, X · ∇X (|k · ω|) is a multiple of v · 5k ⊥ ω − v∗ · 5k ⊥ = (v − v∗ ) · 5k ⊥ ω = 0. A similar computation shows that v − v∗ sgn(k · ω) · 5k ⊥ ω |v − v∗ |3 1 4 δ(k·ω)=0 5k ⊥ ω · 5k ⊥ ω + |v − v∗ | |v − v∗ | 4 |k · ω|, − |v − v∗ |2

1X (|k · ω|) = − 2

(126) (127) (128)

where to compute the last term we have used formula (125) and the relation sgn(u)u = |u|. The expression (126) is 0 because v − v∗ and 5k ⊥ ω are orthogonal. The contribution of (127) to the integral (124) is also 0 because when (k · ω) = 0, then F = Gω , and F , Gω are smooth. Finally, the expression (128) is nonpositive. Gathering all of this, we obtain that the inequality (124) actually holds. We do not know if by this method one can recover the general theorem that Fisher’s information is decreasing along solutions of the Boltzmann equation with Maxwellian molecules in any dimension [47]. But we found rather striking this connection with the problem of finding a lower bound for the entropy dissipation. Acknowledgement. The main part of this work was done while the second author was visiting the Mathematics Departement of the University of Pavia. It is a pleasure for him to thank the whole Department for their kind hospitality. The first author acknowledges the partial support of the National Council for Researches of Italy, Gruppo Nazionale per la Fisica Matematica. Both authors acknowledge the support of the European TMR Project Kinetic Applications, contract ERB FMBX-CT97-0157.

Sharp Entropy Dissipation Bounds

705

References 1. Abrahamsson, F.: Strong L1 convergence to equilibrium without entropy conditions for the spatially homogeneous Boltzmann equation. Preprint NO 1997-43, Dept. of Math., Chalmers Univ. of Tech. Göteborg, 1997 2. Arnold, A., Markowich, P., Toscani, G. and Unterreiter, A.: On logarithmic Sobolev inequalities, CsiszarKullback inequalities, and the rate of convergence to equilibrium for Fokker–Planck type equations. Preprint, 1997 3. Arkeryd, L.: Stability in L1 for the spatially homogeneous Boltzmann equation. Arch. Rat. Mech. Anal. 103 (2), 151–167 (1988) 4. Barky, D. and Emery, M.: Diffusions hypercontractives. (in French) Lect. Notes Math. 1123, Sém. XIX: 177–206, 1985 5. Bobylev, A.V.: The theory of the nonlinear, spatially uniform Boltzmann equation for Maxwellian molecules. Sov. Sci. Rev. C. Math. Phys. 7, 111–233 (1988) 6. Bobylev, A.V. and Cercignani, C.: On the rate of entropy production for the Boltzmann equation. J. Stat. Phys. 94, 603–618 (1999) 7. Boltzmann, L.: Lectures on Gas Theory. Reprinted by Dover Publications, 1995 8. Carlen, E.: Superadditivity of Fisher’s information and logarithmic Sobolev inequalities. J. Funct. Anal. 101 (1), 194–211 (1991) 9. Carlen, E. and Carvalho, M.: Strict entropy production bounds and stability of the rate of convergence to equilibrium for the Boltzmann equation. J. Stat. Phys. 67 (3–4), 575–608 (1992) 10. Carlen, E. and Carvalho, M.: Entropy production estimates for Boltzmann equations with physically realistic collision kernels. J. Stat. Phys. 74 (3–4), 743–782 (1994) 11. Carlen, E., Esposito, R., Lebowitz, J.L., Marra, R. and Rokhlenko, A. Kinetics of a model weakly ionized plasma in the presence of multiple equilibria, Preprint, 1996 12. Carlen, E., Gabetta, E. and Toscani, G.: Propagation of smoothness in velocities and strong exponential convergence for maxwellian molecules. Commun. Math. Phys. 199, 521–546 (1999) 13. Carlen, E. and Soffer, A.: Entropy production by block variable summation and central limit theorems. Commun. Math. Phys. 140, 339–371 (1991) 14. Carrillo, J.A. and Toscani, G.: Exponential convergence toward equilibrium for homogeneous FokkerPlanck-type equations. Math. Mod. Meth. Appl. Sci. 21, 1269–1286 (1998) 15. Cercignani, C.: H -theorem and trend to equilibrium in the kinetic theory of gases. Arch. Mech. 34, 231–241 (1982) 16. Cercignani, C.: The Boltzmann equation and its applications. New York: Springer, 1988 17. Cercignani, C., Illner, R. and Pulvirenti, M.: The mathematical theory of dilute gases. NewYork: SpringerVerlag, 1994 18. Csiszar, I.: Information-type measures of difference of probability distributions and indirect observations. Stud. Sci. Math. Hung. 2, 299–318 (1967) 19. Desvillettes, L.: Entropy dissipation rate and convergence in kinetic equations. Commun. Math. Phys. 123 (4), 687–702 (1989) 20. Desvillettes, L. and Villani, C.: On the spatially homogeneous Landau equation for hard potentials. Part I: Existence, uniqueness and smoothness. To appear in Comm. P.D.E. 21. Desvillettes, L. and Villani, C.: On the spatially homogeneous Landau equation for hard potentials. Part II: H -theorem and applications. To appear in Comm. P.D.E. 22. Donsker, M.D. and Varadhan, S.R.S.: Asymptotic evaluation of certain Markov process expectations for large time, I. Comm. Pure Appl. Math. 28, 1–47 (1975) 23. Gabetta, E.: On a conjecture of McKean with application to Kac’s model. Transp. Theo. Statis. Phys. 24, 305–318 (1995) 24. Gabetta, E. and Toscani, G.: On entropy production rates for some kinetic equations. Bull. Tech. Univ. Istanbul 47, 219–230 (1994) 25. Gabetta, E. and Pareschi, L.: About the non cut-off Kac equation: uniqueness and asymptotic behaviour. Comm. Appl. Nonlinear Anal. 4, 1–20 (1997) 26. Gabetta, E., Toscani, G. and Wennberg, B.: Metrics for probability distributions and the trend to equilibrium for solutions of the Boltzmann equation, J. Stat. Phys. 81 , 901–934 (1995) 27. Gross, L.: Logarithmic Sobolev inequalities. Am. J. Math. 97, 1061–1083 (1975) 28. Gustafsson, T.: Global Lp -properties for the spatially homogeneous Boltzmann equation. Arch. Rat. Mech. Anal. 103, 1–39 (1988) 29. Kullback, S.: A lower bound for discrimination information in terms of variation. IEEE Trans. Inf. The. 4, 126–127 (1967) 30. Lifchitz, E.M. and Pitaevskii, L.P.: Physical Kinetics – Course in theoretical physics. Vol. 10, Oxford: Pergamon, 1981 31. Lions, P.L.: Compactness in Boltzmann’s equation via Fourier integral operators and applications. J. Math. Kyoto Univ. 34 (2), 391–427 (1994)

706

G. Toscani, C. Villani

32. Lions, P.L. and Toscani, G.: A strenghtened central limit theorem for smooth densities. J. Funct. Anal. 128, 148–167 (1995) 33. Maxwell, J.C.: On the dynamical theory of gases. Phil. Trans. R. Soc. Lond., 157, 49–88 (1867) 34. McKean, H.P.: Speed of approach to equilibrium for Kac’s caricature of a Maxwellian gas. Arch. Rat. Mech. Anal. 21, 343–367 (1966) 35. Morgenstern, D.: Analytical studies related to the Maxwell-Boltzmann equation. J. Rat. Mech. Anal. 4, 533–555 (195) 36. Perthame, B.: Introduction to the collision models in Boltzmann’s theory. Preprint, Univ. Pierre et Marie Curie, 1995 37. Pulvirenti, A. and Wennberg, B.: A Maxwellian lower bound for solutions to the Boltzmann equation. Commun. Math. Phys. 183, 145–160 (1997) 38. Toscani, G.: New a priori estimates for the spatially homogeneous Boltzmann equation. Cont. Mech. Thermodyn. 4, 81–93 (1992) 39. Toscani, G. Entropy production and the rate of convergence to equilibrium for the Fokker-Planck equation. To appear in Quarterly of Appl. Math. 40. Toscani, G.: The grazing collisions asymptotics of the non cut-off Kac equation. Math. Mod. Num. An. 32, 763–772 (1998) 41. Toscani, G. and Villani, C.: Probability metrics and uniqueness of the solution to the Boltzmann equation for a Maxwell gas. J. Stat. Phys. 94, 619–637 (1999) 42. Truesdell, C. and Muncaster, R.G.: Fundamentals of Maxwell’s kinetic theory of a simple monatomic gas. New York: Academic Press, 1980 43. Villani, C.: On a new class of weak solutions to the spatially homogeneous Boltzmann and Landau equations. Arch. Rat. Mech. Anal. 143 (3), 241–271 (1998) 44. Villani, C.: On the spatially homogeneous Landau equation for Maxwellian molecules. Math. Mod. Meth. Appl. Sci. 8 (6), 957–983 (1998) 45. Villani, C.: Conservative forms of Boltzmann’s collision operator: Landau revisited. To appear in Math. Mod. Num. An. 46. Villani, C.: Decrease of the Fisher information for the Landau equation with Maxwellian molecules. To appear in Math. Mod. Meth. Appl. Sci. 47. Villani, C.: Fisher information bounds for Boltzmann’s collision operator. J. Math. Pures Appl. 77, 821– 837 (1998) 48. Wennberg, B.: On an entropy dissipation inequality for the Boltzmann equation. C. R. Acad. Sci. Paris, t. 315, Série I, 1441–1446 (1992) 49. Wennberg, B.: Stability and exponential convergence for the Boltzmann equation. PhD thesis, Chalmers Univ. Tech., 1993 50. Wennberg, B.: Regularity in the Boltzmann equation and the Radon transform. Comm. P.D.E. 11 & 12 (19), 2057–2074 (1994) 51. Wennberg, B.: Entropy dissipation and moment production for the Boltzmann equation. J. Stat. Phys. 86 (5/6), 1053–1066 (1997) Communicated by J. L. Lebowitz

Commun. Math. Phys. 203, 707 – 712 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Geometry of the Constraint Sets for Yang–Mills–Dirac Equations with Inhomogeneous Boundary Conditions ´ J¸edrzej Sniatycki Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada Received: 29 June 1998 / Accepted: 28 December 1998

Abstract: The constraint equation for minimally coupled Yang–Mills and Dirac fields in bounded domains is studied under the inhomogeneous boundary conditions which admit unique solutions of the evolution equations. For each value of the boundary data, the constraint set is shown to be a submanifold of the extended phase space. It is a prinicipal fibre bundle over the reduced phase space with structure group consisting of the gauge symmetries which coincide on the boundary with the identity transformation up to the first order of contact. 1. Introduction Let M be a compact domain in R3 with smooth boundary ∂M, representing the physical space accessible to the fields, and g the Lie algebra of a compact Lie group G presented as a subgroup of the automorphism group of a finite dimensional vector space V . The Cauchy data (A, E, 9) for the equations under investigation form the extended phase space P = {(A, E, 9) ∈ H 2 (M, g ⊗ R3 ) × H 1 (M, g ⊗ R3 ) × H 2 (M, C4 ⊗ V )}.

(1)

Here the Cauchy data (A, E) for the Yang–Mills fields are considered to be g-valued vector fields on M. The matter fields are Dirac spinor fields 9 on M with values in the vector space V . In the preceding paper, [1], we have established a local in time existence and the uniqueness of solutions of the Yang–Mills–Dirac evolution equations subject to the gauge condition Z n(grad A0 ) = −nE and A0 d3 x = 0, 1A0 = − div E, M

and the boundary conditions 1

t(curl A(t)) = λ(t) ∈ H 2 (0(T ∂M ⊗ g)) ,

(2)

´ J. Sniatycki

708

3 1 (I d − iγ k nk )9(t) |∂M = µ(t) ∈ H 2 (∂M, C4 ⊗ V ), 2

(3)

1 1 (I d − iγ k nk )γ 0 (γ k ∂k + im)9(t) |∂M = ν(t) ∈ H 2 (∂M, C4 ⊗ V ), 2

(4)

−

where A0 is the scalar potential, t and n denote the components tangential and normal to ∂M, respectively, and γ 0 and γ k are the Dirac matrices acting in C4 . The boundary data (λ(t), µ(t), ν(t)) are not arbitrary, but they have to satisfy the following consistency conditions: 3

3

λ = λ1 + grad χ, where λ1 ∈ H 2 (0(T ∂M ⊗ g)) and χ ∈ H 2 (∂M, g);

(5)

(I d + iγ k nk )µ = 0;

(6)

(I d + iγ k nk )ν = 0.

(7)

We denote by B the space of the values of the boundary data, that is β = (λ, µ, ν) ∈ B 1 3 if and only if λ ∈ H 2 (0(T ∂M ⊗ g)) satisfies Eq. (5), µ ∈ H 2 (∂M, C4 ⊗ V ) satisfies 1 Eq. (6), and ν ∈ H 2 (∂M, C4 ⊗ V ) satisfies Eq. (7). The extended phase space P is weakly symplectic, with the weak symplectic form ω = dθ, where Z (E · a + 9 † ψ)d3 x. (8) hθ(A, E, 9) | (a, e, ψ)i = M

For each β = (λ, µ, ν) ∈ B we denote by Pβ ⊂ P the space of the Cauchy data satisfying the boundary conditions given by Eqs. (2), (3), and (4). It is a closed subspace of the Hilbert space P , and the pull-back of ω to Pβ is weakly symplectic. However, Pβ has no symplectic complement in P . The constraint equation of the Yang–Mills–Dirac theory is div E + [A; E] = 9 † (I ⊗ T a )9Ta ,

(9)

where T a is a basis in g. The space of solutions in P of the constraint equation is called the constraint set and it is denoted by C. The aim of this paper is to describe the structure of Cβ = C ∩ Pβ that is, the space of solutions in P of the constraint equation with the boundary data β. Theorem 1. For each β ∈ B, the constraint set Cβ is a smooth submanifold of Pβ . We denote by GS(P )1 the connected group of time independent gauge transformations, represented by maps ϕ ∈ H 3 (M, G) such that ϕ |∂M = identity and n grad ϕ = 0. Its action on the Cauchy data (A, E, 9) ∈ P is given by A 7 → ϕAϕ −1 + ϕ grad ϕ −1 , E 7 → ϕEϕ −1 , 9 7 → ϕ9.

(10)

Theorem 2. The action of GS(P )1 in P is continuous, smooth, free and proper. For every β ∈ B, the action of GS(P )1 in P preserves Pβ .

Geometry of the Constraint Sets for Yang–Mills–Dirac Equations

709

Corollary 3. The space Cβ /GS(P )1 of GS(P1 ) orbits in Cβ is a quotient manifold of Cβ , and Cβ has the structure of a principal fibre bundle over Cβ /GS(P )1 with structure group GS(P )1 . Elements of the Lie algebra gs(P )1 of GS(P )1 are given by maps ξ ∈ H 3 (M, g) such that ξ |∂M = 0, and n grad ξ = 0. Their action in P is given by the vector field ξP (A, E, 9) = (−DA ξ, −[E, ξ ], 9 † ξ ), where DA ξ = dξ + [A, ξ ] is the covariant differential of ξ with respect to the connection A. The action of GS(P )1 in P preserves the 1-form θ given by Eq. (8). Hence it preserves ω = dθ, and it is Hamiltonian with an equivariant momentum map J1 such that, for every ξ ∈ gs(P )1 , Z (−E · DA ξ + 9 † ξ 9)d3 x. (11) hJ1 (A, E, 9) | ξ i = hθ | ξP (A, E, 9)i = M

Using Stokes’ Theorem and the vanishing of ξ on ∂M, we obtain Z {(div E + [A; E])ξ + 9 † ξ 9}d3 x. hJ1 (A, E, 9) | ξ i = M

Since the above equality is satisfied for every smooth ξ which vanishes on the boundary together with its gradient, the Fundamental Theorem in the Calculus of Variations and the constraint equation (9) imply that (A, E, 9) ∈ C ⇐⇒ hJ1 (A, E, 9) | ξ i = 0.

(12)

The pull-back ωCβ of ω to Cβ has involutive kernel ker ωCβ . The reduced phase space ˇ Pβ is defined as the set of equivalence classes of points in Cβ under the equivalence relation p ' p 0 if and only if there is a piece-wise smooth curve in Cβ with the tangent vector contained in ker ωCβ . If ker ωCβ is a distribution, it is clearly involutive and the equivalence classes coincide with integral manifolds of ker ωCβ . We denote by ρβ : Cβ → Pˇβ the canonical projection associating to each p ∈ Cβ its equivalence class containing p. Theorem 4. For each β ∈ B, 1. Pˇβ = Cβ /GS(P )1 ; 2. Pˇβ is endowed with a weak Riemannian metric induced by the L2 scalar product in P , and with a 1-form θPˇβ such that ρβ∗ θPˇβ = θCβ . Moreover, ωPˇβ = dθPˇβ is weakly symplectic and ρβ∗ ωPˇβ = ωCβ . The regularity results obtained here are analogous to the results valid for the Yang– Mills and Dirac fields in the Minkowski space-time, [2], and much stronger than in the bag model, [3]. The reason for this is that the boundary conditions used here are weaker than the bag boundary condition. In particular, we need not specify the normal component of the electric Yang–Mills field E on the boundary.

´ J. Sniatycki

710

2. Proofs 2.1. Proof of Theorem 1. The space P0 , corresponding to β = 0 ∈ B, is a closed subspace of P . For each β ∈ B, Pβ is a closed affine subspace of P with the tangent space P0 . Let fβ : Pβ → L2 (M, g) be given by fβ (A, E, 9) = div E + [A, E] − 9 † (I ⊗ T a )9Ta . The constraint set Cβ is the zero level of fβ , Cβ = fβ−1 (0). The derivative of fβ at the point (A, E, 9) ∈ Pβ in the direction (a, e, ψ) ∈ P0 is given by Dfβ(A,E,9) (a, e, ψ) = div e + [A; e] + [a; E] − ψ † (I ⊗ T a )9Ta − 9 † (I ⊗ T a )ψTa . (13) The differential Dfβ(A,E,9) maps P0 to L2 (M, g). Since our boundary conditions impose no restrictions on E, the operator P0 → L2 (M, g) : (a, e, ψ) 7→ div e is submersive. Moreover, by the Rellich-Kondrachev Theorem, [4], the operator P0 → L2 (M, g) : (a, e, ψ) 7 → [A; e] + [a; E] − ψ † (I ⊗ T a )9Ta − 9 † (I ⊗ T a )ψTa is compact. Hence, Dfβ (A,E,9) is semi-Fredholm and its range is closed, see [5]. Since Dfβ(A,E,9) maps to a Hilbert space, its range has the orthogonal complement. Similarly, the kernel of Dfβ(A,E,9) is closed and has the orthogonal complement in P0 . Hence, by the Implicit Function Theorem fβ−1 (0) is a submanifold of Pβ , see [6]. This completes the proof of Theorem 1. 2.2. Proof of Theorem 2. The action of H 3 (M, G) is given by A 7 → ϕAϕ −1 + ϕ grad ϕ −1 , E 7 → ϕEϕ −1 and 9 7 → ϕ9.

(14)

If ϕ ∈ H 3 (M, G), then ϕ grad ϕ −1 ∈ H 2 (M, g ⊗ R3 ). Moreover, for k = 1, 2, the pointwise product of elements of H 3 (M, R) and H k (M, R) are in H k (M, R). Hence, (A, E, 9) ∈ P implies that (ϕAϕ −1 + ϕ grad ϕ −1 , ϕEϕ −1 , ϕ9) ∈ P . This proves that H 3 (M, G) acts in P . Continuity and smoothness of the action of H 3 (M, G) in P follows from the continuity in H k (M, R) of pointwise products of elements of H 3 (M, R) and H k (M, R), k = 1, 2. Properness of the action (14) of H 3 (M, G) in H 2 (M, R3 ⊗ g) × H 1 (M, R3 ⊗ g) × 2 H (M, C4 ⊗ V ) was proved in [3]. The boundary conditions assumed there did not affect the proof. Since M is contractible it follows that GS(P )0 = {ϕ ∈ H 3 (M, G) | ϕ |∂M = identity}

Geometry of the Constraint Sets for Yang–Mills–Dirac Equations

711

is connected. Since GS(P )0 is a Banach–Lie subgroup of H 3 (M, G), continuity, smoothness and properness of its action in P is a consequence of the same properties of the action of H 3 (M, G). To show that the action of GS(P )0 is free, it suffices to consider its action on the Yang–Mills potentials. If ϕ ∈ GS(P )0 preserves A, then A = ϕAϕ −1 + ϕ grad ϕ −1 , which implies that grad ϕ + [A, ϕ] = 0,

(15)

that is ϕ is covariantly constant with respect to the connection A as a section of the group bundle over M. For any x ∈ M, let x(s) be a smooth path in M such that x(1) = x and x(0) ∈ ∂M. Restricting Eq. (15) to the path x(s) and setting ϕ(s) = ϕ(x(s)), we get an initial value problem d ϕ(s) = −[A(x(s)) · x(s), ˙ ϕ(s)] and ϕ(0) = identity, ds where x(s) ˙ denotes the derivative of x(s) with respect to s. Clearly, ϕ(s) = identity is a solution. Moreover, since A(x(s)) · x(s) ˙ is continuous in s, the solution is unique, which implies that ϕ(x) = ϕ(x(1)) = ϕ(1) = identity. Since GS(P )0 is connected, it follows that the isotropy group in GS(P )0 of any Yang–Mills potential A is trivial. This ensures that the action of GS(P )0 in P is free. Since GS(P )1 is a Banach–Lie subgroup of GS(P )0 its action in P is also continuous, smooth, proper and free. Moreover, each ϕ ∈ GS(P )1 satisfies the conditions ϕ |∂M = 0 and grad ϕ |∂M = 0. Hence, t curl (ϕAϕ −1 + ϕ grad ϕ −1 ) = = t(grad ϕ × Aϕ −1 + ϕ curl Aϕ −1 + ϕA × grad ϕ −1 + grad ϕ × grad ϕ −1 ) = t grad ϕ × nAϕ −1 + n grad ϕ × tAϕ −1 + ϕt curl Aϕ −1 + +ϕtA × n grad ϕ −1 + ϕnA × t grad ϕ −1 +t grad ϕ × n grad ϕ −1 + n grad ϕ × t grad ϕ −1 = n grad ϕ × tA + t curl A + tA × n grad ϕ −1 = t curl A. Similarly, 1 1 (I d − iγ k nk )ϕ9 |∂M = (I d − iγ k nk )9 |∂M , 2 2 and 1 1 − (I d − iγ k nk )γ 0 (γ k ∂k + im)ϕ9 |∂M = − (I d − iγ k nk )γ 0 (γ k ∂k + im)9 |∂M . 2 2 Hence, the action of ϕ in P preserves Pβ . Finally, the constraint equation is gauge invariant. Hence, the constraint set Cβ is preserved by the action of GS(P )1 . Moreover, GS(P )1 acts in Cβ continuously, smoothly, properly and freely. This completes the proof of Theorem 2.

712

´ J. Sniatycki

2.3. Proof of Corollary 3. Since GS(P )1 acts in Cβ continuously, properly and freely, the space Cβ /GS(P )1 of the GS(P )1 -orbits in Cβ is a quotient manifold of Cβ . Moreover, since the action of GS(P )1 in Cβ is continuous, proper and free, Cβ has the structure of the principal fibre bundle with base space Cβ /GS(P )1 and structure group GS(P )1 . A proof that a continuous, smooth, proper and free action of a Lie group in a finite dimensional manifold gives rise to the structure of a principal fibre bundle is given in Ref. [7]. This proof extends without change to continuous, smooth, proper and free actions of Banach–Lie groups on Banach manifolds. This completes the proof of Corollary 3. 2.4. Theorem 4. Theorem 4 is a consequence of smoothness of the constraint set and several results which can be found in the literature, see Refs. [8,9,3,10]. Its proof is essentially identical to the proof of the corresponding result for Yang–Mills and Dirac fields in the Minkowski space-time, [2]. References ´ 1. Schwarz, G., Sniatycki, J. and Tafel, J.: Yang–Mills and Dirac fields with inhomogeneous boundary conditions. Commun. Math. Phys. 188, 439–448 (1997) ´ 2. Sniatycki, J.: Regularity of constraints and reduction in the Minkowski space Yang–Mills–Dirac theory. Ann. Inst. Henri Poincaré 70, 277–293 (1999) ´ 3. Sniatycki, J., Schwarz, G. and Bates, L.: Yang–Mills and Dirac fields in a bag, constraints and reduction. Commun. Math. Phys. 176, 95–115 (1996) 4. Adams, R.A.: Sobolev Spaces. New York: Academic Press, 1975 5. Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–NewYork: Springer Verlag, 1984 6. Lang, S.: Differential and Riemannian Manifolds. New York: Springer, 1995 7. Cushman, R. and Bates, L.: Global Aspects of Classical Integrable Systems. Basel–Boston–Berlin: Birkhäuser, 1997 8. Arms, J., Marsden, J.E. and Moncrief, V.: Symmetry and bifurcation of momentum maps. Commun. Math. Phys. 90, 361–372 (1981) 9. Palais, R.: On the existence of slices for actions of non-compact Lie groups. Ann. Math. 73, 295–323 (1961) 10. Mitter, P. and Viallet, C.: On the bundle of connections and the gauge orbit manifold inYang–Mills theory. Commun. Math. Phys. 79, 457–472 (1981) Communicated by G. Felder

Commun. Math. Phys. 203, 713 – 728 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Singular Monopoles and Gravitational Instantons Sergey A. Cherkis1,? , Anton Kapustin2,?? 1 California Institute of Technology, Pasadena, CA 91125, USA. E-mail: [email protected] 2 School of Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA.

E-mail: [email protected] Received: 29 May 1998 / Accepted: 12 January 1999

Abstract: We model Ak and Dk asymptotically locally flat gravitational instantons on the moduli spaces of solutions of U (2) Bogomolny equations with prescribed singularities. We study these moduli spaces using Ward correspondence and find their twistor description. This enables us to write down the Kähler potential for Ak and Dk gravitational instantons in a relatively explicit form. 1. Introduction A gravitational instanton is a smooth four-dimensional manifold with a Riemannian metric satisfying Einstein equations. A particularly interesting class of gravitational instantons is that of four-dimensional hyperkähler manifolds, i.e. manifolds with holonomy group contained in SU (2). A hyperkähler manifold can be alternatively characterized as a Riemannian manifold admitting three covariantly constant complex structures I, J, K satisfying the quaternion relations I J = −J I = K, etc.

(1)

such that the metric is Hermitian with respect to I, J, K. Covariant constancy of I, J, K implies that three 2-forms ω1 = g(I ·, ·), ω2 = g(J ·, ·), ω3 = g(K·, ·) are closed. If we pick one of the complex structures, say I , we may regard a hyperkähler manifold as a complex manifold equipped with Kähler metric (with Kähler form ω1 ) and a complex symplectic form ω = ω2 + iω3 . Hyperkähler four-manifolds arise in several physical problems. For example, compactification of string and M-theory on hyperkähler four-manifolds preserves one half of supersymmetries and provides exact solutions of stringy equations of motion. ? Research supported in part by DOE grant DE-FG03-92-ER40701.

?? Research supported in part by DOE grant DE-FG02-90-ER40542.

714

S. A. Cherkis, A. Kapustin

The only compact hyperkähler four-manifolds are T 4 and K3, but the K3 metric is not known explicitly. In the noncompact case there are several possibilities to consider. There are no nontrivial hyperkähler metrics asymptotically approaching that of R4 , but the situation becomes more interesting if one makes the metric only Asymptotically Locally Euclidean (ALE), i.e. that the metric look asymptotically like the quotient of R4 by a finite group of isometries. All such metrics fit into the ADE classification of Kronheimer which we now briefly explain. Let 0 be a finite subgroup of SU (2). There is a natural correspondence between such 0’s and ADE Dynkin diagrams: the Ak diagram corresponds to the cyclic group Zk+1 , the Dk diagram corresponds to the binary dihedral group Dk−2 of order 4(k − 2), and Ek diagrams correspond to symmetry groups of tetrahedron, cube, and icosahedron. Since SU (2) acts on C2 by the fundamental representation, we may consider quotients C2 / 0 (known as Kleinian singularities). Kronheimer showed that resolutions of Kleinian singularities admit ALE hyperkähler metrics, and that all such metrics arise in this way [1,2]. In the Ak case the metric has been known explicitly for some time: it is the Gibbons-Hawking metric with k + 1 centers [3]. Kronheimer provided an implicit construction of Dk and Ek ALE gravitational instantons as hyperkähler quotients [1]. Another interesting class of noncompact gravitational instantons is that of Asymptotically Locally Flat (ALF) manifolds. By definition, the metric of an ALF manifold has the asymptotic form ds 2 = dr 2 + σ12 + r 2 (σ22 + σ32 ),

(2)

where σj are left-invariant one-forms on S3 / 0 for some finite subgroup 0 acting on S 3 = SU (2) from the right. The only known hyperkähler metric of this sort is the multiTaub-NUT metric. As a complex manifold the k + 1-center multi-Taub-NUT space is isomorphic to the resolution of C2 /Zk+1 , so we will call it the Ak -type ALF gravitational instanton. Compactification of M-theory of this manifold is equivalent to a configuration of k + 1 parallel D6 branes in IIA string theory. Furthermore, it is expected that a configuration of an O6+ orientifold and k D6 branes in IIA string theory corresponds to the compactification of M-theory on a Dk ALF space [4], i.e. an ALF gravitational instanton isomorphic to the resolution of C2 /Dk−2 . More generally, any compactification of Mtheory on an ALF hyperkähler manifold should correspond to a IIA brane configuration preserving half of supersymmetries. Thus it is of interest to find all four-dimensional ALF hyperkähler metrics in as explicit a form as possible. In our previous paper [5] we constructed Dk ALF metrics from moduli spaces of certain ordinary differential equations (Nahm equations). In this paper we construct both Ak and Dk ALF hyperkähler four-manifolds from moduli spaces of solutions of U (2) Bogomolny equations on R3 with prescribed singularities. Solutions of SU (2) Bogomolny equations with singularities were previously considered by Kronheimer [6], and much of our discussion closely follows that in Ref. [6]. The idea is that, on one hand, the moduli space of Bogomolny equations carries natural hyperkähler structure, while on the other hand the solutions can be found by means of Ward correspondence. This approach yields directly the twistor space (in the sense of Penrose) of the moduli space of solutions. To get the metric itself one needs to find an appropriate family of sections of the twistor space. In both Ak and Dk cases we were able to identify the correct family of sections only modulo some finite choices. The Ak metrics are simple enough so that one can explicitly see that only one choice gives nonsingular metrics. In the end of Subsect. 5.1 we argue that the Dk twistor spaces we have constructed correspond to

Singular Monopoles and Gravitational Instantons

715

everywhere smooth hyperkähler metrics as well. We also show that the Dk ALF metrics we obtain are identical to those found in Ref. [5]. Let us explain what we mean by solutions of Bogomolny equations with prescribed singularities. Recall that Bogomolny equations on R3 are equations for a connection A in a vector bundle B over R3 and a section 8 of EndB: ∗FA = DA 8. Let kak be the Ad-invariant norm on u(2), kak2 = − 21 Tra 2 . Fix k distinct points p 1 , . . . , pk ∈ R3 . A singular U (2) monopole is a solution of U (2) Bogomolny equations on R3 \{p1 , . . . , pk } satisfying the following conditions. (i) As r → p α 2rα 8 → i diag (0, −`α ) up to gauge transformations, and d (rα k8k) is bounded. Here rα = |r − pα |, α = 1, . . . , k. (ii) As r → ∞ one has asymptotic expansions, up to gauge transformations, P n n − `α + O(1/r 2 ), 8 = i diag µ1 − , µ2 + 2r 2r ∂k8k = O 1/r 2 , kD8k = O 1/r 2 . ∂ We will refer to n as the nonabelian charge of the monopole,P and to {`α } as its abelian charges. We will assume that µ1 > µ2 . We also set n0 = n − α `α , µ = µ1 − µ2 , for short. Every fiber of the complex rank two bundle B splits into the eigenspaces of 8, B = M1 ⊕ M2 , near r = pα or when r → ∞. Let M1 corresponds the eigenvalue of 8 diverging as r → p α . It is a simple consequence of Bogomolny equations that −`α is the degree of M1 restricted to a small 2-sphere around r = pα . Similarly, −n and n0 are the degrees of eigensubbundles of B restricted to a large 2-sphere. Therefore n and {`α } are integers. String theory considerations imply that the moduli space of the n = 1 monopole with `α = 1, α = 1, . . . , k is an Ak−1 ALF gravitational instanton, and the centered moduli space of n = 2 monopole with `α = 1, α = 1, . . . , k is a Dk ALF gravitational instanton [7]. (The centered n = 2 monopole moduli space is a U (1) hyperkähler quotient of the n = 2 monopole moduli space; see Sect. 5 below.) In this paper we show that this is indeed the case. The main tool is the Ward correspondence described in Sect. 2. In Sect. 3 we use it to derive the twistor space for arbitrary n. In Sects. 4 and 5 we deal with the n = 1 and n = 2 cases, respectively. We show that the moduli spaces are resolutions of Ak−1 and Dk singularities, as expected, find the real holomorphic sections of the twistor spaces, and derive the Kähler potentials for the metrics using the generalized Legendre transform method of Refs. [8,9]. 2. Ward Correspondence From now on we restrict ourselves to the case `α = 1, α = 1, . . . , k. We make some comments on the more general case of positive `α at the end of Subsect. 5.1. To construct the moduli space of singular U (2) monopoles, we will use a version of Ward correspondence due to Hitchin [10]. The set of all oriented straight lines T in R3 has a natural complex structure, as it is the tangent bundle of the projective line. T can be covered by two patches V0 (ζ 6 = ∞) and V1 (ζ 6 = 0) with coordinates (η, ζ ) and (η0 , ζ 0 ) = (η/ζ 2 , 1/ζ ).

716

S. A. Cherkis, A. Kapustin

For any point x ∈ R3 the set of all oriented straight lines through x sweeps out a projective line Px ∈ T; thus there is a holomorphic map Px : P1 → T. The reversal of the orientation of lines in R3 is an antiholomorphic map τ : T → T satisfying τ 2 = id. It is called the real structure of T. For any x it acts on Px as the antipodal map. Thus Px is a real holomorphic section of T. For any straight line in R3 , γ = {x|x = ut + v, u · u = 1, u · v = 0} , let γ+ = {x|x = ut + v, t > R} , γ− = {x|x = ut + v, t < −R} ,

(3)

where R is a positive number greater than any |pα |. Now we define two complex rank 2 vector bundles E + and E − over T: (4) E + = s ∈ 0 (γ+ , E) |Dγ s = i8s , − E = s ∈ 0 (γ− , E) |Dγ s = i8s . From Bogomolny equations it follows, as in Ref. [10], that these bundles are holomorphic. The real structure τ on T can be lifted to an antilinear antiholomorphic map σ : E + → (E − )∗ . Thus every solution of U (2) Bogomolny equations maps to a pair of holomorphic rank two bundles on T interchanged by the real structure. Let Px denote the real section corresponding to x, and Pα the real section corresponding to p α . Let P be the union of all Pα . If γ does not pass through any of pα , any solution s can be continued from γ+ to γ− . This defines a natural identification of the fibers Eγ+ and Eγ− . Therefore we have an isomorphism h : E + |T\P → E − |T\P .

(5)

For nonsingular monopoles h extends to an isomorphism over the whole T, therefore the Ward correspondence maps a nonsingular monopole to a holomorphic bundle over T. In the present case h or h−1 may have singularities at P , and the Ward correspondence maps a singular monopole into a triplet (E + , E − , h). This triplet satisfies a certain triviality constraint which we now proceed to formulate. T P consists of an even number For any x distinct from all p α the intersection Px T of points. For a generic x the cardinality of Qx = Px P is 2k. For any x we can − arbitrarily split Qx into two sets of equal cardinality Q+ x and Qx and construct a vector + − bundle Ex over Px by gluing together E restricted to Px \Q+ x and E restricted to , with the transition function h. (Of course, E depends on the Px \Q− x x S − splitting.) The Qx such that Ex is triviality constraint is that for any x there is a splitting Qx = Q+ x trivial. Now we state the Ward correspondence between singular U (2) monopoles and twistor data. There is a bijection between singular monopoles modulo gauge transformations and pairs (E + , E − ) of holomorphic rank 2 bundles over T equipped with an isomorphism Eq. (5) satisfying the following conditions: S − Qx such that Ex is trivial. (a) For any x 6 = pα there is a splitting Qx = Q+ x

Singular Monopoles and Gravitational Instantons

717

(b) In the vicinity of each point of P there exist trivializations of E + and E − such that h takes the form 1Q 0 , (6) h= 0 α (η − Pα (ζ )) so that h extends across P to a morphism E + → E − . (c) The real structure τ on T lifts to an antilinear antiholomorphic map σ : E + → − (E )∗ . The injectivity of the Ward correspondence can be shown by a straightforward modification of the argument in Ref. [10]. We conjecture the surjectivity by analogy with the nonsingular case. Let us explain where (a) and (b) come from. The condition (b) arises from studying the behavior of the solutions of the equation Dγ s = i8s as γ approaches pα . Details can be found in Ref. [6]. (There the SU (2) case was analyzed, but the extension to U (2) is straightforward). To demonstrate (a) it is sufficient to exhibit a holomorphic trivialization of Ex . Take any x 6 = pα , α = 1, . . . , k and recall that Px consists of all straight lines γ passing through x. To obtain a holomorphic section of Ex pick a vector v1 in the fiber of B over x and take it as an initial condition for the equation Dγ s = i8s at t = 0. Integrating it forward and backward in t and varying γ yields sections of E + and E − related by h. It is easy to check that they are holomorphic and thereby combine into a holomorphic section s1 of Ex . To get a section s2 of Ex linearly independent from s1 just pick a vector v2 linearly independent from v1 and repeat the procedure. (This argument has to be modified if there is a straight line γ passing through x and α, β ∈ {1, . . . , k} such that p α and pβ lie on γ and x separates them. In this case one of the vectors v1 , v2 has to be varied, vi ∼ ζ −1 , as one varies γ .) We now want to encode the twistor data in an algebraic curve S ⊂ T, in the spirit of Ref. [10]. We denote by O(m) the pullback to T of the unique degree m line bundle on P1 , and by Lx (m) a line bundle over T with the transition function ζ −m e−xη/ζ from + V0 to V1 . Let L+ 1 be a line subbundle of E which consists of solutions of Dγ s = i8s − bounded by const · exp(−µ1 t)t n as t → +∞. Similarly, a line bundle L− 1 ⊂ E 0 −n consists of solutions bounded by const · exp(−µ2 t)t as t → −∞. The line bundles − and L are defined by L+ 2 2 + + L+ 2 = E /L1 ,

− − L− 2 = E /L1 .

As in Ref. [10] the asymptotic conditions on the Higgs field can be used to show that − L+ 1,2 and L1,2 are holomorphic line bundles, and that the following isomorphisms hold: + − − µ1 µ2 0 µ2 0 µ1 L+ 1 ' L (−n), L2 ' L (n ), L1 ' L (−n ), L2 ' L (n).

Consider a composite map − + − ψ : L+ 1 → E → E → L2 ,

where the first arrow is an inclusion, the second arrow is h, and the third arrow is a natural projection. We may regard ψ as an element of H 0 (T, O(2n)). Let us define the spectral curve S to be the zero level of ψ. S is in the linear system O(2n). Arguments identical to those in Ref. [10] can be used to prove that S is compact and real (i.e., τ (S) = S). Consider now a map φ : ∧2 E + → ∧2 E − induced by h. By virtue of Eq. (6) the zero level of φ is precisely P . We will assume in what follows that S does not contain

718

S. A. Cherkis, A. Kapustin

any of Pα as components. Physically this corresponds to the requirement that none of theTnonabelian monopoles is located at x = pα . For simplicity we will also assume that S P consists of 2nk points (this is a generic situation). The construction here bears a close resemblance to that in Ref. [11], where nonsingular monopoles for all classical groups were constructed. According to Ref. [11], the spectral data for a nonsingular SU (3) monopole with magnetic charge (k, n) include a pair of spectral curves S1 , S2 in the linearTsystems O(2n), O(2k). Our S and P are analogs of S1 and S2 . The condition that S P consists of 2nk points is analogous to the requirement in Ref. [11] that the monopoles are generic. (This resemblance is not a coincidence: if we consider an SU (3) gauge theory broken down to SU (2) × U (1) by a large vev of an adjoint Higgs field, the (k, n) monopoles of SU (3) reduce to singular monopoles of SU (2) × U (1) with nonabelian charge n and total abelian charge k. In this limit the spectral data of Ref. [11] must reduce to ours.) + − Since L+ 1 |S = ker ψ|S , we have a well-defined holomorphic map ρ : L2 |S → L2 |S − induced by h. There is also a holomorphic map ξ : L+ by h. Thus 1 |S → L1 |S induced 0 µ 0 we have natural elements ρ ∈ H (S, L (k)) and ξ ∈ H S, L−µ (k) . It also easily follows from theTdefinition that ρ ⊗ ξ = φ|S , and therefore the divisors of both ρ and ξ are subsets of S P . ρ and ξ are interchanged by real structure, and therefore the same is true about their divisors. It follows that the divisors of ρ and ξ are disjoint and have equal cardinality. Thus we can define the spectral data for a generic singular monopole to consist of A spectral T curve S, which is a real compact curve in the linear system O(2n) such that S P consists of 2nk Sdisjoint points. T (ii) A splitting S P = Q+ Q− into sets of equal cardinality interchanged by τ . (iii) A section ρ of Lµ (k)|S with divisor Q+ and a section ξ of L−µ (k)|S with divisor Q− . ρ and ξ are interchanged by real structure.

(i)

The condition (iii) is a constraint on S. It implies that ρ and ξ satisfy Y (η − Pα (ζ )). ρξ =

(7)

α

For nonsingular monopoles it reduces to the requirement that Lµ |S is trivial, as in 2µ Ref. [10]. As a consequence of (iii), L |S Q− − Q+ is trivial. Recall that the spectral data for nonsingular SU (2) monopoles satisfy an additional constraint, the ”vanishing theorem" of Ref. [12]. It says that Lzµ (n − 2) is nontrivial for z ∈ (0, 1). A natural guess for the analogue of this condition in our case is (iv) Lzµ (n − 2) −Q+ is nontrivial for z ∈ (0, 1). We already mentioned a close connection of the spectral data for singular U (2) monopoles and those for nonsingular SU (3) monopoles [11] with the largest Higgs vev set to +∞. Consequently, one can obtain the condition (iv) from the “vanishing theorem” of Ref. [11] by taking the appropriate limit. A direct derivation of (iv) should also be possible. Arguments very similar to those in Ref. [10] show that the spectral data determine the singular monopole uniquely. A natural question is if there is a one-to-one correspondence between singular U (2) monopoles and spectral data defined by (i-iv). The answer was positive for nonsingular SU (2) monopoles [12], so it is highly plausible that the same is true in the present case. Presumably a proper proof of this can be achieved by converting the spectral data into solutions of Nahm equations, and then reconstructing singular monopoles by an inverse Nahm transform [12,11].

Singular Monopoles and Gravitational Instantons

719

3. Twistor Space for Singular Monopoles Having established the correspondence between singular U (2) monopoles and algebraic data on T, we now proceed to construct the twistor space Zn for the moduli space of a singular monopole with nonabelian charge n. We follow the method of Ref. [13]. For fixed ζ = ζ0 every point in Zn yields a spectral curve S which intersects the fiber of T over ζ0 at n points. Thus we have a projection Zn → ⊕nj=1 O(2j ) = Yn . Concretely, if S is given by ηn + η1 ηn−1 + · · · + ηn = 0, the corresponding point in Yn is (η1 , . . . , ηn ). Now consider an n-fold cover of Yn , n o Xn = (η, η1 , . . . , ηn ) ∈ O(2) ⊕ Yn |ηn + η1 ηn−1 + · · · + ηn = 0 . There are two natural projections π1 : Xn → T and π2 : Xn → Yn . Using these projections, we get a rank n bundle V + over Yn as a direct image sheaf V + = π2∗ π1∗ Lµ (k). Similarly, we get a rank n bundle V − = π2∗ π1∗ L−µ (k). For any point in Zn we have a section ρ of Lµ (k)|S and a section ξ of L−µ (k)|S . Therefore, there is an inclusion Zn ⊂ V + ⊕ V − . To describe this inclusion more concretely, we must rewrite the condition (iii) in terms of sections of V ± . The result is as follows. Let U be a 2n+1-dimensional subvariety in C3n+1 with coordinates (ζ, η1 , . . . , ηn , ρ0 , . . . , ρn−1 , ξ0 , . . . , ξn−1 ) defined by (ρ0 + ρ1 η + · · · + ρn−1 ηn−1 )(ξ0 + ξ1 η + · · · + ξn−1 ηn−1 ) =

Y (η − Pα (ζ )), α

mod ηn + η1 ηn−1 + · · · + ηn = 0.

(8)

Take two copies of U and glue them together over ζ 6= 0, ∞ by ζ˜ = ζ −1 , η˜ j = ζ −2j ηj , j = 1, . . . , n,

(9)

ρ˜0 + ρ˜1 η˜ + · · · + ρ˜n−1 η˜ n−1 = e−µη/ζ ζ −k (ρ0 + ρ1 η + · · · + ρn−1 ηn−1 ), ξ˜0 + ξ˜1 η˜ + · · · + ξ˜n−1 η˜ n−1 = eµη/ζ ζ −k (ξ0 + ξ1 η + · · · + ξn−1 ηn−1 ), all modulo ηn + η1 ηn−1 + · · · + ηn = 0. The resulting 2n + 1-dimensional variety is Zn , the twistor space of singular monopoles with nonabelian charge n. To reconstruct the hyperkähler metric from the twistor space one has to find a holomorphic section of 32 TF∗ ⊗ O(2), where TF∗ is the cotangent bundle of the fiber of Zn . Upon restriction to any fiber of Zn this section must be closed and nondegenerate. An obvious choice (the same as in Ref. [13]) is ω=4

n X dρ(βj ) ∧ dβj j =1

ρ(βj )

,

where βj , j = 1, . . . , n are the roots of ηn + η1 ηn−1 + · · · + ηn = 0.

(10)

720

S. A. Cherkis, A. Kapustin

4. Moduli Space M1 of n = 1 Monopole Specializing the formulas of the previous section to n = 1, we get that the twistor space Z1 is a hypersurface in the total space of Lµ (k) ⊕ L−µ (k), ρ0 ξ0 =

k Y

(η − Pα (ζ )),

(11)

α=1

Lµ (k), ξ0 C2 /Zk , so

L−µ (k),

∈ and η ∈ O(2). Obviously, for fixed ζ this is a where ρ0 ∈ the corresponding hyperkähler metric is an Ak−1 gravitational resolution of instanton. In fact, it is well known what the metric is: it is the multi-Taub-NUT metric with k centers. In the remainder of this section we rederive this result using the Legendre transform method of Refs. [14,8,9]. This will serve as a warm-up for the discussion of Dk ALF metrics in the next section. First we find the real holomorphic sections of the twistor space Z1 . This amounts to solving Eq. (11) with ρ0 , ξ0 , and η now regarded as holomorphic sections of the appropriate bundles. Recalling that η = aζ 2 + 2bζ − a¯ and Pα (ζ ) = aα ζ 2 + 2bα ζ − a¯ α with b, bα ∈ R, one gets in the patch V0 ρ0 = Ae+µ(b + aζ ) ξ0 = Be−µ(b + aζ ) with AB =

Q

k Y α=1 k Y

(ζ − uα ) , (ζ − vα ) ,

α=1

(a − aα ). Here uα and vα are the roots of the equation η(ζ ) = Pα (ζ ), −(b − bα ) − 1α , a − aα −(b − bα ) + 1α , vα = a − aα

uα =

(12)

p with 1α = (b − bα )2 + |a − aα |2 > 0. (The ambiguity in the sign of 1α is fixed by requiring that the hyperkähler metric on this family of sections be everywhere nonsingular. This is equivalent to asking that the normal bundle of every section in the family is O(1) ⊕ O(1).) Since the real structure must interchange ρ0 and ξ0 , we get Y (13) BB = (b − bα + 1α ) . Thus we have a family of solutions to Eq. (11) parametrized by Re a, Im a, b, and Arg B. Having found the real holomorphic sections, we compute the Kähler potential. The twistor space Z1 is fibered over P1 with an intermediate projection Z1 → O(2) → P1 . In the above ζ and η are coordinates on the base and the fiber of O(2), respectively. The holomorphic 2-form ω ∈ 32 T ∗ ⊗ O(2) is given by ω = 4dη ∧

dρ . ρ

(14)

Singular Monopoles and Gravitational Instantons

p

721

,

Q

u

v

p

Fig. 1. The contour γα enclosing uα and vα

For ζ 6 = ∞ we can choose η(ζ ) and χ = 2 log ρξ as two coordinates on the moduli space M1 holomorphic with respect to the complex structure defined by ζ . The coordinates in the patch ζ 6 = 0 are related to these as η0 = η/ζ 2 , χ 0 = χ − 4µη/ζ.

(15)

The second equation here follows from ρ0 and ξ0 being sections of Lµ (k) and L−µ (k). In terms of these coordinates ω = dη ∧ dχ = ζ 2 dη0 ∧ dχ 0 .

(16)

Following Ref. [9] we define an auxiliary function fˆ and a contour C by the equation I I I I I I dζ dζ ˆ dζ dζ 0 dζ χ+ χ = + χ − 4µ η (17) f = j j j j j +1 C ζ 0 ζ ∞ ζ 0 ∞ ζ ∞ ζ H H for any integer j . Here and in what follows the integrals 0 and ∞ are taken along small positively oriented contours around respective points. This implies that in the first of these integrals the contour runs counterclockwise, while in the second one it runs clockwise. Substituting an explicit expression for χ we find I I I dζ ˆ X dζ dζ 2 log ) − P (ζ )) + 4µ η. (18) f = (η(ζ α j j j +1 ζ ζ ζ C 0α 0 α Here 0α is a figure-eight-shaped contour enclosing uα and vα (see Fig. 1). We define a function G(η, ζ ) by ∂G/∂η = fˆ. According to Ref. [9] the Legendre transform of the Kähler potential is given by I dζ 1 G(η, ζ ). (19) F (a, b) = 2πi C ζ 2 Using Eq. (18) we find 2µ F (a, b) = 2πi

I 0

k

dζ 2 X 1 η + ζ3 2πi α=1

I 0α

dζ 2 (η − Pα ) log (η − Pα ) . ζ2

(20)

The Kähler potential K is the Legendre transform of F : ∂F = t + t¯. K(a, a, ¯ t, t¯) = F − b t + t¯ , ∂b

(21)

It is a well known fact that the metric corresponding to Eq. (20) is the multi-Taub-NUT metric with k centers [9,14]. This is in agreement with string theory predictions [7].

722

S. A. Cherkis, A. Kapustin

5. Moduli Space of Centered n = 2 Monopole 5.1. Twistor space Z20 of centered n = 2 monopole. For n = 2 the moduli space M2 is 8-dimensional and admits a triholomorphic U (1) action. We define the centered moduli space M20 to be the hyperkähler quotient of M2 with respect to this U (1) (at zero level). The U (1) action on M2 lifts to a C∗ action on Z2 . It acts by ρj → λρj , ξj → λ−1 ξj . The corresponding moment map is η1 , as can be easily seen from the expression for ω. Thus Z20 , the twistor space of M20 , is the C∗ quotient of the subvariety η1 = 0 = η˜ 1 in Z2 . We first investigate one coordinate patch of Z20 . Let us denote ψ1 = ρ0 ξ0 , ψ2 = ρ1 ξ1 , ψ3 = 21 (ρ0 ξ1 + ρ1 ξ0 ), ψ4 = 21 (ρ0 ξ1 − ρ1 ξ0 ). The variables ψi are invariant with respect to C∗ action and satisfy ψ1 ψ2 = ψ32 − ψ42 , Y√ √ ( −η2 − Pα (ζ )), ψ1 − η2 ψ2 + 2 −η2 ψ3 =

(22)

α

Y √ √ (− −η2 − Pα (ζ )). ψ1 − η2 ψ2 − 2 −η2 ψ3 = α

These equations define a three-dimensional subvariety U 0 in C6 with coordinates (ζ, η2 , ψ1 , . . . , ψ4 ). Geometric invariant theory tells us that Z20 can be obtained by gluing together two copies of U 0 over ζ 6 = 0, ∞. The transition functions can be computed from Eq. (9): (23) ζ˜ = ζ −1 , −4 η˜ 2 = ζ η2 , ζ −2k √ ψ1 − η2 ψ2 + cos γ (ψ1 + η2 ψ2 ) − 2ψ4 η2 sin γ , ψ˜ 1 = 2 ζ 4−2k √ −(ψ1 − η2 ψ2 ) + cos γ (ψ1 + η2 ψ2 ) − 2ψ4 η2 sin γ , ψ˜ 2 = 2η2 2−2k ˜ ψ3 , ψ3 = ζ ζ 2−2k sin γ ˜ (ψ1 + η2 ψ2 ) √ + ψ4 cos γ , ψ4 = 2 η2 √ where γ = 2µ η2 /ζ. From this explicit description of Z20 one can see that for any ζ the fiber of Z20 is a resolution of the Dk singularity. Indeed, combining Eqs. (22) we see that the fiber of U0 over ζ is biholomorphic to a hypersurface in C3 (with coordinates (η2 , ψ2 , ψ4 )) given by ψ42 + η2 ψ22 + ψ2 Q(η2 ) − R(η2 )2 = 0, where Q(η2 ), R(η2 ) are polynomials in η2 defined by Y√ Y √ ( −η2 − Pα (ζ )) + (− −η2 − Pα (ζ )), 2Q(η2 ) = α

α

α

α

Y√ Y √ ( −η2 − Pα (ζ )) − (− −η2 − Pα (ζ )). 4 −η2 R(η2 ) = √

(24)

Singular Monopoles and Gravitational Instantons

723

Furthermore, these formulas imply that if all points p1 , . . . , pk are distinct, the manifold M20 is a smooth complex manifold in any of its complex structures. Since the 2-form ω is smooth as well, we conclude that M20 is a smooth hyperkähler manifold. The smoothness of M20 is also in agreement with string theory predictions. Indeed, as explained in Ref. [7], the space M20 is the Coulomb branch of N = 4, D = 3 SU (2) gauge theory with k fundamental hypermultiplets, with pα being hypermultiplet masses. When pα are all distinct, the theory has no Higgs branch, and therefore the Coulomb branch is smooth everywhere. When some masses become equal, the Higgs branch emerges, and the Coulomb branch develops an orbifold singularity at the point where it meets the Higgs branch. Thus we expect that when some of p α coincide, or equivalently, when some of `α are bigger than 1, the manifold M20 has orbifold singularities. In Ref. [5] the same manifold Z20 arose as the twistor space of the moduli space of a system of ordinary differential equations (so called Nahm equations). This is of course a consequence of a general correspondence between solutions of Bogomolny equations and Nahm equations [15,12,11]. Thus Ref. [5] provides an equivalent construction of Dk ALF metrics. 5.2. Real holomorphic sections of Z20 . The discussion of Sect. 2 implies that a real holomorphic section of the uncentered twistor space Z2 is a triplet (S, ρ, ξ ), where S is the spectral curve in T given by η2 + η1 η + η2 = 0, ρ and ξ are holomorphic sections of Lµ (k)|S and L−µ (k)|S satisfying the condition (iii) of Sect. 2. Then, as explained in Sect. 3, the real holomorphic sections of Z20 are obtained by setting η1 = 0 and modding by the C∗ action ρ → λρ, ξ → λ−1 ξ . In this subsection we find the explicit form of the real holomorphic sections of Z20 . The curve η2 + η2 = 0 is either elliptic or a union of two CP1 ’s. The former case is generic, while the latter occurs at a submanifold of the moduli space. Intuitively the latter case corresponds to the situation when the two nonabelian monopoles are on top of each other. It suffices to consider the elliptic case. By an SO(3) rotation ζ =

a ζ˜ + b , −bζ˜ + a

η=

η˜ , ˜ (−bζ + a)2

|a|2 + |b|2 = 1,

we can always bring the elliptic curve η2 = −η2 (ζ ) to the form η˜ 2 = 4k12 ζ˜ 3 − 3k2 ζ˜ 2 − ζ˜ , k1 > 0, k2 ∈ R.

(25)

(26)

It follows that the discriminant 1 > 0, and therefore the lattice defined by the curve S is rectangular. We denote this lattice 2 and its real and imaginary periods by 2ω and 2ω0 , respectively. We parametrized S by five real parameters: the Euler angles of the SO(3) rotation and a pair of real numbers k1 and k2 . We will see in a moment that the condition (iii) imposes one real constraint on them, so we will obtain a four-parameter family of real sections, as required. To write explicitly a section of Lµ (k)|S , we will use the standard “flat" parameter on the elliptic curve u defined modulo 2, in terms of which η˜ = k1 P 0 (u), ζ = P(u) + k2 . Here P(u) is the Weierstrass elliptic function. In terms of u the real structure acts by u → −u + ω + ω0 .

724

S. A. Cherkis, A. Kapustin

A section of Lµ (k)|S can be thought of as a pair of functions on S f1 , f2 such that f1 is holomorphic everywhere except ζ = ∞, f2 is holomorphic everywhere except ζ = 0, and for ζ 6 = 0, ∞ f2 (ζ ) = ζ −k exp(−µη/ζ )f1 (ζ ). The point ζ = ∞ corresponds to two points u∞ , −u∞ on S defined by P(u∞ ) + k2 = a/b.SFurthermore, S condition is Q . Let us recall that Q = Q (iii) implies that the divisor of f 1 + + − α Qα , where T Qα = S Pα , α = 1, . . . , k. Thus Qα consists of solutions of a system of two equations η = Pα (ζ ), η2 = −η2 (ζ ). Obviously, this defines four points on the elliptic curve S. Because of real structure, these four points split into two pairs whose members are interchanged by τ . Q+ includes one point from each pair (for all α), SQ− includes the rest. There is a 4m -fold ambiguity involved in the splitting Q = Q+ Q− . It can be fixed, in principle, by requiring that the normal bundle of every section of the twistor space in the family that we are considering is O(1) ⊕ O(1). Let us denote the “flat" coordinates of points in Q+ by uα , u0α , α = 1, . . . , k, and those in Q− by vα , vα0 , α = 1, . . . , k. By definition, vα = −uα + ω + ω0 (mod 2), vα0 = −u0α + ω + ω0 (mod 2). We fix the mod 2 ambiguity by requiring that uα , u0α , vα , vα0 is in the fundamental rectangle of 2. In this notation a section of Lµ (k)|S is given by f1 ∼ exp (−µk1 (ζW (u + u∞ ) + ζW (u − u∞ )) + Cu)

Y σ (u − uα )σ (u − u0 ) α . σ (u − u )σ (u + u ∞ ∞) α (27)

Here ζW (u) and σ (u) are Weierstrass quasielliptic functions (we denote the Weierstrass ζ -function by ζW (u) to avoid confusion with the affine coordinate ζ on the P1 of complex structures), and C is a constant. Similarly, a section of L−µ (k)|S with the divisor Q− is represented by a pair of functions g1 , g2 related by g2 (ζ ) = ζ −k exp(µη/ζ )g1 (ζ ). Explicitly g1 is given by g1 ∼ exp (µk1 (ζW (u + u∞ ) + ζW (u − u∞ )) + Du)

Y σ (u − vα )σ (u − v 0 ) α , σ (u − u )σ (u + u ∞ ∞) α (28)

where D is another constant. In general f1 and g1 are quasiperiodic with periods 2ω and 2ω0 . The condition (iii) is equivalent to asking that f1 and g1 be doubly periodic. One can see that the latter can be achieved by adjusting C and D if and only if X (uα + u0α ) ∈ 2, 2µk1 + α

X (vα + vα0 ) ∈ 2. 2µk1 −

(29)

α

m0 , p, p0 Recalling that k1 is real and positive, wePconclude that there exist integers m,P 0 0 0 and a real number x ∈ (0, 2ω] such that α (uα + uα ) = −x + 2mω + 2m ω , α (vα + vα0 ) = x + 2pω + 2p0 ω0 . Then Eqs. (29) together with the condition (iv) imply 2µk1 = x. Then for f1 and g1 to be doubly periodic one has to set C = 2mζW (ω) + 2m0 ζW (ω0 ), D = 2pζW (ω) + 2p0 ζW (ω0 ).

(30)

Singular Monopoles and Gravitational Instantons

725

Let us notice for future use that log f1 (u + ω) − log f1 (u) = −2π im0 , log f1 (u + ω0 ) − log f1 (u) = 2π im, log g1 (u + ω) − log g1 (u) = −2π ip0 , log g1 (u + ω0 ) − log g1 (u) = 2π ip.

(31)

Equation (30) is a transcendental equation on k1 , k2 , and the SO(3) rotation required to bring S to the standard form Eq. (26). It reduces the number of real parameters in the equation of the curve from 5 to 4. Thus we have a four-parameter family of real sections of Z20 . 5.3. The Kähler potential of the centered n = 2 moduli space. Having found a fourparameter family of real holomorphic sections of Z20 we now would like to compute the corresponding hyperkähler metric. Since Z20 has an intermediate holomorphic projection on O(4), we can use the method of Ref. [9] to write down the Legendre transform of the Kähler potential. The existence of the projection is equivalent to saying that η2 is a holomorphic coordinate on Z20 . The holomorphic 2-form ω in the patch ζ 6 = ∞ can be written as X 1 f1 log ≡ dη2 ∧ dχ . ω = dη2 ∧ d η g1 branches

√ Here f1 and η = −η2 are regarded as double-valued functions of ζ ∈ P1 \{ζ 6 = ∞}, and the sum is over the two branches of the cover S → P1 . Similarly, in the patch ζ 6= 0 we can write ω0 = dη20 ∧ dχ 0 . On the overlap we have the relations ω0 = ζ −2 ω, η20 = ζ −4 η2 , χ 0 = ζ 2 χ − 4µζ.

(32)

Following Ref. [9], we would like to find a (multi-valued) function fˆ(η, ζ ) and a contour C on the double cover S → P1 such that I I I dζ ˆ dζ dζ 0 χ+ χ f (η, ζ ) = j j −2 j ζ ζ C 0 ∞ ζ for any integer j . Here the contours of integration on the RHS are small positively oriented loops around ζ = 0 and ζ = ∞. To find fˆ we substitute the explicit expressions for χ and χ 0 and rewrite the integral on the RHS as an integral in the u-plane. Then the RHS becomes I I f1 (u) dζ du + 4µ ζ (u)−j +2 log , (33) j −1 k1 g1 (u) 0 ζ where the contour in the first integral consists of four small positively oriented loops around four preimages of the points ζ = 0 and ζ = ∞ in the fundamental rectangle of the lattice 2. We denote these points u0 , u00 = 2(ω + ω0 ) − u0 , u∞ , u0∞ = 2(ω + ω0 ) − u∞ . Besides these four points the only other branch points of log f1 (u)/g1 (u) in

726

p

A

u

v

u

1

p

X X

p

u

0 1

u0

p p p p CC

CC

CC

S. A. Cherkis, A. Kapustin

0

u

u0

p

CC

A 0

X X

0

v 0

X X

X X

Fig. 2. Integration contours in Eq. (33). Only one of the contours Aα and one of the contours A0α are shown

the fundamental rectangle are uα , u0α , vα , vα0 , α = 1, . . . , k. As for ζ (u), it is elliptic. Then we can rewrite Eq. (33) as I I I f1 (u) X f1 (u) du du dζ ζ (u)−j +2 log ζ (u)−j +2 log , + + 4µ j −1 g1 (u) g1 (u) bdry k1 Aα +A0α k1 0 ζ α (34) where the contour in the first integral runs along the boundary of the fundamental rectangle, while Aα and A0α enclose the pairs of points uα , vα and u0α , vα0 , respectively (see Fig. 2). Using Eqs. (31) the integral over the boundary can be simplified to I du ζ (u)−j +2 , −2πi (m−p,m0 −p0 ) k1 where the contour (m − p, m0 − p0 ) winds m − p times around the real cycle and m0 − p 0 times around the imaginary cycle. Recalling the explicit form of f1 (u) and g1 (u), we can rewrite the integral over Aα + A0α as I du ζ (u)−j +2 log σ (u − uα )σ (u − u0α )σ (u − vα )σ (u − vα0 ), (35) 0 Bα +Bα k1 where the contours Bα and Bα0 are figure-eight-shaped contours shown in Fig. 3. On the other hand, it can be easily seen that η(u) − Pα (ζ (u)) ∼ eu(C+D)

σ (u − uα )σ (u − u0α )σ (u − vα )σ (u − vα0 ) . σ (u − u∞ )2 σ (u + u∞ )2

Singular Monopoles and Gravitational Instantons

B

727

u p

p

p

J

u

0 1

u0 0

v p

p

p

u

p

1

u 0

u0 Q

B 0

v 0

p

Fig. 3. The contours Bα and Bα0

Since neither u∞ nor u0∞ are enclosed by the contour Bα + Bα0 , the integral Eq. (35) is equal to I du ζ (u)−j +2 log(η(u) − Pα (ζ (u))). Bα +Bα0 k1 Collecting all of this together we get I I dζ ˆ dζ −j +2 f (η, ζ )=−2πi ζ j 0 0 ζ C (m−p,m −p ) η I I X dζ −j +2 dζ log(η − Pα (ζ )) + 4µ . ζ + j −1 0 η ζ Cα +Cα 0 α

(36)

Here all the functions are regarded as functions on the double cover of the ζ -plane, and the contours Cα , Cα0 are the images of Bα , Bα0 under the map u 7→ ζ. We now define a function G(η, ζ ) by ∂G/∂η = −2ηζ −2 fˆ. According to Ref. [9] the Legendre transform of the Kähler potential is given by I dζ 1 G(η, ζ ). F = 2πi ζ2 Hence we can read off F :

I I 4µη2 2η 1 dζ 3 + dζ 2 F =− 0 0 2πi 0 ζ ζ (m−p,m −p ) X 1 I dζ 2(η − Pα (ζ )) log(η − Pα (ζ )). − 0 2πi Cα +Cα ζ 2 α

(37)

F may be regarded as a function of the coefficients of η2 (ζ ) = z+vζ +wζ 2 −vζ 3 +zζ 4 . Since w is real, F depends on 5 real parameters. These parameters are subject to one transcendental constraint expressed by Eq. (30). (This constraint implies ∂F /∂w =

728

S. A. Cherkis, A. Kapustin

0.) Thus we may think of w as an implicit function of z and v. The Kähler potential K(z, z, u, u) is the Legendre transform of F : K(z, z, u, u) = F (z, z, v, v, w) − uv − uv,

∂F ∂F = u, = u. ∂v ∂v

Equation (37) agrees with a conjecture by Chalmers [16]. We already saw in Sect. 5 that M20 is a resolution of Dk singularity. Now we can check that it is ALF. To this end we take the limit k1 → +∞. Equation (30) implies that in this limit ω → ∞, while ω0 stays finite. Thus the curve S degenerates: η2 (ζ ) → −(P (ζ ))2 , where P (ζ ) is a real section of T. It is easy to see that in this limit F reduces to the Taub-NUT form (see Sect. 4): I 4µP (ζ )2 P (ζ ) log P (ζ ) 1 dζ − +K , (38) F ∼ 2πi 0 ζ3 ζ2 where K is an integer depending on the limiting behavior of uα , u0α , vα , vα0 . Therefore asymptotically the metric on M20 has the Taub-NUT form. With some more work it should be possible to compute the integer K as well. Note also that if we set µ = 0, then the metric becomes ALE. Kronheimer proved [2] that the Dk ALE metric is essentially unique. Thus we have obtained the Legendre transform of the Kähler potential for the Dk metrics of Ref. [1]. It would be interesting to obtain a similar representation for the Ek ALE metrics. References 1. Kronheimer, P.B.: The Construction of ALE Spaces as Hyper-Kähler Quotients. J. Differ. Geom. 29, 665–683 (1989) 2. Kronheimer, P.B.: A Torelli-type theorem for gravitational instantons. J. Diff. Geom. 29, 685–697 (1989) 3. Gibbons, G.W. and Hawking, S.W.: Gravitational Multi-instantons. Phys. Lett. B 78, 430–432 (1978) 4. Sen, A.: A Note on Enhanced Gauge Symmetries in M- and String Theory. JHEP 09, 1 (1997). hepth/9707123 5. Cherkis, S.A. and Kapustin, A.: Dk Gravitational Instantons and Nahm Equations. hep-th/9803112 6. Kronheimer, P.B.: Monopoles and Taub-NUT Metrics. M. Sc. Thesis, Oxford, 1985 7. Cherkis, S.A. and Kapustin, A.: Singular Monopoles and Supersymmetric Gauge Theories in Three Dimensions. hep-th/9711145 8. Lindström, U. and Roˇcek, M.: Commun. Math. Phys. 115, 21 (1988) 9. Ivanov, I.T. and Roˇcek, M.: Supersymmetric Sigma Models, Twistors, and the Atiyah–Hitchin Metric. Commun. Math. Phys. 182, 291–302 (1996) 10. Hitchin, N.J.: Monopoles and Geodesics. Commun. Math. Phys. 83, 579–602 (1982) 11. Hurtubise, J. and Murray, M.K.: On the Construction of Monopoles for the Classical Groups. Commun. Math. Phys. 122, 35–89 (1989) 12. Hitchin, N.J.: On the Construction of Monopoles. Commun. Math. Phys. 89, 145–190 (1983) 13. Atiyah, M. and Hitchin, N.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, NJ: Princeton Univ. Press, 1988 14. Hitchin, N.J., Karlhede, A., Lindström, U. and Roˇcek, M.: Hyperkähler Metrics and Supersymmetry. Commun. Math. Phys. 108, 535–589 (1987) 15. Nahm, W.: Self-dual monopoles and calorons. In: Lecture Notes in Physics 201, G. Denardo et al. (eds.), Berlin–Heidelberg–New York: Springer, 1984 16. Chalmers, G.: The Implicit Metric on a Deformation of the Atiyah–Hitchin Manifold. hep-th/9709082; Multi-monopole Moduli Spaces for SU (N) Gauge Group. hep-th/9605182 Communicated by G. Felder

Commun. Math. Phys. 203, 729 – 741 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Structure of Shocks in Burgers Turbulence with Stable Noise Initial Data Jean Bertoin Laboratoire de Probabilités, Université Pierre et Marie Curie, 4, Place Jussieu, F-75252 Paris Cedex 05, France. E-mail: [email protected] Received: 28 September 1998 / Accepted: 12 January 1999

Abstract: Burgers equation can be used as a simplified model for hydrodynamic turbulence. The purpose of this paper is to study the structure of the shocks for the inviscid equation in dimension 1 when the initial velocity is given by a stable Lévy noise with index α ∈ (1/2, 2]. We prove that Lagrangian regular points exist (i.e. there are fluid particles that have not participated in shocks at any time between 0 and t) if and only if α ≤ 1 and the noise is not completely asymmetric, and that otherwise the shock structure is discrete. Moreover, in the Cauchy case α = 1, we show that there are no rarefaction intervals, i.e. at time t > 0, there are fluid particles in any non-empty open interval. 1. Introduction Burgers has introduced the equation 2 u ∂t u + ∂x u2 /2 = ε∂xx as a simple model of hydrodynamic turbulence for compressible fluids, where the parameter ε > 0 describes the viscosity of the fluid, and the solution is meant to represent the velocity of a fluid particle located at x at time t. Roughly, when the viscosity tends to 0, the dynamic of the system of particles corresponds to completely inelastic shocks, in the sense that if two (clumps of) particles collide at a given time, then they form a larger clump of particles in such a way that mass and momentum are preserved. Although it is known that this is not an accurate model for turbulence, Burgers equation is still widely used in physical problems such as, for instance, the study of shock wave formation in compressible fluids, or that of the formation of large clusters in the universe, or also as a simplified version of more elaborate models of turbulence (e.g. the Navier–Stokes equation). To present this work as simply as possible, it is convenient to use the fluid particles picture that has just been sketched, describe first informally results in this setting, and postpone the mathematical rigor to the next sections.

730

J. Bertoin

There is an abundant literature on the inviscid Burgers equation (that is the limit as the viscosity ε goes to 0) in dimension 1, with random initial data. See in particular [1, 2,4,6,7,12,15,16,18–20] and references therein. An interesting problem in this field is to obtain qualitative results on the shock structure. To that end, recall that a so-called Lagrangian regular point at time t can be viewed as the initial location of a particle that has not participated in shocks induced by the turbulence at any time between 0 and t, and that a rarefaction interval is an interval that contains no fluid particles at time t. Sinai [19] has proven that when the initial velocity is given by a Brownian motion, then the set of Lagrangian regular points has Hausdorff dimension 1/2 and that there are no rarefaction intervals. When the initial velocity is a Gaussian white noise, Avellaneda and E [1] have shown that the shock structure is discrete, in the sense that at time t > 0, there are no Lagrangian regular points and only finitely many clumps of particles are left in a given compact set. Quite recently, numerical simulations led Janicki and Woyczynski [12] to the conjecture that when the initial velocity is a stable Lévy process of index α ∈ (1, 2], the Hausdorff dimension of Lagrangian regular points is 1/α (this conjecture has been proven mathematically in [4] when the Lévy process has no positive jumps). We consider here the case when the initial velocity is given by a stable Lévy noise. Specifically, if we introduce the initial potential ψ(·, 0), which is formally defined by ∂x ψ(x, 0) = −u(x, 0), then the process ψ(·, 0) has independent and homogeneous increments and its one-dimensional distributions are stable laws with index α ∈ (1/2, 2]. This situation naturally appears as a limit in a large class of renormalized potentials, see [5]. It is easy to show that in this framework, Lagrangian regular points are exceptional, in the sense that for each fixed point x ∈ R, the probability that x is regular is always zero. We will prove a much more precise result. For α > 1, and also for 1/2 < α < 1 if the noise is completely asymmetric (that is if the initial potential is a monotone increasing or decreasing process), the shock structure is discrete a.s. Nonetheless, if α ∈ (1/2, 1] and if the noise is not completely asymmetric, then a.s. there are Lagrangian regular points. The Cauchy case α = 1 is especially interesting from the mathematical point of view. We will show that it is the only one for which there are no rarefaction intervals, that is every non-empty open interval contains fluid particles at time t > 0. Informally, this study suggests that for α > 1, the shocks induced by Burgers turbulence are numerous and strong enough to involve every single fluid particle at any time t > 0 and to create only finitely many clusters on any given compact interval. For α ∈ (1/2, 1], the initial data is not as rough. However in the completely asymmetric case, the monotonicity of the initial potential implies that all the particles are moving in the same direction, and this explains why again the shock structure is discrete. On the other hand, when the noise is not completely asymmetric, the monotonicity is lost and thanks to compensations that occur when clumps of particles with opposite velocity collide, some exceptional particles are not involved in the turbulence. The rest of this paper is organized as follows. The next section is devoted to the formal presentation of the notions. Our results are then stated and proven in Sect. 3. In the case α ∈ (1/2, 1], the proofs essentially rely on known sample path properties of stable Lévy processes which have been obtained in the 70’s by Fristedt, Hawkes, Monrad and Silverstein. The argument to establish that the shock structure is discrete when α ∈ (1, 2] is less direct; it requires some material on fluctuation theory for Lévy processes.

Shocks in Burgers Turbulence with Stable Noise Initial Data

731

2. Preliminary 2.1. Some basic features on Burgers equation. In this subsection, we review some classical material on the inviscid Burgers equation that can be found for instance in [18] or [19]. From the works of Hopf [11] and Cole [8], it is known that given an initial velocity, Burgers equation with viscosity ε > 0 possesses a unique solution uε , and that uε converges as ε → 0+ to a solution u0 = u to the inviscid equation, which is usually referred to as the Hopf–Cole (or entropic) solution. The Hopf–Cole solution has a simple expression in terms of potential functions. If we introduce ψ by u = −∂x ψ, then the potential at time t is expressed in terms of the Legendre transform of the function a → ψ(a, 0) − a 2 /2t: (x − a)2 . (1) ψ(x, t) = sup ψ(a, 0) − 2t a∈R Of course, we implicitly supposed that ψ(a, 0) = o(a 2 )

as |a| → ∞,

(2)

so that the quantity in (1) is finite. Note also that the formula (1) makes sense whenever the initial velocity u(·, 0) = −∂x ψ(·, 0) is the derivative (in the sense of Schwartz) of a function. To that end, we shall merely assume that the initial potential ψ(·, 0) has only discontinuities of the first kind, i.e. there exists left and right limits at each point; and it will then be convenient to work with the version that is right-continuous. For the sake of simplicity, we shall focus on time t = 1 in the sequel. Of course, our results are valid at any positive time; this can be easily checked by a simple scaling argument. The structure of the shocks in Burgers turbulence is conveniently described in terms of the inverse Lagrangian function which we now introduce. We denote by a(x) the largest location a at which the supremum in (1) is reached, i.e. ) ( (x − b)2 = ψ(x, 1) . a(x) = sup a ∈ R : sup ψ(b, 0) − 2 b≥a We will frequently make use of the fact that, as ψ(·, 0) is continuous to the right and possesses limits to the left, one has for all a ∈ R, ψ(a(x), 0) − (x − a(x))2 /2 ≥ ψ(a, 0) − (x − a)2 /2 when ψ(·, 0) is continuous or makes an upwards jump at a(x), whereas ψ(a(x)−, 0) − (x − a(x))2 /2 ≥ ψ(a, 0) − (x − a)2 /2 when ψ(·, 0) makes a downwards jump at a(x). We stress that the inverse Lagrangian function x → a(x) is right-continuous and increasing. Its right-continuous inverse a → x(a), which is given by x(a) = inf {y ∈ R : a(y) > a} , is called the Lagrangian function; alternatively, it can be viewed as the (right) derivative of the convex hull of the function a → −ψ(a, 0) + a 2 /2. From the point of view of

732

J. Bertoin

hydrodynamic turbulence, the Lagrangian function describes the position at time 1 of the fluid particle initially located at a. We see that if a discontinuity of the inverse Lagrangian function occurs at some point x, i.e. lim a(y) := a(x−) < a(x), y→x−

then the Lagrangian function is constant on the interval [a(x−), a(x)), which means that at time 1, there is a clump located at x which is formed by all the particles that were initially in the interval [a(x−), a(x)). Similarly, if the inverse Lagrangian function stays constant on some interval [x, y), then the Lagrangian function never takes values in the open interval (x, y), which means that at time 1, there are no fluid particle in (x, y). This motivates the following definition. We first introduce the closed range of the inverse Lagrangian function, A = {y = a(x) or y = a(x−) for some x ∈ R} . The open set R−A has a canonical decomposition into disjoint open intervals of the type (a(x−), a(x)); their closures [a(x−), a(x)] are called the shock intervals. A Lagrangian shock point is a point that belongs to some shock interval. A Lagrangian regular point is a point in A that is isolated neither to its left nor to its right in A. We thus have a natural partition of R into the set of Lagrangian regular points and the set of Lagrangian shock points. From the point of view of hydrodynamic turbulence for compressible fluids, a Lagrangian shock point (respectively, a Lagrangian regular point) represents the initial location of a particle that belongs to some clump at time 1 (respectively, that has not been involved in the shocks induced by the turbulence before time 1). One says that the shock structure is discrete if A is a discrete set. This means that there are only finitely many shock intervals in a given compact set and there exists no Lagrangian regular points. Finally one calls (x, y) is a rarefaction interval if the inverse Lagragian function stays constant on [x, y). 2.2. Stable Lévy processes. We saw in the preceding subsection that it is easier to study Burgers turbulence using potentials than velocities. To that end, it is more convenient to discuss the initial data in terms of a stable Lévy process rather than in terms of its derivative, namely a stable Lévy noise. In this subsection, we briefly review material in this field that will be useful in the sequel, and refer to [3] and [17] for much more on this topic. Let S = (Ss , s ∈ R) denote a generic stable Lévy process indexed by the real line. This means that S has independent and homogeneous increments and fulfills the scaling property law

Sks = k 1/α Ss ,

∀k > 0,

where α ∈ (0, 2] is known as the index. Note that S0 = 0 a.s. We will always consider the version of S for which the sample paths are right-continuous and have limits to the left, a.s. It is plain from the scaling property that (2) cannot hold unless α > 1/2, and conversely, it is easy to check that (2) is fulfilled if α > 1/2. This explains why we will restrict our attention to that case in the next section. On the other hand, the case when S is a constant drift is trivial and will be implicitly excluded in the sequel. First, recall that for α < 1, S is a pure jump process with bounded variation a.s., and its derivative dS· in the sense of Stieltjes is a mixture of Dirac point masses whose

Shocks in Burgers Turbulence with Stable Noise Initial Data

733

law can be described in terms of a certain Poisson measure. In particular, S is monotone increasing if and only if it has only positive jumps; one then says that S is a subordinator. One calls S completely asymmetric if either S or −S is a subordinator. If S is not completely asymmetric, then it can be expressed as the difference of two independent stable subordinators. Second, for α ≥ 1, S has unbounded variation and its derivative in the sense of Schwartz is no longer a signed measure. In particular it is not a monotone process even when it only has positive jumps. When α = 1, S is called a Cauchy process; it can be expressed as the sum of a symmetric Cauchy process and a deterministic drift; it always possesses both positive jumps and negative jumps. Known properties about the growth of S will play a major role in this study. The literature mostly concerns the growth at the right of points. Because the time-reversed process Sˆs = S(−s)− is again a stable Lévy process with the same law as −S, results at the left follow immediately; and sometimes it will be convenient for us to use the two-sided version of a result that appears as one-sided in the references. One of the first result in that field concerns the upper rate of growth at a fixed point. It has been proved by Khintchine, see Theorem VIII.5 in [3] for an accessible reference. Lemma 1 (Khintchine). Let S be a stable process with index α ∈ (0, 2]; suppose that −S is not a subordinator. For every β > 0, we have with probability one lim sup h→0+

Sh = 0 or ∞ hβ

according as α < 1/β or α ≥ 1/β. Next, we present a uniform result on the lower rate of growth for stable subordinators that can be found in an even stronger form in Fristedt [9] and Hawkes [10]. Lemma 2 (Fristedt and Hawkes). Let S = (Ss , s ∈ R) be a stable subordinator with index α ∈ (0, 1). With probability one, we have for all s ∈ R, lim inf h→0+

Ss+h − Ss Ss− − Ss−h < ∞ , lim inf < ∞, h→0+ h1/α h1/α

(i)

Ss+h − Ss > 0. h1/α | log h|1−1/α

(ii)

and lim inf h→0+

The final result concerns the behavior near local extrema; see Theorem 7.3 in Monrad and Silverstein [14]. Lemma 3 (Monrad and Silverstein). Let S be a stable process with index α ∈ (0, 2], which is not completely asymmetric if α < 1, and let f : (0, ∞) → (0, ∞) be an increasing function. With probability one, we have for any time µ at which S reaches a local maximum Sµ − Sµ±h = 0 or ∞ lim inf 1/α h→0+ h f (h) R1 according as the integral 0 t −1 f (t) dt diverges or converges.

734

J. Bertoin

3. Statements and Proofs We suppose throughout this section that the initial potential is a stable Lévy process, i.e. ψ(a, 0) = Sa . Our purpose is to describe the shock structure depending on the value of the index α. 3.1. The completely asymmetric case when α ∈ (1/2, 1). In this subsection, we establish that the shock structure is discrete when α ∈ (1/2, 1) in the completely asymmetric case. Theorem 1. Suppose that initial potential ψ(·, 0) is a completely asymmetric stable Lévy process with index α ∈ (1/2, 1). Then the shock structure is discrete a.s. For the sake of simplicity, we will focus on the case when the initial potential is monotone increasing, i.e. is a stable subordinator. The monotone decreasing case is similar and therefore omitted. The study relies heavily on the uniform result on the rate of growth stated in Lemma 2. We first point out that the set of jump points of the initial potential, J = {y ∈ R : ψ(y, 0) 6 = ψ(y−, 0)} , contains the closed range of the inverse Lagrangian function, A. Lemma 4. With probability one, we have A ⊆ J . Proof. Because α < 1, we see from Lemma 2(i) that a.s., for any point y that is not a jump of the initial potential ψ(·, 0) = S· , one has lim inf h→0+

ψ(y + h, 0) − ψ(y, 0) ψ(y, 0) − ψ(y − h, 0) = lim inf = 0. h→0+ h h

(3)

If y ∈ / J can be expressed as y = a(x) or y = a(x−) for some x ∈ R, then the function a → ψ(a, 0) − (x − a)2 /2 reaches its maximum at y. By (3), this can only happen if x = y. On the other hand, because α > 1/2, we see from Lemma 2(ii) that ψ(y + h, 0) − ψ(y, 0) = ∞, h→0+ h2 lim

which entails that y cannot be the location of the maximum of the function a → ψ(a, 0)− t (y − a)2 /2. u We are now able to establish Theorem 1. Proof. We have to show that A is a discrete set a.s. Pick an arbitrary y ∈ A. Because ψ(·, 0) makes a positive jump at y, it is easy to check that y is isolated on its left in A. Note that the model of hydrodynamic turbulence makes this property completely obvious as at time t = 0, the particle located at y has a negative momentum, and thus it instantaneously collides with those that are immediately to its left. So all that we need is to check that y is also isolated on its right in A. To that end, recall that the set of jump points J can be expressed at the values taken by a countable family of stopping times, and by the strong Markov property, the path behavior of ψ(·, 0) after a stopping time is the same as after the origin. We know from

Shocks in Burgers Turbulence with Stable Noise Initial Data

735

Lemma 1 that a stable process with index α < 1 has derivative zero at the origin, so with probability one, we have ψ(y + h, 0) − ψ(y, 0) = 0. h→0+ h lim

(4)

Suppose y is a point of accumulation in A, i.e. there is a decreasing sequence yn = a(xn ) converging to y. In particular, we have ψ(yn , 0) − that is

(xn − yn )2 (xn − y)2 ≥ ψ(y, 0) − , 2 2

yn + y − 2xn ψ(yn , 0) − ψ(y, 0) ≥ , yn − y 2

and therefore lim sup h→0+

ψ(y + h, 0) − ψ(y, 0) ≥ y − x, h

where x denotes the limit of the decreasing sequence xn . By (4), we must have y ≤ x. On the other hand, we know that y = a(x) by the right-continuity of the inverse Lagrangian function, so y must be the location of a maximum of the function a → ψ(a, 0) − (x − a)2 /2. Again by (4), the right-derivative at y of this function is x − y ≥ 0, which forces x = y. So y must be the location of a maximum of the function a → ψ(a, 0)−(y−a)2 /2, and we see from Lemma 2(ii) that this is impossible, except on an event of probability zero. We conclude that y is isolated on its right in A. u t 3.2. The non-completely asymmetric case when α ∈ (1/2, 1]. We suppose here that α ∈ (1/2, 1] and that the noise is not completely asymmetric; we shall first prove that Lagrangian regular points are exceptional. Theorem 2. Suppose that initial potential ψ(·, 0) is stable Lévy process with index α ∈ (1/2, 1] that is not completely asymmetric. Then for every fixed x ∈ R, the probability that x is Lagrangian regular equals zero. Next, we will establish the existence of Lagrangian regular points. Theorem 3. Suppose that the initial potential ψ(·, 0) is a stable Lévy process with index α ∈ (1/2, 1] that is not completely asymmetric. Then the probability that there exists Lagrangian regular points is one. We first consider Theorem 2; by stationarity, we may suppose x = 0. The argument relies on a property that is intuitively obvious from the point of view of hydrodynamic turbulence. For α < 1, because at time t = 0 the velocity of a fluid particle is either 0 or proportional to a Dirac point mass, and as in the latter case the particle instantaneously collides with some of its neighbors, a fluid particle which has not been involved in shocks up to time t must have the same location as at the origin of time. (Of course, one has to be careful with such an informal argument: the result becomes false in the Cauchy case α = 1.) Lemma 5. Suppose α < 1. With probability one, if r is a Lagrangian regular point, then r = a(r) (or equivalently r = x(r)).

736

J. Bertoin

Proof. Recall from Lemma 1 that a.s. lim

h→0+

ψ(h, 0) ψ(h, 0) = 0 and lim sup = ∞. h h2 h→0+

A slight variation of the proof of Theorem 1 then shows that a point r in the closure of the range of the inverse Lagrangian function, which is also a jump point of ψ(·, 0), is necessarily isolated in A, and therefore is not Lagrangian regular. Suppose r = a(x) is a Lagrangian regular point, say with x > r; and recall that the initial potential can be expressed in the form ψ(·, 0) = S (1) (·) − S (2) (·), where S (1) and S (2) are two independent stable subordinators. As ψ(·, 0) is continuous at r, we have for every h > 0, ψ(r, 0) −

(x − r − h)2 (x − r)2 ≥ ψ(r + h, 0) − , 2 2

that is S (1) (r + h) − S (1) (r) ≤ S (2) (r + h) − S (2) (r) − h(2x − 2r − h)/2. By Lemma 2(i), we may pick a sequence hn → 0+ such that S (2) (r + hn ) − S (2) (r) = o(hn ). But then, as x > r, we would have S (1) (r + hn ) − S (1) (r) < 0 when n is sufficiently large, which is impossible. One proves similarly (working at the left of r) that x < r is impossible. u t We are now able to prove Theorem 2. Proof. In the Cauchy case α = 1, we know from Lemma 1 that lim suph→0+ ψ(h, 0)/ h = ∞ a.s., and since ψ(·, 0) is continuous at 0, this entails that 0 cannot be Lagrangian regular with probability one. In the case α ∈ (1/2, 1), we know from Lemma 5 that if 0 is Lagrangian regular, we must have a(0) = 0. On the other hand, we know from Lemma 1 that lim suph→0+ ψ(h, 0)/ h2 = ∞ a.s., and this entails that 0 cannot be the location of a maximum of a → ψ(a, 0) − a 2 /2 a.s. We conclude that the probability that 0 is Lagrangian regular is zero. u t We next turn our attention to Theorem 3, and to that end, we shall prove that local maxima of ψ(·, 0) have a positive probability of being Lagrangian regular. It is easy to deduce that Lagrangian regular points exist with probability one. Proof. Let µ be the (a.s. unique) location of the maximum of ψ(·, 0) on [0, 1]. It is well-known that 0 < µ < 1 a.s. We deduce from Lemma 3 that ψ(µ, 0) − ψ(µ ± h, 0) = 0 h→0+ h lim

a.s.

It follows that if 0 : [0, 1] → R denotes the concave hull of the restriction of ψ(·, 0) to [0, 1], that is if 0 is the smallest concave function with 0(a) ≥ ψ(a, 0) for every a ∈ [0, 1], then its derivate γ = 0 0 is continuous at µ and

Shocks in Burgers Turbulence with Stable Noise Initial Data

737

γ (µ + h) < γ (µ) = 0 < γ (µ − h) for every sufficiently small h > 0. This implies that the support of the Stieltjes measure −dγ contains µ, and more precisely µ is neither isolated to the left nor to the right in Supp(−dγ ). Then pick any x ∈ Supp(−dγ ) arbitrarily closed to µ. Clearly, the graph of 0 touches that of ψ(·, 0) at x, so we must have 0(x) = ψ(x, 0) or 0(x) = ψ(x−, 0). In both cases, x is the location of a maximum of a → ψ(a, 0) − γ (x)a on [0, 1], and a fortiori it is the unique location of the maximum of a → ψ(a, 0) − (x − γ (x) − a)2 /2 on [0, 1]. Plainly, µ is also the unique location of the maximum of a → ψ(a, 0) − (µ − a)2 /2 on [0, 1] . Because ψ(µ, 0) > max (ψ(0, 0), ψ(1, 0)), there is a positive probability that the preceding two maxima are global (i.e. on R) and not only local (i.e. on [0, 1]). We conclude that with positive probability, µ ∈ A and is neither isolated on its right nor on its left, and therefore is a Lagrangian regular point. u t 3.3. The case α ∈ (1, 2]. Our aim in this subsection is to establish that the shock structure is discrete when α ∈ (1, 2]. In the Gaussian case α = 2, this has been proven first by Avellaneda and E [1]. Their approach relied crucially on the Girsanov theorem which enables one to add a parabolic drift to the standard Brownian motion. This cannot be done in the stable case α < 2, so we will use a completely different argument. Theorem 4. Suppose that the initial potential ψ(·, 0) is a stable Lévy process with index α ∈ (1, 2]. Then the shock structure is discrete a.s. Because the random set A is stationary (i.e. its law is invariant by translation), we have to prove that Card ([1, 2] ∩ A) < ∞ a.s. It is easy to verify that the probability that a(x) ∈ [1, 2] for some x with |x| > n goes to zero as n → ∞ (see [5] for the rate of decay), so it suffices in fact to establish that for each fixed n, Card {a(x) ∈ [1, 2] : |x| ≤ n} < ∞

a.s.

(5)

We first point out that when α ≥ 1, a jump point of ψ(·, 0) cannot be in A (we stress that the argument also applies in the Cauchy case α = 1). Lemma 6. Suppose that ψ(·, 0) is a stable Lévy process with index α ∈ [1, 2). Then with probability one, ψ(·, 0) is continuous at every point in A. Proof. Recall from Lemma 1 that lim sup h→0+

ψ(h, 0) − ψ(0, 0) ψ(−h, 0) − ψ(0, 0) = lim sup = ∞ h h h→0+

a.s.

By the strong Markov property (and time-reversal), we thus have with probability one lim sup h→0+

ψ(y + h, 0) − ψ(y, 0) ψ(y − h, 0) − ψ(y−, 0) = lim sup = ∞ h h h→0+

for all jump points y ∈ J . If y = a(x) or y = a(x−) for some x, and if y ∈ J is the point of, say, a positive jump of ψ(·, 0), then we have for every h > 0, ψ(y, 0) − (x − y)2 /2 ≥ ψ(y + h, 0) − (x − y − h)2 /2.

738

J. Bertoin

Therefore we would have lim sup h→0+

ψ(y + h, 0) − ψ(y, 0) ≤ y − x, h

which is impossible, except on an event with probability zero. The case of a negative jump is similar, working now at the left of the jump. u t We resume our analysis of the shock structure with the observation that if a point y ∈ [1, 2] can be expressed as y = a(x) for some x ∈ [−n, n], then, as ψ(·, 0) is continuous at y, one has ψ(y ± h, 0) < ψ(y, 0) + 2nh

for every h ∈ (0, 2],

(6)

that is one can touch from above the graph of ψ(·, 0) on [y − 2, y + 2], using a vertical cone centered at y with vertices of slope ±2n. We shall estimate the probability that the preceding behavior occurs at some point y in a small interval [a, a + ε] ⊆ [1, 2]. By stationarity, this probability does not depend on the choice of a, so we may focus on the case a = 1. Let us analyze this situation from a “dynamical” point of view, i.e. considering the process X = (ψ(a, 0) + 2na, a ≥ 0) and thinking of the variable a ≥ 0 as time. Let us denote by τ the first instant after time 1 at which X reaches a new maximum, and set Ya = Xτ +a − Xτ − 4na for a ≥ 0. If (6) holds for some y ∈ [1, 1 + ε], then we must have τ ∈ [1, 1 + ε]; moreover Y cannot reach a new maximum on the time interval [1 + ε − τ, 3 − τ ], and a fortiori not on [ε, 1 + ε]. Observe also from the strong Markov property that (Xa , 0 ≤ a ≤ τ ) and Y are independent. The key step is thus provided by the following lemma that we will prove in a while. Lemma 7. There is a finite constant K > 0 such that P (X reaches a new maximum on [1, 1 + ε]) × P (Y does not reach a new maximum on [ε, 1 + ε]) ≤ εK. Indeed, it then follows from the preceding analysis that P ((6) holds for some y ∈ [1, 1 + ε]) ≤ εK, which entails by Tonelli’s theorem (assuming for simplicity that 1/ε is an integer) E (Card {k = 0, · · · , 1/ε : (6) holds for some y ∈ [1 + kε, 1 + (k + 1)ε]}) ≤ 2K, and a fortiori (5). We now proceed to the proof of Lemma 7. Proof. For simplicity, write Sa = ψ(τ + a, 0) − ψ(τ, 0), so S = (Sa , a ≥ 0) is a stable process with index α and Ya = Sa − 2na. The set of times M when Y reaches a new maximum is known as the ascending ladder time set; it is a regenerative set (i.e. the range of a subordinator). See [3], Sect. VI.1. In this setting, Y does not reach a new maximum on [ε, 1 + ε] only if M ∩ (ε, 1 + ε] = ∅. On the other hand, it is well-known that the probability of the latter event can be bounded from above by P (M ∩ (ε, 1 + ε] = ∅) ≤ 5(1)U (ε),

Shocks in Burgers Turbulence with Stable Noise Initial Data

739

where 5 denotes the tail of the Lévy measure associated to the regenerative set M and U the renewal function. This can be seen for instance from Proposition III.2 in [3]. As 5(1) is just a constant number, we only need an estimate of U (ε) as ε → 0+. We know from Proposition III.1 of [3] that U (ε) = O(1/8(1/ε)), where 8 is the Laplace exponent of the ascending ladder time set of Y . Moreover, we know from fluctuation theory (cf. [3, p. 166] that Z ∞ e−s − e−qs s −1 P (Ys ≥ 0) ds . 8(q) = exp 0

Note that by the scaling property, P(Ss ≥ 0) = ρ does not depend on s (this quantity is known as the positivity parameter of the stable process S), and thus P (Ys ≥ 0) = P (Ss ≥ 0) − P (0 ≤ Ss < 2ns) = ρ − P 0 ≤ S1 ≤ 2ns 1−1/α . On the one hand, we may use the classical identity Z ∞ e−s − e−qs s −1 ds = q ρ . exp ρ 0

On the other hand, the existence of bounded density for the stable law entails that as s → 0+, P 0 ≤ S1 ≤ 2ns 1−1/α = O s 1−1/α which in turn ensures that Z ∞ e−s s −1 P 0 ≤ S1 ≤ 2ns 1−1/α ds < ∞. 0

Putting the pieces together, we get P (Y does not reach a new maximum on [ε, 1 + ε]) = O(U (ε)) = O (1/8(1/ε))) = O(ερ ). We then consider the event that X reaches a new supremum on [1, 1 + ε]. If we introduce the time-reversed process Xˆ a = X(1+ε−a)− − X1+ε , then the foregoing event occurs if and only if Xˆ does not reach on [ε, 1 + ε]. On the other hand, a new maximum ˆ ˆ X has the same law as the process Sa − 2na, a ≥ 0 , where Sˆ = −S. Plainly, one has ρˆ := P(Sˆ1 ≥ 0) = 1 − P(S1 > 0) = 1 − ρ, and we deduce from above that P (X reaches a new maximum on [1, 1 + ε]) = O(ε1−ρ ). This completes the proof of the lemma, and therefore that of Theorem 4. u t

740

J. Bertoin

3.4. Absence of rarefaction intervals in the Cauchy case. Plainly, rarefaction intervals exist when the shock structure is discrete. It is also easy to see that rarefaction intervals exist for α ∈ (1/2, 1) in the non-completely asymmetric case. More precisely, denote by y the first positive location of a jump greater than 1 of ψ(·, 0). Recall that ψ(·, 0) has right-derivative zero at y; it follows that the probability that y is the unique location of the maximum of the function a → ψ(a, 0) − (x − a)2 /2 is positive provided that x < y and (x − y)2 < 2. Then the same argument as in the proof of Theorem 1 shows that y is isolated in A, both on its left and on its right. By the right-continuity of the inverse Lagrangian function, we see that the set {x : a(x) = y} contains a non-empty open interval. We now consider the Cauchy case α = 1, that is we suppose that ψ(a, 0) = Ca + da, where C is a symmetric Cauchy process and d ∈ R a drift coefficient. The purpose of this subsection is to prove the absence of rarefaction intervals, that is that the locations of the fluid particles at time 1 form an everywhere dense set a.s. Theorem 5. Suppose that initial potential ψ(·, 0) is a Cauchy process. Then with probability one there are no rarefaction intervals. Proof. Recall from Lemma 6 that jump times of ψ(·, 0) do not belong to A, a.s. Now suppose (x, x 0 ) is a rarefaction interval, that is a(·) stays constant on [x, x 0 ); denote its value by y. As y is not a jump time of ψ(·, 0), we have for all h > 0, (x − y + h)2 (x − y)2 ≥ ψ(y − h, 0) − , 2 2 (x 0 − y − h)2 (x 0 − y)2 ≥ ψ(y + h, 0) − . ψ(y, 0) − 2 2 ψ(y, 0) −

We deduce that

ψ(y, 0) − ψ(y − h, 0) ≥ y − x and h ψ(y + h, 0) − ψ(y, 0) ≤ y − x0. lim sup h h→0+ lim inf h→0+

As x < x 0 , we may thus find a rational number q ∈ (y −x 0 , y −x). Then y is the location of a local maximum of a → ψ (q) (a, 0) := ψ(a, 0) − qa and moreover ψ (q) (y, 0) − ψ (q) (y + h, 0) > 0. (7) h→0+ h On the other hand, the family ψ (s) (·, 0), s ∈ Q is a countable family of Cauchy processes. For each of these processes, we can invoke Lemma 3 to see that with probability one, for any s ∈ Q and any location µ of a local maximum for ψ (s) (·, 0), lim inf

ψ (s) (µ, 0) − ψ (s) (µ + h, 0) = 0. h→0+ h We conclude that (7) is impossible, except on an event of probability zero, and therefore there are no rarefaction intervals a.s. u t lim inf

The absence of rarefaction intervals means that the Lagrangian function a → x(a) is continuous. On the other hand, it only increases on the set of Lagrangian regular points, and it follows from Theorem 2 and Tonelli’s theorem that the latter has Lebesgue measure zero a.s. In the terminology used by Sinai [19], one says that the Lagrangian function is a complete devil staircase.

Shocks in Burgers Turbulence with Stable Noise Initial Data

741

References 1. Avellaneda, M. and E, W.: Statistical properties of shocks in Burgers turbulence. Commun. Math. Phys. 172, 13–38 (1995) 2. Avellaneda, M.: Statistical properties of shocks in Burgers turbulence, II: Tail probabilities for velocities, shock-strengths and rarefaction intervals. Commun. Math. Phys. 169, 45–59 (1995) 3. Bertoin, J.: Lévy processes. Cambridge: Cambridge University Press, 1996 4. Bertoin, J.: The inviscid Burgers equation with Brownian initial velocity. Comm. Math. Phys. 193, 397– 406 (1998) 5. Bertoin, J.: Large deviation estimates in Burgers turbulence with stable noise initial data. J. Stat. Phys. 91, No. 3/4, 655–667 (1998) 6. Burgers, J.M.: The nonlinear diffusion equation. Dordrecht, Reidel, 1974 7. Chorin, A.J.: Lectures on turbulence theory. Boston: Publish or Perish, 1975 8. Cole, J.D.: On a quasi linear parabolic equation occurring in aerodynamics. Quart. Appl. Math. 9, 225–236 (1951) 9. Fristedt, B.E.: Uniform local behavior of stable subordinators. Ann. Probab. 7, 1003–1013 (1979) 10. Hawkes, J.: A lower Lipschitz condition for the stable subordinator. Wahrscheinlichkeitstheorie verw. Gebiete 17, 23–32 (1971) 11. Hopf, E.: The partial differential equation ut +uux = µuxx . Comm. Pure Appl. Math. 3, 201–230 (1950) 12. Janicki, A.W. and Woyczynski, W.A.: Hausdorff dimension of regular points in stochastic flows with Lévy α-stable initial data. J. Stat. Phys. 86, 277–299 (1997) 13. Molchanov, S.A., Surgailis, D. and Woyczynski, W.A.: Hyperbolic asymptotics in Burgers’ turbulence and extremal processes. Commun. Math. Phys. 168, 209–226 (1995) 14. Monrad, D. and Silverstein, M.L.: Stable processes: Sample function growth at a local minimum. Z. Wahrscheinlichkeitstheorie verw. Gebiete 49, 177–210 (1979) 15. Ryan, R.: Large-deviation analysis of Burgers turbulence with white-noise initial data. Comm. Pure Appl. Math. 51, 47–75 (1998) 16. Ryan, R.: The statistics of Burgers turbulence initialized with fractional Brownian noise data. Commun. Math. Phys. 191, 71–86 (1998) 17. Samorodnitsky, G. and Taqqu, M.S.: Stable non-Gaussian random processes: stochastic models with infinite variance. London: Chapman and Hall, 1994 18. She, Z.S., Aurell, E. and Frisch, U.: The inviscid Burgers equation with initial data of Brownian type. Commun. Math. Phys. 148, 623–641 (1992) 19. Sinai, Ya.: Statistics of shocks in solution of inviscid Burgers equation. Commun. Math. Phys. 148, 601–621 (1992) 20. Woyczynski, W.A.: Göttingen Lectures on Burgers-KPZ turbulence. Lecture Notes in Maths, Berlin– Heidelberg–New York: Springer, to appear Communicated by Ya. G. Sinai