Communications in Mathematical Physics - Volume 207

Commun. Math. Phys. 207, 1 – 42 (1999) Communications in Mathematical Physics © Springer-Verlag 1999 On the Discrete...

Author: A. Jaffe (Chief Editor)

38 downloads 691 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 207, 1 – 42 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

On the Discrete Spectrum of the Nonstationary Schrödinger Equation and Multipole Lumps of the Kadomtsev–Petviashvili I Equation Javier Villarroel1 , Mark J. Ablowitz2 1 Universidad de Salamanca, Dept. de Matematicos puras y aplicadas, 37008 Salamanca, Spain 2 Department of Applied Mathematics, University of Colorado, Boulder, CO 80309-0526, USA

Received: 30 September 1998 / Accepted: 30 March 1999

Abstract: The discrete spectrum of the nonstationary Schrödinger equation and localized solutions of the Kadomtsev–Petviashvili-I (KPI) equation are studied via the inverse scattering transform. It is shown that there exist infinitely many real and rationally decaying potentials which correspond to a discrete spectrum whose related eigenfunctions have multiple poles in the spectral parameter. An index or winding number is asssociated with each of these solutions. The resulting localized solutions of KPI behave as a collection of individual humps with nonuniform dynamics. 0. Introduction In this paper we study the discrete spectrum of the nonstationary Schrödinger equation (cf. Eq. (1.1) below). It is well known that this operator can be used to linearize the Kadomtsev–Petviashvili-I (KPI) equation (cf. Eq. (1.5) below) via the inverse scattering transform (IST) (see e.g. [1]). Both of these equations have wide application in physics. The Schrödinger equation is central in the study of quantum mechanics and the KP equation is a basic nonlinear equation that models long two dimensional waves with weak nonlinearity and slow transverse variation. The KP equation arises in water waves, plasma physics, as well as having important mathematical implications. Both the continuous and discrete spectrum of the stationary Schrödinger operator have been extensively investigated. Far less is known in the nonstationary case. With regard to the discrete spectrum, corresponding to decaying potentials, it is known [2] that there exist complex conjugate pairs of simple eigenvalues of the nonstationary Schrödinger operator which, via IST, yield rational slowly decaying (they decay as O(1/r 2 ) where r 2 = x 2 + y 2 ) soliton solutions, usually referred to as lumps, of KPI. These lump solutions have been heavily studied since they were first found by direct methods [3]. Here we show that there exist infinitely many real, rationally decaying potentials of the nonstationary Schrödinger operator and corresponding solutions to KPI equations. As functions of the spectral parameter, the analogue of the “energy” in the stationary case,

2

J. Villarroel, M. J. Ablowitz

the related eigenfunctions (or wavefunctions) have, in general, multiple poles. Thus the potentials correspond to a discrete spectrum associated with poles of the eigenfunction of multiplicity m ≥ 1. This is in contrast to the stationary case where all real localized potentials correspond to simple poles. We have found that to characterize the discrete spectrum of this operator it does not suffice to require that the potential vanishes at infinity with a certain decay rate (like O( r12 )). Additional restrictive conditions must be given which are expressed in terms of new integer valued quantities. We discuss the characterization of the potential in terms of the pole structure of the eigenfunction and this additional number, which we refer to as the charge which has an implicit interpretation as an index or winding number (see Sect. 3; result V, part iii, where using Green’s theorem we prove that they are winding numbers). We note, as with the ordinary lump solutions, the potentials found here are not absolutely integrable, decaying as O(1/r 2 ), where r 2 = x 2 + y 2 . This slow decay is the underlying reason for the appearance of the charge. We also note that these results are relevant to, and many of the methods can be carried over to, other completely integrable partial differential equations in 2 + 1-dimensions such as the Davey-Stewartson equations. The resulting class of potentials is related to analytic and rationally decaying solutions in the KPI equation when the appropriate temporal dependence is inserted. These solutions of KPI are termed multipole solutions when m > 1. They can be seen as a collection of individual humps that generically evolve with nonuniform dynamics and display a behaviour that is in contrast to the uniform motion of standard solitary waves; they interact in a nontrivial manner (see also ref. [6]). Direct methods for finding rational solutions of KPI and connection with Calogero-Moser systems have also been considered in [10], though the full manifold of physically relevant solutions (real and nonsingular) was not obtained. In a subsequent paper we shall show how the problem of classifying the various infinite families of real nonsingular localized solutions to KPI, can be found by direct methods (cf. ref. [11]). The results discussed here were announced in ref. [4] (see also ref. [9]). The purpose of the present paper is to derive and elaborate on these results. Several examples are considered. The framework of the paper is as follows: in Sect. 1 we review the the known results of IST for the case of eigenfunctions with only simple poles. New results regarding the discrete spectrum are presented in Sects. 2–3 with relevant proofs (the proofs in these two sections can be skipped by a reader wishing to only understand the basic phenomena). Section 4 employs the results of the previous sections to obtain concrete examples of reflectionless potentials. Section 5 discusses how some of these results can be obtained by coalescence of poles. Examples of corresponding decaying solutions of the KPI equation are given in Sect. 6 where we also study the trajectories and dynamics of the humps that each solution contains. Galilean interactions that correspond to the above lump trajectories are also derived. 1. Some Previous Results The nonstationary Schrödinger equation is iψy + ψxx + uψ = 0. We introduce the complex spectral parameter k via ψ = µ(x, y, k)eikx−ik that µ solves (i∂y + ∂xx + 2ik∂x + u)µ = 0.

(1.1) 2y

and find (1.2)

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

3

In this paper we restrict ourselves to the study of the class u(x, y) of potentials that are real, nonsingular and decay rationally at infinity. u(x, y) is reconstructed from solutions of (1.2) from (see [1] for this and other facts below) u(x, y) = −2i

∂ µ1 , ∂x

(1.3)

where µ = 1 + µ1 /k + O( k12 ) as k → ∞. The compatibility of (1.2) and (∂t + 4∂xxx + 12ik∂xx − 12k 2 ∂x + 6u∂x + 3ux + 6iku − 3i(∂y ∂x−1 u))µ = 0 (1.4) yields the KPI equation: (ut + 6uux + uxxx )x = 3uyy .

(1.5)

The knowledge of the Schrödinger potential u(x, y) and corresponding eigenfunction µ(x, y, k) allows one to obtain solutions to KPI by letting certain constants (the scattering data) evolve in time t in a precise way as required through compatibilty with (1.4). Similarly, setting t = constant in the relevant KPI solution u(x, y, t) one obtains a Schrödinger potential for which (1.1,1.2) is solvable. If the potential is regular and has suitable decay (1.2) can be converted into the integral equation Gµ(x, y, k) = 1, (1.6) where the operator G(k) acts as follows: Z ∞ Z ∞ dx 0 dy 0 G(x − x 0 , y − y 0 , k)(uf )(x 0 , y 0 ). (1.7) Gf (x, y) = f (x, y, k) − −∞

−∞

The Green’s function of the problem has different forms for Im k > 0, Im k < 0. Indeed one has that G(x, y, k) is given on the upper-lower half planes by Z ∞ Z ∞ 1 dm dn exp [i(mx + iny)] /((m2 + 2mk + n) (1.8) G± (x, y, k) = (2π)2 −∞ −∞ Z ∞ h i 1 exp i(mx − (m2 + 2km)y) (θ (y)θ (∓m) − θ (−y)θ (±m))dm. = − 2π i −∞ This function satisfies Z ∞ h i 1 sign(m) exp i(mx − (m2 + 2km)y) , Im k = 0, G+ − G− = 2πi −∞ 1 ∂G(x, y, k) = −i(x − 2ky)G(x, y, k) − sign (Im k), Im k 6 = 0, ∂k 2π i ¯ ¯ = G(−x, −y, k), G(x, y, k)

(1.9.a) (1.9.b) (1.9.c)

where bar denotes complex conjugate. We consider solutions µ to (1.6) that correspond to regular and decaying potentials u(x, y) tending to 1 as |k| → ∞. Generically as functions of the complex parameter k, µ has a discontinuity across the real axis, proportional to a scattering function F (k), which in analogy to the stationary problem is called the “reflection” coefficient. The Fredholmlike nature of (1.6) implies that at some points kα (the eigenvalues), eigenfunctions or

4

J. Villarroel, M. J. Ablowitz

solutions to its homogeneous version, Gkα 8(x, y) = 0, may exist. Here Gkα ≡ G(k = kα ). Equivalently these eigenfunctions correspond to solutions of (1.2) at k = kα : (i∂y + ∂xx + 2ikα ∂x + u)8α = 0

(1.20 )

vanishing as x 2 + y 2 → ∞. At these eigenvalues µ(k) has poles with the residue given by the eigenfunction 8. Solutions corresponding to the discrete spectrum of the nonstationary Schrödinger equation correspond, by definition, to having a vanishing reflection coefficient F (k) and are taken to be purely meromorphic. We first review the results of [2] corresponding to pure discrete spectrum. In [2] it is assumed that the discrete spectrum consists of single poles (here we take an even number), i.e. µ(k) is written: µ(k) = 1 +

2N X α=1

8α , (k − kα )

(1.10)

where the residues 8α remain to be found. The potential is obtained from u(x, y) = −2i

2N X

∂x 8α .

(1.11)

α=1

In [2] the following fundamental equation is found that relates the residues 8α with the spectral function µ ν(kα ) = −i(fα + γα )8α , α = 1, . . . 2N,

(1.12)

f (k) = x − 2ky; fα ≡ f (kα ) = x − 2kα y, ν(kα ) ≡ lim (µ(k) − singular part of µ at kα )

(1.13)

where

k→kα

and γα are “normalization” constants (in KPI they are functions of time). In addition, a constraint, conveniently normalized now for the purposes described later in this paper, (1.14a) Q(kα ) = 1 was obtained where 1 sign (Im kα ) Q(kα ) ≡ 2πi

Z Z u8α dxdy

(1.14b)

(integration is over the plane). Later we show that this constraint is a certain integer; i.e. an index or winding number. From (1.12) the following linear algebraic system of equations ensues: 1+

2N X β=1 β6=α

8β = −i(fα + γα )8α , α = 1, . . . 2N. kα − kβ

(1.15)

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

5

After solving this system, the corresponding “reflectionless” potentials for (1.1) are obtained as 2N ∂2 ∂ X 8α = 2 2 log F, (1.16) u(x, y) = −2i ∂x ∂x α=1

where F is the determinant of the matrix Bαβ = (fα + γα )δαβ − i(1 − δαβ )

1 kα − kβ

, α, β = 1, 2 . . . 2N.

(1.17)

Solutions to KPI (lumps) are given by (1.16) letting the constants in (1.12) “evolve” as γα (t) = γα (t = 0) + 12kα2 t.

(1.18)

Real nonsingular potentials are obtained by taking kα , γα to satisfy kj +N = k¯j , γj +N = γ¯j , j = 1, . . . N, where the bar denotes complex conjugate. The simplest real nonsingular solution of KPI corresponds to taking N=1, with k1 = a + ib, F = z2 + 4b2 y 02 +

1 , 4b2

(1.19a)

where z = x 0 − 2ay 0 , x 0 = x − 12(a 2 + b2 )t − x0 , y 0 = y − 12at − y0 ,

(1.19b)

γI a − γR b γI , y0 = , b 2b where a, b, γR , γI are arbitrary real constants. This is the well known one lump solution. Later we will say that the solution has index or charge one. The foregoing discussion summarizes what had been considered (previous to ref. [4]) regarding the discrete spectrum of the operator (1.6) and the associated lump type solutions which correspond to simple poles kα with the condition Q(kα ) = 1, ∀α enforced. The KPI solutions are localized structures that move uniformly and do not suffer a phase shift upon interaction. An important discovery was that there exist nonsingular localized decaying solutions u(x, y, t) to KPI away from the above class of lump solutions. One such new ∂2 configuration is u(x, y, t) = 2 ∂x 2 log F , where x0 =

γI − δI )2 F = (z2 − 4b2 y 02 + δR − 24bt)2 + (2y 0 (1 + 2bz) + b 1 1 1 + 2 (z − )2 + 4b2 y 02 + 2 , b 2b 4b

(1.20)

where z, x 0 , y 0 , are defined above in Eq. (1.19b) and a, b, γR , γI , δR , δI are arbitrary real constants. This particular solution of KPI was constructed in [5], but without reference to the underlying scattering problem. It went almost unnoticed until recently (cf. [6]), where it was rederived, again purely in terms of KP solutions and related dynamics. The above solution, although sharing some of the properties of solitary waves, is not encompassed

6

J. Villarroel, M. J. Ablowitz

by the aforementioned class of lump solutions (1.16)–(1.18). We shall show that this solution is really part of the discrete spectrum corresponding to poles of higher order and a higher order index, in this case index two. Thus the previous theory regarding the discrete spectrum of the Schrödinger operator, based upon assuming simple poles, was not complete. Additionally there is the further interesting underlying characterization property of an index and its meaning with regard to inverse scattering. 2. Solutions and the Spectrum of the Nonstationary Schrödinger Operator In what follows we assume that u(x, y) is real (the extension to more general potentials follows along similar lines), and suitably decaying such that the relevant integral equations are of Fredholm type. We leave the justification of this to later study. All results obtained are consistent with this assumption. Our results may then be summarized as follows: i) If k = k1 is an eigenvalue, then its complex conjugate, k¯1 , is an eigenvalue. The number of linearly independent homogeneous solutions to (1.6) at k1 and k¯1 are equal (Result I below). ii) Eigenfunctions corresponding to eigenvalues k1 , k2 are orthogonal (with respect to an appropriate pairing) if k1 6 = k¯2 (Result II below). Otherwise their scalar product equals the “charge" Q (Result IV, below, see Eq. 2.11.2). iii) The charge Q can take on other natural values besides Q = 1. If Q 6= 1, then Eq. (1.12) is no longer valid. Instead a new equation holds; this is discussed in Sect. 3. iv) Q has a topological interpretation as Q = winding number H, where H is related to the residue 8 of the eigenfunction µ satisfying Eq. (1.6) by 8 =i

1 ∂ log H + O( 2 ), r 2 ≡ x 2 + y 2 → ∞, ∂f r

where f ≡ x − 2k1 y and the derivative is taken along a closed contour at infinity: 0∞ that surrounds the origin once. In fact we find H = f Q + O(1/r) as r → ∞. v) Solutions of the nonstationary Schrödinger operator (Eq. (1.2),(1.6)) can have higher order poles. vi) Characterization of the class of reflectionless potentials requires giving the position and multiplicity of the poles. Additional determining data are the multiplicity and the associated indices Qj of the eigenfunction satisfying Eq. (1.6). These meromorphic wave functions correspond to real, nonsingular and decaying potentials of the form ∂2 u(x, y) = 2 ∂x 2 log F (x, y, γ , δ, . . . ), where F is a certain polynomial and γ , δ, . . . constants. They also correspond to localized, real, nonsingular, decaying solutions of the KPI equation of the form ∂2 log F (x, y, γ (t), δ(t) . . . ). (2.1) ∂x 2 The dynamical evolution of the maxima of u(x, y, t) is nontrivial as opposed to the situation for standard lumps or solitons. Equations (1.16–17) and (1.20) are members of this class. u(x, y, t) = 2

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

7

We now elaborate on the above statements. As previously mentioned, we denote Gkα ≡ G(k = kα ). We call homogeneous solutions of the operator Gk1 , ω(x, y) which satisfy Z Gk1 ω ≡ ω(x, y) −

dx

0

Z

dy 0 G(x − x 0 , y − y 0 , k1 )u(x 0 , y 0 )ω(x 0 , y 0 ) = 0. (2.2.1)

The span (vector space) of all such functions is denoted Ker Gk1 . Similarly we refer to homogeneous adjoint solutions as ρ(x, y). Homogeneous solutions to the adjoint operator Gk†1 satisfy Z Gk†1 ρ

≡ ρ(x, y) −

dx

0

Z

¯ 0 − x, y 0 − y, k1 )u(x, y)ρ(x 0 , y 0 ) = 0. dy 0 G(x

(2.2.2)

We reiterate that certain properties of these operators, e.g. compactness and decay of their solutions, is assumed in this discussion. Result I. At any point k1 the spaces KerGk†1 , KerGk1 , KerGk¯1 satisfy i) KerGk†1 = u(x, y)KerGk¯1 ,

(2.3.1)

i.e. ω1¯ (x, y) is a homogeneous solution at k¯1 if ρ(x, y) ≡ uω1¯ (x, y) is an adjoint function at k1 . (Note, the potential is assumed to be real.) In particular DimKerGk†1 = DimKerGk¯1 . ii) DimKerGk¯1 = DimKerGk1 .

(2.3.2)

Thus, at any point k1 the number of linearly independent homogeneous solutions to Gk1 ω1 = 0, Gk†1 ρ = 0 and Gk¯1 ω1¯ = 0 is the same. Proof. i) Using (1.9.c) we have for the adjoint equation Z ρ=

dx 0 dy 0 u(x, y)G(x − x 0 , y − y 0 , k¯1 )ρ(x 0 , y 0 ).

Hence if a solution ρ to this equation exists, letting ρ = ζ u implies that ζ satisfies Gk¯1 ζ = 0. Thus , ρ ∈ Ker Gk†1 ⇒ ζ ∈ Ker Gk¯1 . The proof holds also in the opposite direction and i) follows. ii) Fredholm’s first theorem gives that the number of linearly independent solutions to Gk1 8 = 0, and Gk†1 ρ = 0 is the same. Thus, Dim Ker Gk1 = Dim KerGk†1 . These two facts together imply the statement. u t

8

J. Villarroel, M. J. Ablowitz

Result II. The system of eigenfunctions satisfy the orthogonality relationship ¯ < ωα , ωβ >= 0, β 6 = α, where ωα denotes an eigenfunction at k = kα , where Z 1 f¯x gdxdy < f, g >≡ π

(2.4)

(2.5)

is the relevant bilinear form. Note that we find it convenient to define kα¯ ≡ k¯α . Hence ωα¯ denotes an eigenfunction at k¯α . Proof. (1.20 ) implies that Z 0 = dxdy ω¯ β (i∂y + ∂xx + u + 2ikα ∂x )ωα Z Z ¯ ¯ = dxdyωα {(−i∂y + ∂xx + u − 2i kβ ∂x )ω¯ β } − 2i(kα − kβ ) dxdyωα ∂x ω¯ β Z ⇒ (kα − k¯β ) dxdyωα ∂x ω¯ β = 0. (Actually we can extend this result, see Appendix I). u t We assume that eigenfunctions µ(k) corresponding to “pure” multipole lumps have the following meromorphic representation: µ(k) = 1 +

n X

M

{

α=1

α X 8α 9r,α }, + (k − kα ) (k − kα )r

(2.6)

r=2

where 8α , 9r,α depend only on x, y. Around any pole k1 we have the Laurent expansion M

µ(k) = ν(k) +

X 9r 81 + (k − k1 ) (k − k1 )r r=2

M

= ν˜ (k) + ν(k1 ) +

X 81 9r + , (k − k1 ) (k − k1 )r r=2

where ν(kα ) ≡ limk→kα µ(k) − singular part of µ(k) at kα , and ν˜ (k) is zero at kα and tends to 0 as k → ∞. Note that ν(kα ) = 1 +

n X β=1,β6=α

Mβ

X 8β 9r,β { }. + (kα − kβ ) (kα − kβ )r

(2.7)

r=2

The next result establishes the connection between the pole structure of µ(k) and the discrete spectrum.

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

9

Result III. The (negative) Laurent coefficients of µ(k) around kα :{ν(kα ), 8α , 9r,α }, r = 2 . . . M, M ≥ 2 satisfy the integral equations ∂G 9M,α }k=kα = . . . ∂k ∂G 1 ∂ M−1 G 9M,α }k=kα = 0 92,α + . . . + = {G8α + ∂k (M − 1)! ∂k M−1

{G9M,α }k=kα = {G9M−1,α +

{Gν +

1 ∂ 2G 1 ∂M G ∂G 92,α + . . . + 9M,α }k=kα = 1. 8α + 2 ∂k 2 ∂k M! ∂k M

(2.8)

Proof. Take the limit k → kα in Eq. (1.6) and set the coefficients of 1/(k − kα )j , j = 0, 1, . . . M equal to zero. u t Result III implies that the poles of µ(k) are eigenvalues, i.e. they are points of the discrete spectrum, while 9M,α are eigenfunctions at the eigenvalue kα . By Result I we have that the discrete spectrum is an even dimensional set {kj , k¯j }j =1...N . As a consequence, pure meromorphic functions, µ have the structure (2.6), where the poles occur in pairs (kj , kj +N ) ≡ (kj , k¯j ) and hence without loss of generality the position can be arranged such that Im (kj ) > 0; j = 1, 2 . . . N. We find it convenient to define j¯ = j + N. Suppose µ corresponds to the discrete spectrum of the Schrödinger operator and has 2N poles {kj , k¯j }j =1...N with multiplicities {Mj , Mj¯ }j =1...N , and tends to 1 as k → ∞. Such an eigenfunction is given by N X { µ(k) = 1 + j =1

M

M¯

j j X X 8j¯ 9r,j¯ 9r,j 8j + + + }, r (k − kj ) (k − k¯j ) (k − kj ) (k − k¯j )r r=2 r=2

(2.60 )

where we have used kj +N = k¯j in (2.6) j = 1 . . . N. Fundamental objects in our development are certain quantities that we call indices (winding numbers) or charges. At any pole kα , α = 1, . . . 2N we define the index of the pole as Z Z 1 u8α dxdy, (2.9) sign (Im kα ) Qα ≡ 2πi where 8α is the residue of µ at the pole. We also introduce the secondary indices Z Z 1 sign (Im kα ) u9r,α dxdy, 2 ≤ r ≤ Mα . (2.10) Qr,α ≡ 2πi Note. To stress the importance of the charges Q1,α we drop the subscript and we write Qα ≡ Q1,α . Result IV. Let µ(k) be given by Eq. (2.60 ) with discrete spectrum {kj , k¯j }N j =1 . Then the indices Qα satisfy the following properties: PN PN (2.11.1) i) j =1 Qj = j =1 Qj¯ , j = 1 . . . N, ii)

Qα = sign (Im kα ) < 8α¯ , 8α >,

(2.11.2)

10

J. Villarroel, M. J. Ablowitz

Qr,α = sign (Im kα ) < 8α¯ , 9r,α > iii) iv)

¯ α = Qα¯ , Q ¯ r,j = Q ¯ , Q r,j QM,α = 0,

(2.11.3) (2.11.4)

where M ≥ 2 is the multiplicity of the pole. Proof of these facts are given in Appendix I and II.

3. Equations Determining the Eigenfuction In this section we discuss a method which determines equations which serve to fix eigenfuctions µ(k) satisfying Eq. (1.6) with certain assumed analytic structure. Once these equations have been determined, we show in Sect. 4 how they can be used to solve for particular cases of eigenfuctions and related reflectionless multipole potentials which are real, nonsingular and decaying at infinity. A linear relationship is obtained. Result V. Simple poles. We begin by analyzing simple pole solutions of Eq. (1.6); i.e. special cases of Eq. (2.6). Assume that the solution to Eq. (1.6) has the following structure around k = k1 : µ(k) = ν(k) +

8 , (k − k1 )

(3.1)

where ν(k) is regular at k1 and ν(k) → 1. Define recursively the forcings hn (f, y), k→∞

n = 0, . . . n0 by hn (f, y), n = 0, . . . n0 by

h0 = 1, ∂f hn = ihn−1 , (i∂y + ∂ff )hn = 0,

(3.2a)

where f ≡ f (k1 ) = x − 2k1 y (see (1.13)). Note that the first few of them are h0 = 1, h1 = i(f + γs ), h2 = 21 (h21 − δs − 2iy), h3 = 3!1 (h31 − 3h1 (δs + 2yi) + s ) and γs , δs , s are arbitrary constants (here the subscript s stands for simple pole). We also write Hn (x, y) ≡ hn (x, y, γ = 0, δs = 0, s = 0 . . . ). Thus 1 1 H0 = 1, H1 = if, H2 = − (2iy + f 2 ), H3 = (6yf − if 3 ). 2 3!

(3.2b)

Let ρ be the solutionR of the adjoint equation at k1 and n0 be the greatest integer ¯ y)dxdy, n = 0, . . . n0 exist. Suppose also that for which the integrals Hn−1 ρ(x, DimKerGk1 = 1. In what follows we define the index Q = Q(k1 ) = Q1 . Then

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

11

i) Q = n0 ; ii) 8 is related with ν as follows: Q = 1 ⇒ {ν + h1 8}k=k1 = 0, dν = 0, + h1 ν + h2 8 Q=2⇒ dk k=k1 2 dν 1d ν + h2 ν + h3 8 + h1 = 0, Q=3⇒ 2 dk 2 dk k=k1 .. . d n−1 ν dν 1 + .. + h ν + h 8 = 0; + h Q=n⇒ n−2 n−1 n (n − 1)! dk n−1 dk k=k1

(3.3.1) (3.3.2) (3.3.3)

(3.3.n)

iii) if 8 is related with µ by Eq. (3.3.n) then as r 2 ≡ x 2 + y 2 → ∞, 8≈i

∂ in ∂ log Hn = i log f n = , and ∂f ∂f f Q = winding number Hn .

Hence Q is a topological invariant for the Schrödinger operator that counts the number of zeros that Hn (f, y) has a function of f . Remarks. 1. We assume that it is possible to differentiate Eq. (1.6) (n0 − 1) times under the integral sign. 2. The proof below shows also that at k = k1 there exists solutions µ˜ n , n = 0 . . . n0 to the set of equations (3.4) Gk1 µ˜ n = An Hn−1 , n = 0 . . . n0 , where An ≡ n1 (Q − 1)...(Q − n), H−1 ≡ 0. This holds if the following Fredholm alternative conditions are satisfied: Z Z ¯ y)dxdy = 0, n = 0 . . . n0 . (3.5) Fn ≡ An Hn−1 ρ(x, These conditions typically require Q = n0 . 3. n0 is assumed to be finite. Otherwise ρ(x, y) would need to vanish faster than any power. This would contradict our assumption of rationally decreasing solutions. 4. The relationship between the Laurent coefficients and ν(k) depends on the value that the charge Q takes, which is assumed to be a positive integer. In Sect. (4) it is shown, given the pole structure of function µ around both k1 and k¯1 , and Dim Ker Gk1 = 1, that the Fredholm conditions (3.5) fix the value of Q to be n0 . 5. The case Q = 0 is not ruled out by the discussion below. We assume Q 6 = 0. 6. The results of [2] correspond to the case n0 = 1. (This is the case of the well known lumps associated with KPI, discussed briefly in Sect. 1 (ref. [2])). Proof. Proof of i and ii. For simple poles, Eqs. (2.8) are Gk1 8 = 0, {Gν +

∂G 8}k1 = 1. ∂k

12

J. Villarroel, M. J. Ablowitz

Use of (1.9.b) yields that µ1 ≡ {ν + H1 8}k=k1 solves Gk1 8 = 0, Gk1 µ1 = 1 − Q.

(3.6)

To proceed further, we need the value of the charges Q which as mentioned above is not fixed unless we specify the pole structure of function µ around both k1 and k¯1 . This matter is considered in the next section. Suppose we assume that Q = 1. With the assumption Dim Ker Gk1 = 1 this implies that µ1 = −iγ 8 which is (3.3.1). When Q 6 = 1 Eqs. (3.6) do not imply a relationship between ν, 8. The key observation that allows us to obtain equations between ν and 8 is that µ satisfies the equations obtained from formal differentiation of (1.6) under the integral sign n0 times when Q = n0 . We thus obtain, along with (2.8), the following further equations: ∂G 1 ∂ 2G dν + ν+ 8 = 0, G dk ∂k 2 ∂k 2 k=k1 2 ∂G dν 1 ∂ 3G ∂ 2G d ν 8 = 0, + 2ν+ G 2 +2 dk ∂k dk ∂k 3 ∂k 3 k=k1 .. . It is convenient to introduce the functions

µ0 ≡ 8, µ1 ≡ {ν + H1 8}k=k1 , µ2 ≡ 1 µ3 ≡ 2

1 d 2ν dν + H2 ν + H3 8 + H1 2 dk 2 dk

dν + H1 ν + H2 8 dk

k=k1

k=k1

,

... .

Elaboration of these equations gives that Gk1 µ0 = 0, Gk1 µ2 = H1 (1 − Q/2) + A, where we introduce the constants

Gk1 µ1 = 1 − Q, Gk1 µ3 = 2H2 (1 − Q/3) + H1 B + C, .. .

A ≡ −(Q0 + η/2), B ≡ (Q0 + η/3) with 1 sign (Im k1 ) Q0 ≡ 2π i

Z Z

1 sign (Im k1 ) uνdxdy, η ≡ 2π i

Z Z uH1 8dxdy.

(3.7) (The explicit value of the constant C is not important for this discussion.) Hence the linear combinations µ˜ 0 , µ˜ 1 , . . . µ˜ n0 , µ˜ 0 ≡ 8, µ˜ 1 ≡ −µ1 , µ˜ 2 ≡ (1 − Q)µ2 + Aµ˜ 1 , Q Q µ˜ 3 ≡ (Q − 1)(1 − )µ3 + B µ˜ 2 − (1 − )C µ˜ 1 , . . . , 2 2 Gk1 µ˜ n = An Hn−1 , n = 0 . . . n0 satisfy (3.4).

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

13

Existence of solutions to the above equations requires that the following Fredholm conditions are satisfied: Z Z ¯ y)dxdy = 0, n = 0 . . . n0 . Hn−1 ρ(x, Fn ≡ An This is (3.5). Suppose Q = 2. Then we have from the above equation, Gk1 µ˜ 0 = Gk1 µ˜ 2 = 0. Since we are assuming that Dim Ker Gk1 = 1, this implies that µ˜ 2 must be proportional to µ˜ 0 . Using this fact and the definitions above yields (3.3.2). Besides Gk1 µ˜ 1 = 1 requires that F1 = 0. (In Sect. 4 we show that using the solution of the homogeneous adjoint equation this Fredholm condition implies that the secondary charge Q2 vanishes (see Eq. (2.10) for the definition of Q2 at k¯1 ).) More generally Eq. (3.4) results in the equation Gk1 µ˜ n = An Hn−1 , n = 1 . . . n0 − 1 which implies the Fredholm conditions: Fn = 0, n = 1 . . . n0 − 1. In Sect. 4 we will show that these equations along with the corresponding conditions at k¯1 typically determine the value of Q = n0 . In that case we also have Gk1 µ˜ 0 = Gk1 µ˜ n0 = 0, and hence that µ˜ n0 is proportional to µ˜ 0 yielding (3.3.n). iii) Integrating Eq. (1.20 ), and noting that for large r, 8 = O(1/r) Green’s theorem yields that if k1 = a + ib then Z Z 1 (i∂y + ∂xx + 2ik1 ∂x )8 sign (Im k1 )Q = − 2πi Z 1 8df, =− 2π 0∞ where f ≡ x − 2ay − 2iby, and 0∞ is a large circle (or rectangle) in the f plane. We also note that as x 2 + y 2 → ∞, ν → 1 and (3.3.n) with (3.2a) yields 8 ≈ ∂ i ∂f log Hn , hence 1 Q= 2πi

Z 0∞

∂f Hn df = winding number of Hn . Hn

t u

Result VI. Double poles. Assume that the solution to (1.6) has the following structure around k = k1 : 8 9 µ(k) = ν(k) + + , (3.8) k − k1 (k − k1 )2 where ν(k) is regular at k1 and tends to 1 as k → ∞. Note that as compared to the general formula (2.6) we simply write 9 ≡ 92,1 . Suppose also that 9 6 = 0 and that DimKerGk1 = 1. Then Z Z sign(Imk1 ) u9dxdy. i) If u = O(1/r 2 ) then Q2 = 0, where Q2 = 2π i Let Q ≥ 2.

14

J. Villarroel, M. J. Ablowitz

ii) In either of the cases below one has the following relationships between µ and its Laurent coefficients: {8 + h1 9}k=k1 = 0, (3.9.1) and also if: 1) Q = 2 ⇒

{ν + h2,d 9}k=k1 = 0,

(3.9.2)

where

1 h1 ≡ h1,d ≡ i(f + γd ), h2,d ≡ − (h21 + 2iy + δd ) 2 γ , δd are arbitrary constants (here d stands for double pole). We find it convenient to use a similar notation to that of the simple pole case. 2) Q = 3 ⇒ 1 dν = 0, (3.9.3) + (h1 + 2σ )ν + h3,d 9 dk 3 k=k1 where h3,d ≡ −(h1 + σ )3 + 3σ 2 (h1 + σ ) − 6iσy + iβ, β, δd , σ are arbitrary constants. iii) If 8, 9 are related with µ by the above equations then as r 2 ≡ x 2 + y 2 → ∞, 1 ∂ log hn,d + O( 2 ), and ∂f1 r Z ∂f hn,d 1 df = winding number of hn,d . Q= 2πi 0∞ hn,d 8=i

(3.10)

Proof. i) For double poles equations (2.8) are Gk1 9 = 0, ∂G 9}k=k1 = 0, {G8 + ∂k 1 ∂ 2G ∂G 9}k=k1 = 1. 8+ {Gν + ∂k 2 ∂k 2 The two first equations in differential form correspond to (these equations are obtained by substituting (3.8) into (1.2)) (i∂y + ∂xx + 2ik1 ∂x + u)9 = 0, (i∂y + ∂xx + 2ik1 ∂x + u)8 + 2i∂x 9 = 0.

(3.11.1) (3.11.2)

Assume that for large values of x, y, 8 ≈ f1α , 9 ≈ f1β . Then (3.11.2) shows that ∂ P 8j and we are assuming that u = O(1/r 2 ), it must be β = α + 1. Since u = −2i ∂x that α = 1. Next we integrate (3.11.1) and have, in view of the rate of decay of 9, Z Z Z Z 1 1 u9dxdy = − (i∂y + ∂xx + 2ik1 ∂x )9 = 0. sign (Im k1 )Q2 ≡ 2πi 2π i The fact that Q ≥ 2 follows by inspection of different cases and is also discussed further in the next section.

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

15

ii) Using the Green’s function relations from Eq. (1.9) the above equation involving derivatives of the Green’s functions are explicitly found to be (see the appendix for relations involving the higher derivatives of the Green’s functions), Gk1 µ0 = Gk1 µ1 = 0, Gk1 µ2 = 1 − Q −

η , 2

(3.12.1)

where in a similar spirit to the simple pole case we define the functions µ0 ≡ 9, µ1 ≡ {8 + H1 9}k=k1 , µ2 ≡ {ν + H1 8 + H2 9}k=k1 and the constant η≡

1 sign (Im k1 ) 2πi

Z Z uH1 9dxdy.

Note first that Dim Ker Gk1 = 1 implies that µ1 = −iγ µ0 with γ an arbitrary constant, and also that η = −Q. The first relationship yields (3.9.1). To proceed further we need the value of the charges which is unknown unless we specify the pole structure of the function µ around both k1 and k¯1 (see Sect. 4). The following general statements can be made. If Q = 2 then µ2 solves also the homogeneous equation and hence it must be also proportional to µ0 . This yields (3.9.2). When Q 6 = 2 Eqs. (3.12) do not imply a relationship between ν, µ0 . To get around this difficulty we assume that differentiation of (1.6) under the integral sign is permitted (this may be validated a posteriori). We then obtain {G

1 ∂ 3G ∂G 1 ∂ 2G dν 8+ 9}k=k1 = 0. + ν+ 2 dk ∂k 2 ∂k 6 ∂k 3

Using the Green’s function properties (see the appendix) we get in addition to (3.12.1), Gk1 µ3 = C + H1 (1 − Q/3),

(3.12.2)

where

dν + H1 ν + H2 8 + H3 9}k=k1 . . . dk and C is a constant, the explicit value of which is not important here. Assume next that Q1 = 3. We have µ3 ≡ {

1 Gk1 µ0 = Gk1 µ1 = 0, Gk1 µ2 = − , Gk1 µ3 = C. 2 We have with µ˜ 0 ≡ µ0 , µ˜ 1 ≡ µ1 , µ˜ 2 ≡ −2µ2 , µ˜ 3 ≡ µ3 + 2Cµ2 that Gk1 µ˜ 0 = Gk1 µ˜ 1 = Gk1 µ˜ 3 = 0, Gk1 µ˜ 2 = 1 which imply µ3 = µ0 = 9 and this confirms Eq. (3.9.3). We also require the Fredholm condition F1 = 0 to be satisfied. (As mentioned earlier in Sect. 4 we show how such conditions can be used to fix the charges.) Higher order cases are treated similarly. iii) As x 2 + y 2 → ∞, we assume that ν → 1 and (3.9.n) yields in both the cases n = 2, 3 8 ≈ i ∂f∂ log h2,n ≈ i ∂f∂ log f n = in/f . The continuation of the proof is similar to that of the single pole. u t

16

J. Villarroel, M. J. Ablowitz

Result VII. Triple poles. Assume that around k = k1 the solution to (1.6) µ(k) has the structure 92 93 8 + + , µ = ν(k) + 2 k − k1 (k − k1 ) (k − k1 )3 where ν(k) is regular at k1 and tends toR 1 as k → ∞ and 93 6= 0. Let n0 − 3 be the ¯ y)dxdy, n = 0 . . . n0 exist. Suppose greatest integer for which the integrals Hn ρ(x, also that DimKerGk1 = 1. Then 2 ) then Q = Q = 0, where Q , j = 2, 3 are defined as: Q = i) If u =ZO(1/r 2 3 j j Z sign (Imk1 ) u9j dxdy and Q = n0 ≥ 3. 2π i ii) In the case Q = 3 one has {92 + h1 93 }k=k1 = 0, {8 + h2,t 93 }k=k1 = 0, {ν + h3,t 93 }k=k1 = 0.

(3.13) (3.1)

where 1 h1 ≡ h1,t ≡ i(f + γt ), h2,t ≡ − (h21 + 2iy + δt ), 2 1 3 h3,t ≡ (h1 + 3h1 (δt + 2iy) + t ). 6 γt , δt , t are arbitrary constants (here t stands for triple pole). As r → ∞, ∂ ∂ log hn,t ≈ i log f 3 = 3i/f, and ∂f1 ∂f1 Z ∂f hn,t 1 df = winding number of hn,t . Q(k1 ) = 2πi 0∞ hn,t 8≈i

The proof is similar to the other cases (see the appendix). Higher order poles can be considered in a similar way. We will not pursue this topic any more here.

4. Schrödinger Potentials In this section we construct several families of decaying potentials for the nonstationary Schrödinger operator. We shall see that the assumed pole structure of µ around both k1 and k¯1 determines the value of all the relevant charges. The results of the last section show that under the proviso Dim Ker Gk1 = 1 the assumed pole structure of µ of the form (2.6’) yields a system of equations which serve to fix the coefficients of the poles and thus µ(k). We begin considering the simplest possibility.

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

17

1+1 pole. Assume that µ is meromorphic with simple poles at points k1 , k¯1 : µ(k) = 1 + µ(k) ˆ +

81¯ 81 + , (k − k1 ) (k − k¯1 )

(4.1)

where µ(k) ˆ is regular around both k1 and k¯1 and tends to 0 as k → ∞. Then the charges defined by Eq. (2.9) are given by ¯ ¯ = Q1 = 1. i) Q 1 We use the notation Q1¯ = Q(k¯1 ), Q1 = Q(k1 ). ii)

¯ ν(kα ) = −i(fα + γα )8α , α = 1, 1.

(4.2)

Note. The possibility Q1 = 0 is not ruled out by the proof below. We assume that Q1¯ , Q1 do not vanish. iii) Generalizing Eqs. (4.1–4.2) to allow µ(k) to have only simple poles at kα then real, decaying, nonsingular “reflectionless” potentials are obtained when the constants are chosen to satisfy γj +N = γ¯j , j = 1, . . . N. Proof. From Eq. (2.11.3) the charges at k¯1 Equations (3.6) now are:   Gk1 81 = 0 ,  G µ = 1 − Q k1 1 1

are the complex conjugates of those at k1 .   Gk¯1 81¯ = 0  G µ = 1 − Q 1¯ k¯1 1¯

.

The homogeneous solutions at k1 , k¯1 are 81 and 81¯ respectively and from the proof of Result I in Sect. 2 (see below Eq. (2.3.1)) the homogeneous adjoint solutions are ρ1 = u(x, y)81¯ and ρ1¯ = u(x, y)81 . Using (2.11.3) the Fredholm alternative conditions (see Eq. (3.5)) yield Z ¯ ¯ = (1 − Q1 )Q1 ⇒ Q1 = 1. 0 = F1 ≡ (1 − Q1 ) dxdy ρ¯1 = (1 − Q1 )Q 1 (Recall that we assume Q1 6 = 0.) Note. ii) follows immediately from Eq. (3.3.1). iii) From the discussion in Sect. 1 we already know that real, nonsingular potentials ¯ ¯ = Qj . However these necessary conditions are not imply that n = 2N, kj +N = k¯j Q j sufficient. We also need the constants γα to satisfy γj +N = γ¯j , j = 1, . . . N. Indeed as we pointed out in Sect. 1, the solution to (4.2) yields the potential (1.161.17). This formula shows that the potential is decaying, nonsingular and real if det(B) > 0. D E where, Following ref. [8] note first that in block form B = −E¯ D † Dij = (fi + γi )δi,j − i t u

(1 − δi,j ) −i ; Eij = ; ki − kj ki − k j

E † = E, i, j = 1, 2, . . . N.

18

J. Villarroel, M. J. Ablowitz

Note that the position of the poles occurring in pairs (kj , k¯j ) can be arranged such that I m(kj ) < 0; j = 1, 2 . . . N. We see that the determinant of B is real from the following. The matrix B has the following property: B¯ βα = Bα+N,β+N , mod 2N, hence Det Bαβ = (−1)N Det Bα+N,β = (−1)2N Det Bα+N,β+N ⇒ Det Bαβ = Det Bβα = Det Bαβ . Thus Det Bαβ is real. Then we note that E is a positive definite matrix. To show this, use the Cauchy formula for determinants: det(E) =

Y

(ki − kj )(k¯i − k¯j )

1≤i<j ≤N

=

Y 1≤i<j ≤N

N |ki − kj |2 Y

|ki − k j |2

i=1

N Y

k i,j =1 i

1 − kj

1 . (2|I m(ki )|)

Therefore det(E) > 0. Replacing N by p, 1 ≤ p ≤ N in the above formula, one can show that all the principal minors of the Hermitian matrix E are also positive. Hence it is well known that E is positive definite. It follows immediately that E¯ is also positive definite. Next we prove that det B(x, y) > 0 for all x, y. Assume that at some point (x1 , y1 ) det B(x1 , y1 ) = 0. Then at this point there exists a non null vector e ≡ (e1 , e2 ) ∈ C 2N satisfying Be = 0. This implies ¯ 1 + D † e2 = 0. De1 + Ee2 = 0; − Ee These equations by manipulation yield ¯ 1 = 0. e2 † Ee2 + e1† Ee Since E, E¯ are positive definite this yields (e1 , e2 ) = (0, 0) and hence that det B(x, y) 6 = 0 for all x, y. But det B(x, y) → ∞ if y 2 +x 2 → ∞ which in turn implies the statement. The simplest Schrödinger potential corresponds to taking µ(k) ˆ = 0 in Eq. (4.1) with two conjugate simple poles k2 = k¯1 and γ2 = γ¯1 , k1 = a + ib, z = x 0 − 2ay 0 , x 0 = x − x0 , y 0 = y − y0 , where x0 , y0 are constants related with γ1 (see Eq. (1.19). Then we have detB = F = z2 + 4b2 y 02 +

1 . 4b2

(4.3)

∂ The well known one lump solution is obtained from u(x, y, t) = 2 ∂x 2 log F . We say that the solution has index one. Solutions constructed from simple poles were well known [2,3]. The first nontrivial potentials beyond that class are discussed next. 2

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

19

Simple + double pole. Assume that µ has the following structure: µ = 1 + µ(k) ˆ +

81¯ 92 81 + + , 2 k − k1 (k − k1 ) k − k¯1

(4.4)

where µ(k) ˆ is regular at both k1 , k¯1 and tends to 0 as k → ∞. Suppose also that Dim Ker Gk1 = Dim Ker Gk¯1 = 1. Then Q1 = 2. Note that from Eq. (2.11.3,4) the value of the charges at k = k¯1 follows from the charges at k = k1 . ii) The Laurent coefficients satisfy the following system of linear algebraic equations a) At k = k1 (see Eq. (3.9.1) and (3.9.2))

i)

{81 + h1 92 }k=k1 = 0, {ν + h2,d 92 }k=k1 = 0.

(4.5.1)

b) At k = k¯1 (see eq. 3.3.2) {

dν + h1 ν + h2 81¯ }k=k¯1 = 0. dk

(4.5.2)

Note that here we simply write 92,1 ≡ 92 . iii) The function (4.4) with µ(k) ˆ = 0 solves (1.2,1.6) with the Schrödinger potential u(x, y) = 2∂xx log F , γI − δI )2 F = (z2 − 4b2 y 02 + δR )2 + (2y 0 (1 + 2bz) + b 1 1 1 + 2 (z − )2 + 4b2 y 02 + 2 , b 2b 4b

(4.6)

where we define k1 ≡ a + ib, z = x 0 − 2ay 0 , x 0 = x − x0 , y 0 = y − y0 , and

γI a − γR b γI , y0 = ; b 2b δR , δI are arbitrary real constants. x0 =

Proof. Equation (3.6) for a simple pole at k¯1 and Eq. (3.12.1) for a double pole at k1 yield Q1 Gk1 92 = 0, Gk1 µ2 = 1 − 2 and Gk¯1 81¯ = 0, Gk¯1 µ1¯ = 1 − Q1¯ . As discussed in Result I, Sect. 2 (see below Eq. (2.3.1)) with the homogeneous solutions to (1.6) at k1 , k¯1 being 92 , 81¯ respectively, then the homogeneous adjoint solutions are

20

J. Villarroel, M. J. Ablowitz

14 12 10 8 6 4 2 0 -2 -6

-4 -4 -2

-2 0x

y0 2

2 4 4

6

Fig. 1a. Snapshot of solution corresponding to potential (1.20) before interaction (a = 0.5, b = 1, γ = δ = 0)

¯ 1 = Q1 , the Fredholm given by ρ1 = u(x, y)81¯ and ρ1¯ = u(x, y)92 . Also since Q alternative conditions yield a) from ρ1¯ = u(x, y)92 : Z 0 = (1 − Q1 )

dxdy ρ¯1¯ ≡ (1 − Q1 )Q2 .

This is an identity since Q2 = 0 ; b) from ρ1 = u(x, y)81¯ : Q1 ) 0 = (1 − 2

Z dxdy ρ¯1 ≡ (1 −

Q1 )Q1 ⇒ Q1 = 2 2

(recall Q1 6 = 0). Therefore Q1 = Q1¯ = 2 and by (3.3.2) and (3.9.2) i) is obtained. The simplest case corresponds to µ(k) ˆ = 0 and in this case ii) yields with δ1¯ = −δ¯1 ≡ δR + iδI : h2,d 81 − α81¯ = 1, − h1 3 1 h¯ 2,d 1 2α¯ 2 + α¯ − 8 ¯ = 1, − α¯ 81 + 2 ¯ |h1 | h1 h1 h¯ 1 1 1 . Solving this linear system we obtain the Laurent coefficients where k1 = a+ib, α ≡ 2ib 81 , 81¯ . Then from Eq. (4.5.1) we obtain 92 . Using (1.3) (which really amounts to Eq. (1.16)) the Schrödinger potential given via (4.6) follows. It is clear that this potential is real, and nonsingular. It has 6 free parameters. Note that u = 0( r12 ) as r 2 ≡ x 2 +y 2 → ∞. Hence the potential is in L2 but not in L1 . This “slow" decay is the underlying reason why the index is two. In Fig. 1 a typical plot of this function is displayed. u t

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

21

25 20 15 10 5 0 -3

-6 -2

-4 -1

-2 0x

y0 2

1 4

2 3

6

Fig. 1b. Snapshot of solution corresponding to potential (1.20) during interaction – same parameters as Fig. 1

18 16 14 12 10 8 6 4 2 0 -2 -3

-6 -2

-4 -1

-2 0x

y0 2

1 4

2 3

6

Fig. 1c. Snapshot of solution corresponding to potential (1.20) after interaction – same parameters as Fig. 1

Two double poles. Assume that µ has the following structure: µ = 1 + µ(k) ˆ +

81¯ 92¯ 92 81 + + + , 2 ¯ k − k1 (k − k1 ) k − k1 (k − k¯1 )2

(4.7)

where µ(k) ˆ is regular at both k1 , k¯1 and tends to 0 as k → ∞ and Dim Ker Gk1 = 1. Then since we have assumed that Q1 cannot vanish, i)

Q1 = 3.

22

J. Villarroel, M. J. Ablowitz

ii) At k = kα , α = 1, 1¯ the Laurent coefficients satisfy the following system of linear algebraic equations: 8α = −h1,α 92.α , 1 dνα + (h1,α + 2σα )να + h3.α,d 92.α ] = 0, [ dk 3

(4.8.1) (4.8.2)

¯ 3.1 ≡ 3, 3.1¯ ≡ 3. ¯ where we define the notation as: 2.1 ≡ 2, 2.1¯ ≡ 2, ˆ =0 iii) If k1¯ = k¯1 , β1¯ = β¯1 , γ1¯ = γ¯1 + 2i σ¯ 1 , σ1¯ = σ¯ 1 the function (4.7) with µ(k) solves (1.6) with the Schrödinger potential u(x, y) = 2∂xx log F , 9 1 1 1 1 | − i(h1 + 2σ ) + |2 + 2 , F = |h3,d |2 + 2 |ih1 + |2 + 2 4b 2b 4b 2b 4b or in physical coordinates F = (z3 − 12b2 y 02 z + 3(σR2 − σI2 )z + 12bσI σR y 0 − 6σR y 0 + βˆR )2 + (6bz2 y 0 − 8b3 y 03 − 6σI σR z + 6b(σR2 − σI2 )y 0 + 6σI y 0 − βˆI )2 9 1 2 1 0 2 + 2 (z − σI − ) + (2by − σR ) + 2 4b 2b 4b 1 1 (4.9) · (z + σI + )2 + (2by 0 + σR )2 + 2 , 2b 4b where the coordinates z, x 0 , y 0 are the same as those in Eq. (4.6) except x0 =

a γI − σR (γI − σR ) − (γR + σI ), y0 = b 2b

and βˆ is a minor redefinition of the original free parameter β. This solution depends on 8 parameters. Proof. By Result VI we have that 8α = −h1,α 92α ; η1 = η1¯ = −Q1 . Hence what remains from Eq. (3.12.1) is Q1 , 2 Q¯ Gk¯1 µ2¯ = 1 − 1 . 2

Gk1 92 = 0,

Gk1 µ2 = 1 −

Gk¯1 92¯ = 0,

¯ The Fredholm It follows that the adjoint solutions are ρα = u(x, y)9α¯ , α = 2, 2. alternative conditions yield Q1 ¯ 2 1 − Q1¯ , )=Q 0 = Q2 (1 − 2 2 ¯ 2 = 0. which are identically satisfied since Q2 = Q The next equation is (3.12.2): Gk1 µ3 = C + H1 (1 − Q1 /3), G ¯ µ¯ ¯ = C ¯ + H¯ 1 (1 − Q1 /3), k1

3

1

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

23

14 12 10 8 6 4 2 0 -2 -6

-6 -4

-4 -2

-2 0x

y0 2

2 4

4 6

6

Fig. 2. A three humped structure – see cf. (6.110 ), (4.9) with constants (a = 0, b = 1, γ = β = σ = 0)

where C, C1¯ are constants. Using the above adjoint solutions, the Fredholm conditions are 0 = (1 − Q1 /3)η¯ 1¯ = −Q1 (1 − Q1 /3) ⇒ Q1 = 3 since Q1 6 = 0. Thus as claimed we obtain Q1 = Q1¯ = 3. Therefore by Result VI Eq. (4.8) follow. iii) The proof requires us to solve (4.8) for 8α , α = 1, 1¯ and then obtain the potential – see (4.9) – via the second logarithmic derivative. It is a matter of straightforward but tedious linear algebra. From the potential we see that u = 0( r12 ) as r 2 ≡ x 2 + y 2 → ∞. Hence the potential is real, nonsingular and rationally decaying. Also, as are the others in this class, it is in L2 but not in L1 . This “slow" decay allows the index to be three. See Fig. 2 for a typical plot of u(x, y). u t Simple + triple pole . Assume that µ has the following structure: µ = 1 + µ(k) ˆ +

81¯ 92 93 81 + + + , 2 3 k − k1 (k − k1 ) (k − k1 ) k − k¯1

(4.10)

where µ(k) ˆ is regular at both k1 , k¯1 and tends to 0 as k → ∞, Dim Ker Gk1 = 1 and 93 6 = 0. Then i) Q1 = 3. ii) The following system of linear algebraic equations is obtained:

24

J. Villarroel, M. J. Ablowitz

a) at k = k1 from Eq. (3.13), {92 + h1 93 }k=k1 = 0, {8 + h2,t 93 }k=k1 = 0 {ν + h3,t 93 }k=k1 = 0,

(4.11.1) (4.11.2) (4.11.3)

b) at k = k¯1 , from Eq. (3.3.3), 2 1d ν dν + h2 ν + h3 81¯ + h1 = 0. 2 dk 2 dk k=k¯1 iii) If µ(k) ˆ = 0, k1 ≡ a + ib, δs = −δ¯t ≡ δR + iδI , s = −¯t ≡ R + iI , the corresponding reflectionless potential is u(x, y) = 2∂xx log F , where i2 h F = z3 − 12b2 y 02 z + 3zδR + 6by 0 (δI − 2y 0 ) + I h i2 + 8b3 y 03 − 6bz02 y 0 + 3z(δI − 2y 0 ) − 6by 0 δR + R 9 h z · 2 (z2 − 4b2 y 02 + δR − 4b b 1 2 1 1 2 1 0 2 02 + 2 ) + (4y bz − δI ) + 2 (z − ) + 4y + 4 (4.12) 2b b b 4b Proof. The proof goes along the same lines of the examples already considered where one needs to solve for 8α , α = 1, 1¯ and then obtain the potential via the second logarithmic derivative of the function F given in Eq. (4.12). Again this is a matter of substantial but straightforward linear algebra. u t See Fig. 3 for a typical plot of u(x, y). Other reflectionless potentials. More complicated potentials/solutions can be constructed by considering pairs of poles kl , k¯l of higher orders and deriving the corresponding linear algebraic equations for the residues. This procedure allows one, in principle, to find infinitely many real, nonsingular and rationally decaying potentials of the nonstationary Schrödinger operator associated with its discrete spectrum. Below we consider some of the simplest cases. Reflectionless potentials corresponding to superposition of simple and double poles with Q=2. Let C, D, E, G be N × N matrices defined by 1 1 − (4.13.1) (1 − δlj ), Clj = h2l δlj + h1l (kl − kj ) h1l (kl − kj )2 ! ! 1 1 1 1 2 − − Dlj = h1l − , (4.13.2) (kl − k¯j ) (kl − k¯j )2 h1l h¯ 1j h¯ 1j h1l (kl − k¯j )3 Elj =

−h¯ 1l , ¯ (kl − kj )

Glj = h¯ 2l δlj − h¯ 1l

(4.13.3) 1 1 + ¯ ¯ ¯ ¯ (kl − kj ) h1,j (kl − k¯j )2

! (1 − δlj ),

(4.13.4)

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

25

14 12 10 8 6 4 2 0 -2 -10

-4 -3

-5 -2 y0

-1 x 0

5 1 10

Fig. 3. A multipole order 3 solution corresponding to a simple plus triple pole configuration with parameters (a = 0, b = 1, γ = δ = = 0) – c.f. (6.14), (4.12)

where the functions h1l , h2l , l = 1, . . . N (with relevant constants γs,l , δs,l ) are defined in Eq. (3.2). Let F stand for the determinant of the 2N × 2N matrix(l = 1, . . . N, j = 1, . . . N), CD B= . EG Then

∂2 log F ∂x 2 is a real nonsingular reflectionless potential for the nonstationary Schrödinger operator. The wave function for this potential is given by u(x, y) = 2

µ=1+

N X j =1

! 92,j¯ 8j¯ 8j + , + (k − k¯j )2 (k − k¯j ) (k − kj )

where the Laurent coefficients satisfy a linear algebraic system given below. Proof. Assume that the eigenfunction has the above given structure and we require that Ql = 2, l = 1 . . . N. Then according to the results (3.9.1–2), at any of the poles k¯l , l = 1 . . . N we have {8l¯ + h1l¯92,l¯}k=k¯l = 0, {ν + h2l,d ¯ 92,l¯}k=k¯l = 0. Using the first of the above equations to solve for 92,l¯ in terms of 8l¯ and substituting this equation into the second (recalling that ν stands for the nonsingular part of µ), we

26

J. Villarroel, M. J. Ablowitz

have

−h2l,d ¯ 8l¯ + h1l¯

N X

1 1 − ¯ ¯ ¯ (kl − kj ) h1j¯ (kl − k¯j )2

j =1,j 6=l

! 8j¯ + h1l¯

N X j =1

8j = −h1l¯; ¯ (kl − kj )

furthermore from (3.3.2) at any of the poles kl , l = 1 . . . N the following equations apply dν + h1l ν + h2l 8l = 0, dk k=kl and as before this yields,

h2l 8l + h1l

N X j =1

1 1 2 1 1 ( + )+ − h1j¯ (kl − k¯j ) (kl − k¯j )2 h1l h1j¯ h1l (kl − k¯j )3

N X

+ h1l

(

j =1,j 6=l

! 8j¯

1 1 − )8j = −h1l . (kl − kj ) h1l (kl − kj )2

In the above formulae for l = 1 . . . N the functions h1l¯, h2l,d ¯ (with relevant constants γd,l¯, δd,l¯) and h1l , h2l (with relevant constants δs,l , γs,l ) are defined in (3.2),(3.9). Taking ¯ 2l , h ¯ = −h¯ 1l , the parameters to satisfy γd,l¯ = γ¯s,l ; δd,l¯ = −δ¯s,l implies that h2l,d ¯ = −h 1l then the above equations are:

h2l 8l + h1l

N X ( j =1

8j

(kl − k¯j )

N X

−

j =1,j 6=1

8j ) h1l (kl − kj )2

! 1 1 2 1 1 ( − )− − 8j¯ = −h1l , + h1l (kl − k¯j ) (kl − k¯j )2 h1l h¯ 1j h¯ 1j h1l (kl − k¯j )3 j =1 ! N N X X 8j 1 1 ¯ − h + = h¯ 1l . 8 h¯ 2l 8l¯ − h¯ 1l ¯ 1l j (k¯l − k¯j ) h¯ 1,j (k¯l − k¯j )2 (k¯ − kj ) j =1,j 6=l j =1 l N X

For l = 1, . . . N, this system is equivalent to

CD EG

−h1l 8l = ¯ , 8l¯ h1l

where C, D, E, G are defined in (4.13). Using u(x, y) = −2i result follows. u t

PN

∂ j =1 ∂x (8j

+ 8j¯ ) the

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

27

Reflectionless potentials corresponding to superposition of simple and triple poles with Q = 3. Let C, D, E, G be N × N matrices defined by ! h1l 1 + 2 (1 − δlj ), h2l − klj klj   h¯ 1,j h1l 1 2h 3 h2l − 1l +  + 2 − kl j¯ Dlj = h2l − kl j¯ kl j¯ kl j¯ kl2j¯ h¯ 2j kl j¯   3h1l 6  1  + 2 , h2l − − kl j¯ kl j¯ h¯ 2j kl2j¯ 1 Clj = h3l δlj + klj

Elj = −

h¯ 2l , klj ¯

Glj = h¯ 3l δlj − h¯ 2l

(4.14.1)

(4.14.2)

(4.14.3) h¯ 1,j 1 1 + + 2 k¯lj h¯ 2j k¯lj h¯ 2l k¯lj3

! (1 − δlj ).

(4.14.4)

In the above formulae for l = 1 . . . N, the functions h1l , h2l , h3l (with relevant constants γs,l , δs,l , s,l ) are defined in (3.2) and we set klj ≡ kl − kj , kl j¯ ≡ kl − kj¯ , k¯lj ≡ k¯l − k¯j . Let F stand for the determinant of the 2N × 2N matrix CD B= . EG Then u(x, y) = 2

∂2 log F ∂x 2

is a real nonsingular reflectionless potential for the nonstationary Schrödinger operator. The wave function for this potential is given by µ=1+

N X j =1

! 92,j¯ 8j¯ 93,j¯ 8j + + + , (k − k¯j )3 (k − k¯j )2 (k − k¯j ) (k − kj )

where the Laurent coefficients satisfy a linear algebraic system given below. Proof. Assume that the eigenfunction has the above given structure where we require that Ql = 3, l = 1 . . . N. Then according with the result (3.3.3) at any of the poles kl , l = 1 . . . N, the following equations apply: {

dν 1 d 2ν + h1l + h2l ν + h3l 8l }k=kl = 0, 2 2 dk dk

while at k¯l , l = 1 . . . N (3.13) yields 92,l¯ + h1l¯93,l¯ = 0, 8l¯ + h2l,t ¯ 93,l¯ = 0, {ν + h3l,t ¯ 93,l¯}k=k¯l = 0.

28

J. Villarroel, M. J. Ablowitz

We take the constants to satisfy ¯s,l = −t,l¯ ; δ¯s,l = −δt,l¯ ; after tedious algebra of reducing to equations with 2N unknowns 8j , 8j¯ we obtain ! h1l 1 + 2 8j h2l − −h2l = h3l 8l + klj klj j =1,j 6=l    ¯ h1,j h1l 1 h2l − 2h1l + 3  + h2l − + 2 − kl j¯ kl j¯ kl j¯ kl2j¯ h¯ 2j kl j¯    8 3h1l 6 1  j¯  + 2  h2l − − − kl j¯ kl j¯ kl j¯ h¯ 2j kl2j¯ N X

and

N X h¯ ¯ h¯ 2l¯ = − 2l 8j − h¯ 2l¯ klj ¯

j =1,j 6=l

1 klj

h¯ 1j 1 1 + + 2 k¯lj h¯ 2j k¯lj h¯ 2l k¯lj3

! 8j¯ + h¯ 3l¯8l¯.

For l = 1, . . . N this system is equivalent to −h2l CD 8l = ¯ , 8l¯ EG h2l where C, D, E, G is defined in (4.14). Using u(x, y) = −2i follows. u t

P

x (8j

+ 8j¯ ) the result

Reflectionless potentials corresponding to superposition of double poles with Q = 3 . Let C, D, E, G be N × N matrices defined by ! 3h1l gl 1 2 − + (4.15.1) gl − (1 − δlj ), Clj = −h3l,d δlj + klj h1j klj klj h1j klj2   3h1l  gl 1 2  − + Dlj = gl − , (4.15.2) kl j¯ h1j¯ kl j¯ kl j¯ h1j¯ kl2j¯   3h1l¯ g ¯ 1 2 l g¯ l − , + − Elj = 2 klj h1j klj klj h1j klj ¯ ¯ ¯ ¯ ! 3h1l¯ g ¯ 1 2 l + − (4.15.4) g¯ l − (1 − δlj ). Glj = −h¯ 3l,d δlj + h ¯ k¯lj h ¯ k¯ 2 k¯lj k¯lj 1j

1j lj

In the above formulae for l = 1 . . . N the functions h1l,d , h3l,d (with relevant constants γd,l , βl , and σl ) are defined in (3.9.2-3) and klj ≡ kl − kj , k¯lj ≡ k¯l − k¯j , kl j¯ ≡ kl − kj¯ . Besides gl ≡ h1l,d + 2σl . Let F stand for the determinant of the 2N × 2N matrix, CD B= . EG

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

29

Then

∂2 log F ∂x 2 is a real nonsingular reflectionless potential for the nonstationary Schrödinger operator. The wave function for this potential is given by ! N X 8j¯ 92,j¯ 8j 92,j + + , + µ=1+ (k − kj )2 (k − kj ) (k − k¯j )2 (k − k¯j ) u(x, y) = 2

j =1

where the Laurent coefficients satisfy a linear algebraic system given below. Proof. Assume that the eigenfunction has the above structure and require that Ql = 3, l = 1, . . . N. Then Eqs. (3.9.1–3) imply

8α = −h1α 92.α ; 1 dνα + (h1α + 2σα )να + h3α,d 92.α = 0 dk 3

at both k = kl and k = k¯l . These equations yield ! X N N X 8j¯ 92,j¯ 8j 92,j 1 + + h3l,d 92l + gl + 3 (kl − kj )2 (kl − kj ) (kl − k¯j )2 (kl − k¯j ) j =1,j 6=l j =1 ! X N N X 292,j¯ 8j¯ 8j 292,j = −gl , + + + − (kl − kj )3 (kl − kj )2 (kl − k¯j )3 (kl − k¯j )2 j =1,j 6=l

N X j =1,j 6=l¯

j =1

N X

8j¯ 92,j¯ + + 2 ¯ (kl¯ − kj ) (kl¯ − k¯j ) j =1 j =1,j 6=l¯ ! 292,j¯ 8j¯ 8j 292,j + + + = −gl¯ (kl¯ − kj )3 (kl¯ − kj )2 (kl¯ − k¯j )3 (kl¯ − k¯j )2

1 h ¯ 9 ¯ + gl¯ 3 3l,d 2l −

N X

8j 92,j + 2 (kl¯ − kj ) (kl¯ − kj )

!

for l, l¯ = 1, . . . N. Note that gl¯ = −g¯ l if the constants are chosen to satisfy βl¯ = β¯l , γl¯ = γ¯l + 2i σ¯ l , σl¯ = σ¯ l .After elimination of variables we obtain the system ! N X gl 1 2 3h1l − + gl − 8j − h3l,d 8l + k h1j klj klj h1j klj2 j =1,j 6=l lj   N X gl 1 2  3h1l  + − + gl − 8j¯ = −3gl h1l , l = 1, . . . N 2 kl j¯ h1j¯ kl j¯ kl j¯ h k ¯ 1 j ¯ lj j¯=1,j¯6=l    N X 3h1l¯ g ¯ 1 2 l  g¯ l −  8j + − − h¯ 3l,d 8l¯ + 2 klj h1j klj klj h k ¯ ¯ ¯ 1j ¯ lj j =1,j 6=l¯ ! ! 3h ¯ g¯ l 1 2 + 1l g¯ l − + − 8j¯ = −3g¯ l h1l¯. h ¯ k¯lj h ¯ k¯ 2 k¯lj k¯lj 1j

1j lj

This implies the result in the standard way. u t

30

J. Villarroel, M. J. Ablowitz

More general potentials. The cases detailed above are some of the new families that follow from our results. While we shall not elaborate any further on this, we remark that one could also consider a mixture of poles of different types as well as adding in the contribution of the continuous spectrum by taking µ(k) ˆ 6 = 0 and following the procedure discussed in refs. [1,2]. 5. Coalescence of Poles In this section we briefly describe how some of our results can be obtained by appropriately coalescing pairs of conjugate and simple poles of the eigenfunctions of the Schrödinger operator. We show this for N = 2 in formulae (1.15–16). Let us assume that the discrete spectrum consists of two pairs of simple poles, i.e. µ(k) is written: µ(k) = 1 + µ(k) ˆ +

4 X α=1

φα , (k − kα )

where kj +2 = k¯j , j = 1, 2 and all the indices Qα =1. In the limiting process we take k1 ≡ a + ib = k2 + ε, k3 = k¯1 , k4 = k¯2 , |ε| << 1, assume the following expansion for both the constants and residues of (1.10, 1.12),

γα,0

(−1)

φα + φα(0) + φα(1) ε + . . . , ε (−1) (0) (1) = γα,0 /ε + γα,0 + γα,0 ε + . . . , α = 1, . . . 4,

φα =

and substitute these expansions into Eq. (1.15). At order 1/ε2 there are various possibil(−1) (−1) (−1) (−1) ities. Reality requires γ1,0 = γ4,0 = −γ2,0 = −γ3,0 = −i. In order for µ to have (−1)

(−1)

(−1)

(−1)

= φ3 +φ4 = 0. a finite limit as ε tends to zero, we are forced to take: φ1 +φ2 Proceeding to subsequent orders is straightforward. It is convenient to define new vari(0) (0) (0) (0) (−1) (−1) (−1) ≡ 92 . We find that φ3 = φ4 = 0, ables: φ1 +φ2 ≡ 81 , φ3 +φ4 = 81¯ , φ1 (−1) (0) (0) (0) (0) (0) = i81 /(f + γ ), γ ≡ γ1,0 , f ≡ f (k1 ), φ3 = φ4 . Letting ε → 0 the φ1 above equations imply that the eigenfunction µ has the following spectral structure: µ = 1 + µ(k) ˆ +

81¯ 81 92 . + + (k − k1 )2 (k − k1 ) (k − k¯1 )

Besides this one also obtains that the new functions 92 , 81 , 81¯ satisfy the system of Eqs. (3.9.1–2) and (3.3.2). Note next that the equality Z 1 u φα dxdy 1 = Qα ≡ 2πi with the above results implies that Z Z 1 1 (0) (0) u81 dxdy = u(φ1 + φ2 )dxdy = 2 Q(81 ) = 2πi 2πi Z Z 1 1 (−1) u92 dxdy = uφ1 dxdy = 0. 2πi 2π i Thus for this case we recover our results (3.8–9) and and (3.3.2). and

Q(92 ) =

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

31

6. KPI Solutions The above class of potentials is related to localized rationally decaying solutions of the KPI equation when the appropriate temporal dependence is inserted. Requiring µ(x, y, t, k) to solve also (1.4) implies a particular time evolution for the constants γ , β, ... entering in the potential. The procedure, though tedious, is standard and we will not dwell on the derivation. Here we present the solutions corresponding to potentials derived earlier and then we study their dynamics. Solutions to KPI are obtained when the constants of (3.2),(3.9),(3.13) “evolve” as k˙α (t) = 0 and δ˙α (t) = −24ikα , β˙α (t) = 12(1 + 6kα σ ),

γ˙α (t) = 12kα2 , σ˙ (t) = 0,

˙α (t) = 24i

(6.1)

α = 1 . . . 2N.

Simple poles. The simplest case corresponds to (4.1) with µˆ = 0. Using (6.1) the KPI solution u(x, y, t) is obtained by setting x → x − 12(a 2 + b2 )t, y → y − 12at

(6.2)

in (4.3). Corresponding to n = 2N simple poles the resulting solution KPI solution, called the N lump solution, is obtained by setting γα → γα (t) = γα (0) + 12kα2 t in (1.16). It describes a configuration that asymptotically is made up of N-humps all of them moving uniformly with distinct velocities vj x = 12(aj2 + bj2 ), vjy = 12aj (here kj ≡ aj + ibj ). Upon interaction there is neither a phase shift nor a deflection, hence the asymptotic dynamics of these objects is trivial. Simple + double poles. Again, the easiest case corresponds to (4.4) with µˆ = 0. The KPI solution follows from (4.6) and (6.1) by taking x → x − 12(a 2 + b2 )t, y → y − 12at, δR → δR − 24bt.

(6.3)

This gives (1.20). Study of this KP solution shows that for long times it has two maxima or humps (see Fig. 1), each of which moves with distinct asymptotic velocities and diverge from one another, proportional to |t|1/2 ( this “anomalous scattering” was first discussed in [6] with a particular choice of some of the relevant parameters). In Appendix IV we show that as t → −∞ the two maxima (+ denotes fast, - denotes slower hump) are located at  q x ∼ 12(a 2 + b2 )t ± −24a 2 t + x − 1 , ± 0 b 2b q (6.4.1) y ∼ 12at ± − 6t + y ±

b

0

and at t → ∞ the two maxima are located at ( √ 1 ) x± = 12(a 2 + b2 )t ± 24bt + 0( t 1/2 y± = 12at + y0 .

(6.4.2)

32

J. Villarroel, M. J. Ablowitz

10

5

-150

-100

-50

0

50

100

-5

-10

Fig. 4. Trajectories of the maxima – see (1.20), (6.4.1)

Note that as t → −∞ the trajectories follow a hyperbola given by (Fig. 4) 2 (a 2 − b2 ) (x − 2ay) = 0. (6.5) (a 2 + b2 )y − ax + 2b The asymptotic motion of the humps can be viewed as a two particle dynamical system. To describe the dynamics of this motion it is convenient to consider a frame with the origin at (x0 , y0 ) which moves with velocities vx = 12(a 2 + b2 ), vy = 12a, with respect to the rest frame. With respect to this Galilean frame as t → −∞ one of the humps is located in the first quadrant while the other is the mirror image in the third quadrant. As t → ∞ both of them travel on the x axis, the first one moving left, the second right. The asymptotic amplitude is u(x± , y± ) = 16b2 for both maxima as |t| → ∞. We note that these “particles” are not distinguishable after scattering. Here we take the natural election that corresponds to the situation described above: i.e. say that the hump which drifts to the left as t → ∞ was the one located at the far right of the upper half quadrant at t → −∞. With this choice the angle the humps get deflected is given by: 1 . (6.6) = arc tan 2|a| By properly choosing the free parameter a we can obtain a solution that scatters with any desired angle. This is in contrast to the situation for standard lumps where this angle is zero (i.e. there is no phase shift). In this dynamical analogue the asymptotic motion of the humps can be thought of as being the result of an interaction between the humps. To determine the nature of this interaction define xi,j ≡ xi − xj , yi,j ≡ yi − yj , i, j = +, − r i ≡ (xi , yi ), r i,j ≡ (xi,j , yi,j ), ri,j ≡ |r i,j |

(6.7)

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

33

2 (note r+,− = (x+ − x− )2 + (y+ − y− )2 ). The above trajectories as t → −∞ of the humps are solutions of the following system: X r i,j d 2r i = f (ri,j ), (6.8) 2 dt ri,j j,j 6=i

where the attractive force between particles is f (r) ≡ −(

24a 2 2 (1 + 4a 2 )2 . ) b 4r 3

(6.9)

These humps attract each other but they do not form a bound state since the attractive interaction is not strong enough to bind them together. They eventually split apart horizontally as they travel parallel to the y axis. Similar considerations apply as t → ∞. We note the equivalent force law only has a different constant. We remark finally that some solutions with nontrivial interaction properties of other 2+1 integrable equations have been derived using direct methods (cf. [12-14]). Two double poles. We consider (4.7) with µˆ = 0. Solutions to KPI are obtained by taking x → x − 12(a 2 + b2 )t, y → y − 12at, βˆ → βˆ + 12(1 + 6ibσ )t,

(6.10)

σ = σR + iσI , constant, in (4.9). Hence we obtain u(x, y, t) = 2∂xx log F , 9 1 2 1 1 2 1 2 F = |h3,d | + 2 |ih1 + | + 2 |−i(h1 + 2σ ) + | + 2 4b 2b 4b 2b 4b 2 3 2 02 2 2 0 0 = z −12b y z + 3(σR −σI )z + 12bσI σR y −6σR y + 12(1−6bσI )t + βˆR 2 + 6bz2 y 0 −8b3 y 03 −6σI σR z + 6b(σR2 −σI2 )y 0 + 6σI y 0 −72tbσR − βˆI 9 1 2 1 0 2 + 2 (z−σI − ) + (2by −σR ) + 2 4b 2b 4b ) ( 2 1 1 2 0 (6.11) + 2by + σR + 2 . · z + σI + 2b 4b This solution depends on 8 real parameters. Generically it describes a three humped structure with the position of the humps given asymptotically as |t| → ∞ by a ϕ + 2πj ϕ + 2πj )− sin ( ) + o(1), xj ∼ 12(a 2 + b2 )t + x0 + (ζ t)1/3 cos ( 3 b 3 yj ∼ 12at + y0 −

ϕ + 2πj (ζ t)1/3 sin ( ) + o(1), j = 0, 1, 2, 2b 3

where ζ = 12{(1 − 6bσI )2 + 36b2 σR2 }1/2 , tan ϕ = Thus the humps are located on the conic (x 0 − 2ay 0 )2 + 4b2 y 02 = (ζ t)2/3

6bσR . (1 − 6bσI )

(6.12)

34

J. Villarroel, M. J. Ablowitz

4

2

-200

-100

0

100

200

-2

-4

Fig. 5. Trajectories of the humps following (6.110 )–(6.12) in the rest frame with appropriate choices of constants

moving with uniform speed vx = 12(a 2 + b2 ), vy = 12a and whose semiaxes are first decreasing and then increasing at a rate |t|1/3 . Moreover in the moving frame the humps translate along three straight lines. After collision the formulae (6.12) shows that the particles follow their original trajectories without any scattering or phase shift. Thus there is no scattering although there is interaction that is manifest in the nontrivial trajectories (6.12) evolving as 0(t 1/3 ) as t → ∞. The expression for the solution simplifies if σ = 0 in which case F = (z3 − 12b2 y 02 z + 12t + βˆR )2 + (8b3 y 03 − 6bz2 y 0 + βˆI )2 9 1 1 1 2 1 2 2 02 2 02 + (z − ) + 4b y + 2 (z + ) + 4b y + 2 . 4b2 2b 4b 2b 4b (6.110 ) This corresponds to a three humped structure (see Fig. 2) with the position of the humps given asymptotically by (6.12), where ζ = 12, ϕ = 0 (a = 0, b = 1). The trajectories of the above humps (see for example Fig. 5) are solutions of the following system: X r i,j d 2r i = f (ri,j ) , i = 1, 2, 3. 2 dt ri,j j,j 6=i

A direct calculation similar to (6.8) shows that the “force between particles” is (for convenience, we take here a = 0) 1 1 9 f (r) ≡ − (3 + 2 )3 5 , 2 4b r which again corresponds to an attractive interaction.

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

Another subcase of interest is that given by σ =

i 6b ,

35

which corresponds to

2 2 z y0 ˆR + 8b3 y 03 − 6bz2 y 0 − 5 + βˆI + β F = z3 − 12b2 y 02 z − 12b2 6b ( ) ( )! 2 2 2 2 9 1 1 2 02 2 02 z+ z− . + 2 + 4b y + 2 + 4b y + 2 4b 3b 4b 3b 4b (6.13) We note that the temporal dependence drops out and the analysis above for the hump trajectories ceases to be valid. Equation (6.13) corresponds to a three humped KP solution that is stationary (cf. also ref. [7]) which moves with uniform velocity vx = 12(a 2 + b2 ) > 0, vy = 12a and hence can be regarded as a solitary wave. Thus this solution, unlike the others considered which depend on all the variables z, y 0 , t, also solves the Boussinesq equation. Solutions that behave asymptotically as a nonlinear superposition of these multipeaked solitary waves are obtained in a straightforward way from the analysis in Sect. 4. Namely, consider the function µ=1+

N X j =1

! 92,j¯ 8j¯ 92,j 8j , + + + (k − kj )2 (k − kj ) (k − k¯j )2 (k − k¯j )

with Qj ≡ Q(kj ) = 3 and the constants kj , βj , γj , σj satisfying kj¯ = k¯j , βj¯ = β¯j , γj¯ = γ¯j + 2i σ¯ j , σj¯ = σ¯ j =

i . 6bj

All the residues satisfy

1 dνα ¯ . . . N¯ . + (h1,α + 2σα )να + h3.α,d 92.α = 0, α = 1, . . . N, 1, dk 3

As |t| → ∞ the ensuing solution behaves as a superposition of these multipeaked solitary waves: u∼2

N X

∂ 2 (log Fj )/∂x 2 ,

j =1

where Fj is given by (6.13) corresponding to parameters kj , βj , γj , σj = individual humps suffer no phase shift due to interaction.

i 6bj

. The

Solutions coming from simple + triple poles . Consider (4.10) with µˆ = 0. Using (6.1) results in the solution to KPI given by formula (4.12) with x → x − 12(a 2 + b2 )t, y → y − 12at, δR → δR − 24bt, I → I − 24t.

36

J. Villarroel, M. J. Ablowitz

Hence

i2 h F = z3 − 12b2 y 02 z + 3z(δR − 24bt) + 6by 0 (δI − 2y 0 ) + I − 24t h i2 + 8b3 y 03 − 6bz02 y 0 + 3z(δI − 2y 0 ) − 6by 0 (δR − 24bt) + R 1 9 z + 2 (z2 − 4b2 y 02 + δR − 24bt − + 2 )2 + (4y 0 bz − δI )2 4b b 2b 1 1 2 1 02 + 2 (z − ) +4y + 4 . (6.14) b b 4b

Detailed study of this multipole order-3 configuration for the KPI equation shows, in a similar way to earlier cases such as (6.4–6.5), that it is composed of three mutually interacting humps (see e.g. Fig. 3), which move with distinct asymptotic velocities. 0 , y 0 ), (x 0 , y 0 ), As t → ±∞ the three maxima are located at (assuming b > 0) (x+ + 0 0 0 , y 0 ), where x 0 = 12(a 2 + b2 )t + x , y 0 = 12at + y , (x− 0 0 − 0 0  q x 0 ∼ x 0 − 4/3b ± −72a 2 t, ± 0 b q t → −∞ y 0 ∼ y 0 ± −18t , ±

0

(

and

0 ∼ x0 ± x± 0 0 ∼ y0 , y± 0

b

√ 72bt +

1 6b ,

t → +∞.

(6.15)

From a dynamical point of view it is convenient to consider a frame with the origin at (x0 , y0 ) which moves with velocities vx = 12(a 2 + b2 ), vy = 12a, with respect to a rest frame. In this comoving frame one of the humps remains at rest at the origin (0, 0). 2 The asymptotic amplitude of the hump at rest is given by u = 720 169 b . The two remaining humps follow a nonuniform motion and undergo scattering. As t → −∞ one of the humps is located in the first quadrant, while the other is the mirror image. As t → ∞ both of them travel on the x axis, the first one moving left, the second right. The asymptotic amplitude u(x± , y± ) = 16b2 for both maxima as |t| → ∞. As before these two humps are not distinguishable after scattering. The angle by which the humps get deflected due to interaction is given again by 1 . Thus properly choosing the free parameter a we can obtain a solution = arc tan 2|a| that scatters off in any desired angle. Finally, the above trajectories of the humps are solutions of the following system: X r i,j d 2r i = f (ri,j ) dt 2 ri,j

i = +, 0, −,

j,j 6=i

where, in this case, f (r) ≡

−2 A , with A = (18/b)2 , t → −∞, and A = (72b)2 , t → +∞. 9 r3

(6.16)

Therefore the total interaction can be considered as sum of two particle interactions with force between particles given by (6.16). Note that the hump at rest is subject to forces from the other two humps. The absence of motion of this hump can be traced to the fact that the above two forces are equal in magnitude but opposite in direction.

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

37

KPI solutions corresponding to superposition of simple and double poles with Q = 2. Let C, D, E, G be the N × N matrices defined in (4.13), and let F stand for the determinant of the 2N × 2N matrix, CD B= . EG Then if the constants γl (t), δl (t) evolve as required by (6.1), u(x, y, t) = 2

∂2 log F (x, y, γl (t), δl (t)) ∂x 2

is a real nonsingular KPI solution. KPI solutions corresponding to superposition of simple and triple poles with Q = 3. Let C, D, E, G be N ×N the matrices defined in (4.14) where the constants γl (t), δl (t), l (t) evolve as required by (6.1) and let F stand for the determinant of the 2N × 2N matrix, CD B= . EG Then

∂2 log F (x, y, γl (t), δl (t), l (t)) ∂x 2 is a real nonsingular KPI solution. u(x, y, t) = 2

KPI solutions corresponding to superposition of double poles with Q = 2 . Let C, D, E, G be N × N matrices defined in (4.15) where the constants γl (t), βl (t), σl (t) evolve as required by (6.1) and let F stand for the determinant of the 2N × 2N matrix, CD B= . EG Then

∂2 log F (x, y, γl (t), βl (t), σl (t)) ∂x 2 is a real nonsingular KPI solution. Assume furthermore that the constants are chosen satisfying u(x, y, t) = 2

i . 6bl Then as |t| → ∞ the ensuing solution behaves as a superposition of triply peaked multilump waves: N X ∂ 2 (log Fj )/∂x 2 , u∼2 βl¯ = β¯l , γl¯ = γ¯l + 2i σ¯ l , σl¯ = σ¯ l =

j =1

where Fj is given by (6.13) corresponding to parameters kj , βj , γl , σj = 6bi j . In each of these subgroups j = 1, . . . N the triply peaked lump is a stationary solution of KPI. The individual solitary waves suffer no phase shift due to interaction. Acknowledgements. This work was partially sponsored by the Air Force Office of Scientific Research, Air Force Materials Command, USAF, under grant number F49620-97-1-0017. This work was also partially supported by NSF grant DMS-970350 and by CICYT 0028/95 and Junta de Castila-Leon JADZ in Spain.

38

J. Villarroel, M. J. Ablowitz

Appendix I For convenience in this section we define 91,α ≡ 8α , 90,α ≡ να . Let the system of functions {9r,α }, α = 1..2N, r = 0..M, M ≥ 2 satisfy ∂G 9M,α }k=kα = · · · ∂k ∂ M−1 G ∂G 1 92,α + .. + 9M,α }k=kα = 0, = {G91,α + ∂k (M − 1)! ∂k M−1 ∂G 1 ∂ 2G 1 ∂M G 92,α + · · · + 9M,α }k=kα = 1. 91,α + {G90,α + 2 ∂k 2 ∂k M! ∂k M {G9M,α }k=kα = {G9M−1,α +

(2.8)

Then the orthogonality relationships below hold < 9s,β , 9r,α >= 0, ∀r, s = 0 . . . M, ∀β 6 = α¯ < 9s,α¯ , 9r,α >= 0, ∀r, s = 0 . . . M, satisfying r + s > M, < 9s+1,α¯ , 9r,α >=< 9s,α¯ , 9r+1,α >, ∀r, s = 0 . . . M

(A1.1) (A1.2) (A1.3)

and we define 9M+1,β ≡ 0. In particular this implies < 9r,α¯ , 91,α >=< 91,α¯ , 9r,α >, ∀r = 0 . . . M

(A1.4)

The above means the following: Consider a net of points (s, r), r, s = 0 . . . M. Then < 9s,β , 9r,α >= 0, ∀β,α if the point (s, r) is above the diagonal r + s = M, < 9s,β , 9r,α >= 0 if β 6 = α¯ and the point (s, r) is on or below the diagonal. Finally all values < 9s,α¯ , 9r,α > are equal for points (s, r) that are on a common line r + s = Constant. Then all scalar products get determined in terms of the basic ones < 91,α¯ , 9r,α >, r = 1 . . . M − 1 -which are given by the charges Qr,α . Proof. In differential form the above equations yield first (i∂y + ∂xx + u + 2ikα ∂x )9M,α = 0, (i∂y + ∂xx + u + 2ikα ∂x )9r,α + 2i∂x 9r+1,α = 0, r = 0 . . . M − 1 or simply if we define 9M+1,β ≡ 0, (i∂y + ∂xx + u + 2ikα ∂x )9r,α + 2i∂x 9r+1,α = 0, r = 0 . . . M. ¯ s,β yields through integration by parts Multiplication of these equations by 9 < 9s+1,β , 9r,α > +kα,β < 9s,β , 9r,α >=< 9s,β , 9r+1,α >, s, r = 0 . . . M (kα,β ≡ kα − k¯β ). Let first β = α¯ and find (kα,α¯ = 0), < 9s+1,α¯ , 9r,α >=< 9s,α¯ , 9r+1,α >, which is (A1.3.) Next take r = M to get < 9s+1,β , 9M,α > +kα,β < 9s,β , 9M,α >= 0, s = 1 . . . M.

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

39

By induction on s we then find by letting s run from M first to 1, ¯ < 9M,β , 9M,α >= 0, β 6 = α, < 9M,β , 9M,α > +kα,β < 9M−1,β , 9M,α >= 0 ⇒ ¯ and < 9M,α¯ , 9M,α >= 0 < 9M−1,β , 9M,α >= 0, β 6 = α, .. . < 9r,β , 9M,α >= 0, ∀α, β, 1 ≤ r ≤ M, < 90,β , 9M,α >= 0, ∀β 6= α¯ which are (A1.1), (A.1.2) for r = M. The rest of the proof follows through a similar inductive argument. u t

Appendix II Result IV. Let µ(k) belong to the discrete spectrum. Then the indexes Qα satisfy the following properties: PN PN (2.11.1) i) j =1 Qj = j =1 Qj¯ , j = 1 . . . N, Qα = sign(Imkα ) < 8α¯ , 8α >,

ii)

(2.11.2)

Qr,α = sign(Imkα ) < 8α¯ , 9r,α >, ¯ α = Qα¯ ,(2.11.3a) Q ¯ r,j = Q ¯ , j = 1 . . . M Q r,j

iii)

iv) QM,α = 0, where M ≥ 2 is the multiplicity of the pole.

(2.113b) (2.11.4)

Proof. i) Recall that the position of the poles occurring in pairs (kj , k¯j ) can be arranged such that I m(kj ) > 0; j = 1, 2 . . . N. For convenience we take j = 1, 2 . . . N. P ∂ Using u(x, y) = −2i N j =1 ∂x (8j + 8j¯ ) we have that Z Z X N N X 1 u (Qj − Qj¯ ) = (8j + 8j¯ )dxdy 2πi j =1

= −1 2π This is Eq. (2.11.1).

−1 2π Z

Z Z

j =1

N ∂ X ( 8j + 8j¯ )2 dxdy ∂x j =1

N X

dy(

j =1

8j + 8j¯ )2 (x = ±∞, y) = 0.

40

J. Villarroel, M. J. Ablowitz

ii) Using reality of the potential and the orthogonality condition A.1.1 for s = 1 we get πQr,j

1 = 2i =π

Z

Z dxdyu9r,j = −

dxdy

2N X

Z ∂x 8l 9r,j =

dxdy

l=1

2N X l=1

2N X

¯ l 9r,j ∂x 8

l=1

< 8l , 9r,j >= π < 8j¯ , 9r,j > .

With r = 1 we obtain that

Qj =< 8j¯ , 8j > .

The above are Eq. (2.11.2). iii) Using the orthogonality condition (A.1.1), Z πQj¯ =

dxdy

2N X l=1

Z ∂x 8l 8j¯ = −

dxdy

Z ⇒ πQj¯ = and

2N X l=1

Z ∂x 8l 8j¯ =

dxdy8j ∂x 8j¯

dxdy8j ∂x 8j¯ ≡ πQj

¯ ¯ π Qr,j =< 8j¯ , 9r,j >=< 9r,j¯ , 8j >= −< 8j , 9r,j¯ > = π Q r,j

by (A1.4) first and then using < f, g >= −< g, f > (this is Eq. (2.11.3)). iv) We have QM,α = sign (Im kα ) < 91,α¯ , 9M,α >= 0 by (A.1.2) with r = M, s = 1. This is (2.11.4). u t Appendix III We first show that the Greens function satisfies sign kI ∂G(x, y, k) = if G − , kI 6 = 0. ∂k 2π i

(A.III.1)

This formula can be obtained by writing the Greens function of (1.8) as Z −1 2 2 ei ((l−k)x−(l −k )y) (θ(y)θ (∓(l − k)) − θ (−y)θ (±(l − k)))dl. G± (x, y, k) = 2πi ∂G ∂k

has two contributions, one stemming from the derivative of the exponential term yielding the first term in (A.III.1), the other arising from the derivative of the Heaviside functions. Next we note that (A.III.1) implies, assuming that the derivatives can be taken under the integral sign, that the operator G defined in Eq. (1.7) satisfies for kI 6= 0, Z sign kI ∂G = −if G + G(if ) + udxdy, ∂k 2π i

Discrete Spectrum of Nonstationary Schrödinger and Kadomtsev–Petviashvili Equation

i.e.

∂G ∂k

acts on any function µ(x, y) by {

sign kI ∂G }µ = −if Gµ + G(if µ) + ∂k 2π i

Similarly we obtain with λ ≡

sign kI 2πi

41

Z uµ(x, y)dxdy.

(A.III.2)

that

∂ 2 G(x, y, k) = (2iy − f 2 )G + ilf, kI 6= 0, ∂k 2 which implies {

∂ 2G }µ = (2iy − f 2 )Gµ − G((2iy + f 2 )µ) + 2f G(f µ) ∂k 2 Z Z − ilf

uµdxdy + il

f uµdxdy.

(A.III.3)

Higher order derivatives are handled similarly. Appendix IV Consider the solution (1.20). We are interested in finding the asymptotic locations of the humps. Generically for x 0 , y 0 = 0(1) then x, y = 0(t), F = 0(t 4 ) and the amplitude behaves as F Fxx − Fx2 1 = O( 2 ). u(x, y, t) = 2∂xx log F = 2 2 F t It is reasonable to expect that for long time the maxima of u are found at those points x = x(t), y = y(t) that are minima of F and thus at which the denominator has smallest order. One can obtain equations for the minima of F : Fx = Fy = 0 and solve those equations asymptotically for |t| → ∞. An alternative and more direct way is to compute the smallest order of F directly from (1.20). Inspection of Eq. (1.20) shows that the smallest terms occur when the highest polynomials vanish at leading order. Namely let us call the asymptotically dominant terms z ∼ z0 (t), y 0 ∼ y0 (t). Then we wish to satisfy (z02 (t) − 4b2 y02 (t) − 24bt)2 ∼ 0, 2y0 (t)(1 + 2bz0 (t)) ∼ 0. There are two cases as |t| → ∞, y0 = 0, r

(a)

t y0 = ± 6| |, b

(b)

(A.4.1) (A.4.2)

p z0 = ± 24|bt|,

bt > 0,

(A.4.3)

z0 = 0,

bt < 0.

(A.4.4)

Further asymptotic can be obtained by assuming fractional power expansions X corrections 1−n Cn t 2 . In both cases (a), (b) one can readily establish of the form n≥0

u∼2

1 Fzz = 16b2 + 0 . F t

(A.4.5)

The asymptotics of other multipole lumps may be calculated directly using similar ideas.

42

J. Villarroel, M. J. Ablowitz

References 1. Ablowitz, M.J. and Clarkson, P.A.: Solitons, Nonlinear Evolution Equations and Inverse Scattering. Cambridge: Cambridge University Press, 1991, p. 516 2. Fokas, A.S. and Ablowitz, M.J.: Stud. Appl. Math. 69, 211 (1983) 3. Manakov, S.V., Zakharov, V.E., Bordag, L.A., Its, A.R. and Matveev, V.B.: Phys. Lett. A 63, 205 (1977) 4. Ablowitz, M.J. and Villarroel, J.: Phys. Rev. Lett. 78, 570 (1997) 5. Johnson, R.S. and Thompson, S.: Phys. Lett. A 66, 279 (1978) 6. Gorshov, K.A., Pelinovskii, D.E. and Stepahyants, Yu.A.:JETP 77, 237 (1993) 7. Galkin, V.M., Pelinovsky, D.E. and Stepanyants, Yu.A.: Physica D 80, 246 (1995) 8. Villarroel, J., Chakravarty, S. and Ablowitz, M.J.: Nonlinearity 9, 1113 (1996) 9. Ablowitz, M.J. and Villarroel, J.: London Mathematics Society Lecture Note Series 255, Cambridge: Cambridge University Press, 1998, p. 151 10. Pelinovsky, D.E.: J. Math. Phys. 39, 5377 (1998) 11. Ablowitz, M.J., Chakravarty, S., Trubatch, A.D., Villarroel, J.: Novel potentials of the nonstationary Schrödiinger equation and solutions of the Kadomtsev–Petviashvili I equation. Department of Applied Mathematics Report #389, University of Colorado, Boulder, CO, USA 12. Mañas, M., Santini, P.: Phys. Lett. A 227, 325 (1997) 13. Ward, R.S.: Phys. Lett. A 208, 203 (1995) 14. Ioannidou, T.: J. Math. Phys. 37, 3422 (1996) Communicated by T. Miwa

Commun. Math. Phys. 207, 43 – 66 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Deterministic Percolation Ilan Vardi IHES, 35 route de Chartres, 91440 Bures-sur-Yvette, France. E-mail: [email protected] Received: 1 November 1998 / Accepted: 1 April 1999

Abstract: This paper examines percolation questions in a deterministic setting. In particular, I consider R, the set of elements of Z2 with greatest common divisor equal to 1, where two sites are connected if they are at distance 1. The main result of the paper proves that the infinite component has an asymptotic density. An “almost everywhere” sieve of J. Friedlander is used to obtain the result. 1. Introduction One can often gain insight about a deterministic problem by comparing it to a probabilistic model. For example, Hardy and Littlewood [30] made precise conjectures about the existence of twin primes based on the assumption that the prime numbers are a generic set, i.e., have “pseudo random” properties. Similarly, in [47] the theory of percolation was used to make some precise conjectures about the existence of unbounded walks of bounded step size along Gaussian primes, a question posed by Basil Gordon, see [24] for a survey. Percolation theory deals with the same question of unbounded walks of bounded step size on a lattice, but in a probabilistic context, see [14,23,26]. For example, consider the plane integer lattice {(m, n) : m, n ∈ Z}, and a fixed 0 ≤ p ≤ 1. Say that a lattice point (or site) is open with probability p and closed with probability 1−p. If these events occur independently for each lattice point, what is the probability that there is an unbounded walk using step size k or less? The main result of percolation theory discovered by Broadbent and Hammersley in 1957 [10] is that this exhibits phase transition, in other words, there is a 0 < pc < 1 for which the probability of an unbounded walk is zero if p < pc and one if p > pc (this is called the critical point). For this reason, percolation theory has been of great interest in physics, as it is one of the simplest models to exhibit phase transition. In this paper, I will examine how questions of percolation theory can be posed in a deterministic setting. Thus deterministic percolation is the study of unbounded walks on

44

I. Vardi

a single subset of a graph, e.g., defined by number theoretic conditions. This might be of interest in physics and probability theory as it studies percolation in a deterministic setting and in number theory where it can be interpreted as studying the disorder inherent in the natural numbers. Instead of just providing a conjectural framework for percolation properties of these sets as was done in [47], I would like to show what unconditional results can be obtained so I will focus on R = {(m, n) ∈ Z2 : gcd (m, n) = 1}, where two sites are connected when they are at Euclidean distance 1. This example is more tractable than Gaussian primes yet retains some similar features. Studying the connectivity properties of R was posed as a problem in [15, p. 109]. The analysis of R will use sieve methods. This seems quite natural, as sieve methods can be interpreted as an application of probabilistic methods to number theory, for example, the large sieve [6] shows that arithmetic progressions behave like independent random variables. Moreover, as Doron Zeilberger has shown [50], sieve methods like the ones used in this paper can be thought of as special cases of the Lace Expansion which has been used successfully to study percolation problems [11,29]. The sieve method used here is due to Friedlander [20] and it proves very general “almost everywhere” results. Thus, if U = U (y) is a function increasing to infinity, then the Rosser sieve [36] shows that for every sufficiently large y, the interval [y, y + U ] always contains a number whose smallest prime factor is ≥ U 1/2−ε (Corollary 5.1 below). However, Friedlander’s sieve shows that for almost every y, the interval [y, y+U ] 1/5−ε (Proposition 5.1 below). contains a number all of whose prime factors are ≥ eU One can look at other examples of deterministic percolation, for example, the generalization of R to Rn = {(a1 , . . . , an ) ∈ Zn : gcd (a1 , . . . , an ) = 1}, where two points are connected when they are at Euclidean distance 1. When n ≥ 3, this is substantially simpler since all elements (a1 , . . . , an ) for which (a1 , . . . , aj −1 , aj +1 , . . . , an ) ∈ Rn−1 lie on the infinite component. Thus the case R2 is harder since the reduction to R1 essentially corresponds to primes. Hopefully, the techniques developed here will shed some light on the original question about Gaussian primes which can also be thought of as an analogue of R1 . Sieve techniques for Gaussian primes have been developed by Coleman [12], Fouvry and Iwaniec [19], and Friedlander and Iwaniec [22]. Finally, an interesting problem might be to study deterministic percolation for Fuchsian groups, e.g., by using the recent work of Lalley [38].

2. Problems of Deterministic Percolation Consider a graph G and a subset V of vertices. A vertex, or site will be open if it belongs to V and closed, otherwise. Percolation studies the properties shared by “almost all” subsets of G which have a given density p. One should therefore compare the properties of V with those of a “generic” set of the same density (such questions are investigated for random graphs in [39]). In order to do this one must first answer

Deterministic Percolation

45

Problem 1. Does V have a probability? Since one is trying to give analogues of percolation, one needs an analogue of a probability for a deterministic set. In this paper I will use asymptotic density δ(V) as defined in the next section. As will be seen, calculating the asymptotic density of a finite event can be nontrivial, even in the simplest cases. Problem 2. Does V have an infinite connected component? One might expect there to be an unbounded component if δ(V) > pc (G), where pc (G) is the critical percolation probability of G, and not otherwise. Problem 3. How many unbounded components does V have? For site and bond percolation in Zn , it has been shown that, with probability one, there is a unique unbounded component for p > pc [1,26]. For other models [38], with probability one, there can be zero, one, or infinitely many infinite components, depending on p. Problem 4. Do the unbounded components of V have densities? Percolation theory also predicts that unbounded components have, with probability one, a density, denoted by θ(p). Problem 5. In general, let f (p) be a function defined, with probability one, in the random model. Can the corresponding quantity be defined for V? If so, how is it related to f (δ(V))? Percolation theory considers other functions related to the cluster distribution. For example, χ (p), the average size of a connected component (χ f (p), the average size of finite components when p > pc ), and κ(p), the cluster size per vertex. In the Ising model, θ, χ (or χ f ), and κ represent magnetization, susceptibility, and free energy, respectively. 3. Percolation on R The subject of this section is deterministic percolation for R. Note that R can be thought of as the points of Z2 which are visible from the origin. One can thus restate the percolation properties of R as “Lecture Hall Percolation” (compare with [9]): Consider a classroom with a regular array of tables so students sit only where they can see the teacher. The teacher starts passing out exam booklets to the closest student and each student passes on the booklets to his closest neighbors. What percentage of the class will receive an exam? To get a feeling for the problem, consider Figs. 1 through 4 which depict sites of Z2 , where a disk or radius 1/2 surrounds each open site, so connected sets of disks correspond to connected components. In Fig. 1, 3, and 4, sites are considered open if they belong to R. Figure 1 gives all sites (m, n) with −50 ≤ m, n ≤ 50. Figure 2 represent a 100 × 100 square where sites are randomly open with probability 6/π 2 . In Fig. 3, the sites were (a, b) + (m, n), where (a, b) = (8660313549, 3102586521) and 0 ≤ m, n, ≤ 100. A similar computation was done in Fig. 4, where a, b were randomly chosen 100 digit numbers 818479887666178685800484385810052015576642831879680779054188168941665 4357533055575739845797590171065, 5258627195794198788220463515375759096 523698556698348004291249422348890633805215110415985087652598121.

46

I. Vardi

Fig. 1. Relatively prime pairs near the origin

The reader is invited to explain the obvious differences between the random and deterministic models, e.g., show that independence is false for R. An interesting feature of the experimental data for R is that local properties do not seem to change when increasing the scale. Thus, a 100×100 snapshot appears roughly the same whether at distance from the origin 1010 or 10100 . One can explain this phenomenon as follows: Long unbroken lines correspond to numbers with no small prime factors. Very roughly, a number with no prime factors < k will produce a line with breaks at average spacing k, which is the expected spacing between numbers with no prime factors < ek . For example, numbers with no prime factors < log1+ε X will most likely produce unbroken lines of length ≥ log1+ε X, but the Prime Number Theorem suggests that, on average, consecutive primes are distance log X apart, so these intervals will most likely contain a prime number. Since lines corresponding to primes lie on the infinite component, these lines will most likely belong to the infinite component. Moreover, the number of numbers with no prime factors > log1+ε X is about X/ log log X, so iterating this process seems to indicate that there is a scale invariance X 7 → eX . The line segments produced by this process seem to form a regular grid and, consistent with the philosophy

Deterministic Percolation

47

Fig. 2. Random sites open with probability 6/π 2

of de Gennes [14,23], one observes that the infinite component is a large interconnected mesh with holes. These observations are used in the proof of Theorem 3.4 below and will be the key to the proof that the infinite component has an asymptotic density. Having gained some feeling for the empirical evidence, one now tries to address the problems of deterministic percolation. Problem 1. Asymptotic density. In the case of R, the answer to Problem 1 is known: The asymptotic density of R is 1/ζ (2), which is the well known result that the “probability” that two random integers are relatively prime is 6/π 2 . In order to prove this, one needs to define density Definition. The asymptotic density of an event P (z) occurring in R is δ(P ) = lim δ(P , R), where δ(P , R) = R→∞

|{z ∈ R ∩ B(R) : P (z) holds}| , |B(R)|

48

I. Vardi

Fig. 3. Relatively prime pairs near (1010 , 1010 )

whenever this limit exists. Here B(R) = B2 (R) = {z ∈ Z2 : kzk < R}, and k(m, n)k = max(|m|, |n|) gives a square summation. Remark. One can also use circular summation, i.e., the usual Euclidean norm |z| and B◦ (R) is a disk of radius R. The subtle differences between such choices is well known, e.g., in the theory of multiple trigonometric series [2,7] and will be discussed below. The density of R is easily computed by the classical estimate [44], X |m|,|n|≤R gcd(m,n)=1

1=8

X d≤R

ϕ(d) =

24 2 R + E(R), E(R) = O(R log R). π2

The error √ term cannot be improved much further as Montgomery has shown that E(R) = ± (R log log R). However, as noted above, the situation is much different if one uses

Deterministic Percolation

49

Fig. 4. Relatively prime pairs near (10100 , 10100 )

circular summation: One is then estimating X 1 = π R 2 + 1(R), S(R) = m2 +n2 ≤R 2 gcd (m,n)=1

and it is easily shown that 1(R) = O(R) using a nontrivial bound on the error term in the Gauss circle problem (the prime number theorem also implies that the error is o(R)). The error term 1(R) has been studied by Moroz [41] and by Huxley and Nowak [35]. The latter showed, using results of Baker [3] that, assuming the Riemann Hypothesis, 1(R) = O(R 3/4+ ). It seems that the actual error is O(R 1/2+ε ) and in analogy with the Gauss circle problem [33,5], one can ask whether 1(R)/R 1/2 has a limiting (logarithmic) distribution (this would require very strong assumptions as in [42]) Continuing with the analysis of R, it is also true that any finite configuration in R has a density. In other words, let be a finite subset of Z2 which contains the origin and define δ() to be the asymptotic density of in the sense that (z) is true if and only if z + ⊂ R.

50

I. Vardi

Theorem 3.1. Let be a finite subset of Z2 , then δ() exists. This result is more subtle than expected and a direct approach to compute δ() fails (I would like to thank G. Tenenbaum for explaining this point to me). For example, let = {(0, 0), (1, 1), (2, 2)}, then one is counting X 1. S() = k(m,n)k≤R gcd (m+j,n+j )=1, j =0,1,2

The direct approach uses the identity X

µ(d) =

d|n

(

1, if n = 1, 0 if n > 1,

(1)

where µ(d) is the Möbius function and this gives rise to terms of the form 2 X R µ(d1 )µ(d2 )µ(d3 ) + O(1) . d1 d2 d3 d1 ,d2 ,d3 ≤R

The term corresponding to O(1)3 will be of order R 3 and no known estimates will be able to reduce this to o(R 2 ). However, one can still hope that such a bound exists and this leads to Conjecture 1. If is a finite set containing the origin, then Y pn − | mod p| . δ() = pn p prime

This is an analogue of the Hardy–Littlewood conjectures for prime k-tuplets [30]. The conjecture states that the global density should be a product of the local densities for each prime. A similar result was proved by Hafner, Sarnak, and McCurley [27] in which the density of relatively prime values of polynomials is computed. Problems 2. Existence of an infinite component. Existence is trivial since the line {(m, 1) : m = 1, 2, 3, . . . } lies in R. Comparing R to the random model, one sees that an unbounded component should exist since 6/π 2 > .5927, the conjectured approximation to pc (Z2 ) [51] (the best rigorous bounds are are .556 < pc (2) < .679492 of [4] and [49], respectively). Problem 3. Number of infinite components.. This has an elementary answer consistent with the probabilistic model: Proposition 3.1. R has a unique infinite component. Proof. Call C1 the infinite component containing the line {(m, 1) : 1 ≤ m < ∞}. Then for each prime p, the vertical line {(p, n) : 1 ≤ n ≤ p − 1} is in C1 . Thus, a large connected component in the region {(m, n) : m > n} will eventually have to cross one of these line, and therefore belong to C1 . By symmetry, the same holds true for the other 7 lines {±1, ±k}, {±k, ±1}. Finally, these lines are all joined together, either at (±1, ±1) or by passing through (±1, 0) and (0, ±1), where, by definition, gcd(1, 0) = 1. u t

Deterministic Percolation

51

Problem 4. The asymptotic density of the infinite component. This is the main question of this paper and I prove Theorem 3.2. The infinite component of R has an asymptotic density. Let θ = θ (R) be the asymptotic density of the infinite component, i.e., θ = δ(C∞ ), where C∞ (z) is true if and only if z lies on the infinite component, which, by abuse of notation, will also be denoted by C∞ . The value of θ can examined experimentally and preliminary computations seem to indicate that θ/p(R2 ) ≈ .96 ± .01, i.e., about 96% of open sites lie on the unbounded component. In fact, I will prove Theorem 3.3. The asymptotic density of the infinite component of R is not zero. Both these theorems are quite subtle and follow from Theorem 3.4. Let f (R) be any function increasing to infinity then, except for a set of zero asymptotic density, every (m, n) ∈ B(R) is surrounded by a rectangle of perimeter < f (R) all of whose edges are contained in C∞ . As noted above, this result is consistent with de Gennes’ philosophy that the infinite component consists of a mesh with small holes [14,23]. Theorem 3.4 suggests that a stronger result is true, namely if one defines rect (m, n) to be the perimeter of the smallest rectangle containing (m, n) all of whose edges are in C∞ , then rect (m, n) should have a limiting distribution. An examination of the proof of Theorem 3.2 (Sect. 8) reveals that it works with f (R) in Theorem 3.4 replaced by (log R)1/2−ε , so Theorem 3.2 follows from the weaker result of Lemma 7.3 below. It should be noted that one can similarly show that the infinite component of Rn has an asymptotic density for n ≥ 3, and it is trivially nonzero since it is ≥ 6/π 2 in these cases. One can easily give an upper bound θ < 6/π 2 . Thus, consider γ (z) to be true if z ≡ (4, 15) (mod 30), then z is isolated if γ (z) and z ∈ R are both true. Since X X X = µ(d) 1 + O(R), k(m,n)k≤R γ (z), (m,n)=1

one gets

gcd (d,30)=1 d≤R

1 6 Y 1 1 Y , 1− 2 = 2 δ(γ ) = 2 2 30 p π p −1 p>5

so

m≡4/d (mod 30) n≡15/d (mod 30) k(m,n)k≤R/d

p≤5

1 6 6 6 = .99305¯ 2 . θ ≤ 2 − 4δ(γ ) = 1 − 2 π 144 π π

More generally, define an animal α to be a connected set containing the origin and α(z) is true if z + ω ∈ R for every ω ∈ α, but z + ω 6∈ R if ω ∈ ∂α (a site is in ∂α if it is connected to α but not in α). It should be noted that this definition of animal is not translation invariant, see [8] for general results on animals. The inclusion-exclusion principle implies that X (−1)n δ(α ∪ s), δ(α) = s⊂∂α

52

I. Vardi

where each δ(α ∪ s) on the right represents an ordinary density of a configuration. It follows that the density of animals exists and moreover, Lemma 8.4 below shows that θ=

X 6 − δ(α). 2 π α

One can therefore try to estimate θ by computing δ(α) in small cases. As noted above, this is does not seem possible using current methods. However, if one accepts Conjecture 1, then one has α1 = ζ1 (2) − 4ζ2 (2) + 8ζ3 (2) − ζ4× (2) − ζ5 (2) = .0110235396 . . . , , 62 16 α2 = 4ζ2 (2) − 32ζ3 (2) + ζ4× (2) + 36ζ5 (2) − ζ6 (2) = .0013019993 . . . , 5 3 where α1 and α2 are the densities of elements in R belonging to animals of size 1 and 2 respectively, and Y Y m m × 1 − s , ζm (s) = 1− s . ζm (s) = p p p prime

p prime ps 6 =m

Conjecture 1 therefore implies θ < 24ζ3 (2) −

57 × 16 6 ζ4 (2) − 35ζ5 (2) + ζ6 (2) ≈ .9797253010901 2 . 5 3 π

Note that the functions ζm (s) do not have an analytic continuation if m > 1 since a result of [16] states that for a polynomial f (x), the Euler product ζ (s, f (x)) = Q Estermann −s p f (p ) has an analytic continuation to the complex plane only if f (x) is a product of cyclotomic polynomials. However, one can easily find these numerically using the formula P " # 1` d|` ad µ(`/d) ∞ Y Y Y 1 f (1/ps ) , ζ (`s) 1 − `s ζ (s, f (x)) = p p≤m p≤m `=1

where log f (x) =

P∞

k=1 ak x

k,

see [18,46].

Problem 5. Functions of percolation. The evaluation of θ indicates that other functions might also have deterministic analogues. One therefore defines Number of connected components in B(R) , |B(R)| X 1 |C(z)|, χ f (R) = lim R→∞ |B(R)| κ(R) = lim

R→∞

kzk≤R z6∈C∞

if these limits exist, where C(z) represents the connected component containing z. Since the sum for κ(R) has better convergence properties than θ (R), the methods used to prove Theorem 3.2 also show that this exists. However, the existence of χ f (R) is left as an open problem.

Deterministic Percolation

53

A problem related to percolation is computing the asymptotics of the largest Rfree square. This question is a generalization the Cramer conjecture [13] which states that the largest prime gap ≤ R should tend to (log R)2 (though this is now believed to inaccurately model the primes [25]). One can compare results for the random and deterministic models. Proposition 3.2. Let Fp (R) be the area of the largest closed square inside B(R), where sites are open with probability p, then, with probability 1, Fp (R) −2 = . R→∞ log R log(1 − p) lim

Proposition 3.3. Let F (R) be the area of the largest R-free square in B(R), then F (R) 1 log R . (log log R)2 log R log log R 4. Finite Configurations have a Density This section is devoted to the proof of Theorem 3.1. I will prove this by showing that the asymptotic density of a configuration is the limit of its density modulo Q a product of initial sets of primes. The idea is that Z2 is approximated by Z2 mod p≤X p and, using an idea from ordinary percolation [26], one can think of some events as being monotonic. Definition. Two integers m, n are relatively prime modulo h if gcd (m, n, h) = 1. Furthermore, let |{(m, n) ∈ B(R) : h (m, n)}| , δh (, R) = B(R) where h (m, n) means that (m, n) holds as well as (m, n, h) = 1. Lemma 4.1. δh () = limR→∞ δh (, R) exists (and will be called the density of modulo h). Proof. This follows from the fact that this relative primality is periodic modulo h and that B(h) tiles the plane. u t Q The next step is to consider a product of an initial set of primes P (X) = p≤X p and let h = P (X). Thus the pairs relatively prime modulo P (X) will be an approximation to R. The main result thus follows from Lemma 4.2. If Y > X and X → ∞, then δP (Y ) (, P (Y )) = δP (X) (, P (X)) + O(||/X). Lemma 4.3. If both X → ∞ and R/P (X) → ∞, then δ(, R) = δP (X) (, P (X)) + O(||/X + RP (X)/|B(R)|).

54

I. Vardi

P If these results are true, then one can take an increasing sequence of Xj ’s such that 1/Xj < ∞, so that Lemma 4.2 shows that δP (Xj ) (, P (Xj )) approaches a limit. Lemma 4.3 then implies that this limit equals the limit of δ(, R) as R → ∞. This proves Theorem 3.1. Proof of Lemma 4.2. Let Y > X, then P (X)|P (Y ) and B(P (X)) exactly tiles B(P (Y )). Now consider z ∈ B(P (Y )) for which P (Y ) (z) is true. Then P (X) (z mod P (X)) is true except if the distance of z mod B(P (X)) from the boundary of B(P (X)) is less than ||. Conversely, if P (Y ) (z) is false, then P (X) (z mod B(P (X))) could be true if d|z+α, where d is divisible only by primes > X. It follows that   X z∈B(P (Y )) P (Y ) (z)

1=

|B(P (Y ))|   |B(P (X))| 

X z∈B(P (X)) P (X) (z)

 p 1 + O || |B(P (X))|  

X 1 + O |B(P (Y ))| || d2

! .

d>X

Dividing this by |B(P (Y ))| gives the result. u t Proof of Lemma 4.3. Let R > P (X), where R/P (X) is large. Let z ∈ B(R), then if (z) is true, then P (X) (z mod P (X)) is true unless z mod P (X) is close to the boundary of B(P (X)). Conversely, if (z) is false, then P (X) (z mod B(P (X))) could be true if d|z + α, where d is divisible only by primes > X. One gets a similar estimate as in the above, except that the tiling of B(R) by B(P (X)) is no longer exact and there is an extra factor of size O(R |B(P (X)))|. Thus   X X p  |B(R)|   1= 1O + || |B(P (X))|    |B(P (X))| z∈B(R) (z)

z∈B(P (X)) P (X) (z)

X 1 + O |B(R)| || d2

! + O(R P (X)).

d>X

The result follows upon dividing by |B(R)|. u t

5. Sieve Methods The sieve methods used here are due to Friedlander [20] who used them to show that for any fixed E > 5, almost every interval [y, y + logE y] contains a number with at most 4 prime factors. The term “almost every” means that the measure of y ≤ X for which this fails to hold is o(X). Such results were first obtained by A. Selberg [43] who showed that, assuming the Riemann Hypothesis, for any function f (n) → ∞, almost every interval [y, y +f (y) log2 y] contains a prime. Heath-Brown [32] later showed that the further assumption of Montgomery’s pair correlation conjectured implied the similar

Deterministic Percolation

55

result with log y replacing log2 y, and this is in some sense optimal. Unconditional results were proved by various authors and the result used here is due to N. Watt [48] and shows that almost every interval of length y 1/14+ε contains a prime. An unconditional result of Heath-Brown and Iwaniec [34] states that every interval of length y 11/20 contains a prime (this has subsequently been improved, see [48]). In [21], Friedlander improved his methods and showed that almost every interval [y, y + f (y) log y] contains a number with at most 21 prime factors. This is a much more accurate result, as the optimal interval length is f (y) log y/(log log y)21 . There are other techniques to prove such results and Harman [31] has shown that almost every interval [y, y + log7 y] contained a number with exactly two prime factors. Unlike Friedlander’s papers, this does not seem to yield techniques that are directly applicable to the problems of this paper. I will closely follow Friedlander’s paper which starts with a modified form of Rosser’s sieve [36] (see also [28, Chapter 8]). As usual [28] consider an interval A and let S(A, Z) be the set of an element of A which are not divisible by any prime p < Z. In this case A will be the interval [y, y + U ]. Theorem 5.1 (Friedlander). Let 2 ≤ Z ≤ D. There exists a function f (u) which is positive for u > 2, and a sequence λd of real numbers, this sequence depending on D and Z, and satisfying |λd | ≤ 1, such that log D U −1/3 f + O(log D) − |Ry (D)|, (2) S(A, Z) > log Z log Z where Ry (D) =

X d
λd ry (d), ry (d) = ψ

y d

−ψ

y+U d

1 , ψ(t) = t − btc − . 2

The following is the best “everywhere” result available using the Rosser sieve. Corollary 5.1. For any ε > 0, there exists a constant E such that for all sufficiently large k and X, every interval [y, y + U ] contains at least E U/ log U numbers all of whose prime divisors are greater than U 1/2−ε , This result follows by letting U = D 1+δ = Z 2+3δ in Theorem 5.1, where δ > 0 and the trivial bound |Ry (D)| ≤ 2D is used. Theorem 5.1 formulates the error term Ry (D) in analytic form which allows one to average the error as y varies over an interval [X/2, X]. Theorem 5.2 (Friedlander). Let 4 ≤ D 2 ≤ V /2 ≤ X/(2 log D), where V = y/U , then Z 1 X |Ry (D)|dy U 1/2 (log D)3/2 + (log D)2 . (3) X X/2 Remark. One can improve this result slightly by using Friedlander’s second paper [21]. This would result in replacing the U 1/2 (log D)3/2 term in (3) with U (log Z)−1/2 , under the conditions D 2 (log D)7 ≤ V ≤ X and Z 11 ≤ D. The exponent in Proposition 5.1 below would be improved from 1/5 to 1/3 (the true exponent seems to be 1). The proof of Theorem 5.2 follows exactly as in the paper of Friedlander [20] except that the log D term is everywhere substituted for log X, written as L in [20]. An examination of Friedlander’s proof reveals that this substitution is valid at each step of his argument. Friedlander used this result to prove

56

I. Vardi

Theorem 5.3 (Friedlander). For any A > 5, there is a constant c such that almost every interval [y, y + (log y)A ] contains at least c (log y)A−1 numbers x such that P− (x) > x 1/4−ε , where P− (x) is the smallest prime factor of x (so x has at most 4 prime factors). This result will be generalized to arbitrary interval lengths of small size. Proposition 5.1. Let g(R) be an unbounded increasing function such that g(R) < (log R)5 . Given any fixed ε > 0, then for all y < R, except for a set of measure O(R [g(R)]−ε/2 ), the interval [y, y + g(R)], contains an x with P− (x) > exp([g(R)]1/5−ε ). Proof. One takes a large T and has to show that almost all y ≤ T have an x ∈ [y, y + g(R)] with P− (x) > exp([g(R)]1/5−ε ). Using the above notation, one writes U = g(R) 1/5−ε . Also, in and in order to get the result of Proposition 5.1, one must choose Z = eU 2+ε order to have f ((log D)/(log Z)) > 0, one takes D = Z . Since U → ∞, assume that y > T /U and divide the interval [T /U, T ] into disjoint subintervals of the form (X, 2X]. Taking V = X/U and applying Theorem 5.1 gives a lower bound o n S(A, Z) > U 4/5+ε f (2 + ε0 ) + O((log D)−1/3 ) − |Ry (D)|, so the main term has order U 4/5+ε . Now consider G ⊂ (X, 2X] given as the set of y’s for which |Ry (D)| > U 4/5−ε . It follows that Z 2X |Ry (D)|dy > |G| U 4/5−ε . X

Since U < (log X)5 , one has Z < e(log X)

1−ε

D 2 ≤ e3(log X)

1−ε

≤

and so

V X X = , ≤ 5 2U 2 2(log X)

for all sufficiently large X. Also, since D = exp((2 + ε)U 1/5−ε ), one has U = 0 C1 (log X)5+ε D for some constants C1 , ε0 , so X X X V = ≤ C1 , ≤ 5 2 2U 2 log D (log D) for all sufficiently large X. The conditions of Theorem 5.2 are therefore satisfied, and applying it gives Z 2X |Ry (D)|dy X(U 1/2 (log D)3/2 + (log D)2 ) X U 4/5−3/2ε . X

It follows that

|G| X U −ε/2 = o(R),

i.e., the proportion of exceptional y’s goes to zero. Thus almost all y ∈ (X/2, X] have |Ry (D)| < U 4/5−ε and this is dominated by the main term U 4/5+ε of (3), so S(A, Z) is nonzero for almost all such y. Taking the union of the intervals gives the result. u t

Deterministic Percolation

57

6. R-Free Squares Before embarking on the rather daunting task of proving Theorem 3.4, I will do a “warmup” consisting of proving Propositions 3.2 and 3.3. Proof of Proposition 3.2. (a) lim sup Fp (R)/ log R ≤ −2/ log(1 − p): Let c > −2/ log(1 − p), and at each site (m, n) in B(R), consider a square of area c log k(m, n)k and with lower left hand corner at (m, n). The probability that all sites in the square are closed is (1 − p)c log k(m,n)k < k(m, n)k−2−ε , where ε > 0. Summing over all sites in Z2 gives X 1 , k(m, n)k2+ε 2 (m,n)∈Z k(m,n)k>2

which converges. The Borel-Cantelli Lemma [17] shows that the probability of infinitely many such squares being closed is zero. (b) lim inf Fp (R)/ log R ≥ −2/ log(1 − p): Let c < −2/ log(1 − p) and for each site (m, n) with k(m, n)k > 2, consider a square of area c log k(m, n)k with bottom left hand corner at b(m, n) (log k(m, n)k)2 c. As in part (a), it follows that each square has a probability k(m, n)k−2−ε of being closed. Moreover, all these events are independent so the result follows from the Borel Cantelli Lemma and the divergence of X (m,n)∈Z2

(log k(m, n)k)2 . k(m, n)k2+ε

t u

k(m,n)k>2

Proof of Proposition 3.3. (a) Lower bound: Consider an integer k, and the first k 2 primes q1 , . . . , qk 2 . By the Chinese Remainder Theorem, the following congruences have soQ2 lutions m, n ≤ ki=1 qi ,     k k Y Y qik+j  , n ≡ −j mod qj k+i  , m ≡ −i mod j =1

j =1 0

so gcd (m+i, n+j ) > 0, for 1 ≤ i, j ≤ k. The prime number theorem gives k 2(1+ε )k > k(m, n)k > k 2(1−ε)k , which yields the estimate k 2 > (1 − ε00 ) 21 log R/ log log R (one can improve the 1/2 term to π 2 /12). (b) Upper bound:Assume that there is an R-free square {(m+i, n+j ) : 1 ≤ i, j ≤ k}. By Corollary 5.1, there are x1 , . . . , x` ∈ [m+1, m+k] for which P− (xi ) > k 1/2−ε , and such Q e that ` k/ log k. Write xi = rji=1 qijij . One notes that if q is a prime, then the number of integers in a set S which are relatively prime to q is ≥ min{|S|(1 − 1/q), |S| − 1}. The asymptotic relation X 1 ∼ − log(1/2 − ε) < 1 q 1/2−ε k

2

thus implies that the number of integers in [m + 1, m + k] relatively prime to xi is (1 + log(1/2 − ε)) k − |{j ≤ ri : qij ≥ k}|, and so each ri > (1 + log(1/2 − 2ε)) k.

58

I. Vardi

One further notes that gcd (xi , xj ) ≤ k for i 6 = j , so that xi and xj can share at most 2 distinct prime factors. This implies that the number of distinct qij ’s is k 2 / log k. The prime number theorem gives ` Y i=1

xi

Y

q = eD k

2 −k 1/2−ε +o(1)

,

y
for some constant D, and so log xi k log k for all i. This therefore gives log R k log k and the result follows since log k is of order log log R. u t

7. The Structure of the Infinite Component In this section, I prove Theorem 3.4. One begins identifying some subsets of the infinite component. Lemma 7.1. The following sets are in C∞ : (1) {(m, 1) : m > 0}.

(2) {(p, n) : p > n}, p prime.

(3) {(m, q) : q < m < (q/2)20/11 , q 6 |m}, qprime. Proof. The first two results were proved above. The third follows from the unconditional result of Heath-Brown and Iwaniec [34] which states that every interval of length y 11/20 contains a prime (this has subsequently been improved, see [48]). Thus, if q 6 |m, then since q > 2 m11/20 , one of the intervals [m + m11/20 ] or [m − m11/20 , m] has none of its elements divisible by q and the corresponding line segment lies completely in R. By the result of Heath-Brown and Iwaniec, this line segment will cross a line {(p, n)}, where t p is prime, and by (2), this is in C∞ , so (m, q) is also in C∞ . u One proceeds by proving an initial version of Theorem 3.4. Lemma 7.2. All but O(R 2 / log R) pairs of B(R) are surrounded by a rectangle of perimeter O((log R)7 ) all of whose edges are contained in C∞ . Proof. Consider the set S0 = {(m, n) ∈ B(R) : ∃ n0 , P− (n0 ) > R 1/5 , (log R)7 + n > n0 > n, minimal, (d, n0 ) = 1 if |d − m| < R 1/13 , ∃ p ∈ [m − R 1/13 , m + R 1/13 ]}. By Watt’s result that almost every interval of length y 1/14+ε contains a prime, E0,1 = {(m, n) ∈ B(R) : [m − R 1/13 , m + R 1/13 ] has no primes} has density zero. In fact, his result shows that |E0,1 | R 2 /(log R)2 , since substituting E = 3 in Theorem 1 of [48] already gives this estimate for intervals of length R 1/14 (log R)22 . Next, if E0,2 = {(m, n) ∈ B(R) : P− (n0 ) ≤ R 1/5 , for all (log R)7 + n > n0 > n},

Deterministic Percolation

59

then |E1,2 | R 2 / log R. This bound follows by an argument similar to the proof of Proposition 5.1: Partition [R 1/2 , R] into intervals of the form [X, 2X] and in each of these intervals, let U = (log X)7 , D = V 1/2 /2, and for a fixed 1/10 > ε > 0, let Z = D 1/2−ε . The main term in Eq. (2) of Theorem 5.1 is then (log X)6 while Theorem 5.2 gives Z 1 2X |Ry (D)|dy (log X)5 . X X Therefore G defined as the set of y ∈ (X, 2X] for which |Ry (D)| (log X)6 satisfies |G| X/ log X. Finally, one considers E0,3 = {(m, n) ∈ B(R) : P− (n0 ) > R 1/5 , (log R)7 + n > n0 > n, minimal, ∃d |d − m| < R 1/13 , (d, n0 ) > 1}. But if P− (n0 ) > R 1/5 , then |E0,3 | R(log R)7

X

1 R 1+1/13 (log R)7 R

P− (n0 )>R 1/5 |d−m|1

R 2+1/13−1/5 (log R)7

XR p|n0

p

R2 . log R

Since B(R) − S0 ⊂ E0,1 ∪ E0,2 ∪ E0,3 , it follows that |B(R) − S0 | R 2 / log R, and so S0 represents almost all points of B(R). An examination of the definition of S0 shows that for every (m, n) ∈ S0 there is an unbroken horizontal line segment L0,1 (m, n) ⊂ C∞ of length R 1/13 , centered at (m, n) and which passes above (m, n) within distance O((log R)7 ), and which crosses a vertical line {(p, n)}, where p is prime. By Lemma 7.1 (2), this vertical line is in C∞ , so it follows that L0,1 (m, n) is also in C∞ . Replacing the condition n0 > n with n0 < n yields the similar result with a line segment L0,2 (m, n), passing below (m, n). To construct the corresponding vertical line segments L0,3 (m, n), L0,4 (m, n), one must include the condition that m, n R 11/20 in order to apply part 3 of Lemma 7.1. The result then follows. u t The next iteration already contains most of the ideas of the general procedure and is included in order to give a self-contained proof of Theorem 3.3. Lemma 7.3. All but O(R 2 /(log log R)3 ) pairs of B(R) are surrounded by a rectangle of perimeter O((log log R)36 ) all of whose edges are contained in C∞ . Proof. Consider the set S1 = {(m, n) ∈ B(R) : ∃m0 P− (m0 ) > (log R)42 , (log log R)36 + m > m0 > m, minimal, (m0 , d) = 1, |d − n| < (log R)40 .}. Now let E1,1 = {(m, n) ∈ B(R) : P− (m0 ) < (log R)42 , for all (log log R)36 + m > m0 > m}.

60

I. Vardi

By Proposition 5.1, for almost all y, [y, y + (log log R)36 ] contains an integer x with P− (x) > exp((log log R)36(1/5−ε) ), and this is > (log R)42 for all sufficiently large R. The error term of Proposition 5.1 gives |E1,1 |

R2 R2 = , (log log R)36ε/2 (log log R)3

by letting ε = 1/6. Next, one considers E1,2 = {(m, n) ∈ B(R) : P− (m0 ) > (log R)42 , (log log R)36 + m > m0 > m, minimal, ∃d |d − n| < (log R)40 , (m0 , d) > 1}. As before, one gets X

|E1,2 | R(log log R)36

1

P− (m0 )>(log R)42 |d−n|<(log R)40 , (m0 ,d)>1

R (log log R)36 (log R)40

XR R 2 (log log R)37 p (log R)2 0

p|m

R2 log R

,

for all sufficiently large R. This last estimate used the fact that if P− (m0 ) > (log R)42 and m0 < R, then m0 can have at most O(log R/ log log R) prime factors. It follows that S1 consists of almost all elements of B(R). The definition of S1 then shows that for almost all (m, n) ∈ B(R), there is a line segment L1,1 (m, n) ⊂ R of length at least (log R)40 , centered at (m, n) and which passes by (m, n) at distance O((log log R)36 ). One now observes that S0 ∩ S1 also comprises almost all elements of B(R). The definition of S0 implies that for almost all (m, n), the line segment L1,1 (m, n) intersects a rectangle of perimeter O((log R)7 ) whose edges lie in C∞ (note that these extend in C∞ to length R 1/13 ), so L1,1 (m, n) will also be contained in C∞ . Clearly, one can similarly show that for almost all (m, n), there are line segments L1,j ⊂ R, j = 2, 3, 4, of length (log R)40 and which pass within (log log R)36 of (m, n) on all four sides, and t by the same argument L1,j ⊂ C∞ , j = 2, 3, 4. u An examination of the P proof of Lemma 7.3 reveals that the obstacle in continuing this process is the term p|m0 1/p, which must be o(1). The trivial estimate X1 ω(m) p P− (m)

(4)

p|m

was used and this will only allow one more iteration, if one first removes all m for which ω(n) > 2 log log R (as is well known, this set has zero density [44]). In fact, in this context, it is easy to improve substantially on (4), as was noted to me by G. Tenenbaum.

Deterministic Percolation

61

Lemma 7.4. p Let f (R) be any function increasing to infinity, then except for a set of m of size O(R/ f (R) log f (R)), one has the bound X p|m p≥f (R)

1 1 p . p f (R) log f (R)

Proof. One uses the simple estimate X

X

m≤R

p|m p≥f (R)

X R R 1 = . = 2 p p f (R) log f (R) p≥f (R)

P This result reflects the fact that p|m 1/p has a limiting density, as follows easily from the Erd˝os–Wintner Theorem [44]. u t This result allows one to iterate the above argument. Recall that rect (m, n) is the perimeter of the smallest rectangle surrounding (m, n) which has all its edges in C∞ . Lemma 7.5. Let log1 z = log z and logk+1 z = log(logk z), and let expk z be the inverse of logk z, then for R > expk+2 (1015 ), 1 |(m, n) ∈ B(R) : rect (m, n) ≤ (logk+1 R)36 }| = |B(R)| 1 − O , (logk+1 R)3 (5) where the O( ) term is independent of k. Proof. One proves this by induction on k. The initial step k = 1 is exactly Lemma 7.3. Now assume that (5) holds for k, then I will show that it holds for k + 1. Thus assume that one has constructed for almost all (m, n), line segments Lk,j (m, n), j = 1, 2, 3, 4, as above, but of length (logk R)40 , which pass within (logk+1 R)36 , are centered at (m, n) and lie completely in C∞ . In particular, one assumes that the set Sk = {(m, n) ∈ B(R) : ∃n0 P− (n0 ) > (logk R)100 ,

(logk+1 R)36 + n > n0 > n, minimal,

(d, n0 ) = 1, |d − m| < (logk R)40 , (m, n0 ) ∈ C∞ },

is such that |B(R) − Sk | R 2 /(logk+1 R)3 . One then considers Sk+1 = {(m, n) ∈ B(R) : ∃m0 P− (m0 ) > (logk+1 R)100 ,

(logk+2 R)36 + m > m0 > m, minimal, (m0 , d) = 1, |d − n| < (logk+1 R)40 }.

Now let Ek+1,1 = {(m, n) ∈ B(R) : P− (m0 ) < (logk+1 R)100 , for all (logk+2 R)36 + m > m0 > m}.

62

I. Vardi

By Proposition 5.1, for any ε > 0, except for O(R/(logk+2 R)36ε/2 ) values of y, the interval [y, y + (logk+2 R)36 ] contains an integer x with P− (x) > exp((logk+2 R)36(1/5−ε) ). Letting ε = 1/6, this says that except for O(R/(logk+2 R)3 ) values of y, the interval [y, y + (logk+2 R)36 ] contains an integer x with P− (x) > exp((logk+2 R)1+2/15 ). This last quantity is eventually > (logk+1 R)100 , in particular, when logk+2 R > 10015/2 , i.e., when R > expk+2 (1015 ). Thus, one concludes that |Ek+1,1 |

R2 (logk+2 R)3

, for R > expk+2 (1015 ).

Next, one considers Ek+1,2 = {(m, n) ∈ B(R) : P− (m0 ) > (logk+1 R)100 ,

(logk+2 R)36 + m > m0 > m, minimal,

∃d, |d − n| < (logk+1 R)40 , (m0 , d) > 1}.

As before, one gets X

|Ek+1,2 | R(logk+2 R)36

1

P− (m0 )>(logk+1 R)100 |d−n|<(logk+1 R)40 , (m0 ,d)>1

R (logk+2 R)36 (logk+1 R)40

X

m≤R

P− (m)>(logk+1 R)100

X1 . p p|m

Now let X

Ek+1,3 = {(m, n) ∈ B(R) :

p|m0 p>(logk+1 R)100

1 1 p ≥ p (logk+1 R)50 logk+2 R

for all (logk+2 R)36 + m > m0 > m}. Applying Lemma 7.4 gives |Ek+1,3 | so |Ek+1,2 − Ek+1,3 |

R 2 (logk+2 R)36 p , (logk+1 R)50 logk+2 R

R 2 (logk+2 R)36 (logk+1 R)40 R2 p , logk+1 R (logk+1 R)50 logk+1 R

whenever (logk+1 R)9 > (logk+2 R)36 , for example, when R > expk+2 (16). It follows that Sk+1 consists of almost all elements of B(R), where the exceptional set is R 2 /(logk+2 R)3 . The definition of Sk+1 also shows that for almost all (m, n) ∈ B(R), there is a line segment Lk+1,1 (m, n) ⊂ R of length at least (logk+1 R)40 , centered at (m, n) and which passes by (m, n) at distance O((logk+2 R)36 ).

Deterministic Percolation

63

One now observes that Sk ∩ Sk+1 also comprises almost all elements of B(R) except for an exceptional set of size

R2 (logk+1 R)3

R2

+

(logk+2 R)3

= (1 + Ak )

R2 (logk+1 R)3

,

for sufficiently large R, where k Y

(1 + Aj )

j =1

k Y

" 1+O

j =1

1

!#

(logj R)3

= O(1).

The definition of Sk+1 implies that for almost all (m, n), the line segment Lk+1,1 (m, n) of length (logk+1 R)40 will intersect the perimeter of the rectangle of perimeter (logk+1 R)36 constructed in the previous iteration (note that its sides extend to line segments in C∞ of length (logk R)40 ). Since the sides of this rectangle lie in C∞ , it follows that Lk+1,1 (m, n) will also be contained in C∞ . Thus, the line segments defined in Sk+1 also lie in C∞ . Clearly, one can similarly show that for almost all (m, n), there are line segments Lk+1,j ⊂ R, j = 2, 3, 4, of length (logk+1 R)40 and which pass within (logk+2 R)36 of t (m, n) on all four sides, and Lk+1,j ⊂ C∞ , j = 2, 3, 4. u Proof of Theorem 3.4. Assume that there is a function f (R) increasing to infinity, and a fixed λ > 0 such that for at least λ Ri2 pairs (m, n) one has rect (m, n) > f (Ri ), where Ri is a sequence increasing to infinity. One defines log∗ z to be the minimum number of iterations of log required to be ≤ 2, i.e., loglog∗ z z ≤ 2. Let k = log∗ R − log∗ f (R) − 2, then logk R log f (R) for all sufficiently large R. One can now apply Lemma 7.5 since the bound expk+2 (1015 ) = explog∗ R−logf (R)+4 (1015 ) log R = o(R) holds. Furthermore, one has 1 (logk+1

R)3

1 → 0, log4 f (R)

since logk R log3 f (R) and f (R) → ∞. Substituting this in (5) shows that for almost all (m, n) ∈ B(R) one has rect (m, n) ≤ f (R), which contradicts the above assumption. The result follows. u t Proof of Theorem 3.3. Assuming Theorem 3.2, then this follows directly from Theorem 3.4. For if the result were not true, then there would be a function f (R) increasing to infinity such that θ(R) < 1/f (R) for all sufficiently large R. However, Theorem 3.4 implies that for sufficiently large R, almost all points of B(R) are surrounded by a rectangle √ √ of perimeter f (R) all of whose edges lie in C∞ . This implies that θ (R) 1/ f (R) which is a contradiction. u t

64

I. Vardi

8. The Infinite Ccomponent has a Density As in Sect. 4, the idea is to compare the density of the infinite component in a big square θ(R) =

|B(R) ∩ C∞ | |B(R)|

with this density modulo a product of primes. Thus, let |{(m, n) ∈ B(R) : (m, n) ∈ infinite component modulo h}| . R→∞ |B(R)|

θh = lim

The following gives a local characterization of the infinite component modulo h. Lemma 8.1. z is in the infinite component modulo h if and only if z mod h is connected modulo h to all sides of B(h). Proof. In fact, this holds for the fundamental domain B ∗ (h) = {0 ≤ m, n < h} of Z2 mod h (so B(h) consists of 4 copies of B ∗ (h)). To see this, note that B ∗ (h) has reflection symmetries generated by (m, n) 7 → (n, m) and (m, n) 7 → (m, −n) mod h which preserve relatively prime pairs mod h. Thus B ∗ (h) consists of 8 triangles each of which is a reflection of its adjacent neighbor, see Fig. 5. It is clear that z is on the infinite component if and only if it is h-connected to all three sides of the triangle on which it t lies and this is clearly equivalent to being h-connected to all four sides of B ∗ (h). u Lemma 8.2. If X < Y then θP (X) ≥ θP (Y ) . Proof. If z ∈ B(P (Y )) is in the infinite component modulo P (Y ) then z mod P (Y ) is P (Y ) connected to three boundaries of B(P (Y )). Since reducing modulo P (X) does not remove any connections, it follows that z mod P (X) is P (X) connected to the boundary of B(P (X)). u t

Fig. 5. Reflection symmetries modulo h

Deterministic Percolation

65

One concludes that θP (X) is decreasing and therefore the limit θ∗ = limX→∞ θP (X) exists. Lemma 8.3. If both X → ∞ and R/P (X) → ∞, then θ (R) ≤ θP (X) + o(1). Proof. This follows exactly as in the above also following the proof of Lemma 4.3. Note that the boundary error in tiling B(R) with B(P (X)) is O(P (X)/R) = o(1) by assumption. u t One therefore has θ(R) ≤ θ∗ + o(1) and the main result follows from Lemma 8.4. limR→∞ θ(R) = θ∗ . Proof. If θ∗ = 0, then Lemma 8.3 shows that θ = 0 as well and there is nothing to prove. On the other hand, if θ∗ > 0 consider a large R and by Lemma 8.2, find a large X such that θP (X) is very close to θ∗ , with P (X) < R and R/P (X) large. Since the prime number theorem says that P (X) = eX+o(1) , one can choose X such that R 1/2 /4 < P (X) < R 1/2 . It follows that X is of order log R, i.e., there are two constants A, B, such that A log R < X < B log R. P Now going from R modulo P (X) to R in B(R) removes at most |B(R)| d>X 1/d 2 |B(R)|/X elements. So, by the above estimate, one is removing O(R 2 / log R) sites. By Lemma 7.3, given 0 < ε < 1/2, one can surround almost all sites with a C∞ rectangle of perimeter < (log R)1/2−ε . Thus apart from a set of zero density, each individual removal can disconnect at most (log R)1−2ε sites from the connected component (note that none of the elements on the rectangles specified by Lemma 7.3 are removed, since these belong to C∞ of R). It follows that at most O(R 2 (log R)1−2ε / log R) sites are disconnected, a vanishingly small percentage of the infinite connected component modulo t P (X). This implies that θ(R) is asymptotically close to θP (X) . u Acknowledgement. I would like to thank Cécile Dartyge for checking the method of Sect. 7 and Gérald Tenenbaum for helpful comments and a crucial observation (Lemma 7.4).

References 1. Aizenman, M., Kesten, H., Newman, C.M.: Uniqueness of the infinite cluster and continuity of connectivity functions for short and long range percolation. Commun. Math. Phys. 111, 505–531 (1987) 2. Ash, J.M., Wang, G.: A survey of uniqueness questions in multiple trigonometric series. In: Harper, L.H. (editor), Harmonic Analysis and nonlinear differential equations American Math. Soc. Contemp. Math. 208, 1997, pp. 35–71 3. Baker, R.C.: The square–free divisor problem. Quart. J. Math. Oxford 45, 269–277 (1994) 4. van den Berg, J., Ermakov, A.: A new lower bound for the critical probability of site percolation on the square lattice. Random Structures and Algorithms 8, (1996), 199–212 5. Bleher, P.M., Cheng, Z., Dyson, F.J., Lebowitz, J.L.: Distribution of the error term for the number of lattice points inside a shifted circle. Commun. Math. Phys. 54, 433–469 (1993) 6. Bombieri, E.: Le grand crible dans la théorie analytique des nombres. 2nd edition, Astérisque 18, 1987 7. Bourgain, J.: Spherical summation and uniqueness of multiple trigonometric series. Int. Math. Research Notices 3, 93–107 (1996) 8. Bousquet-Mélou, M.: Percolation models and animals. European J. Combin. 17, 343–369 (1996) 9. Bousquet-Mélou, M., Eriksson, K.: Lecture hall partitions. Ramanujan J. 1, 101–111 (1997) 10. Broadbent, S.R., Hammersley, J.M.: Percolation processes. Proc. Cambridge Phil. Soc. 53, 629–641 (1957) 11. Brydges, D.C., Spencer, T.: Self–avoiding walk in 5 or more dimensions. Commun. Math. Phys. 97, 125–148 (1985) 12. Coleman, M.D.: The Rosser–Iwaniec sieve in number fields, with an application. Acta Arithmetica 66, 53–83 (1993)

66

I. Vardi

13. Cramér, H.: On the order of magnitude of the difference between consecutive prime numbers. Acta Arithmetica 2, 23–46 (1937) 14. Efros, A.L.: Physics and Geometry of Disorder, Percolation Theory. Moscow: Mir, 1986 15. Erd˝os, P., Gruber, P.M., Hammer, J.: Lattice Points. New York: Longman, Wiley, 1989 16. Estermann, T.: On certain functions represented by Dirichlet series. Proc. London Math. Soc. 27, 435–448 (1928) 17. Feller, W.: An Introduction to Probability Theory and its Applications. Vol. I, New York: Wiley, 1964 18. Flajolet, P., Vardi, I.: Zeta expansions of classical constants. Preprint (1996) 19. Fouvry, E., Iwaniec, H.: Gaussian primes. Acta Arithmetica 79, 249–287 (1997) 20. Friedlander, J.B.: Sifting short intervals. Math. Proc. Camb. Phil. Soc. 91, 9–15 (1982) 21. Friedlander, J.B.: Sifting short intervals II. Math. Proc. Camb. Phil. Soc. 92, 381–384 (1982) 22. Friedlander, J.B., Iwaniec, H.: Using a parity-sensitive sieve to count prime values of a polynomial. Proc. Nat. Acad. Sci. U.S.A. 94, 1054–1058 (1997) 23. de Gennes, P.G.: La percolation: Un concept unificateur. La Recherche 7, 919–927 (1976) 24. Gethner, E., Wagon, S., Wick, B.: A stroll through the Gaussian primes. American Math. Monthly 105, 327–337 (1998) 25. Granville, A.: Unexpected irregularities in the distribution of prime numbers. Proceedings of the International Congress of Mathematicians, (Zurich, 1994). Basel: Birkhäuser, 1995, pp. 388–399 26. Grimmett, G.: Percolation. New York: Springer-Verlag, 1989 27. Hafner, J.L., Sarnak, P., McCurley, K.: Relatively prime values of polynomials. In: Knopp, M., Sheingorn, M. (eds), A tribute to Emil Grosswald: Number theory and related analysis. Contemp. Math. 143, Amer. Math. Soc., Providence, RI: 1993, pp. 437–443 28. Halberstam, H.,Richert, H.–E.: Sieve Methods. New York: Academic Press, 1974 29. Hara, T., Slade, G.: Mean-field critical behavior for percolation in high dimensions. Commun. Math. Phys. 128, 339–391 (1990) 30. Hardy, G.H., Littlewood, J.E.: Some problems of ‘Partitio Numerorum’ III: On the expression of a number as a sum of primes. Acta Math. 44, 1–70, also In: G.H. Hardy: Coll. Papers, vol. 1, 1922, pp. 561–630 31. Harman, G.: Almost primes in short intervals. Math. Ann. 258, 107–112 (1981) 32. Heath-Brown, D.R.: Gaps between primes and the pair correlation of zeros of the zeta–function. Acta Arithmetica 41, 85–99 (1982) 33. Heath-Brown, D.R.: The distribution and moments of the error term in the Dirichlet divisor problems. Acta Arithmetica 60, 389–415 (1992) 34. Heath-Brown, D.R., Iwaniec, H.: On the difference between consecutive primes. Invent. Math. 55, 49–69 (1979) 35. Huxley, M.N., Nowak, W.G.: Primitive lattice points in convex planar domains. Acta Arithmetica 76, 271–283 (1996) 36. Iwaniec, H.: Rosser’s sieve. Acta Arithmetica 36, 171–202 (1980) 37. Iwaniec, I., Mozzochi, C.J.: On the divisor and circle problems. J. Number Theory 29, 60–93 (1988) 38. Lalley, S.: Percolation on Fuchsian groups. Ann. Inst. H. Poincaré Probab. Statist. 34, 151–177 (1998) 39. Lubotzky, A., Phillips, R., Sarnak, P.: Ramanujan graphs. Combinatorica 8, 261–277 (1988) 40. Montgomery, H.L.: Fluctuations in the mean of Euler’s phi function. Proc. Indian Acad. Sci. Math. Sci. 97, 239–245 (1987) 41. Moroz, B.Z.: On the number of primitive lattice points in plane domains. Monatschrift Math. 99, 37–43 (1985) 42. Sarnak, P., Rubinstein, M.: Chebyshev’s bias. Experimental Math. 3, 173–197 (1994) 43. Selberg, A.: On the normal density of primes in short intervals, and the difference between consecutive primes. Arch. Math. Naturvid. 47, 87–105 (1943) 44. Tenenbaum, G.: Introduction to analytic and probabilistic number theory. Cambridge: Cambridge University Press, 1995 45. Turán, P.: On a theorem of Hardy and Ramanujan. J. London Math. Soc. 9, 274–276 (1934) 46. Vardi, I.: Computational Recreations in Mathematica. Reading, MA: Addison–Wesley, 1991 47. Vardi, I.: Prime Percolation. Experimental Mathematics 7, 275–289 (1998) 48. Watt, N.: Short intervals almost all containing primes. Acta Arithmetica 72, 131–167 (1995) 49. Wierman, J.C.: Substitution method critical probability bounds for the square lattice site percolation model. Combin. Probab. Comput. 4, 181–188 (1995) 50. Zeilberger, D.: The Abstract Lace Expansion. Adv. Appl. Math. 19, 355–359 (1997) 51. Ziff, R.M., Sapoval, B.: The efficient determination of the percolation threshold by a frontier–generating walk in a gradient. J. Phys A: Math. Gen. 19, L1169–L1172 (1986) Communicated by P. Sarnak

Commun. Math. Phys. 207, 67 – 80 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Billiards in Tubular Neighborhoods of Manifolds of Codimension 1 Roberto Peirone Università di Roma “Tor Vergata”, Dipartimento di Matematica, via della Ricerca Scientifica, 00133, Roma, Italy. E-mail: peirone@ mat.uniroma2.it Received: 19 June 1998 / Accepted: 7 April 1999

Abstract: I prove that in (sufficiently small) tubular ρ-neighborhoods of a given C 2 manifold of codimension 1, any two points can be connected by a billiard trajectory, and 1 that in addition there exists such a trajectory having at most Hρ − 2 collision points, for 3 a suitable H > 0, provided the manifold is of class C . 0. Introduction In this paper I deal with billiard trajectories, i.e., the motion, in a region having smooth boundary, of a point which, when it reaches the boundary, moves according to the rule: incidence angle equal to reflection angle. There is extensive literature on billiards (see for example [2] and references therein). Probably, the most investigated problem on billiards is that of ergodicity of the related dynamical system (a good reference for ergodicity is [7]). Another typical problem is that of the existence of periodic trajectories (see for example [1] and [3]). Other references on billiards are [6] and [11]. In this paper I consider a problem which is not, to my knowledge, much investigated, namely that of the existence of a trajectory connecting two given points. Note that this question has a trivial positive answer if the region is convex, but an example due to L. Penrose, R. Penrose ([9]), based on an ellipse shows that there exist a region and two points of which are connected by no billiard trajectories in (see Fig. 1). If we permit the points to be in ∂ (not necessarily in the interior), it is sufficient to consider a region between two circles C1 and C2 such that C2 lies in the interior of C1 and contains the centre A of C1 , for, clearly A cannot be connected to many points of the region. In [10] a strengthened version of the result of [9] for regions having non-smooth boundary is discussed. It appears to be very difficult to give a characterization of the regions for which any two points can be connected by a billiard trajectory, thus here I consider the simpler problem of giving a not too special and yet nontrivial class of regions having this property. Namely, let M be a compact and connected C 2 manifold of codimension 1 in Rn . Then for sufficiently small positive ρ (see Prop. 1.2), the region Mρ , which is

68

R. Peirone

Fig. 1. The arc between P and Q is an arc of the ellipse whose foci are F1 and F2 . By the properties of an ellipse, no billiard trajectories connect points in A to points in B or in C

Fig. 2. Trajectory connecting two given points

the set of the points having distance from M smaller than ρ, i.e., a tubular neighborhood of M, has the above property (see for example [5], Ch. 4 for the notion of tubular neighborhood). Note that in some sense this result generalizes the trivial corresponding result for annuli. This is the main result of this paper and is proved in Sect. 2. I prove this, using a “variational approach”, i.e., I consider the polygonal having the two given points as ends, and touching the boundary at N points, each point lying on the opposite part of the boundary with respect to the previous one, and of minimum length. The nontrivial point is to prove that this polygonal, for N sufficiently large and of suitable parity, remains in the region. This polygonal is characterized by the fact that its direction is "almost orthogonal" to the boundary (see Fig. 2). By using similar considerations it is possible to prove the existence of nontrivial periodic trajectories in the two-dimensional case (see Remark 2.5). The variational approach, in a simpler form, is usual in convex regions. In Sect. 3 by supposing that is of class C3 , I get a bound for the number of collision points (which are the points at which the trajectory touches the boundary), namely I show that the number of collision points can be chosen to be at most of the 1 order of ρ − 2 for ρ → 0 (Theorem 3.5). Note that this result is trivial in the case of an annulus. The results of Sect. 2 were stated without proof in [8], for the two-dimensional case and regions of class C 4 . It could be interesting to try to extend the results of this paper to the case in which M may have codimension greater than 1.

Billiards in Tubular Neighborhoods of Manifolds of Codimension 1

69

1. Notation If a, b ∈Rn we denote by a · b the scalar product between a and b. If a ∈Rn , we denote by aˆ that vector of Rn−1 given by (a1 , . . . , an−1 ). We denote by η the euclidean distance in Rn and by || || the euclidean norm in Rn . As usual, we put η(x, A) = inf η(x, y), y∈A

when x ∈Rn , ∅ 6 = A ⊆Rn . Also, when ∅ 6 = A ⊆Rn , let Aρ = {x ∈ Rn : η(x, A) < ρ} for every ρ > 0. Given a, b ∈Rn , [a, b] (resp. ]a, b[) will denote the closed (resp. open) segment line with ends a and b, i.e., the set {ta+(1−t)b : t ∈ [0, 1]} (resp. {ta+(1−t)b : m−1 S [ai , ai+1 ]. If f is t ∈]0, 1[}). If a0 , . . . , am ∈Rn (m ≥ 1), we put [a0 , . . . , am ] = i=0

a differentiable map from an open subset U of Rn into Rm and x ∈ U , we denote by (df )x (u) the derivative of f at x calculated at u. In the following, manifold will always mean submanifold of Rn without boundary, with n ≥ 2. If M is a manifold of class C 1 , −→ 0}, x ∈ M, we denote by TM (x) the tangent space to M at x, i.e., {z ∈Rn : η(x+tz,M) t t→0

and put NM (x) = TM (x)⊥ , i.e., NM (x) is the normal space to M at x. The following lemma is well known. Lemma 1.1. If M is a C 1 -manifold, z ∈Rn , x ∈ M, and η(z, x) = η(z, M), then z − x ∈ NM (x). For the following M will denote a compact and connected (n − 1)-manifold of class C s with s ≥ 1. We consider the geodesic distance dM on M, defined by dM (x, x 0 ) = R1 inf ||γ 0 (t)|| dt, where inf is taken over all Lipshitzian curves γ : [0, 1] → M such 0

that γ (0) = x, γ (1) = x 0 , and let L = sup dM (z, z0 ). It is known that Rn \ M z,z0 ∈M

has exactly two connected components, one bounded which we denote by i(M) and the other unbounded, which we denote by e(M), and also, there exists a continuous function νM : M →Rn such that νM (x) is the outer unit normal vector to M at x, and we write simply ν instead of νM when the manifold is clear from the context. I guess that the following proposition is known. However we sketch a proof of it ; i) and part of iii) can be found in [4], Appendix. It regards, in substance, the existence of a tubular neighborhood of M. For our convenience, we here consider a standard simple local representation of M, namely, given x ∈ M, there exist an isometry 3x of Rn , and Ux open in Rn and Vx open in Rn−1 containing 0, and ϕx : Vx →R of class C s , such that x ∈ Ux , 3x (x) = 0, and ˆ ϕx (0) = 0, 3x (Ux ∩ M) = {x ∈ Rn : xˆ ∈ Vx , xn = ϕx (x)}, ∂ϕx ∂ϕx − ∂x1 (x), ˆ . . . , − ∂xn−1 (x), ˆ 1 ∂ϕx (0) = 0, ν3x (M) (x) = ∀ x ∈ 3x (Ux ∩ M), x x ∂xi || − ∂ϕ ˆ . . . , − ∂x∂ϕn−1 (x), ˆ 1 || ∂x1 (x), in particular, ν3x (M) (0) = en (= (0, . . . , 0, 1)). Proposition 1.2. If M is of class C s , s ≥ 2, then there exists ρ > 0 such that (i) For every z ∈ Mρ there exists a unique p1 (z) ∈ M such that z−p1 (z) ∈ NM (p1 (z)) and ||z − p1 (z)|| < ρ; also p1 (z) is the unique point of M of minimum distance from z. (ii) p1 has class C s−1 , and we have (dp1 )z (v) = v − (v · νM (z))νM (z) for all z ∈ M.

70

R. Peirone

(iii) If φ :Rn →R is defined by φ(z) =

(

η(z, M) if z ∈ e(M), −η(z, M) if z ∈ / e(M),

then φ is continuous on Rn and of class C s on Mρ , and we have (dφ)z (v) = νM (p1 (z)) · v for every z ∈ Mρ . (iv) If |t| < ρ, then φ −1 (t) is a (nonempty) compact and connected C s manifold of dimension n − 1. (v) Given t, t 0 with |t| < ρ, |t 0 | < ρ and z ∈ φ −1 (t), then z0 = p1 (z) + t 0 νM (p1 (z)) is the unique point of φ −1 (t 0 ) of minimum distance from z and ||z − z0 || = |t − t 0 |. Also, z − p1 (z) ∈ Nφ −1 (t) (z). (vi) ||z − z0 || ≥ |φ(z) − φ(z0 )| ∀ z, z0 ∈ Mρ . (vii) If x1 , x2 ∈ M, y ∈ NM (x1 ), then |(x2 − x1 ) · y| ≤

||x2 −x1 ||2 ||y|| . 2ρ

Proof. Given x ∈ M, consider a local representation of M at x as above, and for ρ > 0 consider ψx : Vx ×] − ρ, ρ[→Rn defined by ˆ ϕx (x)) ˆ + xn ν3x (M) (x, ˆ ϕx (x)). ˆ ψx (x) = (x, A simple calculation yields (1.1) (dψx )0 = I d, thus, by possibly restricting Vx , we can and do assume that there exists ρx > 0 such that ψx : Vx ×] − ρx , ρx [ is a C s−1 diffeomorphism, in particular one-to-one, onto its image. By a compactness argument, there exists ρ > 0 such that the map ψ : M×] − ρ, ρ[→Rn defined by ψ(x, t) = x + tνM (x) is a one-to-one map whose image is an open neighborhood of M, which is, in view of Lemma 1.1, exactly Mρ . Thus, i) easily follows, in view of Lemma 1.1 again, and by (1.1) we also get ii). Since η(z) = ||z − p1 (z)||, iii) is a simple consequence of ii). To prove iv), observe that φ −1 (t) = {x + tν(x) : x ∈ M} = Ct (M), where Ct (x) = x + tν(x), so φ −1 (t) is compact and connected, and is also a C s manifold, because, by ii), (dφ) 6 = 0 there. v) is a simple consequence of Lemma 1.1, and vi) follows from v). For vii), suppose y 6= 0. y y − x2 || ≥ ρ, thus ||x1 − x2 ||2 + ρ 2 ± 2ρ(x2 − x1 ) · ||y|| ≥ ρ2, Then, we have ||x1 ± ρ ||y|| and vii) follows at once. u t We remark that we can express any z ∈ Mρ as z = p1 (z) + p2 (z), where p2 (z) = z − p1 (z) ∈ NM (p1 (z)), and in fact we have p2 (z) = φ(z)νM (p1 (z)), in other words z = p1 (z) + φ(z)νM (p1 (z)), for all z ∈ Mρ . Note also that when 0 < ρ < ρ we have Mρ = {z ∈Rn : η(z, M) ≤ ρ}. We now study the billiard trajectories. If U is a connected open set in Rn whose boundary is a C 1 -manifold of dimension n − 1, and given a, b, c ∈Rn , with a 6= b 6= c, we say that [a, b, c] is a billiard trajectory in U , if b ∈ ∂U , ]a, b[∪]b, c[⊆ U , and, the half-line with origin at b and passing through a and the half-line with origin at b and passing through c are symmetric with respect to the normal to ∂U at b, in formula c−b a−b ·z= · z ∀z ∈ N∂U (b), ||a − b|| ||c − b|| c−b a−b ·z=− · z ∀z ∈ T∂U (b), ||a − b|| ||c − b||

Billiards in Tubular Neighborhoods of Manifolds of Codimension 1

71

||c−b|| or, equivalently, c − b = ||a−b|| (2n · (a − b)n − (a − b)), where n is any unit normal vector to ∂U at b. We say that [a0 , . . . , am ], m ≥ 2, is a billiard trajectory in U if [ai , ai+1 , ai+2 ] is a billiard trajectory in U for i = 0, . . . , m − 2. In such a situation we say that a1 , . . . , am−1 are the collision points (or c.p.) of the trajectory, and that the trajectory connects a0 and am . The following proposition gives a useful characterization of billiard trajectories, and is well known.

Proposition 1.3. Let a, b, c ∈Rn with a 6 = b 6 = c, let U be a nonempty open set in Rn whose boundary is a C 1 manifold of dimension n − 1. Suppose ]a, b[∪]b, c[⊆ U , b ∈ ∂U . Then (i) [a, b, c] is a billiard trajectory in U if and only if the function S : ∂U →R defined by S(x) = ||a − x|| + ||x − c||, is stationary at b, in the sense that (for example) S(x)−S(b) ||x−b|| −→ 0. x→b

(ii) If there exists V open in ∂U containing b such that S(x) ≥ S(b) for all x ∈ V , then [a, b, c] is a billiard trajectory. From now on, we assume that M is of class C 2 (at least). 2. Trajectories Connecting Two Given Points Fix ρ with 0 < ρ < ρ. In this section we prove the main result of this paper, i.e., that any two points in Mρ can be connected by a billiard trajectory in Mρ having N collision points, provided N is sufficiently large and of appropriate parity, the parity depending on the two points (Theorem 2.4). We use the following notation. Let ζ (1) , ζ (2) ∈ Mρ . Then put   if φ(ζ (1) )φ(ζ (2) ) < 0 2N P(ζ (1) , ζ (2) ) = 2N + 1 if φ(ζ (1) )φ(ζ (2) ) > 0  N if φ(ζ (1) )φ(ζ (2) ) = 0 , PN (ζ (1) , ζ (2) ) = {z ∈ (Rn )N+2 : z0 = ζ (1) , zN +1 = ζ (2) , φ(zi )φ(zi+1 ) ≤ 0 ∀ i = 0, . . . , N, |φ(zi )| = ρ ∀ i = 1, . . . , N}. Here, and in the following, we indicate a point z of (Rn )N +2 as (z0 , . . . , zN +1 ). When convenient, we will identify z with the polygonal [z0 , . . . , zN +1 ]. Note that PN (ζ (1) , ζ (2) ) 6 = ∅ provided N ∈ P(ζ (1) , ζ (2) ) (and this is the motivation to introduce P(ζ (1) , ζ (2) )). The idea is now to minimize the length of a polygonal in PN (ζ (1) , ζ (2) ) when N is a fixed element of P(ζ (1) , ζ (2) ), sufficiently large. Thus we are lead to N P ||zi+1 − zi ||; also put 30N (ζ (1) , ζ (2) ) = define the length 3N of z as 3N (z) = i=0

|φ(ζ (1) )|+|φ(ζ (2) )|+2Nρ. Observe that we have 30N (ζ (1) , ζ (2) )

=

N P

|φ(zi+1 )−φ(zi )| i=0 30N (ζ (1) , ζ (2) ) for all

for every z ∈ PN (ζ (1) , ζ (2) ), and by Prop. 1.2, vi, 3N (z) ≥ z ∈ PN (ζ (1) , ζ (2) ). On the other hand, the geometry of a tubular domain suggests that, by choosing z ∈ PN (ζ (1) , ζ (2) ) in such a way that ||p1 (zi+1 ) − p1 (zi )|| is small, for every ε > 0 the length of the polygonal can be smaller than 30N (ζ (1) , ζ (2) ) + ε (for sufficiently large N ∈ P(ζ (1) , ζ (2) )). Thus, for this N, the minimizing polygonal remains in Mρ (in the contrary case its length is bigger than 30N (ζ (1) , ζ (2) ) + σ for some

72

R. Peirone

σ > 0, independent of N). To realize this idea we need three (simple) lemmas. For our convenience, we put H = {(ζ (1) , ζ (2) ) : ζ (1) , ζ (2) ∈ Mρ , |φ(ζ (1) )| = ρ, φ(ζ (1) )φ(ζ (2) ) ≤ 0} and observe that if N ≥ 1 and z ∈ PN (ζ (1) , ζ (2) ), then for every i = 0, . . . , N, either (zi , zi+1 ) or (zi+1 , zi ) is in H. Lemma 2.1. There exists σ > 0 such that if (ζ (1) , ζ (2) ) ∈ H and |p1 (ζ (1) )−p1 (ζ (2) )| < σ , then ]ζ (1) , ζ (2) [⊆ Mρ . Proof. In this proof we always assume (ζ (1) , ζ (2) ) ∈ H. By Prop. 1.2, iii, by putting α(t)(= αζ (1) ,ζ (2) (t)) = |φ(tζ (1) + (1 − t)ζ (2) )|, we see that, if p1 (ζ (1) ) = p1 (ζ (2) ), and φ(ζ (2) ) 6 = 0, then α 0 (0) < 0, α 0 (1) > 0. So, by compactness, there exist β ∈]0, 21 [ and σ 0 > 0 such that if ||p1 (ζ (1) ) − p1 (ζ (2) )|| < σ 0 , then α(t) < ρ for every t ∈ ]0, β[∪]1 − β, 1[ (for t ∈]0, β[ we see this directly if |φ(ζ (2) )| ≥ θρ with 0 < θ < 1, for example θ = 21 , but this is also obvious when |φ(ζ (2) )| < ρ2 ). On the other hand, if β ≤ t ≤ 1 − β and p1 (ζ (1) ) = p1 (ζ (2) ), then we have |α(t)| ≤ (1 − β)ρ, thus there exists σ 00 > 0 such that if β ≤ t ≤ 1 − β and ||p1 (ζ (1) ) − p1 (ζ (2) )|| < σ 00 , then t α(t) < ρ. We conclude by taking σ = min{σ 0 , σ 00 }. u Lemma 2.2. For every ε > 0 there exists λ(ε) > 0, such that, if (ζ (1) , ζ (2) ) ∈ H, and |p1 (ζ (1) ) − p1 (ζ (2) )| ≥ ε, then ||ζ (1) − ζ (2) || ≥ |φ(ζ (1) ) − φ(ζ (2) )| + λ(ε). Proof. This easily follows from Prop. 1.2, vi, and a compactness argument. u t Lemma 2.3. There exists α : [0, +∞[→ [0, +∞[ with α(0) = 0, α continuous and increasing such that if (ζ (1) , ζ (2) ) ∈ H, then ||ζ (1) − ζ (2) || ≤ |φ(ζ (1) ) − φ(ζ (2) )| + ||p1 (ζ (1) ) − p1 (ζ (2) )||α(||p1 (ζ (1) ) − p1 (ζ (2) )||), more precisely ||ζ

(1)

−ζ

(2)

|| ≤ |φ(ζ

(1)

) − φ(ζ

(2)

r ρ )| + ||p1 (ζ (1) ) − p1 (ζ (2) )||2 (1 + ) + ρ 2 − ρ. ρ

Proof. Let (ζ (1) , ζ (2) ) ∈ H. Then ||ζ (1) − ζ (2) || − |φ(ζ (1) ) − φ(ζ (2) )| ≤ ||ζ (1) − p1 (ζ (2) )|| + ||ζ (2) − p1 (ζ (2) )|| − |φ(ζ (1) )| − |φ(ζ (2) )| = ||ζ (1) − p1 (ζ (2) )|| − ρ, and, by Prop. 1.2, vii), we get ||ζ (1) − p1 (ζ (2) )||2 = ||ζ (1) − p1 (ζ (1) )||2 + ||p1 (ζ (1) ) − p1 (ζ (2) )||2 + 2(ζ (1) − p1 (ζ (1) )) · (p1 (ζ (1) ) − p1 (ζ (2) )) ρ ≤ ||p1 (ζ (1) ) − p1 (ζ (2) )||2 (1 + ) + ρ 2 . ρ Now, the thesis easily follows. u t

Billiards in Tubular Neighborhoods of Manifolds of Codimension 1

73

Theorem 2.4. Suppose M is of class C 2 . Then there exists N ≥ 1 such that, if ζ (1) , ζ (2) ∈ Mρ , N ≥ N , N ∈ P(ζ (1) , ζ (2) ), then there exists a billiard trajectory in Mρ with N c.p. connecting ζ (1) and ζ (2) . Proof. The function 3N , being continuous, does take a minimum at some z ∈ PN (ζ (1) , ζ (2) ). By Prop. 1.3 it suffices to prove that there exists N ≥ 1 such that if ζ (1) , ζ (2) ∈ Mρ , N ≥ N, N ∈ P(ζ (1) , ζ (2) ), and i = 0, . . . , N, then ]zi , zi+1 [⊆ Mρ . By Lemma 2.1 it suffices to prove that ||p1 (zi ) − p1 (zi+1 )|| < σ . Suppose this is not true. Then by Lemma 2.2, we have 3N (z) ≥ 30N (ζ (1) , ζ (2) ) + λ(σ ). Now, given N ∈ P(ζ (1) , ζ (2) ) sufficiently large, we exhibit z˜ ∈ PN (ζ (1) , ζ (2) ) with 3N (˜z) < 30N (ζ (1) , ζ (2) ) + λ(σ ), in contrast to the definition of z. Choose xi ∈ M such that x0 = p1 (ζ (1) ), xN+1 = p1 (ζ (2) ), and ||xi+1 − xi || ≤ NL+1 for i = 0, . . . , N, and for i = 1, . . . , N let yi = (−1)i+c ρν(xi ) with c ∈ {0, 1} such that, putting y0 = p2 (ζ (1) ), yN+1 = p2 (ζ (2) ) and z˜ i = xi + yi , we have z˜ ∈ PN (ζ (1) , ζ (2) ). Note that this choice of yi is possible because N ∈ P(ζ (1) , ζ (2) ). By Lemma 2.3 ||˜zi+1 − z˜ i || ≤ |φ(˜zi+1 ) − φ(˜zi )| +

L L α( ). N +1 N +1

L ) < λ(σ ), to have that Now, it suffices to take N so large that Lα( N+1

3N (˜z) < 30N (ζ (1) , ζ (2) ) + λ(σ ).

t u

Remark 2.5. We can obtain similar results for different problems. For example, we can prove that given ζ (1) , ζ (2) ∈ Mρ and 1 > 0, there exists a billiard trajectory z ∈ PN (ζ (1) , ζ (2) ) in Mρ connecting ζ (1) and ζ (2) such that ||p1 (zi+1 ) − p1 (zi )|| ≤ min{1, σ } for i = 0, . . . , N. In order to get this, it suffices to consider the minimum of 3N over the set of z ∈ PN (ζ (1) , ζ (2) ) for which ||p1 (zi+1 ) − p1 (zi )|| ≤ min{1, σ } for i = 0, . . . , N,

(2.1)

and to prove, by proceeding in a similar way as in Theorem 2.4, that for this minimum z the strict inequality holds in (2.1), thus by Prop. 1.3 we have a billiard trajectory. By choosing 1 smaller and smaller, we see that the billiard trajectory connecting ζ (1) and ζ (2) can be taken to be arbitrarily close to the normal to ∂Mρ at all its collision points. Also, we can consider the problem min 3N (z), z∈A

where A can be a given closed subset of (Rn )N +2 , and proceed similarly. For example if A = {z : p1 [[z0 , . . . , zN+1 ]] is homotopic to γ , z0 = ζ (1) , zN +1 = ζ (2) }, where γ is a given curve on M connecting p1 (ζ (1) ) to p1 (ζ (2) ), and [[z0 , . . . , zN +1 ]] denotes the curve which runs along the polygonal [z0 , . . . , zN +1 ] with unitary velocity. We get in this way the existence of a billiard trajectory connecting ζ (1) and ζ (2) in a given homotopy class. If, instead, γ is a closed curve in M and consider A = {z : p1 [[z0 , . . . , zN+1 ]] is homotopic to γ , z0 = zN +1 }, we get the existence of a periodic billiard trajectory in a given homotopy class. This type of result is interesting especially in case M is a Jordan curve in R2 , for in this case there exist many mutually non-homotopic curves.

74

R. Peirone

3. The Case ρ Small In Sect. 2 we have proved the existence of a billiard trajectory connecting two given points, and it is possible in fact to estimate explicitly the number of collisions. For example, by making more explicit the arguments of Sect. 2, we get that if N ≥ 1 is such that s s L 2 ρ ρ 2ρ ) (1 + ) + ρ 2 − ρ < 4ρ 2 + (1 − )ρ 2 − 2ρ, (N + 1) ( N +1 ρ ρ ρ+ρ then for every ζ (1) , ζ (2) ∈ Mρ , if N ∈ P(ζ (1) , ζ (2) ), there exists a billiard trajectory in Mρ connecting ζ (1) and ζ (2) having N collision points. The proof of this estimate is not very complicated, but a bit technical, and will be omitted. However, this estimate is far from the best, in particular for ρ small. In fact, the case where M is a circle in 1 R2 suggests that the number of collision points has the order of ρ − 2 . In this section we prove that this type of estimate holds, provided M is of class C 3 (Theorem 3.5). Since we work in a compact subset of Mρ we a priori consider only ρ not bigger than some specific ρˆ < ρ, for example ρˆ = ρ2 . We now sketch the idea of the proof. First of all, we consider a variant of the problem considered in Sect. 2. Namely, given ζ (1) , ζ (2) ∈ Mρ , let PN,T (ζ (1) , ζ (2) ) = {z ∈ PN (ζ (1) , ζ (2) ) : ||p1 (zi+1 ) − p1 (zi )|| √ ≤ T ρ ∀ i = 0, . . . , N}, mN,T (ζ (1) , ζ (2) ) = {z ∈ PN,T (ζ (1) , ζ (2) ) : 3N (z) ≤ 3N (z) ∀ z ∈ PN,T (ζ (1) , ζ (2) )}. Once ζ (1) and ζ (2) are fixed, we omit the dependence on ζ (1) and ζ (2) , and write simply PN,T , mN,T . In the following we will use without explicit mention the simple observation that, if z ∈ mN,T (ζ (1) , ζ (2) ), and i = 0, . . . , N − 1, then (zi , zi+1 , zi+2 ) ∈ m1,T (zi , zi+2 ). Now, it is easy to see that mN,T (ζ (1) , ζ (2) ) is nonempty provided N is sufficiently large and N ∈ P(ζ (1) , ζ (2) ). Moreover, in view of Prop. 1.3, every z ∈ mN,T (ζ (1) , ζ (2) ) is a billiard trajectory in Mρ , provided it lies in fact in Mρ and √ (3.1) ||p1 (zi ) − p1 (zi+1 )|| < T ρ for each i = 0, . . . , N. √ In Lemma 3.1 we see that if ρ and T are sufficiently small (for example, T = ρ), then the trajectory lies in Mρ . On the other hand, if (3.1) does not hold then, by Lemmas 3.2 and 3.3, (for sufficiently small ρ and T ) ||p1 (zi ) − p1 (zi+1 )|| is, roughly speaking, "rather far" from 0 for all i, provided N is not too large. Therefore, using Lemma 3.4 and a not completely obvious argument, we can find a point in PN,T whose length is strictly smaller than that of z, and this contradicts the hypothesis z ∈ mN,T . Lemma 3.1. There exists ρ1 ∈]0, ρ2 [ such that if ρ ∈]0, ρ1 [ and (ζ (1) , ζ (2) ) ∈ H and √ ||p1 (ζ (1) ) − p1 (ζ (2) )|| ≤ ρρ, then ]ζ (1) , ζ (2) [⊆ Mρ . Proof. For t ∈ [0, 1] let zt = ζ (2) +t (ζ (1) −ζ (2) ), (so in particular z1 = ζ (1) , z0 = ζ (2) ), let β(t) = φ(zt ), and let α be defined as in Lemma 2.1, so that α(t) = |β(t)|. By the hypothesis we have that α : [0, 1] → R is continuous, α(0) ≤ ρ, α(1) ≤ ρ, β(0)β(1) ≤ 0. We are going to prove that if for some t ∈ [0, 1] and for some i = 0, 1 we have α(t) = ρ, β(t)β(i) ≤ 0, (3.2)

Billiards in Tubular Neighborhoods of Manifolds of Codimension 1

( >0 α 0 (t) <0

then

75

if i = 0 if i = 1.

(3.3)

Thus, a standard argument implies α(t) < ρ for all t ∈]0, 1[, thus the thesis. Fix t ∈ [0, 1] satisfying (3.2) and let ν = ν(p1 (zt )). Observe that ν · (zt − zi ) = ν · (zt − p1 (zt )) + ν · (p1 (zt ) − p1 (zi )) + ν · (p1 (zi ) − zi ) = φ(zt ) − φ(zi )ν · ν(p1 (zi )) + ν · (p1 (zt ) − p1 (zi )). √ On the other hand, if ρ1 is sufficiently ||zt −zi || ≤ ||ζ (1) −ζ (2) || ≤ ρρ+2ρ √ small, from√ we deduce ||p1 (zt ) − p1 (zi )|| ≤ ρρ + 4ρ < 2ρρ, and also ||ν(p1 (zi )) − ν|| ≤ 1. Thus, ν · ν(p1 (zi )) ≥ 0, and by (3.2) and Prop. 1.2, vii, ν · (zt − zi ) has the same sign as β(t). Now, (3.3) easily follows from Prop. 1.2, iii. u t In the following lemma we prove that, if (z1 , z2 , z3 ) ∈ P1,T (z1 , z3 ), then, roughly 1 (z1 )−p1 (z2 )|| speaking, the ratio ||p ||p1 (z3 )−p1 (z2 )|| cannot be too large. The geometrical interpretation of this statement is that if the previous ratio is very large, then the angle between [z1 , z2 ] and ν(p1 (z2 )) is bigger than the angle between [z3 , z2 ] and ν(p1 (z2 )), thus, moving z2 "towards" z1 (remaining on φ −1 (φ(z2 ))), we can make the length of the polygonal / m1,T (z1 , z3 ). strictly smaller, thus (z1 , z2 , z3 ) ∈ Lemma 3.2. Suppose M is of class C 3 . Then for every T > 0 there exist K(T ) ∈]0, 1[ and ρ2 (T ) ∈]0, ρ1 ] such that, if ρ ∈]0, ρ2 (T )[, and (z1 , z2 , z3 ) ∈ m1,T (z1 , z3 ) and √ ||p1 (z1 ) − p1 (z2 )|| = T ρ, (3.4) then

||p1 (z3 ) − p1 (z2 )|| ≥ K(T )T

√ ρ.

Proof. √ Given (z1 , z2 , z3 ) ∈ P1,T (z1 , z3 ) satisfying (3.4) with ||p1 (z3 ) − p1 (z2 )|| = AT ρ, let ν = νM (p1 (z2 )), and put v = z2 − z1 − (z2 − z1 ) · ν ν, α(t) = ||z2 + tv − z1 || + ||z2 + tv − z3 ||,

u = z2 − z3 − (z2 − z3 ) · ν ν, (3.5) β(t) = ||p1 (z2 + tv) − p1 (z1 )||.

It suffices to prove that if ρ and A are sufficiently small then α 0 (0) > 0,

(3.6)

β 0 (0) > 0.

(3.7)

In fact, if (3.6) and (3.7) hold, it is easy to see that, by moving the point z2 with direction −v, remaining on φ −1 (φ(z2 )) we get a polygonal in P1,T (z1 , z3 ) whose length is strictly smaller than that of (z1 , z2 , z3 ). Therefore, if (z1 , z2 , z3 ) ∈ m1,T (z1 , z3 ), then for sufficiently small ρ, A must be greater than (or equal to) some K(T ) > 0. We prove ||v||2 (3.6). We have α 0 (0) = ||z3u·v −z2 || + ||z1 −z2 || . Thus, in order to have (3.6) it is sufficient to prove

||v||2 ||z1 −z2 ||2

>

||u||2 , ||z3 −z2 ||2

or equivalently, in view of (3.5),

|(z3 − z2 ) · ν| |(z1 − z2 ) · ν| < . ||z1 − z2 || ||z3 − z2 ||

(3.8)

76

R. Peirone

Using Prop. 1.2, vii and (3.4) we have |(z1 − z2 ) · ν| ≤ 2ρ +

T 2ρ 2ρ ;

also √ ||z1 − z2 || ≥ ||p1 (z1 ) − p1 (z2 )|| − ||p2 (z2 )|| − ||p2 (z1 )|| ≥ T ρ − 2ρ. √ Thus, if ρ is so small that T ρ − 2ρ > 0, we have ρ(2 + T2ρ ) |(z1 − z2 ) · ν| ≤ √ . ||z1 − z2 || T ρ − 2ρ

(3.9)

2

(3.10)

Similarly, (z3 − z2 ) · ν = (z3 − p1 (z3 )) · ν + (p1 (z3 ) − p1 (z2 )) · ν + (p1 (z2 ) − z2 ) · ν. √ Now, since ν(p1 (z2 )) · ν = ν · ν = 1, and ||p1 (z3 ) − p1 (z2 )|| ≤ T ρ, if we take ρ2 (T ) so small that p x1 , x2 ∈ M, ||x1 − x2 || ≤ T ρ2 (T ) ⇒ ||ν(x1 ) − ν(x2 )|| ≤ 1 , we have, in view of Prop. 1.2, vii, |(z3 − z2 ) · ν| ≥ |(p1 (z2 ) − z2 ) · ν| − |(p1 (z3 ) − p1 (z2 )) · ν| ≥ ρ − also ||z3 − z2 || ≤ AT

√ ρ + 2ρ, hence we get

A2 T 2 ρ ; 2ρ

ρ(1 − A2ρT ) |(z3 − z2 ) · ν| ≥ . √ ||z3 − z2 || AT ρ + 2ρ 2

2

In conclusion, in view also of (3.10), if ρ and A are sufficiently small, we get (3.8), hence (3.6). We are now going to prove that (3.7) holds for sufficiently small ρ and A. We have (p1 (z2 ) − p1 (z1 )) · γ10 (0) , β 0 (0) = ||p1 (z2 ) − p1 (z1 )|| where γ1 (t) = p1 (z2 + tv). Put γ2 (t) = p1 (p1 (z2 ) + tv). We have ||γ10 (0) − γ20 (0)|| ≤ Hρ, where H is an appropriate constant > 0 related to the second derivatives of p1 in M ρ . Also, by Prop. 1.2, ii, γ20 (0) = v. Thus, using (3.4) and (3.5), we have

2

||p1 (z2 ) − p1 (z1 )||β 0 (0) ≥ (p1 (z2 ) − p1 (z1 )) · v − Hρ||p1 (z2 ) − p1 (z1 )|| = √ (z2 − z1 ) · v + (p1 (z2 ) − z2 ) · v + (z1 − p1 (z1 )) · v − H T ρ ρ ≥ 3 √ ||v||2 − 2ρ||v|| − H T ρ ρ = ||v||(||v|| − 2ρ)H T ρ 2 . √ By (3.9) and (3.10) ||v|| ≥ r ρ for suitable r > 0 and sufficiently small ρ > 0, and this concludes the proof. u t In the next lemma we estimate the difference between the length of successive segment lines in a billiard trajectory provided all the ends lie on ∂Mρ . To interpret geometrically this result in dimension two, observe that when M is a circle then the segment lines have the same length, and in the general case, M is approximated by some circle (or straight line) up to the third order.

Billiards in Tubular Neighborhoods of Manifolds of Codimension 1

77

Lemma 3.3. Suppose M is of class C 3 . Then there exist T > 0, B > 0, ρ3 ∈]0, ρ1 [ such that, if ρ ∈]0, ρ3 [ and [z1 , z2 , z3 ] is a billiard trajectory in Mρ with (z1 , z2 , z3 ) ∈ P1,T and |φ(z1 )| = |φ(z3 )| = ρ, then ||z1 − z2 ||3 . ||z3 − z2 || − ||z1 − z2 || ≤ B √ ρ

(3.11)

Proof. Suppose given T , ρ3 , ρ, z1 , z2 , z3 satisfying the hypotheses and prove that if ρ3 , T are appropriate, there exists B > 0 satisfying (3.11), with ρ3 , T , B independent of z1 , z2 , z3 . Put ν = ν(p1 (z2 )), z1 = z2 + aν + v with v · ν = 0. Since [z1 , z2 , z3 ] is a billiard trajectory we have ∃ t > 0 : z3 = z2 + t(aν − v). Put z4 = z2 + (aν − v), λ(t) = φ(z2 + aν + tv) − φ(z2 + aν − tv). It follows ||z3 − z2 || − ||z1 − z2 || = ||z3 − z2 || − ||z4 − z2 || = ||z3 − z4 ||.

(3.12)

We now choose ρ3 so small that all the segment lines [z2 + aν, z2 + aν ± v], [z2 , z4 ], [z3 , z4 ], are contained in M ρ , and it is sufficient that 2

T

ρ √ ρ3 + 3ρ3 ≤ . 3

(3.13)

Indeed, in this case, if P is one of the points z2 , z2 + aν, z2 + aν ± v, z3 , z4 , then we have ||P − p1 (z2 )|| ≤ ρ2 , for, ||P − z2 || ≤ ||z1 − z2 || unless P = z3 , and, clearly, √ √ ||z1 − z2 || ≤ T ρ3 + 2ρ3 , ||z3 − z2 || ≤ T ρ3 + 2ρ3 . Since by Prop. 1.2, iii) we easily get λ0 (0) = λ00 (0) = 0, for a suitable B1 > 0 related to the third derivatives of φ, we have |φ(z4 ) − φ(z3 )| = |φ(z4 ) − φ(z1 )| = |λ(1)| ≤ B1 ||z1 − z2 ||3 .

(3.14)

Thus, 2ρ − B1 ||z1 − z2 ||3 ≤ |φ(z4 ) − φ(z2 )| = gradφ(ξ1 ) · u||z4 − z2 || , B1 ||z1 − z2 ||3 ≥ |φ(z3 ) − φ(z4 )| = gradφ(ξ2 ) · u||z3 − z4 || , where u =

z4 −z2 ||z4 −z2 || , ξ1

(3.5) (3.16)

is a suitable point in ]z2 , z4 [, and ξ2 is a suitable point in ]z3 , z4 [.

Note that max{||ξ1 − p1 (z2 )||, ||ξ2 − p1 (z2 )||} ≤

ρ 2,

thus [ξ1 , ξ2 ] ⊆ M ρ . Moreover, 2

78

R. Peirone

√ ||ξ1 − ξ2 || ≤ max{||z3 − z2 ||, ||z4 − z2 ||} ≤ T ρ + 2ρ. Hence if (for example) n P 2φ max | ∂x∂i ∂x |, where the max is taken over M ρ , using (3.15) we have α= j i,j =1

2

|gradφ(ξ2 ) · u| ≥ |gradφ(ξ1 ) · u| − α||ξ1 − ξ2 || 2ρ − B1 ||z1 − z2 ||3 √ − α(T ρ + 2ρ) ||z1 − z2 || √ 2ρ − B1 (T ρ + 2ρ)3 √ − α(T ρ + 2ρ). ≥ √ T ρ + 2ρ

≥

Hence, if ρ3 and T satisfy for some β > 0, √ 2ρ − B1 (T ρ + 2ρ)3 √ √ − α(T ρ + 2ρ) ≥ β ρ, √ T ρ + 2ρ

(3.17)

for every ρ ∈]0, ρ3 [, then by (3.12) and (3.16) we get B ||z − z ||3 1 1 2 . ||z3 − z2 || − ||z1 − z2 || ≤ √ β ρ In conclusion, it suffices to take T , ρ3 satisfying (3.13) and (3.17), and to put B = t u

B1 β .

We will prove Theorem 3.5 by choosing N in such a way that, on one hand, it is sufficiently large (thus PN,T 6 = ∅), on the other is not too large (see (3.28)), thus if (3.1) does not hold for z ∈ PN,T , then, √ using Lemmas 3.2 and 3.3, every segment line of z is not much shorter than K(T )T ρ, hence z is too long to be in mN,T . This argument works for sufficiently small ρ and T . However, we need a final lemma to prove that in fact Lemmas 3.2 and 3.3 imply this consideration about the length of z. Lemma 3.4. There exists r > 1 such that every (N − 1)-tuple (l1 , . . . , lN −1 ) of real numbers (with N > 1) satisfying, for some α > 0, λ > 0, formulas (3.18), (3.19), (3.20) given by h r 1 i , (3.18) l1 , . . . , lN−1 ∈ 0, 3α |li+1 − li | ≤ α(min{li , li+1 })3 if min{li , li+1 } < λ, i = 1, . . . , N − 2, (3.19) ∃ i = 1, . . . , N − 1 such that li ≥ λ,

(3.20)

also satisfies N−1 X

li ≥ (N − 1)λr −(N−1)αλ . 2

i=1

Proof. Let r > 1 be such that (1 − x1 )x ≥ 1r for every x ≥ 3. Suppose i, i ± 1 ∈ {1, . . . , N − 1}. By (3.19) if li < λ we have li±1 ≥ li − αli3 ≥ li (1 − αλ2 ); if instead

Billiards in Tubular Neighborhoods of Manifolds of Codimension 1

79

li ≥ λ > li±1 we have li±1 ≥ li − αli3 ≥ λ(1 − αλ2 ) because the function l 7→ l − αl 3 h q i 1 is increasing in 0, 3α . In any case li±1 ≥ min{li , λ} (1 − αλ2 ). By (3.18) and (3.20) we deduce αλ2 ≤ 13 , and also, given j = 1, . . . , N − 1, 1

lj ≥ λ(1 − αλ2 )N−1 = λ (1 − αλ2 ) αλ2

(N−1)αλ2

≥ λr −(N−1)αλ , 2

and we immediately get the thesis. u t Theorem 3.5. Suppose M is of class C 3 . Then there exist ρ˜ ∈]0, ρ[ and H > 0 such that for every ρ ∈]0, ρ[, ˜ any two points in Mρ can be connected by a billiard trajectory in Mρ having no more than √Hρ collision points. Proof. Let T > 0 and ρ˜ > 0 be such that

p T < min{T , ρ}, ρ˜ ≤ min{ρ2 (T ), ρ3 }.

(3.21) (3.22)

Given ρ ∈]0, ρ[ ˜ and ζ (1) , ζ (2) ∈ Mρ , suppose N satisfies N≥

L √ , T ρ

N ∈ P(ζ (1) , ζ (2) ).

(3.23) (3.24)

Then we can construct z˜ ∈ PN,T as in Theorem 2.4, thus PN,T 6= ∅. Consider z ∈ mN,T . Note that z is in Mρ by (3.21), (3.22), and Lemma 3.1. We are going to prove that z satisfies (3.1). We will prove in fact that if (3.1) does not hold, then the length of z is strictly greater than that of z˜ , a contradiction. Note that if we assume K(T )T 2 ) , (3.25) 4 √ √ √ we have ||zi+1 −zi || ≤ T ρ +2ρ ≤ 23 T ρ, and also ||zi+1 −zi || ≥ K(T )T ρ −2ρ ≥ √ √ 1 2 K(T )T ρ, if in addition, ||p1 (zi+1 ) − p1 (zi )|| ≥ K(T )T ρ. Thus, if for some i = 0, . . . , N, (3.1) does not hold, by Lemmas 3.2 and 3.3, by putting li = ||zi+1 − zi ||, √ α = √Bρ , λ = 21 K(T )T ρ, we can apply Lemma 3.4 provided ρ˜ < (

27 p 2 B ρT ˜ ≤ 1, 4

(3.26)

and we get 3N (z) ≥

N−1 X i=1

||zi+1 − zi || ≥

B N −1 2 2√ √ K(T )T ρr −(N−1) 4 K(T ) T ρ . 2

We conclude that we can make 3N (z) arbitrarily large; for example we have 3N (z) > 2L,

(3.27)

80

R. Peirone

√ √ provided K(T )T (N − 1) ρ > 8L and BK(T )2 T 2 (N − 1) ρ < inequalities are satisfied if

ln2 lnr ,

and these

ln2 8L √ < (N − 1) ρ < . K(T )T BK(T )T 2 lnr

(3.28)

Now, if T satisfies (3.21) and the inequality between the left and the right side of (3.28), and for this T , ρ˜ satisfies (3.22), (3.25), (3.26), and also p

ρ˜ <

ln2 1 8L , − 2 2 BlnrK(T )T T K(T )

then there exists N satisfying (3.23), (3.24), (3.28), and for this N z satisfies (3.27). On the other hand if we suppose these conditions on T , ρ, ˜ N are satisfied, and use Lemma 2.3 and (3.28) we get s L 2 ρ ) (1 + ) + ρ 2 + (N + 1)ρ ≤ 2L, 3N (˜z) ≤ (N + 1) ( N +1 ρ provided ρ˜ is sufficiently small. In such a case, by using also (3.27) we have 3N (˜z) < 2ln2 , see (3.28). u t 3N (z). The theorem is proved if we take for example H = BlnrK(T )T 2 Acknowledgement. I thank E. De Giorgi1 for useful discussions and suggestions on this subject.

References [1]

Benci, V., Giannoni, F.: Periodic Bounce Trajectories with a low number of Bounce Points. Ann. Inst. Henri Poincaré, Anal. non linéaire, 6 n. 1, 73–93 (1989) [2] Cornfeld, I.P., Fomin, S.V., Sinai, Ya.G.: Ergodic Theory. New York: Springer-Verlag, 1982 [3] Degiovanni, M.: Multiplicity of Solutions for the Bounce Problem. J. Diff. Equations, 54, 414–428 (1984) [4] Gilbarg, D., Trudinger, N.: Elliptic Partial Differential Equations of Second Order. Berlin: SpringerVerlag, 1983 [5] Hirsch, M.: Differential Topology. New York: Springer, 1976 [6] Kozlov, V., Treshchev, D.: Billiards. A Genetic Introduction to the Dynamics of Systems with Impacts. Translations of Math. Monographs No 89, Providence, RI: Amer. Math. Soc., 1991 [7] Markarian, R.: Introduction to the Ergodic Theory of Plane Billiards. In: Dynamical Systems, Pitman Research Notes in Math. Series n. 285, pp. 327–439 [8] Peirone, R.: Bounce Trajectories in Plane Tubular Domains, Atti Accad. Naz. Linc. Rend. Cl. Sc. Fis. Mat. Natur., Ser. 8, Vol. LXXXIII, 39–42 (1989) [9] Penrose, L., Penrose, R.: Puzzles for Christmas. The New Scientist, 25 Dec. 1958, pp. 1580–81 and 1597 [10] Rauch, J.: Illumination of Bounded Domains. Am. Math. Month. 85, 359–361 (1978) [11] Tabachnikov, S.: Billiards. SMF Panoramas et Synthese, No 1, 1995 Communicated by Ya. G. Sinai

1 who died in 1996

Commun. Math. Phys. 207, 81 – 106 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Remarks on the Time Dependent Periodic Navier–Stokes Flows on a Two-Dimensional Torus Zhi-Min Chen1,? , W. Geraint Price2 1 Department of Mathematics, Tianjin University, Tianjin 300072, P.R. China. E-mail: [email protected] 2 Department of Ship Science, Southampton University, Southampton SO17 1BJ, UK.

E-mail: [email protected] Received: 4 August 1998 / Accepted: 9 April 1999

Abstract: This note is a continuation of the earlier paper [Chen, Z. M., Price, W. G.: Time dependent periodic Navier–Stokes flows on a two-dimensional torus. Commun. Math. Phys. 179, 577–597 (1996)], where the evidences of the occurrence for the time dependent periodic Navier–Stokes flows were found based on a combination of analysis and computation. This investigation is now confirmed via rigorous analysis. The existence of the time dependent periodic Navier–Stokes flows on a two-dimensional torus is proved.

1. Introduction From a general point of view, the occurrence of time dependent periodic flows is the inherent phenomenon to exist in the viscous incompressible fluid motions governed by the Navier–Stokes equations. Especially, Hopf bifurcation or the transition from a steady-state flow to time dependent periodic flows may be considered to be the primary stage in understanding the nature of turbulence (cf. [7,10]). The existence theorem of Hopf bifurcation for general Navier–Stokes equations was established by Joseph and Sattinger [5] under the assumption of the eigenvalue simplicity and eigenvalue transversal crossing conditions for the linearized Navier–Stokes equations. Nevertheless, these conditions are difficult to verify. One may refer to [6] for the comment given in 1975. The possibility of a time periodic motion is well established if certain conditions on the linearization are satisfied. However, they have not yet been realized for a single model. Rigorous mathematical analysis on the existence of time dependent periodic solutions to the Navier–Stokes equations is still in its early stage. For example, a critical Reynolds number can be found for parallel Navier–Stokes motions by studying the OrrSommerfeld equations or the linearization in the space of functions spatially periodic in ? Research partially supported by the National Natural Science Foundation of China.

82

Z.M. Chen, W. G. Price

the horizontal direction (cf. [8]). There still exists a large gap to understand the eigenvalue conditions with respect to the Orr-Sommerfeld equations in solving bifurcation problems. In 1959, Kolmogorov (cf. [1]) gave a model of parallel fluid motions, which are spatially periodic and excited by the external force k 2 (sin ky, 0) for an integer k. This force gives rise to the steady-state flow (sin ky, 0). In connection with this model, the gap is narrowed via an approach due to Meshalkin and Sinai [9] by transforming the OrrSommerfeld equations to algebraic equations. The spectral behavior becomes easier to be examined from the investigation on the algebraic equations. An important extended study on this problem is due to Iudovich [4], where the existence of a sequence of steady-state bifurcation points is obtained. Consider Kolmogorov’s model with the forcing term 4k 2 (sin 2ky, 0). This is formulated by the following Navier–Stokes equations on the two-dimensional torus T 2 = R 2 /(2πZ): ∂t u − 1u + λ(u·∇)u + λ∇p = 4k 2 (sin 2ky, 0), ∇ ·u = 0, Z u dx dy = 0,

(1)

T2

where u = (u1 , u2 ), p and λ represent respectively the velocity, the pressure and the Reynolds number defining the fluid motions. The work made here is an continuation of the earlier investigation (cf. [3]) √ on the occurrence of time dependent periodic flows, which were found to exist in [ 3k] flow invariant subspaces by truncating Eq. (1) reduced to each of these subspaces into an ordinary differential system and by numerical experiments. Here [a] denotes the integer part of a. In this paper we obtain the existence of time dependent periodic solutions of Eq. (1) observed in [3]. To state the results more precisely, we rewrite Eq. (1) in the vorticity form: ∂t ψ = 1ψ − λ1−1 (∂y ψ∂x 1ψ − ∂x ψ∂y 1ψ) − 2k cos 2ky,

Z T2

ψdxdy = 0,

(2)

(−1ψ = ∂y u1 − ∂x u2 ),

and we introduce the following flow invariant subspaces of Eq. (2) given in [3]: H 2 = the Hilbert space {ψ| ψ, 1ψ ∈ L2 (T 2 ; R)}/R, which is a Banach algebra with respect to the function multiplication. ( 2 Hl,k

= ψ ∈ H2 |ψ = +

∞ X n=1

∞ ∞ X X

ηm,n cos(2mlx m=1 n=−∞ ∞ ∞ X X

ξn cos 2nky +

− lx + 2nky + ky) )

ξm,n cos(2mlx + 2nky) ,

m=1 n=−∞

which is a Banach subalgebra of H 2 generated by the three modes cos 2ky, cos(lx − ky) and cos(lx + ky) for an integer vector (l, k).

Periodic Navier–Stokes Flows

83

2 can be formulated as follows: Likewise, a symmetric subspace with respect to Hl,k ( ∞ ∞ X X 2 ˜ ηm,n sin(2mlx − lx + 2nky + ky) Hl,k = ψ ∈ H 2 | ψ =

+

∞ X n=1

m=1 n=−∞ ∞ ∞ X X

ξn cos 2nky +

) ξm,n cos(2mlx + 2nky) ,

m=1 n=−∞

which is a Banach subalgebra of H 2 generated by the three modes cos 2ky, sin(lx − ky) and sin(lx + ky). These subspaces are flow invariant of Eq. (2) in the following sense: 2 ∪ H˜ 2 with l ≥ 0 and k ≥ 1, Eq. (2) Lemma 1.1 ([3, Lemma 2.1]). For every ψˆ ∈ Hl,k l,k admits a unique solution with ψ(0) = ψˆ such that 2 2 ), when ψˆ ∈ Hl,k ψ ∈ C([0, ∞); Hl,k

and

2 2 ), when ψˆ ∈ H˜ l,k . ψ ∈ C([0, ∞); H˜ l,k

We are now in the position to state the existence result on time dependent periodic flows. √ Theorem 1.1. Let the integers k ≥ 1 and l = 1, . . . , [ 3k]. There exist a critical Reynolds number λl,k and a continuous function λl,k, with respect to small > 0 such that Eq. (2) with λ = λl,k, admit two different time dependent periodic solutions ψl,k, and ψ˜ l,k, satisfying 2 2 ψl,k, ∈ C([0, ∞); Hl,k ), ψ˜ l,k, ∈ C([0, ∞); H˜ l,k ),

ψl,k, , ψ˜ l,k, → ψ0 , λl,k, → λl,k as → 0, where ψ0 = −(1/2k) cos 2ky is the basic steady-state solution of Eq. (2). In approaching this result, we use the Hopf bifurcation theorem due to Joseph and Sattinger [5]. For the reader’s convenience, we simplify this theorem concerning Eq. (2) as follows: Theorem 1.2 (Joseph and Sattinger[5]). Assume that the spectral problem, the linerization of Eq. (2) around the steady-state solution ψ0 in H 2 , ρψ = 1ψ − λ1−1 (∂y ψ0 ∂x 1ψ − ∂x ψ0 ∂y 1ψ + ∂y ψ∂x 1ψ0 − ∂x ψ∂y 1ψ0 ) admits an eigenvalue ρ = ρ(λ) subject to the eigenvalue simplicity condition 1 = dim{ψ(λ) ∈ H 2 | (ρ(λ), ψ(λ)) satisfies Eq. (3)} and the eigenvalue transversal crossing condition Imρ(λ0 ) 6 = 0, Reρ(λ0 ) = 0 and Re

dρ(λ0 ) >0 dλ

(3)

84

Z.M. Chen, W. G. Price

for some critical Reynolds number λ0 > 0. Then there exist continuous functions λ and ν for small > 0 such that Eq. (2) with λ = λ admits a solution ψ = ψ ∈ C([0, ∞); C 2 (T 2 ; R)) satisfying ψ (x, y, t) = ψ (x, y, t +

2π ) for small > 0, λ ν

ψ → ψ0 , λ → λ0 and ν → Imρ(λ0 ) as → 0. Here 2π/(λ ν ) is the time period of ψ . In fact, the eigenvalues of Eq. (3) always have multiplicity 2m for some integer m, and the whole space H 2 is not suitable for applying the Hopf bifurcation theorem. However, the simplicity condition can be verified in the above-mentioned flow invariant subspaces, and Theorem 1.2 remains true by reducing Eq. (2) to these subspaces. Consequently, taking Theorem 1.2 and Lemma 1.1 into account, we are led to the proof of the following result on the linear spectral problem presented in Eq. (3). √ Theorem 1.3. Let the integers k ≥ 1 and l = 1, . . . , [ 3k] and let Aλ ψ be the righthand side of Eq. (3). Then the following assertions hold true. (i) The spectral problems ∞ X

Aλ ψ = ρψ, ψ =

n=−∞

2 ξn cos(lx + ky + 2nky) ∈ Hl,k

(4)

2 ξn sin(lx + ky + 2nky) ∈ H˜ l,k

(5)

and ∞ X

Aλ ψ = ρψ, ψ =

n=−∞

admit an eigenvalue ρ = ρl,k (λ) such that lim ρl,k (λ) = −l 2 − k 2 and Imρl,k (λ) < 0, λ > 0.

λ→0

(6)

(ii) ρl,k satisfies the eigenvalue positivity condition lim Reρl,k (λ) > 0, λ > 0.

λ→∞

(iii) The eigenvalue ρl,k is smooth and unique with respect to the spectral problems described in Eqs. (4–5) and satisfies the monotonicity condition Re

dρl,k (λ) > 0, λ > 0. dλ

(7)

Periodic Navier–Stokes Flows

85

2 (resp. H˜ 2 ) contains a flow invariant subspace X (resp. X) ˜ of Eq. (2) such that, (iv) Hl,k l,k for λ > 0,

˜ Aλ ψ = ρl,k (λ)ψ} 1 = dim{ψ ∈ X| Aλ ψ = ρl,k (λ)ψ} = dim{ψ ∈ X| and ˜ Aλ ψ = ρl,k (λ)ψ}. 2 = dim{ψ ∈ X ∪ X| For convenience, we may suppose that H 2 is a complex space when it is concerned with the spectral problem. This result implies the existence of critical Reynolds number λl,k so that Reρl,k (λl,k ) = 0. Thus it follows from Theorem 1.2 that a pair of time dependent periodic solutions branch off the bifurcation point (λl,k , ψ0 ). These two different solutions are contained ˜ This gives Theorem 1.1. respectively in the spaces C([0, ∞); X) and C([0, ∞); X). To show Theorem 1.3, it is necessary to formulate the spectral problem to algebraic equations, which can be deduced essentially by using the method of [9]. Now we state this result as follows: √ Theorem 1.4. Let λ > 0, k ≥ 1, l = 1, . . . , [ 3k] and Z denote the integer set. Then Eqs. (4–5) admit an eigenvalue ρ = ρl,k (λ) satisfying Eq. (6), and both Eq. (4) and Eq. (5) are equivalent to the following coupled set of algebraic equations 2βn (βn + ρ)ξn + λαn−1 ξn−1 − λαn+1 ξn+1 = 0, n ∈ Z,

(8)

where βn = βn (l, k) = l 2 + (2nk + k)2 , αn = αn (l, k) = l(βn − 4k 2 ), and {ξn }n∈Z is defined by the following conditions: (−1)n ξn = cγ1 · · · γn ξ0 = c, ξ−1 = ic, (−1)n−1 ξ−n = iξn−1 , γn =

α0 , n ≥ 1, αn (9) n ≥ 2,

1 , 1 2βn (βn + ρ) + 1 2βn+1 (βn+1 + ρ) λαn + λαn+1 .. .

and, in particular, 2β0 (β0 + ρ) 1 1 = i. = + 2 2 1 2β (β + ρ) γ0 λl(l − 3k ) 1 1 + 1 2β2 (β2 + ρ) λα1 + λα2 .. . √ Here c is an arbitrary complex constant and i = −1.

(10)

86

Z.M. Chen, W. G. Price

In connection with Eq. (4), this result follows from the proof of [3, Theorem 3.1]. It is readily seen that Eqs. (4) and (5) are equivalent. Thus this result with respect to another spectral problem Eq. (5) can be shown in the same way. The purpose of this paper is to prove Theorem 1.3. We see Assertion (i) is implied in Theorem 1.4. Thus it remains to show Assertions (ii)–(iv). The proof for these assertions is a bit lengthy, and is to be completed by combining the examinations of the remaining sections. Section 2 contains preliminary lemmas devoted to the estimation of the algebraic equation, Eq. (10). Assertion (ii) is to be verified in Sect. 3 based on the study in Sect. 2. Assertion (iii) is to be shown in Sect. 4 via estimation on the coupled set of algebraic equations, Eq. (8). With the help of Lemma 1.1 and Assertions (i)–(iii), Assertion (iv) becomes an easy problem, which will be shown in Sect. 5. It should be noted that Assertion (ii) was stated in [3, Theorem 3.1] without detailed proof, since the authors considered it as an easy problem then. In fact, the critical Reynolds numbers λl,k vary sharply and are unbounded when k and −3k 2 + l 2 increase. This phenomenon can be observed easily via numerical experiments. Hence it is difficult to show the existence of all Hopf bifurcation values λl,k , and, especially, to show Assertion (ii) as k increases. The difficulty arises from the fact that ρl,k (λ) is complex and disjoint to the real axis for all λ > 0. To overcome the above-mentioned difficulty, we present some technical analysis in estimating the algebriac equations, Eqs. (8–10). Let us mention that, among other things, Theorem 1.1 has been extended to a threedimensional Navier–Stokes model (cf. [2]) by developing the approach of this paper, and extra instabilities are found therein. 2. Prelimary Lemmas The proof of Assertion (ii) is based on the following technical lemmas in estimating Eq. (10) when λ is close to infinity. In what follows, let ρ(λ) = ρl,k (λ) and (l, k) be given in Theorem 1.4, and let µ(λ) > −β0 and ν(λ) < 0 be the real functions so that µ(λ) + iν(λ) = ρ(λ). Let us begin with an elementary lemma. Lemma 2.1. The following holds −

ν(λ) 3 < < 0 for λ > 0. 2 λl

Proof. Recalling from Theorem 1.4 that γn (λ) =

1 , n≥0 1 2βn (βn + ρ(λ)) + 1 2βn+1 (βn+1 + ρ(λ)) λαn + λαn+1 .. .

and ν(λ) < 0, we see that |γn (λ)| ≤

λαn λαn ≤ → 0 as n → ∞, 2βn (βn + µ(λ)) 2βn (βn − β0 ) Imγn+1 (λ) =

2βn |ν(λ)| 1 + Im , n ≥ 0, αn λ γn (λ)

(11)

Periodic Navier–Stokes Flows

87

≥

1 2|ν(λ)| + Im , n ≥ 1, lλ γn (λ)

and, by Eq. (10), Imγ1 (λ) = 1 −

2(l 2 + k 2 )|ν(λ)| 2|ν(λ)| 2β0 |ν(λ)| = 1 − <1− . 2 2 2 2 l(3k − l )λ lλ(3k − l ) 3lλ

If ν(λ)/(λl) ≤ −3/2 for some λ > 0, we see that Imγ1 (λ) < 0, and thus Imγn (λ) > 3 for n ≥ 2. This contradicts Eq. (11), and completes the proof. Without loss of generality, the limits of ν(λ)/(λl) and µ(λ) may be supposed to exist as λ tends to infinity. Otherwise, we may reduce our study in this section to a sequence {λn > 0| n ≥ 1} instead of the half line {λ| λ > 0}. Now we adopt the notations: 2ν(λ) ≤ 0, λl µ0 = lim µ(λ) ≥ −β0 , ν0 = lim

λ→∞ λ→∞

κn + iθn κn (λ) + iθn (λ) = λ λ =

1 , n ≥ 0, 1 2βn (βn + ρ(λ)) + 2βn+1 (βn+1 + ρ(λ)) 1 αn λ + αn+1 λ .. .

τn + iσn = τn (λ) + iσn (λ) =

(12)

λ2 κn (λ) + iθn (λ)

2βn (βn + ρ(λ)) + κn+1 (λ) + iθn+1 (λ), n ≥ 0, (13) αn where κn , θn , τn and σn are real functions. Obviously, κn (λ) > 0, τn (λ) > 0 for n ≥ 1 since αn > 0 and βn + Reρ > 0 for t n ≥ 1, and Lemma 2.1 implies that ν0 is finite. u =

Lemma 2.2. Assume that µ0 < ∞, and there exists an integer n ≥ 1 so that limλ→∞ κn (λ) < ∞. Then (i)

σn+1 (λ) βn ν0 6 = 1, lim βn − 4k 2 λ→∞ λ lim κn (λ)

λ→∞

σn+1 (λ) 2 2βn+1 (βn+1 + µ0 ) 2βn (βn + µ0 ) lim + + lim κn+2 (λ) λ→∞ λ→∞ αn λ αn+1 = 2 βn ν0 σn+1 (λ) −1 lim 2 βn − 4k λ→∞ λ

holds whenever lim

λ→∞

and

|τn+1 (λ) + iσn+1 (λ)| < ∞, λ

88

Z.M. Chen, W. G. Price

(ii) 2βn (βn + µ0 ) + lim κn+1 (λ) λ→∞ αn lim κn (λ) = 2 λ→∞ βn ν0 βn − 4k 2 holds whenever lim

λ→∞

|τn+1 (λ) + iσn+1 (λ)| = ∞. λ

Proof. (i) Note that κn (λ) = Re

=

λ

1 2βn (βn + ρ(λ)) + τn+1 (λ) + iσn+1 (λ) αn λ λ τn+1 λ2 2βn (βn + µ(λ)) + 2 2 αn τn+1 + σn+1 !2 !2 2βn ν(λ) τn+1 λ σn+1 λ 2βn (βn + µ(λ)) + + 2 − 2 2 2 αn λ αn λ τn+1 + σn+1 τn+1 + σn+1 1

=

2βn ν(λ) σn+1 λ − 2 2 αn λ τn+1 + σn+1

!2 .

τn+1 2βn (βn + µ(λ)) + 2 + 2 2 αn λ τn+1 λ2 2βn (βn + µ(λ)) τn+1 + σn+1 + 2 2 αn τn+1 + σn+1 Hence τn+1 2βn (βn + µ(λ)) 2 αn + σn+1

2 τn+1

2βn (βn + µ(λ)) 1 − − κn (λ) αn λ2 τn+1 λ2 2βn (βn + µ(λ)) + 2 2 αn τn+1 + σn+1 !2 2 λ2 τn+1 σn+1 λ 2βn ν(λ) − 2 + 2 2 )2 2 αn λ (τn+1 + σn+1 τn+1 + σn+1 = τn+1 λ2 2βn (βn + µ(λ)) + 2 2 αn τn+1 + σn+1 2 2 2 τn+1 2βn ν(λ) 2βn ν(λ) 1 + σn+1 − λ 2 2 2 2 αn λ αn λ τn+1 + σn+1 τn+1 + σn+1 = τn+1 λ2 2βn (βn + µ(λ)) + 2 2 αn τn+1 + σn+1

Periodic Navier–Stokes Flows

2 2βn ν(λ) σn+1 −1 + λ2 αn λ λ = , 2 2 + σn+1 2βn (βn + µ(λ)) τn+1 + τn+1 αn λ2 which is bounded by 2 1 2βn ν(λ) σn+1 2βn ν(λ) 2 τn+1 −1 . + αn λ λ2 τn+1 αn λ λ

2βn ν(λ) αn λ

2

89 2 τn+1

(14)

If limλ→∞ τn+1 (λ) = ∞, this term vanishes as λ → ∞. This implies lim κn (λ) = ∞,

λ→∞

which leads to a contradiction. Thus τn+1 (λ) is bounded. Passing to the limit as λ → ∞ in Eq. (14), therefore, yields 1 λ→∞ κn (λ) lim

2 σn+1 (λ) βn ν0 lim −1 βn − 4k 2 λ→∞ λ , = 2 (λ) σn+1 2βn+1 (βn+1 + µ0 ) 2βn (βn + µ0 ) lim + + lim κn+2 (λ) λ→∞ λ→∞ αn λ2 αn+1

and gives (i). (ii) Noticing that lim

λ→∞

|κn+1 (λ) + iθn+1 (λ)| = λ

1 = 0, |τn+1 (λ) + iσn+1 (λ)| lim λ→∞ λ

and λ 2βn (βn + ρ(λ)) κn+1 (λ) + iθn+1 (λ) + αn λ λ 1 = , 2βn ν(λ) θn+1 (λ) 2 + 2βn (βn + µ(λ)) κn+1 (λ) αn λ λ + + 2βn (βn + µ(λ)) αn λ2 λ2 + κn+1 (λ) αn we find that limλ→∞ κn (λ) = ∞ whenever limλ→∞ κn+1 (λ) = ∞, and κn (λ) = Re

2βn (βn + µ0 ) + lim κn+1 (λ) λ→∞ αn lim κn (λ) = β ν λ→∞ n 0 ( )2 βn − 4k 2 whenever limλ→∞ κn+1 (λ) < ∞. This gives (ii) and thus completes the proof. With the use of Lemmas 2.1 and 2.2, we can now estimate Eq. (10) as λ tends to infinity. u t

90

Z.M. Chen, W. G. Price

Lemma 2.3. Assume µ0 < ∞. Then ∞

β0 (β0 + µ0 ) X 2 βn (βn + µ0 ) ≥ an , 3k 2 − l 2 βn − 4k 2

(15)

n=1

where, and in what follows, a0 = 1, a1 = 1 +

an = an (l, k) = (−1)n

(l 2 + k 2 )ν0 , 3k 2 − l 2

βn−1 ν0 an−1 + an−2 for n ≥ 2. βn−1 − 4k 2

(16)

Proof. For an integer n > 1, we recall from Eqs. (10,12,13) that 2β0 (β0 + ρ(λ)) κ1 (λ) + iθ1 (λ) =i+ λ λl(3k 2 − l 2 ) =

1 2β1 (β1 + ρ(λ)) + λα1 .. .+

1

1 1 2βn (βn + ρ(λ)) + τ + iσn+1 λαn n+1 λ 1 . = 1 2β1 (β1 + ρ(λ)) + λα1 1 .. .+ 2βn (βn + ρ(λ)) κn+1 + iθn+1 + λαn λ Since µ0 = limλ→∞ µ(λ) < ∞, passing to the limit as λ → ∞ yields lim

λ→∞

τn+1 (λ) + iσn+1 (λ) = (−1)n+1 i if ν0 = 0, λ

lim κ1 (λ) =

λ→∞

1+

β0 ν0 = 3k 2 − l 2

2β0 (β0 + µ0 ) , l(3k 2 − l 2 )

1 −β1 ν0 + β1 − 4k 2 .. .+

(17)

(18)

(19)

1 1 (−1)n−1 βn−1 ν0 βn−1 − 4k 2

Periodic Navier–Stokes Flows

91

if limλ→∞ |τn+1 (λ) + iσn+1 (λ)|/λ = 0 and ν0 < 0, 1+

β 0 ν0 = 3k 2 − l 2

1 −β1 ν0 + β1 − 4k 2 .. .+

(20)

1 1 (−1)n βn ν0 βn − 4k 2

if limλ→∞ |τn+1 (λ) + iσn+1 (λ)|/λ = ∞ and ν0 < 0, τn+1 (λ) = 0, λ→∞ λ lim

1+

β 0 ν0 = 3k 2 − l 2

1 −β1 ν0 + β1 − 4k 2 .. .+

(21)

1 1 (−1)n βn ν0 βn − 4k 2

+

(−1)n+1 σn+1 (λ) lim λ→∞ λ

if 0 < limλ→∞ |τn+1 (λ) + iσn+1 (λ)|/λ < ∞ and ν0 < 0. We now start to check the desired inequality based on these equations. If ν0 = 0, Eq. (17) together with Lemma 2.2 implies lim κn (λ) =

λ→∞

2βn (βn + µ0 ) 2βn+1 (βn+1 + µ0 ) + + lim κn+2 (λ). λ→∞ αn αn+1

Combining this and Eq. (18) yields ∞

X 2βn (βn + µ0 ) 2β0 (β0 + µ0 ) = lim κ (λ) ≥ . 1 λ→∞ l(3k 2 − l 2 ) αn n=1

If ν0 < 0, it follows from Eq. (18) and Lemma 2.2 that σ2 (λ) β1 ν0 − 1 6 = 0. lim β1 − 4k 2 λ→∞ λ Addtionally, Eq. (18) and Lemma 2.2 yield lim κ3 (λ) < ∞, when lim

λ→∞

λ→∞

|τ2 (λ) + σ2 (λ)| < ∞, λ

and

|τ2 (λ) + σ2 (λ)| = ∞. λ→∞ λ→∞ λ Thus for limλ→∞ |τ2 (λ) + σ2 (λ)|/λ < ∞, from Eqs. (19) and (21) we find that lim κ2 (λ) < ∞, when lim

β1 ν0 a2 = a2 = 1 + a1 = a0 β1 − 4k 2

1 6= 0 β1 ν0 σ2 (λ) − lim + 1 β1 − 4k 2 λ→∞ λ

92

Z.M. Chen, W. G. Price

and

a1 β0 ν0 σ2 (λ) , = a1 = 1 + 2 = a2 lim λ→∞ a0 3k − l 2 λ and hence, by Eq. (18) and Lemma 2.2, we have 2β0 (β0 + µ0 ) = lim κ1 (λ) λ→∞ l(3k 2 − l 2 )

σ2 (λ) 2 2β2 (β2 + µ0 ) 2β1 (β1 + µ0 ) lim + + lim κ3 (λ) λ→∞ λ→∞ α1 λ α2 = 2 β1 ν0 σ2 (λ) −1 lim β1 − 4k 2 λ→∞ λ 2β1 (β1 + µ0 ) 2β2 (β2 + µ0 ) = a12 + a22 + a22 lim κ3 (λ). λ→∞ α1 α2

For another case, limλ→∞ |τ2 (λ) + iσ2 (λ)|/λ = ∞, from Eq. (20) and the fact ν0 < 0 it follows that β0 ν0 1 a1 = a1 = 1 + 2 = 6= 0. β1 ν0 a0 3k − l 2 − β1 − 4k 2 Using Eq. (18) and Lemma 2.2 again, we have 2β0 (β0 + µ0 ) = lim κ1 (λ) λ→∞ l(3k 2 − l 2 ) 2β1 (β1 + µ0 ) + lim κ2 (λ) λ→∞ α1 = 2 β1 ν0 β1 − 4k 2 2β1 (β1 + µ0 ) = a12 + a12 lim κ2 (λ). λ→∞ β1 − 4k 2 To estimate κn for all n, we may suppose that an−1 6 = 0 and lim κn (λ) < ∞ λ→∞

for an integer n ≥ 2. Applying an elementary calculation to Eqs. (19-21), we have σn+1 lim an λ→∞ λ = , β ν σn+1 an−1 n 0 n+1 (−1)n lim + (−1) βn − 4k 2 λ→∞ λ 2 an+1

1 6= 0, β σn+1 ν n 0 n+1 (−1)n lim + (−1) βn − 4k 2 λ→∞ λ whenever limλ→∞ |τn+1 + iσn+1 |/λ < ∞, and 2 an−1

=

an = an−1

1 (−1)n

βn ν0 βn − 4k 2

6 = 0,

Periodic Navier–Stokes Flows

93

2 whenever limλ→∞ |τn+1 + iσn+1 |/λ = ∞. Thus an2 + an+1 > 0. By Lemma 2.2, we have 2 lim κn (λ) = an2 an−1 λ→∞

2βn (βn + µ0 ) 2 2βn+1 (βn+1 + µ0 ) 2 + an+1 + an+1 lim κn+2 (λ) λ→∞ αn αn+1

when an+1 6 = 0, and 2 lim κn (λ) = an2 an−1 λ→∞

2βn (βn + µ0 ) + an2 lim κn+1 (λ) when an+1 = 0. λ→∞ αn

Hence the desired assertion follows by induction. The proof is complete. u t Assertion (ii) of Theorem 1.3 will be deduced in the next section via careful analysis on Eq. (15). Of course, this essentially depends of the estimation on an defined by Eq. (16). For convenience, we first set r=

βn k2 , bn = , n ≥ 1, 2 −l βn − 4k 2

(22)

3k 2

which gives bn =

(2n + 1)2 r + 3r − 1 > 0, a1 = 1 + (4r − 1)ν0 , (2n + 1)2 r − r − 1

(23)

(24) an = (−1)n bn−1 v0 an−1 + an−2 , n ≥ 2. √ Furthermore, the choice of the integer l with l = 1, . . . , [ 3k] is now implied in the condition 1 < r ≤ k2 . 3 With these observations in mind, we can now estimate an when 2 < r ≤ k 2 . Lemma 2.4. Let 2 < r ≤ k 2 , a1 < 0 and n ≥ 1. Then 2 1 −|a1 | ≤ a2n−1 ≤ −|a1 | + (n − )|ν0 | + |a1 |(n2 + n)ν02 , 3 2 provided that a2n0 ≥ 0 for n0 = 0, . . . , n − 1. Proof. Let us begin with the proof of the following estimates: −|a1 | ≤ a2n−1 ≤ −|a1 | + |ν0 |

n−1 X

b2m + |a1 |ν02

b2m0 −1 b2m ,

(25)

m=1 m0 =1

m=1

a2n ≤ 1 + |a1 ||ν0 |

m n−1 X X

n X m=1

b2m−1 .

(26)

94

Z.M. Chen, W. G. Price

We argue by induction. When n = 1, these estimates are obviously valid. Assume that Eqs. (25–26) hold true for any n0 < n. By (24), a2n−1 = a2n−3 + a2n−2 b2n−2 |ν0 | ≤ − |a1 | + |ν0 |

n−2 X

b2m + |a1 |ν02

m n−2 X X

b2m0 −1 b2m

m=1 m0 =1

m=1

+ |ν0 |(1 + |a1 ||ν0 |

n−1 X

b2m−1 )b2n−2

m=1

= − |a1 | + |ν0 |

n−1 X

b2m + |a1 |ν02

m n−1 X X

b2m0 −1 b2m ,

m=1 m0 =1

m=1

a2n−1 = a2n−3 + a2n−2 b2n−2 |ν0 | > a2n−3 ≥ −|a1 |, since a2n−2 ≥ 0, a2n = a2n−2 − a2n−1 b2n−1 |ν0 | n−1 X

≤ 1 + |a1 ||ν0 | = 1 + |a1 ||ν0 |

m=1 n X

b2m−1 + |a1 ||ν0 |b2n−1 b2m−1 .

m=1

We thus obtain Eqs. (25–26). Furthermore, we note that n−1 X

b2m

m=1

n−1 X (4m + 1)2 r + 3r − 1 = (4m + 1)2 r − r − 1 m=1

n−1 X

4 (4m + 1)2 − 2 m=1 ∞ X 1 2 1 − =n− , ≤ n−1+ 4m − 1 4(m + 1) − 1 3

≤ n−1+

m=1

n−1 X m=1

b2m

m X

b

2m0 −1

=

m0 =1

n−1 X

b2m

m=1

≤

n−1 X

b2m

=

m=1

≤

n−1 X m=1

m =1 m X m0 =1

m=1 n−1 X

m X (4m0 − 1)2 r + 3r − 1 (4m0 − 1)2 r − r − 1 0

b2m

(4m0 − 1)2 + 2 (4m0 − 1)2 − 2

m X 4 4 m+ + 0 7 (4m − 1)2 − 2 0

m+

4 b2m 5

m =2

!

Periodic Navier–Stokes Flows

95

≤

X n−1 2 n(n − 1) 4 4m + n− + 2 5 3 (4m + 1)2 − 2 m=1

≤

n(n − 1) 4 + n+ 2 5

n−1 X m=1

1 1 ≤ (n2 + n), 4m + 1 2

and thus the desired assertion follows from Eq.(25). The proof is complete. u t Lemma 2.5. Let 2 < r ≤ k 2 . Then a2n > 1 −

n 1 , when 0 > a1 ≥ −1 and 1 ≤ n ≤ (4r − 1), 4r − 1 2

a2n > 1, when − 1 > a1 > −

1√ 3√ 4r − 1 and 1 ≤ n ≤ 4r − 1. 2 4

(27)

(28)

Proof. First, we prove Eq. (27). In order to use Lemma 2.4, we note a0 = 1 > 0 and a2 = 1 + b1 |a1 ν0 | > 1. Thus we may suppose that n ≥ 2 is an integer satisfying 2n ≤ 4r − 1 and a2n0 ≥ 0 for n0 = 1, . . . , n − 1. Recalling that a1 = 1 + (4r − 1)ν0 , we have for m = 1, . . . , n, m 1 |a1 |(|a1 | + 1) 1 m|a1 ||ν0 | = m ≤ ≤ , 2 2(4r − 1) 4r − 1 2 and so, by Lemma 2.4, 1 a2m−1 ≤ −|a1 | + m|ν0 | + |a1 |m2 ν02 . 2

(29)

On the other hand, we see that 2|a1 | + 1 z0 = √ 1 + |a1 |(2|a1 | + 1) + 1 is the positive solution to the algebraic equation 1 1 −|a1 | + z + |a1 |z2 = , 2 2 and the condition 2m ≤ 4r − 1 becomes m|ν0 | = m

1 + |a1 | 2|a1 | + 1 1 + |a1 | ≤ ≤√ = z0 . 4r − 1 2 1 + |a1 |(2|a1 | + 1) + 1

Combining this with Eq. (29) gives a2m−1 ≤ 1/2. This together with Eq. (24) yields, by induction, a2m = a2m−2 − a2m−1 |ν0 |b2m−1 1 ≥ a2m−2 − |ν0 |b2m−1 2 m X 1 ≥ a2 − |ν0 | b2m0 −1 2 0 m =2

96

Z.M. Chen, W. G. Price m X 1 = 1 + |a1 ν0 |b1 − |ν0 | b2m0 −1 2 0 m =2

m X 1 (4m0 − 1)2 + 2 ≥ 1 − |ν0 | 2 (4m0 − 1)2 − 2 0 m =2

1 ≥ 1 − m|ν0 | 2 m m 1 + |a1 | >1− >0 = 1− 2 4r − 1 4r − 1 for m = 1, . . . , n. Thus applying this argument repeatedly, we obtain Eq. (27). √ Next, we prove Eq. (28) when −1 ≥ a1 ≥ −3/2 and n ≤ (3/4) 4r − 1. Similar√to the proof of Eq. (27), we may suppose that n ≥ 2 is an integer such that n ≤ (3/4) 4r − 1 and a2n0 ≥ 0 for n0 = 1, . . . , n − 1. Since r > 2, we see that for m = 1, . . . , n, m|a1 |(|a1 | + 1) 1 m|a1 ||ν0 | = 2 2(4r − 1) 3m 2 3m(|a1 | + 1) ≤ √ ≤ . ≤ √ 3 10 4r − 1 4 4r − 1 This together with Lemma 2.4 implies 1 a2m−1 ≤ −|a1 | + m|ν0 | + |a1 |m2 ν02 . 2

(30)

Additionally, observing 3 3√ 4r − 1, −1 ≥ a1 ≥ − , r > 2, and m ≤ 2 4 we have m|ν0 | = m

|a1 | + 1 3(|a1 | + 1) ≤ √ ≤ 4r − 1 4 4r − 1

q 1 + 2a12 − 1 |a1 |

,

for which we denote by z0 the positive solution of the algebraic equation 1 −|a1 | + z + |a1 |z2 = 0. 2 Consequently, from Eq. (30) it follows that a2m−1 ≤ 0, and so, by Eq. (24), a2m = a2m−2 + |a2m−1 ||ν0 |b2m−1 ≥ a2m−2 . By induction, this yields a2m ≥ a2 = 1 + |a1 ||ν0 |b1 > 1 for m = 1, . . . , n. Thus we reach Eq. (28) with −1 ≥ a1 ≥ −3/2 by following the above argument repeatedly.

Periodic Navier–Stokes Flows

97

√ Finally, we prove Eq. (28) with −3/2 > a1 > − 4r − 1/2. We see |a1 | + 1 1 |a1 |(|a1 | + 1) 1 |a1 ||ν0 | = ≤ √ ≤ . 2 2(4r − 1) 4 4 4r − 1 Combining this with Lemma 2.4 gives 5 1 a2m−1 ≤ −|a1 | + m|ν0 | + |a1 |m2 ν02 , m = 1, . . . , n, 4 2 where the integer n is supposed to satisfy the condition a2n0 ≥ 0 for n0 = 1, . . . , n − 1 and n <

3√ 4r − 1. 4

Hence a2m−1 ≤ 0 is valid if the inequality 5 1 −|a1 | + m|ν0 | + |a1 |m2 ν02 ≤ 0 4 2 holds or m|ν0 | ≤

q (5/4)2 + 2a12 − 5/4 |a1 |

.

Indeed, we see m|ν0 | =

3(|a1 | + 1) 3(|a1 | + 1) m(|a1 | + 1) ≤ √ ≤ ≤ 4r − 1 8|a1 | 4 4r − 1

q (5/4)2 + 2a12 − 5/4 |a1 |

,

where use is made of the condition m≤

3 1√ 3√ 4r − 1 and − ≥ a1 ≥ − 4r − 1. 4 2 2

Hence, by Eq. (24) and the fact a2m−1 ≤ 0, a2m = a2m−2 + |a2m−1 ||ν0 |b2m−1 > a2m−2 > a2 > 1, m = 1, . . . , n. By induction, this implies the desired assertion. The proof is complete. u t 3. Proof of Theorem 1.3 (ii) With the aid of the lemmas in the preceding section, we can now carry out the proof of Assertion (ii), that is, µ0 = limλ→∞ Reρ(λ) > 0. We argue by contradiction. If µ0 ≤ 0 holds, it follows from Lemma 2.3 that ∞

∞

n=1

n=1

X a2β 2 X a 2 (l 2 + (2nk + k)2 )2 β02 (l 2 + k 2 )2 n n n = ≥ = . 3k 2 − l 2 3k 2 − l 2 βn − 4k 2 l 2 + (2nk + k)2 − 4k 2

98

Z.M. Chen, W. G. Price

This becomes, by Eqs. (22–23), ∞ X ((2n + 1)2 r + 3r − 1)2 2 a (4r − 1) ≥ (2n + 1)2 r − r − 1 n 2

(31)

n=1

(12r − 1)2 2 (28r − 1)2 2 (52r − 1)2 2 a + a + a + ··· , 8r − 1 1 24r − 1 2 48r − 1 3 √ and the condition l = 1, . . . , [ 3k] is implied in the condition 1/3 < r ≤ k 2 . Consequently, it suffices to give the contradiction that the right-hand side of Eq. (31) is strictly greater than (4r − 1)2 for 1/3 < r ≤ k 2 . Indeed, this will be achieved in the following by considering r in the interval (1/3, k 2 ] through four steps. =

Step 1. The case 1/3 < r ≤ 3/8. First, when a2 ≤ 0, we see that a1 > 0 and thus a2 = 1 + (1 − (4r − 1)|ν0 |)

12r − 1 9 ν0 ≥ 1 + (1 + (4r − 1)ν0 )ν0 . 8r − 1 5

This gives ν0 >

−3 −

√ √ 1 9 − 20(4r − 1) 21 and a1 = 1 + (4r − 1)ν0 > − , 6(4r − 1) 2 18

which together with Eq. (31) leads to the contradiction √ !2 1 3 3 9 21 2 − (12r − 1) > (4r − 1)2 . > (4r − 1) ≥ (12r − 1)a1 > (12r − 1) 2 2 2 18 100 2

Next, when a2 ≥ 0 and a1 ≤ 0, we see that a2 ≥ 1. Hence, by Eq. (31), (4r − 1)2 ≥

7 7 (28r − 1)a22 > (28r − 1) > (4r − 1)2 . 6 6

Finally, when a2 ≥ 0 and a1 > 0, we see that a32 = (|ν0 |b2 a2 + a1 )2 > a12 , and, by Eq. (31), (4r − 1)2 > This yields

3 13 (12r − 1)a12 + (52r − 1)a32 > (74r − 3)a12 . 2 12 4r − 1 , a1 = 1 + (4r − 1)ν0 ≤ √ 74r − 3

that is, ν0 ≤ ν00 with ν00 = −

1 1 +√ . 4r − 1 74r − 3

(32)

Periodic Navier–Stokes Flows

99

Consequently, we readily see that 12r − 1 ν0 8r − 1 12r − 1 0 ≥ 1 + (1 + (4r − 1)ν00 ) ν 8r − 1 0 12r − 1 1 4r − 1 ≥ 1− − √ 8r − 1 74r − 3 74r − 3 9 , ≥ 1− √ 5 74r − 3

a2 = 1 + (1 + (4r − 1)ν0 )

(33)

and hence, by Eq. (31), 7 (28r − 1)a22 6 2 9 > 1− √ (28r − 1) 5 74r − 3 2 > (28r − 1) > (4r − 1)2 . 5

(4r − 1)2 >

Thus µ0 > 0 holds true in the case 1/3 < r ≤ 3/8. Step 2. The case 3/8 < r ≤ 2. If a1 ≤ 0, we see a2 > 1. Thus Eq. (32) follows from Eq. (31). If a1 > 0, we also have Eq. (33), or 1 4r − 1 12r − 1 , when r > 3/8. − a2 > 1 − √ 8r − 1 74r − 3 74r − 3 This gives 7 3 3 , when < r < , a2 > 1 − √ 8 2 4 74r − 3 17 5 1 17 1 3 4r − 1 >1− , when ≤ r ≤ 2. − a2 > 1 − √ √ − 11 66 18 2 74r − 3 74r − 3 3 Taking this observation and Eq. (31) into account, we have (4r − 1)2 >

7 49 3 (28r − 1)a22 > (28r − 1) > (4r − 1)2 , when ≤ r < 1, 6 100 8

(4r − 1)2 >

7 7 3 (28r − 1)a22 > (28r − 1) > (4r − 1)2 , when 1 ≤ r < , 6 10 2

(4r − 1)2 >

7 99 3 (28r − 1)a22 > (28r − 1) > (4r − 1)2 , when ≤ r ≤ 2. 6 100 2

Hence µ0 > 0 holds true for 3/8 < r ≤ 2.

100

Z.M. Chen, W. G. Price

Step 3. The case a1 ≥ 0 and 2 < r ≤ k 2 . To begin with, we see that if there exists an integer n satisfying a2n ≥ 1 −

n 4r − 1 4r − 1 and ≥n> − 1. 4r − 1 2 2

(34)

Since r ≥ 2, Eq. (31) gives 2 ((4n + 1)2 r + 3r − 1) ≥ (4r − 1)2 ≥ a2n

1 (2(4r − 1) − 3)2 2 > (4r − 1)2 . 4

This leads to a contradiction and yields the assertion µ0 > 0. Thus it remains to prove Eq. (34). Let us note that 0 ≤ a1 = 1 − (4r − 1)|ν0 | ≤ 1 and 1 ≥ a2 = 1 − (1 + (4r − 1)ν0 )

12r − 1 |ν0 | 8r − 1

1 12r − 1 8r − 1 4(4r − 1) 2 > 0. ≥ 1− 5(4r − 1)

≥ 1−

By induction, for an integer n ≤ 4r − 1, if the estimates holds true 0 ≤ a2n−3 ≤ 1 − (4r − 1)|ν0 | +

n−2 X (4m + 1)2 r + 3r − 1 |ν0 | ≤ 1, (4m + 1)2 r − r − 1

(35)

m=1

1 ≥ a2n−2

1 ≥1− 4r − 1

n−2

2 X (4m + 3)2 r + 3r − 1 + 5 (4m + 3)2 r − r − 1

! ≥ 0,

(36)

m=1

we have (4n − 3)2 r + 3r − 1 |ν0 |a2n−2 ≥ a2n−3 ≥ 0, (4n − 3)2 r − r − 1 (4n − 3)2 r + 3r − 1 ≤ a2n−3 + |ν0 | (4n − 3)2 r − r − 1 n−1 X (4m + 1)2 r + 3r − 1 |ν0 | ≤ 1 − (4r − 1)|ν0 | + (4m + 1)2 r − r − 1

a2n−1 = a2n−3 + a2n−1

m=1

n−1 X (4m + 1)2 + 2 (4m + 1)2 − 2 m=1 ∞ X 1 1 − |ν0 | ≤ 1 − (4r − 1)|ν0 | + (n − 1)|ν0 | + 4m − 1 4(m + 1) − 1

≤ 1 − (4r − 1)|ν0 | + |ν0 |

≤ 1 − (4r − 1)|ν0 | + n|ν0 | ≤ 1,

m=1

Periodic Navier–Stokes Flows

101

and (4n − 1)2 r + 3r − 1 |ν0 |a2n−1 ≤ a2n−2 ≤ 1, (4n − 1)2 r − r − 1 (4n − 1)2 r + 3r − 1 |ν0 | a2n−2 − (4n − 1)2 r − r − 1 ! n−2 2 X (4m + 3)2 r + 3r − 1 (4n − 1)2 r + 3r − 1 1 + − |ν | 1− 0 4r − 1 5 (4m + 3)2 r − r − 1 (4n − 1)2 r − r − 1 m=1 ! n−1 2 X (4m + 3)2 r + 3r − 1 1 + 1− 4r − 1 5 (4m + 3)2 r − r − 1 m=1 ! n−1 3 X 1 4 n− + 1− 4r − 1 5 (4m + 3)2 − 2 m=1 n ≥ 0. 1− 4r − 1

a2n = a2n−2 − a2n ≥ ≥ ≥ ≥ ≥

This shows Eqs. (35–36) with n − 1 replaced by n. In particular, we obtain the bound a2n > 1 −

n , whenever n ≤ 4r − 1, 4r − 1

which gives Eq. (34) by choosing the integer n0 such that (4r − 1)/2 − 1 < n0 ≤ (4r − 1)/2. Step 4. The case a1 < 0 and 2 < r ≤ k 2 . When 0 ≥ a1 ≥ −1, we may choose an integer n0 so that 2n0 < 4r − 1 ≤ 2n0 + 2. By Lemma 2.5 and Eq. (31), we have a2n0 > 1/2 and 2 (4n0 + 1)2 r (4r − 1)2 ≥ a2n 0 r ≥ (2(4r − 1) − 3)2 > (4r − 1)2 . 4 √ When −1 ≥ a1 > −(1/2) 4r − 1, we may choose an integer n0 so that

n0 ≤

3√ 4r − 1 < n0 + 1. 4

By Lemma 2.5 and Eq. (31) , we have a2n0 > 1 and 2 ((4n0 + 1)2 r + 3r − 1) (4r − 1)2 ≥ a2n 0 √ > 9r( 4r − 1 − 1)2 + 3r − 1 > (4r − 1)2 . √ Finally, when a1 ≤ −(1/2) 4r − 1, it follows from Eq. (31) that

(4r − 1)2 >

3 2 3 a1 (12r − 1) > (4r − 1)(12r − 1) > (4r − 1)2 . 2 8

Thus Eq. (31) is√ not valid for µ0 ≤ 0 and r ∈ (1/3, k 2 ]. Hence we conclude that µ0 > 0 for l = 1, . . . , [ 3k]. The proof is complete.

102

Z.M. Chen, W. G. Price

4. Proof of Theorem 1.3 (iii) Let us begin with the proof the smoothness and uniqueness. By Theorem 1.4, we see that coefficients {ξn }n∈Z of the eigenfunctions from the spectral problems Eqs. (4-5) satisfy the coupled set of the algebraic equations 2βn (βn + ρ(λ))ξn + λαn−1 ξn−1 − λαn+1 ξn+1 = 0, n ∈ Z.

(37)

Multiplying this nth equation by (−1)n αn ξn /2 and and summing the resultant equations, we have ∞ X

n

(−1)

n=−∞

βn (βn + ρ)αn ξn2

∞ X

+λ

(−1)n+1 αn αn+1 ξn ξn+1 = 0.

(38)

n=−∞

Denote by F (ρ(λ), λ) the left-hand side of this equation. Recalling from Eq. (6) that lim ρ(λ) = −l 2 − k 2 = −β0 ,

λ→0

we see that by the implicit function theorem ρ(λ) is a smooth and unique solution of Eq. (37), if ∂F /∂ρ 6 = 0 holds true. Indeed, by Eqs. (9) and (37), ∞ ∞ X X ∂ξn ∂F = (−1)n βn αn ξn2 + (−1)n 2βn (βn + ρ)αn ξn ∂ρ ∂ρ n=−∞ n=−∞ ∞ X ∂ξn+1 ∂ξn (−1)n+1 αn αn+1 ξn+1 + ξn +λ ∂ρ ∂ρ n=−∞ ∞ X

=

(−1)n βn αn ξn2

n=−∞ ∞ X

+

(−1)n αn

n=−∞ ∞ X

=

∂ξn (2βn (βn + ρ)ξn + λαn−1 ξn−1 − λαn+1 ξn+1 ) ∂ρ

(−1)n βn αn ξn2

n=−∞ ∞ X

=2

n=0

(−1)n βn αn ξn2 .

(39)

On the other hand, multiplying the nth equation of Eq. (37) by αn ξ¯n , and summing the resultant equations, we have ∞ X

2βn (βn + ρ)αn |ξn |2 +

n=−∞

∞ X

λαn αn+1 (ξn ξ¯n+1 − ξn+1 ξ¯n ) = 0,

n=−∞

which yields, by Eq. (9), ∞ X n=−∞

βn (βn + Reρ)αn |ξn |2 =

∞ X n=0

2βn (βn + Reρ)αn |ξn |2 = 0.

(40)

Periodic Navier–Stokes Flows

103

Using this equation, we have ∞

|

X ∂F (−1)n βn αn ξn2 | | ≥ 2β0 l(3k 2 − l 2 )|ξ0 |2 − 2| ∂ρ n=1

2 = β0 + Reρ ≥ = ≥

2 β0 + Reρ 2 β0 + Reρ 1 β0 + Reρ

∞ X n=1 ∞ X n=1 ∞ X n=1 ∞ X

∞ X βn (βn + Reρ)αn |ξn | − 2| (−1)n βn αn ξn2 | 2

n=1

βn (βn + Reρ)αn |ξn |2 − 2

∞ X

βn αn |ξn |2

n=1

βn (βn − β0 )αn |ξn |2 , for − β0 < Reρ,

(41)

βn (βn + Reρ)αn |ξn |2 , for − β0 < Reρ < 0,

n=1

= β0 l(3k 2 − l 2 )|ξ0 |2 . Since ξ0 6 = 0 is a constant, we thus have, for λ > 0 and Reρ ≥ −β0 , ∂F (ρ, λ) ∂F (ρ, λ) > 0. 6 = 0 and lim inf ∂ρ ∂ρ Reρ&−β0 Now we proceed to the proof of Eq. (7). By Eqs. (39) and (41), we have ∞ X (−1)n βn αn ξn2 6= 0. n=0

By Eq. (9), we may choose the constant ξ0 so that ∞ X

n

(−1)

n=−∞

βn αn ξn2

∞ X =2 (−1)n βn αn ξn2 = 2.

(42)

n=0

Moreover, by Eq. (40), we see ∞ X n=0

∞

X 1 βn αn |ξn | = − βn (βn − β0 )αn |ξn |2 < 0. β0 + Reρ(λ) 2

n=1

Differentiating Eq. (38) with respect to λ yields 0=

∞ X

(−1)n 2βn (βn + ρ(λ))αn ξn

n=−∞ ∞ X

dξn dλ

∞ X dρ + (−1)n+1 αn αn+1 ξn ξn+1 dλ n=−∞ n=−∞ ∞ X dξn dξn+1 n+1 +λ (−1) αn αn+1 ξn+1 + ξn dλ dλ n=−∞

+

(−1)n βn αn ξn2

(43)

104

Z.M. Chen, W. G. Price

=2 +

∞ X dρ(λ) (−1)n+1 αn αn+1 ξn ξn+1 , by Eq. (42), + dλ n=−∞ ∞ X

dξn (2βn (βn + ρ(λ))ξn + λαn−1 ξn−1 − λαn+1 ξn+1 ) dλ

(−1)n αn

n=−∞

=2

∞ dρ(λ) 1 X − (−1)n βn (βn + ρ(λ))αn ξn2 , by Eqs. (37–38), dλ λ n=−∞

=2

dρ(λ) 2 X (−1)n βn (βn + ρ(λ))αn ξn2 , by Eq. (9), − dλ λ

∞

n=0

∞

=2

2X dρ(λ) 2 (−1)n βn (βn − β0 )αn ξn2 , by Eq. (42). − (β0 + ρ(λ)) − dλ λ λ n=1

Therefore, by Eq. (43), ∞

λRe

X dρ (−1)n βn (βn − β0 )αn Re(ξn2 ) = β0 + Reρ(λ) + dλ n=1

> β0 + Reρ(λ) −

∞ X

βn (βn − β0 )αn |ξn |2

n=1

= (β0 + Reρ(λ))(1 +

∞ X

βn αn |ξn |2 ).

n=0

Now we show 1+

∞ X

βn αn |ξn |2 > 0.

n=0

Indeed, by Eq. (43), we denote by h the positive constant such that h=−

∞ X

βn αn |ξn |2 = β0 l(3k 2 − l 2 )|ξ0 |2 −

n=0

∞ X

βn αn |ξn |2 .

n=1

Using Eq. (42), we have β0 l(3k 2 − l 2 )(|ξ0 |2 − Im(ξ02 )) = h +

∞ X n=1

and β0 l(3k − l )(|ξ0 | 2

2

2

+ Im(ξ02 ))

=h+

∞ X n=1

Thus it follows from the Schwartz inequality that β02 l 2 (3k 2 − l 2 )2 |Re(ξ02 )|2

βn αn (|ξn |2 − (−1)n Im(ξn2 )),

βn αn (|ξn |2 + (−1)n Im(ξn2 )).

Periodic Navier–Stokes Flows

= h+

∞ X

105

! n

βn αn (|ξn | + (−1) 2

n=1

≥

h+

∞ X n=1

Im(ξn2 ))

h+

! n

βn αn (|ξn | − (−1) 2

n=1

!2 βn αn |Re(ξn2 )|

∞ X

Im(ξn2 ))

.

This implies, by Eq. (42), h ≤ β0 l 2 (3k 2 − l 2 )|Re(ξ02 )| −

∞ X n=1

βn αn |Re(ξn2 )|

∞ ∞ X X (−1)n βn αn Re(ξn2 )| − βn αn |Re(ξn2 )| < 1, ≤ 1+| n=1

n=1

where we have used the fact ξn2 6 = 0 for all n which is due to Eq. (9) and the choice ξ0 6 = 0. The proof is complete. 5. Proof of Theorem 1.3 (iv) 2 is an orthogonal sum of the following three subspaces Let us note that Hl,k

( ψ ∈ H |ψ = 2

∞ ∞ X X

) ξn cos(2mlx + 2nky) ,

m=0 n=−∞

  

ψ ∈ H 2| ψ =

X

 

∞ X

√ (2m−1)l> 3k n=−∞

  ψ ∈ H 2| ψ = 

√ [X 3k]

∞ X

(2m−1)l=l n=−∞

ξn cos((2m − 1)lx + ky + 2nky) , 

  ξn cos((2m − 1)lx + ky + 2nky) , 

and each of these subspaces is invariant with respect to the operator Aλ . If the spectral problem Aλ ψ = ρψ is reduced to the first subspace, the absence of of the complex eigenvalue ρ with Imρ 6 = 0 and Reρ = 0 follows from [4]. For the second subspace, we readily see that Reγ0 > 0, which contradicts Eq. (10). This gives the absence of such an eigenvalue as well. Here we have used the equivalence given in Theorem 1.4 with the condition l 2 < 3k 2 replaced by the condition l 2 > 3k 2 . This equivalence is shown in [3]. However, as far as the last subspace is concerned, it has been obtained in Sects. 3 and 4 that the spectral problem Aλ ψ = ρψ in the space ( ψ| ψ =

∞ X n=−∞

) ξn cos((2m − 1)lx + ky + 2nky) ∈ H

2

106

Z.M. Chen, W. G. Price

√ for each m with l ≤ (2m − 1)l < [ 3k] has a unique and simple eigenvalue ρ = ρ(2m−1)l,k (λ) such that Re and Let

dρ(2m−1)l,k (λ) > 0, Imρ(2m−1)l,k (λ) < 0 for λ > 0 dλ

Reρ(2m−1)l,k (λ(2m−1)l,k ) = 0 for some λ(2m−1)l,k > 0. m0

be the positive integer such that m0 = max{m| λ(2m−1)l,k = λl,k , (2m − 1)l <

√ 3k}.

2 It is readily seen that ρl,k is simple in the flow invariant subspace H(2m 0 −1)l,k , which is 2 contained in Hl,k . Likewise, we obtain that ρl,k is simple in the flow invariant subspace 2 ˜2 H˜ (2m 0 −1)l,k ⊂ Hl,k .

The corresponding eigenfunction ψ is in the following form: ψ=

∞ X

ξn sin((2m0 − 1)lx + ky + 2nky) ∈ H 2 .

n=−∞

By Eq. (9), we thus obtain 2 1 = dim{ψ|Aλ ψ = ρl,k ψ, ψ ∈ H(2m 0 −1)l,k }, 2 1 = dim{ψ| Aλ ψ = ρl,k (λ)ψ, ψ ∈ H˜ (2m 0 −1)l,k }, 2 ˜2 2 = dim{ψ| Aλ ψ = ρl,k (λ)ψ, ψ ∈ H(2m 0 −1)l,k ∪ H(2m0 −1)l,k }.

This shows Assertion (iv) and completes the proof of Theorem 1.3. Acknowledgement. This research was partially made when the first author was visiting the Department of Mathematics at Indiana University.

References 1. Arnold, V.I., Meshalkin, L.D.: A. N. Kolmogorov’s seminar on selected problems of analysis (1958-1959). Russ. Math. Surv. 15 (1), 247–250 (1960) 2. Chen, Z.M.: Hopf bifurcation of the three-dimensional Navier–Stokes equations. J. Math. Anal. Appl., to appear 3. Chen, Z.M., Price, W.G.: Time dependent periodic Navier–Stokes flows on a two-dimensional torus. Commun. Math. Phys. 179, 577–597 (1996) 4. Iudovich, V.I.: Example of the generation of a secondary stationary or periodic flow when there is loss of stability of the laminar flow of a viscous incompressible fluid. J. Math. Mech. 29(3), 453–467 (1965) 5. Joseph, D.D., Sattinger, D.H.: Bifurcating time periodic solutions and their stability. Arch. Rational Mech. Anal. 45, 75–109 (1972) 6. Kirchgässner. K.: Bifurcation in nonlinear hydrodynamic stability. SIAM Rev. 17, 652–683 (1975) 7. Landau, L.: On the problem of turbulence. Comptes Rend. Acad. Sci. USSR 44, 311–316 (1944) 8. Lin, C.C.: The Theory of Hydrodynamic Stability. Cambridge: Cambridge University Press, 1955 9. Meshalkin, L.D., Sinai,Ya.G.: Investigation of the stability of a stationary solution of a system of equations for the plane movement of an incompressible viscous fluid. J. Math. Mech. 19 (9), 1700–1705 (1961) 10. Ruelle, D., Takens, F.: On the nature of turbulence. Commun. Math. Phys. 20, 167–192 (1971) Communicated by Ya. G. Sinai

Commun. Math. Phys. 207, 107 – 129 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Fluid Dynamical Profiles and Constants of Motion from d-Branes R. Jackiw1,? , A. P. Polychronakos2,3 1 Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA 02139–4307, USA 2 Theoretical Physics Department, Uppsala University, 75108 Uppsala, Sweden 3 Physics Department, University of Ioannina, 45110 Ioannina, Greece

Received: 3 February 1999 / Accepted: 9 April 1999

Abstract: Various fluid mechanical systems enjoy a hidden, higher-dimensional dynamical Poincaré symmetry, which arises owing to their descent from a Nambu–Goto action. Also, for the same reason, there are equivalence transformations between different models. These interconnections, summarized by the diagram below, are discussed in our paper.

Parameterization-invariant Nambu–Goto d-brane in d + 1, 1 spacetime

Light-cone parameterization

Cartesian parameterization

Nonrelativistic limit Poincaré-invariant

Galileo-invariant Chaplygin gas in (d, 1) spacetime

Exact transformation

Born–Infeld model in (d, 1) spacetime

? This work is supported in part by funds provided by the U.S. Department of Energy (D.O.E.) under contract #DE-FC02-94ER40818.

108

R. Jackiw, A. P. Polychronakos

1. Introduction In this paper we shall be concerned with nonlinear dynamical systems that are described by a density of “matter” ρ, flowing in time {t} with local velocity v on a d-dimensional surface coordinated by {r}. The vectorial nature of v is not unrestricted: v is a function of ∇θ , where θ is a velocity “potential”, and we shall examine several such functions. (When v is linear in ∇θ, the flow is irrotational, ∇ × v = 0.) The density is linked to the velocity by a continuity equation involving the matter current j = vρ, ρ(t, ˙ r) + ∇ · v(t, r)ρ(t, r) = 0, (1.1) while the velocity satisfies an “Euler” equation v˙ (t, r) + v(t, r) · ∇v(t, r) = f(ρ, v).

(1.2)

[The over-dot always indicates differentiation with respect to the temporal argument, while the gradient ∇ (unless further specified) differentiates the spatial argument.] We shall examine theories with various expressions for the force (per unit mass) f, which lead to Galileo, Poincaré and additional unexpected kinematical symmetries of the equations, and which sometimes produce completely integrable systems, with an infinite number of local conserved quantities. The above equations can to be formulated with an action principle. Consequently the symmetries that we find are in fact Noether symmetries, which leave the action invariant. Additionally, we shall present limiting and equivalence transformations between different models, which allow mapping solutions of one model onto solutions of another. Various topics that we discuss have already appeared in the literature. A common feature unites the diverse models that we study: they have a common antecedent in that they can be gotten from a parameterization-invariant Nambu–Goto action for a d-brane on a d + 1-dimensional space, moving in (d + 1, 1)-dimensional space-time. [A “dbrane” is a d-dimensional extended object: d = 1 is a string, d = 2 is a membrane, etc. A d-brane inhabiting d + 1-dimensional space divides that space in two.] When a lightcone parameterization is selected for the Nambu–Goto problem, one derives the Euler and continuity equations for a d-dimensional “Chaplygin gas” [in Eq. (1.2), f ∝ ρ13 ∇ρ]. Alternatively, a Cartesian parameterization produces the d-dimensional “Born–Infeld” model (see below). The relation between membranes and planar fluid mechanics (the d = 2 case) was known to Goldstone [1], and was developed by his student Hoppe (with collaboration) [2]. Similar connections, yielding equations in one spatial dimension, were discussed by Nutku (and collaborators) [3]. Here we consider the general d-dimensional case, and use the common ancestry of the various fluid-mechanical models to posit unexpected transformations between them, and to identify hidden, dynamical symmetries in each model, which derive from the high degree of symmetry of the Nambu–Goto parent theory. For d = 1, the models that we study become especially simple for two reasons. First, their common antecedent is a string (1-brane) moving on a plane (2-space), and for this system the Nambu–Goto equations are integrable [4]. Second, the requirement that velocity be expressible in terms of a potential poses no restriction in one dimension, where any function can be related to the derivative of another function. In this way one may understand that the d = 1 models are completely integrable, as has been noted previously [3–5].

Fluid Dynamical Profiles and Constants of Motion from d-Branes

109

In Sect. 2 we consider noninteracting systems, with Galileo- and Poincaré-invariant kinetic terms. Specific interactions that preserve Galileo and Poincaré symmetry, as well as higher, dynamical symmetries are discussed in Sect. 3. The relation of these to the Nambu–Goto theory is explained in Sect. 4, where we also exhibit mappings between the Galileo-invariant and the Poincaré-invariant models. The last Sect. 5 is devoted to models in one spatial dimension. 2. Force-Free Motion The force-free problem, f = 0, describes the free flow of dust. Equations (1.1) and (1.2) are readily solved in terms of initial data, specified (without loss of generality) at t = 0, = ρ0 (r), (2.1) ρ(t, r) v(t, r)

t=0

t=0

= v0 (r).

(2.2)

Upon defining the retarded position q(t, r) by the equation q + tv0 (q) = r,

(2.3)

one verifies that (1.1) and (1.2), with vanishing right side, are solved by v(t, r) = v0 (q)

(2.4)

∂q i ρ(t, r) = ρ0 (q) det j . ∂r

(2.5)

The free equations, with v restricted to a function of ∇θ , possess a variational action formulation, which was first given by Eckart for Galileo-invariant nonrelativistic motion [6]. We reproduce and generalize his argument. The Lagrangian for N point-particles of mass m in free nonrelativistic motion is the P 2 Galileo-invariant kinetic energy 21 N n=1 mvn (t). In a continuum description, the particle counting index n becomes the continuous variable particles are distributed R d r, and the PN 2 2 with density ρ, so that n=1 vn (t) becomes d r ρ(t, r)v (t, r). But we also wish to link the density with the current by the continuity equation (1.1), which can be enforced with the help of a Lagrange multiplier θ. We thus arrive at the free continuum Lagrangian Z h i Galileo = dd r ρ 21 mv 2 + θ ρ˙ + ∇ · (vρ) . (2.6) L Since L is first-order in time, and the canonical 1-form v may be varied, evaluated and eliminated [7]. We find

R

˙ does not contain v, dd r θ ρdt

ρmv − ρ∇θ = 0, showing that ∇θ is the local momentum p = mv, and the velocity is irrotational: v=

1 ∇θ, m

(2.7)

110

R. Jackiw, A. P. Polychronakos

∇ × v = 0.

(2.8)

The Lagrangian (2.6) becomes Z LGalileo 0

=

(∇θ )2 , d r θ ρ˙ − ρ 2m d

(2.9)

where the subscript 0 denotes absence of interaction. Varying θ in (2.9) regains the continuity equation (1.1), while varying ρ produces the free “Bernoulli” equation for the velocity potential θ θ˙ +

1 (∇θ)2 = 0. 2m

(2.10)

This is also recognized as the free Hamilton-Jacobi equation. The gradient of (2.10) gives rise, in view of (2.8), to the free Euler equation (1.2) (with f = 0). Remarkably the same equations emerge for a kinetic energy T that is an arbitrary function of v. If we generalize (2.6) to Z h i (2.11) L0 = dd r ρT (v) + θ (ρ˙ + ∇ · (vρ) we get, instead of (2.7), ∂T (v) ≡ p = ∇θ, ∂v

(2.12)

and (2.9) becomes Z L0 =

∂T (v) dd r θ ρ˙ − ρ v · − T (v) , ∂v

(2.13)

where it is understood that the Legrendre transform of T is expressed in terms of ∇θ by inverting (2.12). Varying θ in (2.13) again gives the continuity equation, Z δ ∂T (v) δL0 d = ρ˙ − d r ρv · 0= δθ δθ ∂v Z δ = ρ˙ − dd r ρv · ∇θ δθ = ρ˙ + ∇ · (vρ), (2.14) while varying ρ leaves a generalization of the free Bernoulli equation: 0=

∂T δL0 = −θ˙ − v · + T (v). δρ ∂v

(2.15)

With the help of (2.12), this implies 2 ∂ 2T ∂ j ∂ θ ˙ = −v j = −v θ ∂r i ∂r i ∂v j ∂r i ∂r j 2 ∂ T ∂v k ∂ 2 T = −v j j i = −v j j . ∂r ∂v ∂r ∂v k ∂v i

(2.16a)

Fluid Dynamical Profiles and Constants of Motion from d-Branes

111

On the other hand, from (2.12) it follows that ∂ 2T ∂ θ˙ = i k v˙ k . i ∂r ∂v ∂v

(2.16b)

Equations (2.16a) and (2.16b) are consistent only if the free Euler equation (1.2) holds 2T (provided the matrix ∂v∂i ∂v j has an inverse). With a general form for T (v), the local momentum (2.12) p ≡ ∂T ∂v remains irrotational while the velocity, as determined by inverting (2.12), becomes a nonlinear function of ∇θ . Evidently the solution (2.1)–(2.5) works with arbitrary kinetic energy, whose specific form enters only in the fixing the relation between v and ∇θ . However, the initial data for the velocity must be consistent with the expression of the velocity in terms of ∇θ. One may present a family of constants of motion: Z (2.17) C = dd r ρ(t, r)C v(t, r), r − tv(t, r) . The time independence of C is established either by differentiating with respect to t and using the free equations of motion to prove that dC dt = 0, or more easily, by inserting the solution (2.3)–(2.5) into (2.17) and changing integration variables from r to q. (Carrying out these manipulations requires assuming that ρ0 and v0 obey appropriate regularity conditions and drop off sufficiently at large distances.) Various constants of motion arise from invariance against time and space translation (energy E and total momentum P, respectively) as well as space rotation (angular momentum Lij ), provided T (v) carries no explicit time and coordinate dependence, and does not depend on any external vectors, i.e., T (v) = T (v). These constants are Z ∂T − T (v) , (2.18) E = H = dd r H H = ρ v · ∂v Z ∂T (v) = ρp, (2.19) P = dd r P P = ρ∇θ = ρ ∂v Z (2.20) Lij = dd r (r i P j − r j P i ). Also shifting θ by a constant is a symmetry, leading to conservation of Z N = dd r ρ.

(2.21)

To recognize that these constants of motion involve particular forms for C(v, r − tv) in (2.17), we recall that according to (2.12) ∇θ is a function of v, and the two are colinear when T (v) = T (v). The densities H and P are components of an energy-momentum tensor, T 00 and T 0i respectively, which satisfy continuity equations with energy flux T i0 and momentum flux T ij , T 00 T i0 T 0i T ij

= = = =

H, v i H, Pi, vi P j .

(2.22a) (2.22b) (2.22c) (2.22d)

112

R. Jackiw, A. P. Polychronakos

[T ij is symmetric when T (v) = T (v).] The continuity equations ∂ T˙ 00 + i T i0 = 0, ∂r ∂ T˙ 0j + i T ij = 0 ∂r

(2.23a) (2.23b)

are entirely equivalent to the free dynamical equations (1.1) and (1.2), with vanishing force. The symplectic structure, which is determined by the canonical 1-form in (2.13), indicates that the only nonvanishing bracket is [7] or equivalently

{θ(t, r), ρ(t, r0 )} = δ(r − r0 )

(2.24)

{p(t, r), ρ(t, r0 )} = ∇δ(r − r0 ).

(2.25)

With these, one verifies that the constants of motion (2.18)–(2.21) generate the appropriate infinitesimal transformation on the variables θ and ρ. Specific forms for T (v) support additional, kinematical symmetries and lead to further constants of motion. In the nonrelativistic case presented in Eqs. (2.6)–(2.10), we have Galileo invariance against boosts by velocity u. The transformation law for the fields reads ρ(t, r) → ρu (t, r) = ρ(T , R), θ(t, r) → θu (t, r) = θ(T , R) + m(u · r − u2 t/2),

(2.26)

where the transformed coordinates are boosted: t → T = t, r → R(t, r) = r − tu.

(2.27)

The inhomogenous terms in θu are recognized as the well-known 1-cocycle of field theoretic realizations of the Galileo group. Also they ensure that the transformation for the velocity v = ∇θ/m, v(t, r) → vu (t, r) = v(t, r − tu) + u, is appropriate for the co-moving velocity. The conserved quantity arising from the Galileo symmetry is Z B = tP − m dd r rρ Z = −m dd r (r − tv)ρ,

(2.28)

(2.29)

where the last equality casts B in the form (2.17). With the help of the bracket (2.24), B generates the infinitesimal transformation on the fields, and its bracket with P closes on N, thereby exposing the familiar Galileo 2-cocycle, which provides an extension of the algebra: {B i , P j } = δ ij mN.

(2.30)

Fluid Dynamical Profiles and Constants of Motion from d-Branes

113

The free Galileo-invariant theory possesses further symmetries, which survive even in the presence of a particular interaction. Hence we postpone discussing them until later, when interactions are included. Subsequently, in addition to the Galileo-invariant case, we shall also be concerned with a relativistic, p Poincaré-invariant model for which the point-particle kinetic energy P is −mc2 nn=1 1 − vn2 (t)/c2 . Upon passing to a continuum description, as in the nonrelativistic case, we find q (2.31) T (v) = −mc2 1 − v 2 /c2 , mv ∂T (v) ≡p= p = ∇θ, ∂v 1 − v 2 /c2 c∇θ , v= p 2 m c2 + (∇θ )2

(2.32) (2.33)

leading to q p mc2 , H = ρc m2 c2 + (∇θ)2 = ρc m2 c2 + p2 = ρ p 1 − v 2 /c2 and Lagrangian

Z = LLorentz 0

p dd r θ ρ˙ − ρc m2 c2 + (∇θ )2 .

(2.34)

(2.35)

In the nonrelativistic limit this becomes → −mc2 N + LGalileo . LLorentz 0 0 Under Lorentz boosts, with velocity u, the fields transform as p ˙ , R) + c m2 c2 + (∇θ (T , R))2 θ(T p ρ(t, r) → ρu (t, r) = ρ(T , R) ∂t θ(T , R) + c m2 c2 + (∇r θ (T , R))2 θ (t, r) → θu (t, r) = θ(T , R)

(2.36)

(2.37)

with Lorentz-boosted coordinates 1 β · r sinh β, t → T (t, r) = t cosh β + b c r → R(t, r) = r + b β ct sinh β + b β · r(cosh β − 1) ,

(2.38)

where β = u/c. Invariance is most easily verified by writing the action corresponding to (2.35) as Z n o p Lorentz = − dt dd r ρ θ˙ + c m2 c2 + (∇θ )2 . (2.39) I0 The infinitesimal version of the field transformation (2.37) r ∂ v + ct∇ ρ − β · ρ, δρ = β · c ∂t c r ∂ + ct∇ θ δθ = β · c ∂t

(2.40)

114

R. Jackiw, A. P. Polychronakos

is generated by the Lorentz constant of motion Z L = tP − dd r rH/c2 Z r p = dd r tρ∇θ − ρ m2 c2 + (∇θ )2 c Z 1 = −m dd r (r − tv)ρ p 1 − v 2 /c2

(2.41)

with the last equality exhibiting the form (2.17). The transformation laws for ρ and θ ensure that j α = (1, v/c)ρ and U α = (1, v/c) √ 1 2 2 transform as Lorentz vectors, so 1−v /c p that ρ 1 − v 2 /c2 and θ are scalars [8]. The equation of motion (2.15) for θ˙ together with (2.31)–(2.33) implies that the Lorentz vector ∂α θ satisfies a Lorentz-invariant equation (∂α θ)2 = m2 c2 .

(2.42)

Note that the Lorentz transformation law (2.37) for θ does not involve a 1-cocycle, which is a nonrelativistic effect. It is interesting to see how (2.26) arises in the nonrelativistic limit. By comparing the relativistic action (2.39) to the nonrelativistic one, we see that the relationship between θR and θNR – the relativistic and nonrelativistic variables – is θR = θNR − mc2 t.

(2.43)

Applying the Lorentz transformation law (2.37) to θR implies that θNR (t, r) − mc2 t → θNR (T , R) − mc2 T or θNR (t, r) → θNR (T , R) + mc2 (t − T ). We evaluate the nonrelativistic limit of the last quantity from (2.38) and find mc2 (t − T ) → m(u · r − u2 t/2), which matches the 1-cocycle in (2.26). Similar to the Galileo-invariant theory, this Poincaré-invariant model possesses further symmetries, which we shall discuss below, when we include an interaction that preserves them. 3. Motion in the Presence of Interactions 3.1. Nonrelativistic motion. Interactions that preserve the Galileo symmetry of the free, nonrelativistic motion can be included by adding a θ -independent potential V (ρ) to the Lagrangian (2.9): Z = LGalileo V

(∇θ )2 dd r θ ρ˙ − ρ − V (ρ) . 2m

(3.1)

With nonvanishing V , (2.3)–(2.5) are no longer solutions, and the quantity (2.17) with arbitrary C(v, r − tv) is no longer constant. Of course the Galileo generators (2.18)– (2.21) and (2.29), with H=ρ

(∇θ)2 + V (ρ) 2m

(3.2)

Fluid Dynamical Profiles and Constants of Motion from d-Branes

115

remain time independent. The energy-momentum tensor retains the form (2.22), with H 0i ij given by (3.2); in T i0 , H of (3.2) is diminished by V − ρ ∂V ∂ρ ; T is unchanged and T acquires the addition δ ij V − ρ ∂V ∂ρ . The dynamics implied by (3.1) arise in diverse physical contexts. The most directly physical application is to isentropic, irrotational motion in fluid mechanics with the 2 1/2 V (ρ) is the speed “force” f (ρ) = − ∂V∂ρ(ρ) corresponding to the enthalpy and ρ ∂ ∂ρ 2 of sound [9]. Alternatively one finds (3.1) (with V depending also on ∇ρ) in the hydrodynamical formulation of quantum mechanics, which emerges when the wave function is presented as [10], ψ = ρ 1/2 eiθ/h¯ .

(3.3)

[In this context, the inhomogenous Galileo transformation of θ (2.26) corresponds to the familiar change of phase in a wave function under Galileo boosts, while shifting θ by a constant is just the phase-invariance of quantum mechanics, which leads to probability conservation, i.e., constant N in (2.21).] But we shall be especially concerned with the case V (ρ) = λ/ρ

(3.4)

which arises in the study of “d-branes” – d-dimensional extended objects – moving on a (d + 1)-dimensional space, in (d + 1, 1)-dimensional space-time, and descending from a Nambu–Goto action (see Sect. 4) [1,2]. Furthermore, (3.4) arises in the nonrelativistic limit of a Poincaré-invariant model with interactions, which we shall also describe below. The equations of this theory, which follow from Z (∇θ )2 λ d = d r θ ρ ˙ − ρ − (3.5) LGalileo λ 2m ρ read in their Bernoulli form

ρ˙ + ∇ ·

θ˙ +

∇θ ρ m

= 0,

λ (∇θ)2 = 2, 2m ρ

(3.6)

(3.7)

while their Euler form is gotten by recalling that v = ∇θ/m, ρ˙ + ∇ · (vρ) = 0, v˙ + v · ∇v =

−2λ ∇ρ. mρ 3

(3.8) (3.9)

These are the equations for a “Chaplygin gas”. For thisR model there exist further symmetry transformations [11]. The action is invariant against a rescaling of time, with parameter ω, unIλGalileo = dtLGalileo λ der which the fields change according to ρ(t, r) → ρω (t, r) = e−ω ρ(eω t, r), θ(t, r) → θω (t, r) = eω θ (eω t, r),

(3.10)

116

R. Jackiw, A. P. Polychronakos

and the time-independent generator of the infinitesimal transformation is Z D = tH − dd r ρθ.

(3.11)

Furthermore, the action is also invariant against a field-dependent diffeomorphism, implicitly defined by t → T (t, r) = t + ω · r + 21 ω2 θ (T , R)/m, r → R(t, r) = r + ωθ(T , R)/m,

(3.12)

where the transformed fields are 1 , |J | θ(t, r) → θω (t, r) = θ (T , R),

ρ(t, r) → ρω (t, r) = ρ(T , R)

(3.13)

and |J | is the Jacobian of the transformation. 

∂T  ∂t J = det   ∂R i ∂t

 ∂T −1 2 ∂r j   = 1 − ω · ∇θ(T , R) − ω θ˙ (T , R) . m 2m ∂R j  ∂r j

(3.14)

Here ω is a (vectorial) parameter of the transformation, with dimension of (velocity)−1 , and the time-independent quantity Z (3.15) G = dd r (rH − θ P/m) generates the infinitesimal transformation. Note that the generators D and G remain time-independent even in the absence of the interaction (3.4), hence these symmetries are also present for the free theory. The generators are not of the form (2.17): they involve θ , and cannot be written in terms of v = ∇θ/m. Finally we note that bracketing of the additional generators with the Galileo generators of the nonrelativistic theory on a (d, 1)-dimensional space-time produces an algebra which is isomorphic to the Poincaré group in (d + 1, 1) dimensions, [2,12] under which (t, θ, r) transforms as a (d +2)-Lorentz vector Xµ in light-cone components (t = X + , θ = X− ) [4]. Using (3.7), we may eliminate ρ, and describe the model solely in terms of θ , whose dynamics is governed by the Lagrangian √ Z Lλ = −2 λ dd r

s θ˙ +

(∇θ )2 . 2m

(3.16)

It is seen that the “interaction strength” λ in fact disappears from the equations of motion for θ ; λ serves only to normalize the Lagrangian. In the free theory it is not possible to achieve this compact formulation. Furthermore, the interacting dynamical equations

Fluid Dynamical Profiles and Constants of Motion from d-Branes

117

can be summarized by an equation for θ, which follows from (3.6), once (3.7) is used to eliminate ρ, or alternatively the equation is derived from (3.16),    ∇θ/m  1 ∂  = 0. +∇· r r ∂t 2 2 (∇θ) (∇θ ) θ˙ + 2m θ˙ + 2m

(3.17)

In spite of their awkward appearance, (3.16) and (3.17) are Galileo invariant in (d, 1) space-time, and possess a hidden, nonlinearly realized Poincaré symmetry in (d + 1, 1) space-time (which is a descendant of the symmetries of the Nambu–Goto action; see Sect. 4). Apart from the intrinsic interest in this nonlinear realization of a kinematical/dynamical Poincaré symmetry, which is provided by (3.10)–(3.15) supplementing the linearly realized Galileo symmetry, the new symmetries have the useful consequence of generating new solutions to the equations of motion (3.6)–(3.9) from old ones. For example, the time-rescaling invariant, particular solution θ(t, r) = − r ρ(t, r) =

mr 2 , 2(d − 1)t |t| 2λ (d − 1) md r

(3.18)

can be transformed by (3.12)–(3.14) into new solutions θω and ρω , which are very different in character from (3.18) [11]. Note that in (3.18) we must have d > 1 and λ > 0. At d = 1, we can obtain general, time-rescaling invariant solutions. With the timerescaling–invariant Ansatz θ ∝ 1/t, (3.17) leads to a second order differential equation for the x-dependence of θ. Therefore solutions involve two arbitrary constants, one of which fixes the origin of x, and can be ignored. The other, which we call k, appears in two distinct families of solutions (which are related by an imaginary shift of x): m cosh2 kx, 2k 2 t m θ(t, x) = 2 sinh2 kx. 2k t

θ(t, x) = −

(3.19a) (3.19b)

For real θ, k must be real or imaginary. When a real ρ is computed from (3.7), we find that k must be real for λ > 0, imaginary for λ < 0, and a nonsingular density exists only with (3.19a) for λ > 0, r 2λ k|t| . (3.20) ρ(t, x) = m cosh2 kx 0

The current j = θm ρ exhibits a kink profile (derivation with respect to the single spatial variable is indicated by a dash) r 2λ tanh kx, (3.21) j (t, x) = ∓ m

118

R. Jackiw, A. P. Polychronakos

where the sign is fixed by the sign of t. In the last section we shall further review the d = 1 case. Another interesting solution, which is essentially one-dimensional, even though it exists in arbitrary spatial dimension, is given by n · u)2 . (3.22) θ(t, r) = 2(b n · r) + mu · r − 21 mt u2 − (b Here b n is a spatial unit vector, and u is an arbitrary vector with dimension of velocity, while 2 is an arbitrary function with static argument, which can be boosted by the Galileo transformation (2.26). The corresponding charge density is time-independent: √ 2λ/m (3.23) ρ(t, r) = b n · u + 20 (b n · r)/m and the static current becomes j(t, r) =

r

2λ u −b n(b n · u) b n+ . m b n · u + 20 (b n · r)/m

(3.24)

, which pre3.2. Relativistic motion. We seek an interacting generalization of LLorentz 0 serves Poincaré invariance. To find this, proceed as follows. Let Z (3.25) LLorentz = dd r {θ ρ˙ − H(ρ, p)} with p given by (2.32) and H to be determined. The symplectic structure is as in the free theory, hence the Poisson brackets retain the form (2.24), (2.25). We calculate the Poisson bracket between two Hamiltonian densities; in one the fields are evaluated at r, in the other at r0 : Z 0 δH(r) 00 000 00 000 δH(r ) 0 − r ) . (3.26a) · ∇δ(r − r ↔ r {H(r), H(r0 )} = dr dr 00 000 δp(r ) δρ(r ) We assume that H is a local function of p and ρ, so that the functional derivatives lead to ordinary derivatives, and (3.26a) becomes ∂H(r) ∂H(r) ∂H(r0 ) ∂H(r0 ) + · ∇δ(r − r0 ). (3.26b) {H(r), H(r0 )} = ∂p ∂ρ ∂p ∂ρ On the other hand, the Dirac-Schwinger condition for Lorentz invariance states that the bracket (3.26) should give rise to c2 times the momentum density P, which in this problem is given in (2.19) as P = ρp.

(3.27)

Rotational invariance requires that the p dependence of H is only on the magnitude p. Thus we conclude that 4

∂ ∂ H 2 H = c2 . 2 ∂p ∂ρ

(3.28)

Fluid Dynamical Profiles and Constants of Motion from d-Branes

119

While many forms for H can solve (3.28), we take a solution that is relevant to the present context, i.e., it generalizes in a simple manner the free Hamiltonian density (2.34), leads to a theory that descends from the Nambu–Goto action, and coincides in the nonrelativistic limit with the Galileo-invariant Chaplygin gas model: H=

q p p mc2 ρ 2 c2 + a 2 m2 c2 + (∇θ)2 = ρ 2 + a 2 /c2 p . 1 − v 2 /c2

(3.29)

Evidently the parameter a measures strength of “interaction”. Thus an interacting, Poincaré-invariant theory is described by the Lagrangian Z h i p p Lorentz = dd r θ ρ˙ − ρ 2 c2 + a 2 m2 c2 + (∇θ )2 . (3.30) La The corresponding conserved Lorentz generator takes the same form as in the first equality of (2.41), with H given by (3.29), and it generates the infinitesimal transformation q r ∂ v + ct∇ ρ − β · ρ 2 + a 2 /c2 , δρ = β · c ∂t c r ∂ + ct∇ θ. (3.31) δθ = β · c ∂t The finite transformation law is gotten by iterating (3.31), and the generalization of (2.37) becomes q ρ(t, r) → ρu (t, r) = ρ(T , R) 21 (+ + − ) + ρ 2 (T , R) + a 2 /c2 21 (+ − − ), θ (t, r) → θu (t, r) = θ(T , R),

(3.32)

where the Lorentz transformed coordinates (T , R) are given in (2.38) and p ˙ , R) ± c m2 c2 + (∇θ (T , R))2 θ(T p . ± ≡ ∂t θ(T , R) ± c m2 c2 + (∇r θ (T , R))2 p , vc √ It follows that j α = ρ, vc ρ 2 + a 2 /c2 and U α = √ ρ ρ 2 +a 2 /c2

(3.33) 1 1−v 2 /c2

are

Lorentz vectors, while θ is a scalar. The equations of motion are q q c∇θ ρ 2 + a 2 /c2 = ρ˙ + ∇ · (v ρ 2 + a 2 /c2 ) = 0, (3.34a) ρ˙ + ∇ · p m2 c2 + (∇θ)2 p ρ m2 c2 + (∇θ)2 mc2 p = θ˙ + p = 0. (3.34b) θ˙ + ρc p ρ 2 + a 2 /c2 ρ 2 + a 2 /c2 1 − v 2 /c2 Using (3.34b) to express ρ in terms of θ , ρ=−

∂0 θ a p , 2 c m2 c2 − (∂µ θ )2

(3.35)

120

R. Jackiw, A. P. Polychronakos

and substituting this in (3.34a) yields a second order, Lorentz covariant equation for θ . That equation may also be gotten by eliminating ρ from (3.30) and deriving a Lagrangian for θ , Z p (3.36) Lθ = −a dd r m2 c2 − (∂α θ )2 . This is the “Born–Infeld” Lagrangian, leading to the equation of motion 1 α ∂α θ = 0. ∂ p m2 c2 − (∂µ θ)2

(3.37)

As in the nonrelativistic theory [see (3.16)] the possibility of expressing ρ in terms of θ requires presence of the “interaction”, a 6 = 0, whose nonvanishing strength disappears from the nonlinear, interacting equations for θ . The energy-momentum tensor for the theory (3.30) is Lorentz covariant, of second rank, and symmetric. After eliminating ρ with (3.35), the resulting expression, depending solely on θ, bears the usual relation to its Lagrangian (3.36). Since the interacting Lorentz-invariant model is a descendant of the Nambu–Goto Lagrangian (see Sect. 4), it comes as no surprise that it too possesses additional kinematic symmetries, whose generators supplement the generators of the linearly realized Poincaré group in (d, 1) dimensions to give a nonlinear realization of dynamical Poincaré algebra in (d + 1, 1) dimensions [13]. The additional symmetry transformations, which leave (3.30) or (3.36) invariant, involve a field-dependent reparameterization of time, defined implicitly by t → T (t, r) =

θ (T , r) t + tanh mc2 ω, cosh mc2 ω mc2

(3.38)

under which the field transforms according to θ(t, r) → θω (t, r) =

θ(T , r) − mc2 t tanh mc2 ω. cosh mc2 ω

(3.39)

[We record only the action of the transformations on θ ; their effect on ρ can be read off from (3.35).] The infinitesimal generator, which is time independent by virtue of the equation of motion (3.37), is Z Z p p d 2 4 2 2 2 2 2 2 D = d r m c tρ + θ ρ c + a m c + (∇θ ) = dd r (m2 c4 tρ + θ H). (3.40) A second class of invariances involves a reparameterization of the spatial variable, implicitly defined by 1 − cos mcω tan mcω +b ωb ω·r , (3.41) r → R(t, r) = r − b ωθ(t, R) mc cos mcω θ(t, r) → θω (t, r) =

θ(t, R) − mcb ω · r sin mcω . cos mcω

(3.42)

Fluid Dynamical Profiles and Constants of Motion from d-Branes

121

√ Here ω is a vectorial parameter, b ω = ω/ω, ω = ω2 . The time-independent generator of the infinitesimal transformation reads Z Z (3.43) G = dd r (m2 c2 rρ + θρ∇θ) = dd r (m2 c2 rρ + θ P). With the addition of D and G to the previous generators, the Poincaré algebra in (d + 1, 1) dimension is reconstructed, and the transformation laws (3.38), (3.39), (3.41), (3.42) ensure that (t, r, θ) transforms as a (d + 2)-dimensional Lorentz vector (in Cartesian components) [2]. Note that this symmetry also holds in the free, a = 0, theory. Because of the extended symmetry, one can generate new solutions from old ones since both θω and θω in (3.39) and (3.42) solve the equation of motion if θ does. A remarkable fact is that the nonrelativistic limit of the above relativistic and interacting model precisely corresponds to the nonrelativistic interacting model discussed previously. This is easily seen from (3.30), which in the limit gives (3.1), with the help of (2.43), Z Z d (∇θN R )2 a2 d 2 d → − r mc tρ + d r θ ρ ˙ − ρ d − LLorentz NR a dt 2m 2mρ . = −mc2 N + LGalileo a 2 m/2 Equivalently, when ρ is eliminated, we have from (3.36) and (2.43), Z q Lθ → −a dd r 2mθ˙NR − (∇θN R )2 = Lλ=a 2 m/2 .

(3.44)

(3.45)

Correspondingly, the equation of motion (3.37) goes over into (3.17). It is easy to exhibit solutions of the relativistic theory, which reduce to solutions of the nonrelativistic equations that were given previously. The following profiles solve (3.37): q (3.46) θ(t, r) = −mc c2 t 2 + r 2 /(d − 1). With (2.43), this reduces to (3.18). In one dimension we have q θ(t, x) = −mc c2 t 2 + cosh2 kx/k 2

(3.47)

reducing to (3.19a). The relativistic analog of the lineal solution (3.22) is p n · u)2 . θ(t, r) = 2(b n · r) + mu · r − mct c2 + u2 − (b

(3.48)

Note that the above profiles continue to solve (3.37), even when the sign of the square root is reversed; but then they no longer possess a nonrelativistic limit. Additionally, there exists an essentially relativistic, chiral solution describing massless propagation in one direction: θ can satisfy the wave equation 2θ = 0 when

(∂µ θ)2 = constant

(3.49a) (3.49b)

122

R. Jackiw, A. P. Polychronakos

as, for example, with plane waves θ(t, r) = f (b n · r ± ct),

(3.50)

where (∂µ θ )2 vanishes. Then ρ reads from (3.35) ρ=∓

a 0 f . mc2

(3.51)

4. Relation to Nambu–Goto Action The Nambu–Goto action for a d-brane in (d + 1) spatial dimensions, moving in time on (d + 1, 1) Minkowski space is Z √ (4.1) IN-G = − dφ 0 dφ 1 · · · dφ d G, where G is (−1)d times the determinant of the induced metric Gαβ =

∂Xµ ∂Xµ . ∂φ α ∂φ β

(4.2)

Here Greek letters, from the beginning of the alphabet, label the quantities φ α = (φ 0 , φ), with which the d-brane coordinate X µ is parameterized; φ 0 is the evolution parameter, and φ = {φ a , a = 1, . . . , d} are the fixed-time, spatial parameters. These d-brane coordinates carry a Greek-letter index from the middle of the alphabet, with value 0 for the temporal coordinate X0 and m for the d-brane’s d + 1 spatial coordinates, X = {X m , m = 1, . . . , d, d + 1}. The action is invariant against reparameterizations of the φ α , and we make the parameterization choice that the d coordinates X m , m = 1, . . . , d, are given by φ m , which we rename r m . For the remaining parameters we use one of two options, “light-cone” and “Cartesian”. In the light-cone option, we define 1 (4.3) X± = √ (X 0 ± Xd+1 ) 2 √ and for X + choose the parameterization X+ = √ 2λm φ 0 ; also we rename X+ as t. The remaining coordinate X − , a function of φ 0 = t/ 2λm and φ = r, is renamed θ (t, r)/m. Upon evaluating the determinant G, we see that the Nambu–Goto action coincides with the action for (3.16) [14]. This identity also explains the higher symmetry noted in Eqs. (3.10) and (3.13). Our choice of parameterization does not interfere with invariance against the (d + 1, 1) Poincaré group, which acts on Xµ . In the chosen parameterization, the Poincaré transformation acts nonlinearly, mixing coordinates (t, r) with the field θ. (However, the higher symmetry is also enjoyed by the noninteracting theory, λ = 0, which is not equivalent to the Nambu–Goto model.) For the second, Cartesian option the chosen parameterization permits writing X0 , which is renamed ct, as amcφ 0 , while the last coordinate, Xd+1 , which is a function of φ 0 = t/am and φ = r, is called θ(t, r)/mc. With these choices the Nambu–Goto action coincides with that for the Born–Infeld Lagrangian, Eq. (3.36). Again the higher dynamical symmetry, described by Eqs. (3.38)–(3.43), is now understood as the covariance of

Fluid Dynamical Profiles and Constants of Motion from d-Branes

123

the Nambu–Goto variables Xµ against (d +1, 1)-dimensional Poincaré transformations. (But once again, the similar invariance of the free theory, a = 0, cannot be related to properties of a Nambu–Goto action.) The Nambu–Goto action is normaized by the d-brane tension, which has been scaled to unity in (4.1). Thus the non-relativistic and relativistic free models (λ = 0 = a) may be viewed as descriptions of ”tension-less” d-branes. Since both the (d, 1)-dimensional Galileo-invariant Chaplygin gas equations and the (d, 1)-dimensional Poincaré-invariant Born–Infeld equations correspond to different parameterizations of the (d + 1, 1) Nambu–Goto action, there must be a transformation – recognized as a reparameterization – that takes solutions of one model into the other. This transformation may be formulated as follows. Given a solution θNR (t, r) to (3.17), we solve for T (t, r) from the equation 1 1 θNR T (t, r), r = t. (4.4) √ T (t, r) + mc2 2 Then a solution θR (t, r) to (3.37), is given by 1 1 1 θ θR (t, r) T (t, r), r = √ T (t, r) − NR 2 mc mc2 2 or θR (t, r) = mc2

√ 2T (t, r) − t .

(4.5a)

(4.5b)

Indeed this mapping produces solutions (3.46), (3.47) and (3.48) from (3.18), (3.19a) and (3.22) respectively. [In fact both signs of the square root are obtained; also (3.48) emerges from (3.22) only after a redefinition of 2 and u.] Oppositely, given as solution θR (t, r) to the relativistic Born–Infeld equation (3.37), we reparameterize by solving for T (t, r) from 1 1 θ T (t, r), r =t (4.6) √ T (t, r) + R mc2 2 and then find θNR (t, r) from 1 1 1 θ T (t, r), r θ (t, r) = √ T (t, r) − 2 R 2 NR mc mc 2 or θNR (t, r) = mc2

√ 2T (t, r) − t .

The two transformations are collected in the statement √ θNR (t, r) = mc2 ( 2T − t), √ θR (T , r) = mc2 ( 2t − T )

(4.7a)

(4.7b)

(4.8)

with the instruction that obtaining an expression for θN R in terms of θR , or vice-versa, requires that one of T or t be eliminated in favor of the other.

124

R. Jackiw, A. P. Polychronakos

The interrelationships may be summarized by the following diagram: Parameterization-invariant Nambu–Goto d-branes in d + 1, 1 spacetime

Light-cone parameterization

Cartesian parameterization

Nonrelativistic limit Poincaré-invariant

Galileo-invariant Chaplygin gas in (d, 1) spacetime

Exact transformation

Born–Infeld model in (d, 1) spacetime

It is striking that there exists a two-fold relationship between the Chaplygin gas and the Born–Infeld model. First, there is the exact mapping, given in (4.8), of one into the other. Second, the nonrelativistic limit (3.44), (3.45) of the latter produces the former. 5. One-Dimensional Motion In one spatial dimension, the motion of these systems simplifies and they become integrable. We shall give a self-contained demonstration of integrability and derive the integrals of motion in a compact form, stressing the connection between the relativistic and nonrelativistic case. In one dimension the requirement that the local momentum is irrotational poses no restriction. We shall use as phase space variables the local momentum p(t, x) and the particle density ρ(t, x). The equal-time Poisson bracket (2.25) becomes {p(x), ρ(x 0 )} = δ 0 (x − x 0 ).

(5.1)

Note that p and ρ in this R relation are on an equal footing and are governed by a nonlocal ˙ − y)p(y)dt, where is the antisymmetric step canonical 1-form 21 dx dy ρ(x)(x function. We shall consider local integrals, that is, quantities of the form Z F = dxF(p(x), ρ(x)) (5.2) with F a local function of p and ρ. The Poisson bracket of two such integrals F and G is calculated through (5.1) as Z i h (5.3) {F, G} = − dx (Fρρ Gp + Fρp Gρ )∂x ρ + (Fρp Gp + Fpp Gρ )∂x p , where we suppressed the dependence on x and subscripts indicate partial derivative. If the above integrand is a total x-derivative then (with appropriate boundary conditions)

Fluid Dynamical Profiles and Constants of Motion from d-Branes

125

the integral will vanish and F and G will be in involution. For this we need the curl-free condition (Fρρ Gp + Fρp Gρ )p = (Fρp Gp + Fpp Gρ )ρ

(5.4)

Gρρ Fρρ = . Fpp Gpp

(5.5)

or, finally

Choosing one of the integrals to be the Hamiltonian H = condition

R

dxH, the well-known

Hρρ Fρρ = Fpp Hpp

(5.6)

guarantees that F is a constant of the motion. If we recover a set of such integrals satisfying (5.6) then they will obviously also satisfy (5.5) among themselves. Therefore, constants of motion will automatically be in involution. For the nonrelativistic case the Hamiltonian density (3.2), (3.4) is H=ρ

λ p2 + , 2m ρ

(5.7)

and therefore the integrals of motion are generated by functions that satisfy ρ 4 Fρρ = 2λmFpp .

(5.8)

This can readily be solved by separation of variables. We prefer, however, simply to give its general solution in terms of two arbitrary functions f and g of one variable: √ √ 2λm 2λm + ρg p − . (5.9) F = ρf p + ρ ρ We essentially get two infinite towers of integrals. Choosing, e.g., f (z) = zn , g = 0 or g(z) = zn , f = 0, we get the integrals √ Z 2λm n . (5.10) In± = dx ρ p ± ρ All the integrals presented in [5,16] can be identified as linear combinations of the In± . As stated, the In± are all in involution and demonstrate the complete integrability of the system. The Hamiltonian, in particular, is included as 4H = I2+ + I2− and the total momentum as 2P = I1+ + I1− . The quantities √ 2λm (5.11) R± = p ± ρ appearing above are known as Riemann coordinates. The equations of motion for this system (continuity and Euler) are summarized in terms of R± : 1 0 . R˙ ± = − R∓ R± m

(5.12)

126

R. Jackiw, A. P. Polychronakos

Although this formulation for the equations is known, the relation to the constants of motion does not seem to appear in the literature. Another fluid system for which the equations of motion, expressed in terms of Riemann coordinates, take a form similar to (5.12) possesses a potential cubic in ρ: V (ρ) = `ρ 3 /3. This also arises in a collective, semiclassical description of nonrelativistic free fermions, where the cubic potential reproduces fermion repulsion [17]. In this case, the Riemann corrdinates read √ (5.13) R± = p ± 2`m ρ and, in contrast to (5.12), they decouple in the the equations of motion: 1 0 . R˙ ± = − R± R± m

(5.14)

Indeed it is seen that R± satisfy essentially the free Euler equation [compare (1.2) and identify R± with v]. Consequently (5.14) is solved by analogs of (2.2)–(2.4). Both these examples are special limiting cases of a more general system, with potential λ ρ + 23 α + Aρ + B. (5.15) V (ρ) = (ρ + α)2 (The terms A, B correspond to a dynamically trivial part that does not alter the equations of motion.) The Chaplygin gas corresponds to α = A = B = 0, while the cubic potential is recovered in the limit λ = `α 4 , A = 13 `α 2 , B = − 23 `α 3 , α → ∞. The Riemann coordinates are R± = p ±

√ 2λm ρ+α

and the conserved integrals of this system are given by functions of R± : √ √ 2λm 2λm + (ρ + α)g p − . F = (ρ + α)f p + ρ+α ρ+α

(5.16)

(5.17)

(5.18)

The physical meaning of this general system is not clear. We conclude the discussion on the one-dimensional nonrelativistic Chaplygin gas by presenting a set of moving solutions, which are the Galileo boosts of the static solutions (3.22), (3.23) in Sect. 3. These read √ 2λm (5.19) p = p(x − ut) , ρ = |p − mu| with p(x − ut) an arbitrary function of x − ut (provided it never equals mu). Clearly this is a constant-profile solution moving with a velocity u. For the relativistic Born–Infeld system, the Hamiltonian density (3.29) is q p (5.20) H = ρ 2 c2 + a 2 m2 c2 + p2 .

Fluid Dynamical Profiles and Constants of Motion from d-Branes

127

Relation (5.6) for the conserved integrals reads m2 (ρ 2 c2 + a 2 )2 Fρρ = a 2 (m2 c2 + p2 )2 Fpp .

(5.21)

This can be solved by separation of variables. We prefer again, however, to define Riemann coordinates and give the solution in terms of arbitrary functions of one variable, just as in the nonrelativistic case. The relevant combinations here are R± = φρ ± φp , φρ = arctan

ρc p , φp = arctan , a mc

(5.22)

and the equations of motion read, in terms of these, 0 , R˙ ± = ∓c(cos R∓ )R±

(5.23)

while the general solution to (5.21) is F=

i h mca f (φρ + φp ) + g(φρ − φp ) . cos φρ cos φp

(5.24)

By choosing, as before, simple monomials for f and g we get the tower of conserved quantities Z (φρ ± φp )n . (5.25) In± = mca dx cos φρ cos φp The Hamiltonian is included as H = I0± . The momentum, on the other hand, is an infinite series in the above integrals, requiring f and g to be exponential functions. We can give an alternative tower of complex conserved integrals involving only algebraic functions of p and ρ: Z q 1−n p 1−n m2 c2 + p2 (ρc ± ia)n (p + imc)n . (5.26) I˜n± = dx ρ 2 c2 + a 2 Then the Hamiltonian is H = I˜0± while the total momentum, total particle number and integral of the local momentum are contained in the real and imaginary parts of I1+ and I1− (as well as a trivial constant). It can be checked that the integrals (5.26) go over to the nonrelativistic ones (5.10) in the limit c → ∞ upon proper rescaling. In the relativistic model ρ need not be constrained to be positive (negative ρ could be interpreted as antiparticle density). The transformation p → −p, ρ → −ρ is a symmetry and can be interpreted as charge conjuguation. Further, p and ρ appear in an equivalent way. As a result, this theory enjoys a duality transformation: ρ→±

mc2 a ρ. p, p → ± mc2 a

(5.27)

Under the above, both the canonical structure and the Hamiltonian remain invariant. Solutions are mapped in general to new solutions. Note that the nonrelativistic limit is mapped to the ultra-relativistic one under duality. Self-dual solutions ρ = ± mca 2 p satisfy ρ˙ = ∓cρ 0

(5.28)

128

R. Jackiw, A. P. Polychronakos

and are, therefore, the chiral relativistic solutions that were presented at the end of Sect. 3. In the self-dual case, when p is eliminated from the canonical 1-form and from the Hamiltonian, one arrives at an action for ρ, which coincides (apart from irrelevant constants) with the self-dual action, constructed some time ago [18]: Z Z q p 1 2 2 2 2 2 2 dx dy ρ(x)(x ˙ − y)p(y) dt − dx ρ c + a m c + p dt 2 2

=

2mc a

2 Z 1 4

dx dy ρ(x)(x ˙ − y)ρ(y) dt −

c 2

Z

p= mca ρ

a2 dx ρ 2 (x) + 2 dt . (5.29) c

A set of constant-profile solutions can be found by Lorentz-boosting the static solution (3.48). Their most general form is √ a up ± mc c2 − u2 (5.30) p = p(x − ut) , ρ = √ c p c2 − u2 ∓ mcu with p(x − ut), again, a general function. Note that the two choices of sign in the above formula are related by charge conjuguation. In the extreme relativistic case u → c this solution goes over to the chiral relativistic solution (3.51). The set of these solutions is closed under duality transformations. We shall conclude by presenting an explicit mapping between the relativistic and the nonrelativistic theories in one dimension, which demonstrates their kinematical equivalence. Note that the relativistic equations of motion (5.23) become, in terms of cos R± , ∂t cos R± = ∓c(cos R∓ )∂x cos R± .

(5.31)

These are essentially identical to the equations of motion for the nonrelativistic Riemann coordinates (5.12). In fact, putting R¯ ± = ±mc cos R± ,

(5.32)

we see that the R¯ ± obey the nonrelativistic equations (5.12). Expressing R¯ ± and R± in terms of the corresponding nonrelativistic and relativistic variables produces a mapping between the two sets. Call pNR , ρNR and pR , ρR the local momentum and density of the nonrelativistic and the relativistic theory, respectively. Then the mapping is √ 2λm ρR pR HR , pN R = mc2 , (5.33) ρNR = 2 2 am c HR q q 2 is the relativistic Hamiltonian density. As can where HR = ρR2 c2 + a 2 m2 c2 + pR be checked by direct algebra, this maps the relativistic equations of motion to the nonrelativistic ones. Since the combinations of pR and ρR that appear in (5.33) are dualityinvariant, the mapping of solutions is two-to-one. Note that the constant-profile relativistic solutions (5.30) are mapped to the corresponding nonrelativistic ones (5.19). We stress that the transformation (5.33) is not canonical, since it does not preserve the Poisson brackets. Accordingly, it does not map the relativistic Hamiltonian into the nonrelativistic one. This is a manifestation of the bi-Hamiltonian structure of these systems, since there are now two pairs of Hamiltonian and canonical structure (the standard nonrelativistic one and the one obtained through this mapping) that lead to the same nonrelativistic equations of motion.

Fluid Dynamical Profiles and Constants of Motion from d-Branes

129

We note that a similar mapping between the nonrelativistic system and the relativistic one in light-cone coordinates was presented previously by Verosky [19]. The mapping (5.33), then, can be considered as the analog of Verosky’s transformation for the Lorentz system, although it cannot be obtained from it in any straightforward way. The existence of an infinite set of constants of motion for both the nonrelativistic Chaplygin gas, (5.9) and (5.10), as well as for the relativistic Born–Infeld model, (5.24)– (5.26), signals the complete integrability of these theories. The actual integration of the equations of motion can be carried out only indirectly. First of all, since both these d = 1 models descend, via alternate parameterization choices, from the Nambu–Goto string (1-brane) on the plane (2-space), the explicit integration of the latter [4] allows presenting solutions of the former two in terms of two arbitrary functions. Alternatively, the Chaplygin gas equations can be combined, after a Legendre transformation, into a linear, second-order partial differential equation [9], whose general solution, in terms of two arbitrary functions, is known explicitly [20]. The Born–Infeld solution can then be constructed with the help of transformations described in Sects. 4 and 5. Also, the Born– Infeld equations have been solved with a hodograph transformation without reference to the Chaplygin gas [21]. References 1. Goldstone, J.: Unpublished 2. Bordemann, M. and Hoppe, J.: Phys. Lett. B317, 315 (1993); B325, 359 (1994) 3. Nutku, Y.: J. Math. Phys. 28, 2579 (1987); Olver, P. and Nutku, Y.: J. Math. Phys. 29, 1610 (1988); Arik, M., Neyzi, F., Nutku, Y., Olver, P. and Verosky, J.: J. Math. Phys. 30, 1338 (1989) 4. See e.g. Bazeia, D.: Phys. Rev. D 59, 085007 (1999) 5. Brunelli, J. and Das, A.: Phys. Lett. A235, 597 (1997) 6. Eckart, C.: Phys. Rev. 54, 920 (1938); Recent work: Schakel, A.: Mod. Phys. Lett. B10, 999 (1996); Ogawa, N.: Preprint, hep-th/9801115 7. Faddeev, L.D. and Jackiw, R.: Phys. Rev. Lett. 60, 1692 (1988) 8. In conventional treatments of relativistic hydrodynamics, as in Weinberg, S.: Gravitation and Cosmology. New York: Wiley, 1972, ρ is a scalar rather than the time-component of a Lorentz vector. This distinction makes no difference in the noninteracting case, but our choice allows introducing self-interactions in a Lorentz-invariant manner. 9. Landau, L. and Lifschitz, E.: Fluid Mechanics. 2nd ed. Oxford, UK: Pergamon, 1987 10. Madelung, E.: Z. Phys. 40, 322 (1926); Recent work: Merzbacher, E.: Quantum Mechanics. 3rd ed. New York: Wiley, 1998 11. Bazeia, D. and Jackiw, R.: Ann. Phys. (NY) 270, 246 (1998); Jackiw, R. and Polychronakos, A.: Faddeev Festschrift. Preprint, hep-th/9809123 12. Jevicki, A.: Phys. Rev. D 57, 5955 (1998) 13. Bordemann and Hoppe: Second paper in Ref. [2]. 14. This result was established with a hodographic transformation for d = 2 by Bordemann and Hoppe (first paper in Ref. [2]) and for arbitrary d by Jackiw and Polychronakos, Ref. [11]; see also Bazeia, Ref. [4]. The present, simple approach was shown to us by Hoppe; see also Hoppe, J.: Phys. Lett. B329, 10 (1994) 15. This result was established with a hodographic transformation for d = 2 by Bordemann and Hoppe (second paper in Ref. [2]); see also Hoppe, Ref. [14] 16. Nutku, Olver and Nutku, Ref. [3] 17. Jevicki, A. and Sakita, B.: Phys. Rev. D 22, 467 (1980) and Nucl. Phys. B165, 511 (1980); Polchinski, J.: Nucl. Phys. B362, 25 (1991); Avan, J. and Jevicki, A.: Phys. Lett. B266, 35 (1991) and B272, 17 (1991) 18. Floreanini, R. and Jackiw, R.: Phys. Rev. Lett. 59, 1873 (1987) 19. As cited in Olver and Nutku, Ref. [3] 20. Jackiw and Polychronakos, Ref. [11] 21. Barbishov, B. and Chernikov, N.: Zh. Eksp. Teor. Fiz. 51, 658 (1966) [Eng. trans.: Sov. Phys. JETP 24, 437 (1967)] Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 207, 131 – 143 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

The Principal Eigenvalue of a Conformally Invariant Differential Operator, with an Application to Semilinear Elliptic PDE Matthew J. Gursky? Department of Mathematics, Indiana University, Bloomington, IN 47405-5701, USA. E-mail: [email protected] Received: 1 November 1998 / Accepted: 16 April 1999

Abstract: In this paper we give conditions under which the Paneitz operator, a conformally invariant fourth order differential operator, is positive. We also prove a sharp estimate for a related conformal invariant. These results are combined to study the existence of a semilinear PDE with critical Sobolev growth. Introduction Our aim in this paper is to study an important fourth order differential operator along with a related conformal invariant from four-dimensional Riemannian geometry. The operator and associated invariant have arisen in the study of semi-linear elliptic PDEs with critical Sobolev growth ([Li,CY1,CY2,WX]), the zeta-functional determinant ([BO,CY1, BCY]), vanishing theorems for the first Betti number of a compact four-manifold of positive scalar curvature ([Gu]), and in a gauge choice for Maxwell’s equations ([ES1, Pa]). Let us begin by defining the objects of our study. Let (M 4 , g) be a smooth compact four-dimensional Riemannian manifold. We let Rmg denote the Riemannian curvature tensor of g, and drop the subscript whenever there is no possibility of confusion. We also let W, Ric, and R denote the various components of the curvature tensor: Weyl, Ricci, and scalar. Now define Q = Qg =

1 (−1R − 3|Ric|2 + R 2 ), 12

(0.1)

where 1 denotes the Laplace–Beltrami operator with respect to g (in our case a negative operator). Our first observation is that the quantity Z (0.2) κg = Qg dµg , ? Supported in part by NSF Grant DMS-9801046 and an Alfred P. Sloan Research Fellowship

132

M. J. Gursky

where dµg is the Riemannian measure, is a conformal invariant. There are (at least) two ways of seeing this. First, by the Chern–Gauss–Bonnet formula ([Be, (6.31)]), Z Z (0.3) 4π 2 χ(M 4 ) = |W |2 dµ + Qdµ, where χ (M 4 ) is the Euler characteristic of M 4 . Since the integral of the norm squared of the Weyl tensor is conformally invariant in four dimensions, we see that (0.2) must be also. Indeed, (0.3) indicates the analogy between the quantity Q of a four-manifold and the Gauss curvature of a surface – a theme which we will repeatedly encounter in this paper. In the latter case, the Gauss–Bonnet formula is Z (0.4) 2πχ (6 2 ) = KdA so that the Euler characteristic is given by a conformally invariant integral. The corresponding formula in four dimensions (0.3) shows that the Euler characteristic of a four-manifold is also a sum of conformally invariant integrals, one of which (the Weyl term) measures whether the manifold is locally conformally flat. Since every surface is locally conformally flat this latter term is absent in (0.4). Another way to see that (0.2) is conformally invariant is by writing down the formula for the transformation of Q under a conformal change of metric. In doing this we also introduce the aforementioned differential operator, called the Paneitz operator, which will be a focus of this paper. Let d : 30 (M 4 ) = C ∞ (M 4 ) → 31 (M 4 ) denote the exterior derivative, and δ : 31 (M 4 ) → 30 (M 4 ) its L2 -formal adjoint. We define P = Pg by P = 12 + δ

2 3

Rg − 2Ric d.

(0.5)

P is clearly fourth order, and turns out to be conformally invariant ([Pa]). More precisely, if g˜ = e2w g then Pg˜ = e−4w Pg . Having defined P we can now give the transformation law for Q : if g˜ = e2w g then Pg w + 2Qg˜ e4w = 2Qg .

(0.6)

From (0.6) it is again easy to see that the integral of Q is a conformal invariant: just use the divergence theorem along with the fact that dµg˜ = e4w dµg . Returning to our analogy with the theory of surfaces, we see that P is, from the point of view of conformal geometry, the proper generalization of the Laplace–Beltrami operator to four dimensions. For, in the case of surfaces if g˜ = e2w g then 1g˜ = e−2w 1g ; i.e., the Laplace–Beltrami operator is also conformally invariant. The two-dimensional counterpart to (0.6) is the Gauss curvature equation: 1w + Kg˜ e2w = Kg .

(0.7)

There is by now a vast literature on (0.7), but the study of (0.6) is still in many respects quite preliminary. For example, given a Riemannian manifold (M 4 , g), under what conditions is there a conformal metric g˜ = e2w g with Qg˜ identically constant? The analogous question for (0.7) is the famous uniformization theorem, and there are many approaches one can take to resolve this. In four dimensions the problem was undertaken

Principal Eigenvalue of a Conformally Invariant Differential Operator

133

by Chang and Yang ([CY1]) using the direct variational approach. That is, they studied the convergence of minimizers of the functional Z Z Z Z (0.8) F [w] = wP wdµ + 4 Qwdµ − Qdµ log e4w dµ Z (here

=

Z

dµ

−1 Z

) whose critical points satisfy (0.6) with Qg˜ = κg . In the

course of their work they discovered two things. First, if the principal eigenvalue λ1 (P ) is allowed to be negative, then a minimizing sequence for F can diverge.1 For example, it is easy to see that some hyperbolic metrics have constant Q but are saddle points of F . Second, to establish the compactness of a minimizing sequence the value of κg as defined in (0.2) is crucial. More precisely, they showed that when λ1 (P ) = 0 with Ker P = {constants}, κg < 8π 2 , then a minimizing sequence always converges. The condition κg < 8π 2 is completely analogous to the compactness criteria involved in theYamabe problem ([LP]), where one requires that the conformally invariant Sobolev constant be less than its value on the sphere (and equal only in the case of the sphere). Indeed, κg¯ = 8π 2 when g¯ is the round metric. The sharp inequality appearing in the work of Chang–Yang is the Moser–Trudinger type inequality due to Adams [Ad]. In this context we can now state the two main results of this paper. The first gives conditions under which λ1 (P ) = 0. Since P is a conformally invariant operator, our conditions should be conformally invariant. Now in four dimensions there are two natural (signed) conformally invariant quantities to consider; κg and the Yamabe invariant: Z Y (g) = inf

g=e ˜ 2w g

Rg˜ dµg˜

. Z

1 dµg˜

2

.

We then have: Theorem A. Let (M 4 , g) be a smooth compact four-dimensional Riemannian manifold. If κg ≥ 0 and Y (g) ≥ 0 then λ1 (P ) = 0 and Ker P = {constants}. Remarks. 1. In [ES2] another criterion for λ1 (P ) = 0 is given. However, this requires checking in a pointwise fashion the eigenvalues of the Ricci tensor. In particular, the condition is not conformally invariant. 2. There are hyperbolic manifolds (necessarily with Y (g) < 0) for which λ1 (P ) < 0. So the assumption of non-negativity of the Yamabe invariant cannot be relaxed. On the other hand, in [ES2] examples are constructed of manifolds with Y (g) > 0, κg < 0, and λ1 (P ) = 0. Our next result involves estimating κg when Y (g) ≥ 0. Theorem B. Let (M 4 , g) be a smooth compact four-dimensional Riemannian manifold. If Y (g) ≥ 0 then κg ≤ 8π 2 . Moreover, Y (g) ≥ 0 and κg = 8π 2 if and only if (M 4 , g) is conformally equivalent to the round sphere. Remark. Hyperbolic manifolds provide examples with Y (g) < 0 but κg 8π 2 . From Theorems A and B along with the work of Chang and Yang we have 1 Since P annihilates constants, λ (P ) is always non-positive. 1

134

M. J. Gursky

Corollary 0.1. Let (M 4 , g) be a smooth compact four-dimensional Riemannian manifold with κg ≥ 0 and Y (g) ≥ 0. Then there exists a metric g˜ = e2w g with Qg˜ a constant. In [CY1] Chang and Yang also studied extremals of the zeta-functional determinant. We will describe this work in some detail in Sect. 2. For now we only point out that the above estimate for κg leads to a much simpler existence criteria for extremals than those stated in [CY1]. For example, if L denotes the conformal Laplacian and 6 ∇ 2 the spin Laplacian, then Corollary 0.2. Let (M 4 , g) be a smooth compact four-dimensional Riemannian (spin) manifold with Y (g) ≥ 0. Then an extremal for log det L (log det 6 ∇ 2 ) exists in the conformal class of g. Since our criteria for the existence of solutions to (0.6) depends on checking the sign of two conformal invariants, it seems reasonable to ask whether there are any obstructions to both invariants being simultaneously positive. Recent work in [Gu] on the Weyl functional provides necessary conditions involving the first deRham cohomology group. Before we state the result more precisely, let us make a simple observation about the case when Y (g) = 0. Lemma (See [Gu, Proposition E]). If Y (g) = 0 then κg ≤ 0. Furthermore, κg = 0 if and only if g is conformal to a Ricci-flat metric. This lemma neatly summarizes the zero case and allows us to restrict our attention to the case where Y (g) > 0 (see Sect. 3 for details). Let H 1 (M 4 ) denote the first deRham cohomology group. Assuming M 4 is oriented we let σ (M 4 ) denote the signature of M 4 . Theorem C. Let (M 4 , g) be a smooth compact oriented four-dimensional Riemannian manifold with κg ≥ 0. (i) Then χ (M 4 ) ≥ 23 |σ (M 4 )| with equality if and only if (M 4 , g) is either self-dual or anti-self-dual. (ii) If Y (g) > 0 then H 1 (M 4 ) = 0 unless (M 4 , g) is conformally equivalent to a quotient of S 3 × R endowed with the product metric, in which case κg = 0. Remark. Recall that χ (M 4 ) ≥ 23 |σ (M 4 )| is a necessary condition for M 4 to admit an Einstein metric ([Hi,Th]). Moreover, all Einstein metrics have κg ≥ 0. As a concrete application of Theorem C we have: Corollary 0.3. (i) For m ≥ 9 the manifold CP 2 ]m(−CP 2 ) does not admit a metric with κg ≥ 0. (ii) Let N 4 be a smooth compact oriented four–manifold. Then for any m ≥ 1 the manifold M 4 = N 4 ]m(S 3 × S 1 ) does not admit a metric with Y (g) > 0 and κg > 0. Remark. If 3 ≤ m ≤ 8 the manifold CP 2 ]m(−CP 2 ) admits a Kähler–Einstein metric ([Ti]). These metrics obviously have constant Q.

Principal Eigenvalue of a Conformally Invariant Differential Operator

135

We conclude the introduction with a few words about the proofs of Theorems A–C. For Theorem A our argument relies on some of the existence work for the functional determinant in [CY1], along with subsequent regularity work in [CGY] and a maximum principle result in [Gu]. Thus the argument is rather involved, and relies on a fair amount of existing technical work. The proof of Theorem B is conceptually easier. Indeed, if we are willing to cite Schoen’s work on the Yamabe problem, it is practically automatic. However, we will use a sharp result of Cheng [Ch] from comparison geometry to bypass Schoen’s work on the positive mass theorem. Our ability to do this reveals a curious feature of this subject which belies all of the impressive analogies with the Yamabe problem: Namely, the case of positive Yamabe invariant is much easier here, while the negative case is poorly understood. Theorem C and its corollary rely on the results in [Gu]. This is explained in Sect. 4. A final note on the organization of the paper: for technical reasons it will simplify matters if we first prove Theorem B. This is done in Sect. 1. In Sect. 2 we give some technical background to the proof of Theorem A, and give the proof proper in Sect. 3. Finally, in Sect. 4 we give the proof of Theorem C and Corollary 3. 1. The Proof of Theorem B Let us begin by recalling some basic facts about the Yamabe problem; details may be found in the survey article of Lee and Parker [LP] or in the monograph of Schoen [Sc1]. Since we are only interested in four dimensions we will specialize our account to this case. For p ∈ [2, 4] define the functional R (6|∇u|2 + Ru2 )dµ Yp [u] = R 2/p . |u|p dµ It follows from the Rellich compactness theorem that for each p ∈ [2, 4) there exists a smooth minimizer up > 0 with Yp [up ] = µp ≡ inf Yp [u]. u∈W 1,2

If we normalize so that kup kp = 1 then up satisfies the Euler equation p−1

61up + µp up

= Rup .

Since µp is continuous from the left and non-decreasing [Au], we may choose a sequence pk % 4 such that µk = µpk % Y (g). Let uk = upk and gk = u2k g. The scalar curvature Rk of gk is then given by h i − 61uk + Ruk Rk = u−3 k p −4

= µ k uk k

.

(1.1)

If we let Ek denote the trace-free Ricci tensor of gk , then as κg is a conformal invariant we have Z 1 1 2 2 − |Ek | + Rk dµgk . (1.2) κg = 4 48

136

M. J. Gursky

Using (1.1) along with the fact that dµgk = u4k dµ we have Z Z 1 1 2 |Ek | dµgk + Rk2 dµgk κg = − 4 48 Z 1 2 2p −4 µk uk k dµ ≤ 48 2(pk −2) Z Z 4−pk pk pk 1 2 p . µk uk k dµ dµ ≤ 48

(1.3)

Taking the limit as κ → ∞ in (1.3) we get κg ≤

1 (Y (g))2 . 48

(1.4)

√ By the energy estimate of Aubin [Au], Y (g) ≤ Y (S 4 ) = 8 6π . Thus if Y (g) ≥ 0 from (1.4) we conclude κg ≤ 8π 2 . Now suppose κg = 8π 2 . If we do not omit the term involving the trace–free Ricci tensor in (1.3) then we get 1 8π + 4 2

Z |Ek | dµgk 2

1 2 µ ≤ 48 k

Z

2(pk −2) Z

p uk k dµ

pk

4−pk dµ

pk

.

(1.5)

Taking the limit as κ → ∞ in (1.5) and using Aubin’s estimate once again we conclude Z (1.6) |Ek |2 dµgk = 0. lim k→∞

There are now two possibilities to consider, depending on whether the sequence {uk } is precompact or not. First let us assume that Z u2k dµ = 0. lim k→∞

In this case, we can find a subsequence (still denoted {uk }) and points {P1 , . . . , Pm } ⊂ M 4 such that uk → 0 in C ∞ on compact sets K ⊂ M 4 \{P1 , . . . , Pm } (see [Sc1] for details). Choose another point x0 6 ∈ {P1 , . . . , Pm } and let vk = uk /uk (x0 ). By the Harnack inequality vk → v > 0 in C ∞ on compact K ⊂ M 4 \{P1 , . . . , Pm }, and v satisfies −61v + Rv = 0. (1.7) Since we may assume that R > 0 it follows that v is singular at some of the Pi ’s; say P1 , . . . , Pm0 i.e., v is a sum of Green’s functions for the conformal Laplacian L = −61+R. If we consider the sequence of metrics hk = vk2 g, then by the scale–invariance of the L2 -norm of the Ricci tensor we have Z Z |Ehk |2 dµhk = |Ek |2 dµgk . Thus by (1.6) we see that Ehk → 0 on compact K ⊂ M 4 − {P1 , . . . , Pm0 }, and conclude that h = v 2 g is Einstein. By (1.7) h is also scalar-flat (because Rh = v −3 (−61v+Rv) =

Principal Eigenvalue of a Conformally Invariant Differential Operator

137

0), so in fact h is Ricci-flat. It then follows from a result of Schoen ([Sc2]) that (M 4 , g) is conformally equivalent to the round sphere. On the other hand, suppose Z u2k > 0. lim k→∞

It is then standard that {uk } contains a subsequence which converges to a non-zero smooth solution of the Yamabe problem; call it v. Just as before we can argue that the metric h = v 2 g is in fact Einstein. Moreover, Z 1 2 Rh2 dµh . (1.8) 8π = 48 Let us normalize h so that Rich = 3h; then Rh = 12 and by (1.8) vol(M 4 , h) = 83 π 2 . Let D denote the diameter of (M 4 , h). By Myer’s theorem, D ≤ π. Also, by the Bishop comparison theorem, Z D 8 2 4 3 π = vol(M , h) ≤ vol(S ) sin3 r dr 3 0 2 2 1 3 cos D − cos D + = 2π 3 3 so that the diameter satisfies 1 2 ≤ cos3 D − cos D. 3 3 It is easy to verify that the function f (x) = 13 cos3 x − cos x attains its maximum value of 2/3 on the interval [0, π] at x = π; thus D = π . It then follows from a result of Cheng [Ch] that (M 4 , g) is isometric to the round sphere. 2. Technical Preliminaries to Theorem A In this section we provide some necessary background material for the proof of Theorem A, beginning with the work of [CY1] on the functional determinant. Let (M 4 , g0 ) be a smooth compact four-dimensional Riemannian manifold. We then define the functionals Z Z Z 2 2 |W0 | dµ0 log e4w dµ0 , (2.1) I [w] = 4 |W0 | wdµ0 − I I [w] =

Z

Z

Z wP0 wdµ0 + 4

I I I [w] = 12

Q0 wdµ0 −

Q0 dµ0 log

Z e4w dµ0 ,

Z 2 10 w + |∇0 w|2 dµ0 − 4 R0 |∇0 w|2 dµ0 Z − 4 10 R0 wdµ0 ,

(2.2)

Z

(2.3)

and linear combinations of the above: 8[w] = γI [w] + γ2 I I [w] + γ3 I I I [w].

(2.4)

138

M. J. Gursky

In (2.1)–(2.4) terms have the subscript 0 to indicate that they are computed with respect to g0 . Also, notice that the functional I I is just the functional F we defined in (0.8). The “sub-functionals” I–III arise in the calculation of the functional determinant of a conformally covariant operator A due to Branson and Orsted [BO]. For example, if A = L = −61 + R, the conformal Laplacian, then log

det Lg 2 = I [w] − 4I I [w] − I I I [w], det L0 3

where g = e2w g0 . If M 4 is spin and A = 6∇ 2 , then log

det 6∇g2 det 6 ∇02

= 7I [w] − 88I I [w] −

14 I I I [w]. 3

In general, any operator A which satisfies certain “conformal” and “naturality” assumptions defined in [BO] has log det Ag / det A0 = γ1 I [w]+γ2 I I [w]+γ3 I I I [w] for some choice of γi = γi (A). The main existence result of [CY1] for extremals of the functional determinant can be stated as follows: Theorem 2.1 ([CY1, Theorem 1.1]). Let 8 be defined as in (2.4). If γ2 < 0, γ3 < 0, and Z γ1 |W0 |2 dµ + 8π 2 , κg0 < − γ2 then supw∈W 2,2 8[w] is attained by some w ∈ W 2,2 . Moreover, the extremal metric g = e2w g0 satisfies the Euler equation γ1 |W |2 + γ2 Q − γ3 1R = constant.

(2.5)

Remarks. 1. It follows from [CGY] that extremal metrics are smooth. 2. In (2.5) the curvature terms and Laplace–Beltrami operator are with respect to the extremal metric. 3. Using Theorem B we can now give the proof of Corollary 2. As we observed above, when A is the conformal Laplacian then (γ1 , γ2 , γ3 ) = (1, −4, − 23 ). Thus existence will follow if we can show that Z 1 (2.6) |W0 |2 dµ0 + 8π 2 . κg0 < 4 By Theorem B, (2.6) holds unless (M 4 , g0 ) is conformally equivalent to the round sphere. However, this case was treated separately in [BCY]. If M 4 is spin and A = 6∇ 2 then (γ1 , γ2 , γ3 ) = (7, −88, − 14 3 ), so existence follows if we can show κg0 <

7 88

Z |W0 |2 dµ0 + 8π 2 .

(2.7)

Again, (2.7) is satisfied except in the case of the round sphere, and this was treated in [BCY] as well.

Principal Eigenvalue of a Conformally Invariant Differential Operator

139

For our purposes it will be helpful to rewrite the Euler equation (2.5). Let E denote the trace-free Ricci tensor; then 1R = λ − α|W |2 − β|E|2 + with α = γ1 (γ3 +

1 βR 2 12

1 γ2 )−1 , 12

1 1 γ2 (γ3 + γ2 )−1 , 4 12 Z −1 n Z λ= dµ α |W0 |2 dµ0 − 4βκg0 }. β=

(2.8)

(2.9) (2.10) (2.11)

Notice that if γ2 < 0 and γ3 < 0 as required by Theorem 1.1, then β > 0. Furthermore if γ1 ≥ 0 and κg0 ≥ 0 then α ≤ 0 and λ ≤ 0. This is important, for the following reason: Proposition 2.2 (See [Gu, Lemma 1.2]). Suppose Y (g0 ) > 0. If α ≥ 0, 0 ≤ β ≤ 2, and λ ≤ 0, then R > 0. We will see that the pointwise positivity of the scalar curvature when Y (g0 ) > 0 is crucial to our argument. The following proposition assembles all of the material thus reviewed into one concise – and for our purposes useful – statement. Proposition 2.3. Let (M 4 , g0 ) be a smooth compact four-dimensional Riemannian manifold with Y (g0 ) > 0 and κg0 ≥ 0. Then there is a conformal metric g = e2w g0 which satisfies 1 (2.12) 1R = λ − |E|2 + R 2 12 with λ ≤ 0 and R > 0. Proof. Just take γ1 = 0, γ2 = 6, and γ3 = 1. By Theorem B, κg0 < 8π 2 unless (M 4 , g0 ) is conformally equivalent to the sphere. In the latter case, the round metric satisfies (2.12). In the former case, we can quote the existence result of [CY1] as stated in Theorem 2.1, along with the maximum principle result Proposition 2.2 to get the positivity of R. u t 3. The Proof of Theorem A Let (M 4 , g0 ) satisfy κg0 ≥ 0 and Y (g0 ) ≥ 0. The case where Y (g0 ) = 0 was addressed in the introduction; recall that in this case κg0 ≤ 0 with equality if and only if g0 is conformal to a Ricci-flat metric g = e2w g0 . Thus if Y (g0 ) = 0 and κg0 = 0 then with respect to g one has P = 12g . Clearly Theorem A follows. So from now on assume Y (g0 ) > 0 and κg0 ≥ 0. By Proposition 1.3 there is a metric g = e2w g0 satisfying 1 (3.1) 1R = λ − |E|2 + R 2 12

140

M. J. Gursky

with λ ≤ 0 and R > 0. Let φ ∈ C ∞ (M 4 ) be any function; we wish to estimate D

P φ, φ

Z

E

=

L2

φPg φdµg .

Using the definition of P and integrating by parts we have (see [CY1, (1.3)]) D E P φ, φ

L2

=

Z h i 1 (1φ)2 + R|∇φ|2 − 2E(∇φ, ∇φ) 6

(3.2)

(from now on we will omit the subscript g and the volume form dµ in order to simplify notation). Theorem A will be an easy consequence of the following lemma: Lemma 3.1. For any φ ∈ C ∞ (M), Z

Z −2E(∇φ, ∇φ) ≥

−(1φ)2 −

1 R|∇φ|2 . 48

(3.3) √

Proof. Since E is trace–free, we have the inequality −E(∇φ, ∇φ) ≥ − 23 |E||∇φ|2 . Thus, Z Z √3 |E||∇φ|2 −2E(∇φ, ∇φ) ≥ −2 2 Z |E|2 3 ≥ −2 |∇φ|2 − R|∇φ|2 , R 8 √ where we have used the arithmetic–geometric mean inequality −2 23 xy ≥ −2x 2 − 3 2 8y .

Now using (3.1) we have Z

Z

1 13 |∇φ|2 (−|E|2 + R 2 ) − R|∇φ|2 R 12 24 Z 13 |∇φ|2 (1R − λ) − R|∇φ|2 = 2 R 24 Z 13 |∇φ|2 1R − R|∇φ|2 , ≥ 2 R 24

−2E(∇φ, ∇φ) ≥

2

(3.4)

the last line following from the fact that λ ≤ 0. Considering the first term on the right hand side of (3.4), we integrate by parts to get Z 2|∇φ| R 2

−1

Z 1R = Z =

E D −2|∇φ|2 ∇(R −1 ), ∇R E D − 2R −1 ∇|∇φ|2 , ∇R 2

D ∇R E |∇R|2 2 2 |∇φ| − 2 . , ∇|∇φ| R2 R

Principal Eigenvalue of a Conformally Invariant Differential Operator

D

Using the simple inequality | Z

∇R 2 R , ∇|∇φ|

E

2 | ≤ 2 |∇R| R |∇φ||∇ φ|, we have

Z

|∇R|2 |∇R| |∇φ||∇ 2 φ| |∇φ|2 − 4 2 R R Z |∇R|2 |∇R|2 2 |∇φ| − 2 |∇φ|2 − 2|∇ 2 φ|2 ≥ 2 R2 R2 Z = −2|∇ 2 φ|2 .

2|∇φ|2 R −1 1R ≥

141

2

Substituting (3.5) into (3.4) we get Z Z 13 −2E(∇φ, ∇φ) ≥ −2|∇ 2 φ|2 − R|∇φ|2 . 24

(3.5)

(3.6)

We now use the integrated Bochner formula: Z Z |∇ 2 φ|2 = (1φ)2 − Ric(∇φ, ∇φ) Z 1 = (1φ)2 − E(∇φ, ∇φ) − R|∇φ|2 . 4 Substituting this into (3.6) gives Z Z 1 −2E(∇φ, ∇φ) ≥ −2(1φ)2 + 2E(∇φ, ∇φ) − R|∇φ|2 24 ⇒ Z Z 1 t u −2E(∇φ, ∇φ) ≥ −(1φ)2 − R|∇φ|2 . 48 To complete the proof of Theorem A, combine (3.2) and (3.3) to get Z D E 7 R|∇φ|2 . P φ, φ 2 ≥ L 48 t Thus λ1 (P ) = 0 and Ker P = {constants}. u 4. The Proof of Theorem C The basis for Theorem C is the Chern–Gauss–Bonnet formula along with sharp L2 estimates for the self-dual part of the Weyl tensor in [Gu]. Recall that if (M 4 , g) is oriented, then the bundle of two–forms splits 32 = 32+ ⊕ 32− into its self-dual and anti-self-dual sub–bundles. This in turn induces a splitting of the Weyl curvature W = W + ⊕ W − , viewed as a bundle endomorphism W : 32 → 32 . To prove (i) of Theorem C, we combine the Hirzebruch signature formula Z 12π 2 σ (M 4 ) = (|W + |2 − |W − |2 )dµ with the Chern–Gauss–Bonnet formula (0.3) to get Z Z 2π 2 2χ (M 4 ) ± 3σ (M 4 ) = |W ± |2 dµ + Qdµ.

(4.1)

142

M. J. Gursky

Thus, if κg ≥ 0 we conclude 2χ(M 4 ) + 3σ (M 4 ) ≥ 0, with equality if and only if either W + ≡ 0 or W − ≡ 0. Part (i) follows. To prove part (ii), we appeal to the following result in [Gu]: Theorem ([Gu, Theorem 2]). Let M 4 be a closed, compact, oriented four-dimensional manifold with H 1 (M 4 ) 6 = 0. Then for any metric g on M 4 satisfying Y (g) > 0, Z (4.2) |W + |2 dµ ≥ 2π 2 2χ (M 4 ) + 3σ (M 4 ) . Furthermore, equality is attained in (4.2) by some metric g0 with Y (g0 ) > 0 if and only if (M 4 , g0 ) is conformal to a quotient of S 3 × R equipped with the product metric. Assuming Y (g0 ) > 0 and H 1 (M 4 ) 6 = 0, combining (4.1) and (4.2) we get Z Z 2π 2 2χ (M 4 ) + 3σ (M 4 ) = |W + |2 dµ + Qdµ ≥ 2π 2 2χ(M 4 ) + 3σ (M 4 ) + κg so that κg ≤ 0. Furthermore, if κg = 0 then equality is attained in (4.2), and (M 4 , g0 ) is conformal to a quotient of the product S 3 × R. The proof of Corollary 3 is immediate. u t Acknowledgements. It is a pleasure to thank S.Y.A. Chang and P.Yang for bringing some of problems addressed in this article to our attention, and for their encouragement during the preparation of the manuscript.

References [Ad] [Au] [BCY] [Be] [BO] [CGY] [Ch] [CY1] [CY2] [ES1] [ES2] [Gu] [Hi] [Li] [LP]

Adams, D.: A sharp inequality of J. Moser for higher order derivatives. Ann. of Math. 128, 385–398 (1988) Aubin, T.: Equation differentielles non lineaires et probleme de Yamabe concernant la courbure scalatre. J. Math. Pures Appl. 55, 269–296 (1976) Branson, T., Chang, S.Y.A. and Yang, P.: Estimates and extremals for zeta function determinants on four-manifolds. Commun. Math. Phys. 149, 241–262 (1992) Besse, A.: Einstein manifolds. Berlin: Springer-Verlag, 1987 Branson, T. and Orsted, B.: Explicit functional determinants in four dimensions, Proc. Am. Math. Soc. 113, 669–682 (1991) Chang, S.Y.A., Gursky, M.J. and Yang, P.: Regularity of a fourth order non-linear PDE with critical exponent. To appear in Am. J. Math. Cheng, S.Y.: Eigenvalue comparison theorems and its geometric application. Math. Z. 143, 289–297 (1975) Chang, S.Y.A. and Yang, P.: Extremal metrics of zeta function determinants on 4-manifolds. Ann. of Math. 142, 171–212 (1995) Chang, S.Y.A. and Yang, P.: On uniqueness of solutions of a n-th order differential equation in conformal geometry. Math. Res. Letters 4, 91–102 (1997) Eastwood, M.G. and Singer, M.A.: A conformally invariant Maxwell gauge. Phys. Lett. A 107, 73–74 (1985) Eastwood, M.G. and Singer, M.A.: The Frohlicher spectral sequence on a twistor space. J. Diff. Geom. 38, 653–669 (1993) Gursky, M.J.: The Weyl functional, deRham cohomology, and Kahler–Einstein metrics. Ann. of Math. 148, 315–337 (1998) Hitchin, N.: On compact four-dimensional Einstein manifolds. J. Diff. Geom. 9, 435–442 (1974) Lin, C.S.: A classification of solutions of a conformally invariant fourth order equation in R n . Commun. Math. Helv. 73, 206–231 (1998) Lee, J.M. and Parker, T.H.: The Yamabe problem. Bull. Am. Math. Soc. 17, 37–91 (1987)

Principal Eigenvalue of a Conformally Invariant Differential Operator

[Pa] [Sc1] [Sc2] [Th] [Ti] [WX]

143

Paneitz, S.: A quartic conformally covariant differential operator for arbitrary pseudo-Riemannian manifolds. Preprint, 1983 Schoen, R.: Variational theory for the total scalar curvature functional for Riemannian metrics and related topics. In: Topics in Calculus of Variations, Seminar 1987 (ed. M. Giaquinta), LNM 1365, New York: Springer-Verlag, 1989, pp. 120–154 Schoen, R.: Conformal deformation of a Riemannian metric to constant scalar curvature. J. Diff. Geom. 20, 479–495 (1984) Thorpe, J. Some remarks on the Gauss–Bonnet integral. J. Math. Mech. 18, 779–786 (1969) Tian, G.: On Calabi’s conjecture for complex surfaces with positive first Chern class. Invent. Math. 101, 101–172 (1990) Wei, J. and Xu, X.: On conformal deformations of metrics on S n . J. Funct. Anal. 157, 292–325 (1998)

Communicated by A. Connes

Commun. Math. Phys. 207, 145 – 171 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Multifractal Analysis of Lyapunov Exponent for Continued Fraction and Manneville–Pomeau Transformations and Applications to Diophantine Approximation Mark Pollicott1 , Howard Weiss2,? 1 Department of Mathematics, The University of Manchester, Oxford Road, M13 9PL, Manchester, UK.

E-mail: [email protected]

2 Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA.

E-mail: [email protected] Received: 13 October 1998 / Accepted: 19 April 1999

Abstract: We extend some of the theory of multifractal analysis for conformal expanding systems to two new cases: The non-uniformly hyperbolic example of the Manneville– Pomeau equation and the continued fraction transformation. A common point in the analysis is the use of thermodynamic formalism for transformations with infinitely many branches. We effect a complete multifractal analysis of the Lyapunov exponent for the continued fraction transformation and as a consequence obtain some new results on the precise exponential speed of convergence of the continued fraction algorithm. This analysis also provides new quantitative information about cuspital excursions on the modular surface. 1. Introduction In this paper we extend some aspects of the multifractal analysis which are useful in studying problems in Diophantine approximation, in studying the behavior of geodesics on the modular surface, and in studying an important non-uniformly hyperbolic dynamical system. In particular, we first study the continued fraction (Gauss) transformation T1 : [0, 1] → [0, 1] defined by 1 1 1 = , T1 x ≡ − x x x for x 6 = 0 and T1 (0) ≡ 0. Here [1/x] denotes the integer part of 1/x. This map (see Fig. 1) is uniformly hyperbolic, but being naturally coded by an infinite alphabet and having infinite topological entropy, the usual theory of multifractal analysis for conformal maps [PW1,PW2] does not directly apply. ? The work of the second author was partially supported by a National Science Foundation grant DMS9704913. The manuscript was completed during the second author’s sabbatical visit at IPST, University of Maryland, and he wishes to thank IPST for their gracious hospitality.

146

M. Pollicott, H. Weiss 1

0

1/4

1/3

1/2

1

Fig. 1. Graph of the map T1

For the continued fraction transformation, our multifractal analysis of the Lypaunov exponent will yield new detailed information about the precise rates of Diophantine approximation to irrational numbers. This has immediate implications for cuspital excursions of geodesics on the modular surface. We also study the Manneville–Pomeau transformation [MP1] defined by (

T2 : [0, 1] → [0, 1] T2 x = x + x 1+α mod 1,

where α is a non-negative constant. This important model (see Fig. 2a) is a non-uniformly hyperbolic transformation having the most benign type of non-hyperbolicity: an indifferent fixed point at 0, i.e., T2 (0) = 0 and T20 (0) = 1, and exhibits intermittent behavior [MP1]. In general one could not expect the full force of multifractal analysis to apply for general non-uniformly hyperbolic systems. However, we show that part of the theory carries over to this, and similar, transformations. A key aspect of our analysis is a reduction, via inducing, of this system to a countable state uniformly hyperbolic system (i.e., using the Schweiger jump transformation). In particular, we show that the Lyapunov exponent attains an interval of values, realized on dense sets with positive Hausdorff dimension. This quantifies the size of the set of points whose orbits spend a disproportionate amount of time near the indifferent fixed point. A major tool in the multifractal analysis is the use of symbolic dynamics and thermodynamic formalism, i.e., pressure (and its derivatives), equilibrium states, etc. In the present context of infinite state subshifts of finite type, we are fortunate in having at our disposal a theory worked out by Walters [Wa1].

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

147

1

0

a

1

0

Fig. 2a. Graph of the map T2 1

0

a

3 I 3

a

a

2 I 2

a

1

1

0

I1

Fig. 2b. Graph of the map Tb2

I0

148

M. Pollicott, H. Weiss

Finally, while we state our main results for these two model transformations, our analysis works in greater generality. Our results are valid for interval maps which have induced maps (coded by an infinite alphabet) which satisfy the EMR (expanding-MarkovRényi) property (see Sect. 2) and potentials (for equilibrium states) which satisfy properties (W1) and (W2) (see Sect. 2). In particular, our analysis works for the family of functions, which Prellberg calls Cs [Pr], containing certain piecewise monotone transformations of the interval with an indifferent fixed point at the boundary. For instance, the Farey map ( x/(1 − x), 0 ≤ x < 1/2 F (x) = (1 − x)/x, 1/2 ≤ x ≤ 1 belongs to the class C1 . There are other examples to which a similar sort of analysis extends, particularly in the realm of number theoretic analysis. Our analysis applies to the family of maps Ts (x) = {1/(s(1 − x))} for 0 < s < 4 and s 6 = 4 cos2 (π/q), q = 3, 4, · · · , and there is a theory of s−backward continued fractions based on this family of maps [GH]. Our analysis of the Manneville–Pomeau map easily extends to the complex continued fraction algorithm [Sc, §23.6.], where the underlying map is conformal and has an indifferent fixed point. The underlying map for the Jacobi Perron Algorithm [Sc, §23.1] for continued fractions in several variables is hyperbolic with infinitely many branches, but is not conformal. In this case, a multifractal analysis based on local entropy rather than pointwise dimension should routinely follow. Shortly after the completion of this manuscript, the authors were given a recent preprint [N] which contains an argument corresponding to Proposition 3(4). 2. Markov Maps and Inducing Let I denote an interval of real numbers. For the maps T1 and T2 the interval I will be [0, 1]. The study of our two transformations can be reduced to studying transformations of the following general form: Definition. We say that a transformation T : I → I is an EMR Transformation if we may write I = ∪∞ n=0 In as a countable union of closed intervals (with disjoint interiors ◦

In ), which we call basic subintervals, such that ◦

(1) The map T is C 2 on ∪∞ k=1 I n . (2) Some power of T is uniformly expanding, i.e., there exists a positive integer r and ◦

α > 0 such that |(T r )0 (x)| ≥ α > 1 for all x ∈ ∪∞ n=1 I n . (3) The map T is Markov. (4) The map T satisfies Renyi’s condition, i.e., there exists a positive number K such that |T 00 (x)| ≤ K < ∞. sup sup 0 0 n x,y,z∈In |T (y)||T (z)| ◦

It is easy to verify that |(T12 )0 (x)| ≥ 4 for all x ∈ ∪∞ n=1 I n . Proposition 1 ([Sc, Wa1, p. 148]). The continued fraction transformation T1 satisfies EMR.

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

149

For α = 0 the Manneville–Pomeau transformation is the usual doubling map of the circle. For 0 < α < 1, Thaler [T] constructed an finite absolutely continuous invariant measure. For α > 1, Thaler also constructed a sigma-finite but not finite absolutely continuous invariant measure. However, we shall not consider this range. The Manneville–Pomeau map has two branches for [0, a0 ] and [a0 , 1] where 1 = a0 + a0α . There is a natural topological conjugacy of this map to the standard doubling map and has topological entropy equal to log 2. Although this map is not hyperbolic since the derivative T20 (0) = 1, the induced map on [a0 , 1] is hyperbolic [Sc]. More precisely, we choose the monotone decreasing sequence an → 0 such that T (an+1 ) = an and define In = [an , an−1 ], for n ≥ 1, and I0 = [a0 , 1]. For an+1 < x < an we define Tb2 (x) = T2n (x) (see Fig. 2B). The map Tb2 is piecewise analytic on countably many intervals {In } and is uniformly hyperbolic. Each point which is not an end point of an interval In has two pre-images under T2 . Proposition 2 ([T, pp. 312–313, I, §2]). For 0 < α < 1, the transformation Tb2 satisfies EMR. We will first effect a multifractal analysis for the map Tb2 and then transfer the multifractal analysis to the map T2 . −k (∪∞ ∂I ), where For an EMR transformation T we define the set O = ∪∞ k=0 T k=0 k subset of I . ∂Ik denotes the two endpoints of the interval Ik . Clearly O is a countable −i {I } ∞ of the original T We denote by In (x) the element of the refinement ∨n−1 n i=0 n=1 partition containing x. Every x ∈ I \ O has a unique symbolic coding since for every k ∈ N there is a unique basic subinterval that contains T k (x). A fundamental property of an EMR transformation is that it satisfies what is usually called the Jacobian Estimate [CFS, p. 171], i.e., there exists positive K such that for all x ∈ I \ O one has n 0 (T ) (x) 1 ≤ sup sup n 0 ≤ K < ∞. (J) 0< K n≥0 y∈In (x) (T ) (y) This property will be exploited many times in the proof of Proposition 3 and is also an essential component in the proof of Proposition 2. For the reasons mentioned in the introduction, it is natural that for infinitely coded maps like T1 and T2 , the class of potentials which admit unique equilibrium states is more restrictive than in the usual case. We will now discuss a class of potentials, which we denote W = W(T ) (for Walters), for which the usual results in thermodynamic formalism are valid. Let T : I → I be an EMR transformation and φ : I → R a function such that exp φ is continuous and satisfies the following two properties: P (W1) There exits a constant C > 0 such that the sum T y=x exp φ(y) ≤ C, for all x ∈ I. (W2) The function Cφ (x, x 0 ) = sup sup

n−1 X φ(T i y) − φ(T i y 0 )

n≥1 y∈T −n x i=0

is bounded by a constant Cφ and Cφ (x, x 0 ) tends to zero as |x − x 0 | → 0.

150

M. Pollicott, H. Weiss

Such potentials will be said to belong to the class W . The thermodynamic Pressure P (φ) can be defined for a continuous function φ via the variational principle as Z P (φ) = sup hµ (T ) + φdµ , µ T −inv

I

where hµ (T ) denotes the measure theoretic entropy of T , and the supremum is taken over all T −invariant Borel probability measures µ [Wa2]. Example 1. It is an important feature of allowable potentials for the continued fraction transformation T1 that φ(x) → −∞ as x → 0, in contrast to the usual boundedness of potentials for EMR maps having only finitely many intervals. Observe also that for T1 , the non-zero constant function never satisfies (W1). For T1 with the piecewise analytic potential φ = −t log |T10 | we have that X

exp φ(y) ≤

∞ X 1 < ∞, n2t n=1

T1 y=x

and condition (W1) holds for t > 21 . Condition (W2) can similarly be seen to hold on the same range [Wa1]. Let x, x 0 ∈ X, y ∈ T1−n x and let y 0 be the corresponding point of T1−n x 0 . Two applications of the Mean Value Theorem yield that T100 (zi ) i i 0 |φ(T y) − φ(T y )| = t 0 |x − x 0 |, T (zi )(T n−i )0 (wi ) 1

1

where wi and zi lie between T1i y and T1i y 0 . Using properties (2) and (4) of EMR we obtain that |φ(T i y) − φ(T i y 0 )| ≤ Kt|x − x 0 |/α n−i−1 and thus n−1 X φ(T i y) − φ(T i y 0 ) ≤ Kt|x − x 0 |/(α − 1). i=0

Condition (W2) immediately follows. On the range t > 21 , the function t 7 → P (−t log |T10 |) is analytic since the function exp(P (−t log |T10 |)) is an isolated eigenvalue for the associated transfer operator, about which we shall say more in [M1, Sect. VI]. It is also strictly convex (see Fig. 3). Moreover, as in the usual theory, one can use perturbation theory to compute the second derivative of exp(P (−t log |T10 |)) and deduce this is strictly convex. It follows from the standard Rohlin equality that P (− log |T10 |) = 0. Mayer [M1] has also shown that the function t 7 → P (−t log |T10 )| has a logarithmic singularity at t = 21 . This will explain the range of values in the statement of Theorem 1. Example 2. For the induced Manneville–Pomeau transformation Tb2 with the piecewise analytic potential φ = −t log |Tb20 |, we have that X Tb2 y=x

exp φ(y) =

X Tb2 y=x

∞

X 1 = |G0n (x)|t , 0 t |Tb2 (y)| n=1

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

1/2

151

1

Fig. 3. Graph of t 7 → P (−t log |T10 |)

where Gn = F1 F0n−1 , and F0 , F1 denote the two branches of the inverse of T2 . Properties (W1) and (W2) for Tb2 are established in [T, p. 312] and [I, Lemma 2.2] for t < 1. Prellburg [Pr,PS,V] showed that the function P (−t log |T20 |) = 0 for t > 1, and that on the range 1/α < t < 1, the function t 7 → P (−t log |T20 |) is analytic and strictly convex Fig. 4). It follows from the Rohlin equality that P (− log |T20 |) = 0. Using the variational principle, it is easy to see that for all values of t we have that P (−t log |T20 |) ≥ R0, since if we take µ to be the Dirac measure supported at 0, then hµ (T2 ) = 0 and I log |T20 |dµ = 0. Lopes [L] has studied the precise nature of the singularity of the map t 7 → P (−t log |T10 |) at t = 1. This explains the range of values in the statement of Theorem 2. Lyapunov exponents. Lyapunov exponents measure the exponential rate of divergence of infinitesimally close orbits of a smooth dynamical system. These exponents are intimately related with the global stochastic behavior of the system and are fundamental invariants of a smooth dynamical system. For a transformation T : I → I we define the Lyapunov exponent λ(x) of T by n−1

Y 1 1 log |(T n )0 (x)| = lim log |T 0 (T i x)|, n→∞ n n→∞ n

λ(x) ≡ lim

(3)

i=0

when the limit exists. The function λ(x) is clearly T −invariant. There is a natural decomposition of the interval I by level sets of the Lyapunov exponent Lβ = {x ∈ I : λ(x) = β}, [ Lβ ∪ {x ∈ I | λ(x) does not exist} . I= −∞<β<∞

152

M. Pollicott, H. Weiss

1

0

1

Fig. 4. Graph of t 7 → P (−t log |T20 |)

To study this complicated decomposition we introduce the Lyapunov spectrum by considering the level sets of the Lyapunov exponent and by defining g(β) = dimH (Lβ ), where dimH (Lβ ) denotes the Hausdorff dimension of Lβ . In the next section we relate the Lyapunov spectrum to a related spectrum for local dimension and prove several remarkable properties about it. 3. The Multifractal Analysis of Equilibrium States The general concept of a multifractal analysis for a dynamical system concerns a detailed study of the exceptional behavior of asymptotically defined dynamical quantities such as pointwise dimension, Lyapunov exponent, local entropy, Birkhoff average, etc. In many examples with hyperbolic structure these quantities are constant almost everywhere, with respect to an appropriate ergodic measure. We consider two notions of local or pointwise dimension with respect to an invariant measure. The pointwise dimension of a Borel probability measure µ defined on I is defined by log µ(B(x, r)) , r→0 log r

dµ (x) = lim

(1)

when the limit exists. Here B(x, r) denotes the ball of radius r centered at the point x. The Markov pointwise dimension of a T −invariant Borel probability measure µ is defined on I , defined by log µ(In (x)) , δµ (x) ≡ lim n→∞ − log `(In (x))

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

153

when the limit exists. Here the intervals {In } are those in the definition of ERM transformation and `(In (x)) denotes the length of the interval In (x). In [PW1,PW2] the authors establish deep relationships between these two notions of local dimension for equilibrium states for conformal expanding maps. By definition conformal expanding maps are local homeomorphisms, which the maps we consider in this paper are not. Not surprisingly, some of relationships that are valid at every point in certain sets are no longer true, and this is a major obstacle in generalizing the usual multifractal theory to these more general transformations. There are natural decompositions of the interval I by level sets Kα = {x : dµ (x) = α}, [ Kα ∪ {x ∈ I | dν (x) does not exist} , I= −∞<α<∞

and by the level sets KαM = {x : δµ (x) = α}, [ KαM ∪ {x ∈ I | δν (x) does not exist} . I= −∞<α<∞

Since we are mostly interested in effecting a multifractal analysis of the Lyapunov exponent, we only analyze the latter decomposition. To study this decomposition we define the Markov dimension spectrum fµ (α) = dimH (KαM ), where dimH (KαM ) denotes the Hausdorff dimension of the set KαM . This is similar to the Lyapunov spectrum we defined in Sect. 2. The parts of the multifractal analysis which we establish are that under suitable hypotheses, the function fµ (α) is real analytic and strictly convex (on a suitable interval) and is given in terms of thermodynamic formalism. More precisely, let T : I → I be an EMR transformation, and let φ ∈ W. The equilibrium state µ is a T -invariant probability measure µ such that there exists a positive C such that µ(In (x)) 1 ≤ C, ≤ P j C exp(−nP (φ) + n−1 j =0 φ(T y))

(2)

for all x ∈ I and y ∈ In (x). We assume that φ is not cohomologous to log |T 0 | [PW1]. Let ψ be the positive function such that log ψ = φ − P (φ). Clearly ψ ∈ W, the pressure P (log ψ) = 0, and µ is also the equilibrium state for log ψ. For these potentials it follows from the Jacobian estimate (J) that the Markov pointwise dimension satisfies Q i log n−1 log µ(In (x)) i=0 ψ(T x) = lim , δµ (x) ≡ lim n→∞ − log `(In (x)) n→∞ − log |(T n )0 (x)| when the limits exist. The following proposition establishes relationships between the two notions of local dimension for equilibrium states and the Lyapunov exponent. Proposition 3. Let T : I → I be an EMR transformation and let µ be the equilibrium state corresponding to the potential φ ∈ W. Let φ(x) denote the Birkhoff average P k limn→∞ 1/n n−1 i=0 φ(T (x)) at x and consider x ∈ I \ O.

154

M. Pollicott, H. Weiss

(1) Suppose that δµ (x) exists. Then d µ (x) ≤ δµ (x). If φ(x) also exists, then d µ (x) ≥ δµ (x). In this case dµ (x) = δµ (x) =

−log ψ(x) P (φ) − φ(x) = , λ(x) λ(x)

where λ(x) denotes the Lyapunov exponent at x. (2) Suppose that dµ (x) exists. Then δ µ (x) ≥ dµ (x). If φ(x) also exists, then δ µ (x) ≤ dµ (x). In this case δµ (x) = dµ (x). (3) If exp φ is uniformly bounded away from zero, then the pointwise dimension dµ (x) = γ if and only if the Markov pointwise dimension δµ (x) = γ . (4) If T = T1 or T = T2 and log ψ = φ − P (φ) is bounded away from zero, then the Markov pointwise dimension δµ (x) = γ implies that the pointwise dimension dµ (x) = γ . Remark. We caution that the hypothesis in (3) does not hold for the important family of potentials φ = φs = −s log |T10 | since exp φs = x 2s . As noted in Example 1 of Sect. 2, every potential φ ∈ W for the continued fraction map T 1 must approach 0 as x → 0 and thus (3) will not hold for any such potential for this map. However, Property (4) is satisfied (at least for s close to 1). This is an important difference from the multifractal analysis of the classical conformal expanding maps where we consider Hölder continuous potentials and where this condition is always satisfied. Proof of Proposition 3. The Jacobian estimate (J) allows us to estimate the lengths (`) of the intervals In (x) using the derivative of T [PW1], i.e., there exist positive constants C1 and C2 such that for all x ∈ I \ O and all n ∈ N, we have C1 ≤

`(In (x)) ≤ C2 . |(T n )0 (x)|−1

(4)

Suppose that the Markov pointwise dimension δµ (x) exists at a point x ∈ I \O. Given r > 0 there exists a unique n = n(r) such that C1 |(T n )0 (x)|−1 < r ≤ C1 |(T n−1 )0 (x)|−1 . It immediately follows from (4) that log µ(B(x, C1 |(T n−1 )0 (x)|−1 ) log µ(In−1 (x)) log µ(B(x, r)) ≥ ≥ (5) log r log r log r log µ(In−1 (x)) log µ(In (x)) log µ(In−1 (x)) = . (6) ≥ log(C1 |(T n )0 (x)|−1 ) log µ(In (x)) log(C1 |(T n )0 (x)|−1 ) It follows from the definition of equilibrium state that P j exp(−(n − 1)P (φ) + n−2 µ(In−1 (x)) j =0 φ(T x)) , Pn−1 µ(In (x)) exp(−nP (φ) + j =0 φ(T j x)) and thus if φ(x) exists, then lim

n→∞

log µ(In−1 (x)) = 1. log µ(In (x))

We obtain that if δµ (x) exists, then d µ (x) ≥ δµ (x). This proves the first part of (1).

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

155

We note that by replacing the potential φ by log ψ, we obtain the following expression: Qn−2 j 1 µ(In−1 (x)) j =0 ψ(T x) = Qn−1 , n−1 x) j µ(In (x)) ψ(T j =0 ψ(T x) and if we assume that ψ (or exp φ) is uniformly bounded away from zero, then clearly the quotient µ(In−1 (x))/µ(In (x)) will be uniformly bounded for all x and all n, and thus for all x we have log µ(In−1 (x)) = 1. lim n→∞ log µ(In (x)) Under this uniform boundedness away from zero assumption, we also obtain that d µ (x) ≥ δµ (x) contributing to the proof of (3). Next given r > 0 there exists a unique n = n(r) such that C2 |(T n )0 (x)|−1 < r ≤ C2 |(T n−1 )0 (x)|−1 . It immediately follows from (4) that log µ(B(x, C2 |(T n )0 (x)|−1 ) log µ(B(x, r)) ≤ log r log r log µ(In (x)) log µ(In (x)) ≤ . ≤ log r log(C2 |(T n )0 (x)|−1 ) We obtain that d µ (x) ≤ δµ (x) and hence dµ (x) = δµ (x). Now suppose that the pointwise dimension dµ (x) exists at a point x ∈ I \ O. Equation (4) immediately implies that log µ(B(x, C2 |(T n )0 (x)| log µ(In (x)) ≥ , log |(T n )0 (x)|−1 log(|(T n )0 (x)|−1 ) and we obtain that δ µ (x) ≥ dµ (x). Finally, choose an increasing sequence of positive integers {nk } such that lim

nk →∞

log µ(Ink (x)) = δ µ (x), log(C1 |(T nk )0 (x)|−1 )

and for each nk choose some rk > 0 such that C1 |(T nk )0 (x)|−1 < rk ≤ C1 |(T nk −1 )0 (x)|−1 . From (5) and (6) we have that log µ(Ink (x)) log µ(Ink −1 (x)) log µ(B(x, rk )) . ≥ log rk log µ(Ink (x)) log(C1 |(T nk )0 (x)|−1 ) Again, under the assumption that φ(x) exists we showed that lim log µ(Ink −1 (x))/ log µ(Ink (x)) = 1,

n→∞

and thus we obtain that dµ (x) ≥ δ µ (x). We conclude that δµ (x) = dµ (x) completing the proof of (2). Finally, as above, we obtain the same estimate if we assume that ψ is uniformly bounded away from zero on I . Part (3) now easily follows from parts (1) and (2).

156

M. Pollicott, H. Weiss

For the final part, let δµ (x) = γ . It suffices to show that for T = T1 or T = T2 then log |T 0 (T n x)| −→ 0 as n → ∞, Pn−1 i i=0 log ψ(T x) which, using δµ (x) = γ easily implies that log ψ(T n x) −→ 0 as n → ∞, Pn−1 i i=0 log ψ(T x) from which the conclusion easily follows. We shall concentrate on the case of the continued fraction transformation T1 ; the case of T2 being similar. Fix > 0. If inf x∈I | log ψ(x)| = δ > 0, then choose n0 ∈ N such that log(n0 + 1)2 /(δn0 ) ≤ . If 1/(k + 1) ≤ T1n (x) < 1/k (where k ∈ N), then k 2 ≤ |T10 (T1n x)| < (k + 1)2 . If k ≥ n0 , then provided n ≥ n0 , we can estimate log(n0 + 1)2 log |T 0 (T n x)| ≤ ≤ . Pn−1 i δn0 i=0 log ψ(T x) If k < n0 then we can still bound log(n0 + 1)2 log |T 0 (T n x)| ≤ ≤ , Pn−1 i δn i=0 log ψ(T x) provided n is sufficiently large. u t The essential feature of the proof of (4) above is the need for polynomial bounds on the derivative of the transformation. The polynomial bounds on the derivative for T2 may be found in [I,Pr]. We define the two parameter family of functions φq,t = −t log |T 0 | + q log ψ in W. Define the function t (q) by requiring that P (φq,t (q) ) = 0 and let µq be the equilibrium state for φq,t (q) . Definition. We say that a triple (T , φ, µ) satisfies a multifractal analysis1 if (1) The Markov pointwise dimension δµ (x) exists for µ-almost every x ∈ I . Moreover, R δµ (x) = δµ ≡ hµ (T )/ I log |T 0 | dµ for µ-almost every x ∈ I . (2) The function t (q) is the Legendre transform of the dimension spectrum, i.e., we have that fµ (α(q)) = t (q) + qα(q), where Z Z α(q) = −t 0 (q) = − log ψdµq / log |T 0 |dµq . I

I

In particular, t (q) is smooth and strictly convex on some interval (qmin , qmax ). 1 This multifractal analysis should properly be called a Markov multifractal analysis although this is not standard terminology. The term multifractal analysis should refer to an analysis effected using the pointwise dimension. However, in order not to introduce non-standard terminology, we will refer to this analysis as a multifractal analysis.

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

- α

2

157

f(α)

T(q)

f( α) not defined

-α 0

- α 1

Vertical Slope

1

1/2 T(q) not defined

0

q q

0

α

0

α

1

α

α 2

Fig. 5. Multifractal analysis for the continued fraction transformation

An immediate consequence is that if (T , φ, µ) satisfies a multifractal analysis, then the dimension spectrum fµ (α) is smooth and strictly convex on an interval, and hence the Markov pointwise dimension δµ (x) attains the interval of values (α(qmax ), α(qmin )), where each value is attained on an uncountable dense set which supports an equilibrium state. We note that this is only a partial multifractal analysis in the sense of [PW1,PW2], in that a complete multifractal analysis also establishes analogous results for the Rényi spectrum of dimensions for µ and then establishes a Legendre transform relation between the dimension spectrum and the Rényi spectrum. Here we only extend those aspects of the theory which we use in our applications. One technical complication in extending the entire theory is that the equilibrium state µ may not be included in the family of equilibrium states µq , where in the usual theory µ1 = µ. In other words, the interval (α(qmax) , α(qmin )) may not contain dµ . This happens for the continued fraction transformation T1 with potential φ = −s log |T10 |, s > 21 (see Corollary 5). Henceforth for T1 we shall only consider potentials φ which are elements of W(T1 ), and for T2 we shall only consider potentials φ which are elements of W(Tb2 ). For these classes of potentials we establish a multifractal analysis for the pointwise dimension of the associated equilibrium state. For applications to Diophantine approximation, we only require a multifractal analysis for the very special class of potentials of the form −s log |T 0 |. Theorem 1. A multifractal analysis holds for the continued fraction transformation T1 in the range of q such that t (q) > 21 (see Fig. 5). Theorem 2. A multifractal analysis holds for the Manneville–Pomeau transformation Tb2 for 1/α < q < 1 (see Fig. 6).

158

M. Pollicott, H. Weiss f( α )

T(q)

q t(q) not defined

1/α

1

- α

α

1

α

1

Fig. 6. Multifractal analysis for the Manneville–Pomeau transformation

If the pointwise dimension (or Lyapunov exponent) exists at a point x for an EMR transformation T , then the same limit exists for the induced map Tb. In particular, the estimates for Tb provide a lower bound on the dimensions of the set of values with the same limit for T . However, the converse need not necessarily be true. It is plausible that there exist uncountably many points (comprising a set of positive Hausdorff dimension) for which the limit defining the pointwise dimension or Lyapunov exponent for T does not exist, but the subsequential limit, which corresponds to the pointwise dimension or Lyapunov exponent for Tb does exist. The following is an immediate consequence of Proposition 3 and Theorem 1, and allows us to relate the Lyapunov spectrum to the Markov dimension spectrum for a special class of equilibrium states (see [We]). Corollary 1. Let T : I → I be an EMR transformation. If φ(x) = −s log |T 0 |, then λ(x) = P (−s log |T 0 |)/(δµs (x) − s), where µs is the equilibrium state for −s log |T 0 |. In the case s = 0 we obtain that except on a countable set, λ(x) = hT OP (T )/δµMAX (x), where hT OP (T ) denotes the topological entropy of the map T and µMAX denotes the measure of maximal entropy 2 . Since countable sets have zero Hausdorff dimension, we have that P (−s log |T 0 |) . fµs (α) = g α−s The Lyapunov spectrum g(β) is smooth and strictly convex on an interval. We will see in Sect. 4 that the Lyapunov exponent for the continued fraction transformation measures the precise exponential rate of rational approximation for the continued fraction algorithm. 4. Continued Fractions, Diophantine Approximation, and Cuspital Excursions on the Modular Surface For a wealth of classical results about continued fractions we recommend the superb books [C], [HW] and [K]. The books [B,CFS] contain an excellent introduction to the 2 We remind the reader that h T OP (T1 ) = ∞ and thus care must be taken in the allowable range of s.

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

159

dynamics of the continued fraction transformation and the connection with Diophantine approximation. Every irrational number 0 < x < 1 has a continued fraction expansion of the form 1

x=

=

1

a1 + a2 +

[a1 , a2 , a3 , . . . ],

1 a3 + · · ·

where a1 , a2 , · · · are positive integers. For every positive integer n define the n-th approximant pn /qn to be the rational number pn = qn

1

.

1

a1 + a2 +

1 ... +

1 an

There is a simple recursive relationship between pn , qn and an : p0 = 0, q0 = 1,

p−1 = 1, q−1 = 0,

pn = an pn−1 + pn−2 , qn = an qn−1 + qn−2 ,

k = 1, 2, . . . , k = 1, 2, · · · .

(7)

The continued fraction transformation can be considered as a simple algorithm for associating to irrational numbers 0 < x < 1 a sequence of rational numbers pn /qn . It is well known that the approximants pn /qn satisfy 1 1 1 1 1 x − pn ≤ < < 2 . (8) ≤ = 2 qn (qn + qn+1 ) qn qn (an+1 qn + qn−1 ) qn qn+1 qn 2qn+1 There is an intimate connection between the numbers a1 , a2 , · · · and the continued fractions transformation T1 . Given 0 < x < 1 we can write x= =

1 1 x

= 1 x

1 +

1 = x

1 a1 +

1 a2 +

1 T1 x

1 1 = a1 + T1 x a1 + =

1 a1 +

1

1 1 T1 x

=

1 a1 +

h

1 T1 x

i 1n o + T 1x 1

= ··· .

a2 + T12 x

h i Thus a1 = [1/x] , a2 = [1/T1 x] , · · · , ak = 1/T1k−1 x . Alternatively, if x = [a1 , a2 , · · · ], then T1n (x) = [an+1 , an+2 , an+3 , · · · ]. From this relation one immediately sees a close connection between the distribution of the values of ak and the ergodic properties of the map T1 . It also easily follows from the recursion that pn (x) = qn−1 (T1 x). To see this 1 1 pn (x) = [a1 , · · · , an ] = = qn (x) a1 + [a2 , · · · , an ] a1 + [pn−1 (T1 x)/qn−1 (T1 x)] qn−1 (T1 x) . (9) = a1 qn−1 (T1 x) + pn−1 (T1 x)

160

M. Pollicott, H. Weiss

Since all fractions are irreducible, the result immediately follows. There is an absolutely continuous T1 -invariant probability measure µG on [0, 1], usually called the Gauss measure, defined by Z 1 1 dx, µG (B) = log 2 B (1 + x) where B is a Borel subset of I . Clearly 1 1 `(B) ≤ µG (B) ≤ `(B), 2 log 2 log 2 where ` denotes Lebesgue measure. The map T1 is ergodic with respect to µG [K]. For x ∈ I \ O the set In (x) consists of all points 0 ≤ y ≤ 1 whose nth approximant is the same as for x. Dynamically this is equivalent to T k x, T k y ∈ [1/(ak (x) + 1), 1/ak (x)] for 1 ≤ k ≤ n. An easy calculation shows that `(In (x)) = 1/(qn (x)(qn (x) + qn−1 (x))). Applying (4) and (8) we obtain λ(x) = − lim

n→∞

1 1 log `(In (x)) = 2 lim log qn (x) ≡ 2q(x), n→∞ n n

(10)

when the limits exist, where q(x) denotes the exponential growth rate of the sequence {qn (x)}. Moreover, when one or the other limit exists, the other limit must also exist. An immediate consequence of (8) is that pn (x) 1 (11) λ(x) = − lim log x − n→∞ n qn (x) when one or the other limit exists. It also follows from the simple estimate on the density of the Gauss measure µG that 1 1 pn (x) λ(x) = − lim log µG (In (x)) = 2q(x) = − lim log x − n→∞ n n→∞ n qn (x) when the limit exists. Again, when one or the other limit exists the other limit must also exist. Since T1 is ergodic with respect to µG , it follows from the Birkhoff ergodic theorem applied to the function log |T10 x| and a simple calculation [B] that for µG -almost all x ∈ I, λ(x) = λ0 ≡

π2 π2 = 2.37314 · · · and q(x) = q0 ≡ = 1.18657 · · · . 6 log 2 12 log 2 (12)

Thus the Lyapunov exponent of the continued fraction transformation measures both the precise exponential speed of approximation of a number by its approximants and the exponential growth rate of the sequence {qn (x)}. For µG -almost all x ∈ I , |x − pn (x)/qn (x)| exp(−nπ 2 /6 log 2) and qn (x) exp(nπ 2 /12 log 2). Our aim now is to understand all the possible values which the Lyapunov exponent attains on the exceptional set of zero measure, as well understanding the distribution and structure of the sets of points where the exceptional values are realized. By studying periodic points of T1 one can easily find points x such that λ(x) 6 = λ0 . The fixed points of T1 correspond to numbers of the form x = [a, a, a, · · · ]. The Lyapunov

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

161

exponent at a fixed point x√ is −2 log x, and since fixed points exist arbitrarily close to 0 ([2a, 2a, 2a, · · · ] = a 2 + 1 − a), it follows that λ(x) attains arbitrarily large values. On the other hand, it is easy to see that√the minimum value that λ(x) can attain is 2 log(γ ) = 0.962424 · · · , where γ = (1 + 5)/2 is the Golden Mean. This follows from the fundamental recursion (7) since qn = an qn−1 + qn−2 ≥ qn−1 + qn−2 , and thus qn (x) ≥ cγ n for all x and all n, where c is a constant which is determined by initial conditions. This value of λ(x) is attained, for example, at the fixed point x = γ − 1 = [1, 1, 1, · · · ]. This value of λ(x) is also attained at any number x whose continued fraction expansion consists of all 1’s from some point on, i.e., x = [a1 , · · · , an , 1, 1, 1, · · · ]. Such numbers are sometimes called noble numbers. More generally, this value will be realized by precisely those numbers whose continued fraction expansions have a proportion of 1s which increases to 100 percent. The set of such numbers is dense and uncountable. We know that λ(x) can realize the values γ and λ0 . In the case of γ we can find a periodic orbit x∞ such that λ(x∞ ) = γ . We claim that the Lyapunov exponent for T1 possesses an intermediate value property: any intermediate value can also be realized as a Lyapunov exponent. Lemma 1. Given any value 2 log γ < ξ < λ0 , there exists a point x ∈ I such that λ(x) = ξ . Proof. Fix any value 2 log γ < ξ < λ0 . We first note that since the T1 -invariant measures are convex, then Z 0 log |T1 |dµ : µ is T1 − invariant = [2 log γ , λ0 ]. [2 log γ , λ0 ] ∩ Moreover, since the periodic point measures are weak star dense in the T1 -invariant measures, we can choose a sequence of periodic orbits T1Nn xn = xn , n ≥ 1, such that the associated Lyapunov exponent λ(xn ) satisfies |λ(xn ) − ξ | < 1/n. We can write each periodic orbit in terms of its continued fraction expansion, i.e. xn = [a0 (xn ), a1 (xn ), · · · , aNn −1 (xn )]. We can choose an increasing sequence nk inductively P such that (1/nk ) k−1 i=1 ni Ni → 0. Finally, we define the point x ∈ I having the continued fraction expansion x = [a0 (x1 ), · · · , aN1 −1 (x1 ), a0 (x2 ), · · · , aN2 −1 (x2 ), · · · ], {z } | {z } | n1 times

n2 times

i.e., we concatenate the repeated block in the continued fraction expansion of x1 (n1 times), followed by the repeated block in the continued fraction expansion of x2 (n2 times), etc. By construction we have that λ(x) = ξ . u t To study the distribution of values of λ(x) more precisely, we define the (T1 −invariant) level sets of the Lyapunov exponent 3α = {x ∈ I : λ(x) = α}. These sets (along with the set on which λ(x) does not exist) provide a decomposition of the interval. From (10) and (11) we see that for x ∈ 3α , |x − pn (x)/qn (x)| exp(−nα) and qn (x) exp(nα/2). The following proposition on the distribution of values of λ and the precise Hausdorff dimension of the level sets are easy consequences of Lemma 1, Theorem 1, and Proposition 3 applied to the function φ = −s log |T10 | for s > 1/2:

162

M. Pollicott, H. Weiss

Proposition 4. Let T1 : I → I be the continued fraction transformation. (1) The Lyapunov exponent λ(x) attains the interval of values [2 log γ , ∞) = [2 log γ , λ0 ) ∪ [λ0 , ∞). R (2) For α ∈ [λ0 , ∞) the value α = I log |T10 |dµs is attained R by the Lyapunov exponent on a set of (positive) Hausdorff dimension hµs (T1 )/ I log |T10 |dµs = hµs (T1 )/α. This level set is uncountable and also dense in I . Proof. Lemma 1 implies that λ(x) attains the interval of values [γ , λ0 ]. Consider the family of potentials φ s = −s log |T10 |, s > 1/2 and let µs be the corresponding family of equilibrium states. To prove Proposition 4, we shall only require (1) in our definition of multifractal analysis, that δµs (x) = δs ≡ R

I

P (−s log |T10 |) hµs (T1 ) =s+ R >0 0 0 log |T1 |dµs I log |T1 |dµs

for µs -almost all x ∈ I . Let KδMs = {x ∈ I : δµs (x) = δs }. From Proposition 3 we see that d µs (x) ≤ δs for all x ∈ KδMs and d µs (x) ≥ δs for µs -almost all x ∈ KδMs . By standard arguments in dimension theory [PW1, pp. 253–254] it follows that dimH Kδs = δs [PW1, pp. 253–254]. PropositionR 1 and the Variational PrincipleR immediately imply that if δµs (x) = δs then λ(x) = I log |T10 |dµs , and thus λ(x) = I log |T10 |dµs for µs -almost all x ∈ I . Thus dimH 3R log |T 0 |dµs = δs > 0. I

1

Since µs (KdMs ) = 1 and µs (3R log |T 0 |dµs ) = 1, and equilibrium states are positive 1 I on open sets, we have that each of the sets KdMs and 3R log |T 0 |dµs are dense in I . 1 I R Recall that µ1 = µG and thus I log |T10 |dµ1 = λ0 . It immediately follows from the first derivative formula for pressure [R] that Z d log |T10 |dµs = − P (−s log |T10 |). ds I

Since lims&1/2 P (−s log |T10 |) = ∞ and P (−s log |T10 |) is smooth and strictly convex on (1/2, ∞), it follows that lims&1/2 (d/ds)P (−s log |T10 |) = −∞. Thus Z lim

s&1/2 I

log |T10 |dµs = ∞.

R The map s 7 → I log |T10 |dµs is smooth on (1/2, ∞), by analytic perturbation theory since µs corresponds to an isolated maximal eigenvalue for a transfer operator [M1], and this implies that the Lyapunov exponent λ attains all values between λ0 and ∞, each on an uncountable dense set of positive Hausdorff dimension. u t Proposition 4 immediately implies the following two number theoretic corollaries. Corollary 2. The asymptotic quantity limn→∞ (1/n) log |x − pn (x)/qn (x)| attains the interval of values [λ0 , ∞) and each value in this interval is attained on an uncountable

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

163

dense set of positive Hausdorff dimension. There is an explicit formula for the Hausdorff dimension of each level set in [λ0 , ∞): Z For α = log |T10 |dµs , I 1 pn (x) = α = hµs (T1 )/α. dimH x ∈ [0, 1] : lim log x − n→∞ n qn (x) It easily follows from this formula that the Hausdorff dimension of the level sets vary smoothly. Corollary 3. The asymptotic quantity limn→∞ (1/n) log qn (x) attains the interval of values [ 21 λ0 , ∞) and each value in this interval is attained on an uncountable dense set of positive Hausdorff dimension. There is an explicit formula for the Hausdorff dimension of each level set, which in particular shows that the Hausdorff dimension of the level sets vary smoothly in α. We have seen that pn (x) = qn−1 (T1 x) and this easily implies that 1 1 log pn (x) = lim log qn−1 (T1 x) = q(T1 x). n→∞ n n→∞ n

p(x) ≡ lim

It follows from (12) that p(x) = q0 = π 2 /12 log 2 for µG −almost all x ∈ I . The next corollary, which analyzes the exceptional set, follows immediately. Corollary 4. The asymptotic quantity limn→∞ (1/n) log pn (x) attains the interval of values [ 21 λ0 , ∞) and each value in this interval is attained on an uncountable dense set of positive Hausdorff dimension. There is an explicit formula for the Hausdorff dimension of each level set, which in particular shows that the Hausdorff dimension of the level sets vary smoothly. Remark. Consider the function (x) log x − pqnn(x) pn (x) 1 = − lim log x − τ (x) = − lim n→∞ n→∞ n log qn (x) qn (x)

1 1 n

log qn (x)

,

when the limits exist. It immediately follows from (10) and (11) that for µG -almost all x the function τ (x) = τ0 ≡ λ0 /q0 = 2. More precisely, if q(x) = limn→∞ (1/n) log qn (x) exists at a point x ∈ I \ O then τ (x) = 2. We now consider the setP of points for which λ(x) does not exist. An example is ∞ −k! . It is easy to see that |x − p (x)/q (x)| = the Liouville number x = n n k=1 10 P∞ −k! and thus 10−(n+1)! < |x − p (x)/q (x)| < 2 × 10−(n+1)! . It follows that n n k=n+1 10 λ(x) = ∞. It is also easy to construct numbers for which the limit defining λ(x) does not exist and is not infinite. The construction uses the trick in Lemma 2. Consider the number x with continued fraction expansion x = [1, · · · , 1, 2, · · · , 2, 1, · · · , 1, 2, · · · , 2, · · · ], | {z } | {z } | {z } | {z } n1 times

m1 times

n2 times

m2 times

164

M. Pollicott, H. Weiss

with each ni and mi being much larger than the sum of all the proceeding choices. A routine argument shows that for suitable choices of ni and mi , the Lyapunov exponent λ(x) does not exist. A straightforward extension of an argument by Shereshevsky [Sh] gives that the set of points for which λ(x) does not exist has positive Hausdorff dimension. A natural problem is to compute the precise Hausdorff dimension. In [BaS] the authors show that for a conformal expanding map (they assume that their map is a local homeomorphism), the Hausdorff dimension of the set of points where the Lyapunov exponent does not exist is maximal, i.e., the same dimension as the limit set (repeller). The essential hypothesis for their result is a smooth map which possesses a sequence µk of ergodic invariant measures such that limk→∞ dimH (µk ) = dimH (J ), where J is the limit set and dimH (µ) ≡ inf{dimH (U ), µ(U ) = 1} [Pe, p. 42] is the Hausdorff dimension of the measure µ. As will be noted after the proof of Theorems 1 and 2, the existence of such sequences of measures for the maps T1 and T2 follow immediately from the proof of Theorems 1 and 2. Thus a straightforward extension of the proof of Barreira and Schmeling proves the following result. Theorem 3. The set of points x ∈ I for which λ(x) does not exist has Hausdorff dimension equal to 1. There are classical results on Hausdorff dimension and Diophantine approximation, due to Jarnik, to which our results can be viewed as complimentary. Recall that the continued fraction approximants pn /qn of a number x all satisfy the Diophantine condition x − p ≤ 1 . q q2 Let us now consider the set of numbers which admit a faster approximation by rational numbers. For τ > 2 let Fτ denote the set of τ −well approximable numbers, i.e., those that satisfy p 1 Fτ = x ∈ I : x − ≤ τ infinitely often . q q Here infinitely often means that there are infinitely many distinct rational p/q which satisfy the relation. Legendre showed that if p/q is any rational approximation to an irrational number x satisfying |x − p/q| ≤ 1/2q 2 , then p/q must be an approximant for x. Thus the rationals p/q in Fτ are all approximants. It is easy to show that this set has zero measure for each τ > 2. Jarnik [J] computed the Hausdorff dimension of Fτ and showed that dimH (Fτ ) = 2/τ . While Jarnik explicitly computes the Hausdorff dimension of the set of numbers x such the approximation error |x − p/q| is bounded above by q −τ for infinitely many approximants pn /qn , Corollary 2 is a statement about the Hausdorff dimension of sets of numbers x such that the error |x − pn /qn | admits precise asymptotic limiting behavior. Corollary 2 also quantifies the precise speed of convergence of the continued fraction algorithm on exceptional sets. Although our results are related, they do not seem to be obtainable from each other. Application to Geodesics on the Modular Surface. The map T1 is also closely related to the symbolic description of the geodesic flow φt : P SL(2, R)/P SL(2, Z) → P SL(2, R)/P SL(2, Z) defined on the unit tangent bundle of the modular surface M = H2 /P SL(2, Z) by φt (g)P SL(2, Z) = ggt P SL(2, Z) with gt = diag(exp(t), exp(−t)) and H2 = {x + iy : y > 0}. There is an interesting connection between Corollary 1

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

165

(x+y)/2

-1

-y

0

a

0

x

a +1 0

Fig. 7. Continued fractions and geodesic excursions

and such geodesic flows. The Lyapunov exponent quantifies the proportion of time that a geodesic spends in excursions on cuspital excursion [St,Su]. Given a geodesic γ on the modular surface we can consider a lift γˆ to the Poincaré upper half-plane H2 . Such geodesics on H2 correspond to Euclidean semi-circles which meet the real line perpendicularly. In particular, we can choose our lift so that the endpoints satisfy and 1 < x ≡ γˆ (∞) < ∞ and −1 < y ≡ γˆ (−∞) < 0. Let us consider the continued fraction expansion x = [a0 (x), a1 (x), a2 (x), . . . ] then we see that the Euclidean height of the arc is equal to (x − y)/2 and lies between a0 /2 and a0 /2 + 1. In particular, the hyperbolic distance of the cuspital excursion can be estimated by log a0 (x) (see Fig. 7). Fix p ∈ M. Given a unit tangent vector v ∈ T1 M = P SL(2, R)/P SL(2, Z) at p, we let γ : R → M be the unique unit speed geodesic with γ˙ (0) = v. We denote by sn , n ≥ 1, the times s > 0 at which γs (v) maximizes d(γs (v), p) on each successive excursion into the cusp (i.e., s1 < s2 < s3 < . . . are local maxima for d(γs (v), p)), where d denotes the hyperbolic distance on M. By the above observations we see that these heights can be estimated with the function − log an (x). Unfortunately, this function does not satisfy condition (W1), so that we need to consider instead functions of the form φ(x) ≡ −β log an (x), for β > 1. Moreover, sn can be estimated by log |(T1n )0 (x)|. We can therefore interpret the quantity An (v) ≡ Pn−1 Pn−1 n 0 i=0 log ai (x)/ log |(T1 ) (x)| ∼ i=0 d(γsi (v), p)/sn as an estimate on the average height of the first n geodesic excursions, compared with the time required for the excursions. For β > 1 let us denote ( ) n−1 X α α d(γsi (v), p) = = v : lim An (v) = . 3α ≡ v : lim (1/sn ) n→∞ n→∞ β β i=0

Theorem 1 implies the following result.

166

M. Pollicott, H. Weiss

Theorem 4. There exists an interval of values (αmin , αmax ) such that for α in this interval the set 3α is an uncountable dense set of positive Hausdorff dimension, and the Hausdorff dimension dimH (3α ) varies smoothly (analytically). This can be compared with the results of Sullivan [Su], Melián-Pestana [MP2], and Stratmann [St], which are of a somewhat complementary nature. In our notation these authors’ results relate to the subsequence tm = snm of successive farthest geodesic excursions (i.e., d(γt1 (v), p) ≤ d(γt2 (v), p) ≤ . . . ). Sullivan [Su] shows that a typical geodesic extends a distance at most log t into a geodesic at time t. More precisely, Sullivan shows that for almost every unit tangent vector v at p, d(γtm (v), p) = 1. m→∞ log tm lim

Stratmann and Melián-Pestana compute the Hausdorff dimension of the sets 5α = {v : lim d(γtm (v), p)/tm = α}. m→∞

Remark. As one would imagine, a similar study can be made of the behaviour of geodesics on other surfaces with cusps, generalizing those for the modular surface. In this case, one needs to use the general analysis described in [BoS]. There is also an analogous notion of Diophantine for more general Fuchsian groups [Pa], to which our results would naturally apply. Remark. We do not yet have a multifractal analysis of Birkhoff sums for T1 . With such P machinery one could make similar statements about n−1 i=0 d(γsi (v), p)/n as we can for Pn−1 d(γ (v), p)/s . si n i=0 5. Thermodynamic Formalism for Infinite State Systems The following proposition contains useful formulas for the derivatives of pressure. Proposition 5. Let T I → I be an EMR transformation. Let f, g and h be functions on I such that for sufficiently small 1 , 2 the family of functions f +1 g +2 h satisfy (W1), (W2), and P (f + 1 g + 2 h) > −∞. Then the function (1 , 2 ) 7 → P (f + 1 g + 2 h) is analytic, convex (in each variable), strictly convex if f is not cohomologous to a constant, and satisfies the following derivative formulas: Z d P (f + g) = g dµf , 1 d1 =0 I and

∂ 2 P (f + 1 g + 2 h) ≡ Qf (g, h), ∂1 ∂2 1 =2 =0

where Qf is the bilinear form on C α (I, R) defined by Z Z ∞ Z X g · (h ◦ T k ) dµf − g dµf h dµf , Qf (g, h) = k=0

I

I

I

and µf is the equilibrium state for f . Also Qf (g, g) ≥ 0 for all g and Qf (g, g) > 0 if and only if f is not cohomologous to a constant function.

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

167

Proof. The proof is very similar to the proof in the usual setting where the interval map is piecewise smooth and expanding on finitely many intervals [R]. There is an additional potential complication which needs to be addressed in that there may be an infinite sum, rather than just a finite sum, for the transfer operator. This occurs for T1 and Tb2 . This complication is handled with conditions (W1) and (W2) which ensure that the infinite sum is well defined and can be treated in the usual ways. The key to the proof is the study of the transfer operator Lφq,t : C 0 (I ) → C 0 (I ) given by X exp(φq,t (y))k(y), Lφq,t k(x) = T y=x

whose maximal eigenvalue is exp(P (φq,t )). Moreover, to obtain an isolated eigenvalue in the appropriate range, we study Lφq,t : BV (I ) → BV (I ) acting on the space BV of functions of bounded variation. Prellburg [Pr] has shown that for Tb2 , the spectrum of this operator consists of the closed unit ball, plus at most a countable number of isolated eigenvalues of modulus strictly greater than 1. Thus when the quantity P (φq,t ) > −∞, its exponential is the maximal positive isolated eigenvalue for Lφq,t . The result follows by analytic perturbation theory. The convexity is a direct consequence of the second derivative formula. u t The next proposition is an immediate consequence of Proposition 5 and the implicit function theorem [PW1]. Proposition 6. Let T I → I be an EMR transformation. Assume that for a range of values (q, t) the family of functions φq,t , satisfy (W1), (W2), and P (φq,t ) > −∞. Then the function (q, t) 7 → P (φq,t ) is analytic and convex (in each variable). Furthermore, for the one parameter family of potentials φq = φq,t (q) , where t = t (q) is defined by requiring that P (φq,t (q) ) = 0, we have that (1) The function t (q) is real analytic and convex. It is strictly convex if log ψ is not cohomologous to − log |T 0 |. (2) The derivative R log ψ dµq , t 0 (q) = − R I log |T 0 | dµq I where µq is the equilibrium state for φq . (3) The second derivative satisfies 2 2 2 ∂ P (φq,r ) 0 (q) ∂ P (φq,r ) + ∂ P (φq,r ) t 0 (q)2 − 2t ∂q ∂r ∂r 2 ∂q 2 , t 00 (q) = ∂P (φq,r ) ∂r

evaluated at (q, r) = (q, −t (q)). The following useful corollary follows easily from Proposition 2. Here we collect many results which we will require in Sect. 7. Corollary 5. In the special case φ = −s log |T 0 | we have that the function φq = −(t (q) + qs) log |T 0 | − qP (s log |T 0 |) and that (1) The function t (q) is defined implicitly by P (−(t (q) + qs) log |T 0 |) = qP (−s log |T 0 |).

168

M. Pollicott, H. Weiss

(2) The derivative t 0 (q) = − R

I

hνq (T ) log |T 0 |dνq

,

where νq is the equilibrium state for φq . (3) We have the special values t (0) = 1, t (1) = 0, and t 0 (0) = −1. (4) For the continued fraction transformation T1 the expressions in (1)-(3) are well defined (satisfy W1, W2, and P 6 = −∞) provided that t (q) > 1/2 and s > 1/2. (5) For the induced Manneville–Pomeau transformation Tb2 the expressions in (1)-(3) are well defined (satisfy W1, W2, and P 6 = −∞) provided that t (q) < 1 [Pr]. Example 1. The transfer operator for the continued fraction transformation T1 for the family φq,t , can be written explicitly as q 2t ∞ X 1 1 1 ψ k . Lφq,t k(x) = x+n x+n x+n n=1

For the potentials φq,t (for t > 1/2), conditions (W1) and (W2) apply. In the special case that φ is real analytic, the operator Lφq,t preserves the smaller space of analytic functions and is compact. In particular, we can waive the assumption that P (φq,t ) > −∞. Example 2. The transfer operator for the induced Manneville–Pomeau transformation Tb2 for the family φq,t , can be written explicitly as Lφq,t k(x) =

∞ X ψ q (x) X k(x) = ψ(Gn (x))q |G0n (x)|t k(Gn (x)), |Tb0 (x)|t

Tb2 y=x

n=1

2

where Gn = F1 F0n−1 , and F0 , F1 denote the two branches of the inverse of Tb2 . In the case of the Manneville–Pomeau transformation, the induced transformation Tb2 : I → I is a smooth map on each of the intervals [an , an+1 ] [Pr]. For the potentials φq,t (for t > 1), conditions (W1) and (W2) apply. The next result describes the construction of equilibrium states using transfer operators. Proposition 7. Let T I → I be an EMR transformation. Assume that φ satisfies (W1) and (W2). Then there exists a unique equilibrium state. Sketch of Proof. The associated transfer operator satisfies the following [Wa1, p. 128] (1) There exists λ > 0 and h ∈ C 0 (I ) with Lψq h = λh; 0 (2) There exists a T -invariant R probability measure µ such that for f ∈ C (I ) we have n −n that λ Lψq f (x) → h f dµ. R If we denote g = exp(ψ)h/(λh ◦ T ) we have that λ−n Lnlog g f (x) → f dµ [Wa1, p. 128]. Moreover, dT ∗ µ/dµ = 1/g [Wa1, p. 124], where T ∗ µ denotes the pull-back of the measure µ by T . The proof that µ is the required equilibrium state comes from (2). A useful property of g is that there exists C0 > 0 such that for all x, y ∈ I , exp(−C0 d(x, y)) ≤

n−1 Y i=0

|g(T i x)| ≤ exp(−C0 d(x, y)) |g(T i y)|

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

169

[Wa1, p. 130]. In particular, this implies that there exists C > 0 such that for all x ∈ I , µ(In (x)) 1 ≤ C. ≤ Qn−1 i C i=0 g(T x) Uniqueness comes by the ergodicity of µ and the fact that any two solutions are absolutely continuous. u t

6. Proofs of Theorems 1 and 2 Consider the transformation T = T1 or Tb2 . Statement (1) in our definition of multifractal analysis easily follows from the Birkhoff ergodic theorem. Given x ∈ I \ O, it follows state µ is an from Proposition 3 that dµ (x) = −logR ψ(x)/λ(x). Since the equilibrium R ergodic measure we have that λ(x) = I log |T 0 |dµ and log ψ(x) = I log ψdµ for µalmost everyRx ∈ I . Since P (log ψ) = 0, it follows from the R Variational Principle that hµ (T ) = − I log ψdµ. It follows that dµ (x) = hµ (T )/ I log |T 0 | dµ for µ-almost every x ∈ I . Recall that t (q) is the unique solution to P (φ(q,t (q) ) = P (−t (q) log |T 0 |+q log ψ) = 0 and µq is the equilibrium state for φq = φ(q,t (q)) . For each map T and each potential φ, the function t (q) is defined for a certain range (qmin , qmax ) of q which we have discussed in Sect. 6. By Proposition 6 the function t (q) is smooth and strictly convex on this interval. Using the derivative formula in Proposition 6, we define for each q the function R log ψdµq 0 , α(q) ≡ −t (q) = RI − I log |T 0 |dµq and consider the level sets M = x : δµ (x) = α(q) . Kα(q) We remind the reader that log µ(In (x)) M Kα(q) = x ∈ I \ O : lim n→∞ log `(In (x)) R Pn−1 i log ψdµq i=0 log ψ(T x) =RI . = lim Pn−1 0 0 i n→∞ I log |T |dµq i=0 log |T (T (x))| An immediate consequence of the Birkhoff ergodic theorem is that the Markov pointwise M ) = 1. dimension satisfies µq (Kα(q) M , Proposition 3 immediately implies that for all x ∈ Kα(q) δµq (x) = −

φ q (x) λ(x)

=

t (q)log |T 0 |(x) − qlog ψ(x) = t (q) + qα(q). λ(x)

It follows from Proposition 3(2) that the upper pointwise dimension d µq (x) ≤ δµq (x) = M . t (q) + qα(q) for all x ∈ Kα(q) M (and equals Since the Birkhoff average log ψ(x) exists for µq -almost every x ∈ Kα(q) R I log ψdµq ), it follows from Proposition 3(3) that d µq (x) ≥ δµq (x) = t (q) + qα(q)

170

M. Pollicott, H. Weiss

M . By standard arguments in dimension theory [PW1, for µq -almost every x ∈ Kα(q) M ) ≥ t (q) + qα(q), and these pp. 253–254], this a˙e˙inequality implies that dimH (Kα(q) M two estimates imply that dimH (Kα(q) ) = t (q) + qα(q). The smoothness and convexity properties of t (q) follow from Proposition 6. u t

Remark. As mentioned before the statement of Theorem 3, the above proof of Theorems 1 and 2 provides a sequence of ergodic invariant measures for T1 or Tb2 such that lim dimH (µk ) = dimH (I ) = 1. k→∞

M and µ (K M ) = 1, a weaker Since δµq (x) = t (q) + qα(q) for µq almost-all x ∈ Kα(q) q α(q) conclusion than we obtained in the theorem above is that for the (ergodic) equilibrium state µq we have that dimH (µq ) = t (q) + qα(q) [Pe, p. 42]. Since t (0) = 1 we have that limq→0 dimH (µq ) = 1. Thus any allowable potential φ ∈ W gives rise to such a one parameter family (and hence such a sequence) of measures.

References [B] [BaS] [BoS] [C] [CFS] [GH] [HW] [I] [J] [K] [L] [M1] [MP1] [MP2] [N] [Pa] [Pe] [Pr] [PS] [PW1] [PW2] [PW3]

Billingsley, P.: Ergodic Theory and Information. Krieger, 1978 Bareirra, L. and Schmeling, J.: Sets of “non-typical” points have full topological entropy and full Hausdorff dimension. Preprint Bowen, R. and Series, C.: Markov maps associated with Fuchsian groups. Publ. Math.(IHES) 50, 153–170 ( 1979) Cassels, J.: An Introduction to Diophantine Approximation. CUP, 1957 Cornfeld, I., Fomin, S., Sinai, Ya.: Ergodic theory. Berlin–Heidelberg–New York: Springer-Verlag, 1982 Gröchenig, K. and Haas, A.: Backwards Continued Fractions, Hecke Groups and Invariant Measures for Transformations of the Interval. Ergodic Theory Dynam. Systems 16, 241–1274 (1996) Hardy, G. and Wright, E.: An Introduction to the Theory of Numbers. Fifth edition, Oxford: Oxford University Press, 1979 Isola, S.: Dynamical Zeta Functions and Correlation Functions for Non-uniformly Hyperbolic Systems. Preprint Jarnik, V.: Über die simultanen diophantischen Approximationen. Math. Zeit. 33, 505–543 (1931) Khinchin, A.: Continued fractions. Chicago: University of Chicago Press, 1964 Lopes, A.: The Zeta Function, Nondifferentiability of Pressure, and the Critical Exponent of Transition. Adv. Math. 101 2, 133–165 (1993) Mayer, D.: On the Thermodynamics Formalism for the Gauss Map. Commun. Math. Phys. 130, 311–333 (1990) Pomeau, Y. and Manneville, P.: Intermittent Transition to Turbulence in Dissipative Dynamical Systems. Commun. Math. Phys. 74 189–197 (1980) Melián, M. and Pestana, D.: Geodesic Excursions into Cusps in Finite-Volume Hyperbolic Manifolds. Michigan Math. J. 40, 77–93 (1993) Nakaishi, K.: Multifractal Formalsim For Some Parabolic Maps. Preprint Patterson, S.: Diophantine approximation in Fuchsian groups. Philos. Trans. Roy. Soc. London, Ser. A 282, 1976, pp. 527–563 Pesin, Y.: Dimension Theory in Dynamical Systems. CUP, 1997 Prellberg, T.: Maps of Intervals with Indifferent Fixed Points: Thermodynamic Formalism and Phase Transitions Va. Polytechnique Institute Theses, 1991 Prellberg, T. and Slawny, J.: Maps of Intervals with Indifferent Fixed Points: Thermodynamic Formalism and Phase Transitions. J. Stat. Phys. 66, 503–514 (1992) Pesin, Y. and Weiss, H.: Multifractal Analysis of Equilibrium Measures for Conformal Expanding Maps and Moran-like Geometric Construction. J. of Stat. Phys. 86, 233–275 (1997) Pesin, Y. and Weiss, H.: The Multifractal Analysis of Gibbs Measures: Motivation, Mathematical Foundation and Examples. Chaos 7, 89–106 (1997) Pesin, Y. and Weiss, H.: On the Dimension of Deterministic and Random Cantor-like sets, Symbolic Dynamics, and the Eckmann–Ruelle Conjecture. Commun. Math. Phys. 182, 105–153 (1996)

Multifractal Analysis of Lyapunov Exponent for Continued Fraction

[R] [Sc] [Sh] [St] [Su] [T] [V] [Wa1] [Wa2] [We]

171

Ruelle, D.: Thermodynamic Formalism. Reading, MA: Addison-Wesley, 1978 Schweiger, P.: Ergodic Theory of Fibred Systems and Metric Number Theory. Oxford: Oxford University Press, 1995 Shereshevsky, M.: A Complement to Young’s Theorem on Measure Dimension: The Difference Between Lower and Upper Pointwise Dimension. Nonlinearity 4, 15–25 (1991) Stratmann, B.: Fractal Dimensions for Jarnik Limit Sets of Geometrically Finite Kleinian Groups; The Semi-Classical Approach: Ark. Mat. 33, 385–403 (1995) Sullivan, D.: Disjoint Spheres, Approximation by Imaginary Quadratic Numbers and the Logarithmic Law for Geodesics. Acta. Math. 149, 215–237 (1982) Thaler, M.: Estimates on the invariant densities of endomorphisms with indifferent fixed points. Israel J. Math. 37, 303–314(1980) Verbitski, E.: Personal communication Walters, P.: Invariant Measures and Equilibrium States for Some Mappings Which Expand Distances. Transactions of the AMS 236, 121–153 (1978) Walters, P.: Introduction to Ergodic Theory. Berlin–Heidelberg–New York: Springer Verlag, 1982 Weiss, H.: The Lyapunov Spectrumof Equilibrium Measures for Conformal Expanding Maps and Axiom-A Surface Diffeomorphisms. J. Stat. Physics 95, (1999)

Communicated by P. Sarnak

Commun. Math. Phys. 207, 173 – 195 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Normal Forms and Quantization Formulae Dario Bambusi1 , Sandro Graffi2 , Thierry Paul3 1 Dipartimento di Matematica, Università di Milano, 20133 Milano, Italy. E-mail: [email protected] 2 Dipartimento di Matematica, Università di Bologna, 40127 Bologna, Italy. E-mail: [email protected] 3 Ceremade, Université de Paris-IX, 75776 Paris, France. E-mail: [email protected]

Received: 9 January 1998 / Accepted: 21 April 1999

Abstract: We consider the Schrödinger operator Q = −h¯ 2 1+V in Rn , where V (x) → +∞ as |x| → +∞, is Gevrey of order ` and has a unique non-degenerate minimum. A −a quantization formula up to an error of order e−c| ln h¯ | is obtained for all eigenvalues of −b Q lying in any interval [0, | ln h¯ | ], with a > 1 and 0 < b < 1 explicitly determined 1/` and c > 0. For eigenvalues in [0, h¯ δ ], 0 < δ < 1, the error is of order e−c/h¯ . The proof is based upon uniform Nekhoroshev estimates on the quantum normal form constructed quantizing the Lie transformation.

1. Introduction and Statement of the Results Consider the Schrödinger operator Q(h¯ ) = −h¯ 2 1/2 + V (x), V ≥ 0. Assume: (A1) V ∈ C ∞ (Rn ), n ≥ 1; V → +∞, |x| → ∞; V has a unique non-degenerate minimum. (Without loss we can take V (0) = ∇V (0) = 0, HessV (0) = diag[ω12 , . . . , ωn2 ].) By (A1) Q defined on H 2 (Rn ) ∩ D(V ) is self-adjoint in L2 (Rn ) with discrete spectrum (see e.g. [RS], §XIII.14). Finding an approximate quantization formula for its eigenvalues when the symbol q(x, ξ ) := ξ 2 /2 + V (x) is a non-integrable Hamiltonian is an old problem. It is long known that the eigenvalues of Q in [0, C h¯ ], C > 0 are approximated by those of the harmonic oscillator up to order h¯ 2 ; the mathematical justification of a quantization formula valid for all such eigenvalues "near the bottom of the well" is however much more recent (see e.g.[CDS,HS,Si]). On the other hand, classically these systems admit a normal form. Write q = p0 (x, ξ ) + V3 (x), where p0 (x, ξ ; ω) :=

1 X 2 1X 2 2 ωk x k , ξk + ωk2 xk2 , V3 (x) := V (x) − 2 2 n

n

k=1

k=1

(1.1)

174

D. Bambusi, S. Graffi, T. Paul

p0 is an integrable system (the harmonic oscillator), and V3 (x, ξ ) = O(|x|3 ) as |x| → 0. If (x, ξ ) ∈ := {(x, ξ ) | |(x, ξ )| ≤ } the “size” kV k := sup V3 (x) of the |x|≤

perturbation V3 can be made small: one then constructs on a canonical transformation close to the identity, such that the image of q has an expansion in powers of whose coefficients depend only on the actions Ik = (ξk2 + ωk2 xk2 )/2ωk , k = 1, . . . , n. This procedure has been implemented in quantum mechanics via microlocal analysis by Sjöstrand [Sj], yielding a quantization formula mod h¯ ∞ for the “semi-excited levels”; i.e. he proved that there is a smooth function F (t1 , . . . , tn ; h¯ ) admitting a full asymptotic expansion in h¯ such that, ∀δ > 0, all eigenvalues of Q in [0, h¯ δ ] are given by F ((k1 + 1/2)h¯ , . . . , (kn + 1/2)h¯ ; h¯ ) + O(h¯ ∞ ), ki = 1, . . . , i = 1, . . . , n. (1.2) The 0th order term in h¯ is the normal form of q truncated at a suitable δ dependent order. The purpose of this paper is twofold: (i) To find a quantization formula valid further away from the bottom of the well, and hence closer to the chaotic regime.Actually our formula, which holds for eigenvalues in the interval [0, | ln h¯ |−b ], does not have the standard form of an expansion in powers of h¯ with coefficients depending on (k + 1/2)h¯ ; β (ii) To turn the O(h¯ ∞ ) error estimate valid in [0, h¯ δ ] into an O(e−c/h¯ ) one, 0 < β < 1. To prove these results we proceed as follows. First we perform the unitary rescaling U : L2 (Rn ) 3 ψ(x) 7 → (U ψ)(x) := n/2 ψ(x) ∈ L2 (Rn ). The unitary image S := U QU−1 of the operator Q(h¯ ) under U has the form S = 2 P (h¯ 0 , ), P (h¯ 0 , ) := P0 (h¯ 0 , ω) + V (x), h¯ 0 = h¯ / 2 , X (∂x3h xk xl V )(0)xh xk xl + . . . . V (x) := −3 V3 (x) =

(1.3) (1.4)

h+k+l=3

Hence Spec(Q(h¯ )) = 2 Spec(P (h¯ 0 , )). To study Spec(P (h¯ 0 , )) we replace V by a,R (x, ξ ) := V (x)t (|x|)t (|ξ |), where t is a Gevrey function with t (u) ≡ 1 if u ∈ [0, R], t (u) ≡ 0 if u > 2R, R > 0 fixed, and prove that this replacement modifies any finite 1/(`+1) . Finally, using the quantum Lie part of the spectrum only by terms of order e−c/h¯ transform (see e.g.[DGH,Ba1]), we construct (and estimate) the normal form for the semiclassical pseudodifferential operator of Weyl symbol p0 (x, ξ ) + a,R (x, ξ ). An exponentially small remainder is obtained by implementing in this context the Nekhoroshev[Ne] technique: the normal form is truncated at an order chosen as a function of so as to minimize the remainder. The error estimate is uniform in h¯ (as in [Ba1,BV]). We further remark that here the Fourier integral operator construction of [Sj] is not required, and the results of [BV] are obtained as a by-product (see Proposition 3.1). Our first result is: Theorem 1.1. In addition to (A1) above assume (A2) There exist ν > 0, Cj > 0, α > 0, K(U ) > 0 and 1 < ` < ∞ such that, ∀ j = (j1 , . . . , jn ), |j | := |j1 | + . . . + |jn |, and for any bounded U ⊂ Rn : |j | |j | sup (1 + |x|2 )|j |/2 ∂xj V (x) ≤ Cj (1 + |x|2 )ν , sup ∂xj V (x) ≤ K(U )α −|j | (|j |!)` . x∈Rn

x∈U

(1.5)

Normal Forms and Quantization Formulae

175

(A3) There exist τ > n − 1, γ > 0 such that hω, ki ≥ γ |k|−τ , ∀k ∈ Zn \ {0}, |k| := |k1 | + . . . + |kn |, ω := (ω1 , . . . , ωn ). (1.6) Let furthermore b := [(τ + 2)(2` + 3)]−1 . Then: 1. There exist h∗ > 0, ∗ > 0, A > 0, B > 0 (independent of h¯ and ) and a smooth function Z(t1 , . . . , tn ; h¯ , ) : Rn × [0, h∗ ] × [0, ∗ ] → R with asymptotic expansion in h¯ : Z(t1 , . . . , tn ; h¯ , ) ∼ Z0 (t1 , . . . , tn ; ) + h¯ Z1 (t1 , . . . , tn ; ) + . . .

(1.7)

such that, ∀ ≤ ∗ , h¯ ≤ h¯ ∗ 2 the eigenvalues of Q(h¯ ) in [0, 2 ] have the representation h(k + 1/2), ωih¯ + 3 Z (k + 1/2)h¯ / 2 , h¯ / 2 , (1.8) 2 1/(`+1) b + O 3 e−A/ . + 2 O e−(B /h¯ ) 2. There exists F(t1 , . . . , tn ; h¯ ) such that, for any fixed integer M < N () := (∗ /)b : h(k + 1/2), ωih¯ + 3 Z((k + 1/2)h¯ / 2 , h¯ / 2 , ) = F ((k + 1/2)h¯ , h¯ ) + OM ( M ) + O((h¯ / 2 )∞ ).

(1.9)

Here F := h(k + 1/2), ωih¯ + F((k + 1/2)h¯ , h¯ ) is as in Theorem 0.1 of [Sj]. Remarks. 1. Z0 is the classical Birkhoff normal form computed up to order N () = (∗ /)b : Ik = (ξk2 + ωk2 xk2 )/2ωk are the action variables of the harmonic oscillator, and Z0 (I1 , . . . , In , ) =

N() X

ζs (I1 , . . . , In ; ) s .

(1.10)

s=0

The coefficients ζs are given by the Birkhoff transformation. The quantum corrections Zj : j ≥ 1 are likewise explicitly computable recursively. 2. The normal form (1.8) depends on because the scale invariance of the perturbation is broken by the modifications of the potential at infinity. Thus (1.8) is different from that of [Sj], Theorem 0.1. Assertion 2 states that the difference is O (h¯ / 2 )∞ . Choosing 2 = ϕ(h¯ ), within the requirement h¯ /ϕ(h¯ )b/2−1 → 0, h¯ → 0 we immediately recover Theorem 0.1 of [Sj], namely: Corollary 1.1. Let ϕ :]0, 1[→ R+ be positive and increasing. Then 1. If lim ϕ(x)b ln x = 0 all eigenvalues of Q(h¯ ) in [0, ϕ(h¯ )] admit the representation x→0

h(k + 1/2), ωih¯ + ϕ(h¯ ) Z 3

(k1 + 1/2)h¯ (kn + 1/2)h¯ h¯ p ,... , , , ϕ(h¯ ) ϕ(h¯ ) ϕ(h¯ ) ϕ(h¯ ) b/2

+ O(e−A/ϕ(h¯ ) ). (1.11)

176

D. Bambusi, S. Graffi, T. Paul

2. If ϕ(h¯ ) = h¯ δ/2 , 0 < δ < 1, (1.11) becomes, for all eigenvalues in [0, h¯ δ/2 ]: h(k + 1/2), ωih¯ + h¯ 3δ/2 Z (k1 + 1/2)h¯ 1−δ , . . . , (kn + 1/2)h¯ 1−δ , h¯ 1−δ , h¯ δ/2 δb/2

+ O(e−c/h¯

). (1.12)

3. If an error O(h¯ ∞ ) is allowed the fractional dependence of Z on h¯ disappears and (1.12) reduces to Theorem 0.1 of [Sj], i.e. the eigenvalues admit the representation h(k + 1/2), ωih¯ + F ((k1 + 1/2)h¯ , . . . , (kn + 1/2)h¯ , h¯ ) + O(h¯ ∞ ).

(1.13)

Remarks. 1. If ϕ(x) = | ln b +η (x)|−1 , η > 0 , (1.11) yields h a quantization formula i δ more general than Theorem 0.1 of [Sj], because [0, h¯ ] ⊂ 0, | ln2/b+η (h¯ )|−1 ∀ δ > 2

0, η > 0. Moreover, the error estimate O(exp(−A/| ln h¯ |1+ηb/2 )) is sharper than O(h¯ ∞ ).

2. Formula (1.13) follows from (1.9): for so small that 1/δ 2 < (∗ /)b , and any √ M 2 M ). M > 1/δ , one has O( ) = O(h¯ δ

3. A more general result, i.e. a quantization formula up to order e−c/h¯ , δ < 1, for all eigenvalues in any interval [0, M], M > 0, has been recently stated by Popov [Po], by a quantization of KAM theory.

2. Reduction to a Perturbation of Compact Support In this section we estimate the variation undergone by the eigenvalues of P (h¯ 0 , ) defined by (1.3, 1.4) when V is approximated by an operator with compact support symbol (and therefore trace-class). Denote by IB the open interval ]0, B[, 0 < B < ∞, and I¯B any compact subinterval of IB . Let H1 (κ), H2 (κ) be norm-resolvent continuous, self-adjoint operator families in a Hilbert space H ([Ka], §VII.1) for κ ∈ IB , with compact resolvents. Let J ⊂ R be a fixed bounded interval, and set SpecJ (H1,2 ) := Spec(H1,2 ) ∩ J . Definition 2.1. Let 0 < δ, c < +∞. We say that δ

SpecJ (H1 ) = SpecJ (H2 ) (mod e−c/κ ) as κ → 0 in I if H1 (κ)φ1 (κ) = λ1 (κ)φ1 (κ), φ1 6= 0, λ1 ∈ J imply φ1 (κ) ∈ δ D(H2 ) and k[H2 (κ) − λ1 (κ)]φ1 (κ)k = O(e−c/κ ) and, conversely, H2 (κ)φ2 (κ) = λ2 (κ)φ2 (κ), φ2 6 = 0, λ2 ∈ J imply φ2 (κ) ∈ D(H1 ) and k[H1 (κ) − λ2 (κ)]φ2 (κ)k = δ O(e−c/κ ). δ

Remark. If SpecJ (H1 ) = SpecJ (H2 ) (mod e−c/κ ), then for any λ1 ∈ SpecJ (H1 ) there δ exists λ2 ∈ SpecJ (H2 ) such that λ1 (κ) = λ2 (κ) + O(e−c/κ ) and conversely. Let P (h¯ 0 , ) be the operator in L2 (Rn ) defined by P0 (h¯ 0 ) + V on D(P0 ) ∩ D(V ). P0 (h¯ 0 , ω), the harmonic oscillator, is defined on D(P0 ) := H 2 (Rn ) ∩ L22 (Rn ).

Normal Forms and Quantization Formulae

177

Let R > 0 be fixed, u 7 → t (u) ∈ GR`,α (R+ ) be a Gevrey function such that t (u) ≡ 1 for |u| ≤ R, t (u) ≡ 0 for |u| ≥ 2R, and set: p(x, ξ ) := p0 (x, ξ ) + V (x), a,R (x, ξ ) := V (x)t (|x|)t (|ξ |), p,R (x, ξ ) := p0 (x, ξ ) + a,R (x, ξ ).

(2.1)

Denote furthermore QR := {(x, ξ )} | |x| ≤ R, |ξ | ≤ R}. Remarks. 1. a,R (x, ξ ) ∈ G `,α (R2n ) uniformly in ∈ I¯B ; namely, for some K, α independent of : |j | (2.2) sup ∂zj a,R,h¯ 0 (z) ≤ K(|j |!)` /α |j | , z := (x, ξ ). z∈R2n

/ Q2R . Moreover a,R (x, ξ ) ≡ V (x), (x, ξ ) ∈ QR , a,R (x, ξ ) ≡ 0, (x, ξ ) ∈ 2. Let A := OpW (a) denote the operator in L2 (Rn ) defined by the Weyl quantization of the symbol (equivalently, classical observable) a,R (x, ξ ): Z a,R ((x + y)/2, ξ )eihx−y,ξ i/h¯ ψ(y) dydξ. (2.3) (Op W (a)ψ)(x) = (2π h¯ )−n Rn ×Rn

In this case ([Ro], Th.III.44) A = Op W (a,R ) extends from S(Rn ) to a trace-class selfadjoint operator in L2 (Rn ), realized as an h¯ - pseudodifferential operator ([Ro], II.4). 3. By standard results of perturbation theory ([Ka], §VII.1) the operator family PR (h¯ 0 , ) := Op W (p,R ) = P0 + A,R , A,R := OpW (a,R ) defined on D(P0 ) is positive selfadjoint in L2 (Rn ), and norm-resolvent continuous with compact resolvents for (h¯ 0 , ) in a neighbourhood I¯B × I¯B of zero because A,R is a fortiori continuous. We now compare SpecJ P and SpecJ P,R for |J | < +∞: Proposition 2.1. Let J ⊂ R+ be bounded and fixed. Then there exist R > 0 and ∗ > 0 independent of , h¯ 0 , such that, provided 0 ≤ ≤ ∗ one has 0 1/`

SpecJ (P ) = SpecJ (P,R ) (mod e−c/h¯

).

The proof of the proposition requires a decay property of the eigenfunctions of the Weyl operator corresponding to a Gevrey symbol, well known in the analytic case (see e.g. [HS,Ma]). We prove this property in three steps (in what follows C and c denote positive constants which may sometimes assume different values). Given a Gevrey function a ∈ GR`,α , supp a ⊂ QR , denote kak`,α the smallest constant K such that the following inequality holds: |j | (2.4) sup ∂zj a(z) ≤ K(|j |!)` /α |j | , z := (x, ξ ). z∈R2n

Lemma 2.1. Let a(x, ξ ) ∈ GR`,α be a Gevrey function, supp a ⊂ QR . Let η ∈ C ∞ (Rn ; R+ ) fulfill 0 ≤ η(x) < (σ |x|)β with β := 1/` and 0 < σ < α. Assume also that |η(x) − η(y)| ≤ (σ |x − y|)β . Define the operator β β ˜ [Aψ](x) := eη(x)/h¯ OpW ae−η(x)/h¯ ψ(x).

(2.5)

Then there is a positive CA = CA (α, `, kak`,α , R, σ ) < +∞ such that A˜ ≤ CA .

178

D. Bambusi, S. Graffi, T. Paul

Proof. Denote aˆ 2 (x, k) :=

1 (2π)n/2

Z
a(x, ξ )eihk,ξ i dξ.

By the Gevrey property there is a compactly supported smooth positive function A(x) such that β |aˆ 2 (x, k)| ≤ A(x)e−(α|k|) . Given ψ ∈ S, by definition of Weyl quantization (see (2.3)) we have Z η(x)−η(y) ˜ a((x + y)/2, ξ )eihx−y,ξ i/h¯ e h¯ β ψ(y) dydξ [Aψ](x) = (2π h¯ )−n Rn ×Rn Z η(x)−η(y) aˆ 2 ((x + y)/2, (x − y)/h¯ )e h¯ β ψ(y) dy, = (2π h¯ 2 )−n/2 Rn

and therefore Z ˜ 2 −n/2 [Aψ](x) ≤ (2π h¯ )

x+y A n 2 R

Define

1 f (ξ ) := (2π)n/2

Z

e

β − α x−y h¯

|ψ(y)|e

x−y β σ h¯

dy.

(2.6)

β

e−(α−σ )|z| e−ihξ,zi dz,

then the r.h.s. of (2.6) is equal to Z −n A((x + y)/2)f (ξ )eihx−y,ξ i/h¯ |ψ(y)| dydξ (2π h¯ ) Rn ×Rn i h = OpW (A(x)f (ξ )) |ψ| (x). Since A(x)f (ξ ) has an L1 Fourier transform OpW (A(x)f (ξ )) is bounded uniformly in h¯ ([Ro], Cor.II.19). This proves the assertion. u t Given a positive constant E define the classically allowed domain of |ξ |2 + V (x) by E := x ∈ Rn : V (x) ≤ E . E is compact. Denote cE its complement. Proposition 2.2. Let b(x, ξ ) := |ξ |2 + V (x) + a(x, ξ ), B = OpW (b), where V (x) fulfills (1.5) and a(x, ξ ) ∈ GR`,α , supp a ⊂ QR . Denote β := 1/`; let the interval J and 0 < σ < α be fixed. Then there exist constants η˜ = η(β, ˜ n, σ ) > 0, ς = ς(β, n, σ, V ) > 0 and ∗ = ∗ (β, n, σ, R, α, `, kak`,α ) > 0 such that, provided < ∗ , any eigenfunction ψ of B corresponding to an eigenvalue λ ∈ J satisfies the estimate

(2.7)

ψ exp (σ |x|/h¯ )1/` 2 c ≤ CE , L (E )

where E := max{η˜ + 2 sup λ, ς }. λ∈J

(2.8)

Normal Forms and Quantization Formulae

179

Proof. For E fixed, denote E − δ the set of the points x contained in E together with the ball Bδ (x) of radius δ and center x. Likewise, denote E + δ the set of the set ∪x∈E Bδ (x). Fix E and δ such that E − 2δ 6 = ∅. Let ηN (x) be a sequence of positive, smooth, compactly supported function such that (1) ηN (x) = 0, x ∈ E − δ; ηN (x) = (σ |x|)β , x ∈ E + N, |ηN (x) − ηN (y)| ≤ (σ |x − y|)β . (2) ∇ηN is bounded uniformly with respect to x ∈ Rn and N ∈ N. (1) and (2) can be fulfilled choosing E and δ large and small enough, respectively. Let furthermore χ(x) ∈ C0∞ (Rn ) be such that χ(x) = 1 for x ∈ E − 2δ, χ(x) = 0 for x 6 ∈ E − δ. We omit the dependence on N in the forthcoming argument which β β holds for N fixed. For λ ∈ J , denote T := −h¯ 2 4 +V , S(λ) := eη/h¯ T e−η/h¯ − λ, β β A0 := eη/h¯ OpW (a)e−η/h¯ , so that β

β

R := eη/h¯ OpW (b + a)e−η/h¯ − λ = S(λ) + A0 , with kA0 kL2 →L2 < C(R) by Lemma 2.1 (we drop the dependence on the Gevrey constants). The Weyl symbol of S(λ) is clearly s(x, ξ ; λ) = (ξ − i∇η)2 + V (x) − λ, whence D(S) = D(T ) by the boundedness of ∇η. Let u ∈ D(S) = D(T ). Then, since kRuk ≥ kS(λ)uk − C(R)kuk: kRuk2 ≥ (1 − C(R))kS(λ)uk2 − C(R)(1 + C(R))kuk2 ≥ (1 − C(R))k(1 − χ)S(λ)uk2 − C(R)(1 + C(R))kuk2 , whence kRuk2 ≥ (1 − C(R))hu, S(λ)∗ (1 − χ)2 S(λ)ui − C(R)(1 + C(R))kuk2 . (2.9) We estimate from below hu, S(λ)∗ (1−χ)2 S(λ)ui by the Garding inequality ([Ro, Th.III4] or [He, §2.4]). Remark that the principal symbol of S(λ)∗ (1 − χ)2 S(λ) is (1 − χ )2 |s(x, ξ ; λ)|2 . Define M(E, δ, λ) :=

inf

x∈E −2δ

[V (x) − λ]2 ,

and remark that, by assumption (1.5) on the potential and the Lagrange mean value theorem, there exists ς 0 which tends to zero as E → ∞ such that M(E, δ, λ) ≥ [E(1 − ς 0 δ) − λ]2 . Choose now the constant ς in such a way that ς 0 δ < 21 , so that M > (e/2 − λ)2 . Then one has (1 − χ )2 |s(x, ξ ; λ)|2 > M(E, δ, λ)(1 − χ )2 − η, η := sup |∇η(x)|2 . x∈Rn

(2.10)

¯ Then Th.III-4 of [Ro] yields Let C1 satisfy C1 > η. 1 hu, S(λ)∗ (1 − χ )2 S(λ)ui ≥ M k(1 − χ)uk2 − C1 kuk2 . 2

(2.11)

180

D. Bambusi, S. Graffi, T. Paul

Substituting in (2.9) we get kRuk2 ≥ M(1 − C(R))k(1 − χ)uk2 − C2 kuk2 , C2 (R, η, ) := C1 (η) + C(R)(1 + C(R)). Define now ∗ according to

1 C1 min 1, , ∗ := C(R) 2

so that C2 < 2C1 . Now kuk2 ≤ 2kχuk2 + 2k(1 − χ)uk2 , whence kRuk2 ≥ (M − 4C1 )k(1 − χ)2 uk − 4C1 kχuk2 .

(2.12)

Now take u := uN = eηN ψ, where ψ is an eigenfunction of B corresponding to the eigenvalue λ. Since ηN ≡ 0 on supp χ we have kχuN k ≤ kψk. On the other hand, RuN = 0 for all N. This yields β

(M − 4C1 )k(1 − χ )eηN /h¯ ψk ≤ 4C1 kψk

(2.13)

which holds provided M > 4C1 which is true provided the constant η˜ in (2.8) is chosen large enough. This estimate is independent of N , which proves the proposition. u t Corollary 2.1. Let ψ be an eigenfunction of B corresponding to the eigenvalue λ ∈ J . Fix a positive σ < α; then ∃C < ∞ such that |ψ(x)| ≤ C exp −(σ |x|/h¯ 0 )1/` , x ∈ cE (2.14) with E given by (2.8). Proof. ψ fulfills (2.7). Op W (aτ ) is bounded. Hence the eigenvalue equation −h¯ 0 4 2 ψτ + V ψτ + OpW (aτ )ψτ = λτ ψτ yields h¯ 0 k4ψτ k ≤ C kψτ k + kV ψτ k. Moreover: 2 β β Z Z − hx0 hx0 2 2 |V (x)ψτ (x)| dx + kV ψτ k = V (x)e ¯ e ¯ ψ(x) dx x∈E x ∈ / E 2

≤ sup |V (x)|2 kψk2 + sup |V (x)e x∈E

β − hx0 2 ¯

x ∈ / E

κ|x|

| kψτ e 2h¯ 0 k2 ≤ C.

Iterating the argument we get the existence of a constant C such that k1N ψτ k ≤ C N , ∀N ≥ 1, whence the thesis by (2.7) on account of the Sobolev embedding theorem. u t We come now to the heart of the proof. We have, defining p1 (x, ξ ) := p0 (x, ξ ) + V (x)t (|x|) , P1 := OpW (p1 ).

Normal Forms and Quantization Formulae

181

Proposition 2.3. There exists R∗ < ∞ and a function ∗ (R) such that, provided R > R∗ and < ∗ (R) one has 0β

Spec(P1 ) = Spec(P ) mod(e−c/h¯ ). Proof. Since p(x, ξ ) − p1 (x, ξ ) = [1 − t (|x|)]V (x), it is enough to prove that, if ψ is an eigenfunction of P or P1 , then 0β

k[1 − t (|x|)]V (x)ψk = O(e−c/h¯ ). Both p and p1 fulfill the assumptions of Corollary 2.1, and [1 − t (|x|)]V (x) is supported outside the domain B2R := |x| ≤ 2R. Hence the inequality holds provided B2R ⊃ E . This can be ensured by taking R large enough, which is possible since the constants η˜ and ς in Proposition 2.2 do not depend on R. On the contrary the constant ∗ will depend on R, and this yields the assertion. u t Proposition 2.4. Let ψ be an eigenfunction of P1 or of P,R . Then, there exist R∗ , two functions ∗ (R) and h(R) and suitable constants ς, C, c such that, provided R > R∗ , < ∗ (R), h¯ 0 < h(R) one has:

−1/`

(P − P,R )ψ ≤ C h¯ −ς e−ch¯ 0 . Proof. Instead of comparing P1 and P1 we compare theirZunitary images under the h¯ 0 de-

pendent Fourier transform Fh¯ 0 : (Fh¯ 0 )(ξ ) = (2π h¯ 0 )−n/2

0

Rn

u(x)e−ihx,ξ i/h¯ dx. Given a

symbol g(x, ξ ), it is well known (see e.g. [BS], §5.4) that the symbol of Fh¯ 0 OpW (g)Fh¯−1 0 is simply g(−ξ, x). Hence we compare the spectrum of the operators of Weyl symbols p(−ξ, x) ≡ p0 (ξ, x) + V (ξ )t (|ξ |)t (|x|) and p1 (ξ, x) ≡ p0 (ξ, x) + V (ξ )t (|ξ |). Both these operators fulfill the assumptions of Corollary 2.1; therefore their eigenfunctions fulfill (2.14). We denote W (ξ ) := V (ξ )t (|ξ |). Let ψ be an eigenfunction of either one of these operators; fix R1 such that BR1 ⊂ E . We now estimate −1 [(Fh¯ 0 P1,τ Fh¯−1 0 − Fh ¯ 0 Pτ Fh¯ 0 ψ)](x) Z i hx−y,ξ i = (2π h¯ 0 )−n W (ξ )[1 − t (|x + y|/2)]e h¯ 0 ψ(y) dydξ

Rnξ ×Rnx

−n/2 0 −n

Z

Wˆ ((x − y)/h¯ 0 )[1 − t (|x + y|/2)] ψ(y)dy Z Z −n + Wˆ ((x − y)/h¯ 0 )[1 − t (|x + y|/2)] ψ(y)dy. = (2π)−n/2 h¯ 0

= (2π)

h¯

Rn

|y|≤R1

|y|≥R1

Hence, by the decay properties of ψ and of the Fourier transform of W the second integral is estimated by β Z − σ (|x−y|β +|y|β ) −ς e h¯ 0 |1 − t (|x + y|/2)| dy. (2.15) C h¯ 0 Rn

Here ς is a suitable constant. Choose R = 3R1 , and consider first the case |x| > R/2; by the estimate |x − y|β + |y|β ≥ |x|β /2 + |y|β /2 our integral is majorized by Z 0 β 0 β −ς 0 −ς −(σ |x|/h¯ 0 )β e e−(σ |y|/h¯ ) dy < C h¯ 0 e−(σ |x|/h¯ ) . C h¯ |y|>R1

182

D. Bambusi, S. Graffi, T. Paul

For |x| < R/2 we need a stronger estimate. To obtain it remark that for |y| < 3R/2 one has |x + y| < R, and therefore, by the occurrence of 1 − t, the argument of the second integral in (2.15) vanishes unless |y| > 3R/2 which implies |x − y| > R. From this 0 β −ς it follows that the second integral of (2.15) is also estimated by C h¯ 0 e−(σ R/h¯ ) . We now estimate the first integral of (2.15). To this end we remark that it vanishes unless |(x + y)/2| ≥ R and |y| ≤ R1 . This gives |x − y| ≥ |x| − |y| ≥ |x + y| − 2|y| |x| R1 5 |x| |x + y| |x + y| + − 2|y| ≥ − + R − 2R1 ≥ + R − R1 . = 2 2 2 2 2 2 t Since R = 3R1 , proceeding as above we get the result. u 3. Perturbation Theory Estimates Consider the pseudodifferential operator PR (h¯ 0 , ) = P0 + A,R with Weyl symbol p,R (x, ξ ) = p0 (x, ξ ; ω) + a,R,h¯ 0 (x, ξ ) as in Proposition 2.1. We now construct the quantum normal form for PR (h¯ 0 , ) with uniform bounds for the remainder. The result is (dropping the dependence on R fixed): Theorem 3.1. There exist positive constants (independent of h¯ 0 , ) ∗ , C1 , . . . , C6 and, for < ∗ , a unitary transformation U (, h¯ 0 ) : L2 → L2 such that 1. The unitary image S(h¯ 0 , ) := U PR (h¯ 0 , )U −1 of PR (h¯ 0 , ) has a Weyl symbol σ (h¯ 0 , ) of the form σ (h¯ 0 , ) = p0 + Z(h¯ 0 , ) + R(h¯ 0 , ).

(3.1)

2. Z(h¯ 0 , ) admits the representation Z(h¯ 0 , ) =

N() X

ζs (x, ξ ; , h¯ 0 ) s .

(3.2)

s=1

Here N () := (∗ /)b , b := [(τ + 2)(2` + 3)]−1 ; ζs , s = 1, . . . , N() is a smooth, bounded semiclassical symbol depending on (x, ξ ) only through the actions Ik , k = 1, . . . , n. 3. The Weyl quantization of the symbols Z(h¯ 0 , ), R(h¯ 0 , ) fulfills the estimates: h i kOp W (Z(h¯ 0 , ))k ≤ C2 , kOpW (R(h¯ 0 , ))k ≤ C4 exp − (C5 /)b . (3.3) 4. For every fixed integer k < N() let Z (k) (I (x, ξ ); , h¯ 0 ) :=

k X

ζs (I (x, ξ ); , h¯ 0 ) s .

(3.4)

s=1

Then 3 Z (k) is independent of mod (h¯ 0 )∞ , i.e. there is Z˜ (k) (I (x, ξ ); h¯ ) such that 3 Z (k) (I (x, ξ ); , h¯ / 2 ) = Z˜ (k) (I (x, ξ ); h¯ ) + Ok ((h¯ 0 )∞ ).

(3.5)

Normal Forms and Quantization Formulae

183

Remark. Denoting Z = OpW (Z), R = OpW (R) we have S = P0 + Z + R, where Z commutes with P0 . Hence by (3.2,3.3) the eigenvalues of PR (h¯ 0 , ) have the expression hω, (l + 1)h¯ 0 i +

N() X

b

ζs ((l1 +1/2)h¯ 0 ω1 , . . . , (ln +1/2)h¯ 0 ωn ; , h¯ 0 ) s + O(e−(C5 /) ),

s=1

(3.6) where li ≥ 0, i = 1, . . . , n. We prove this result first for a particular class of analytic perturbations f of p0 , and then we recover the conditions of the theorem through an approximation procedure. To specify our assumptions on the perturbation f we need some preliminary notions. Define an analytic action 9 of Tn into R2n through the flow of p0 : 9 : Tn × R2n → R2n , (φ, (x, ξ )) 7 → (x 0 , ξ 0 ) = 9φ (x, ξ ), ξk sin φk + xk cos φk , ξk0 := ξk cos φk − ωk xk sin φk . xk0 := ωk

(3.7) (3.8)

If z := (x, ξ ), the flow of initial datum z0 is indeed z(t) = 9ωt (z0 ), ωt := (ω1 t, . . . , ωn t). If g ∈ L1loc (R2n ), its angular Fourier coefficient of order k is defined by Z 1 g(9φ (z))e−ihk,φi dφ, k ∈ Zn . g˜ k (z) := (2π)n Tn If g ∈ C1 one has, as is well known, X X g˜ k (z)eihk,φi H⇒ g(z) = g˜ k (z). g(9φ (z)) = k∈Zn

k∈Zn

Remark furthermore that g ≡ g˜ k for some fixed k if and only if g(9φ (z)) = eihk,φi g(z). Let now g ∈ L1 (R2n ), and consider its space Fourier transform Z 1 g(z)e−ihs,zi dz. b g (s) := (2π)2n R2n

(3.9)

(3.10)

Given ρ > 0, σ > 0, denoting gb ˜ k (s) the space Fourier transform of g˜ k , define the norm Z X eρ|k| |gb ˜ k (s)|eσ |s| ds. (3.11) kgkρ,σ := k∈Zn

R2n

Definition 3.1. Let ρ > 0, σ > 0. Then Aρ,σ := {g : R2n → C | kgkρ,σ < +∞}.

184

D. Bambusi, S. Graffi, T. Paul

Remarks. 1. If f ∈ Aρ,σ then f is analytic on R2n , and extends to a complex analytic function on a region of the form |Imzi | ≤ ai |Rezi |, with suitable ai . 2. As recalled in the former section, F := Op W (f ) is a trace-class, self-adjoint h¯ 0 pseudodifferential operator in L2 (Rn ) if f ∈ Aρ,σ . Let fb(s) be the Fourier transform of f . Since kfbkL1 ≤ kf kρ,σ , we have Z |fb(s)| ds ≡ kfbkL1 , kF kL2 →L2 ≤ kf kρ,σ . (3.12) kF kL2 →L2 ≤ R2n

Proposition 3.1. Let f : [0, ∗ ] 7 → Aρ,σ for some ρ > 0, σ > 0 and some ∗ > 0. Consider the operator Pf := P0 (h¯ 0 ) + F , F := OpW (f ). Let (1.6) hold. Set: E := sup kf kρ,σ ∈[0,∗ ]

and assume < 0 ρ τ σ 2 , 0 :=

γ . 2τ +5 e2 Eτ τ

(3.13)

Then the operator family Pf := P0 (h¯ 0 ) + F = OpW (p0 (x, ξ ; ω) + f (x, ξ )), D(Pf ) = D(P0 )

(3.14)

is norm-resolvent continuous, self-adjoint with compact resolvents in L2 (Rn ), and there exists a unitary transformation T : L2 (Rn ) → L2 (Rn ) such that Sf (h¯ 0 ) := T Pf (h¯ 0 )T −1 is a h¯ 0 − pseudodifferential operator family with symbol σf (x, ξ, h¯ 0 , ) = p0 (x, ξ, ω) + Z(I (x, ξ ); h¯ 0 , ) + R(x, ξ, h¯ 0 , ),

(3.15)

where Z ∈ Aρ/2,σ/2 and fulfills the estimate kZkρ/2,σ/2 ≤ 2E,

(3.16)

the principal symbol of Z is the normal form of the Hamiltonian p0 (x, ξ ) + f (x, ξ ) 1 computed up to order (0 ρ τ σ 2 /) τ +2 , and R ∈ Aρ/2,σ/2 is exponentially small, namely h i 1 (3.17) kRkρ/2,σ/2 ≤ eE exp −(τ + 2)(0 ρ τ σ 2 /) 2+τ . To prove the result we estimate the quantum normal form, recursively constructed through the quantum Lie transform [DGH,Ba1], and then choose the order of normalization in order to minimize the remainder. The semiclassical pseudodifferential calculus essentially reduces both steps to the classical method. We recall now the procedure of quantum Lie transform. 0 Let W be a bounded self-adjoint operator in L2 and etW/ i h¯ the corresponding unitary 0 group. If G is any bounded operator in L2 its formal image under etW/ i h¯ is 0

0

e−tW/ i h¯ GetW/ i h¯ =

∞ X

t l Gl ,

(3.18)

l=0

where Gl : l = 0, 1, . . . is recursively defined as follows: G0 := G, and 1 1 [W, Gl−1 ], l ≥ 1. l i h¯ 0 In analogy with the classical notion we give the following Gl =

(3.19)

Normal Forms and Quantization Formulae

185

Definition 3.2. Let the bounded self-adjoint operator W admit a smooth Weyl symbol w. Then the quantum Lie transform generated by w is the unitary operator exp (W/ i h¯ 0 ). Recall that if a bounded operator G also admits a smooth Weyl symbol g, then the Weyl symbol of −i[W, G]/h¯ 0 is by definition the Moyal bracket {w, g}M . ∞ X 0 0 gl , where Hence by (3.19) the (formal) Weyl symbol of e−W/ i h¯ GeW/ i h¯ is k=0

g0 := g, gl := {w, gl }M / l, l ≥ 1.

(3.20)

Consider now the operator Pf of symbol p0 + f , and its image under the quantum Lie transform generated by a symbol w of order . Formally the symbol of the image is p0 + {w, p0 }M + f + O( 2 ). For w such that {w, p0 }M + f depends on (x, ξ ) only through the actions Il the transformed operator commutes with P0 up to order 2 . A similar w can be determined and estimated by a purely classical method [Ba2] because, p0 being a quadratic polynomial, the Moyal and Poisson brackets coincide. We recall that in Fourier transform representation, used throughout the paper, the Moyal bracket is (see e.g. [Fo],§3.4): 2 ({g, g }M ) (s) = 0 h¯ 0

∧

Z R2n

h i b g (s 1 )gb0 (s − s 1 ) sin h¯ 0 (s − s 1 ) ∧ s 1 /2 ds 1 .

(3.21)

where, given two vectors s = (v, w) and s 1 = (v 1 , w1 ), s ∧ s 1 := hw, v1 i − hv, w1 i. The main step in proving Proposition 3.1 is: Proposition 3.2 (Iterative Lemma). Let the function H(k) := p0 + Z (k) + f (k)

(3.22)

be such that Z (k) , f (k) ∈ Aρ−kd,σ −kδ with d < ρ/(k + 1) and δ < σ/(k + 1), and let Z (k) be a function of (I1 , . . . , In ) only. Assume that

kZ

(k)

kρ−kd,σ −kδ ≤

    E

0 k−1 X l=0

if k = 0 µls if k ≥ 1

kf (k) kρ−kd,σ −kδ ≤ Eµks

(3.23) (3.24)

with µs := 8c9

E , cψ := (τ/e)τ γ −1 . δ2 d τ

(3.25)

For h¯ 0 > 0 fixed, let H (k) be the Weyl quantization of H(k) . If µs < 1/2 there exists a unitary transformation Tk such that Weyl symbol of the transformed operator Tk H (k) Tk−1 := H (k+1) is given by (3.22) with k + 1 in place of k (i.e., H (k+1) = OpW (H(k+1) )), and satisfies (3.23,3.24) with k + 1 in place of k.

186

D. Bambusi, S. Graffi, T. Paul

Remarks. 1. In the notation of Theorem (3.1), OpW (Z (k) ) =

k X

ζs (P0 , h¯ 0 , ) s +

s=0

O( k+1 ). The eigenvalue

ζs ((l1 + 1/2)h¯ 0 ω1 , . . . , (ln + 1/2)h¯ 0 ωn , h¯ 0 , ), li = 0, 1, . . . , i = 1, . . . , n is the k-order coefficient of the quantum perturbation expansion of the eigenvalue of PR (h¯ 0 ) near the eigenvalue (l1 + 1/2)h¯ 0 ω1 + . . . + (ln + 1/2)h¯ 0 ωn of P0 . 2. The extra dependence on of the coefficients ζs is due to the V dependence of the perturbation. The multiplying V is the expansion parameter. The extra dependence disappears when the perturbation is a homogeneous polynomial. The proof of this proposition consists in performing the necessary estimates on the formal algorithm recalled above, to be described in the following lemmas. Let us introduce also the space Aσ of all functions g : R2n → C such that Z |b g (s)|eσ |s| ds < +∞, kgkσ := R2n

and let us estimate the Aσ norm of the Moyal bracket as in Lemma 4.1 of [Ba1]. Lemma 3.1. Let g ∈ Aσ , g 0 ∈ Aσ −δ be smooth. Then, ∀ 0 < δ 0 < σ − δ: k{g, g 0 }M kσ −δ−δ 0 ≤

1 kgkσ kg 0 kσ −δ . e2 δ 0 (δ + δ 0 )

(3.26)

Proof. Since |s ∧ s 1 | ≤ |s| |s 1 |, by definition of Aσ −norm and (3.21), we get k{g, g 0 }M kσ −δ−δ 0 = Z Z 0 e(σ −δ−δ )|s| ds =2

| sin (h¯ 0 (s − s 1 ) ∧ s 1 )/2| 1 ds h¯ 0 R2n R2n Z Z | sin (h¯ 0 s ∧ s 1 )/2| 1 0 1 ≤2 ds e(σ −δ−δ )(|s|+|s |) |b g (s)||b g 0 (s 1 )| ds 2n 2n h¯ 0 R R Z Z 0 0 1 ≤ e(σ −δ−δ )|s| |b g (s)| ds · e(σ −δ−δ )|s | |b g 0 (s 1 )||s ∧ s 1 | ds 1 R2n R2n Z Z 0 1 (σ −δ−δ 0 )|s| ≤ e |b g (s)| |s| ds · e(σ −δ−δ )|s | |b g 0 (s 1 )| |s 1 | ds 1 , R2n

|b g (s)b g 0 (s − s 1 )|

R2n

t whence the assertion because e−δx x ≤ 1/eδ, ∀x > 0, ∀δ > 0. u To estimate the Aρ,σ norm of the Moyal brackets we need to identify their angular Fourier coefficients. Lemma 3.2. For any φ ∈ Tn , and for any smooth symbol g, there exists a unitary operator Xφ : L2 → L2 with the property OpW (g ◦ 9φ ) = Xφ OpW (g)Xφ−1 .

(3.27)

Proof. For any fixed φ the map 9φ is a linear canonical transformation of R2n . Then (3.27) is a characteristic property of Weyl quantization (see e.g. [BS], §5.3). u t

Normal Forms and Quantization Formulae

187

This lemma entails that the k th spatial Fourier coefficient of an h¯ pseudodifferential symbol is also an h¯ pseudodifferential symbol. Lemma 3.3. Let g and g 0 be smooth functions such that g ≡ g˜ k , g 0 ≡ g˜ l0 for some fixed k, l ∈ Zn . Then, denoting h := {g, g 0 }M , one has h ≡ h˜ k+l . Proof. By the former lemma we can write Op W (h ◦ 9φ ) = Xφ OpW (h)Xφ−1 1 [Xφ OpW (g)Xφ−1 , Xφ OpW (g 0 )Xφ−1 ] i h¯ 0 = Op W ({g ◦ 9φ , g 0 ◦ 9φ }M ). =

Moreover, by the bilinearity of the Moyal bracket and (3.9): h ◦ 9φ = {eihk,φi g, eihl,φi g 0 }M = eih(k+l),φi {g, g 0 }M = eih(k+l,)φi h.

t u

We can now estimate the Moyal brackets and the individual terms of the quantum Lie transform expansion. Lemma 3.4. Let g ∈ Aρ,σ and g 0 ∈ Aρ,σ −δ . Then, for any positive δ 0 < σ − δ: k{g, g 0 }M kρ,σ −δ−δ 0 ≤

1 kgkρ,σ kg 0 kρ,σ −δ . e2 δ 0 (δ + δ 0 )

Proof. Denoting once more h = {g, g 0 }M , by Lemmas 3.3 and 3.26 we can write X X kh˜ k kσ −δ−δ 0 eρ|k| ≤ eρ|k+l| k{g˜ k , g˜0 l }M kσ −δ−δ 0 khkρ,σ −δ−δ 0 = k∈Zn

X

≤

k,l∈Zn

eρ|k|+ρ|l|

k,l∈Zn

kgk ˜ ρ,σ kg˜0 kσ −δ kg˜ k kσ kg˜0 l kσ −δ = . 2 0 0 e δ (δ + δ ) e2 δ 0 (δ + δ 0 )

t u

Lemma 3.5. Let g ∈ Aρ,σ and w ∈ Aρ,σ be two smooth symbols, and define gr :=

1 {w, gr−1 }M , r ≥ 1; g0 := g. r

Then gr ∈ Aρ,σ −δ for any 0 < δ < σ , and the following estimate holds: r kgr kρ,σ −δ ≤ δ −2 kwkρ,σ kgkρ,σ . (r)

Proof. Fix r, denote δ˜ := δ/r, and look for a numerical sequence Cl (r)

kgl kρ,σ −δl˜ ≤ Cl . By (3.26) a similar sequence can be recursively defined as follows: (0)

Cl

(r)

= kgkρ,σ , Cl

=

r2 (r) C kwkρ,σ . l 2 δ 2 e2 l−1

such that

(3.28)

188

D. Bambusi, S. Graffi, T. Paul

Therefore we get Cr(r)

1 = (r!)2

r 2 kwkρ,σ δ 2 e2

r kgkρ,σ .

(r)

Denoting Cr := Cr , clearly one has kgr kρ,σ −δ < Cr . Moreover: Cr =

r r −1 (r)

so that the sequence Cl

2r

kwkρ,σ (r − 1)2(r−1) −2r kwkrρ,σ e ≤ Cr−1 , 2 2 2r r [(r − 1)!] δ δ2

majorizes Cr , and this proves the lemma. u t

We now construct and estimate the solution of the “quantum” homological equation. Lemma 3.6. Let g ∈ Aρ,σ . Then the homological equation {p0 , w}M + Z = g

(3.29)

admits analytic solutions w, Z such that Z ◦ 9φ = Z; moreover, for any d < ρ: kZkρ,σ ≤ kgkρ,σ ; kwkρ−d,σ ≤ c9

τ τ 1 kgkρ,σ . ; c := 9 dτ e γ

Proof. Define Z := g˜ 0 ; w :=

X k6 =0

g˜ k . ihω, ki

(3.30)

Equation (3.30) solves (3.29). Since p0 is quadratic in z = (x, ξ ) one has indeed: d w(9ωt (z)) = {p0 , w}M (z) = {p0 , w}(z) = dt t=0 d X g˜ k (z) ihωt,ki X d X g˜ k (9ωt (z)) = e = g˜ k (z). dt t=0 ihω, ki dt t=0 ihω, ki k6=0

k6=0

k6 =0

The estimate of Z is trivial, while that of w is obtained as follows: kwkρ−d,σ =

X kg˜ k kσ e−d|k| eρ|k|−d|k| ≤ kgkρ,σ sup |hω, ki| k6 =0 |hω, ki| k6=0

kgkρ,σ τ τ e−dm mτ = γ γ ed m6=0

≤ kgkρ,σ sup and this proves the lemma. u t

Our final lemma deals with Aρ,σ estimates of the individual terms in the expansion of the quantum Lie transform of p0 (remark that p0 6 ∈ Aρ,σ ).

Normal Forms and Quantization Formulae

189

Lemma 3.7. Let g ∈ Aρ,σ be a smooth symbol, and w be the solution of the homological equation (3.29). Define the sequence pr0 : r = 0, 1, . . . as follows: 1 {w, pr−10 }M , r ≥ 1. r Then, for any 0 < d < ρ, 0 < δ < σ , pr0 ∈ Aρ−d,σ −δ and fulfills the following estimate: r−1 kgkρ−d,σ , k ≥ 1. kpr0 kρ−d,σ −δ ≤ 2 δ −2 kwkρ−d,σ p00 := p0 ; pr0 :=

Proof. It is enough to notice that, by (3.29), {w, p0 }M = Z − g, whence kp10 kσ ≤ t 2kgkσ . Hence the argument of Lemma 3.5 can be taken over to the present case. u Proof of Proposition 3.2. Consider the homological equation {p0 , w}M + Zk = f (k) ,

(3.31)

where Zk depends on z = (x, ξ ) only through I1 , . . . , In . By Lemma 3.6 and (3.24) w and Z (k) exist and fulfill the estimate c9 kwkρ−(k+1)d,σ −kδ ≤ τ Eµks ; kZk kρ−kd,σ −kδ ≤ Eµks . d Define Z (k+1) := Z (k) + Zk and X (k) X (k) X Zl + fl + pl0 ; f (k+1) := l≥1

(k)

l≥1

(k)

Z0 := Z (k) ; Zl

l≥2

1 (k) = {w, Zl−1 }M , l

l≥1

(k)

and analogous definitions for fl , pl0 . Then the symbol of the transformed operator has the form (3.22) with k + 1 in place of k. To get the estimate, let k ≥ 1. Set c9 η := δ −2 kwkρ−(k+1)d,σ −kδ ≤ τ 2 Eµks < 1/2. d δ By Lemmas 3.6, 3.7, 3.5 we can write X X (k) Eµs η kfl kρ−(k+1)d,σ −(k+1)δ ≤ Eµks ≤ 2ηEµks ηl = 1−η l≥1 l≥1 X (k) η ≤ 4Eη kZl kρ−(k+1)d,σ −(k+1)δ ≤ kZ (k) kρ−kd,σ −kδ 1−η l≥1 X kpl0 kρ−(k+1)d,σ −(k+1)δ ≤ 4µks ηE, l≥2

whence the assertion in a straightforward way. This proves Proposition 3.2. u t Proof of Proposition 3.1. Apply k times Proposition 3.2 to the operator Pf (h¯ 0 ) of symbol p0 + f with d := (ρ/2k), δ := (σ/2k). This yields a unitary map T such that the symbol of T Pf (h¯ 0 )T −1 has the form (3.22), Z (k) and f (k) fulfilling the estimates, !k 2τ +5 (k) (k) τ +2 Ec9 k . (3.32) kZ kρ/2,σ/2 ≤ 2E, kf kρ/2,σ/2 ≤ E ρτ σ 2

190

D. Bambusi, S. Graffi, T. Paul

The choice k := (0 ρ τ σ 2 /)1/(τ +2) yields the estimate (3.17); k > 1 by (3.13). This proves Proposition 3.1. u t Let us finally prove Theorem 3.1 approximating a by a function fulfilling the conditions of Proposition 3.1. First we show the existence of a constant µ > 0 such that Z X β β eµ|k| |b a˜ ,k (s)|eµ|s| ds < +∞, (3.33) hka ki := R2n

k∈Zn

where β := (1/2(` + 1)). The proof of this estimate is divided in two lemmas. Lemma 3.8. There exist constants C 1 and C such that: −|j | |j | (|j |!)`+1 . sup ∂ζ j a (9φ (z)) ≤ C 1 C z∈R2n ,φ∈Tn

Here ` is the Gevrey constant of a and ζi denotes a phase variable zi or an angle φi . Proof. Denote h (z, φ) := a (9φ (z)), N = |j | and consider ∂ζN h , where ζ is a fixed variable; we can write ∂ζN h =

N+2 X k=2

CkN Ok ,

(3.34)

where CkN are positive integers, and CkN Ok denotes a sum of CkN terms Ok of the form Ok = (d k−1 a )(9φ (z))G1 (z, φ) · · · Gk−1 (z, φ), k ≤ N. Here (d k a )(9φ (z)) denotes the k th differential of a evaluated at 9φ (z), and Gi : i = 1, . . . , k − 1 is a partial derivative of order at most N of 9φ (z). By (3.7,3.8) one clearly kGi (z, φ)k ≤ R2 for some R2 > 0, whence, by (2.2) has sup |z|≤2R,φ∈Tn

sup z∈R2n ,φ∈Tn

|Ok | ≤

X

sup ∂zkj a (z) R2k−1

|j |=k |z|≤2R

≤ K(R)α −k k 2n R2k−1 (k!)` ≤ K(R)α −k N 2n R2k−1 (k!)` . Hence to estimate (3.34) we just have to estimate the coefficients CkN . To this end, remark that we can symbolically write ∂ζ Ok = Ok+1 + (k − 1)Ok . The l.h.s. is majorized by the r.h.s. via triangular inequality, because in the estimate of Ok each Gj and all its derivatives clearly fulfill the same bound. This easily yields N−1 N −1 N N N (k − 1) + Ck−1 , 3 ≤ k ≤ N. CN +1 = C2 = 1, Ck = Ck

Hence, defining SN :=

N+1 X k=2

CkN one has SN :=

N +1 X k=2

kCkN −1 < N SN −1 whence SN <

N !. This proves the result when all derivatives are performed with respect to the same variable, but the same argument can be applied in general. This proves the lemma. u t

Normal Forms and Quantization Formulae

191

Lemma 3.9. Under the assumption of Theorem 3.1 there exist α˜ and C7 such that h i b 1/2(`+1) 1/2(`+1) ˜ + (α|s|) ˜ . (3.35) a˜ ,k (s) ≤ C7 exp − (α|k|) Proof. Performing l ≥ 1 partial integrations, the above lemma and the uniform Gevrey property (2.2) of a ◦ 9 yield, for some C8 > 0, C9 > 0: b a˜ ,k (s) ≤ C8 (C9 |(k, s)|)−l (l!)`+1 , where: |(k, s)| :=

n X p=1

get

|kp | +

2n X

|sp |. Set l = l(|(k, s)|) := (C9 |(k, s)|/e)1/(`+1) . We

p=1

b a˜ ,k (s) ≤ C10 exp − [C9 |(k, s)|/e]1/(`+1) .

Now remark that |(k, s)| = |k| + |s|; moreover, x 2l + y 2l > (x + y)l ∀ l ∈ N whenever x > 2, y > 2. Therefore, for min(|k|, |s|) > 22(`+1) : 1

1

1

(|k| + |s|) `+1 > |k| 2(`+1) + |s| 2(`+1) .

(3.36)

This yields the lemma for min(|k|, |s|) > 22(`+1) ; its validity when (3.36) does not hold t is ensured by suitably choosing the constants C7 . This proves the lemma. u Taking µ suitably small, it follows immediately from this lemma: Corollary 3.1. There is C11 > 0 independent of a such that hka ki ≤ C11 .

(3.37)

Proof of Theorem 3.1 Assertions 1,2. For 3 > 0 and K > 0 fixed, approximate a by X Z b (3.38) a˜ ,k (s)eihs,zi ds. a3,K (z) := |k|≤K |s|<3

We have: a3,K (z) − a3,∞ (z)

=

X Z |k|>K |s|<3

b a˜ ,k (s)eihs,zi ds.

= OpW (a3,K ), A3,∞ = OpW (a3,∞ ) by (3.12) we can Therefore, denoting A3,K write X Z β 3,∞ − A k ≤ |b a˜ ,k (s)| ds ≤ C11 e−µK , (3.39) kA3,K 2 2 L →L |k|>K |s|<3

β

− A kL2 →L2 ≤ C11 e−µ3 . kA3,∞

(3.40)

Now choose ρ = σ = α , K = 3 = −α with a still undetermined α. On account of (3.37, 3.39, 3.40) this yields: − A kL2 →L2 ≤ 2C11 exp [−µ/ αβ ]. kA3,K

192

D. Bambusi, S. Graffi, T. Paul

Now apply Proposition 3.1 to the operator P0 + A3,K of symbol p0 + a3,K . Then: E := ka 3,K kρ,σ ≤ eρK+σ 3 hkaki = e2 hkaki, and hence, by (3.37): E ≤ eC11 ≤ 0 ρ τ σ 2 = 0 α(τ +2)

(3.41)

holds provided α < 1/(τ + 2) and is suitably small. Hence, considering the symbol of the transformed operator, by (3.16) and (3.12) we have p0 + Z3,K + R3,K kOpW (Z3,K )kL2 →L2 ≤ 2e2 C11 , while by Assertion 1 of Proposition 3.1 the principal symbol of Z3,K coincides with the classical normal form computed up to order 1/(τ +2)

N() := 0

/ τ +2 −α . 1

(3.42)

and P0 +A,R being Moreover, the difference between the cut-off operator P0 +A3,K exponentially small, we can put in the remainder the exponentially small difference between the normal form of P0 + A,R and Z3,K . On the other hand, by (3.17): h i 1 )kL2 →L2 ≤ e3 C11 exp −(τ + 2)0 1/(τ +2) α− τ +2 . kOp W (R3,K We choose α so as to optimize the remainder, i.e. in such a way that αβ =

1 −α τ +2

H⇒

b := αβ =

1 . (τ + 2)(2` + 3)

To conclude the proof of Assertions 1 and 2 of Theorem 3.1 it is enough to set: 1/(τ +2)

∗ := 0

C3 = 2e2 C11 , C4 = (4 + e3 )C11 , C5 = min(µ, (τ + 2)∗ ).

t u

Assertion 3 will be a consequence of the property 3 a (z/) = V (z), ∀|z| ≤ R/2. We actually show that this property holds (mod ((h¯ 0 )∞ ), and on a sphere of radius smaller than R/2) for the symbols recursively defined by the normal form algorithm. Let w , g ∈ 6 1 (R2n ) (notations of [Ro]) be two smooth bounded semiclassical symbols with the property that, for some w, ¯ g¯ (independent of ) and some κ > 0: ¯ h¯ ), 3 g (z/, h¯ / 2 ) = g(z, ¯ h¯ ), |z| ≤ R/2. 3 w (z/, h¯ / 2 ) = w(z, Let M > 0 be fixed. Then the following properties are easily verified: 1. Denote d := {g , w }M . Then for all positive δ < R one has, as h¯ 0 → 0: ¯ h¯ ) + O((h¯ 0 )M ) , ∀|z| < (R − δ)/2. 3 d (z/; h¯ / 2 ) = d(z;

Normal Forms and Quantization Formulae

193

2. Let g˜ ,k be the angular Fourier coefficients of g . Then: 3 g˜ ,k (z/; h¯ / 2 ) = g¯˜ k (z; h¯ ), ∀|z| < R/2. 3. Let ϕt be the flow generated by the Hamiltonian w . Then (as → 0) 3 ϕt (z/; h¯ / 2 ) = ϕ¯t (z; h¯ ) + O((h¯ 0 )M ) , ∀|z| < (R − δ)/2. 0

0

4. Denote W := Op W (w ) and gw the symbol of eiW /h¯ OpW (g )e−iW /h¯ . By the semiclassical Egorov theorem ([Ro], IV.10) one has (as h¯ 0 → 0 and → 0) 3 gw (z/; h¯ / 2 ) = g¯ w (z; h¯ ) + O((h¯ 0 )M ) , ∀|z| < (R − δ). Then Assertion 3 is an immediate consequence of the uniqueness of the normal form of nonresonant systems and of the following lemma. Lemma 3.10. Let M > 0 be fixed and suitably large, and let the function (k)

(k)

H(k) := p0 + Z (k) + k+1 S1 + S2

(3.43)

be such that Z (k) depends on (I1 , . . . , In ) only. Assume that (k)

(k)

3 Z (k) (z/, h¯ / 2 ) = Z¯ (k) (z, h¯ ) , 3 S1 (z/, h¯ / 2 ) = S¯1 (z, h¯ ) (k)

(3.44) (k)

for |z| ≤ (R − δk)/2. Assume furthermore Z (k) , S1 ∈ 6 1 (R2n ) and S2 = Ok ((h¯ 0 )M ). Then there exists a unitary transformation Tk such that the transformed Schrödinger operator Tk H (k) Tk−1 := H (k+1) has a Weyl symbol given by (3.43,3.44) with k + 1 in place of k: H (k+1) = OpW (H(k+1) ). Proof. We construct the generator w of the transformation as in the proof of Lemma 3.2, (k) (k) with k+1 S1 in place of f (k) . Remark that, since S1 ∈ C ∞ (R2n ), its angular Fourier (k) coefficient S˜ are O(|l|−∞ ). Therefore by (1.6) w exists in C ∞ (R2n ), uniformly with 1,l

respect to . Moreover, by Property 2, 3 w(z/, h¯ / 2 ) = w(z, ¯ h¯ ). Then in the notation of Remark 4 above we have (k),w

S1

0 (k),w = S¯1 + O(h¯ ,M ) , Z (k),w = Z¯ (k),w + O(h¯ M ), |z| ≤ (R − (k + 1)δ)/2. 0

Now by the homological equation we easily get: p0w = p¯ 0w + O(h¯ M ). Hence let us set: (k+1)

S1

(k),w (k) := S¯1 − S1 + p¯ 0w − p0 − k+1 {w, p0 }M + Z¯ (k),w − Z (k) , (3.45) (k)

Z (k+1) := Z (k) + k+1 S˜1,0 .

(3.46)

(k),w and the other transformed It is now easy to see, using the explicit formula for S¯1 symbols (see once more [Ro], Theorem IV.10) that the l.h.s. of (3.45) is actually of order . Then the assertion is a direct consequence of the above listed properties. This proves the lemma. u t

194

D. Bambusi, S. Graffi, T. Paul

Proof of Theorem 1.1. By Theorem 2.1 and Theorem 3.1 the eigenvalues λk (h¯ 0 , ), k ∈ (N ∪ {0})n of P (h¯ 0 , ) admit the approximate quantization formula 1 λs (h¯ 0 , ) = ω · (k + )h¯ 0 Z((k1 + 1/2)h¯ 0 , . . . , (kn + 1/2)h¯ 0 , h¯ 0 , ) 2 ˜ + R(, h¯ 0 ), ˜ R(, h¯ ) = O(C4 e 0

−(C5 /)b

) + O(e

−c/h¯ 0 1/`

),

(3.47) (3.48)

where Z() = Z N() is given by (3.2) and the constants have been determined above. On the other hand, Spec[P (h¯ 0 )] = 2 Spec[P (h¯ 0 , )]. Hence, λ ∈ Spec[P (h¯ )] ∩ [0, 2 ] ⇐⇒ λ/ 2 ∈ Spec[P (h¯ 0 , )] ∩ [0, 1]. This yields (1.8), i.e. Assertion (1). To see (1.9), for any fixed M < (∗ /)b write: Z((k + 1/2)h¯ / 2 , h¯ / 2 , ) =

M X

ζs ((k + 1/2)h¯ / 2 , h¯ / 2 ) s

s=1

+

N() X

ζs ((k + 1/2)h¯ / 2 , h¯ / 2 ) s .

s=M

The second addend is O( M ) by construction. By Assertion 3 of 3.1, 3

M X

ζs ((k + 1/2)h¯ / 2 , h¯ / 2 ) s

s=1 ˜M

=Z

((k1 + 1/2)h¯ , . . . , (kn + 1/2)h¯ , h¯ ) + OM ((h¯ / 2 )∞ ).

˜ note that by [Sj], Theorem 1.4 the quantum normal form is constructed To identify Z, as a h pseudodifferential operator commuting with P0 by quantization of the classical Lie transform as in Sect. 3. Hence by Remark 2 after Proposition 3.2 and Assertion 3, formula (3.5), F˜ = F. This concludes the proof of Theorem 1.1. u t Proof of Corollary 1.1. Consider now the scaling parameter , which can be chosen as an arbitrary function (h¯ ) vanishing as h¯ → 0 subject to the only requirement lim h¯ (h¯ )−2 → 0.

h¯ →0

(3.49)

Let

2 = ϕ(h¯ ) :]0, 1[: R+ i be such that ϕ(h¯ ) exp −(C5 /ϕ(h¯ ))b = O(h¯ ∞ ). This condition is satisfied by any h

positive, increasing function ϕ(h¯ ) on ]0, 1] with lim ϕ(h¯ )b ln h¯ → 0. The requirement h¯ →0 h¯ /ϕ(h¯ )b/2+1 → 0 ensures that exp − 2 /h¯ exp − A/ϕ(h¯ )b/2 . This choice of fulfills (3.49) in both cases, and hence (1.11) is proved. u t Acknowledgement. We thank A. Martinez for suggesting the proof of Proposition 2.2.

Normal Forms and Quantization Formulae

195

References [Ba1]

Bambusi, D.: Uniform Nekhoroshev estimates on quantum normal forms. Nonlinearity 8, 93–105 (1995) [Ba2] Bambusi, D.: Exponential Stability of Breathers in Hamiltonian Networks of Weakly Coupled Oscillators. Nonlinearity 9, 433–457 (1996) [BS] Berezin, F.A. and Shubin, M.S.: The Schrödinger Equation. Amsterdam: Kluwer, 1991 [BV] Bellissard, J., Vittot, M.: Heisenberg Picture and non Commutative Geometry of Classical Limit in Quantum Mechanics. Ann. Inst. H. Poincaré 52, 175–235 (1990) [CDS] Combes, J.M., Duclos, P., Seiler, R.: The Born-Oppenheimer Approximation. In: NATO ASI Series B 72, R.: Plenum, New York 1981, 363–391 [DGH] Degli Esposti, M., Graffi, S., Herczynski, J.: Exact quantization of the classical lie algorithm. Ann. Phys. (N.Y.) 208, 363–391 (1991) [Fo] Folland, G.: Harmonic analysis in phase space. Princeton , NJ: Princeton University Press, 1988 [Gi] Giorgilli, A.: Rigorous results on the power expansions for the integrals of a hamiltonian system. Ann. Inst. H. Poincaré, 48, 423–439 (1988) [He] Helffer, B.: h-pseudodifferential operators and applications: An introduction. Lectures at IMA: Quasiclassical methods, 1995 [HS] Helffer, B.,Siöstrand, J.: Multiple wells in the semiclassical limit. Commun.Part. Differ. Eqs. 9, 337– 408 (1984) [Ka] Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1966–1976 [Ma] Martinez, A.: Microlocal Exponential Estimates and Applications to Tunneling. In: NATO ASI Series C 490, A,sterdam: Kluwer, 1997, pp. 349–378 [Ne] Nekhoroshev, N.N.: Exponential estimate of the stability time of near integrable Hamiltonian systems. Russ. Math. Surveys. 32 (6), 1–65 (1977) [Po] Popov, V.: Invariant tori effective stability and quasimodes with exponentially samll terms. Preprint 1997 [Ro] Robert, D.: Autour de l’approximation semiclassique. Basel: Birkhäuser, 1987 [RS] Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol.IV. New York: Academic Press, 1978 [Si] Simon, B.: Semiclassical analysis of low-lying eigenvalues, I. Ann. Inst. H. Poincaré 38B, 295–307 (1983) [Sj] Sjöstrand, J.: Semi-excited levels in non-degenerate potential wells. Asymptotic Analysis 6, 29–43 (1992) Communicated by B. Simon

Commun. Math. Phys. 207, 197 – 229 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

The Wulff Construction in Three and More Dimensions T. Bodineau CNRS – UMR 7599, Universités Paris 6 and 7, Laboratoire de Probabilités et Modèles Aléatoires, Département de Mathématiques, Case 7012, 2 place Jussieu, 75251 Paris, France Received: 5 January 1999 / Accepted: 27 April 1999

Abstract: In this paper we prove the Wulff construction in three and more dimensions for an Ising model with nearest neighbor interaction.

1. Introduction The problem of phase separation for two dimensional Ising model and the study of the equilibrium shape of crystals (Wulff shape) has been initiated by Dobrushin, Kotecky and Shlosman [DKS]. Among other things, they proved that if at very low temperatures we decrease the averaged magnetization in the + pure phase, we observe the creation of a macroscopic droplet of the − phase which has a deterministic shape on the macroscopic scale. The proof has been first simplified by Pfister [Pf] and then extended to the whole of the phase transition region by Ioffe [I1,I2] (see also [SS] and [PV]). Recently, Ioffe and Schonmann [IS] have completed the DKS theory up to the critical temperature and greatly simplified the original proofs. Moderate deviations in the exact canonical ensemble are also studied in [IS]. In two dimensions, the proofs have been based on duality arguments and on a coarse graining procedure (skeleton). These arguments do not seem to apply in higher dimensions. For more than two dimensions, an alternative procedure has been proposed by Alberti, Bellettini, Cassandro and Presutti [ABCP,BCP] for Ising systems with Kac potentials. They rephrase the whole problem in terms of L1 theory and prove large deviations for the appearance of a droplet of the minority phase in a scaling limit when the size of the domain diverges not much faster than the range of the Kac potentials. This amounts to a weak large deviation principle which is obtained by proving 0-convergence of a functional associated to the spins system [ABCP]. A large deviation principle has then been proved via a tightness property [BCP].

198

T. Bodineau

Their approach has been generalized by Benois, Bodineau, Butta and Presutti [BBBP, BBP] by taking first the thermodynamic limit and then letting the range of interaction go to infinity. The first paper [BBBP] was devoted to the proof of a weak large deviation principle for the macroscopic magnetization which is equivalent to the computation of surface tension. The main idea has been to introduce a coarse graining in order to use the L1 setting. Namely, events in L1 were related to mesoscopic quantities by an argument which we will refer to later as minimal section argument. An exact expression of surface tension was difficult to recover from coarse grained estimates and surface tension was only derived in the Kac limit, i.e. when the range of interactions tends to infinity. The second step [BBP] consisted of proving a tightness property by using the compactness in L1 of the set of functions of bounded variation with finite perimeter. Wulff construction for three dimensional independent percolation has been proven by Cerf [Ce] using a procedure similar to the one of [BBBP] and a novel definition of surface tension. In this case, the dependence on boundary conditions is weaker and, the minimal section argument enables to prove directly a weak large deviation principle by using this appropriate definition of surface tension. As percolation occurs in an infinite volume, there is an extra difficulty and different compactness arguments have been required. In this paper we proceed as in [BBBP]. One of the main difficulies is to recover surface tension from a constraint on the averaged magnetization. The surface tension + is defined as log ZZ+,− , where the partition functions are computed with + boundary conditions and with mixed boundary conditions (+ at the top and − at the bottom), see for instance the paper of Messager, Miracle-Solé and Ruiz [MMR]. To use directly this definition, one would have to find in the bulk surfaces of + spins or of − spins which in fact may not exist. A way to circumvent this problem is to prove that surface tension can be produced by averaging the boundary conditions, choosing the spins with respect to the + pure phase and to the − pure phase. For the Ising model with nearest neighbor interaction, the coarse graining developed by Pisztora [Pi1] will play an analogous role to the one used for the Kac model. Pisztora’s coarse graining is one of the most profound and powerful techniques for the study of the Ising (Potts) model, it provides an accurate description of the Ising model in a non-perturbative regime up to a temperature Tˆc which is conjectured to agree with the critical temperature. In the following, we will mention which of our results hold up to Tˆc . As Pisztora’s coarse graining is defined via the FK representation, several quantities need to be rewritten in terms of the FK representation. In particular, our approach to the surface tension (Sect. 4) is built upon the FK representation and, is motivated by the corresponding construction in [Ce]. This is a key to obtain precise surface order estimates on the logarithmic scale. This is also the only point at which we refer to [Ce], the core philosophy of our proof is based on the renormalization ideas of [BBBP] and [BBP], including the appropriate setup of the geometric measure theory. The coarse graining schemes of the latter works, however, depend on specific properties of Kac potentials and, one of our main technical tasks here is to develop a relevant modification of these renormalization procedures in the nearest neighbor context. A step further in the understanding of the surface tension will be to prove a phase separation theorem for the Kac model with finite range interactions by using a coarse graining defined only in terms of the Gibbs measure [Bo]. After introducing the main notation, we state in Sect. 2 the results and an overview of the paper (see Subsect. 2.3).

Wulff Construction in Three and More Dimensions

199

2. Notation and Results The strategy of the proof can be applied in any dimension larger or equal to 2. Nevertheless, as the proof depends on Pisztora’s coarse graining [Pi1] our results are valid only in dimension larger or equal to 3. In fact, a similar coarse graining should also hold in dimension 2. For simplicity, notation and results are stated in three dimensions.

2.1. Notation. We introduce the following norms on Z3

∀x ∈ Z3 , kxk1 =

3 X

v u 3 uX |xi | and kxk2 = t |xi |2 .

i=1

i=1

Two vertices x, y in Z3 are nearest neighbors if kx − yk1 ≤ 1 and we denote it by x ∼ y. For any finite subset 3 of Z3 , we define its boundary by ∂3 = {x ∈ 3c | y ∈ 3, x ∼ y}, and denote its cardinality by |3|. We consider the Ising model on Z3 with nearest neighbor interaction. Each spin σi , attached at the lattice site i in Z3 , can take values ±1. For any integer N, we set 1N = {1, N}3 and denote the space of configurations in 1N by 61N = {±1}1N . Let σ1N be the spin configuration restricted to 1N . We introduce the Hamiltonian associated to σ1N with boundary conditions σ∂1N , H (σ1N | σ∂1N ) = −

1 X σi σj − 2 i∼j i,j ∈1N

X

σi σj .

i∼j i∈1N ,j ∈∂1N

The Gibbs measure on 61N at inverse temperature β > 0 is µβ,1N (σ1N | σ∂1N ) =

1 exp − βH (σ1N | σ∂1N ) , Zβ (σ∂1N )

where the partition function Zβ (σ∂1N ) is the normalizing factor. When the boundary conditions σ∂1N are identically equal to 1, we simply write µ+ β,1N . There is a critical value βc > 0 and for all β larger than βc , there exists mβ > 0 such that lim µ+ β,1N (σ0 ) = mβ > 0.

N→∞

+ The Gibbs measure µ+ β on 6Z3 obtained by taking the thermodynamic limit of µβ,1N is called the + pure phase and mβ is the equilibrium value of the magnetization.

200

T. Bodineau

2.2. Surface tension. Let us recall the definition of surface tension and related results which can be found in [MMR]. The set of unit vectors in R3 is denoted by S2 . We fix n a vector in S2 and e1 , e2 two vectors orthogonal to n. Let h be a positive constant and N → f (N ) a positive function ¯ which diverges as N goes to infinity. For any integer N, we denote by 3(hN, hN, f (N )) the parallelepiped of R3 centered at 0 with faces parallel to the axis (e1 , e2 , n) such that the lengths of the sides parallel to (e1 , e2 ) are hN, hN and the ones parallel to n is f (N ). ¯ hN, f (N )) ∩ Z3 . The boundary ∂3N is split We introduce 3N the set of vertices 3(hN, into 2 sets ∂ + 3N = {i ∈ ∂3N | i. n ≥ 0}, ∂ − 3N = {i ∈ ∂3N | i. n < 0}. We call ∂ + 3N the upper and ∂ − 3N the lower part of ∂3N . We fix the boundary conditions outside 3N to be equal to 1 on ∂ + 3N and to −1 on ∂ − 3N . The corresponding +,− . partition function on 3N is denoted by Z3 N Definition 2.1. The surface tension in the direction n ∈ S2 is defined by +,− Z3 1 τ (n) = lim − 2 2 log +N . N→∞ h N Z3N

The surface tension defined above coincides with the one defined in [MMR] (see Appendix 8.1). It depends neither on h nor on f as proven in [MMR] (Theorem 2). Let us extend τ by homogeneity v and τ˜ (0) = 0. (2.1) ∀v ∈ R3 − {0}, τ˜ (v) = kvk2 τ kvk2 The pyramidal inequality proven in Theorem 3 of [MMR] ensures that τ˜ is convex. As τ˜ is locally bounded and convex, it is continuous. It was proven by Lebowitz and Pfister [LP] that for all β larger than βc , the surface tension τ (n0 ) in the direction n0 = (1, 0, 0) is positive. From the symmetries and the convexity of τ , we check that τ is uniformly positive on S2 . The spin configuration σ should be seen as a microscopic representation of the system. The macroscopic state of the system is instead determined by the value of an order parameter (the averaged magnetization) which specifies the phase of the system. As β is fixed, it is convenient to replace the order parameter by a parameter u with values ±1. We suppose that the macroscopic region of R3 where our system is confined is T = [0, 1]3 and that outside this region the system is surrounded by the + pure phase. We denote by BV(T , {+1, −1}) the set of functions of bounded variation in ] − 1, 2[3 which are uniformly equal to 1 outside T and which take only values ±1 in T (see [EG] for a review). These functions are simply signed indicators of macroscopic sets of finite perimeter. The fact that ur = 1 for some r in T means that locally at r the system is in equilibrium in the phase +mβ . The precise correspondence between σ and functions on T is described in Sect. 3, where we approximate σ by a coarse graining procedure, introducing a mesoscopic scale. For any function u in BV(T , {+1, −1}), a generalized notion of the boundary of the set {u = −1} can be defined. It is called the reduced boundary and denoted by ∂ ∗ u. If the set {u = −1} is regular then ∂ ∗ u coincides with the usual definition of the boundary.

Wulff Construction in Three and More Dimensions

201

For each x in ∂ ∗ u, there exists a unit vector nx called the measure theoretic unit normal to {u = −1} at x which coincides with the outward normal if the set is regular. Let us introduce the functional F on L1 (T , [− m1β , m1β ]) R F(u) =

∂ ∗ u τ (nx ) dHx , if u ∈ BV(T , {+1, −1}), ∞, otherwise,

(2.2)

where dH is the 2 dimensional Hausdorff measure in R3 . To any subset A of T , we associate the function 1A = 1Ac − 1A and simply write F(A) = F(1A ). An important property is the lower semi-continuity of F with respect to L1 convergence. As τ˜ is convex (see (2.1)), the lower semi-continuity is a consequence of a result by Ambrosio and Braides (see [AmBr] Theorem 2.1 and Example 2.8). The equilibrium crystal shape Wm , called Wulff shape, is a solution of the following isoperimetric problem: Z ur dr ≤ m , (2.3) min F(u) u ∈ BV(T , {+1, −1}), mβ T

where m belongs to ]m∗ , mβ [. We will restrict the parameter m so that, for m in ]m∗ , mβ [ the minimizers of the variational problems in T and R3 are the same. This enables us

to avoid boundary problems. The shape Wm can be explicitly constructed (the Wulff construction) by dilating the set \ x ∈ R3 ; x. n ≤ τ (n)} W= n∈S2

R in order to satisfy the volume constraint mβ T 1Wm (r) dr = m. As Wm = λm W, one has F(Wm ) = λm F(W). Thus F(Wm ) is continuous with respect to m. Taylor [Ta] proved that Wm is a closed convex surface and that all other minimizers of (2.3) are deduced from Wm by shifts (see also [F,FM]). In the following, we suppose that Wm is centered in ( 21 , 21 , 21 ). P 2.3. Heuristics and Results. The total magnetization N13 i∈1N σi will be denoted by M1N . A shift of the magnetization from its equilibrium value leads to large deviations controlled by a surface order Theorem 2.1. There is β0 positive such that for any β larger than β0 and m in ]m∗ , mβ [, lim

N→∞

1 log µ+ β,1N M1N ≤ m = −F(Wm ), 2 N

where m∗ and Wm were defined in (2.3). More precisely, a phase separation occurs on the macroscopic level. In order to describe it, we introduce an intermediate scale called mesoscopic: the magnetization is locally averaged on boxes of size N α with α ∈]0, 1[ (1 N α N ). For any integer L, we define the sub-lattice L L L 3 x = (x1 , x2 , x3 ) ∈ Z . Lx1 + , Lx2 + , Lx3 + (2.4) LL = 2 2 2

202

T. Bodineau

We introduce also B(x, L) the box of length L centered in x in LL , L L . B(x, L) = y ∈ Z3 | ∀i ∈ {1, 2, 3}, − < yi − xi ≤ 2 2

(2.5)

The parameter α will be chosen slightly bigger than 13 (see Subsect. 3.2), this restriction on α is only used because we want to provide a straightforward proof of Theorem 2.2: by tuning properly the parameter α, we avoid many entropic computations in particular in Sect. 7. Nevertheless, notice that an extension of Theorem 2.2 for appropriate finite mesoscopic scales is derived in [BIV]. To simplify the notation, we can choose α in Q −1 and suppose from now that N is of the form 2α k , where k and αk are integers. The set 1N is partitioned into boxes B(x, N α ) of side length N α centered in x in LN α . For general values of N, one would need to partition 1N with boxes which may have different sizes. This is a standard technique and we refer the reader to Pisztora [Pi1]. The local magnetization M is a piecewise constant function on T , ∀r ∈ T , Mr =

1 N 3α

X

σj if ∀i, −

j ∈B(x,N α )

Nα Nα < N ri − xi ≤ . 2 2

As explained in the introduction, it is convenient to formulate the problem of phase separation in terms of L1 theory. For any function u in L1 (T , [− m1β , m1β ]), we denote by V(u, δ) the δ-neighborhood of u, Z 1 1 , ] |ur − vr | dr ≤ δ . V(u, δ) = v ∈ L1 T , [− mβ mβ We can now on phase separation which says that for β large enough with state a theorem M1 ≤ m -probability converging to 1, the function M is close to some . µ+ N β,1N translate of the Wulff shape mβ 1Wm . Theorem 2.2. There is β0 positive such that for any β larger than β0 and m in ]m∗ , mβ [, ∀δ > 0,

lim µ+ β,1N

N→∞

[ M ∈ V(1Wm +r , δ) M1N ≤ m = 1, mβ 0 r∈T

where m∗ , Wm were defined in (2.3) and T 0 = {r ∈ T | Wm + r ⊂ T }. This result is far less sharp than those obtained in the 2 dimensional case (see [IS]). We will follow the scheme of [BBBP,BBP] and deduce Theorems 2.1 and 2.2 from the following statements. Proposition 2.1. Let β be large enough. Then for all u in BV(T , {+1, −1}) such that F(u) is finite, lim lim sup

δ→0 N→∞

1 M log µ+ ∈ V(u, δ) ≤ −F(u). β,1N 2 N mβ

Proposition 2.2. For β large enough and m in ]m∗ , mβ [, lim lim inf

δ→0 N→∞

1 M log µ+ ∈ V(1Wm , δ) ≥ −F(Wm ), β,1N N2 mβ

where m∗ and Wm were defined in (2.3).

Wulff Construction in Three and More Dimensions

203

Since Proposition 2.1 is only a weak large deviation principle, we need to strengthen it by proving an exponential tightness property which is similar to the one in [BBP]. For any a positive, the set Ka = u ∈ BV(T , {+1, −1}) | F(u) ≤ a is compact with respect to convergence in measure: As F is lower semi-continuous, Ka is closed and as the surface tension τ is larger than a positive constant τ0 , the set Ka is included in the compact set of functions of bounded variation in T with perimeter smaller than τa0 (see [EG] Sect. 5.2.3). Proposition 2.3. We fix β large enough. Then there exists a constant Cβ such that for all a and δ positive, lim sup N→∞

1 M log µ+ ∈ V(Ka , δ)c ≤ −Cβ a, β,1N 2 N mβ

where V(Ka , δ) is the δ-neighborhood of Ka in L1 (T , [− m1β , m1β ]). The proofs of Theorems 2.1 and 2.2 are based on well known large deviations arguments (see [DS]). For completeness we prove Theorem 2.2, the proof of Theorem 2.1 is similar. Proof of Theorem 2.2. Throughout this proof, the parameter δ is fixed. We first start by M ∈ F , where F is the closed set computing a logarithmic upper bound of µ+ β,1N mβ Z [ 1 m 1 1 , ] u 6∈ V(1Wm +r , δ) and ur dr ≤ . F = u ∈ L T , [− mβ mβ m β T 0 r∈T

Let a and µ+ β,1N

δ0

be positive constants. We split F into 2 sets

M M M ∈ F ≤ µ+ ∈ F ∩ V(Ka , δ 0 ) + µ+ ∈ V(Ka , δ 0 )c . β,1N β,1N mβ mβ mβ

We choose a such that Cβ a is much larger than F(Wm ), then Proposition 2.3 enables us to bound the last term in the RHS. Let us fix ε positive. Since Ka is compact, we cover it with a finite number ` of neighborhoods V(ui , εi ), where each εi belongs to ]0, ε] and is chosen such that Proposition 2.1 implies lim sup N→∞

M 1 log µ+ ∈ V(ui , εi ) ≤ −F(ui ) + ε. β,1N N2 mβ

δ0

For small enough, we cover F ∩ V(Ka , δ 0 ) with the above neighborhoods which intersect F . Since ui belongs to V(F, ε), we get from Lemma 2.1.2 of [DS], lim sup lim 0

δ →0 N→∞

1 M log µ+ ∈ F ∩ V(Ka , δ 0 ) ≤ − lim inf F(u) β,1N 2 ε→0 u∈V (F,ε) N mβ ≤ − inf F(u). u∈F

Furthermore, results by [F,FM] on the variational problem (2.3) imply that inf F(u) < F(Wm ).

u∈F

204

T. Bodineau

Thus the upper bound is completed. To complete the bound for the probability of the proof it remains to derive a lower M ∈ V(1Wm−ε , δ 0 )} is included event M1N ≤ m . We fix ε positive and check that { m β in {M1N ≤ m} for δ 0 small enough. Proposition 2.2 implies lim inf lim 0

δ →0 N→∞

1 M log µ+ ∈ V(1Wm−ε , δ 0 ) ≥ −F(Wm−ε ). β,1N 2 N mβ

Letting ε go to 0, we obtain lim inf N→∞

1 log µ+ β,1N {M1N ≤ m} ≥ −F(Wm ). 2 N

Combining the upper and the lower bound, we derive Theorem 2.2. u t Let us comment on Propositions 2.1, 2.2 and 2.3.A shift of the averaged magnetization can be realized by 2 competing effects. The first one, which consists of producing a large droplet of − inside the bulk, is controlled by surface tension (Propositions 2.1 and 2.2). The second one consists of increasing homogeneously the number of small − contours. This requires a lot of energy, but may be favored by entropy. This effect is ruled out by Proposition 2.3 which is a combination of an estimate in the phase of small contours with a Peierls type estimate for large contours. In fact, the underlying phenomena are more subtle and it was shown by [IS] in the case of dimension 2, that on the level of moderate deviations the second effect may be the most important. We start by defining a coarse graining on the mesoscopic scale which keeps more details of the microscopic structure than M. This is done, in Sect. 3, via the FK representation by using Pisztora’s results [Pi1]. This coarse graining procedure imposes that β is larger than a critical value β˜c related to the slab percolation threshold (see [Pi1]) and to condition (3.3). It is conjectured that β˜c equals the critical value βc . In Sect. 4, motivated by [Ce], we use an alternative definition of surface tension in terms of the FK representation. We prove the equivalence of several expressions for surface tension which will enable us to compare different boundary conditions. In Sect. 5, Proposition 2.1 is proven along the lines of the argument developed in M ∈ V(u, δ) are those for [BBBP]. It states that the most likely configurations in m β which the + and − phases coexist along the boundary of ∂u, induces Mthis coexistence ∈ V(u, δ) imposed on deviations proportional to a surface order. The L1 constraint m β the magnetization is not strong enough to localize the interface close to ∂u: there might be mesoscopic fingers of one phase percolating into the other. Following [BBBP], we prove by the minimal section argument that one can chop off these fingers without changing too much the probability of the event. The renormalization is an essential feature in the previous procedure. Once the interface is localized on the mesoscopic level, the main problem is to identify surface tension. Notice that in the case of percolation [Ce], the minimal section argument enables to cut the microscopic fingers which connect the domains separated by ∂u. Therefore, one can immediately identify the surface tension factor, because for independent percolation it is defined as the probability that no cluster connects one domain to the other. In the case of spin systems, one would need to find a microscopic surface of + spins on one side of ∂u and another one of − spins on the other side in order to use directly the definition of surface tension. This would seem difficult to achieve because mesoscopic contours enable only to control the averaged magnetization and do not ensure the existence of such microscopic surfaces. We proceed differently

Wulff Construction in Three and More Dimensions

205

and use an alternative definition of surface tension in terms of the FK measure. This requires β to be large. Proposition 2.2 is proven in Sect. 6 under the condition that β is larger than β˜c . The coarse graining is only useful to get the lower bound up to β˜c and it could be avoided if one considers only β large, in which case, a direct proof without using the FK representation is possible. Section 7 is devoted to the proof of Proposition 2.3. Besides its probabilistic interpretation, i.e. the proof of an exponential tightness property, Proposition 2.3 deals with a physical phenomenon of a different nature than the surface tension: it states that the occurrence of many small contours is unlikely. The production of surface tension supposes a balance between energy and entropy. It is a general feature of DKS theory that energy is the dominant factor which rules out the occurrence of small contours. The techniques developed in [I2,SS] and [IS] to control the phase of small contours, for the two dimensional Ising model, are robust enough to be extended to higher dimensions provided Peierls estimate holds. This observation was used in [BBP]. For β large enough, one could have proceeded as in [BBP] and worked only with the Gibbs measure. We use an alternative approach borrowed to [I2] and deduce directly estimates on the phase of small contours by combining estimates of [I2] and Pisztora’s results [Pi1]. Thus Proposition 2.3 holds as soon as β is larger than β˜c . As noticed in the papers on Ising model with Kac potentials, the strategy described above can be applied in any dimension larger or equal to two. As a final remark, we stress the fact that the above results could be easily extended to prove a large deviation principle for the measures µ+ β,1N with action functional F. This setting was developed in [BBP] and also used for percolation [Ce]. This requires a modification of Proposition 2.2 which is described in Remark 8.3 at the end of Subsect. 8.3. 3. Coarse Graining and Mesoscopic Scale 3.1. The FK representation. We describe now the FK representation of Ising model. For a review of FK measures, we refer the reader to [Pi1,Gri] and [ACCN]. The set of edges is E = {x, y} | x ∼ y . For bond percolation, the configurations ω belong to = {0, 1}E . An edge b in E is open if ωb = 1 and closed otherwise. To any subset 3 of Z3 and π included in ∂3, we associate a set of edges [3]πe = {x, y} | x ∼ y, x ∈ 3, y ∈ 3 ∪ π , π

and the space of configurations in 3 is π3 = {0, 1}[3]e . Let ω be a configuration in , an open path (x1 , . . . , xn ) is a finite sequence of distinct nearest neighbors x1 , . . . , xn such that on each edge ω{xi ,xi+1 } = 1. We write {A ↔ B} for the event such that there exists an open path joining a site of A to one of B. A ∗-connected path (x1 ,√. . . , xn ) is a finite sequence of distinct vertices such that kxk − xk+1 k2 is smaller than 3 for all k. The connected components of the set of open edges of ω are called ω-clusters. The ω-cluster associated to the site i is denoted by Ci (ω). Let us now describe the FK representation of the Ising model (see Edwards and Sokal [ES]). Let 3 be a finite subset in Z3 and π a subset of ∂3. The first step is to introduce a measure on π3 . A vertex x of 3 is called π-wired if it is connected by an open path to π. We call π-clusters the clusters defined with respect to the boundary condition π : a π-cluster is a connected set of open edges in π3 and we identify to be the same cluster

206

T. Bodineau

all the clusters which are π-wired, i.e. connected to π. For a given p in [0, 1], we define the FK measure on π3 with boundary conditions π by   Y 1 π π,p (1 − p)1−ωb pωb  2c (ω) , 83 (ω) = π,p  Z3 b∈[3]π e

π,p

where Z3 is a normalization factor and cπ (ω) is the number of clusters which are not π-wired. If π = ∂3 then the boundary conditions are said to be wired and the corresponding w,p f,p f FK measure on w 3 is denoted by 83 . If π = ∅, we write 83 for the measure on 3 . 1 For any subset 1 of 3, we denote by F3 the σ -field generated by finite dimensional f cylinders associated with configurations in w 3 /1 , then strong FKG property (see [Pi1]) implies that for every increasing function g supported by f1 , 1 ) ≤ 81 (g). 83 − a.s, 81 (g) ≤ 83 (g | F3 w,p

f,p

w,p

w,p

(3.1)

81 (g) ≤ 83 (g) ≤ 83 (g) ≤ 81 (g).

(3.2)

In particular, one has f,p

f,p

w,p

w,p

In order to recover the Gibbs measure µ+ β,3 , we fix the percolation parameter pβ = 1 − exp(−2β) and generate the edges configuration ω in w 3 according to the measure w,pβ 83 . Given ω, we associate to the wired cluster the sign +1 and equip randomly each ωcluster with a color ±1 with probability 21 independently from the others. This amounts to introduce the measure P3ω on {−1, 1}3 such that the spin σi = 1 if Ci (ω) is π -wired and to be the chosen color of Ci (ω) otherwise. The Gibbs measure µ+ β,3 can be viewed w,p

as the first marginal of the coupled measure P3ω (σ )83 β (ω), Z w,p + ∀σ3 ∈ 63 , µβ,3 (σ3 ) = P3ω (σ )83 β (dω). w 3

By abuse of notation, the joint measure will be also denoted by µ+ β,3 . As a consequence of this representation one has mβ = lim µ+ β,1 (σ0 ) = lim 81

w,pβ

1→Zd

1→Zd

({0 ↔ ∂1}) = 2.

In the following, we use mβ or 2 depending on the context. In Theorems 2.1 and 2.2, we consider only the case β large. The first reason to do so is to satisfy the hypothesis of Theorem 5.3 of [Gri] which implies that for β large enough, f,p

w,pβ

lim 81 β ({0 ↔ ∂1}) = lim 81

1→Zd

1→Zd

({0 ↔ ∂1}) = 2.

(3.3)

Throughout the paper we suppose that (3.3) holds. The assumption β large will also be useful for technical reasons in the proof of Lemma 4.3.

Wulff Construction in Three and More Dimensions

207

3.2. Coarse graining. We recall the renormalization procedure introduced by Pisztora [Pi1,DP] for the FK measure. For our purposes, it is preferable to use an alternative construction of the coarse graining [Pi2]. The results of this section hold for β larger than βˆc , where βˆc was defined in [Pi1] in terms of slab percolation threshold. Let β˜c be the smallest value such that (3.3) is satisfied and β˜c ≥ βˆc . It is conjectured that β˜c coincides with the critical value βc . 1 Let γ be in Q ∩]0, 10 [ and α = 13 + γ9 . As in Subsect. 2.3, we partition the domain 1N = {1, N}3 into disjoint boxes B(x, N α ) of length N α centered in x in LN α (see (2.4) and (2.5)). For each x in LN α , we consider also the bigger box B(x, 45 N α ) containing B(x, N α ). Note that if x and y are ∗-neighbors in LN α the boxes B(x, 45 N α ) and B(y, 45 N α ) overlap. Following [Pi1], we introduce events which occur on the box B(x, 45 N α ) for each x in LN α , there is a unique crossing cluster C ∗ in B(x, 5 N α ) . Ux = ω ∈ w 1N 4 A crossing cluster is a cluster which intersects all the faces of the box, \n every open path in B(x, 5 N α ) with diameter ω ∈ w Rx = Ux 1N 4 o larger than N γ is contained in C ∗ , where the diameter of a subset A of Z3 is supx,y∈A kx − yk1 . Notice that the additional parameter γ is needed to control the fluctuations of the magnetization over small clusters (see (3.6)). We also define \n ∗ C crosses every sub-box of side length N γ ω ∈ w O x = Rx 1N o 5 contained in B(x, N α ) . 4 Finally, we consider an event which imposes that the density of the crossing cluster is close to 2 (see (3.3)) in B(x, N α ) with accuracy ζ > 0, \ ∗ α Vxζ = Ux ω ∈ w 1N |C | ∈ [2 − ζ, 2 + ζ ] |B(x, N )| . In the following, parameters α, γ will be fixed, therefore we omit the dependence on these parameters in notation. We will only consider different coarse graining for different values of ζ . ζ Each box B(x, N α ) is labeled by the random variable Yx (ω) depending only on the w configuration ω in 1N , Yxζ (ω) = 1 if ω ∈ Ox ∩ Vxζ ,

Yxζ (ω) = 0 otherwise.

Let {x1 , . . . , x` } be vertices in LN α not ∗-neighbors of x, then [Pi1] implies that there is an integer Nζ such that ∀N ≥ Nζ , 81N β (Yxζ = 0 | Yxζ1 , . . . , Yxζ` ) ≤ exp(−cN γ ) + exp(−cζ0 N α ), w,p

208

T. Bodineau

where cζ0 depends only on ζ and c is a constant. From [LSS] (Theorem 1.3), we deduce ζ

that for N large enough, the random variables {Yx } are dominated by a Bernoulli product measure πρN , πρN (X = 0) = ρN ≤ exp(−cζ N γ ),

(3.4)

where cζ is a positive constant. A similar result was already stated in [Pi1]. The random variables Y ζ are only related to ω, therefore the next step is to define a family of random variables which depend on (σ, ω). We denote by Mx the averaged magnetization in the box B(x, N α ), X 1 σi . (3.5) Mx = M Nx = 3α N α i∈B(x,N )

Pisztora’s results [Pi1] give a control of the deviation of the averaged magnetization from ζ its equilibrium values ±mβ in the boxes B(x, N α ). If Yx = 1, this deviation comes from the random coloring of the small clusters (those of diameter less than N γ ) included in B(x, N α ): this random coloring is independent of the boxes around B(x, N α ). Let ζ be ζ positive and define the new random variables {Zx } which depend on the joint law of (σ, ω), Zxζ (σ, ω) = sign(C ∗ ) if Yxζ (ω) = 1 and |Mx − sign(C ∗ ) mβ | < 2ζ,

Zxζ (σ, ω) = 0

otherwise.

Combining results of [Pi1] and [LSS], we check that there is Nζ such that for all N ζ larger than Nζ the random variables {|Zx |}, taking values in {0, 1}, are dominated by a Bernoulli product measure πρN0 , 0 ≤ exp(−cζ N γ ), πρN0 (X = 0) = ρN

(3.6)

where cζ is a positive constant depending only on ζ . Since the setting is different from [Pi1], we sketch the proof in Appendix 8.2. 3.3. Mesoscopic scale. In Subsect. 2.3, we already used a homogenization procedure on the mesoscopic scale N α . We introduce now a different mesoscopic representation which takes into account more details of the microscopic structure. For a given ζ positive, we associate to any configuration (σ, ω) in 61N × w 1N the ζ piecewise constant function T on T , Nα Nα < N ri − xi ≤ . (3.7) 2 2 If (σ, ω) is close to an equilibrium phase on a mesoscopic scale then T ζ has the sign of this phase. The 2 pure phases are represented by functions T ζ constantly equal to 1 or −1. From (3.6), one knows that for β larger than β˜c , ∀r ∈ T , Trζ (σ, ω) = Zxζ (σ, ω)

if ∀i, −

ζ lim µ+ β,1N ({Tr = 1, ∀r ∈ T }) = 1.

N→∞

The next lemma proves that a knowledge of the asymptotic of T ζ is sufficient to control the local magnetization M. Therefore to prove Propositions 2.1, 2.2 and 2.3, it will be enough to replace M by T ζ . The accuracy of the approximation depends on the parameter ζ which controls the coarse graining.

Wulff Construction in Three and More Dimensions

209

Lemma 3.1. For any δ positive, we set ζ = 41 δ, then Z 1 + mβ T ζ − Mr dr ≥ δ = −∞. log µ lim r β,1N N→∞ N 2 T Proof. One has Z T

mβ T ζ − Mr dr ≤ r

this implies Z T

mβ T ζ − Mr dr ≤ r

Nα N

3 X mβ Z ζ − Mx |, x B(x,N α )

Nα N

3 X B(x,N α )

1Z ζ =0 + 2ζ. x

Since ζ is small enough, Z δ 3(1−α) ζ mβ T ζ − Mr dr ≥ δ ≤ µ+ = 0} ≥ N #{Z , µ+ r x β,1N β,1N 2 T ζ

where #{Zx = 0} is the number of boxes with label 0. Therefore the lemma above will be a consequence of Lemma 3.2. For any δ and ζ positive, 1 log µ+ #{Zxζ = 0} ≥ δN 3(1−α) = −∞. β,1 2 N N→∞ N Proof. One has lim

µ+ β,1N

#{Zxζ

= 0} ≥ δN

3(1−α)

≤

3(1−α) NX

k=δN 3(1−α)

ζ µ+ β,1N #{Zx = 0} = k .

ζ

The random variables |Zx | are dominated by independent variables (3.6), thus for N large enough, 3(1−α) ζ 3(1−α) exp(−cζ δN 3(1−α)+γ ). ≤ 2N µ+ β,1N #{Zx = 0} ≥ δN This implies 2 ζ 3(1−α) ≤ exp ln 2 N 3(1−α) − cζ δN 2+ 3 γ . µ+ β,1N #{Zx = 0} ≥ δN As 3(1 − α) < 2, the entropic factor is negligible and the lemma follows. u t 4. Surface Tension As explained in the introduction, one of the main problems to derive Wulff construction is to recover surface tension from general boundary conditions. In this section we rewrite the surface tension in terms of the FK measure and prove that this new expression depends weakly on boundary conditions. This expression is reminiscent to the one introduced by Cerf [Ce] in the context of percolation. We keep notation of Subsect. 2.2. Throughout this section, we fix the direction n and f (N ) without loss of generality, we set h = 1. We also suppose that log(N ) diverges to infinity as N goes to infinity.

210

T. Bodineau

4.1. First step. The next lemma will be useful to prove Proposition 2.2 | ∂ − 3N } be the event such that there is no open path inside Lemma 4.1. Let {∂ + 3N ↔ + 3 to ∂ − 3 . Then joining ∂ [3N ]w N N e τ (n) = lim − N→∞

1 w,p log 83N β {∂ + 3N ↔ | ∂ − 3N } . 2 N

(4.1)

| ∂ − 3N } takes only into account the paths inside 3N Note that the event {∂ + 3N ↔ and not the identification produced by wired boundary conditions. Proof. We rewrite the quantities in terms of the FK representation. A well known argument implies that for pβ = 1 − exp(−2β), X Y + = exp β(δσx ,σy − 1) Z3 N σ ∈63N <x,y>∈[3N ]w e

Y

X

=

ω∈w 3

N

b∈[3N ]w e

(1 − pβ )1−ωb pβωb 2c

w (ω)

,

where cw (ω) is the number of clusters which are not wired. We prove now an equivalent formula for Y X +,− = exp β(δσx ,σy − 1) , Z3 N σ ∈63N <x,y>∈[3N ]w e

− where boundary conditions are equal to 1 on ∂3+ N and to −1 on ∂3N . We get X Y +,− = 1 − pβ + pβ δσx ,σy , Z3 N σ ∈63N <x,y>∈[3N ]w e

this gives +,− = Z3 N

Therefore +,− = Z3 N

X

X Y (1 − pβ )1−ωb pβωb

σ ∈63N ω∈w 3

N

b

X X Y (1 − pβ )1−ωb pβωb

ω∈w 3N

σ ∈63N

b

Y

δσx ,σy .

b=<x,y> ωb =1

Y

δσx ,σy .

b=<x,y> ωb =1

The boundary conditions imply that configurations ω containing a path joining ∂ + 3N to ∂ − 3N are not taken into account. We keep the definition of wired boundary conditions identifying all the clusters which touch the boundary ∂3N , X Y w +,− = 1{∂ + 3N ↔ (1 − pβ )1−ωb pβωb 2c (ω) . Z3 | ∂ − 3N } (ω) N ω∈w 3

Taking the ratio

+,− Z3 N

+ Z3

N

N

b

w,p , we recover 83N β {∂ + 3N ↔ | ∂ − 3N } . u t

Wulff Construction in Three and More Dimensions

211

¯ 4.2. Second step. In the following, we denote by 30N = 3(N, N, 21 f (N )) ∩ Z3 the parallelepiped included in 3N . Lemma 4.2. One has τ (n) = lim − N→∞

Proof. By definition Lemma 4.1 implies

| {∂ + 30N ↔

1 w,p log 83N β {∂ + 30N ↔ | ∂ − 30N } . 2 N

∂ − 30N } is included in {∂ + 3N ↔ | ∂ − 3N }. Therefore,

τ (n) ≤ lim inf − N→∞

1 w,p log 83N β {∂ + 30N ↔ | ∂ − 30N } . 2 N

| ∂ − 30N } is decreasing and Let us prove the reverse inequality. The event {∂ + 30N ↔ . Thus (3.2) gives supported by [30N ]w e w,pβ w,p | ∂ − 30N } ≥ 830 β {∂ + 30N ↔ | ∂ − 30N } . (4.2) 83N {∂ + 30N ↔ N

Since surface tension does not depend on the function f , Lemma 4.1 implies τ (n) = lim − N→∞

1 w,p log 830 β {∂ + 30N ↔ | ∂ − 30N } . 2 N N

Thus using (4.2), the lemma is proven. u t 4.3. Third step. Let 8f,w 3N be the FK measure with wired boundary conditions on the sides of 3N parallel to n and free on the sides orthogonal to n. Lemma 4.3. There is a constant β0 independent of n and f such that for any β larger than β0 , τ (n) = lim − N→∞

1 + 0 log 8f,w | ∂ − 30N } . 3N {∂ 3N ↔ N2

| ∂ − 30N } is denoted by SN . Applying (3.1), one observes Proof. The event {∂ + 30N ↔ w,pβ f,w that 83N (SN ) ≥ 83N (SN ) so that Lemma 4.2 implies τ (n) ≥ lim sup − N→∞

1 log 8f,w 3N (SN ). N2

(4.3)

To prove the reverse inequality, we introduce the slabs SlN + and SlN − in 3N , 1 3 ¯ N, N, f (N ) + f (N )n ∩ Z3 , SlN + = 3 10 8 3 1 ¯ N, N, f (N ) − f (N )n ∩ Z3 . SlN − = 3 10 8 For any ω in w 3N , we call a vertex x white if ωb = 1 for all edge b incident with − x and black otherwise. Let A+ N (resp AN ) be the event such that there is a surface of white vertices which crosses the slab SlN + (resp SlN − ) and separates the two sides of c the slab orthogonal to n. Equivalently, one can define A+ N as the set of configurations

212

T. Bodineau

ω3N which contain a ∗-connected path of black vertices intersecting the 2 sides of SlN + orthogonal to n. One has f,w f,w + − + − c 8f,w (4.4) 3N (SN ) = 83N SN ∩ AN ∩ AN + 83N SN ∩ (AN ∩ AN ) . First we estimate the last term in the RHS. It is enough to prove an upper bound for c + c ∩ A S . The events SN and A+ 8f,w 3N N N N have distinct supports, so that we can take the conditional expectation with respect to ∂ω, the configuration outside SlN + , + c +c ∂ω = 8f,w 8f,w 3N SN ∩ AN 3N SN 8Sl + (AN ) . N

c

Since A+ N is decreasing, (3.1) implies c

8∂ω (A+ N ) ≤ 8Sl Sl +

f,pβ

N

N

+

c

(A+ N ),

where the free boundary conditions are outside the domain [SlN + ]w e . In order to control this term, we use a Peierls argument (see [Gri], p. 1486). By the comparison result of Aizenman, Chayes, Chayes and Newman [ACCN], the above probability is bounded by p0 pβ ¯ β + with p0 = the percolation (product) measure 8 2, β

SlN

f,pβ

8Sl

N

+

pβ +(1−pβ )

p0

c +c ¯ β (A+ N ) ≤ 8Sl + (AN ). N

We choose β large enough so that pβ0 is close to 1. Then Peierls estimate holds and there is a constant c > 0 such that the probability that a ∗-connected black path joins 2 c f (N )). This comes from the vertices x and y on both sides of SlN + is less than exp(− 10 1 fact that the length of such a path is at least 10 f (N ), p0

¯ β + (A+ c ) ≤ N 2 exp − 8 N Sl N

c f (N ) . 10

One finally obtains c f (N ) 8f,w (4.5) 3N (SN ). 10 + − + We turn now to the estimate of 8f,w 3N SN ∩ AN ∩ AN . For a given ω in AN , we are + going to define the surface S (ω) of white vertices which is the closest to the “upper” side of SlN + . First we construct the black set B+ (ω) as follows: B+ (ω) contains the vertices in the “upper” side of ∂SlN + , i.e. the vertices at distance less than 2 of the hyperplan parallel 17 f (N )n. Furthermore B+ (ω) contains all the black vertices to (e1 , e2 ) and centered in 40 linked by a ∗-connected path of black vertices to the boundary of the “upper” side of SlN + . A vertex x is in S + (ω) if it belongs to the boundary of B+ (ω) and if there is a path of vertices joining x to 0 without crossing B + (ω). By construction the vertices in S + (ω) are white. In the same way we define S − (ω) as the surface of white vertices which is the closest to the “lower” side of SlN − , i.e. to the set of the vertices at distance 17 f (N )n. less than 2 of the hyperplan parallel to (e1 , e2 ) and centered in − 40 + 8f,w 3N SN ∩ AN

c

≤ N 2 exp −

Wulff Construction in Three and More Dimensions

213

The region between the surfaces S + and S − is denoted by SN and by construction is included in SN . Therefore we can consider the conditional expectation of SN with respect to the configurations outside SN (the measurability is discussed in [Gri, p. 1487]). Since S + and S − contain only white vertices, one gets

30N

w,pβ f,w + − 8f,w 3N SN ∩ AN ∩ AN = 83N 1A+ ∩A− 8SN (SN ) . N

N

The event SN is decreasing, thus strong FKG property (3.2) implies w,pβ f,w + − + − w,pβ 8f,w 3N SN ∩ AN ∩ AN ≤ 83N AN ∩ AN 83N (SN ) ≤ 83N (SN ).

(4.6)

Combining (4.4), (4.5) and (4.6), we obtain w,p

β 2 8f,w 3N (SN ) ≤ 83N (SN ) + N exp −

c f (N ) 8f,w 3N (SN ). 10

Applying Lemma 4.2, we get lim inf − N→∞

1 log 8f,w 3N (SN ) ≥ τ (n). N2

The lemma is completed. u t

4.4. Fourth step. Now, we will modify the boundary conditions and prove that the surface tension remains unchanged. The following lemma will be important in the proof of Proposition 2.1. It requires the assumption β large. Notice that the observation used in the proof of Lemma 4.4 that the bonds on the vertical sides can be closed up to a small error has been already made in [Ce]. We denote by ∂ top 30N (resp. ∂ bot 30N ) the face of ∂ + 30N (resp. ∂ − 30N ) orthogonal to | ∂ bot 30N } be the event such that there is no open path inside [30N ]w n. Let {∂ top 30N ↔ e top connecting ∂ 30N to ∂ bot 30N . Finally, we set δ = lim sup N→∞

f (N ) , N

and suppose that δ is finite. Lemma 4.4. There is a constant β0 independent of n and f such that for any β larger than β0 , 1 π,pβ top 0 bot 0 | ∂ 3N } ≤ −τ (n) + cβ δ, lim sup 2 log sup 83N {∂ 3N ↔ π N→∞ N where the constant cβ depends only on β. The above inequality holds uniformly over the boundary conditions π outside [3N ]w e. Notice that in the lemma, π is not a subset of ∂3N but a bond configuration in 3cN .

214

T. Bodineau

Proof. As {∂ top 30N ↔ | ∂ bot 30N } is decreasing, strong FKG property (3.1) implies π,p f,p | ∂ bot 30N } ≤ 83Nβ {∂ top 30N ↔ | ∂ bot 30N } , sup 83N β {∂ top 30N ↔ π

where the free boundary conditions are outside [3N ]w e . Note also that f,p top 0 | ∂ bot 30N } ≤ 24f (N)N 8f,w | ∂ bot 30N } . 83Nβ {∂ top 30N ↔ 3N {∂ 3N ↔ by

(4.7)

| ∂ bot 30N }. The inner boundary of 3N is defined We fix a configuration ω in {∂ top 30N ↔ ∂ ∗ 3N = {x ∈ 3N | ∃y 6∈ 3N , y ∼ x}.

For any vertex x on the sides of ∂ ∗ 3N parallel to n, we modify the edges of ω incident with x into closed edges and denote by ω¯ the new configuration. By construction ω¯ | ∂ − 30N }. Noticing that the number of edges which have been belongs to {∂ + 30N ↔ modified is smaller than 50f (N )N and using (4.7), one has f,p + 0 | ∂ bot 30N } ≤ exp(cβ f (N )N ) 8f,w | ∂ − 30N } , 83Nβ {∂ top 30N ↔ 3N {∂ 3N ↔ where cβ depends only on β. Using Lemma 4.3, we complete the proof. u t 5. Upper Bound: Proposition 2.1 Throughout this section, we fix u in BV(T , {+1, −1}). We recall that F(u) is finite. We split the proof into 3 steps. 5.1. Approximation. First we suppose that ∂ ∗ u is included in the interior of T . The general case will be treated in Subsect. 5.3. We approximate the reduced boundary of u with a finite number of parallelepipeds. Similar theorems were already stated in [ABCP] and [Ce]. The following result is proved in Appendix 8.3. Theorem 5.1. For any δ positive, there exists h positive such that there are ` disjoint parallelepipeds R 1 , . . . , R ` included in T with square basis B 1 , . . . , B ` of size h and height δh. The basis B i divides R i in 2 parallelepipeds R i,+ and R i,− and we denote by ni the normal to B i . Furthermore, the parallelepipeds satisfy the following properties: Z

` Z X |χR i (r) − u(r)| dr ≤ δ vol(R ) and i

Ri

i=1

Bi

τ (ni ) dHx − F(u) ≤ δ,

where χR i = 1R i,+ − 1R i,− and the volume of R i is vol(R i ) = δh3 . The area of B i is h2 .

R Bi

We fix δ positive. The approximation procedure implies lim lim sup

δ 0 →0 N→∞

1 M log µ+ ∈ V(u, δ 0 ) ≤ β,1N N2 mβ lim sup N→∞

` 1 M \ + log µ ∈ V(R i , 2δvol(R i )) , β,1N 2 N mβ i=1

dHx

Wulff Construction in Three and More Dimensions

215

where the ε-neighborhood of R i is Z 1 1 i 1 , ] |v(r) − χR i (r)| dr ≤ ε . V(R , ε) = v ∈ L T , [− mβ mβ Ri According to Lemma 3.1, there is ζ small enough, depending on δ and h, such that lim sup N→∞

` 1 M \ + log µ ∈ V(R i , 2δvol(R i )) ≤ β,1N 2 N mβ i=1

lim sup N→∞

` \ 1 + ζ log µ ∈ V(R i , 3δvol(R i )) . T β,1N 2 N i=1

Therefore to prove Proposition 2.1, it is enough to show that lim sup N→∞

` \ 1 + ζ log µ ∈ V(R i , 3δvol(R i )) ≤ −F(u) + Cβ,u δ, T β,1N 2 N i=1

where the constant Cβ,u depends only on β and u. Each box can be labeled by 3 values 3(1−α) . As 3(1 − α) < 2, this 0, ±1, thus the number of configurations T ζ is less than 3N term has no entropic effect. Thus it remains to check ! 1 + sup µβ,1N {T ζ } ≤ −F(u) + Cβ,u δ, (5.1) lim sup 2 log N →∞ N T ζ ∈ Uδ where U δ denotes which realize T ζ .

T`

i=1 V(R

i , 3δvol(R i ))

and {T ζ } is the set of configurations (σ, ω)

i = 5.2. Minimal section argument. The microscopic domain associated to R i is RN i,+ i,− i /R i,+ . Let Li be the = NR i,+ ∩ 1N and RN = RN N R i ∩ 1N . We also set RN N N i,+ i α is subset of boxes B(x, N ) intersecting RN . The number of boxes intersecting RN

NNi,+ = N 3(1−α)

vol(R i ) + O(N 2(1−α) ) . 2

(5.2)

i,+ .A This error is due to the fact that the partition may not be exact on the sides of RN i,− i,− similar estimate holds for NN , the number of boxes intersecting RN . T To any T ζ in `i=1 V(R i , 3δvol(R i )), we associate the set of bad boxes which are ζ i,+ i,− the boxes in LiN labeled by Zx = 0 and the ones intersecting RN (resp. RN ) labeled ζ ζ by Zx = −1 (resp. Zx = 1). We will now use the L1 -constraint to derive bounds on the number of bad boxes. The i is smaller than 50h2 N 2(1−α) , therefore number of boxes in LiN not included in RN Z X 3(1−α) N |Trζ − 1| dr − |Zxζ − 1| ≤ 100h2 N 2(1−α) . R i,+

i,+ B(x,N α )∩RN 6 =∅

216

T. Bodineau

Since T ζ belongs to V(R i , 3δvol(R i )), one gets from (5.2) that for N large enough the i,+ number of bad boxes in RN is smaller than 10δNNi,+ . In the same way, we check that i,− is smaller than 10δNNi,− . the number of bad boxes in RN i 0 Let R be the parallelepiped included in R i with basis B i and height 2δ h. Its microi 0 . We will apply the minimal section argument introduced in scopic counterpart is RN T i 0↔ i 0 }. | ∂ bot RN [BBBP] and relate the expectation of {T ζ } to the one of `i=1 {∂ top RN i,k k ni . Let BN be the microscopic subset For any integer k, we set B i,k = B i + 10 N 1−α 0

i associated to B i,k , of RN i,k i 0 = j ∈ RN | ∃r ∈ B i,k , kj − N rk1 ≤ 10 . BN

We define Bik as the smallest connected set of mesoscopic boxes containing i,k B(y, N α ) ∈ LiN | B(y, N α ) ∩ BN 6= ∅ . By construction the Bik are disjoint surfaces of boxes. For k positive, let n+ i (k) be the number of bad boxes in Bik and define + δh 1−α N . n+ i = min ni (k); 0 < k < 30 location where the minimum is− achieved and define the minimal Call k + the smallest + i,− and section as Bik . For k non-positive, we denote by Bik the minimal section in RN − k . the number of bad boxes in B n− i i For any {T ζ } in U δ , we will check that the total number of bad boxes is bounded by ` X i=1

− 2(1−α) n+ , i + ni ≤ Cu δN

(5.3)

where Cu is a constant depending only on u. By definition, one has δh 1−α + N ni ≤ 30

X i,+ B(x,N α )∩RN 6=∅

1Z ζ 6=1 ≤ 10δNNi,+ . x

3 2 2(1−α) . Note that h2 is in fact For N large enough, (5.2) implies that n+ i ≤ 10 δh N i the area of B , therefore the approximation procedure implies that `h2 is bounded by a constant depending on the perimeter of {u = −1}. Thus (5.3) holds. We are now going to use all the previous estimates. We define ∃σ such that (σ, ω) ∈ {T ζ } . A = ω ∈ w 1N

Any configuration ω in A will be mapped into ω¯ by the following procedure. For any bad box B(x, N α ) in the minimal sections, we change the open edges of ω located on the sides of the box B(x, 45 N α ) into closed edges. The new configuration ω¯ belongs i 0↔ i 0 }, because any open path of ω which joins ∂ bot R i 0 to ∂ top R i 0 | ∂ bot RN to {∂ top RN N N intersects at least one of the minimal sections on a bad block and therefore is cut by the i 0 to ∂ bot R i 0 and suppose above procedure. Let C be an open path of ω joining ∂ top RN N

Wulff Construction in Three and More Dimensions

217

that C crosses the minimal sections without intersecting a bad box. Then C intersects + − ζ ζ the boxes B(x + , N α ) and B(x − , N α ) in Bik and Bik with labels Yx + = Yx − = 1. This would imply that the crossing clusters of B(x + , N α ) and B(x − , N α ) are connected to ζ ζ C, so that Zx + = Zx − . Therefore one of these boxes has to be a bad box. − 2α edges. From (5.3) the Around the bad boxes, we change at most 20(n+ i + ni )N total number of edges involved in the previous procedure is bounded by 100Cu δN 2 . Therefore we get ` w,pβ \ w,pβ ζ 2 i 0 i 0 ({T }) ≤ 8 δN {∂ top RN ↔ | ∂ bot RN } , A ≤ exp C 8 µ+ β,u 1N 1N β,1N i=1

where the constant Cβ,u depends only on β and u. i and using Lemma 4.4, we derive Conditioning outside each domain RN ! ` Z X 1 w,pβ ζ τ (ni ) dHx + Cβ,u δ + cβ `h2 δ, lim sup 2 log sup 81N {T } ≤ − N ζ δ Bi N→∞ T ∈U i=1

where cβ was defined in Lemma 4.4. Noticing that `h2 is bounded in terms of the perimeter of u and using Theorem 5.1, we derive (5.1). 5.3. Boundary conditions. Let U be the intersection of the reduced boundary ∂ ∗ u and of ∂T . Suppose that U has a positive 2 dimensional Hausdorff measure. In this case we cannot approximate the surface U as in Theorem 5.1 with parallelepipeds included in T . We state a variant of Theorem 5.1 proven in Appendix 8.3. Theorem 5.2. For any δ positive, there exist h positive and ` disjoint squares B 1 , . . . , B ` in ∂T of size h and normal ni such that ` Z X i=1

Bi

Z τ (ni ) dHx −

U

τ (nx ) dHx ≤ δ.

Furthermore, there are ` disjoint parallelepipeds R 1 , . . . , R ` included in T such that one of the faces of R i is B i and the height of R i is δh. The parallelepipeds also satisfy Z |1 + ur | dr ≤ δ vol(R i ). ∀i ≤ `, Ri

The proof of the upper bound is based on local estimates in each parallelepiped, thus we will simply explain how to adapt the previous proof to obtain lim sup N→∞

` \ 1 + ζ log µ ∈ V(R i , δvol(R i )) ≤ − T β,1N 2 N i=1

Z U

τ (nx ) dHx + Cβ,u δ, (5.4)

where Cβ,u is a constant and V(R i , δvol(R i )) is Z 1 1 i i 1 i , ] |vr + 1| dr ≤ δvol(R ) . V(R , δvol(R )) = v ∈ L T , [− mβ mβ Ri

218

T. Bodineau

Combining estimates (5.1) and (5.4), one derives Proposition 2.1 for any function of bounded variation u in T . i be the microscopic set associated to R i . The set of bad boxes is the set of boxes Let RN i and labeled by 0 or 1. Using the L1 constraint, we see that α B(x, N ) intersecting RN the number of bad boxes is smaller than 10δ 2 h3 N 3(1−α) . Let R i 0 be the parallelepiped i included in R i with height δh 2 and such that one of its faces is B . The previous argument implies that the minimal section contains less than 100δh2 N 2(1−α) bad boxes. Therefore we can cut the wired open paths which cross the minimal section and obtain for N large enough, ζ µ+ β,1N T ∈

` \ i=1

` \ i 0 i 0 V(R i , δvol(R i )) ≤ exp Cβ,u δN 2 8w {∂ top RN ↔ | ∂ bot RN } . 1N i=1

0

i is nothing but a subset of ∂1 . Notice that ∂ top RN N i Let R¯ be the union of R i and of the parallelepiped in T c with height δh 2 and one of i i ¯ its faces is equal to B . Denote by RN the corresponding microscopic domain, then by inequality (3.1) one has f,p i 0 i 0 i 0 i 0 ˜ π,w ↔ | ∂ bot RN } ≤ 8 ¯ i β {∂ top RN ↔ | ∂ bot RN } , {∂ top RN 8 i RN

RN

i ˜ π,w where 8 i is the FK measure with wired boundary conditions on the face of RN which RN

coincides with ∂1N and π otherwise. This enables us to apply Lemma 4.4 and to recover (5.4). 6. Lower Bound: Proposition 2.2 The proof rests only on Lemma 4.1, therefore Proposition 2.2 holds as soon as β is larger than β˜c (see Subsect. 3.2). The proof is divided into 2 steps. 6.1. Approximation procedure. We first state an approximation theorem which will be proven in Appendix 8.3. We call a polyhedral set, a set which has a boundary included in the union of a finite number of hyperplans. Theorem 6.1. For any δ positive, there exists a polyhedral set W such that δ δ 1W ∈ V(1Wm , ) and F(W ) − F(Wm ) ≤ . 3 2 For any h small enough there are ` disjoint cubes R 1 , . . . , R ` of size h with basis B 1 , . . . , B ` included in ∂W . Furthermore, the squares B 1 , . . . , B ` cover ∂W up to a S set of measure less than δ denoted by U δ = ∂W/ `i=1 B i and they satisfy ` Z X i=1

Bi

τ (ni ) dHx − F(Wm ) ≤ δ,

where the normal to B i is denoted by ni .

Wulff Construction in Three and More Dimensions

219

We fix δ positive and choose a set W approximating Wm , then δ \ ζ δ M T ∈ V(1W , ) ⊂ ∈ V(T ζ , ) ∈ V(1Wm , δ) . mβ 3 3 mβ M 6∈ V(T ζ , 3δ ) has a Lemma 3.1 implies that there exists ζ such that the event m β probability which vanishes exponentially fast, therefore M

lim inf N→∞

δ M 1 1 ζ log µ+ ∈ V(1Wm , δ) ≥ lim inf 2 log µ+ β,1N β,1N T ∈ V(1W , ) . N →∞ N N2 mβ 3

It remains to find a lower bound for the term in the RHS. For any ε positive, we construct a shell around ∂W which splits T into 2 domains Sε = {r ∈ T | dist(r, ∂W ) ≤ ε}. We set Wε+ = {r ∈ W c | dist(r, ∂W ) ≥ ε} and Wε− = {r ∈ W | dist(r, ∂W ) ≥ ε}. ± be the set of mesoscopic boxes included in N Wε± ∩ 1N . We fix ε such that Let Wε,N δ the volume of Sε is smaller than 10 and choose h smaller than 2ε .

6.2. Exponential bound. The microscopic domain associated to the cube R i is denoted i = N R i ∩ 1 . We set A = T` {∂ + R i ↔ − i by RN N N i=1 N | ∂ RN }. The microscopic domain UNδ = x ∈ 1N ∃r ∈ U δ , kx − N rk1 ≤ 10 is an enlargement of the surface U δ defined in the approximation procedure. We introduce BN , the set of configurations with closed edges in UNδ . Hypotheses on U δ imply that BN is supported by at most 10δN 2 edges. We decompose AN ∩ BN into 2 disjoint sets + + − α ζ µ+ β,1N AN ∩ BN = µβ,1N AN ∩ BN ∩ {∀B(x, N ) ⊂ Wε,N ∪ Wε,N ; |Zx | = 1} + − α ζ + µ+ β,1N AN ∩ BN ∩ {∃B(x, N ) ⊂ Wε,N ∪ Wε,N ; Zx = 0} . We first estimate the last term in the RHS. By definition the events AN ∩ BN and ζ + − ∪ Wε,N ; Zx = 0} have disjoint supports. Taking the {(σ, ω) | ∃B(x, N α ) ⊂ Wε,N conditional expectation with respect to AN ∩ BN and using the stochastic domination (3.6) (see also Remark 8.1), we get + − α ζ µ+ β,1N {∃B(x, N ) ⊂ Wε,N ∪ Wε,N ; Zx = 0} AN ∩ BN ≤ N 3(1−α) exp(−cζ N γ ). Therefore, for N large enough µ+ β,1N AN ∩ BN

+ − α ζ ≤ 2µ+ β,1N AN ∩ BN ∩ {∀B(x, N ) ⊂ Wε,N ∪ Wε,N ; |Zx | = 1} .

220

T. Bodineau

By construction, no configuration ω of AN ∩ BN contains an open path joining S` i S U δ . Therefore any configuration the 2 connected components of 1N / i=1 RN N ζ + − ∪ Wε,N ; |Zx | = 1} contains 2 disconnected in AN ∩ BN ∩ {∀B(x, N α ) ⊂ Wε,N microscopic crossing clusters. The cluster connected to ∂1N is denoted by C + and the other one by C − . The wired constraint imposes the sign 1 to C + . With probability 21 we choose the sign of C − to be −1. We define the event \ \ + ; Zxζ = 1} BN {∀B(x, N α ) ⊂ Wε,N CN (σ, ω) = AN \ − {∀B(x, N α ) ⊂ Wε,N ; Zxζ = −1}, + then µ+ β,1N (AN ∩ BN ) ≤ 4µβ,1N (CN ). Thus for any configuration (σ, ω) in CN , the function T ζ (σ, ω) is equal to 1 on Wε+ and to −1 on Wε− (see (3.7)). Since the volume δ , we have of Sε is less than 10

δ CN ⊂ (σ, ω) T ζ (σ, ω) ∈ V(1W , ) , 3 this leads to δ 1 + ζ µ+ β,1N T (σ, ω) ∈ V(1W , ) ≥ µβ,1N AN ∩ BN . 3 4 As AN ∩ BN depends only on the variable ω, we replace the coupled measure by the w,p FK measure 81N β . The support of BN contains less than 10δN 2 edges, so that ` δ 1 w,pβ \ + i ζ 2 i ) ≥ exp(−c ∈ V(1 , δN )8 {∂ RN ↔ | ∂ − RN } , T µ+ W β 1N β,1N 3 4 i=1

i ↔ i } occur on disjoint | ∂ − RN where cβ is a constant depending on β. The events {∂ + RN supports, taking the conditional expectation with respect to the configuration ∂ωi outside i , we have RN ` δ exp(−cβ δN 2 ) w,pβ Y ∂ωi + i ζ i ) ≥ 8 ∈ V(1 , 8 i ({∂ RN ↔ | ∂ − RN }) . T µ+ W 1N β,1N R 3 4 N i=1

i ↔ i } are non increasing, (3.1) implies | ∂ − RN Since the events {∂ + RN `

δ exp(−cβ δN 2 ) Y w,pβ + i ζ i ) ≥ ∈ V(1 , 8 i {∂ RN ↔ | ∂ − RN } . T µ+ W β,1N R 3 4 N i=1

Taking the limit as N goes to infinity and using Lemma 4.1, we obtain `

lim inf N→∞

X δ 1 + ζ log µ ∈ V(1 , T ) ≥ − W β,1 N N2 3

The proof is completed by letting δ go to 0. u t

i=1

Z Bi

τ (ni )dHx − cβ δ.

Wulff Construction in Three and More Dimensions

221

7. Exponential Tightness: Proposition 2.3 In view of Lemma 3.1, Proposition 2.3 will be a consequence of Lemma 7.1. Let β be larger than β˜c (see Subsect. 3.2), then there is Cβ such that for any a positive, ∀ζ, δ > 0, lim sup N→∞

1 ζ c log µ+ β,1N T ∈ V(Ka , δ) ≤ −Cβ a. 2 N

A high density of small contours can be interpreted on the mesoscopic scale as a high density of random variables Z ζ = 0. Such events are ruled out by Lemma 3.2. If T ζ belongs to V(Ka , δ)c and does not contain many boxes labeled by Z ζ = 0, then there are mainly mesoscopic sets of constant sign. By definition of the variables Z ζ , two sets of different signs are disconnected on the microscopic level and therefore are separated by a layer of boxes with label Z ζ = 0. The problem is that the probability of the event ζ {Zx = 0} may only be of the order of exp(−N γ ) which is much higher than the expected surface order exp(−N 2α ). Thus we introduce another coarse graining in order to recover the surface order. The scheme of the following proof was suggested by D. Ioffe. . Proof. As noticed in Subsect. 5.2, the number of configurations T ζ is less than 3N ζ }). Lemma 3.2 ({T Thus it is enough to fix T ζ in V(Ka , δ)c and to estimate µ+ β,1N enables us to consider only configurations T ζ with a number of boxes labelled by 0 smaller than δN 3(1−α) . This amounts to say that Z 1T ζ =0 dr ≤ δ. (7.1) 3(1−α)

T

r

Each realization of T ζ splits T into T = T+ ∪ T− ∪ T0 , where T ζ is constantly equal to ±1 on T± and to 0 on T0 . From (7.1), the measure of T0 is smaller than δ. The microscopic counterparts of T+ and T− will be denoted by 1N,+ and 1N,− . Moreover as T ζ belongs to V(Ka , δ)c , for any regular set A of T such that T− ⊂ A ⊂ T \ T+ , Z Z dHx ≥ τ (nx ) dHx ≥ a, (7.2) kτ k∞ R

∂A

∂A

where ∂A dHx is the perimeter of ∂A. Note that for each configuration in V(Ka , δ)c the set T− is not empty. Let L be an integer large enough which divides N α and is independent of N . We partition 1N into boxes B(i, L), where i is in LL . We also introduce the boxes B(i, 45 L), and following Subsect. 3.2, define a coarse graining on the scale L. Let {yi } be the family of random variables equal to 1 if the event Oi (for the box B(i, 45 L)) is satisfied and 0 otherwise. We define the microscopic set A˜ N as the union of 1N,− and of the boxes B(i, L) labeled by 1 such that there is a ∗-connected path of boxes B(j, L) labeled by 1 joining B(i, L) to 1N,− . For any configuration (σ, ω) in {T ζ } there is no microscopic path connecting 1N,+ ∪ ∂1N to 1N,− . Therefore there is no ∗-connected path of boxes B(j, L) labeled by 1 connecting 1N,+ ∪ ∂1N to 1N,− . This implies that 1N,− ⊂ A˜ N ⊂ 1N \ 1N,+ .

222

T. Bodineau

S` We define also GN = 1N \ A˜ N . Let GN = i=1 GN,i be the decomposition of GN into maximal connected components composed of boxes B(j, L). The components GN,i such that |GN,i | < N 3α cannot intersect 1N,+ . We set [ GN,i , AN = A˜ N |GN,i |
which satisfies 1N,− ⊂ AN ⊂ 1N \ 1N,+ . To any GN,i with cardinality larger than N 3α , we associate the contour n o ∂ L GN,i = B(j, L) B(j, L) ⊂ GN,i and B(j, 2L) ∩ AN 6= ∅ . By construction ∂ L GN,i is a connected set of boxes B(j, L) such that yj = 0. Furthermore, whenever |GN,i | ≥ N 3α , the number of boxes in the contour ∂ L GN,i (denoted by #∂ L GN,i ) satisfies #∂ L GN,i ≥ c

Nα L

2 ,

(7.3)

where c is a universal constant. Finally, from inequality (7.2), we see that the cardinality of the boundary ∂AN is larger than caN 2 /kτ k∞ , thus 2 X ca N L #∂ GN,i ≥ . (7.4) kτ k L ∞ 3α |GN,i |≥N

Stochastic domination (3.4) implies that for L large enough, a Peierls estimate holds: there is a positive constant CL,β such that the probability that a connected surface S of boxes B(i, L) labeled by 0 contains more than n boxes is bounded by w,p

81N β (#S ≥ n) ≤ N 3 exp(−CL,β n), where #S is the number of boxes in S. By definition the contours ∂ L GN,i are connected surfaces of boxes B(j, L) labeled by 0, so that using (7.3) and (7.4), the proof is concluded by a standard Peierls argument. u t 8. Appendix 8.1. Surface tension. In [MMR], the surface tension τ 0 was defined by tilting the boundary conditions outside the domain 1M = {−M, M}3 , where M is an integer. For a given n in S2 , we denote by DM the intersection of [−M, M]3 with the hyperplan centered in 0 and orthogonal to n. The surface tension τ 0 is τ 0 (n) = lim − M→∞

+,− Z1 1 log +M , SM Z1M

where SM is the area of DM and the mixed boundary conditions in ∂1M are σi = 1 if i.n > 0 and σi = −1 otherwise. The boundary conditions above DM are equal to 1 and to −1 below.

Wulff Construction in Three and More Dimensions

223

This definition coincides with Definition 2.1. This is straightforward from the argument developed by [MMR] (Theorem 2). We recall this argument for completeness. First, we show that τ 0 (n) ≤ lim inf − N→∞

+,− Z3 1 N log + . h2 N 2 Z3 N

(8.1)

Let M, N be 2 integers such that N M. We shall write 0 = − log FM

+,− Z1 M + Z1 M

and F (3N ) = − log

+,− Z3 N + Z3 N

,

where the parallelepiped 3N was introduced in Definition 2.1. We tile DM with kM squares B i of side length hN and at distance 10 from each other. SM i B is smaller These squares are chosen such that the area of DM not covered by ki=1 than C(Mf (N) + kM hN), where C is some positive constant independent of M and N . If the center of B i does not coincide with a site on the lattice, we translate B i at a 0 distance smaller than 1 such that the center of its translate B i belongs to Z3 . Let 3iN 0 be the parallelepiped of basis B i deduced from 3N by translation. There is a choice i of the squares B such that all the parallelepipeds 3iN are included in 1M . Note that F (3iN ) = F (3N ). Inequality (C.2) of [MMR] implies 0 ≤ FM

kM X i=1

F (3iN ) + KC Mf (N ) + kM hN ,

where K is some positive constant. Since |SM − kM h2 N 2 | ≤ C(Mf (N ) + kM hN ), one gets as M goes to infinity, τ 0 (n) ≤

1 h2 N 2

F (3N ) +

KC . hN

Letting N go to infinity, we obtain (8.1). The reverse inequality τ 0 (n) ≥ lim sup − N→∞

+,− Z3 1 N log + h2 N 2 Z3 N

(8.2)

is derived in the same way by choosing M N and by partitioning the basis of 3N with translates of DM . Combining (8.1) and (8.2), we see that τ (n) = τ 0 (n). 8.2. Stochastic domination. The following result is a consequence of [I2] and of the proof of Theorem 1.1 of [Pi1], we sketch the proof for completeness Theorem 8.1. For any ζ positive, there is Nζ such that for all N larger than Nζ , the ζ family of random variables |Zx | is dominated by a product Bernoulli measure πρN0 (see (3.6)).

224

T. Bodineau

Proof. According to [LSS], it is enough to check that ζ ζ ζ γ µ+ β,1N |Zx | = 0 |Zx1 | = ε1 , . . . , |Zx` | = ε` ≤ exp(−cζ N ), where the vertices {x1 , . . . , x` } are not ∗-neighbors of x in LN α and each εi belongs to {0, 1}. Using notation from Subsect. 3.1, one has ζ ζ ζ µ+ β,1N |Zx | = 0, |Zx1 | = ε1 , . . . , |Zx` | = ε` ≤

w,p 81N β Yxζ = 1; P1ωN |Mx − sign(C ∗ )mβ | > 2ζ, |Zxζ1 | = ε1 , . . . , |Zxζ` | = ε` w,p +81N β Yxζ = 0, Yxζ1 = ε1 , . . . , Yxζ` = ε` ; P1ωN |Zxζ1 | = ε1 , . . . , |Zxζ` | = ε` .

Because of stochastic domination (3.4), the last term in the RHS is bounded by ζ ζ exp(−cζ N γ ) µ+ β,1N |Zx1 | = ε1 , . . . , |Zx` | = ε` . In order to control the other term, we have to take into account the deviations occurring from the random coloring of the small clusters, i.e. those of diameter less than N γ . We enumerate the small clusters C1 , . . . , Ck0 included in B(x, N α ). Their cardinals are denoted c1 , . . . , ck0 and their signs s1 , . . . , sk0 . The random variables s1 , . . . , sk0 are iid Bernoulli. For N large enough, one has k0 X ζ 3α N Mx − sign(C ∗ )|C ∗ | − si ci < N 3α . 2 i=1

This comes from the fact that all the clusters intersecting the boundary of B(x, N α ) and ζ distinct from C ∗ have length smaller than N γ when Yx = 1. Thus the total magnetization 2α+γ and does not contribute. By definition of produced by these clusters is less than 6N ζ the event Vx , the unique crossing cluster C ∗ in B(x, N α ) satisfies |mβ N 3α − |C ∗ | | ≤ P0 si ci | > ζ2 N 3α , ζ N 3α . Therefore, we just need to prove large deviations for P1ωN | ki=1 ζ

for configurations ω which satisfy Yx (ω) = 1. By symmetry, it is enough to bound Pk0 ζ ζ 3(α−γ ) (note that k0 is always smaller than P1ωN i=1 si ci > 2 k0 , with k0 ≥ 2 N N 3α ). We follow the argument of [I2] (p. 325). For all t positive, Chebyshev’s inequality implies P1ωN

k0 X i=1

k

0 ζ ζ k0 Y si ci > k0 ≤ exp − t P1ωN exp(tci si ) . 2 2

i=1

As each ci is smaller than N 3γ , one has k0 k0 X ζ ζ 1 X 1 log P1ωN si ci > k0 ≤ −t + log cosh(tci ) k0 2 2 k0 i=1

i=1

ζ ≤ −t + log cosh(tN 3γ ), 2

Wulff Construction in Three and More Dimensions

225

Let 3∗ be the Legendre-Laplace transform of the Bernoulli measure 21 (δ1 + δ−1 ). There exists cζ positive such that for N large enough, k0 X cζ ζ ζ 1 log P1ωN si ci > k0 ≤ −3∗ ≤ − 6γ . k0 2 2N 3γ N i=1

Since k0 ≥

ζ 3(α−γ ) , 2N

P1ωN

this leads to k0 X i=1

si ci >

ζ ζ k0 ≤ exp − cζ N 3α−9γ . 2 2

1 ], we obtain the expected upper bound. u t As γ belongs to ]0, 10 5 α Remark 8.1. Let A be a subset of w 1N with support disjoint from the box B(x, 4 N ). Then the following holds for N large enough: ζ γ µ+ β,1N |Zx | = 0 A ≤ exp(−cζ N ).

This is straightforward from the previous arguments. From [Pi1], we know that ω,p ζ 81N β (Yx = 0) vanishes exponentially fast for arbitrary boundary conditions ω outside the box B(x, 45 N α ). Furthermore, if the magnetization differs from its equilibrium values ±mβ , the deviation occurs from the random coloring of small clusters independent of A. 8.3. Approximation. Before proving Theorems 5.1 and 5.2, let us recall some basic notions of geometric measure theory. Throughout this section, we fix u in BV(T , {+1, −1}) such that F(u) < ∞ and δ in ]0, 1]. We recall that the perimeter of {u = −1}, which is R dH ∗ x , is also finite. The ball of radius r centered in y will be denoted by B(y, r). ∂ u For each y in ∂ ∗ u, we introduce the half-spaces H + (y) = z ∈ R3 | ny .(z − y) ≥ 0 , H − (y) = z ∈ R3 | ny .(z − y) ≤ 0 , where ny is the normal to ∂ ∗ u in y. Let H (y) be the hyperplan H + (y) ∩ H − (y). We fix ζSpositive. According to Theorem 2 (p. 205) of [EG], the reduced boundary ∂ ∗ u equals ni=1 Ki ∪ N, where the 2 dimensional Hausdorff measure of N is less than ζ and each Ki is a compact subset of a C 1 -hypersurface Si . For all x in Ki , the normal nx is also normal to Si and there is r0 > 0 such that uniformly on Ki , (8.3) ∀i, ∀r ≤ r0 , ∀y ∈ Ki , vol B(y, r) ∩ {u = −1} ∩ H + (y) < ζ r 3 , vol B(y, r) ∩ {u = +1} ∩ H − (y) < ζ r 3 . In the decomposition of ∂ ∗ u, one can choose each set Ki such that it is either included in ∂T or at a positive distance from ∂T . Proof of Theorem 5.1. We first approximate the compact sets Ki which do not touch the boundary. The following construction is the same for each Si so it is enough to present it for one Si , that we shall denote by S (with corresponding K ⊂ S).

226

T. Bodineau

As S is C 1 , we can find M pairwise disjoint closed subsets 61 , . . . , 6M of S which cover S up to a set of measure less than ζ and such that each 6i is congruent to the graph of a real function fi : Ui → R of class C 1 , where Ui is a bounded closed set of R2 and fi satisfies the bound |∇fi | ≤ ζ . To any x in Ui , we associate the point gi (x) = (x, fi (x)) of S. Let Ki be the compact subset of Ui such that gi (Ki ) = K ∩ 6i . We choose h in r0 [ (see (8.3)) arbitrarily small and cover Ui with pairwise disjoint cubes C j ⊂ Ui ]0, 10 ζ of side h up to a set of measure less than M . For each cube C j centered in xj and intersecting Ki , we denote by B j the translate of C j centered in gi (xj ). The parallelepiped R j is defined as R j,+ ∪ R j,− , where both parallelepipeds R j,+ and R j,− have a common face B j and height 2δ h (one above and the other below B j ). Let y be in C j ∩ Ki . The parallelepiped R j is included in the ball B(gi (y), 10h). As |∇fi | ≤ ζ , the intersection of the hyperplan H (gi (y)) and R j is contained in {z ∈ R3 | dist(z, B j ) ≤ 2ζ h}. Therefore (8.3) implies that Z ζ |χR j (r) − u(r)| dr ≤ 2ζ h3 + 103 ζ h3 ≤ 104 vol(R i ). (8.4) j δ R The upper bound of τ is denoted by kτ k∞ . It remains to check that Z

k Z X i=1

Bi

τ (ni ) dHx −

τ (nx ) dHx ≤ kτ k∞ CK ζ,

K

(8.5)

where ni is the normal to B i and CK depends on the Hausdorff measure of K. Let C i be the union of cubes C j which intersect Ki , then for h small enough the ζ measure of C i 1Ki (the symmetric difference) is smaller than M and one has Z Z ζ kτ k ∞ , τ (n0 x ) dHx − τ (nx ) dHx ≤ i M gi (C ) gi (Ki ) where n0 is the normal vector to the surface S which coincides with n on K. The normal n0 is uniformly continuous on any compact. Therefore for h small enough, the following holds for any cube C j in Ui : ∀x, y ∈ C j , τ (n0 gi (x) ) − τ (n0 gi (y) ) ≤ ζ. Using the fact that |∇fi | ≤ ζ on each Ui , we derive (8.5). of sets Let us go back to the previous notation and denote by B 1 , . . . B ` the collectionT which approximate the union of sets Ki which are not in ∂ ∗ u. We set also U = ∂ ∗ u ∂T . As τ is bounded, Z τ (nx ) dHx ≤ kτ k∞ ζ. N

We deduce from (8.5) that ` Z X i=1

Bi

Z τ (ni ) dHx −

∂ ∗ u/U

τ (nx ) dHx ≤ Cu ζ,

(8.6)

where Cu depends only on the perimeter of u. From (8.4) and (8.6), we derive Theorem 5.1 for ζ small enough. u t

Wulff Construction in Three and More Dimensions

227

Proof of Theorem 5.2. We are going now to approximate the compact sets Ki included in U = ∂T ∩ ∂ ∗ u. One can also suppose that each Ki is included in one face of ∂T . Notice that H-almost surely for y in Ki , the normal ny is orthogonal to ∂T . r0 [ small enough, there is a covering of Ki with For a given ζ positive and h in ]0, 10 j pairwise disjoint squares B ⊂ ∂T of size h up to a set of measure less than ζ . We denote by R j the parallelepiped in T with one face equal to B j and height δh. Let y be in B j ∩ Ki . The parallelepiped R j is included in the ball B(gi (y), 10h) and (8.3) implies Z ζ |1 + u(r)| dr ≤ 103 ζ h3 ≤ 103 vol(R j ). (8.7) δ Rj Furthermore, for h small enough k Z X

j j =1 B

Z τ (nj ) dHx −

U

τ (nx ) dHx ≤ kτ k∞ ζ,

(8.8)

t where ni is the normal to B i . Combining (8.7) and (8.8), we conclude the proof. u One could have also modified the proof of Lemma 6.4 [Ce] and replaced the approximation in terms of balls, by cubes. Proof of Theorem 6.1. Theorem 6.1 can be viewed as a consequence of a general approximation procedure developed by Alberti and Bellettini [AlBe]. We briefly recall their proof and refer the reader to [AlBe] for details. Let u be a function of bounded variation, then general results of measure theory imply the existence of a sequence {un } of polyhedral functions converging to u in L1 (T ) and such that the vectors measure of the partial first derivatives Dun converge weakly to Du and also that the perimeters of ∂un converge to the one of ∂u. Since τ is continuous, a theorem of Reshetnyak (see [LM]) implies that F(un ) converge to F(u). Therefore for any δ positive, there exists a polyhedral set W such that δ δ 1W ∈ V(1Wm , ) and F(W ) − F(Wm ) ≤ . 3 2 For any h small enough, we approximate the polyhedral set W with disjoint cubes S R 1 , . . . , R ` of size h and basis B 1 , . . . B ` . The set ∂W/ `i=1 B i has arbitrarily small area and is denoted by U δ . As τ is bounded, one has ` Z X i=1

Bi

Z τ (ni ) dHx − F(W ) ≤ kτ k∞

Uδ

dHx ≤ kτ k∞ δ.

This concludes the theorem. u t Remark 8.2. Since we are only interested in approximating the Wulff shape, one could have also used Aleksandrov’s Theorem (see [EG]) which ensures that the boundary of a convex function has almost surely a second derivative. Remark 8.3. As explained previously, Theorem 6.1 holds for any function u of bounded variation. Thus, following the arguments developed in Sect. 6, one can prove the lower bound (Proposition 2.2) for any function u of bounded variation. This implies that a large deviation principle for the measures µ+ β,1N holds.

228

T. Bodineau

Acknowledgements. I am deeply indebted to both D. Ioffe and E. Presutti who have been involved at every stage of this work. This paper would not have been possible without their invaluable suggestions, advice and support. I am also very grateful to A. Chambolle for explaining many results of geometric measure theory and providing advice and references. My gratitude also goes to G. Bellettini and F. Comets for our very useful discussions. I thank S. Shlosman andY.Velenik for useful comments. Finally, I acknowledge the kind hospitality of the Dipartimento di Mathematica di Roma Tor Vergata, where part of this work was completed.

References [AlBe] Alberti, G., Bellettini, G.: Asymptotic behavior of a non local anisotropic model for phase transition. J. Math. Ann. 310 No. 3, 527–560 (1998) [ABCP] Alberti, G., Bellettini, G., Cassandro, M., Presutti E.: Surface tension in Ising system with Kac potentials. J. Stat. Phys. 82, No. 3–4, 743–796 (1996) [AmBr] Ambrosio, L., Braides, A.: Functionals defined on partitions in sets of finite perimeter II: Semicontinuity, relaxation and homogenization. J. Math. pures et appl. 69, 307–333 (1990) [ACCN] Aizenman M., Chayes J., Chayes L., Newman C.: Discontinuity of the magnetization in the onedimensional 1 2 Ising and Potts model. J. Stat. Phys. 50, 1–40 (1988) |x−y|

[BCP]

Bellettini, G., Cassandro, M., Presutti, E.: Constrained minima of non local free energy functionals. J. Stat. Phys. 84, 1337–1349 (1996) [BBBP] Benois, O., Bodineau, T., Butta, P., Presutti, E.: On the validity of van der Waals theory of surface tension. Mark. Proc. and Rel. Fields 3, No. 2, 175–198 (1997) [BBP] Benois, O., Bodineau, T., Presutti, E.: Large deviations in the van der Waals limit. Stoch. Proc. and Appl. 75, 89–104 (1998) [Bo] Bodineau, T.: Wulff construction for Ising model with finite range Kac potentials. In preparation [BIV] Bodineau, T., Ioffe, D., Velenik, Y.: Rigorous probabilistic analysis of equilibrium crystal shapes. Preprint, April (1999) [Ce] Cerf, R.: Large deviations for three dimensional supercritical percolation. Preprint, October (1998) [DKS] Dobrushin, R.L., Kotecky, R., Shlosman, S.: Wulff construction: a global shape from a local interaction. Providence, RI: Am. Math. Soc., 1992 [DP] Deuschel, J. D., Pisztora, A.: Surface order large deviations for high-density percolation. Prob. Theo. Rel. Fields, 104, 467–482 (1996) [DS] Deuschel, J. D., Stroock, D.: Large deviations. London–New York: Academic Press, 1989 [EG] Evans, L., Gariepy, R.: Measure Theory and Fine Properties of Functions. London: CRC Press, 1992 [ES] Edwards, R., Sokal, A.: Generalization of the Fortuin–Kasteleyn–Swendsen–Wang representation and Monte Carlo algorithm. Phys. Rev. D 38 No. 6, 2009–2012 (1988) [F] Fonseca, I.: The Wulff Theorem revisited. Proc. Roy. Soc. London; Ser. A, 432, No 1884, 125–145 (1991) [FM] Fonseca, I., Mueller, S.: A uniqueness proof of the Wulff Theorem. Proc. Roy. Soc. Edinburgh; Sect A, 119, 125–136 (1991) [Gri] Grimmett, G.: The stochastic random cluster process and the uniqueness of random cluster measures. Ann. Prob. 23 No. 4, 1461–1510 (1995) [I1] Ioffe, D.: Large deviations for the 2D Ising model: A lower bound without cluster expansions. J. Stat. Phys. 74, 411–432 (1994) [I2] Ioffe, D.: Exact deviation bounds up to Tc for the Ising model in two dimensions. Prob. Theo. Rel. Fields 102, 313–330 (1995) [IS] Ioffe D., Schonmann R.: Dobrushin-Kotecky-Shlosman Theorem up to the critical temperature. Commun. Math. Phys. 199, 117–149 (1998) [LM] Luckhaus, S., Modica, L.: The Gibbs-Thompson relation within the gradient theory of phase transitions. Arch. Rat. Mech. Anal. 107, 71–83 (1989) [LP] Lebowitz, J., Pfister, C.E.: Surface tension and phase coexistence. Phys. Rev. Lett. 46 No. 15, 1031– 1033 (1981) [LSS] Liggett, T., Schonmann, R., Stacey A.: Domination by product measures. Ann. Prob. 25 No. 1, 71–95 (1997)

Wulff Construction in Three and More Dimensions

229

[MMR] Messager, A., Miracle-Solé, S., Ruiz, J.: Convexity property of the surface tension and equilibrium crystals. J. Stat. Phys. 67 No. 3/4, 449–470 (1992) [Pi1] Pisztora, A.: Surface order large deviations of Ising, Potts and percolation models. Prob. Theo. Rel. Fields 104, 427–466 (1996) [Pi2] Pisztora, A.: Lectures in IHP. Unpublished, Paris, June (1998) [Pf] Pfister, C.E.: Large deviations and phase separation in the two dimensional Ising model. Helvetica Physica Acta 64, 953–1054 (1991) [PV] Pfister, C.E., Velenik Y.: Large deviations and continuum limit in the 2D Ising model. Prob. Theo. Rel. Fields 109 No. 4, 435–506 (1997) [Ta] Taylor, J.: Crystalline variational problems. Bull. Am. Math. Soc. 84 No. 4, 568–588 (1978) [SS] Schonmann, R., Shlosman, S.: Complete analyticity for the 2D Ising model completed. Commun. Math. Phys. 170, 453–482 (1996) Communicated by J. L. Lebowitz

Commun. Math. Phys. 207, 231 – 247 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

An Inverse Scattering Problem with Part of the Fixed-Energy Phase Shifts A. G. Ramm Mathematics Department, Kansas State University, Manhattan, KS 66506, USA. E-mail: [email protected] Received: 3 March 1999 / Accepted: 27 April 1999

Abstract: Assume that q(r) is a real-valued, compactly supported potential, q(r) = 0 Rn , |x| < a}. Let L be for r := |x| > a, q(r) ∈ L2 (Ba ), n = 3, Ba := {x : x ∈P 1 an arbitrary fixed subset of non-negative integers such that ` = ∞, and {δ` } be `6 =0,`∈L

fixed-energy phase shifts corresponding to q(r). The main result is: Theorem. The data {δ` }∀`∈L determine q(r) uniquely. 1. Introduction In this paper a uniqueness theorem is established for inverse scattering with fixed energy data. This theorem says that a real-valued compactly supported spherically symmetric potential q(r) is uniquely determined by the subset of the phase shifts δ` at an arbitrary fixed positive energy. The subset {δ` }`∈L is defined by an arbitrary subset L of integers {0, 1, 2, . . . }, with the property X `∈L,`6=0

1 = ∞. `

(1.1)

No results of this type have been known or conjectured earlier. Condition (1.1) appeared in Müntz’s theorem: it is a necessary and sufficient condition for the completeness of the set {x ` } in L1 (0, a), for an arbitrary fixed a > 0. Such a result gives much deeper understanding of the quantum-mechanical inverse scattering problem with fixed-energy data. It may also be of some practical significance because in some physical experiments the phase shifts can be measured not for all ` and it is important to know what part of the fixed-energy phase shifts is still sufficient for the unique identification of the potential. We now describe the basic ideas of the proof. Our proof is based on two fundamental results.

232

A. G. Ramm

The first result is the uniqueness theorem from [R1] which says that the fixed-energy scattering data A(α 0 , α), ∀α 0 , α ∈ S 2 , determine q(x) ∈ Qa uniquely. Here A(α 0 , α) is the scattering amplitude corresponding to the potential q(x) ∈ Qa := {q : q = q, ¯ q(x) = 0 for |x| > a, q(x) ∈ L2 (Ba )}, where Ba := {x : |x| ≤ a, x ∈ R3 } and the bar stands for complex conjugate. Under this assumption one can check, using Ra Hölder’s inequality, that 0 r|q(r)|dr < ∞. The proof of the above uniqueness theorem is valid in Rn , n ≥ 3, and is not valid for n = 2 (see [R2] for an explanation). An algorithm for inversion of noisy fixed-energy scattering data is developed in [R5] (see also [R2]), where a stability estimate for this algorithm is obtained. In [A] a discussion of the Newton-Sabatier method for inversion of fixed-energy phase shifts is given and an example is constructed of two quite different compactly supported piecewise-constant potentials which produce practically the same ( within the accuracy 10−5 ) fixed-energy phase shifts for all values of the angular momenta. This example illustrates the stability estimate from [R5] and shows that the above estimate is sharp. Actually, property C for a pair of Schrödinger operators is used for a proof of the uniqueness theorem for inverse scattering problem with fixed-energy data. This theorem follows from property C. The notion of property C has been introduced and applied to many inverse problems in a series of papers by the author (see [R2] and references therein, [R3]-[R4]). Let us formulate this notion for a pair {L1 , L2 } of the Schrödinger operators Lj = −O2 + qj − k 2 , j = 1, 2, qj ∈ Qa , k ≥ 0 is a constant, k 2 is the energy of the particle. Let D ⊂ Rn be a bounded domain and Nj := ND (Lj ) := {w : Lj w = 0 in D, w ∈ H 2 (D)}. Let f ∈ L2 (D). Definition 1.1. If Z D

f (x)w1 (x)w2 (x)dx = 0 ∀wj ∈ Nj

⇒ f = 0,

(1.2)

then we say that the pair {L1 , L2 } has property C. Let n = 3 and S 2 be the unit sphere in R3 . Define the scattering solution corresponding to the operator L = −O2 + q − k 2 , and a fixed k > 0, which we take, without loss of generality, to be k = 1 in this paper, as the solution to the equation: (1.3) [O2 + 1 − q(x)]ψ(x, α) = 0 in R3 , ir e x 1 +o , r := |x| → ∞, α 0 := . (1.4) ψ(x, α) = exp(ix · α) + A(α 0 , α) r r r The unit vector α is given, the coefficient A(α 0 , α) is called the scattering amplitude. It is well known that q(x) ∈ Qa determines A(α 0 , α) uniquely. It is proved in [R2] that Z 2 f (x)ψ1 (x, α)ψ2 (x, β)dx = 0 ∀α, β ∈ S ⇒ f = 0, (1.5) D

where ψj (x, α), j = 1, 2, is the scattering solution corresponding to the operator L = −O2 + qj − k 2 , j = 1, 2, and a fixed k = 1.

Inverse Scattering

233

The inverse scattering problem (ISP) with fixed-energy data consists in finding q(x) ∈ Qa given A(α 0 , α) ∀α 0 , α ∈ S 2 . Denote by S˜ 2 an arbitrary small open subset of S 2 . Theorem 1.1 ([R1]). The data A(α 0 , α) ∀α 0 ∈ S˜12 , ∀α ∈ S˜22 , determine q(x) ∈ Qa uniquely. Theorem 1.2 ([R2]). If qj ∈ Qa , j = 1, 2, then (1.2) holds. Theorem 1.2 implies that the pair {L1 , L2 } of Schrödinger’s operators with potentials qj ∈ Qa does have property C. The second result we will use is the uniqueness theorem for analytic functions. Let us assume that h(`) is a holomorphic function in 5+ := {` : Re` > 0}, ` = σ + iτ, σ ≥ 0, and τ are real numbers, h(`) ∈ N (Nevanlinna class in 5+ ), that is Z π 1 − reiϕ + ln h dϕ < ∞, (1.6) sup 1 + reiϕ 0 0, + where ln x = 0 if ln x ≤ 0. Theorem 1.3. If h(`) ∈ N then h(`) = 0, ∀` ∈ L,

(1.7)

h(`) ≡ 0 in 5+ ,

(1.8)

h(`) = 0 ∀` = 0, 1, 2, . . . .

(1.9)

implies

in particular

Theorem 1.3 is a consequence of Theorem 1.4 which is formulated below. This theorem in turn is an immediate corollary to Theorem 15.23 in [Ru, p. 334]. Theorem 1.4. Assume that the following conditions hold: (i) f (z) is holomorphic in the unit disc D1 , (ii) f (z) ∈ N in D1 , that is Z π ln+ |f (reiϕ )|dϕ < ∞, sup 0
(1.10)

(iii) f (zn ) = 0, n = 1, 2, 3, . . . , and ∞ X (1 − |zn |) = ∞.

(1.11)

n=1

Then f (z) ≡ 0 in D1 .

(1.12)

234

A. G. Ramm

Let us explain why Theorem 1.4 implies Theorem 1.3. Note that the function `=

1−w 1+w

(1.13)

1−` 1+`

(1.14)

maps conformally D1 onto 5+ , while w=

is the inverse map 5+ → D 1. The function h(`) = h 1−w 1+w := f (w) is analytic in D1 , f (w) ∈ N in D1 , and f (w` ) = 0. where w` = If ` ∈ L, then X `∈L,`6=0

1−` 1+`

and h(`) = 0. 1 − ` = 1 − 1 + `

X

(1 − |w` |) =

`∈L,`6=0

X `∈L,`6 =0

2 = ∞, `+1

(1.15)

because of assumption (1.1). Thus, f (w) = 0 in D1 by Theorem 1.4, and therefore h(`) = 0 in 5+ . Thus, Theorem 1.3 follows from Theorem 1.4. In order to describe the ideas of our proof, we need to state some known facts from the scattering theory. If q(x) = q(r), r = |x|, that is, the potential is spherically symmetric, and if k = 1, then the scattering solution is: ψ(x, α) =

∞ X `=0

4πi `

x ψ` (r) Y` (x 0 )Y` (α), x 0 := , r r

(1.16)

where ψ` (r) satisfies Eqs. (1.17)–(1.19): `(` + 1) ψ` − q(r)ψ` = 0, r > 0, r2 ψ` (r) = O(r `+1 ) as r → 0, `π + δ` ) + o(1) as r → ∞. ψ` (r) = eiδ` sin(r − 2 ψ`00 + ψ` −

(1.17) (1.18) (1.19)

Here δ` is called the fixed-energy phase shift corresponding to the angular momentum `. The functions Y` (α) = Y`m (α), −` ≤ m ≤ `. in (1.12) are the spherical harmonics orthonormalized in L2 (S 2 ). The summation in (1.16) and in (1.20) below includes the summation with respect to m, −` ≤ m ≤ `, and is not shown for brevity. The corresponding scattering amplitude for the spherically symmetric potential is of the form: A(α 0 , α) = A(α 0 · α) =

∞ X

A` Y` (α 0 )Y` (α),

(1.20)

`=0

where A` = 2πi(1 − e2iδ` ) = 4π eiδ` sin δ` .

(1.21)

Inverse Scattering

235

Recall that we assume k = 1 throughout. Therefore, in the case of spherically symmetric potentials which we consider in this paper, there is a one-to-one correspondence between the scattering amplitude A(α 0 , α) and the set of numbers {A` }`=0,1,2,... . The set {δ` }`=0,1,2,... , −π ≤ δ` < π, determines the set {A` }`=0,1,2,... uniquely, so there is a one-to-one correspondence between the scattering amplitude at a fixed energy and the set of all phase shifts with this energy. Using (1.16) and the orthonormality of the spherical harmonics, one obtains: Z p(r)ψ1 (x, α)ψ2 (x, β)dx ⇐⇒ 0= Ba

0=

∞ Z X

a

`=0 0

p(r)ψ1` (r)ψ2` (r)dr Y` (α)Y` (β), ∀α, β ∈ S 2 ,

(1.22)

where p(r) ∈ Qa is an arbitrary function. Multiplying the second integral in (1.22) by Y` (α), integrating over S 2 and using the orthonormality of the spherical harmonics, one gets that the first equality in (1.22) is equivalent to Z a drp(r)ψ1` (r)ψ2` (r) ∀` = 0, 1, 2, . . . . (1.23) 0= 0

The regular solution ϕ` (r) to Eq. (1.17) is defined uniquely by its behavior near the origin: ϕ` (r) =

r `+1 + o(r `+1 ), r → 0. (2` + 1)!!

(1.24)

This solution is a real-valued function for ` = 0, 1, 2, . . . and r > 0. Its behavior at infinity is known: `π + δ` + o(1), r → +∞, (1.25) ϕ` = |F` | sin r − 2 where δ` is the same as in (1.19) and |F` | is a certain positive constant (the value of the Jost function F` (k) at k = 1). Since ψ` (r) solves (1.17) and satisfies (1.18), it follows that ψ` = c` ϕ` (r), c` = const. Therefore condition (1.23) is equivalent to Z a drp(r)ϕ1` (r)ϕ2` (r) := h(`), ` = 0, 1, 2, . . . , 0=

(1.26)

(1.27)

0

where we have used the real-valuedness of ϕ` (r). Let us now describe the idea of our proof. Step 1. Assuming that the data {δ` }`∈L correspond to two different potentials q1 (r) and q2 (r), qj (r) ∈ Qa , we derive the following orthogonality relation: h(`) = 0 ∀` ∈ L,

p(r) := q1 (r) − q2 (r).

(1.28)

236

A. G. Ramm

Step 2. We prove that the function h1 (`), defined below, in formula (3.65’), is holomorphic in 5+ := {` : ` ∈ C, <` > 0} and belongs to class N defined in (1.6). Condition (1.28) implies h1 (`) = 0 ∀` ∈ L. This implies by Theorem 1.3 that h1 (`) = 0 ∀` ∈ 5+ . Therefore, condition (1.28) implies h(`) = 0, ` = 0, 1, 2, . . . . that is (1.27) holds. This implies, as we have proved above, that Z p(r)ψ1 (x, α)ψ2 (x, β)dx ∀α, β ∈ S 2 . (1.29) 0= Ba

Equation (1.29) and property C for the pair {L1 , L2 } of the Schrödinger operators Lj = t −O2 + qj (x) − 1, qj (x) ∈ Qa , imply p(r) = 0, that is, q1 (r) = q2 (r). u An essential ingredient of our proof of the implication {h1 (`) = 0 ∀` ∈ L} ⇒ {h1 (`) = 0 ∀` = 0, 1, 2, 3, . . . }. The proof of this implication is based on the existence of the transformation operators whose kernel does not depend on `. Existence and uniqueness of such operators as well as the estimate (3.53) (see Sect. 3 below), which we use in the proof of the above implication, are established in Sect. 3. These results, although new and of independent interest, play an auxiliary role in our proof. They are presented in Sects. 3.3 and 3.4 as a part of the proof. The description of the idea of our proof is complete. In Sect. 3 we derive the orthogonality relation (1.28). In the same section we study the analytic properties of h(`) as a function of complex `. In this study there are two basic steps. First, we study the function r Z a πr 2 J 1 (r), drp(r)u` (r), u` (r) := (1.30) h0 (`) := 2 `+ 2 0 where J`+ 1 (r) is the standard Bessel function. 2 Note that ϕ` (r) = u` (r) if q(r) = 0. Define r 1 `+ 1 2 0 2 2 0(` + 1)]2 , H (`) := h0 (`)[ π 2

(1.31)

where 0(z) is the Gamma-function. We prove: Lemma 1.1. The function H (`) is holomorphic in 5+ and H (`) ∈ N . Secondly, we prove the existence of the transformation operator, which sends u` (r) into ϕ` (r). Z r K(r, ρ)u` (ρ)ρ −2 dρ, K(r, 0) = 0. (1.32) ϕ` (r) = u` (r) + 0

It is crucial for our argument that K(r, ρ) does notqdepend on `. Therefore, the analytic 1 properties of h1 (`) := h(`)[0 21 0 ` + 21 2`+ 2 π2 ]2 and H (`), as functions of `, are

Inverse Scattering

237

essentially the same: these two functions are both holomorphic in 5+ and belong to class N . In Sect. 3 we prove some technical estimates for the kernel K(r, ρ) of the transformation operator. Although the transformation operators of the type (1.32) appeared formally earlier in the physical literature [CS, p.185], their existence was not proved. In the literature there exists a construction of the transformation operators whose kernels depend on `, see [V, Le, M]. The difficulty of the existence proof for the transformation operator, whose kernel K(r, ρ) does not depend on `, comes from the fact that the Goursat-type problem which one can derive for K(r, ρ) involves differential operators with variable coefficients which degenerate at the origin. We overcome this difficulty by introducing new variables and reducing the problem to an equivalent Volterra-type integral equation. Existence and uniqueness of the solution to this equation are established in Sect. 3, where some estimates of the solution are given. This concludes the introduction. In Sect. 2 we state the basic uniqueness result. In Sect. 3 proofs are given. 2. Statement of the Basic Result Let us assume that q(x) ∈ Q := {q : q ∈ Qa , q(x) = q(r), r := |x|},

Ra

(2.1)

and let δ` denote the fixed-energy phase shifts. Note that if q ∈ Q, then 0 r|q(r)|dr < c < ∞. Here and below c > 0 stands for various estimation constants. The inverse scattering problem we are interested in can now be formulated: ISP. Given the data {δ` }∀`∈L , where L satisfies condition (1.1), can one recover q(r) ∈ Q uniquely? Our basic result is: Theorem 2.1. Let L be an arbitrary fixed subset of non-negative integers which satisfies condition (1.1). Then the data {δ` }∀`∈L , corresponding to a q(r) ∈ Q, determine q(r) uniquely. This result implies, in particular, that there is no q(r) ∈ Q, q(r) 6 ≡ 0, such that δ2` = 0 ∀` = 0, 1, 2, . . . . It also implies that there is no q(r) ∈ Q, such that δ0 6= 0, δ` = 0, ` = 1, 2, 3, . . . , which means that there are no potentials in Q producing the scattering amplitude A(α, α 0 ) which is constant for all α, α 0 ∈ S 2 at a fixed positive energy, see also [R6] where this has been proved by a different argument for the first time. 3. Proofs 3.1. Proof of the orthogonality relation (1.28). Suppose q1 (r) and q2 (r) generate the same data {δ` }∀`∈L . Subtract from Eq. (1.17) with q = q1 and ψ` = ψ1` similar equation with q = q2 and ψ` = ψ2` to get: ψ`00 + ψ` −

`(` + 1) ψ` − q1 ψ` = p(r)ψ2` , r2

(3.1)

238

A. G. Ramm

where ψ` := ψ1` − ψ2` , p(r) := q1 (r) − q2 (r).

(3.2)

Multiply (3.1) by ψ1` (r), integrate over (0, ∞) and then by parts using (1.18) and (1.19) and the assumption that δ` is the same for ψ1` and ψ2` for ` ∈ L. The result is Z a 0= p(r)ψ1` (r)ψ2` (r)dr ∀` ∈ L. (3.3) 0

This is equivalent to the desired relation (1.28) because of (1.26). u t 3.2. Analytic properties of the function H (`). Proof of Lemma 1.1. Recall the well-known formula [GR, 8.411.8]: r

2 0 π

Z 1 1 `+ 1 (1 − t 2 )` eirt dt. 2 2 0(` + 1)u` (r) = r `+1 2 −1

(3.4)

From (1.30), (1.31), and (3.4) one gets: Z

a

H (`) =

Z drp(r)r

0

2`+2

1

−1

2 ` irt

2

(1 − t ) e dt

.

(3.5)

Let ` = σ + iτ, σ ≥ 0. Then H (`) is a holomorphic function of ` for σ > 0 and Z a dr|p(r)|ra 2σ +1 ≤ ca 2σ . (3.6) |H (`)| ≤ 0

One can always assume a > 1 since σ > 0. Let us check that (3.6) implies that H (`) ∈ N . One has ln+ (ab) ≤ ln+ a + ln+ b

for a, b > 0.

Therefore, using (3.6), one obtains: Z π Z π iϕ 1 − reiϕ 2Re 1−reiϕ + + 1+re |dϕ ln |H ln |ca |dϕ ≤ 1 + reiϕ −π −π Z π 1 − reiϕ Re dϕ ≤ c1 + 2 ln a 1 + reiϕ −π Z π 1 − r2 dϕ ≤ c1 + 2 ln a 2 −π 1 + r + 2r cos ϕ = c1 + 4π ln a < ∞, a > 1. Here we have used the known formula Z π 2π dϕ = , 2 1 − r2 −π 1 + r + 2r cos ϕ

0 < r < 1,

which is easy to check. Estimate (3.7) proves that H (`) ∈ N . Lemma 1.1 is proved. u t

(3.7)

(3.8)

Inverse Scattering

239

3.3. Transformation operators. Define ∂2 Lr ϕ := r 2 2 + r 2 − r 2 q(r) ϕ := L0r ϕ − r 2 q(r)ϕ. ∂r

(3.9)

For the regular solution to (1.17) one has the following differential equation: Lr ϕ` (r) = `(` + 1)ϕ` (r), q and for the function u` (r) = πr 2 J`+ 1 (r) the equation

(3.10)

L0r u` (r) = `(` + 1)u` (r).

(3.11)

2

Let us look for the kernel K(r, ρ) such that formula (1.32) gives the regular solution to Eq. (1.17). Substitute (1.32) into (1.17), drop the index ` for convenience, use (3.10) and (3.11), and get Z r K(r, ρ)uρ −2 dρ 0 = − r 2 q(r)u + (r 2 − r 2 q(r)) 0 Z r Z r −2 2 2 − K(r, ρ)ρ L0ρ udρ + r ∂r K(r, ρ)uρ −2 dρ. (3.12) 0

0

We assume first that K(r, ρ) is twice continuously differentiable with respect to its variables in the region 0 < r < ∞, 0 < ρ ≤ r. This assumption requires extra smoothness of q(r), q(r) ∈ C 1 (0, a). If q(r) satisfies condition (2.1), then Eq. (3.18) below has to be understood in the sense of distributions. Eventually we will work with an integral Eq. (3.45) (see below) for which assumption (2.1) suffices. Note that Z r Z r K(r, ρ)ρ −2 L0ρ udρ = L0ρ K(r, ρ)uρ −2 dρ + K(r, r)ur − Kρ (r, r)u, (3.13) 0

0

provided that K(r, 0) = 0.

(3.14)

dK(r, r) . K˙ := dr

(3.15)

We assume (3.14) to be valid. Denote

Then

Z r 2 ∂r2

0

r

˙ + K(r, r)ur − 2 K(r, r)u K(r, ρ)uρ −2 dρ = Ku r Z r + Kr (r, r)u + r 2 Krr (r, ρ)uρ −2 dρ.

(3.16)

0

Combining (3.12)–(3.16) and writing again u` in place of u, one gets Z r 0= [Lr K(r, ρ) − L0ρ K(r, ρ)]u` (ρ)ρ −2 dρ + u` (r)[−r 2 q(r) + K˙ 0

−

2Kr (r, r) + Kr (r, r) + Kρ (r, r)], ∀r > 0, ` = 0, 1, 2, . . . . r

(3.17)

240

A. G. Ramm

Let us prove that (3.17) implies: Lr K(r, ρ) = L0ρ K(r, ρ), 0 < ρ ≤ r, 2 d K(r, r) 2K(r, r) 2K˙ = . q(r) = 2 − r r r dr r

(3.18) (3.19)

This proof requires a lemma. Lemma 3.1. Assume that ρf (ρ) ∈ L1 (0, r) and ρA(ρ) ∈ L1 (0, r). If Z r f (ρ)u` (ρ)dρ + u` (r)A(r) ∀` = 0, 1, 2, . . . , 0=

(3.20)

0

then f (ρ) ≡ 0 and A(r) = 0.

(3.21)

Proof. Equations (3.20) and (3.4) imply: Z Z 1 d ` r 2 ` dt (1 − t ) dρρf (ρ)eiρt 0= idt −1 0 Z 1 d ` irt + rA(r) (1 − t 2 )` e dt. idt −1 Therefore Z 0=

Z

d ` (t 2 − 1)` dt [ dt ` −1 1

r

dρρf (ρ)eiρt + rA(r)eirt ], l = 0, 1, 2, . . . .

(3.22)

0

Recall that the Legendre polynomials are defined by the formula P` (t) =

1 d` 2 (t − 1)` 2` ! dt `

and they form a complete system in L2 (−1, 1). Therefore (3.22) implies Z r dρρf (ρ)eiρt + rA(r)eirt = 0 ∀t ∈ [−1, 1].

(3.23)

(3.24)

0

Equation (3.24) implies Z

r

dρρf (ρ)eiρt = 0, ∀t ∈ [−1, 1],

(3.25)

0

and rA(r) = 0.

(3.26)

Therefore A(r) = 0. Also f (ρ) = 0 because the left-hand side of (3,25) is an entire function of t, which vanishes on the interval [−1, 1] and, consequently, it vanishes identically, so that ρf (ρ) = 0 and therefore f (ρ) ≡ 0. Lemma 3.1 is proved. u t

Inverse Scattering

241

3.4. Existence and uniqueness of the transformation operators. Let us prove that the problem (3.18), (3.19), (3.14), which is a Goursat-type problem, has a solution and this solution is unique in the class of functions K(r, ρ), which are twice continuously differentiable with respect to ρ and r, 0 < r < ∞, 0 < ρ ≤ r. In this section we assume that q(r) ∈ C 1 (0, a). This assumption implies that K(r, ρ) is twice continuously differentiable. If q(r) ∈ Q, see (2.1), the arguments in this section which deal with the integral equation (3.45) remain valid. Specifically, existence R a and uniqueness of the solution to Eq. (3.45) is proved under the only assumption 0 r|q(r)|dr < ∞ as far as the smoothness of q(r) is concerned. By a limiting argument one can reduce the smoothness requirements on q to the condition (2.1), but in this case Eq. (3.18) has to be understood in the distributional sense. Let us rewrite the problem we want to study: r 2 Krr − ρ 2 Kρρ + [r 2 − r 2 q(r) − ρ 2 ]K(r, ρ) = 0, 0 < ρ ≤ r, Z r r sq(s)ds := g(r), K(r, r) = 2 0 K(r, 0) = 0.

(3.27) (3.28) (3.29)

The difficulty in the study of this Goursat-type problem comes from the fact that the coefficients in front of the second derivatives of the kernel K(r, ρ) are variable. Let us reduce problem (3.27)–(3.29) to the one with constant coefficients. To do this, introduce the new variables: ξ = ln r + ln ρ, η = ln r − ln ρ.

(3.30)

Note that ξ +η

r=e 2 , η ≥ 0,

ξ −η

ρ=e 2 , −∞ < ξ < ∞,

(3.31) (3.32)

and ∂r =

1 1 (∂ξ + ∂η ), ∂ρ = (∂ξ − ∂η ). r ρ

Let

(3.33)

K(r, ρ) := B(ξ, η).

A routine calculation transforms Eqs. (3.27)–(3.29) to the following ones: 1 Bξ η (ξ, η) − Bη (ξ, η) + Q(ξ, η)B = 0, η ≥ 0, −∞ < ξ < ∞, 2 ξ

B(ξ, 0) = g e 2

:= G(ξ ), −∞ < ξ < ∞,

B(−∞, η) = 0, η ≥ 0, where g(r) is defined in (3.28). Here we have defined Q(ξ, η) :=

ξ +η i 1 h ξ +η − eξ +η q e 2 − eξ −η , e 4

(3.34) (3.35) (3.36)

(3.37)

242

A. G. Ramm

and took into account that ρ = r implies η = 0, while ρ = 0 implies, for any fixed η ≥ 0, that ξ = −∞. Note that ξ

sup

−∞<ξ <∞

Z

A

sup

0≤η≤B −∞

e− 2 G(ξ ) < c,

|Q(s, η)|ds ≤ c(A, B),

(3.38) (3.39)

for any A ∈ R and B > 0, where c(A, B) > 0 is a constant. To get rid of the second term on the left-hand side of (3.34), let us introduce the new kernel L(ξ, η) by the formula: ξ

L(ξ, η) := B(ξ, η)e− 2 .

(3.40)

Then (3.34)–(3.36) can be written as: Lηξ (ξ, η) + Q(ξ, η)L(ξ, η) = 0, η ≥ 0, −∞ < ξ < ∞, L(ξ, 0) = e

− ξ2

(3.41)

ξ e2

Z 1 G(ξ ) := b(ξ ) := sq(s)ds, −∞ < ξ < ∞, 2 0 L(−∞, η) = 0, η ≥ 0.

(3.42) (3.43)

We want to prove existence and uniqueness of the solution to (3.41)–(3.43). In order to choose a convenient Banach space in which to work, let us transform problem (3.41)– (3.43) to an equivalent Volterra-type integral equation. Integrate (3.41) with respect to η from 0 to η and use (3.42) to get Z

0

Lξ (ξ, η) − b (ξ ) +

η

Q(ξ, t)L(ξ, t)dt = 0.

(3.44)

0

Integrate (3.44) with respect to ξ from −∞ to ξ and use (3.44) to get Z L(ξ, η) = −

ξ −∞

Z ds

η

dtQ(s, t)L(s, t) + b(ξ ) := V L + b,

(3.45)

0

where Z V L := −

Z

ξ −∞

ds

η

dtQ(s, t)L(s, t).

(3.46)

0

Consider the space X of continuous functions L(ξ, η), defined in the half-plane η ≥ 0, −∞ < ξ < ∞, such that for any B > 0 and any −∞ < A < ∞ one has kLk := kLkAB :=

sup −∞<s≤A t≤B

e−γ t |L(s, t)| < ∞,

(3.47)

Inverse Scattering

243

where γ > 0 is a number which will be chosen later so that the operator V in (3.45) will be a contraction mapping on the Banach space of functions with norm (3.47) for a fixed pair A, B. To choose γ > 0, let us estimate the norm of V . One has: Z kV Lk ≤

sup

≤ kLk

−∞

−∞<ξ ≤A,0≤η≤B

Z

Z

ξ

ξ

sup

Z

ds

−∞<ξ ≤A,0≤η≤B −∞

η

ds

dt|Q(s, t)|e−γ (η−t) e−γ t |L(s, t)| ≤

0

s+t c dt 2es+t +es+t |q e 2 | e−γ (η−t) ≤ kLk, γ (3.48)

η

0

where c > 0 is a constant depending on A, B and Z 2

Z

A −∞

ds

η

dtes+t−γ (η−t) = 2eA

0

A −∞

Z

η

ds

η

η

=2 0

r|q(r)|dr. Indeed, one has:

dtet−γ (η−t) dt ≤ 2eA+B

dtes+t |q(e

s+t 2

s+t 2

)|e−γ (η−t) =

Z

dte−γ (η−t)

η

dte−γ (η−t)

Z

0

Z

e

c1 1 − e−γ B = , γ γ (3.49’)

, one gets:

0

Z

0

0

and, using the substitution σ = e Z

Z

Ra

A+t 2

dσ σ |q(σ )| =

2(1 − e−γ B )

0

γ

A

−∞

Z

a

0

s+t dses+t |q e 2 | dσ σ |q(σ )| :=

c2 . γ (3.49”)

From these estimates inequality (3.48) follows. It follows from (3.48) that V is a contraction mapping in the space XAB of continuous functions in the region −∞ < ξ ≤ A, 0 ≤ η ≤ B, with the norm (3.47) provided that γ > c.

(3.50)

Therefore Eq. (3.45) has a unique solution L(ξ, η) in the region −∞ < ξ < A,

0≤η≤B

(3.51)

for any real A and B > 0 if (3.50) holds. This means that the above solution is defined for any ξ ∈ R and any η ≥ 0. Equation (3.45) is equivalent to problem (3.41)-(3.43) and, by (3.40), one has: ξ

B(ξ, η) = L(ξ, η)e 2 .

(3.52)

Therefore we have proved the existence and uniqueness of B(ξ, η), that is, of the kernel K(r, ρ) = B(ξ, η) of the transformation operator (1.32). Recall that r and ρ are related to ξ and η by formulas (3.31).

244

A. G. Ramm

Let us formulate the result: Lemma 3.2. The kernel of the transformation operator (1.32) solves problem (3.27)(3.29). The solution to this problem does exist and is unique for any potential q(r) ∈ C 1 (0, a) in the class of twice continuously differentiable functions. If q(r) ∈ L∞ (0, a), then K(r, ρ) has first derivatives which are bounded and Eq. (3.27) has to be understood in the sense of distributions. The following estimate holds for any r > 0: Z r |K(r, ρ)|ρ −1 dρ < ∞. (3.53) 0

Proof of Lemma 3.2. We have already proved all the assertions of Lemma 3.2 except estimate (3.53). Let us prove this estimate. Note that Z ∞ Z r η |K(r, ρ)|ρ −1 dρ = r |L(2 ln r − η, η)|e− 2 dη < ∞. (3.54) 0

0

Indeed, if r > 0 is fixed, then, by (3.31), ξ + η = 2 ln r = const. Therefore dξ = −dη, and ρ −1 dρ = 21 (dξ − dη) = −dη, ξ = 2 ln r − η. Thus: Z ∞ Z r 2 ln r−η |K(r, ρ)|ρ −1 dρ = |L(2 ln r − η, η)|e 2 dη 0 0 Z ∞ η |L(2 ln r − η, η)|e− 2 dη. (3.55) =r 0

The following estimate holds: |L(ξ, η)| ≤ ce(2+1 )[ηµ1 (ξ +η)]

1 2 +2

,

(3.56)

where j > 0, j = 1, 2. are arbitrarily small numbers and µ1 is defined in formula (3.60) below, see also formula (3.58) for the definition of µ. Estimate (3.56) is proved below, in Lemma 3.3. From (3.55) and (3.56) estimate (3.53) follows. Lemma 3.2 is proved. u t Lemma 3.3. Estimate (3.56) holds. Proof of Lemma 3.3. From (3.45) one gets: m(ξ, η) ≤ c0 + (W m)(ξ, η), m(ξ, η) := |L(ξ, η)|, Ra where c0 = sup−∞<ξ <∞ |b(ξ )| ≤ 21 0 s|q(s)|ds (see (3.42)), and Z W m :=

ξ −∞

Z ds 0

η

dtµ(s + t)m(s, t), µ(s) :=

s 1 s e 1 + |q(e 2 )| . 2

(3.57)

(3.58)

It is sufficient to consider inequality (3.57) with c0 = 1: if c0 = 1 and the solution m0 (ξ, η) to (3.57) satisfies (3.56) with c = c1 , then the solution m(ξ, η) of (3.57) with any c0 > 0 satisfies (3.56) with c = c0 c1 . Therefore, assume that c0 = 1, then (3.57) reduces to: m(ξ, η) ≤ 1 + (W m)(ξ, η).

(3.59)

Inverse Scattering

245

Inequality (3.56) follows from (3.59) by iterations. Let us give the details. Note that Z η Z η Z ξ Z ξ ds dtµ(s + t) = dt dsµ(s + t) W1 = −∞ 0 0 −∞ Z η dtµ1 (ξ + t) ≤ ηµ1 (ξ + η). = 0

Here we have used the notation

Z µ1 (ξ ) =

ξ

−∞

µ(s)ds,

(3.60)

and the fact that µ1 (s) is a monotonically increasing function, since µ(s) > 0. Note also that µ1 (s) < ∞ for any s, −∞ < s < ∞. Furthermore, Z ξ Z η Z η Z ξ ds dtµ(s + t)tµ1 (s + t) ≤ dtt dsµ(s + t)µ1 (s + t) W 21 ≤ =

−∞ 0 2 2 η µ1 (ξ + η)

2!

2!

0

−∞

.

(3.61)

Let us prove by induction that W n1 ≤

ηn µn1 (ξ + η) . n! n!

(3.62)

For n = 1 and n = 2 we have checked (3.62). Suppose (3.62) holds for some n, then Z η n n Z µn (s + t) η µ1 (ξ + η) tn ξ = dt dsµ(s + t) 1 W n+1 1 ≤ W n! n! n! −∞ n! 0 ≤

ηn+1 µn+1 1 (ξ + η) . (n + 1)! (n + 1)!

(3.63)

By induction, estimate (3.61) is proved for all n = 1, 2, 3, . . . . Therefore (3.59) implies m(ξ, η) ≤ 1 +

∞ 1 X ηn µn1 (ξ + η) 2 +2 ≤ ce(2+1 )[ηµ1 (η+ξ )] , n! n!

(3.64)

n=1

where we have used Theorem 2 from [L, Sect. 1.2], namely the order of the entire P 1 zn function F (z) := 1 + ∞ n=1 (n!)2 is 2 and its type is 2. The constant c > 0 in (3.56) depends on j , j = 1, 2. Recall that the order of an entire function F (z) is the number ρ := lim sup r→∞

ln ln MF (r) , ln r

where MF (r) := max|z|=r |F (z)|. The type of F (z) is the number σ := lim sup r→∞

ln MF (r) . rρ

246

A. G. Ramm

P n It is known [L], that if F (z) = ∞ n=0 cn z is an entire function, then its order ρ and type σ can be calculated by the formulas: ρ

n ln n

lim supn→∞ (n|cn | n ) , σ = . ρ = lim sup 1 eρ n→∞ ln |c | n If cn =

1 , then the above formulas yield ρ (n!)2

=

1 2

and σ = 2. Lemma 3.3 is proved. u t

3.5. Proof of Theorem 2.1. Suppose that there are two potentials which generate the same data {δ` }∀`∈L . In Sect. 3.1 we have proved that this implies (3.3). From (3.3) and (1.26) it follows that (3.3) is equivalent to (1.28). From Lemma 3.2, formula (1.32), Lemma 1.1, and the definition Z a drp(r)ϕ1` (r)ϕ2` (r), (3.65) h(`) = 0

it follows that the function h1 (`) :=

r

2 0 π

1 2 1 0(` + 1)2`+ 2 h(`) ∈ N. 2

(3.65’)

This is checked as in the proof of Lemma 1.1 in Sect. 3.2. There are four terms which one gets from multiplication of ϕ1` (r) by ϕ2` (r), where ϕj ` (r), j = 1, 2. are expressed by formula (1.32) with K(r, ρ) = Kj (r, ρ), j = 1, 2. The first term contains u2` (r) and is identical with (1.30), the second and third terms contain the products of the type u` (r)u` (ρ), while the fourth term contains the term u` (ρ1 )u` (ρ2 ). These terms are treated like in the proof of Lemma 1.1. and estimate (3.53) is used. From (3.65’), Theorem 1.3, and assumption (1.28) it follows that h(`) = 0 for ` = 0, 1, 2, 3, . . . . From this and (1.26) it follows that (1.23) holds for ` = 0, 1, 2, 3, . . . . From (1.23) and (1.22) it follows that Z dxp(r)ψ1 (x, α)ψ2 (x, β) = 0 ∀α, β ∈ S 2 . (3.66) Ba

From (3.66) and Theorem 1.2 one concludes that p(r) = 0. Theorem 2.1 is proved. u t 3.6. Heuristic motivation of the basic result. Here we give an heuristic motivation of the basic result, namely of Theorem 2.1. It is well known that r r 2`+1 2 er πr r 1 J`+ 1 (r) = [1 + o(1)] as ` → ∞. (3.67) u` (r) := √ 2 2 2 2` + 1 2` + 1 One can prove that ϕ` (r) has the same asymptotics r 2`+1 2 er r 1 [1 + o(1)] as ` → ∞. ϕ` (r) = √ 2 2` + 1 2` + 1

(3.68)

Inverse Scattering

247

If one substitutes (3.68) into (1.28), one gets Z a drr 2 p(r)r 2` [1 + o(1)] ∀` ∈ L. 0=

(3.69)

0

If one neglects the term o(1), then one gets Z a drr 2 p(r)r 2` ∀` ∈ L. 0=

(3.70)

0

From (3.70) and the well known Müntz’s theorem [Ru, p.336], it follows that p(r) = 0, which yields the conclusion of Theorem 2.1. This heuristic argument is not a proof because the justification of the passage from (3.69) to (3.70) is not given and is not clear if such a justification can be given directly. Our proof of Theorem 2.1 can be considered as an indirect justification of this heuristic argument. It is known that condition (1.1) is necessary and sufficient for completeness of the set {r ` }`∈L in L1 (0, a) for any fixed a > 0. Therefore one can raise the following question: Is it true that condition (1.1) is necessary for the conclusion of Theorem 2.1 to be valid? This interesting question is open. References [A] [CS] [GR] [L] [Le] [M] [ R1] [R2] [R3] [R4] [R5] [R6] [Ru] [V]

Airapetyan, R., Ramm, A.G. and Smirnova, A.B.: Example of two different potentials which have practical the same fixed-energy phase shifts. Phys. Lett. A 254 3–4, 141–148 (1999) Chadan, K., Sabatier, P.: Inverse problems in Quantum scattering theory. New York: Springer Verlag, 1989 Gradshteyn I., Ryzhik I.: Table of integrals, series and products, Boston: Acad. Press, 1994 Levin, B.: Distribution of zeros of entire functions. AMS Transl. vol. 5 Providence, RI: 1980 Levitan, B.: Inverse Sturm–Liouville problems, Utrecht: VNU Press, 1987 Marchenko, V.: Sturm–Liouville operators and applications. Boston: Birkhäuser Verlag, 1986 Ramm, A.G.: Recovery of the potential from fixed energy scattering data. Inverse Problems 4, 877–886 (1988) Ramm, A.G.: Multidimensional inverse scattering problems. New York: Longman/Wiley, 1992, pp. 1385; Multidimensional inverse scattering problems. Moscow: Mir Publishers, 1994, pp. 1–496 (Russian translation of the expanded monograph) Ramm, A.G.: Completeness of the products of solutions of PDE and inverse problems. Inverse Problems 6, 643–664 (1990) Ramm, A.G.: Necessary and sufficient condition for a PDE to have property C. J. Math. Anal. Appl. 156, 505–509 (1991) Ramm, A.G.: Stability estimates in inverse scattering. Acta Appl. Math. 28 1, 1–42 (1992) Ramm, A.G.: Can a constant be a scattering amplitude? Phys. Lett. A 154A, 35–37 (1991) Rudin, W.: Real and complex analysis. New York: McGraw Hill, 1974 Volk, V.: On inverse formulas for a differential equation with a singularity at x = 0. Uspekhi Math. Nauk 8 4, 1141–151 (1953) (in Russian)

Communicated by B. Simon

Commun. Math. Phys. 207, 249 – 290 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Spectral Analysis for Systems of Atoms and Molecules Coupled to the Quantized Radiation Field Volker Bach1,?,?? , Jürg Fröhlich2 , Israel Michael Sigal3,??? 1 FB Mathematik MA 7-2, TU Berlin, Str. d. 17. Juni 136, 10623 Berlin, Germany.

E-mail: [email protected]

2 Institut für Theoretische Physik, ETH Hönggerberg, 8093 Zürich, Switzerland.

E-mail: [email protected]

3 Department of Mathematics, University of Toronto, Toronto, M5S 3G3, Canada.

E-mail: [email protected] Received: 3 September 1998 / Accepted: 17 March 1999

Dedicated to Klaus Hepp and Walter Hunziker Abstract: We consider systems of static nuclei and electrons – atoms and molecules – coupled to the quantized radiation field. The interactions between electrons and the soft modes of the quantized electromagnetic field are described by minimal coupling, p → p − eA(x), where A(x) is the electromagnetic vector potential with an ultraviolet cutoff. If the interactions between the electrons and the quantized radiation field are turned off, the atom or molecule is assumed to have at least one bound state. We prove that, for sufficiently small values of the fine structure constant α, the interacting system has a ground state corresponding to the bottom of its energy spectrum. For an atom, we prove that its excited states above the ground state turn into metastable states whose lifetimes we estimate. Furthermore the energy spectrum is absolutely continuous, except, perhaps, in a small interval above the ground state energy and around the threshold energies of the atom or molecule.

1. Introduction and Survey of Results 1.1. The quantum theory of photons and nonrelativistic, bound electrons. In this paper we continue our mathematical analysis of the standard model of nonrelativistic, quantum-mechanical matter in the presence of static nuclei and interacting with the quantized radiation field initiated in [5–7]. The purpose of this paper is to study ground states of atoms and molecules and, for atoms, life-times of their excited states, as described by the standard model, applying and extending the techniques developed [5,6]. In contrast to [5,6] we do not require the assumption of confinement of the interaction, anymore. ? Present address: FB Mathematik, Universität Mainz, 55099 Mainz, Germany.

?? Heisenberg Fellow of the DFG, supported by SFB 288 of the DFG, the TMR-Network on “PDE and

QM”. ??? Supported by NSERC Grant NA 7901

250

V. Bach, J. Fröhlich, I. M. Sigal

The physical system we are analyzing consists of a finite number of nuclei, treated as static sources, and a finite number of nonrelativistic electrons, e.g., atoms, ions, or molecules in a clamped nuclei approximation, interacting with the soft modes of the quantized electromagnetic field which is cut-off in the ultraviolet. The Hilbert space of pure state vectors of the system is given by H := Hel ⊗ F,

(1.1)

where Hel is the Hilbert space of some finite number, N, of electrons, and F is the photon Fock space. Thus, in the Schrödinger (configuration-space) representation, Hel is given by the subspace of totally antisymmetric wave functions in L2 [(R3 × Z2 )N ], where R3 is the configuration space of a single electron, and Z2 describes its spin, i.e., h i (1.2) Hel := AN L2 (R3 × Z2 )N , with AN being the orthogonal projection onto the subspace of totally antisymmetric wave functions, as required by the Pauli principle. The one-photon Hilbert space is given by L2 [R3 × Z2 ], where R3 is the photon momentum space and Z2 describes the two independent transversal polarizations of a photon. (Here and above, the integration measure on R3 is Lebesgue measure.) The photon Fock space is then defined by F :=

∞ M

h i Sn L2 (R3 × Z2 )n ,

(1.3)

n=0

where Sn is the orthogonal projection onto the subspace of totally symmetric n-photon wave functions, in accordance with the fact that photons are bosons. It is convenient to represent the Hilbert space H as the space of antisymmetric, square-integrable wave functions on the N -electron configuration space with values in the photon Fock space F, i.e., i h (1.4) H ∼ = AN L2 (R3 × Z2 )N ; F . The dynamics of the system is generated by the Hamiltonian Hα0 :=

i2 + Vc (x) + Hf , σ j · −i∇ x j − 2π 1/2 α 3/2 Aκ (αx j )

N h X

(1.5)

j =1

where we use units in which h¯ = 1 and the electron mass equals 1/2. In (1.5), σ j = y (σjx , σj , σjz ) denotes the three Pauli matrices associated with the j th electron, xj is its position (in suitable units of length described below), α is the fine structure constant, and Aκ (x) denotes the quantized vector potential of the transverse modes of the electromagnetic field in the Coulomb gauge, i.e., o X Z d 3 k κ(|k|/K) n ελ (k)e−ik·x aλ∗ (k) + ελ (k)∗ eik·x aλ (k) , (1.6) Aκ (x) := √ π 2ω(k) λ=1,2 where κ is an entire function of rapid decrease on the real line, e.g., κ(r) := exp(−r 4 ), cutting off the vector potential in the ultraviolet domain, ω(k) := |k| is the frequency

Spectral Analysis for Systems of Atoms and Molecules

251

of a photon with wave vector k, and ελ (k), λ = 1, 2, are photon polarization vectors satisfying ε λ (k)∗ · εµ (k) = δλµ ,

k · ελ (k) = 0,

for λ, µ = 1, 2.

(1.7)

Moreover, aλ∗ (k), aλ (k) are standard creation- and annihilation operators (see, e.g., [28]) on F obeying the canonical commutation relations [aλ# (k 1 ), aµ# (k 2 )] = 0, [aλ (k 1 ), aµ∗ (k 2 )] = δλµ δ(k 1 − k 2 ),

(1.8)

where a # = a or a ∗ . These objects are densely defined, operator-valued tempered distributions on Fock space F. Fock space F contains a vector , the vacuum vector, uniquely determined, up to a phase, by the properties that kk = 1 and aλ (k) = 0, for all λ and k. A dense set of vectors in F is obtained by applying polynomials in aλ∗ (λ = 1, 2), smeared out with test functions, to the vacuum vector . The term Vc on the right side of Eq. (1.5) defining the Hamiltonian is the Coulomb potential describing the electrostatic interactions between electrons and nuclei. In our units, it is given by Vc (x) :=

M N X X j =1 m=1

−Zm + |x j − R m |

X 1≤i<j ≤N

1 , |x i − x j |

(1.9)

where x ≡ (x 1 , . . . , x N ) ∈ R3N , and R 1 , . . . , R M ∈ R3 are the positions of M static nuclei with atomic numbers Z1 , . . . , ZM . Finally, the Hamiltonian of the free quantized electromagnetic field, Hf , is given by X Z (1.10) d 3 k aλ∗ (k) ω(k) aλ (k). Hf := λ=1,2

The r.s. of Eq. (1.10) defines a densely defined, positive, self-adjoint operator on F with absolutely continuous spectrum on the positive half-axis and a simple eigenvalue at 0 corresponding to the eigenvector . Next, we comment on the units chosen in (1.5). Length (and hence the positions, x j , of the electrons and, R m , of the nuclei) is measured in units of half a Bohr radius, h¯ 2 1 2 rBohr = 2mel e2 , expressed in Gaussian units, where mel is the mass of an electron and −e its electric charge. Photon wave lengths are measured in units of α times half a Bohr radius, α2 rBohr , i.e., the unit for photon wave vectors is 2α −1 rBohr −1 . The energy unit 2

. The ultraviolet cutoff of the radiation is chosen to be 4 Rydberg, with 4 Ry = r2e Bohr field imposed by the function κ(|k|/K) used in the definition of the vector potential Aκ (x) turns off interactions between electrons and photons with energy large compared to K · 4 Ry. The physical value of the fine structure constant α is ≈ 1/137. In this paper, α plays the rôle of a small, dimensionless number. Our results hold for sufficiently small values of α. We shall not verify that the radii of convergence in α of our analytical constructions cover the physical point α = α(phys) ≈ 1/137 (such a verification would presumably require numerical work on a computer). Our main concern in this paper is to analyze properties of the energy spectrum of the Hamiltonian Hα0 and to study resonances for the dynamics generated by Hα0 and estimate their lifetimes. We shall show that, for sufficiently small values of α > 0, Hα0 has a normalizable ground state corresponding to the minimum of its spectrum and

252

V. Bach, J. Fröhlich, I. M. Sigal

that “most” of its spectrum is absolutely continuous. Furthermore, we shall show that the excited bound states of the atom or ion (for a purely technical reason, we exclude molecules here) with electrons decoupled from the radiation field, i.e., the bound states of H0 , turn into metastable states of finite lifetime when that coupling is turned on. We show that Fermi’s golden rule [27] yields an accurate estimate of the lifetimes of metastable states, and we provide a rigorous justification of Bethe’s formula [11,10] for the Lamb shift. Notice that in our proof of a similar statement in [5,6], we required that A(x) be replaced by a confined vector potential κ(|x|/R)Aκ (x), Aκ (x) −→ κ(|x|/R) Aκ (x),

(1.11)

where 1 R < ∞ is large on the atomic length scale, but finite. We do not make this assumption in the present paper, anymore. The standard model of nonrelativistic electrons interacting with the quantized radiation field, cutoff in the ultraviolet, involves passing to the limit R → ∞ in (1.11). In this limit, certain terms in the Hamiltonian Hα0 that describe the coupling of electrons to the quantized radiation field turn out to be what in renormalization jargon is called a marginal perturbation. This makes the problem analyzed in the present paper more subtle and complicated, analytically, than the ones considered in [6,7], where (1.11) was assumed. Because of these analytical complications, we shall not carry out a full-fledged renormalization group analysis of ground states and resonances in the model introduced in (1.5)–(1.10). Instead, we use softer and less quantitative methods to establish existence of ground states, and we do not analyze the exact nature of the atom. The price to be paid is that we lose control over the precise nature of the spectrum of Hα0 in an interval of length O(α 3 ) above the ground state energy, Eo (α), and we do not present a convergent algorithm to calculate E0 (α) and the exact resonance energies to arbitrary accuracy, but merely derive expressions for these quantities which are asymptotic to order O(α 3 ). In this sense, the results of this paper are less precise than the ones in [6, 7]. But, on the positive side, they do not require the full power of the renormalization group methods [7]. In order to describe our results and methods more precisely, we start with a discussion of the spectral properties of the unperturbed Hamiltonian (electrons decoupled from the radiation field), 0 = Hel + Hf , H0 ≡ Hα=0

(1.12)

where Hel is the usual atomic (or molecular) Hamiltonian defined by Hel :=

N X

−1x j + Vc (x).

(1.13)

j =1

We recall some key properties of Hel , viewed as an operator on Hel . For details and proofs, see [27,14] and references given there. We first note that the potential Vc is a P perturbation of the kinetic energy operator 1x := N j =1 1x j with zero relative bound. Thus Hel is a semibounded, self-adjoint operator on the domain D(Hel ) = D(1x ) = Hel ∩ H 2 [(R3 × Z2 )N ], where H 2 is the usual Sobolev space. The essential spectrum of Hel is given by σess (Hel ) = [6, ∞), where 6 is the infimum of the spectrum of Hel , with N replaced by N − 1, as follows from the HVZ theorem (see, e.g., [14]). Thus, 6 is the ionization threshold.

Spectral Analysis for Systems of Atoms and Molecules gr. states

253

(abs.) cont. spectrum exc. states

E0

E1

E2

···

6

Fig. 1. The Spectrum of Hel

In what follows, we shall make the following assumption about the atom, ion, or molecule under consideration, represented by the parameters N, R 1 , . . . , R M , and Z1 , . . . , ZM . For the proof of existence of a ground state of Hα0 in Sect. 2, we assume that Hel has at least one eigenvalue E0 below the ionization threshold 6, i.e., we require that E0 := inf σ (Hel ) < 6 := inf σess (Hel ).

(1.14)

The contents of Sect. 3 on the life-times of metastable states is non-trivial only if, besides E0 , Hel has at least one further eigenvalue E1 , with E0 < E1 < 6, i.e., in Sect. 3 we additionally require that the spectrum σ (Hel ) = {E0 , E1 , . . . } ∪ [6, ∞)

(1.15)

of Hel consists of eigenvalues E0 < E1 < E2 < · · · ≤ 6, of finite multiplicity below 6, possibly with an accumulation point at 6 and (absolutely) continuous spectrum in [6, ∞); see Fig. 1. In Sect. 3 we shall also require an assumption saying, roughly speaking, that there are no accidental selection rules in the system described by Hel , which could prevent excited eigenstates of Hel from decaying radiatively (in 2nd order in the relevant coupling constant). For positive ions and (neutral) atoms or molecules our assumption (1.15) is justified, P as it is known that if N ≤ M m=1 Zm then Hel has infinitely many eigenvalues of finite multiplicity below 6 [36]. In the case of negatively ionized atoms or molecules, i.e., if P N> M has isolated eigenvalues below the ionization m=1 Zm , the question, whether HelP threshold, is more subtle. Indeed, if N ≥ M m=1 (2Zm + 1) then Hel has no eigenvalues at all [26] (see [30,31] for earlier results). Assuming that the atomic numbers of the nuclei are such that E0 < 6, i.e., that Hel has isolated eigenvalues below the ionization threshold, it tends to be an open question whether the ground state corresponding to the energy E0 is unique or not, except when N = 1 or = 2, in which case a standard Perron–Frobenius argument proves uniqueness. (Non-uniqueness for N ≥ 3 may arise as a consequence of the Pauli principle.) The spectrum of the photon Hamiltonian Hf consists of a simple eigenvalue at 0, corresponding to the vacuum vector ∈ F, and absolutely continuous spectrum (of infinite multiplicity) covering the half-axis [0, ∞). Consequently, by separation of variables, the unperturbed Hamiltonian H0 = Hel + Hf on F has spectrum σ (H0 ) = σ (Hel ) + σ (Hf ).

(1.16)

The point spectrum of H0 is the same as the point spectrum of Hel , i.e., it consists of the eigenvalues {Ej }j =0,1,2,... (corresponding to the eigenvectors ϕj,` ⊗ , where {ϕj,` }`=1,2,... ,nj is an orthonormal basis of eigenvectors of Hel corresponding to the eigenvalue Ej of multiplicity nj ). The continuous spectrum of H0 covers the half-axis

254

V. Bach, J. Fröhlich, I. M. Sigal

E0

E1

E2

···

6

Fig. 2. The Spectrum of H0 = Hel ⊗ 1 + 1 ⊗ Hf

[E0 , ∞) and consists of a union of branches [Ej , ∞) starting at the eigenvalues Ej and the branch [6, ∞); as indicated in Fig. 2. We note that the ground state energy E0 = inf σ (Hel ) of the atom or molecule in the absence of the quantized radiation field coincides with the ground state energy E0 ≡ E0 (α = 0) = inf σ (H0 ) of the system of an atom or molecule in the presence of photons, but decoupled from them. But, while E0 is an isolated eigenvalue of Hel , it lies at the tip of a branch of continuous spectrum of the Hamiltonian H0 . Similarly, the energies E1 , E2 , . . . are isolated eigenvalues of Hel ; but they are eigenvalues of H0 imbedded in continuous spectrum of H0 , and each Ej is the threshold of a branch of continuous spectrum of H0 . These spectral properties of H0 make it a difficult problem to analyze, mathematically rigorously, the fate of the eigenvalues Ej of H0 , and the nature of the energy spectrum of the interacting system described by the Hamiltonian Hα0 introduced in (1.5), for α > 0. Although the perturbation, Wα0 := Hα0 − H0 , is a small perturbation of H0 , general analytical methods to deal with this type of problem in perturbation theory do not appear to be available. In [6,7], we have started to develop such methods, tailormade to analyze a class of Hamiltonians describing interactions between nonrelativistic quantum-mechanical matter and the radiation field. In this paper we extend those methods to the Hamiltonian Hα0 (= H0 +Wα0 ) of Eq. (1.5), which describes much of the physics of light atoms or molecules interacting with the quantized electromagnetic field (within the Born–Oppenheimer approximation). For background material concerning the physics described by Hα0 , see e.g., [12,13] and references given there. 1.2. Survey of main results. In the next section, 1.3, we consider the structure and properties of the perturbation Wα0 in the Hamiltonian Hα0 = H0 + Wα0 of (1.5). The strength of the perturbation Wα0 relative to H0 is measured by the parameter g := (α K)3/2 ,

(1.17)

where K is the ultraviolet cutoff in the electromagnetic vector potential A, as described in Eq. (1.6). The parameter K is a “dimensionless energy scale” given by the photon energy above which interactions between electrons and the radiation field are cut off, divided by 4 Ry. In Sect. 1.3 we prove that if g is sufficiently small then the interaction Wg is bounded by H0 , in the sense of Kato [25,28], with relative bound strictly smaller than 1. This proves that, for small α, Hα0 is bounded from below and self-adjoint on the domain of H0 ; see Corollary 1.2. (Under somewhat weaker assumptions one can prove that Hα0 defines a semibounded quadratic form on an appropriate core. It is not easy, however, to characterize the domain of the corresponding self-adjoint operator.) Since g := (α K)3/2 is the relevant coupling parameter, we henceforth write 0 , Hg := Hα=g 2/3 /K

0 Wg := Wα=g 2/3 /K .

(1.18)

Spectral Analysis for Systems of Atoms and Molecules

255

We should emphasize that our bounds on g = (α K)3/2 become poor, as the number of electrons, N, becomes large. This does not mean that the Hamiltonian Hg is ill-defined or unbounded from below, for large values of α, as long as the ultraviolet cutoff K is kept finite. In fact, using path-space methods one can rigorously construct the semi-group exp(−tHg ), for t ≥ 0, and prove that it is self-adjoint, strongly continuous in t > 0 with exp(−tHg ) → 1, as t → 0, for arbitrary values of α and N, as long as K < ∞ (see [18,33] for various ingredients of the proof). However, for the purposes of the analysis presented in Sect. 3, Kato- (or quadratic-form-) boundedness of Wg relative to H0 is an important property. In Sect. 2 we address the question whether Hg has a ground state corresponding to an eigenvalue E0 (g) := inf σ (Hg ). Our main result is Theorem 2.1, which answers this question in the affirmative, as long as the effective coupling constant g is sufficiently small. We remark that the magnitude of the integrand in the definition (1.6) of A(x) is ∼ |k|−1/2 , as k → 0. Hence we are considering a critical case (in the infrared) for which the magnitude of the integrand divided by ω is not square-integrable at k = 0. In case of such a critical behaviour of the coupling function and under the replacement of the photons and the atom by a scalar massless field and a matrix of finite rank, respectively, it is known that ground states (with finite photon number expectation) may not exist, in general [34,1]. This indicates that we are dealing with a borderline case and that the transversality of the photons is crucial for the existence of ground states shown in Theorem 2.1. Indeed, our proof makes explicit use of this property. The method used in Sect. 2 is “non-perturbative” but non-constructive. For this reason, it does not enable us to estimate the multiplicity of the ground state energy E0 (g). However, if the number N of electrons is = 1 or = 2, and if the Zeeman terms in the Hamiltonian Hg are set to 0, then we can construct an L2 -space representation of the photon Fock space F (“electric field” representation) with the property that exp(−tHg ) is positivity-preserving on Hel ⊗F, where Hel is taken in the usual Schrödinger configuration-space representation, and, for N = 2, only spin-singlet states are considered. The ergodicity (positivity-improvement) of exp(−tHg ) (see [27]), which is established in [19], and a Perron–Frobenius argument then establishes uniqueness of the groundstate. It is worthwhile to point out some (fairly standard, but) fundamental consequences of the existence of a ground state of Hg for scattering theory: Using straightforward variants of methods developed in [20–22,3,4], one can construct Möller (wave) operators, ± , such that the range of + (− ) describes asymptotic states of the system consisting of an atom (or molecule) in a ground state accompanied by an outgoing cloud of freely moving photons. The obvious conjecture is that Ran{+ } = Ran{− } =: Hasy , where (0) (0) Hasy is isomorphic to Hg ⊗ F, and where Hg is the space of ground states of Hg . This conjecture, called “asymptotic completeness”, would imply the unitarity of the scattering matrix for the scattering of photons off an atom or molecule below the ionization threshold. We are miles away from proving this conjecture! But in the simplified models of massive photons and confined electrons, it has recently been proven in [17, 15], and for massless photons coupled to confined electrons in the dipole approximation, asymptotic completeness has recently been established in [35]. In Sect. 3, we prove that, outside small neighborhoods of E0 and 6 and below 6, the spectrum of Hg is purely absolutely continuous. (Outside small neighborhoods of the thresholds of Hel and above 6, the spectrum of Hg can be shown to be purely absolutely continuous by using the methods in [6,8].) In particular, Hg does not have any eigenvalues in the vicinity of the imbedded eigenvalues E1 < E2 < · · · < En < 6 − δ,

256

V. Bach, J. Fröhlich, I. M. Sigal E0

E1

E2

···

6

Im(θ) 0 (θ, z) onto the energy plane Fig. 3. A projection of the Riemann surface of z 7 → Fψ,ϕ

of H0 , where δ is a small positive constant depending on g and N. We will, however, make the idea rather precise that imbedded eigenvalues of H0 give rise to metastable states of Hg , and we shall estimate the lifetime, ∝ g 2 , of these metastable states up to an error term O(g 2+ε ), for some ε > 0. Our notion of resonance is based on dilatation analyticity. In order to state our ideas simply, we assume that there is only one atomic nucleus (M = 1 in Eq. (1.9)) of atomic number Z = Z1 ≤ 2N located at R 1 = 0. (The general case of an arbitrary, finite number M of static nuclei can be studied, too, by borrowing ideas developed in [23]. It would be interesting to extend our results to systems of electrons and dynamical nuclei, all coupled to the quantized radiation field. This would require a considerable amount of additional work and will not be done in the present paper.) To describe dilatation analyticity, we start by scaling the positions, x j , of the electrons and the momenta, k, of the photons by x j 7 → eθ x j ,

k 7 → e−θ k .

(1.19)

When θ is real, the transformations (1.19) determine a unitary transformation Uθ on the Hilbert space H defined in Eqs. (1.1)–(1.4). It is easy to see that the subspace, D ⊆ H, of vectors, ψ, with the property that ψ(θ) := Uθ ψ is analytic in θ , for |Imθ | < π/2, is dense in H. Furthermore, one easily checks that Uθ Hf Uθ∗ = e−θ Hf .

(1.20)

These facts, combined with well-known results [2,9] on dilatation analyticity for Schrödinger operators, show that, for arbitrary ψ, ϕ ∈ D, the function

−1 0 (θ, z) := ψθ¯ z − H0 (θ ) ϕ(θ ) , (1.21) Fψ,ϕ where H0 (θ ) = Uθ H0 Uθ−1 , is independent of θ , for |Imθ | < π/2, and, for Imθ =: ϑ 0 (θ, z) is analytic in z in the complement of the shaded region depicted in fixed, Fψ,ϕ Fig. 3. In Sect. 3, we construct the function

−1 g (1.22) Fψ,ϕ (θ, z) := ψθ¯ z − Hg (θ ) ϕ(θ ) , where Hg (θ ) = Uθ Hg Uθ−1 , with Hg as in (1.5), and we show that, for our choice of an g ultraviolet cutoff κ (see (1.6) and below), and for arbitrary ψ and ϕ in D, Fψ,ϕ (θ, z) is g independent of θ, for |Imθ| < π/4 small enough, and, for Imθ =: ϑ fixed, Fψ,ϕ (θ, z) is analytic in z in the complement of the shaded region depicted in Fig. 4. It thus provides g an analytic continuation of the matrix element Fψ,ϕ (0, z) of the resolvent of Hg − z in z from the upper half plane into the lower half plane outside the spectrum of Hg (θ ).

Spectral Analysis for Systems of Atoms and Molecules E0

E1

···

257 Ej

E0 (g)

Ej,` (g) E1 (g) g

Fig. 4. A projection of the Riemann surface of z 7 → Fψ,ϕ (θ, z) onto the energy plane

This implies the absolute continuity of the spectrum of Hg for those energies which are contained in the resolvent set of Hg (θ) (see Corollary 3.3). In Sect. 3, Eqs. (3.8)–(3.14), we introduce a notion of resonance energy Ej (g) corresponding to the energy Ej of the j th excited state of the atom or molecule. We show that the Lamb shift, Re{Ej (g) − Ej }, is given by Bethe’s formula and Im{Ej (g)} is given by Fermi’s Golden Rule, to order g 2 , with error terms that we prove to be O(g 2+ε ), for any 0 < ε < 1/3. If the decay rate for the j th excited state of the atom is non-zero in second order perturbation theory then the lifetime of an initial state close to the j th excited state of the uncoupled atom is shown to be finite. Its inverse is shown to be given by Im{Ej (g)} = 0j g 2 + O(g 2+ε ),

(1.23)

with 0j strictly positive. We remark again that, in contrast to [6, Sect. IV], where similar results have been obtained, we do not require the assumption of confinement, see Eq. (1.11), in the derivation of (1.23), anymore. In [7], by assuming confinement and using a more elaborate renormalization group analysis, we have given a convergent algorithm to compute Im{Ej (g)}, for small values of g > 0. It then follows by standard reasoning that, for g > 0 sufficiently small, the spectrum of Hg is purely absolutely continuous in a neighborhood of every eigenvalue Ej , j ≥ 1, of Hel for which 0j > 0. What we are really looking after, from a physics point of view, is a precise understanding of the decay of the excited states of the atom under the time evolution exp[−it Hg ], as t becomes large. Let δ := dist Ej , σ (Hel ) \ {Ej } > 0, set Ij (δ/2) := (Ej −δ/2, Ej +δ/2), and notice that Ij (δ/2) is an open interval containing Ej and such that dist Ij , σ (Hel ) \ {Ej } = δ/2. Let Fj denote a smooth characteristic function of Ij (δ/2) (see Subsection 3.1). We shall identify a “j th excited state” of the atom with a vector of the form 1/2 (1.24) 9j := Fj (Hg ) exp −g −2 Hf ϕj,` ⊗ η, atomic state to order g 2 described in (3.26) below, and where ϕj,` is an effective excited −2 the operator exp −g Hf essentially eliminates high-energy photons of energy larger than g 2 in the state η ∈ F, which is assumed to be dilatation analytic. An example for such a state is given by η = . We then show that, for any 0 < ε < 1/3, there exists a constant C ≥ 0 and, for any L ∈ N, a constant CL ≥ 0 such that D E 9j exp[−itHg ] 9j i h (1.25) ≤ Bη2 C exp −t g 2 0j − Cg 2+ε ) + CL t −L g 4 ,

258

V. Bach, J. Fröhlich, I. M. Sigal

where Bη := sup|θ|≤θ0 kη(θ)k. This estimate implies that, given ε > 0, there is a finite constant Cε > 0 such that, for t > Cε 0j−1 g −2 , −itH g 9 ≤ ε. 9j e (1.26) j We remark that it is known from other methods (see, e.g., [16]) that, given ε > 0, there is a constant Dε > 0 such that, for 0 ≤ t < Dε g −2 , −itH g 9 ≥ ε. 9j e (1.27) j Estimate (1.26) shows that the state ψj decays, with a life-time bounded above and below by const · g −2 . It is a typical example for the kind of estimates we are able to prove with the help of the methods developed in Sect. 3. We point out that, while the general idea of our method is borrowed from [24], the details of our analysis are more complicated because resonances do not appear as isolated poles of the meromorphic g continuation of Fψ,ϕ (θ, z) in (1.22) onto the second sheet. 1.3. Relative bounds, self-adjointness, and dilatation analyticity. We return to Eq. (1.5), which we write as Hg = H0 + Wg ,

(1.28)

where H0 is defined in (1.12), and we obtain Wg =

N X j =1

−4π 1/2 α 3/2 Aκ (αx j ) · (−i∇ x j ) + 4π α 3 A2κ (αx j ) + 2π 1/2 α 5/2 σ j · ∇ × Aκ (αx j )

(1.29)

α=g 2/3 /K

from expanding the square in (1.5). Our first goal in this subsection is to prove in Corollary 1.2 that Wg is defined on D(H0 ) and obeys the bound

Wg |H0 + iC(N)|−1 ≤ C 0 (N ) (α K)3/2 , (1.30) for some constants C(N), C 0 (N) ≥ 0. This establishes the semiboundedness and selfadjointness of Hg on D(H0 ) for g := (α K)3/2 < C 0 (N )−1 . Our second goal is to establish the dilatation analyticity of Wg (θ) := Uθ Wg Uθ−1 (see Corollary 1.3), where Uθ is the dilatation operator defined in (1.19), i.e., we prove that θ 7 → Wg (θ ) is an analytic function on D(0, θ0 ) := {z ∈ C||z| < θ0 } with values in B(D(H0 ); H), the bounded operators from D(H0 ) to H, for some θ0 > 0. We establish this property by observing that the coupling functions in Wg (θ) are analytic in θ , pointwise in the other parameters, and by verifying a bound similar to (1.30), namely,

Wg (θ) |H0 + iC(N, θ)|−1 ≤ C 0 (N, θ ) g, (1.31) for some constants C(N, θ), C 0 (N, θ) ≥ 0. In fact, Eq. (1.30) is just the special case θ = 0 in Eq. (1.31). Finally, we establish the dilatation analyticity of Hg (θ ) := Uθ Hg Uθ−1 in Corollary 1.4, assuming that Hel = −1x + Vc (x) is the Hamiltonian of an atom, i.e., M = 1. This simplifying assumption could be avoided by using exterior dilatations

Spectral Analysis for Systems of Atoms and Molecules

259

[23,32], at the expense of having to deal with more involved estimates; we do not carry out this analysis here. We characterize the coupling functions in Wg (θ ) in terms of the following functions, √ −θ/2 2e κ(e−θ |k|/K) p exp[−iαk · x] ελ (k) Gx (k, λ; θ) := π K 3 ω(k)

(1.32)

and B x (k, λ; θ ) :=

α

√ −3θ/2 2e κ(e−θ |k|/K) p exp[−iαk · x] k × ελ (k) , i π K 3 ω(k)

(1.33)

where we introduce a dilatation parameter θ ∈ D(0, θ0 ) ⊆ C, for some θ0 > 0 sufficiently small. Note that, pointwise for every x, k ∈ R3 and λ ∈ Z2 , the maps θ 7 → Gx (k, λ; θ) and θ 7 → B x (k, λ; θ) are analytic in D(0, θ0 ). Furthermore, we notice that B x (k, λ; 0) = ∇ × Gx (k, λ; 0). By means of Gx (k, λ; θ ) and B x (k, λ; θ ) we define the following functions on R3 × Z2 with values in the operators on Hel , o −2Gx j (k, λ; θ ) · pj + σ j · B x j (k, λ; θ ) ,

N n X

¯ ∗ := w1,0 (k, λ; θ ) := w0,1 (k, λ; θ)

j =1

(1.34) w2,0 (k 1 , λ1 ; k 2 , λ2 ; θ) := w0,2 (k 1 , λ1 ; k 2 , λ2 ; θ¯ )∗ N n o X := Gx j (k 1 , λ1 ; θ ) · Gx j (k 2 , λ2 ; θ ) ,

(1.35)

j =1

w1,1 (k 1 , λ1 ; k 2 , λ2 ; θ) :=

N n X

Gx j (k 1 , λ1 ; θ¯ )∗ · Gx j (k 2 , λ2 ; θ )

j =1

(1.36)

o + Gx j (k 1 , λ1 ; θ ) · Gx j (k 2 , λ2 ; θ¯ )∗ ,

and these, in turn, serve as coupling functions for the operators defined by Wm,n (θ ) :=

X

Z d 3 k1 · · · d 3 km+n wm,n (k 1 , λ1 ; . . . , k m+n , λm+n ; θ )

λ1 ,... ,λm+n =1,2 aλ∗1 (k 1 ) · · · aλ∗m (k m )

aλm+1 (k m+1 ) · · · aλm+n (k m+n ).

(1.37)

Then we observe that, after normal ordering, the (dilated) interaction Wg (θ ) reads Wg (θ) =

X m+n≤2

g m+n Wmn (θ ) + g 2 Cno ,

(1.38)

260

V. Bach, J. Fröhlich, I. M. Sigal

R where Cno is defined by Cno := 2N |Gx (k, 1; 0)|2 d 3 k, which is independent of x, and g := (α K)3/2 . Henceforth and consistent with our previous definitions, we omit θ in our notation in the undilated case, θ = 0, writing B x (k, λ) := B x (k, λ; 0), Gx (k, λ) := Gx (k, λ; 0) , wm,n (k 1 , λ1 ; . . . , k m+n , λm+n ) := wm,n (k 1 , λ1 ; . . . , k m+n , λm+n ; 0), Wm,n := Wm,n (0).

(1.39) (1.40) (1.41)

Next, we define J : R3 → R+ to be the smallest function such that

w1,0 (k, λ; θ) (−1x + 1)−1/2 , w0,1 (k, λ; θ ) (−1x + 1)−1/2 ≤ J (k) (1.42) holds, for all |θ| ≤ θ0 and (k, λ) ∈ R3 × Z2 , and such that

w2,0 (k 1 , λ1 ; k 2 , λ2 ; θ) , w1,1 (k 1 , λ1 ; k 2 , λ2 ; θ ) , (1.43)

w0,2 (k 1 , λ1 ; k 2 , λ2 ; θ ) ≤ J (k 1 ) J (k 2 ) holds, for all |θ| ≤ θ0 and (k 1 , λ1 ), (k 2 , λ2 ) ∈ R3 × Z2 . Note that, due to ±i∇ x j ≤ (−1x + 1)1/2 , we have that J (k) ≤ C(θ0 ) N K −3/2 |k|−1/2 κ(e−θ |k|/K) + α|k| κ(e−θ |k|/K) , (1.44) for some constant C(θ0 ) > 0. The rapid decay of κ implies that Z 1/2 J (k)2 ω(k)β d 3 k < ∞, 3β :=

(1.45)

for any β > −2. In particular, 3β is uniform in K ≥ 1, for any 1 ≥ β > −2. This uniformity in K ≥ 1 is actually the basic requirement that determines p = 3/2 in the coupling parameter g = α 3/2 K p . The main relative bound that we use is described in the following lemma. Lemma 1.1. For all m, n ∈ N0 with 1 ≤ m + n ≤ 2 and all θ ∈ D(0, θ0 ), the operators Wm,n (θ ) are defined on D(H0 ) and obey the bound

Wm,n (θ) |H0 + iC(N, θ0 )|−1 ≤ 4(1 + 32 + 32 ) (m+n)/2 , (1.46) 0 −1 for some constant C(N, θ) ∈ R+ . Proof. We first note that the canonical commutation relations (1.8) allow us to convert ¯ Indeed, for any ψ ∈ D(H0 ), estimates on W1,0 (θ) into those for W0,1 (θ). X Z ¯ ψk2 + d 3 k kw1,0 (k, λ; θ )ψk2 kW1,0 (θ) ψk2 = kW0,1 (θ) λ=1,2

¯ ψk + 4 320 (−1x + 1)1/2 ψ 2 . ≤ kW0,1 (θ) 2

Furthermore, kW0,1 (θ) ψk ≤

X Z

(1.47)

d 3 k J (k) (−1x + 1)1/2 aλ (k)ψ

λ=1,2

1/2 ≤ 2 3−1 (−1x + 1)1/2 Hf ψ ,

(1.48)

Spectral Analysis for Systems of Atoms and Molecules

and hence kW1,0 (θ ) ψk,

261

kW0,1 (θ) ψk ≤ 2 (30 + 3−1 ) (−1x + Hf + 1)ψ .

(1.49)

Similarly, we convert estimates on W2,0 and W1,1 into those for W0,2 . For W2,0 , for instance, the canonical commutation relations (1.8) imply that ¯ ∗ W0,2 (θ) ¯ (1.50) W2,0 (θ )∗ W2,0 (θ) = W0,2 (θ) Z + 2 dξ1 dξ2 w2,0 (ξ1 ; ξ2 ; θ )∗ w2,0 (ξ1 ; ξ2 ; θ ) Z + 4 dξ1 dξ2 dξ3 w2,0 (ξ1 ; ξ3 ; θ )∗ w2,0 (ξ2 ; ξ3 ; θ ) a ∗ (ξ1 )a(ξ2 ), where we denoted ξ := (k, λ),

R

dξ :=

P

R λ=1,2

d 3 k, and a # (ξ ) := aλ# (k). This yields

¯ ψk + 4320 kψk, kW2,0 (θ) ψk ≤ 3kW0,2 (θ)

(1.51)

and, as in (1.48), we obtain that

kW0,2 (θ) ψk ≤ 32−1 kHf ψk ≤ 32−1 (−1x + Hf + 1)ψ .

(1.52)

The bound (1.46) follows from (1.49), (1.51), (1.52), and the additional use of the fact t that −1x is relatively Hel -bounded with relative bound arbitrarily close to 1. u Now self-adjointness of Hg on D(H0 ) and dilatation analyticity of Wg (θ ) are just two immediate implications of Lemma 1.1. Corollary 1.2. If 0 < g = (α K)3/2 < (1 + 320 + 32−1 )−1/2 /10, then Hg is self-adjoint and semibounded on D(H0 ). Corollary 1.3. The map Wg : D(0, θ0 ) → B(D(H0 ); H), θ 7→ Wg (θ ) is analytic. Finally, we establish the dilatation analyticity of Hg (θ ) := Uθ Hg Uθ−1 , assuming that the potential Vc (x) is dilatation analytic, i.e., D(0, θ0 ) 3 θ → Vc (eθ x) ∈ B(D(H0 ); H) is an analytic function. This property holds in case that Hel = −1x + Vc (x) is the Hamiltonian of an atom, i.e., M = 1, for arbitrary θ . Indeed, in the atomic case we may choose without loss of generality the position of the nucleus to be the origin of the one-electron configuration space, and then we obtain Hel (θ) := Uθ Hel Uθ−1 = e−2θ −1x ) + e−θ Vc (x).

(1.53)

H0 (θ) := Uθ H0 Uθ−1 = Hel (θ ) + e−θ Hf

(1.54)

Therefore,

is an analytic family of type A and, by Corollary 1.3, so is Hg (θ ) = H0 (θ ) + Wg (θ ). We summarize this discussion and a simple consequence of (1.54) in the following corollary. Corollary 1.4. The family {Hg (θ)|θ ∈ D(0, θ0 )} is dilatation analytic, i.e., the map Hg : D(0, θ0 ) → B(D(H0 ); H), θ 7 → Hg (θ ) is analytic. Moreover, there exists a constant b ∈ R+ such that

1Hel (θ) (Hel ± i)−1 ≤ b |θ |, (1.55) where 1Hel (θ) := Hel (θ) − Hel , for all θ ∈ D(0, θ0 ).

262

V. Bach, J. Fröhlich, I. M. Sigal

2. Soft Photon Bound and Existence of a Ground State In this section we derive a new soft photon bound; see Inequalities (2.5)–(2.6) and Theorem 2.3 below. It is tailored for the minimal coupling model, and we use this bound to prove the existence of a ground state. Theorem 2.1. There exists a constant C(N, 6 − E0 ) ≥ 0 such that, for all 0 < g = (α K)3/2 ≤ C(N, 6 − E0 ), the Hamiltonian Hg has a ground state, i.e., E0 (α) := inf σ (Hg ) is an eigenvalue. T Proof. We introduce the notation a # (F ) := a # (F1 ), a # (F2 ), a # (F3 ) , a ∗ (F ) := R 3 R 3 P P ∗ ∗ λ=1,2 d k F (k, λ)aλ (k), and a(F ) := λ=1,2 d k F (k, λ) aλ (k). Here, F is a 3 function on R × Z2 with values in the operators on Hel such that

2 X Z

d 3 k F (k, λ)(−1x + 1)−1/2 < ∞. λ=1,2

We further denote 8(F ) := a ∗ (F )+a(F ), 8(F ) := a ∗ (F )+a(F ), and pj := −i∇ x j . In this notation, the interaction Wg reads Wg =

o −2g8(Gx j ) · p j + g 2 8(Gx j )2 + σ j · g8(B x j ) ,

N n X

(2.1)

j =1

where G and B are defined in Eqs. (1.32), (1.33), and (1.39). Next, we introduce an infrared regularization by switching off the interaction for photons of small momenta. Specifically, we pick a “photon mass”, m > 0, that is, we replace Gx (k, λ), B x (k, λ), and Wg in (1.32), (1.33), and (2.1) by (m)

Gx (k, λ) := χ [ω(k) ≥ m] Gx (k, λ),

(m) Wg

:=

(2.2)

(m) B x (k, λ) := χ [ω(k) ≥ m] B x (k, λ), n PN (m) (m) 2 2 j =1 −2g 8(Gx j ) · p j + g 8(Gx j ) + g σ j (m)

(m)

(m)

(m)

(2.3)

o

(m) · 8(B x j )

, (2.4)

and we denote Hg := H0 + Wg and E0 (g) := inf σ (Hg ). We remark that (m) (0) Hg → Hg = Hg in the norm resolvent sense, as m → 0. This easily follows from an estimate similar to Lemma 1.1 (see [6]). In Theorem 2.2 below we show that, for g = (m) (α K)3/2 sufficiently small, Hg has a ground state, φm , i.e., there exists a normalized (m) (m) solution of Hg φm = E0 (g)φm , for all m > 0. Since kφm k = 1, the family {φm }m>0 contains a weakly convergent subsequence, {φm(n) }n∈N , where limn→∞ m(n) = 0. We put φ0 := w − limn→∞ φm(n) . Then one easily shows [6], that φ0 ∈ D(Hg ) and that Hg φ0 = E0 (α)φ0 . To conclude, it remains to show that φ0 6= 0. To show that φ0 6 = 0, we employ a soft photon bound, as in [6]. There is an important difference, though. The soft photon R in [6] estimated the photon number expectation P bound hφm |Nf φm i, where Nf := λ=1,2 aλ∗ (k)aλ (k)d 3 k, in terms of supx kω−1 Gx k2 . It was derived from a virial type argument, using the commutator of aλ (k) and Hg . This bound does not directly apply to the present problem because kω−1 Gx k2 = ∞, for all x. Modifying the argument slightly by using the commutator of aλ (k) − iFx (k, λ) and

Spectral Analysis for Systems of Atoms and Molecules

263

Hg , for a suitably chosen Fx , we avoid the appearance of supx kω−1 Gx k2 on the right side of the estimate, which we trade for a factor of k |x|φm k. More precisely, in Theorem 2.3 below we show that

2

(2.5) hφm | Nf φm i ≤ C1 (N) g 2 (1 + |x|) φm , for some constant C1 (N) ≥ 0. In [6] we showed that φm is exponentially localized in the electron variables. More generally, there exists an ε > 0 such that

ε|x|

e χ1 (H (m) ) ≤ C2 < ∞, (2.6) g (m)

where χ1 (Hg ) is the spectral projection onto 1 := (−∞ , (6 + E0 )/2), provided that g 2 (1 + 6 − E0 ) is sufficiently small (with C2 and ε independent of g). Since (m) φm ∈ Ran{χ1 (Hg )}, this implies the boundedness of k |x| φm k. Thus there exists a constant, C3 ≡ C3 (N, 6 − E0 ), such that hφm | Nf φm i ≤ C3 g,

(2.7)

for all m > 0. Next, we introduce the projection Pel onto all bound states of Hel below 1 ⊥ 2 (6 + E0 ) < 6. Note that Pel has finite rank and that (6 − E0 )Pel ≤ 2(H0 − E0 ). The latter implies that (m)

hφm | Pel⊥ φm i ≤ 2(6 − E0 )−1 hφm | (E0 (g) − E0 − Wg(m) ) φm i ≤ C3 g 2 .

(2.8)

From (2.7)–(2.8) and Pel P ≥ 1 − Pel⊥ − Nf we draw the important consequence that hφm | Pel P φm i ≥ 1 − 2C3 g 2 ,

(2.9)

where P := |ih| denotes the rank-one projection onto the photon vacuum vector . Thus, if g = (α K)3/2 is sufficiently small then hφm | Pel P φm i ≥ 1/2, for all m > 0. t Since Pel P has finite rank, it follows that φ0 6 = 0. u In the following theorem, we review the proof in [6] of the existence of a ground (m) state of Hg , for m > 0. Theorem 2.2. There exists a constant C ≡ C(N, 6 −E0 ) > 0 such that, for all 0 < g = (m) (m) (m) (α K)3/2 ≤ C, the Hamiltonian Hg has a ground state, i.e., E0 (g) := inf σ (Hg ) is an eigenvalue, for any m > 0. Proof. We only sketch the argument, see [6, Sect. II.2] for details. Alternatively, one may proceed as in [17]. The assertion is proven if we can find some m ˜ > 0 such that (m) (m) (m) ˜ is finite, i.e., Tr [Hg − the sum of the negative eigenvalues of Hg − E0 (g) − m (m) ˜ − > −∞, where the negative part of a real number λ is defined as E0 (g) − m] [λ]− := min{λ, 0}. To this end, we first employ a discretization. Given ε > 0 and a locally integrable function F , we define its ε-average by Z −3 F (k 0 , λ)d 3 k 0 , (2.10) hF (k)iε := ε n(k)+Qε

264

V. Bach, J. Fröhlich, I. M. Sigal

where Qε = [−ε/2, ε/2)3 and n(k) ∈ (εZ)3 is the integer part of k, i.e., k −n(k) ∈ Qε . (m,ε) (m) (m) , byE replacing Gx (k, λ) and B x (k, λ) We define the corresponding interaction, E D Wg D (m)

in (2.2)–(2.4) by Gx (k, λ) (1.46) it follows that

(m)

ε

and B x (k, λ) , respectively. Then, by the bound ε

±(Wg(m,ε) − Wg(m) ) ≤

(m) o(ε 0 ) (Hg(m) − E0 (g) + 1)1/2

1 + |x|

(2.11) (m) (Hg(m) − E0 (g) + 1)1/2 ,

where o(ε 0 ) denotes a function which possibly depends on g, N , κ, m and tends to zero as ε → 0. Here, our original manuscript contained a small mistake in that the factor 1 + |x| was missing, as was kindly pointed out to us by F. Hiroshima. Next, we define (m,ε) (m) by replacing ω(k) in Hf by hω(k)iε . Since |ω(k) − hω(k)iε | = O(ε) and Hf ω(k) ≥ m, we obtain that (m) (m,ε) (m) ≤ 1 + o(ε0 ) Hf . (2.12) 1 − o(ε 0 ) Hf ≤ Hf (m,ε)

Denoting Hg

(m,ε)

:= Hel + Hf

(m,ε)

+ Wg

, we hence obtain that

Hg(m) ≥ 1 − o(ε0 ) Hg(m,ε) (m)

(2.13) (m)

− o(ε 0 ) (Hg(m) − E0 (g) + 1)1/2 (1 + |x|) (Hg(m) − E0 (g) + 1)1/2 . e := (−∞ , E (m) (g) + m) ˜ and observe that Next, we introduce the interval 1 0 (m) (m) (m) (m) (m) ˜ − = χ1 − E0 (g) − m ˜ χ1 [Hg(m) − E0 (g) − m] e (Hg ) Hg e (Hg ).

(2.14)

˜ > 0 sufficiently Furthermore, we note that, thanks to (2.6) and χ1 e = χ1 e χ1 , for m small, we have

χe (H (m) ) (H (m) − E (m) (g) + 1)1/2 (2.15) g g 1 0

(m) (m) 1/2 (m) < ∞. 1 + |x| (Hg − E0 (g) + 1) χ1 e (Hg ) Thus, (m) ˜ − Tr [Hg(m) − E0 (g) − m] (m,ε) (m) (m,ε) (m) − E0 (g) − m ˜ − o(ε0 ) χ1 ≥ Tr χ1 e (Hg ) Hg e (Hg ) (m,ε) (m,ε) (m) (m) − E0 (g) − m ˜ − o(ε0 ) − χ1 ≥ Tr χ1 e (Hg ) Hg e (Hg ) (m,ε) ˜ , ≥ Tr Hg(m,ε) − E0 (g) − 3m/2 −

(2.16)

for ε > 0 sufficiently small, and the finiteness of the right side in (2.16) (and hence the (m,ε) claim) follows if we can show that, for any ε > 0, the discretized Hamiltonian Hg (m,ε) has only finitely many eigenvalues below E0 (g) + 3m/2. ˜ The key point of the discretization by means of the ε-average is the tensor product ⊥ ], where H 3 representation F ∼ = F[Hdisc ]⊗F[Hdisc disc is spanned by χQε +n , n ∈ (εZ)

Spectral Analysis for Systems of Atoms and Molecules

265 (m,ε)

(see [6] for details). Note that, with respect to this representation, we have Hf (m,ε) (m,ε) (m,ε) ∼ (m,ε) ⊗1+1⊗H , Wg ⊗ 1, and hence H = Wg f

∼ =

f

(m,ε) Hg(m,ε) ∼ = Hg(m,ε) ⊗ 1 + 1 ⊗ Hf (m,ε) ≥ E0 (g) + m ⊗ P⊥ + Hg(m,ε) ⊗ P ,

(2.17)

⊥ ] by P . Hence, for 3m denoting the projection onto the vacuum in F[Hdisc ˜ ≤ m, (m,ε) (m,ε) ˜ ≥ Hg(m,ε) − E0 (g) − 3m/2 ˜ ⊗ P . (2.18) Hg(m,ε) − E0 (g) − 3m/2

Next, for sufficiently small ε > 0, an estimate similar to Lemma 1.1 together with an interpolation argument implies that, on Hel ⊗ F[Hdisc ], we have (m,ε) + C(N ) , (2.19) Wg(m,ε) ≥ −C10 g Hel + Hf for some constants C10 , C(N) ≥ 0. Thus, we obtain that (m,ε)

Hg(m,ε) − E0

(g)

(2.20) (m,ε) (1 − C10 g)Hf

− 2C10 C(N ) ≥ (1 − C10 g)Hel − (1 + C10 g)E0 + 1 (6 − E0 ) − 2C10 g(1 + 2|E0 |) − o(ε0 ) Pel⊥ ≥ 2 (m,ε) + (1 − C10 g)Hf − 2C10 g(1 + 2|E0 |) − o(ε 0 ) Pel , where Pel is the finite dimensional projection onto the bound states of Hel of energy < 21 (6 + E0 ). Now, if g is sufficiently small such that 1 (2.21) (6 − E0 ) − 2C10 g(1 + 2|E0 |) > 0 2 then, for any 0 < 3m ˜ < min{m, m}, ˆ we obtain from inserting (2.20) into (2.18) that m ˆ :=

(m)

˜ (2.22) Hg(m) − E0 (g) − 3m/2 (m,ε) 0 0 0 0 − 2C1 g(1 + 2|E0 |) − o(ε ) − 3m/2 ˜ Pel ⊗ P , ≥ (1 − o(ε ) − 2C1 g)Hf for ε > 0 sufficiently small. The right side, however, has clearly only finitely many negative eigenvalues, for any ε > 0, which proves that, for any ε > 0, the discretized (m,ε) (m,ε) has only finitely many eigenvalues below E0 (g) + 3m/2. ˜ t u Hamiltonian Hg (m)

Theorem Let φm be a ground state of Hg , for m > 0, and denote by Nf := R 2.3. P ∗ (k)a (k)d 3 k the photon number operator. Assume that α(K + 1) ≤ 1. a λ λ=1,2 λ Then there exist constants C(N), C1 (N) ≥ 0 such that Z 1 3 Tr 3×3 |∇ x Gx (k, λ)|2 1 + d k hφm | Nf φm i ≤ C(N) g 2 sup ω(k)2 x,λ Z Z

2 d 3k

2 3 2 2 + |Gx (k, λ)| d k + |B x (k, λ)| + |1x Gx (k, λ)| + |x|)φ

(1 m ω(k)2

2

≤ C1 (N) g 2 (1 + |x|) φm . (2.23)

266

V. Bach, J. Fröhlich, I. M. Sigal

Before giving the proof of Theorem 2.3, we remark that using the definitions (1.32), (1.33), and (1.39) of G and B, one easily checks the integrals on the right side of Eq. (2.23) to be bounded by a constant, uniformly in K ≥ 1. Proof of Theorem 2.3. Throughout the proof we omit the superscript “(m)”. To prove the asserted bound, we first observe the following commutation relations, aλ (k) Hg = Hg + ω(k) aλ (k) +

(2.24)

σ j · gB x j (k, λ) − 2gGx j (k, λ) · pj − g8(Gx j ) ,

N X j =1

Fx j (k, λ) Hg = Hg Fx j (k, λ)

+ 1x j Fx j (k, λ) + 2i ∇ x j Fx j (k, λ) · pj − g8(Gx j ) ,

(2.25)

on C0 (R3N ; F) ∩ D(H0 ), for any F ∈ C 2 (R3 ; L2 (R3 × Z2 )). We therefore have aλ (k) − i

N X

Fx j (k, λ) Hg − E0 (α)

(2.26)

j =1

N X Fx j (k, λ) = Hg − E0 (α) + ω(k) aλ (k) − i Hg − E0 (α) +

j =1

N X

σ j · gB x j (k, λ) + 1x j Fx j (k, λ)

j =1

− 2 p j − g8(Gx j ) · gGx j (k, λ) − ∇ x j Fx j (k, λ) .

We apply (2.26) to χ (|x|/R)φm , where χ ∈ C0∞ (R; [0, 1]) and ≡ 1 on [−1, 1]. Letting R → ∞ and using (Hg − E0 (α)) φm = 0, we derive that N X Fx j (k, λ) φm aλ (k) φm = iR ω(k) Hg − E0 (α)

(2.27)

N X

j =1

σ j · gB x j (k, λ) + 1x j Fx j (k, λ)

+ R ω(k)

j =1

− 2 p j − g8(Gx j ) · gGx j (k, λ) − ∇ x j Fx j (k, λ) φm ,

Spectral Analysis for Systems of Atoms and Molecules

where R(ω) := Hg − E0 (α) + ω (2.27) reads

−1

267

. We choose Fx (k, λ) := g x · Gx (k, λ). Then

aλ (k) φm = igR ω(k) Hg − E0 (α)

N X

x j · Gx j (k, λ) φm

(2.28)

j =1

N X

σ j · gB x j (k, λ) + igx j · 1x j Gx j (k, λ)

+ R ω(k)

j =1

i h + 2g (p j − g8(Gx j )) · ∇ x j Gx j (k, λ) · x j φm .

We observe that R ω(k) Hg − E0 (α) ≤ 1 and that N N X 1/2 X x j · Gx j ≤ |x| |Gx j |2 , j =1

PN

j =1

denoting |x|2 := j =1 (x j )2 . Hence, for λ ∈ {1, 2},

Z N

X 3 x j · Gx j (k, λ) φm d k R ω(k) Hg − E0 (α)

j =1

Z

≤ N sup

x

1/2

|Gx (k, λ)| d k

|x| φm . 2 3

Additionally using k R ω(k) k ≤ ω(k)−1 , we similarly obtain

Z N n o

X 3 σ j · B x j (k, λ) + ix j · 1x j Gx j (k, λ) φm d k R ω(k)

≤ N sup

Z

x

(2.29)

j =1

|B x (k, λ)|2 + |1x Gx (k, λ)|2

(2.30)

d 3 k 1/2

(1 + |x|) φ

. m ω(k)2

Furthermore, we note that

R ω(k) (Hg − E0 (α) + 1) ≤ 1 + ω(k)−1 , N

2 X

pj − g8(Gx j ) ≤ C(N ),

R ω(k)

(2.31) (2.32)

j =1

for some constant C(N) ≥ 0 which only depends on N. These bounds yield the following estimate,

Z N h i

X

R ω(k) − g8(G )) · ∇ (k, λ) · x (p G (2.33) d 3k xj xj xj j φm j

j =1

Z ≤ N sup x

1/2

Tr 3×3 |∇ x Gx (k, λ)|2 1 + ω(k)−2 d 3 k

|x| φm .

Thus we arrive at the assertion. u t

268

V. Bach, J. Fröhlich, I. M. Sigal

3. Resonances and Time-Decay Estimates In the present section, we study spectral properties of the Hamiltonian Hg . We also study the propagator exp[−itHg ] applied to states whose spectral support is localized about the excited atomic energy level Ej , j ≥ 1. As we describe in the introduction, our main tool for this analysis is the complex dilatation, Hg (θ ) = Uθ Hg Uθ−1 , of the Hamiltonian Hg , where Uθ is the dilatation defined in (1.19). We prove below that, for θ = iϑ, ϑ > 0, and ϑ g > 0, a complex neighborhood of an interval Ij about Ej does not contain any spectrum of Hg (θ). By the dilatation analyticity of Hg (θ ) in θ , this implies that the spectrum of Hg in Ij is purely absolutely continuous, and it allows for an estimate of the time decay rate of certain states in Ran χIj (Hg ) . To state this estimate more precisely, we recall from (1.32)–(1.38) that Hg (θ) = H0 (θ) + Wg (θ ), H0 (θ) = Hel (θ) + e Wg (θ) =

2 X

−θ

(3.1)

Hf ,

(3.2)

g m+n Wm,n (θ ) + g 2 Cno ,

(3.3)

m+n=1

R where g := (α K)3/2 , and Cno := 2N |Gx (k, 1)|2 d 3 k is an energy shift resulting from normal-ordering Wg . We absorb this constant by redefining Wg (θ) → Wg (θ) + g 2 Cno ,

Hel (θ ) → Hel (θ ) − g 2 Cno ,

(3.4)

and since it only shifts all energies by Cno , we henceforth ignore the constant Cno by setting it equal to zero. Thus we obtain X g m+n Wm,n (θ ). Hg (θ ) := Uθ Hg Uθ−1 = H0 (θ) + Wg (θ) = H0 (θ ) + m+n≤2

(3.5) Recall that we assumed in Eq. (1.15) the j th atomic energy level to be of finite degeneracy nj and isolated from the rest of the spectrum of Hel by a positive distance (3.6) δ := dist Ej , σ (Hel ) \ {Ej } > 0. As in [6, Eqs. (IV.84), (IV.85)], we define three families of nj × nj matrices by Zj (θ ) := Zjd (θ) − Zjod (θ), where Zjod :=

X Z λ=1,2

Zjd (θ) := Uθ Zjd Uθ−1 ,

Zjod (θ ) := Uθ Zjod Uθ−1 , (3.7)

⊥ Pel,j w0,1 (k, λ)Pel,j

−1 ⊥ Pel,j w1,0 (k, λ)Pel,j d 3 k, Hel − Ej + ω(k) − i0 X Z d 3k d , Pel,j w0,1 (k, λ) Pel,j w1,0 (k, λ)Pel,j Zj := ω(k)

(3.8)

λ=1,2

(3.9)

Spectral Analysis for Systems of Atoms and Molecules

269

Pnj and Pel,j := `=1 |ϕj,` ihϕj,` | is the projection onto the eigenspace of Hel corresponding to the eigenvalue Ej . Note that the matrices Zj (θ), Zjod (θ), and Zjd (θ ) are similar to Zj , Zjod , and Zjd , respectively, for all θ ∈ D(0, θ0 ). Furthermore, we remark that while Zjd = (Zjd )∗ , the matrices Zjod (0) and Zj are not even normal, in general. Since Zjd is self-adjoint, we ∗ od have −Im{Zj } := −1 2i Zj − Zj = Im{Zj }, and we require that 0 < 0j := min σ (Im{Zjod }) is a simple eigenvalue of Im{Zjod }.

(3.10)

Denoting by ϕj,0 the normalized eigenvector of Im{Zjod } corresponding to 0j , we introduce the energy shift (2)

1Ej

:= Rehϕj,0 |Zj ϕj,0 i.

(3.11)

Note that if Ej is nondegenerate (nj = 1) then (2)

Zj = 1Ej − i0j ,

(3.12)

since Zj ∈ C is a single complex number in this case. From the nondegeneracy assumption (3.10) and an elementary argument, we obtain the following lemma (see Fig. 5). Lemma 3.1. Assuming (3.10), there exists an angle 0 < ν < π/2 such that NumRan(Zj ), and (2)

(2)

1Ej − i0j ∈ NumRan(Zj ) ⊆ 1Ej − i0j − iQj ,

(3.13)

where Qj := {z ∈ C | − ν ≤ arg(z) ≤ ν }, and NumRan(A) := {hϕ|Aϕi | kϕk = 1 } denotes the numerical range of a matrix A. Given a small constant ε > 0, and a large constant C > 0, we use Qj to define the following “comet-shaped” set Rj ≡ Rj (ε, C) ⊆ C− in the lower halfplane (see Fig. 5), (2) (3.14) Rj := Ej + g 2 1Ej − i0j + Qj + e−θ R+ + D(0, Cg 2+ε ). Furthermore, for any 0 < ρ < δ, we define the interval Ij (ρ) := (Ej − ρ , Ej + ρ), and, for ε > 0, the following subset of C, (3.15) Aj (δ, ε) := Ij (δ/2) + i[−g 2−ε , ∞) . We remark that dist{Ij (δ/2) , σ (Hel ) \ {Ej } } ≥ δ/2. Now we are ready to formulate the first main spectral result of this section Theorem 3.2. Let 0 < ε < 1/3. There exists a constant C ∈ R+ such that, for θ = iϑ and ϑ g > 0 sufficiently small, (3.16) Aj (δ, ε) \ Rj (ε, C) ⊆ ρ Hg (θ ) , where ρ Hg (θ) is the resolvent set of Hg (θ), and for any z ∈ Aj , we have that

Hg (θ) − z −1 ≤ C dist z , Rj −1 .

(3.17)

270

V. Bach, J. Fröhlich, I. M. Sigal

We note the following consequence of Theorem 3.2 and the analyticity of Hg in θ . Corollary 3.3. For g > 0 sufficiently small, the spectrum of Hg in Ij (δ/2) is absolutely continuous, Ij (δ/2) ⊆ σac (Hg ). Proof of Theorem 3.2. We construct the inverse of Hg (θ ) − z by means of the Feshbach map discussed in detail in [6,7]. For this construction, we specify a partition of unity given by the (non-orthogonal) projections P (θ) := Pel,j (θ) ⊗ χHf <ρ0

P (θ ) := 1 − P (θ ),

and

(3.18)

where Pel,j (θ ) := Uθ Pel,j Uθ−1 and Pel,j is the (orthogonal) projection onto the eigenspace of Hel corresponding to the eigenvalue Ej . Moreover, ρ0 := g 2−2ε ,

(3.19)

and 0 < ε < 1/3 is arbitrary but fixed. Note that, given any δ, ϑ, c > 0, we have ρ0 ≤ δ sin(ϑ/2)

and

ρ0 ≥ cg 2−ε ,

(3.20)

provided g ≥ 0 is sufficiently small. In Lemma 3.14 below we prove that, for any z ∈ Aj (δ, ε), Hg (θ)P (θ) − z is invertible on Ran{P (θ )},

(3.21)

where we denote AP := P AP . This property and some further relative bounds of more technical nature stated in Lemma 3.15 below insure the existence of the Feshbach operator defined by (3.22) FP (θ) := FP (θ) Hg (θ) − z −1 := Hg (θ)P (θ) − z P (θ) − P (θ)Wg P (θ) Hg (θ )P (θ ) − z P (θ )Wg P (θ ). The importance of FP (θ) lies in the following identity, i −1 h −1 = P (θ) − P (θ) Hg (θ)P (θ ) − z Wg P (θ ) Hg (θ ) − z h −1 i · FP−1(θ) P (θ) − P (θ )Wg P (θ ) Hg (θ )P (θ ) − z −1 (3.23) + P (θ) Hg (θ )P (θ ) − z P (θ ). Thus using bounds collected in Lemma 3.15 below, we obtain that, for z ∈ Aj (δ, ε),

−1

(3.24)

= 1 + O(gρ0−1 ϑ −1 ) FP−1(θ ) + O(1),

Hg (θ) − z where both sides are infinite iff z ∈ σ [Hg (θ)]. Next, a careful analysis of the Feshbach operator in Lemma 3.16, choosing β := 2ε(1 − ε)−1 ∈ (0, 1), for any 0 < ε < 1/3, yields Theorem 3.4 below. In turn, Theorem 3.4 and the fact that the numerical range of t Ej − z + g 2 Zj (θ) + e−θ Hf is contained in Rj implies (3.17). u Theorem 3.4. For any 0 < ε < 1/3, there exists a constant C ∈ R+ such that, for any z ∈ Aj (δ, ε), the following estimate holds true,

FP (θ) − (Ej − z + g 2 Zj (θ) + e−θ Hf )P (θ ) ≤ C g 2+ε . (3.25)

Spectral Analysis for Systems of Atoms and Molecules Ij (ω)

271

Ej

Ej − ω/2

Ej + ω/2

+i[−S, 0]

+i[−S, 0] Ij (ω) − iS

NumRan(Zj )

Rj

Fig. 5. The contour deformation and Rj . The black dots represent Ej,` (g), the spectrum of Ej + g 2 Zj

3.1. Time-decay estimates. To formulate the second main result of this section, another consequence of Theorem 3.2, we pick a smooth function F ∈ C0∞[0, 1/2) ; [0, 1] such that F ≡ 1 on [0, 1/4], and we define Fj ∈ C0∞ Ij (δ/2) ; [0, 1] , with Fj ≡ 1 on Ij (δ/4), by Fj (λ) := F (δ −1 |λ − Ej |). Furthermore, for an eigenvector ϕj,` of Zj and a dilation analytic vector η ∈ F, we set 1/2 (3.26) 9j := Fj (Hg )ψj and ψj := exp −g −2 Hf ϕj,` ⊗ η. Theorem 3.5. Let ϕj,` be a normalized eigenvector of Zj and η ∈ F be normalized, dilatation analytic in D(0, θ0 ). Denote η(θ ) := Uθ η, and assume that Bη := sup|θ|≤θ0 kη(θ )k < ∞. Moreover, assume that g > 0 is sufficiently small, t > 1, and 0 < ε < 1/3. Then there exist a constant C ≥ 0 and, for any L ∈ N, a constant CL ≥ 0 such that D E 9j exp[−itHg ] 9j i h (3.27) ≤ Bη2 C exp −t g 2 0j − Cg 2+ε ) + CL t −L g 4 . such Proof of Theorem 3.5, given Theorems 3.2 and 3.4. We first use the fact that Z δ/2 −1 F 0 (ω/δ) χIj (ω) (λ) dω Fj (λ) = −δ

(3.28)

to rewrite the matrix element on the left side of (3.27) as E D ψj exp[−itHg ] Fj (Hg ) ψj Z δ/2 D E F 0 (ω/δ) ψj exp[−itHg ] χIj (ω) (Hg ) ψj dω = −δ −1

(3.29)

δ/4

Z =

δ/4 δ/2 −F 0 (ω/δ)

δ/4

πδ

Z Ij (ω)

e−itλ Im

n

−1 o ψj (θ¯ ) Hg (θ ) − λ ψj (θ ) dλ dω,

272

V. Bach, J. Fröhlich, I. M. Sigal

where we use Stone’s formula (see, e.g., [29]) together with Theorem 3.2, which implies that the limit limε&0 (Hg (θ)−λ−iε)−1 = (Hg (θ )−λ)−1 exists and is bounded. Indeed, Theorem 3.2 even implies that λ − is 7 → (Hg (θ ) − λ + is)−1 is bounded analytic, provided s ∈ R is not too large. We exploit this fact by deforming the integration contour Ij (ω) ⊆ C into the lower half-plane. To this end, we define a number S := g 2 0j − 2CS g 2+ε ,

(3.30)

where CS > 0 is later chosen sufficiently large, and we assume that λ ∈ Ij (δ/2) and s ∈ (−∞, S]. Then, Lemma 3.1 and the definition (3.12) of Rj imply that

−1

(Ej − λ + is + g 2 Zj (θ) + e−θ Hf )−1 ≤ C 0 dist λ − is, Rj −1 C 00 λ − Ej − g 2 1E (2) + 0j g 2 − Cg 2+ε − s ≤ , j min{ϑ, ν} (3.31) and in particular

(Ej − λ + iS + g 2 Zj (θ) + e−θ Hf )−1 C 00 λ − Ej − g 2 1E (2) + ≤ j min{ϑ, ν}

≤ −1 CS g 2+ε ,

(3.32)

for some C 0 , C 00 ∈ R+ , provided that CS ≥ C. Theorem 3.2 and Theorem 3.4 yield similar estimates, which we may summarize by

Hg (θ) − λ + is −1 , FP (θ) (Hg (θ ) − λ + is −1 ≤ (3.33) 00 −1 C λ − Ej − g 2 1E (2) + 0j g 2 − Cg 2+ε − s ), j min{ϑ, ν}

Hg (θ) − λ + iS −1 , FP (θ) (Hg (θ ) − λ + iS −1 ≤ −1 C 00 λ − Ej − g 2 1E (2) + CS g 2+ε , j min{ϑ, ν}

(3.34)

for some C 0 ∈ R+ . Therefore, z 7 → (Hg (θ) − z)−1 is analytic in the rectangular domain Ij (δ/2) + i[−S, ∞), and by Cauchy’s integral formula (see Fig. 5), Eq. (3.29) can be written as D

E ψj exp[−itHg ] Fj (Hg ) ψj = A− − A+ + Ak (θ ) − Ak (θ¯ ),

(3.35)

Spectral Analysis for Systems of Atoms and Molecules

where

273

Z −F 0 (ω/δ) S exp[−it (Ej ± ω − is)] (3.36) 2πi δ δ/4 0 n

¯ Hg (θ) − Ej ∓ ω + is −1 ψj (θ ) ψj (θ) o

¯ − Ej ∓ ω + is −1 ψj (θ¯ ) ds dω, − ψj (θ) Hg (θ) Z δ/2 −F 0 (ω/δ) dω (3.37) Ak (θ ) := 2πi δ δ/4 Z

−1 dλ exp[−it (λ − iS)] ψj (θ¯ ) Hg (θ ) − λ + iS ψj (θ ) . Z

A± :=

δ/2

Ij (ω)

Now, Theorem 3.5 directly follows from Lemma 3.6 and Lemma 3.7 below. u t Lemma 3.6. For any L ∈ N, there exists a constant CL ∈ R+ such that |A+ | + |A− | ≤ CL Bη2 t −L g 4 .

(3.38)

Proof. We only derive the estimate on A+ and omit the similar estimate on A− . We use the fact that, for any L ∈ N and t > 0, dL exp −it (Ej + ω − is) . exp −it (Ej + ω − is) = (−it)−L L dω Thus, an integration by parts yields Z S Z δ/2 i −L e−it (Ej +ω−is) A+ = 2πiδ t L 0 δ/4 n

dL 0 ¯ Hg (θ) − Ej − ω + is −1 ψj (θ ) (ω/δ) ψj (θ) F L dω o

¯ − Ej − ω + is −1 ψj (θ¯ ) ds dω − ψj (θ) Hg (θ) =

(3.39)

(3.40)

Z S Z δ/2 L X L! i −L −it (Ej +ω−is) δ −L+k F (L−k+1) (ω/δ) e L 2πiδ t 0 δ/4 (L − k)! k=0 n

¯ Hg (θ) − Ej − ω + is −k−1 ψj (θ ) ψj (θ) o

¯ − Ej − ω + is −k−1 ψj (θ¯ ) ds dω. − ψj (θ) Hg (θ)

Since all derivatives of F are bounded and S ≤ g 2 0j , there exists a constant CL ≥ 0 such that n

CL g 2 ¯ Hg (θ) − Ej − ω + is −k−1 ψj (θ ) sup |A+ | ≤ ψj (θ) L t −k−1

ψj (θ¯ ) − ψj (θ) Hg (θ¯ ) − Ej − ω + is o δ δ ≤ω≤ , 0≤s≤S . (3.41) 0≤k≤L, 4 2

274

V. Bach, J. Fröhlich, I. M. Sigal

Thus, Lemma 3.6 follows if we can find a constant C > 0 such that

ψj (θ¯ ) Rg (θ)k+1 ψj (θ) − ψj (θ) Rg (θ¯ )k+1 ψj (θ¯ ) ≤ C Bη2 g 2 ,

(3.42)

for all k ∈ {0, 1, . . . , L}, ω ∈ [δ/4, δ/2], and s ∈ [0, S], where we denote −1 Rg (θ) := Hg (θ) − Ej − ω + is .

(3.43)

To this end we introduce an unperturbed resolvent, Q0 (θ) := H0 (θ) − Ej − ω − ig 2

−1

,

and we observe that, for g/|θ | sufficiently small,

Q0 (θ ) ≤

Rg (θ) , C|θ |−1 ,

Wg (θ) Rg (θ) , Wg (θ) Q0 (θ ) ≤ Cg|θ |−1 ,

(3.44)

(3.45) (3.46)

and some constant C > 0, which is uniform in ω ∈ [δ/4, δ/2] and s ∈ [0, S]. Using the second resolvent equation, we obtain that (3.47) Rg (θ ) = Q0 (θ) − Q0 (θ) Wg (θ) + is + ig 2 Q0 (θ ) 2 2 + Q0 (θ) Wg (θ) + is + ig Rg (θ ) Wg (θ ) + is + ig Q0 (θ ). We expand Rg (θ)k+1 by means of (3.47), Rg (θ )k+1 = Q0 (θ)k+1 −

k+1 X

Q0 (θ)ν Wg (θ ) Q0 (θ )k+2−ν + Rem,

(3.48)

ν=1

and (3.45)–(3.46) show that there is a constant C ≥ 0, depending on k and θ , such that k Rem k ≤ C g 2 .

(3.49)

Similarly, we find that Rg (θ¯ )k+1 = Q∗0 (θ)k+1 −

k+1 X

Q∗0 (θ)ν Wg (θ¯ ) Q∗0 (θ )k+2−ν + Rem0 ,

(3.50)

ν=1

k Rem0 k ≤ C g 2

(3.51)

¯ Inserting the two identities (3.48) and (3.50) into (3.42) (note that Q∗0 (θ) 6 = Q0 (θ)). and using (3.49) and (3.51), we observe that it suffices to prove that there is a constant C ≥ 0 such that n

o ¯ Q0 (θ)k+1 ψj (θ ) ≤ C Bη2 g 2 , (3.52) Im ψj (θ)

¯ Q0 (θ)ν Wg (θ) Q0 (θ)k+2−ν ψj (θ ) ≤ C Bη2 g 2 , (3.53) ψj (θ) for all k ∈ {0, 1, . . . , L}, ν ∈ {1, 2, . . . , k + 1}, and ω ∈ [δ/4, δ/2]. We remark that (3.52) and (3.53) hold trivially for ψj = ϕj ⊗ , where ϕj is a normalized eigenvector of Hel corresponding to the eigenvalue Ej .

Spectral Analysis for Systems of Atoms and Molecules

275 m/2

To prove (3.53), we observe that, for any two vectors ψ ∈ D(Hf we have ψ Pel,j (θ) Wg (θ) Pel,j (θ) ϕ Z 2 X

g m+n Pel,j (θ) wm,n (ξ (m) , ξ˜ (n) ; θ ) Pel,j (θ ) ≤

n/2

), ϕ ∈ D(Hf ), (3.54)

m+n=1

ka(ξ (m) )ψk ka(ξ˜ (n) )ϕkdξ (m) d ξ˜ (n) Z 2 X J (k)2 d 3 k (m+n)/2 m n ≤C g m+n ψ Hf ψ ϕ Hf ϕ ω(k) m+n=1

≤C

2 X

0

m/2 n/2 (3−1 g)m+n Hf ψ Hf ϕ ,

m+n=1

by Schwarz’ inequality. Here we abbreviate the summation 2 X λ1 =1

···

2 Z X

d 3 k1 · · · d 3 km by dξ (m) , and ξ (m) := (k1 , λ1 , . . . , km , λm ).

λm =1

Thus we have

(3.55) ψj (θ¯ ) Q0 (θ)ν Wg (θ) Q0 (θ)k+2−ν ψj (θ)

ν ¯ Pel,j (θ) Wg (θ) Pel,j (θ ) Q0 (θ )k+2−ν ψj (θ ) = Q0 (θ)∗ ψj (θ) n

n/2

o

m/2 ν Q0 (θ)∗ ψj (θ¯ ) Hf Q0 (θ )k+2−ν ψj (θ ) . ≤ C max g m+n Hf m+n=1,2

Next, we observe that ν ¯ = e−θ¯ Hf − ω + ig 2 −ν exp −g −2 e−θ¯ Hf ϕel,j,i (θ¯ ) η(θ¯ ), Q0 (θ )∗ ψj (θ) (3.56) and hence, for suitable constants C, C 0 , C 00 ≥ 0,

m/2 ν

H ¯ Q0 (θ)∗ ψj (θ) f −ν ≤ C Bη sup r m/2 eiϑ r − ω + ig 2 exp −g −2 eiϑ r

(3.57)

r≥0

n o m/2 ≤ C 00 Bη ϑ −1 g 2 eϑ . ≤ C Bη ϑ −1 sup r m/2 exp −g −2 r 0

r≥0

n/2

Inserting this and a similar estimate for k Hf Q0 (θ )k+2−ν ψj (θ )k into (3.55), we obtain that

ψj (θ¯ ) Q0 (θ)ν Wg (θ) Q0 (θ)k+2−ν ψj (θ ) ≤ |θ |−2 Bη2 max (Cg 2 )m+n m+n=1,2

= C |θ |−2 Bη2 g 2 , for some constant C ≥ 0 and g > 0 sufficiently small. This proves (3.53).

(3.58)

276

V. Bach, J. Fröhlich, I. M. Sigal

Finally, we establish (3.52) by using the fact that we may analytically continue in θ , since the spectral parameter Ej + ω + ig 2 in Q0 (θ ) is in the upper half-plane. Thus

¯ (e−θ¯ Hf − ω − ig 2 )−k−1 ψj (θ ) ψj (θ¯ ) Q0 (θ)k+1 ψj (θ) = ψj (θ)

(3.59) = ψj (0) (Hf − ω − ig 2 )−k−1 ψj (0) −2H /g 2

2 −k−1 f (Hf − ω − ig ) Pel,j η . = Pel,j η e Therefore, n

o Im ψj (θ¯ ) Q0 (θ)k+1 ψj (θ) ≤ sup e−2r/g 2 Im (r − ω − ig 2 )−k−1 . r≥0

(3.60)

Now, we use that ω ≥ δ/4. If r ≤ ω/2 then ω − r ≥ δ/8 and thus | arg(r − ω − ig 2 )| ≤

8g 2 |Im(r − ω − ig 2 )| ≤ . |Re(r − ω − ig 2 )| δ

Hence, for r ≤ ω/2 and g > 0 sufficiently small, k+2 2 g . exp[−2g −2 r] Im (r − ω − ig 2 )−k−1 ≤ 2(k + 1) 8/δ

(3.61)

(3.62)

We point out that only for the derivation of (3.62) we need to estimate the imaginary part of a matrix element rather than its magnitude. It remains to consider the case r ≥ ω/2 ≥ δ/8. We estimate as follows, exp[−2g −2 r] Im (r − ω − ig 2 )−k−1 ≤ g −2k−2 exp[−δg −2 /8] ≤ C g 2 , (3.63) for some constant C ≥ 0. Inserting (3.62), (3.63) into (3.60), we obtain (3.52) which, t together with (3.53), finishes the estimate on A+ . u Lemma 3.7. For some constant C ∈ R+ , we have h i ¯ ≤ C Bη2 exp −t g 2 0j − Cg 2+ε ) . |Ak (θ)| + |Ak (θ)|

(3.64)

¯ are defined in Eqs. (3.35) and (3.37). FurProof. We first recall that Ak (θ) and Ak (θ) thermore, we note that the right side in (3.64) equals C Bη2 e−tS . ¯ We note that the spectrum of Hg (θ¯ ) = Hg (θ )∗ We start with the estimate on |Ak (θ)|. lies in the upper half plane and that Theorem 3.2 implies the existence of C < ∞ such that

C

Hg (θ) ¯ − λ + is −1 ≤ , (3.65) s ¯ − z)−1 is analytic in z ∈ Ij (δ/2) − i[S, ∞), and a contour for all s > 0. Thus, (Hg (θ) deformation which replaces Ij (ω) − iS by Ej − ω − i[S, ∞), Ej + ω − i[S, ∞), and Ij (ω) − i∞ yields that Z ∞ −ts e ds 2 ¯ (3.66) ≤ C 0 Bη2 e−tS , |Ak (θ)| ≤ C Bη δ+s S additionally using that t ≥ 1.

Spectral Analysis for Systems of Atoms and Molecules

277

To estimate Ak (θ), we introduce Z

e ψ) := A(ϕ,

Z

δ/2

δ/2

−F 0 (ω/δ) dω 2i π δ

Ij (ω)

(3.67)

−1

dλ e−it (λ−iS) ϕ Hg (θ ) − λ + iS ψ

e j (θ), ¯ ψj (θ)). An application of (3.34) yields and note that Ak (θ) = A(ψ e ψ)| ≤ C kϕk kψk e−tS |A(ϕ, Z −1 (2) dλ |Ej + g 2 1Ej − λ| + CS g 2+ε ·

(3.68)

Ij (δ/2)

≤ C kϕk kψk ln(1/g) exp −t (g 2 0j − CS g 2+ε ) , 0

for some constants C, C 0 ≥ 0 which depend on ϑ and ν. To apply this estimate, we write (1)

ψj (θ ) := χHf
(2)

ψj (θ ) := χHf ≥g 2−ε/2 ψj (θ ),

and we observe that

χH ≥g 2−ε/2 exp −g −2 e−θ Hf ≤ exp −C g −ε/2 , f

(3.69)

(3.70)

(2)

which implies that kψj (θ)k ≤ Bη exp(−C g −ε/2 ), for some constant C ∈ R+ . Hence, an application of (3.68) yields A e ψ (1) (θ), e ψj (θ¯ ), ψj (θ) − A ¯ ψ (1) (θ) ≤ C Bη2 exp −g −ε/4 − tS , j j

(3.71)

for some constant C ∈ R+ , provided g > 0 is sufficiently small. e (1) (θ), ¯ ψ (1) (θ)) we observe that ψ (1) (θ¯ ) = P (θ¯ )ψ (1) (θ¯ ), and hence To analyze A(ψ j j j j we have Z δ/2 −F 0 (ω/δ) dω e ψ (1) (θ), ¯ ψ (1) (θ) = (3.72) A j j 2i π δ δ/4 Z

(1) ¯ FP (θ ) Hg (θ ) − λ + iS −1 ψ (1) (θ ) . dλ e−it (λ−iS) ψj (θ) j Ij (ω)

Now we use that ϕj,` is an eigenvector of Zj with corresponding eigenvalue µj,` , say. This, the second resolvent equation, µj,i ∈ Rj , and Theorem 3.4 give (1) ψ (θ) ¯ FP (θ) (Hg (θ) − λ + iS)−1 j − (Ej − λ + iS + g 2 µj,i + e−θ Hf ) ≤

Cg 2+ε (2)

(λ − Ej − g 2 1Ej )2 + (CS g 2+ε )2

.

(1) ψj (θ )

−1

(3.73)

278

V. Bach, J. Fröhlich, I. M. Sigal

We insert this estimate into the second integral in (3.72) and obtain Z δ/2 −F 0 (ω/δ) dω A e ψ (1) (θ), ¯ ψ (1) (θ) = C Bη2 e−tS + (3.74) j j 2i π δ δ/4 Z

(1) ¯ (Ej − λ + iS + g 2 µj,i + e−θ Hf )−1 ψ (1) (θ¯ ) . dλ e−it (λ−iS) ψj (θ) · j Ij (ω)

Finally, we observe that the last integral in (3.74) equals Z Z g 2−ε/2 e−it (λ−iS) dλ dµψ (1) (θ¯ ),ψ (1) (θ ) (r) , 2 −θ j j 0 Ij (ω) Ej − λ + iS + g µj,i + e r

(3.75) (1)

where dµψ (1) (θ¯ ),ψ (1) (θ ) (r) denotes the spectral measure of Hf corresponding to ψj (θ¯ ) (1)

j

j

and ψj (θ ). Using the fact that Ej + g 2 µj,i + e−θ r ∈ Rj and Cauchy’s integral formula, this integral is easily seen to be bounded by C Bη2 e−tS . This finishes the proof of Lemma 3.7. u t 3.2. Resolvent norm estimates and the Proof of Theorem 3.4. The first purpose of this subsection is to establish in Lemmata 3.14 and 3.15 the existence of the Feshbach operator FP (θ) ≡ FP (θ) (Hg − z) defined in (3.22), for any z ∈ Aj (δ, ε) (Aj is defined in (3.16)). Having established the existence of FP (θ ) , we give a proof of Theorem 3.4, i.e., we show, for any 0 < ε < 1/3, the existence of a constant C ∈ R+ such that the following estimate holds true (see Eq. (3.25)),

FP (θ) − (Ej − z + g 2 Zj (θ) + e−θ Hf )P (θ ) ≤ C g 2+ε , (3.76) for any z ∈ Aj . 3.2.1. Estimates on the dilated atomic Hamiltonian. In this subsection, we derive some norm bounds on the resolvent of the complex dilated atomic Hamiltonian Hel (θ ) in the vicinity of an excited bound state at energy Ej , where j ≥ 1 is kept fixed henceforth. To begin with, we recall some definitions and notation. The pure point spectrum of Hel is given by the set {E0 , E1 , . . . , Ej , . . . } contained in (−∞, 6), and its essential spectrum is contained in [6, ∞). We assume that 6 ≤ 0 and, we denote Ri := dist Ei , σ (Hel ) \ {Ei } > 0, for i = 0, 1, 2, . . . , and we single out δ := Rj . So denoting Hel (θ) := Uθ Hel Uθ−1 , we can construct the projection Pel,i (θ ) onto the eigenspace of Hel (θ) corresponding to the eigenvalue Ei by using the Dunford integral, Z i dz . (3.77) Pel,i (θ) = 2π |z−Ei |=Ri /2 Hel (θ ) − z Next, we define a finite-rank projection Pdisc (θ ) by X Pel,i (θ ), Pdisc (θ) :=

(3.78)

i:Ei ≤6−µ

where 0 < µ ≤ 6 − Ej +1 is some fixed, strictly positive number. Note that k Pdisc (θ) − Pdisc (0) k ≤ C b |θ |,

(3.79)

Spectral Analysis for Systems of Atoms and Molecules

279

thanks to the relative bound (1.55),

−1

≤ b |θ |,

1Hel (θ) Hel + i

(3.80)

where 1Hel (θ) := Hel (θ) − Hel . Our first result is the following bound. Lemma 3.8. Let z ∈ C with Re{z} < 6 − µ. Then, for |θ | 1 + (6 − µ − Re{z})−1 sufficiently small, Hel (θ) − z is invertible on Ran{P disc (θ )} and

(Hel (θ) − z)−1 P disc (θ) ≤

2 . 6 − µ − Re{z}

(3.81)

Proof. We first observe that Q := P disc (0)Hel (0) − z is globally invertible on Hel , and since 6 ≤ 0 we have −1 kQ−1 k ≤ max |6 − µ − z|−1 , | − z|−1 ≤ 6 − µ − Re{z} .

(3.82)

Similarly, we obtain r + i o n r + 6 − µ + i sup k (Hel (0) + i) Q−1 k = max sup , r + 6 − µ − z z r≥0 E0 ≤r≤6−µ −1 ≤ C1 1 + 6 − µ − Re{z} , (3.83) for some constant C1 ≥ 0. Inserting this and (3.79)–(3.80), we obtain

P disc (0)Hel (0) − P disc (θ )Hel (θ ) Q−1 −1 , ≤ C2 |θ| 1 + 6 − µ − Re{z}

(3.84)

for some constant C2 ≥ 0. Thus a Neumann series expansion yields

(Hel (θ ) − z)−1 P disc (θ) = (P disc (θ )Hel (θ ) − z)−1 P disc (θ ) ∞

X

n

(P disc (0)Hel (0) − P disc (θ )Hel (θ )) Q−1 P disc (θ ) ≤ Q−1

(3.86)

(3.85)

n=0

≤

∞ n X C2 |θ | 2 1 ≤ C2 |θ| + , 6 − µ − Re{z} 6 − µ − Re{z} 6 − µ − Re{z} n=0

t for |θ| 1 + (6 − µ − Re{z})−1 ≤ (2C2 )−1 . u Next, we extend Lemma 3.8 to a global bound for the resolvent of Hel . Lemma 3.9. Let ρ > 0. Then, for θ = ±iϑ and ϑ > 0 sufficiently small, there exists a constant C ≥ 0 such that Hel (θ) − Ej + e−θ ρ is invertible on Hel and

(Hel (θ) − Ej + e−θ ρ)−1 ≤ C (ϑρ)−1 .

(3.87)

280

V. Bach, J. Fröhlich, I. M. Sigal

Proof. We observe that Hel (θ) commutes with Pdisc (θ ) = (Hel (θ ) − Ej + e−θ ρ)−1 Pdisc (θ) =

X

P

i:Ei ≤6−µ Pel,i (θ ) and that

(Ei − Ej + e−θ ρ)−1 Pel,i (θ ).

i:Ei ≤6−µ

(3.88) Thus, for some constant C 0 ≥ 0,

(Hel (θ) − Ej + e−θ ρ)−1 Pdisc (θ) n C0

o

Pel,i (θ) · # Ei ∈ σ (Hel ) Ei ≤ 6 − µ . ≤ max i:Ei ≤6−µ ρ ϑ

(3.89)

R −1 Using the integral representation Pel,i (θ) = (2π i)−1 |z−Ei |=Ri /2 z − Hel (θ ) dz to

gether with the relative bound (3.80), we obtain that Pel,i (θ ) ≤ 1+O(|θ |). Conversely, on Ran{P disc (θ)} we apply Lemma 3.8 and obtain

(Hel (θ) − Ej + e−θ ρ)−1 P disc (θ) ≤ 6 − µ − Ej + ρ · Re{e−θ } −1 −1 ≤ 2 cos(ϑ)ρ . (3.90) t u −1/2

≤ 1/3, and 0 < ρ0 ≤ (δ/3) sin ϑ. Lemma 3.10. Let θ = iϑ, 0 < ϑ < θ0 , 0 ≤ gρ0 Then there exists a constant C ≥ 0 such that, for all z ∈ Aj (δ, ε) and all r ≥ 0,

P el,j (θ )

≤ C.

Hel (θ) − Ej + e−θ (ρ0 + r)

−θ Hel (θ ) − z + e r ϑ

(3.91)

Proof. Using 1 = Pdisc (θ) + P disc (θ) as in the proof of Lemma 3.9, we obtain that

X P el,j (θ) (3.92) −θ X

Ei − Ej + e (ρ0 + r)

Pel,i (θ ) ≤ X P disc (θ) + E − z + e−θ r i:Ei ≤6−µ,i6=j

i

≤ X P disc (θ) + −θ t + e (ρ0 + r) 2−ε |t| ≥ δ , |a| ≤ δ/2 , b ≥ −g , C sup t − a − ib + e−θ r

where t, a, b ∈ R and we denote −1 . X := Hel (θ) − Ej + e−θ (ρ0 + r) Hel (θ ) − z + e−θ r

(3.93)

We observe that with θ = iϑ, ϑ > 0, and b ≥ −g 2−ε , we have |t − a − ib + e−θ r|2 = (t − a)2 + r 2 + b2 + 2r (t − a) cos ϑ + b sin ϑ g 2−ε sin ϑ . (3.94) ≥ (t − a)2 + r 2 1 − cos ϑ − |t − a|

Spectral Analysis for Systems of Atoms and Molecules

281

Choosing g > 0 sufficiently small, we have g ε ≤ 1/3 and g 2−2ε = ρ0 ≤ (δ/3) sin ϑ, which, together with |t − a| ≥ δ/2, implies that g 2−ε |t − a|−1 sin ϑ ≤ ϑ/4 and thus 1/2 q t − a − ib + e−θ r ≥ ϑ t2 + r2 . (3.95) 4 On the other hand, |t + e−θ (ρ0 + r)| ≤ 3(t 2 + r 2 ), since ρ0 ≤ (δ/3) sin ϑ ≤ |t|/3. Inserting this and (3.95) into (3.92), we arrive at

C

X P el,j (θ) ≤ X P disc (θ ) + 48 √ . ϑ

(3.96)

Next, we write X P disc (θ ) = P disc (θ) + (e−θ ρ0 + z − Ej ) Hel (θ ) − z + e−θ r

−1

P disc (θ ), (3.97)

and we obtain from Lemma 3.8 that

h

i

X P disc (θ) ≤ P disc (θ) 1 + 3ρ0 6 − µ − Re{z} −1 ≤ C,

(3.98)

and hence we arrive at the claim. u t 3.2.2. Relative Bounds on the Interaction. In this subsection we use the estimates on the dilated electron Hamiltonian derived in the previous subsection to obtain suitable relative bounds on the interaction. To this end, we recall Eqs. (1.37)–(1.45), and we introduce the operator Bθ (ρ) := H0 (θ) − Ej + e−θ ρ = Hel (θ ) − Ej + e−θ (Hf + ρ).

(3.99)

We shall generally assume that 0 < ϑ ≤ θ0 and that 0 < ρ0 ≤ (δ/3) sin ϑ. As before, we denote Hg (θ)P (θ) := P (θ)Hg (θ)P (θ) and H0 (θ )P (θ ) := H0 (θ )P (θ ). Finally, for a √ closed operator A we denote |A| := A∗ A. We start with two preparatory lemmata. −1/2

≤ 1/3, and 0 < ρ0 ≤ (δ/3) sin ϑ. Lemma 3.11. Let θ = iϑ, 0 < ϑ < θ0 , 0 ≤ gρ0 Then there exists a constant C ≥ 0 such that, for all z ∈ Aj (δ, ε),

P (θ)

≤ C.

Bθ (ρ0 ) (3.100)

H0 (θ) − z ϑ Proof. Writing P (θ) = P el,j (θ)+Pel,j (θ)χHf ≥ρ0 and applying Lemma 3.10, we derive that

Bθ (ρ0 ) P (θ) (H0 (θ) − z)−1 (3.101)

C + Bθ (ρ0 ) Pel,j (θ) χHf ≥ρ0 (H0 (θ ) − z)−1 ≤ ϑ 00 r + ρ0 C 0 ≤ C , + C sup ≤ −θ ϑ ϑ r≥ρ0 Ej − z + e r for some C, C 0 , C 00 ∈ R+ , since Im{z − E} ≥ −g 2−ε = −ρ0 g ε ≥ −(r/2) sin ϑ, by assumption. u t

282

V. Bach, J. Fröhlich, I. M. Sigal

Lemma 3.12. For θ = ±iϑ and ϑ ∈ (0, θ0 ) sufficiently small there exists a constant C ≥ 0 such that

|Bθ (ρ)|−1 (Hf + ω) ≤ C 1 + ω , (3.102) ϑ ρ

|Bθ (ρ)|−1 (Hel (0) ± i) ≤ C 1 + 1 , (3.103) ϑ ρ for all ρ > 0 and ω ≥ 0. Proof. By the functional calculus and Lemma 3.9 we have n o

|Bθ (ρ)|−1 (Hf + ω) = sup Hel (θ) − Ej + e−θ (r + ρ) −1 (r + ω) r≥0

n n r +ω o C ωo ≤ max 1 , , ϑ ρ r>0 ϑ(r + ρ)

≤ C sup

(3.104)

which implies (3.102). To establish (3.103), we start with a similar observation, namely,

|Bθ (ρ)|−1 (Hel (0) ± i) = sup Y± , (3.105) H r≥0

where

el

−1 (Hel (0) ± i). Y± := Hel (θ) − Ej + e−θ (r + ρ)

(3.106)

We observe the following identity, (3.107) Y± = 1 − Y± (Hel (0) ± i)−1 1Hel (θ ) −1 −θ −θ + ±i + Ej − e (r + ρ) Hel (θ ) − Ej + e (r + ρ) . Solving for Y± and applying Lemma 3.9, we obtain 1 1 1 1 + C |i + Ej − e−θ ρ| 1 + ≤ C 1+ . sup kY± k ≤ 1 − b|θ| ϑρ ϑρ r≤0 (3.108) t u Now, we come to the main relative bound used in this section. Lemma 3.13. For θ, θ1 , θ2 ∈ {±iϑ}, 0 < β < 1, and ϑ ∈ (0, θ0 ) sufficiently small, there exists a constant Cβ ≥ 0 such that

|Bθ (ρ)|−1/2 Wm,n (θ) |Bθ (ρ)|−1/2 ≤ Cβ 1 + ρ −1/2 , 1 2 ϑ

|Bθ (ρ)|−1/2 Wm,n (θ) P (θ) ≤ Cβ 1 + ρ −1/2 (ρ + ρ0 )1/2 ρ nβ/2 , 1 0 ϑ

P (θ ) Wm,n (θ) |Bθ (ρ)|−1/2 ≤ Cβ 1 + ρ −1/2 (ρ + ρ0 )1/2 ρ mβ/2 , 2 0 ϑ

P (θ) Wm,n (θ) P (θ) ≤ Cβ 1 + ρ −1/2 (ρ + ρ0 ) ρ (m+n)β/2, 0 ϑ for all ρ > 0, where P (θ) is defined in (3.18).

(3.109)

(3.110) (3.111) (3.112)

Spectral Analysis for Systems of Atoms and Molecules

283

Proof. We first observe that aλ (k)Bθ (ρ) = Bθ (ρ + ω(k))aλ (k) and thus aλ (k) |Bθ (ρ)|2 = aλ (k) Bθ¯ (ρ) Bθ (ρ)

(3.113)

= Bθ¯ (ρ + ω(k)) Bθ (ρ + ω(k)) aλ (k) = |Bθ (ρ + ω(k))| aλ (k). 2

Thus, functional calculus implies the Pull-Through Formulae aλ (k) |Bθ (ρ)|−1/2 = |Bθ (ρ + ω(k))|−1/2 aλ (k), |Bθ (ρ)|−1/2 aλ∗ (k) = aλ∗ (k) |Bθ (ρ + ω(k))|−1/2 . Using (3.114)–(3.115), we observe that, for any ψ ∈ H,

|Bθ (ρ)|−1/2 W0,1 (θ) |Bθ (ρ)|−1/2 ψ 1 2

X

Z

2

3 −1/2 −1/2

d k |Bθ1 (ρ)| = w0,1 (k, λ; θ) |Bθ2 (ρ + ω)| aλ (k)ψ

(3.114) (3.115)

(3.116)

λ=1

=

Z 3 1/2 1/2

2 d k

|Bθ1 (ρ)|−1/2 w0,1 (k, λ; θ) |Bθ2 (ρ + ω)|−1/2 Hf + ω ω λ=1,2 X Z 2 1/2

2

−1/2

× aλ (k)ψ ω d 3 k ,

Hf + ω sup

λ=1

where here and henceforth we denote ω := ω(k) and ω0 := ω(k 0 ). Note that there is an additional constraint ω(k) ≤ ρ0 in the integrals on the right side of (3.116) if we require that ψ ∈ RanχHf <ρ0 . The last factor in (3.116) equals 1/2 X 2 Z D E ⊥ −1/2 ∗ −1/2 ⊥ 3 aλ (k)aλ (k) Hf P ψ ω d k ≤ kψk. ψ P Hf

(3.117)

λ=1

Since furthermore Hf + ω commutes with Bθ1 (ρ) and w0,1 (k, λ; θ ), we may use (1.42) and Lemma 3.12 to estimate

|Bθ (ρ)|−1/2 w0,1 (k, λ; θ) |Bθ (ρ + ω)|−1/2 (Hf + ω)1/2 (3.118) 1 2

−1/2 1/2 −1/2

(Hf + ω) · w0,1 (k, λ; θ ) |Hel (0) + i| ≤ |Bθ1 (ρ)|

1/2 −1/2

· |Hel (0) + i| |Bθ2 (ρ + ω)| 1/2 ω 1/2 1 C 1+ J (k) 1+ ≤ ϑ ρ ρ+ω 1 1/2 C 1+ (1 + ω)1/2 J (k), ≤ ϑ ρ for some constant C ≥ 0. Inserting (3.118) and (3.117) into (3.116), we obtain that

|Bθ (ρ)|−1/2 W0,1 (θ) |Bθ (ρ)|−1/2 2 (3.119) 1 2 Z 2 2 C (30 + 3−1 ) C 1 + ω(k)−1 J (k)2 d 3 k ≤ 1 + ρ −1 , ≤ 2 1 + ρ −1 2 ϑ ϑ

284

V. Bach, J. Fröhlich, I. M. Sigal

for some constant C ≥ 0. Similarly, by additionally requiring that ψ ∈ RanχHf <ρ0 , we obtain that

|Bθ (ρ)|−1/2 W0,1 (θ) |Bθ (ρ)|−1/2 χH <ρ 2 1 2 0 f Z C ≤ 2 1 + ρ −1 1 + ω(k)−1 J (k)2 d 3 k ϑ ω(k)≤ρ0 Z β C ρ0 −1 1 + ρ ω(k)−β + ω(k)−1−β J (k)2 d 3 k ≤ 2 ϑ β

≤

C (32−β + 32−1−β ) ρ0 ϑ2

1 + ρ −1 .

(3.120)

The estimate for W1,0 (θ) is similar. Next, we derive (3.109) in the case of W0,2 (θ ). Picking ψ ∈ H, we observe that

|Bθ (ρ)|−1/2 W0,2 (θ) |Bθ (ρ)|−1/2 ψ (3.121) 1 2 Z

≤ 2 sup

|Bθ1 (ρ)|−1/2 w0,2 (k, k 0 , λ, λ0 ; θ ) λ,λ0 =1,2

3 3 0 1/2

2 d k d k |Bθ2 (ρ + ω + ω0 )|−1/2 Hf + ω + ω0 ω ω0 X 1/2 2 Z

2

0 −1 0 0 0 3 3 0 × + ω + ω a (k )a (k)ψ ω ω d k d k . H

f λ λ λ,λ0 =1

Again, we have the additional constraint ω(k) ≤ ρ0 in the integrals on the right side of (3.121) if we require that ψ ∈ RanχHf <ρ0 . The last factor is bounded by kψk. Thus Eq. (1.43) and Lemma 3.12 imply that, for some constants C ≥ 0,

|Bθ (ρ)|−1/2 W0,2 (θ) |Bθ (ρ)|−1/2 2 (3.122) 1 2 Z

1/2 2

≤

|Bθ1 (ρ)|−1/2 Hf + ω + ω0

2 0 2 3 3 0 1/2

2 J (k) J (k ) d k d k · |Bθ2 (ρ + ω + ω0 )|−1/2 Hf + ω + ω0

ω ω0 Z 0 0 2 0 2 3 ω+ω C J (k) J (k ) d k d 3 k 0 ω+ω 1+ ≤ 2 1+ ϑ ρ ρ + ω + ω0 ω ω0 2 2 2 C (30 + 3−1 ) ≤ 1 + ρ −1 , 2 ϑ and

|Bθ (ρ)|−1/2 W0,2 (θ) |Bθ (ρ)|−1/2 χH <ρ 2 1 2 0 f ≤

C (32−β

2β + 32−1−β ) ρ0 ϑ2

(3.123) 1 + ρ −1 .

Estimates similar to (3.121)–(3.122) establish (3.109) in the remaining cases, i.e., for W1,1 (θ ) and W2,0 (θ). Finally, P (θ) = χHf <ρ0 P (θ ),

Bθ (ρ) P (θ) = (Hf + ρ) P (θ ) = ρ0 + ρ, (3.124) (3.120) and (3.123) yield (3.110)–(3.112). u t

Spectral Analysis for Systems of Atoms and Molecules

285

3.2.3. Domain of the Feshbach map. In the following subsection we apply the relative bounds from Lemma 3.13 to prove that, for z sufficiently close to Ej , the Feshbach map with projection P (θ) is applicable to Hg (θ) − z. Lemma 3.14. Let ρ0 < (δ/3) sin ϑ, and assume that ϑ ∈ (0, θ0 ). Then, for 0 < −1/2 gρ0 ϑ 2 sufficiently small and for all z ∈ Aj (δ, ε), the operator Hg (θ )P (θ ) − z is invertible on Ran{P (θ)}, and

−1 C

P (θ ) ≤ , (3.125)

Hg (θ)P (θ) − z ϑ ρ0 for some constant C ≥ 0. Proof. We construct Hg (θ)P (θ) − z Hg (θ )P (θ) − z

−1

P (θ) =

∞ X n=0

−1

P (θ) by a norm-convergent Neumann series,

P (θ) H0 (θ) − z

−Wg (θ )

P (θ ) H0 (θ ) − z

n

. (3.126)

We estimate the norm of the term in nth order by means of Lemmata 3.11 and 3.13,

n

P (θ) P (θ)

−Wg (θ) (3.127)

H (θ) − z

H0 (θ) − z 0

P (θ)

n+1

≤ Bθ (ρ0 )

H0 (θ) − z

|Bθ (ρ0 )|−1 · |Bθ (ρ0 )|−1/2 Wg (θ )|Bθ (ρ0 )|−1/2 n C −1/2 −2 n ≤ ϑ . C g ρ0 ϑ ρ0 This proves the convergence of the Neumann series (3.126) in norm. u t Lemma 3.14 is the main ingredient used to prove the existence of the Feshbach operator defined in (3.22)–(3.24). Lemma 3.15. Let ρ0 < (δ/3) sin ϑ, and assume that ϑ ∈ (0, θ0 ) and g > 0 are sufficiently small. Then, for all z ∈ Aj (δ, ε), the Feshbach operator defined in (3.22) exists and obeys Eq. (3.23). Moreover, for some constant C ≥ 0, we have )

−1

P (θ ) Wg (θ) P (θ) Hg (θ) Cg −z P (θ ) , P (θ)

≤ , (3.128) −1 1/2

P (θ ) Hg (θ) −z P (θ) Wg (θ) P (θ ) ϑρ P (θ)

1/2

P (θ) Wg (θ) P (θ) ≤ Cg ρ0 . ϑ

0

(3.129)

Proof. The proof of (3.128) is similar to the one for Lemma 3.14. Then Lemma 3.14 and (3.126) imply the existence of the Feshbach operator defined in (3.22) and that it obeys Eq. (3.23) (see, e.g., [6,7]). u t We finally have a lemma which yields Theorem 3.4 upon choosing β := 2ε(1 − ε)−1 ∈ (0, 1).

286

V. Bach, J. Fröhlich, I. M. Sigal

Lemma 3.16. Let 0 < ε < 1/3, ρ0 := g 2−2ε , 0 < β < 1, and assume that ϑ ∈ (0, θ0 ) and g > 0 are sufficiently small and such that ρ0 < (δ/3) sin ϑ. Then, for all z ∈ Aj (δ, ε),

FP (θ) − (Ej − z + g 2 Z d (θ) + g 2 Z od (θ ) + e−θ Hf )P (θ ) j j ≤C g

2+ε

+g

2+2β(1−ε)

+g

1+(1+β)(1−ε)

+g

4−2ε

(3.130)

,

for some constant C ≥ 0. Proof. Recall from (3.22) and (3.8)–(3.9) that FP (θ) := FP (θ) Hg (θ) − z

:= Hg (θ)P (θ) − z P (θ) − P (θ)Wg P (θ ) Hg (θ )P (θ ) − z

−1

(3.131) P (θ )Wg P (θ )

and Zjod (θ) :=

Zjd (θ ) :=

λ=1,2

⊥ Uθ Pel,j w0,1 (k, λ)Pel,j

Hel − Ej + ω(k) − i0

X Z λ=1,2

X Z

−1

(3.132)

⊥ Pel,j w1,0 (k, λ)Pel,j Uθ−1 d 3 k,

Uθ Pel,j w0,1 (k, λ) Pel,j w1,0 (k, λ)Pel,j Uθ−1

d 3k , ω(k)

(3.133)

Pnj |ϕj,` ihϕj,` | is the orthogonal projection onto the where Pel,j = Pel,j (θ = 0) = `=1 eigenspace of Hel corresponding to the eigenvalue Ej . As in [6], we write the difference to be estimated as a sum of six error terms, FP (θ) − (Ej + g 2 Zjd (θ) + g 2 Zjod (θ) + e−θ Hf )P (θ ) =

5 X

Remµ ,

(3.134)

µ=0

where (compare to [6, (IV.58), (IV.60), (IV.68), (IV.77), (IV.101), (IV.86), and (IV.87)]) −1 Rem0 := P (θ)Wg (θ)P (θ) P (θ)Hg (θ)P (θ ) − z − P (θ)H0 (θ )P (θ ) − z

Rem1 := P (θ)Wg (θ)P (θ) P (θ)Hg P (θ) − z − g 2 P (θ) W0,1 (θ) + W1,0 (θ)

−1

P (θ ) H0 − z

!

(3.135) −1

P (θ )Wg (θ )P (θ ),

P (θ )Wg (θ )P (θ )

(3.136)

W0,1 (θ ) + W1,0 (θ ) P (θ ),

Spectral Analysis for Systems of Atoms and Molecules

287

Rem2 := g P (θ) W0,1 (θ) + W1,0 (θ) 2

− g2

2 Z X λ=1

P (θ ) H0 − z

!

W0,1 (θ ) + W1,0 (θ ) P (θ )

# P (θ ; ω(k)) w1,0 (k, λ)P (θ ) d 3 k, P (θ)w0,1 (k, λ) H0 + e−iϑ ω(k) − z "

(3.137) where P (θ, ω) := Pel,j (θ) χHf +ω<ρ0 , Rem3 := P (θ) Wg (θ ) P (θ ),

Rem4 := g 2 "

2 Z X

(3.138)

dk Pel,j w0,1 (k, λ)

(3.139)

λ=1

# e−iϑ Hf + Ej − z P el,j χHf <ρ0 w1,0 (k, λ) Pel,j χHf <ρ0 , Bθ (ω(k)) + Ej − z Hel − Ej + e−iϑ ω(k)

Rem5 := g 2

nZ X

Pel,j w0,1 (k, λ) Pel,j w1,0 (k, λ)Pel,j

(3.140)

λ=1,2

i−1 o χHf +ω(k)≥ρ0 dk − Zjd (θ ) χHf <ρ0 . e−iϑ Hf + ω(k) + Ej − z

h

We first rewrite Rem0 using the second resolvent equation which yields −1 Rem0 = P (θ)Wg (θ)P (θ) P (θ)Hg (θ )P (θ ) − z P (θ )Wg (θ )P (θ ) −1 P (θ)H0 (θ)P (θ) − z P (θ )Wg (θ )P (θ ). (3.141) Then an application of Lemma 3.13 gives k Rem0 k ≤

Cg 3 1/2 ϑ 2 ρ0

= O g 2+ε .

(3.142)

= O g 2+ε .

(3.143)

Second, a similar estimate yields k Rem1 k ≤

Cg 3 1/2 ϑ 2 ρ0

The derivation of these two estimates, (3.142) and (3.143), is similar to [6, (IV.58)– (IV.62)].

288

V. Bach, J. Fröhlich, I. M. Sigal

Third, we observe that

Rem2 = "

2 Z X

dk dk 0 P (θ) w0,1 (k, λ; θ ) a ∗ (k 0 )

λ,λ0 =1

P (θ, ω(k) + ω(k 0 )) H0 + e−iϑ (ω(k) + ω(k 0 )) − z "

(3.144)

# w1,0 (k 0 , λ0 ; θ ) a(k) P (θ )

# P (ω(k 0 )) w1,0 (k 0 , λ0 ; θ ) P (θ ) + P (θ ) w1,0 (k, λ; θ) a (k)a (k ) H0 + e−iϑ ω(k 0 ) − z # " P (θ, ω(k)) w0,1 (k 0 , λ0 ; θ ) a(k)a(k 0 ) P (θ ) + P (θ ) w0,1 (k, λ; θ) H0 + e−iϑ ω(k) − z " # P (θ) ∗ 0 0 0 w0,1 (k , λ ; θ ) a(k ) P (θ ) + P (θ ) w1,0 (k, λ; θ) a (k) H0 − z ∗

∗

0

e0,2 (θ )+ W e1,1 (θ ) P (θ ). e2,0 (θ )+ W (compare to [6, (IV.66)]), which is of the form P (θ ) W A somewhat lengthy estimate analogous to [6, Lemma IV.9] yields, after using (3.110)– (3.111), k Rem2 k = O g 2+2β(1−ε) .

(3.145)

Fourth, we apply (3.112) and directly obtain (1+β)/2

k Rem3 k = O g ρ0

= O g 1+(1+β)(1−ε) .

(3.146)

In order to estimate Rem4 , we observe that when restricted to RanP el,j , the resolvents of Bθ (ω(k)) + Ej − z and Hel − Ej + e−iϑ ω(k) are bounded by a constant C ≥ 0. Since, furthermore, z − Ej ∈ D(Ej , ρ0 /2) and k Hf χHf <ρ0 k = ρ0 , the fraction in the integrand on the right side of (3.139) is bounded in norm by 2Cρ0 , and we thus obtain k Rem4 k ≤ O g 2 ρ0 ≤ O g 4−2ε .

(3.147)

Finally, a similar argument, which is along the lines of [6, Lemma IV.12], yields β

k Rem5 k ≤ O g 2 ρ0

≤ O g 2+2β(1−ε) .

(3.148)

Adding up all error terms, taking into account that 0 < β < 1, we arrive at (3.130). u t Acknowledgements. We thank T. Chen and A. Soffer for numerous very helpful discussions on the material presented in this paper and M. Mück and H. Zenk for careful proofreading. We are also grateful to D. Buchholz, F. Hiroshima, F. Klopp, Y. Last, and H. Spohn for valuable comments.

Spectral Analysis for Systems of Atoms and Molecules

289

References 1. Hiroshima, F., Arai, A., Hirokawa, M.: On the absence of eigenvectors of hamiltonians in a class of massless quantum field models without infrared cutoff. Preprint, 1999 2. Aguilar, J. and Combes, J.M.: A class of analytic pertubations for one-body Schrödinger Hamiltonians. Commun. Math. Phys. 22, 269–279 (1971) 3. Albeverio, S.: Scattering theory in a model of quantum fields. I. J. Math. Phys. 14 (2), 1800–1816 (1972) 4. Albeverio, S.: Scattering theory in a model of quantum fields. II. Helv. Phys. Acta, 45, 303–321 (1972) 5. Bach, V., Fröhlich, J. and Sigal, I.M.: Mathematical theory of non-relativistic matter and radiation. Lett. Math. Phys. 34, 183–201 (1995) 6. Bach, V., Fröhlich, J. and Sigal, I.M.: Quantum electrodynamics of confined non-relativistic particles. Adv. in Math., 137, 299–395 (1998) 7. Bach, V., Fröhlich, J. and Sigal, I.M.: Renormalization group analysis of spectral problems in quantum field theory. Adv. in Math., 137, 205–298 (1998) 8. Bach, V., Fröhlich, , Sigal, I.M. and Soffer, A.: Positive commutators and spectrum of nonrelativistic QED. Preprint, 1997 9. Balslev, E. and Combes, J.M.: Spectral properties of Schrödinger operators with dilatation analytic potentials. Commun. Math. Phys. 22, 280–294 (1971) 10. Bethe, H. and Salpeter, E.: Quantum mechanics of one- and two-electron atoms. In: S. Flügge, editor, Handbuch der Physik, XXXV, Berlin: Springer, 1957, pp. 88–436 11. Bethe, H.A.: The electromagnetic shift of energy levels. Phys. Rev. 72, 339 (1947) 12. Cohen-Tannoudji, C., Dupont-Roc, J. and Grynberg, G.: Photons and Atoms – Introduction to Quantum Electrodynamics. New York: John Wiley, 1991 13. Cohen-Tannoudji, C., Dupont-Roc, J. and Grynberg, G.: Atom-Photon Interaction. NewYork: John Wiley, 1992 14. Cycon, H., Froese, R., Kirsch, W. and Simon, B.: Schrödinger Operators. Berlin–Heidelberg–New York: Springer, 1. edition, 1987 15. Derezinski, J. and Gérard, C.: Asymptotic completeness in quantum field theory. Massive Pauli-Fierz Hamiltonians. Preprint, June 1997 16. Fröhlich, J. and Pfeifer, P.: Generalized time-energy uncertainty relations and bounds on lifetimes of resonances. Rev. Mod. Phys., 67, 795 (1995) 17. Gérard, Ch.: Asymptotic completeness for the spin-boson model with a particle number cutoff. Rev. Math. Phys. 8, 549–589 (1996) 18. Hiroshima, F.: Functional integral representation of a model in QED. Rev. Math. Phys. 9 (4), 489–530 (1997) 19. Hiroshima, F.: Uniqueness of the ground state of a model in quantum electrodynamics: A functional integral approach. Hokkaido U. Prepr. Series in Math. 429, (1998) 20. Hoegh-Krohn, R.: Asymptotic fields in some models of quantum field theory. III. J. Math. Phys. 11 (1), 185–189 (1969) 21. Hoegh-Krohn, R.: Boson fields under a general class of cut-off interactions with bounded interaction densities. Commun. Math. Phys. 12, 216–225 (1969) 22. Hoegh-Krohn, R.: Boson fields with bounded interaction densities. Commun. Math. Phys. 17, 179–193 (1970) 23. Hunziker, W.: Distortion analyticity and molecular resonance curves. Ann. Inst. H. Poincaré 45, 339–358 (1986) 24. Hunziker, W.: Resonances, metastable states and exponential decay laws in perturbation theory. Commun. Math. Phys. 132, 177–188 (1990) 25. Kato, T.: Perturbation Theory of Linear Operators. Volume 132 of Grundlehren der mathematischen Wissenschaften. Berlin–Heidelberg–New York: Springer-Verlag, 2th edition, 1976 26. Lieb, E.H.: Bound on the maximum negative ionization of atoms and molecules. Phys. Rev. A 29, 3018– 3028 (1984) 27. Reed, M. and Simon, B. Methods of Modern Mathematical Physics: Analysis of Operators. Volume 4. San Diego: Academic Press, 1st edition, 1978 28. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics: Fourier Analysis and SelfAdjointness. Volume 2 San Diego: Academic Press, 2nd edition, 1980 29. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics: Functional Analysis. Volume 1. San Diego: Academic Press, 2nd edition, 1980 30. Ruskai, Mary Beth: Absence of discrete spectrum in highly negative ions II. Extension to Fermions. Commun. Math. Phys. 85, 325–327 (1982) 31. Sigal, I.M. Geometric methods in the quantum many-body problem. Nonexistence of very negative ions. Commun. Math. Phys. 85, 309–324 (1982)

290

V. Bach, J. Fröhlich, I. M. Sigal

32. Simon, B.: The definition of molecular resonance curves by the method of exterior complex scaling. Phys. Lett. A 71, 211–214 (1979) 33. Simon, B.: Functional Integration and Quantum Physics. Pure and applied mathematics. New York: Academic Press, 1979 34. Spohn, H.: Ground state(s) of the spin-boson Hamiltonian. Commun. Math. Phys. 123, 277–304 (1989) 35. Spohn, H.: Asymptotic completeness for Rayleigh scattering. J. Math. Phys., 38, 2281–2296 (1997) 36. Zishlin, G.M.: Discussion of the spectrum of the Schrödinger operator for systems of many particles. Tr. Mosk. Mat. O. -va, 9, 81–120 (1960) Communicated by B. Simon

Commun. Math. Phys. 207, 291 – 306 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Extensions of the Neveu–Schwarz Lie Superalgebra Patrick Marcel C.N.R.S., C.P.T., Luminy-Case 907, 13288 Marseille Cedex 9, France Received: 26 May 1998 / Accepted: 8 April 1999

Abstract: We consider the generalizations of the Virasoro Lie algebra constructed recently in [16] and we classify their superanalogues. The main result of this article is a series of Lie Superalgebras generalizing the well-known Neveu–Schwarz Superalgebra.

1. Introduction The Virasoro algebra plays an important role in mathematical physics and in differential geometry. The interest in this Lie algebra and its superanalogues has increased, since after the works of B. L. Feigin and D. A. Leites [5], V. G. Kac and J. W. Van De Leur [10]. The classification of Lie Superalgebras extending the Virasoro algebra and arising in string theory is now complete and is due to P. Grozman, D. Leites and I. Shchepochkina [9], see also V. G. Kac [11]. In this paper we study another interesting class of Lie superalgebras generalizing the Virasoro algebra. We consider extensions of the Virasoro algebra by the modules of tensor densities on S 1 studied recently by V. Ovsienko and C. Roger in [16].We show that most of these Lie algebras have interesting superanalogues and classify the Lie superalgebras of such type. A result close to ours is obtained by Cheng, Kac and Wakimoto [3]; see also Cheng, Kac [4].

1.1. The Virasoro algebra. Let Vect(S 1 ) be the Lie algebra of a smooth vector field on d , where f (x + 2π) = f (x), with the commutator: S 1 :f = f (x) dx [f (x)

d d d , g(x) ] = f (x)g 0 (x) − f 0 (x)g(x) . dx dx dx

292

P. Marcel

The Virasoro algebra is the unique (up to isomorphism) non-trivial central extension of Vect(S 1 ). It is given by the Gelfand–Fuchs cocycle: d d c(f (x) , g(x) ) = dx dx

Z S1

f 0 (x)g 00 (x)dx.

(1)

1.2. Neveu–Schwarz Superalgebra. There exists a Lie superalgebra, the Neveu– Schwarz superalgebra, which contains the Virasoro Algebra as its even part. Precisely the Neveu–Schwarz superalgebra is an algebra on the space V ect (S 1 ) ⊕ R ⊕ C ∞ (S 1 ) with the bilinear operation:   d d (f g 0 − f 0 g) dx Z+ ψφ dx Z ! g d   i  h f dλ dx  f 0 (x)g 00 (x)dx + 2 φ 0 (x)ψ 0 (x)dx  dx .    µ =  S1 , S1  φ(x)   ψ(x) 1 0 1 0 0 0 (f ψ − 2 f ψ) − (gφ − 2 g φ)

(2)

∞ 1 ∞ 1 1 The bilinear mapping [ , ]+ NS : C (S ) ⊗ C (S ) → V ect (S ) ⊕ R given by:

  [φ, ψ]+ NS = 2

Z S1

d ψφ dx

φ 0 (x)ψ 0 (x)dx

 

(3)

is usually called anticommutator. The Gelfand–Fuchs cocycle (1) has a natural generalization : Z Z d d , φ(x)), (g , ψ(x)) = f 0 (x)g 00 (x)dx + 2 φ 0 (x)ψ 0 (x)dx. cNS (f dx dx S1 S1 This Lie superalgebra is a particular case of a serie of so called stringy Lie superalgebras (see [9]). All the Lie superalgebras considered in this paper are given by extensions of the Neveu–Schwarz superalgebra by modules of tensor densities on S 1 .

1.3. Modules of tensor densities. Let F λ be the space of all tensor densities on S 1 of degree λ: a = a(x)(dx)λ . The Lie algebra Vect(S 1 ) acts on Fλ by the Lie derivative: L

(λ) d f (x) dx

a = f (x)a 0 (x) + λf 0 (x)a(x) (dx)λ .

(4)

It follows from (2) and (4) that the odd part of the Neveu–Schwarz Superalgebra can be naturally considered as the space F− 1 of tensor density of degree − 21 . 2

Extensions of Neveu–Schwarz Lie Superalgebra

293

2. Extensions of the Virasoro Algebra 2.1. Central extensions of Vect(S 1 ) n Fλ . Let us first recall the classification of the central extension of the semi-direct product Vect(S 1 ) n Fλ . Such Lie algebras have already been considered in mathematical and physical literature, their classification is also given in [16] and [17]. These extensions exist if and only if λ = 0 or 1. We have: ( R3 for λ = 0, 1 2 1 . H (Vect(S ) n Fλ ; R) = 0 for λ 6= 0, 1 Each of the algebras Vect(S 1 )nF0 and Vect(S 1 )nF1 has a 3-dimensional non-trivial central extension given by the following 2-cocycles: – For both algebras: the continuation of the Gelfand–Fuchs cocycle (1): Z f 0 (x)g 00 (x)dx. c (f, a), (g, b) = S1

(5)

Notice that c does not depend on a and b. – Two more non-trivial 2-cocycles in each case: (a) For λ = 0 : Z d d , a(x)), (g dx , b(x)) = f 00 (x)b(x) − g 00 (x)a(x) dx, c00 (f dx 1 ZS d d 00 a(x)b0 (x) − b(x)a 0 (x) dx. c0 (f dx , a(x)), (g dx , b(x)) = S1 Z d d 0 f 0 (x)b(x) − g 0 (x)a(x) dx, (b) For λ = 1 : c1 (f dx , adx), (g dx , bdx) = S1 Z d d 00 f (x)b(x) − g(x)a(x) dx. c1 (f dx , adx), (g dx , bdx) = S1

\ \ 1 ) n F and Vect(S 1 ) n F the 3-dimensional univerDenote respectively by Vect(S 0 1 sal central extensions given by these cocycles. 2.2. The Ovsienko–Roger algebras. In this section we recall the definitions of Lie algebras from [16,17] generalizing the Virasoro algebra. 2.2.1. Extensions of Vect(S 1 ) by Fλ . Consider the Lie algebras given by the extensions of Vect(S 1 ) with coefficients in Vect(S 1 )-modules of tensor-densities. The classification of these non-trivial extensions is given by the following result, see [6]:  2  R for λ = 0, 1, 2 2 1 H (Vect(S ); Fλ ) = R for λ = 5, 7  0 otherwise. The corresponding non-trivial cocycles were given in [16]. (1) For λ = 0, λ = 1, λ = 2 there exist two non-isomorphic non-trivial extensions. Let us give the 2-cocycles on Vect(S 1 ) with value in Fλ representing the non trivial cohomological classes:

294

P. Marcel

Z − for λ = 0: a)

γ0 (f, g) =

S1

f 0 (x)g 00 (x)dx, 1

b) γ1 (f, g) = (fg 0 − f 0 g)(x). − for λ = 1: a) γ2 (f, g) = (f 0 g 00 − f 00 g 0 )dx, b) γ3 (f, g) = (fg 00 − f 00 g)dx. − for λ = 2: a) γ4 (f, g) = (f 0 g 000 − f 000 g 0 )dx 2 , b) γ5 (f, g) = (fg 000 − f 000 g)dx 2 . (2) For λ = 5, λ = 7 there is a unique non-trivial extension; the 2-cocycles are: 000 (I V ) 5 (I V ) 000 − for λ = 5: γ6 (f, g) = (f .V g 000 − f000 g )dx (I f g f ) g (I V ) − for λ = 7: γ7 (f, g) = 2 (V I ) (V I ) − 9 (V ) (V ) (dx)7 . f g f g Let us denote A1 ,....,A7 the Lie algebras given by the non-trivial cocycles γ1 . . . γ7 . Thus each algebra Ai appears as a deformation of the semi-direct product Vect(S 1 ) n Fλi with: λ1 = 0 , λ2 = 1 , λ3 = 1 , λ4 = 2 , λ5 = 2 , λ6 = 5 , λ7 = 7.

(6)

2.2.2. Central extension of the Lie algebras Ai . Let us describe now the central extensions of Lie algebras Ai (see [16,17]). Each Lie algebra A1 and A2 has two nonisomorphic non-trivial central extensions, and each one of the Lie algebras A3 ,..., A7 has a unique non trivial central extension. Precisely: a) Each algebra Ai has a non-trivial central extension given by the 2-cocycles (5). b) There exist two more non-trivial central extensions: – a central extension of Lie algebra A2 given by the 2-cocycle: d Z d , adx), (g , bdx) = f 0 (x)b(x) − g 0 (x)a(x) dx. c2 (f dx dx S1 – a central extension of Lie algebra A3 given by the 2-cocycle: Z d d , adx), (g , bdx) = f (x)b(x) − g(x)a(x) dx. c3 (f dx dx S1 c2 the two-dimensional central extension of the algebras c1 and A Notation. Let us denote A c c7 the one-dimensional central extension of A3 ,..., A7 A1 and A2 respectively and A3 ,.., A respectively. \ \ 1 ) n F , Vect(S 1) n F , A c7 give a series of nine interestThe algebras Vect(S 0 1 c1 ,..,A ing Lie algebras generalizing the Virasoro algebra. We will introduce superanalogues of these Lie algebras in Sects. 6 and 7. We hope that the superalgebras presented are interesting generalizations of the Neveu–Schwarz Superalgebra and that their representations, coadjoint orbits... can lead to interesting studies. Let us consider first the most simple case: superanalogues of Lie algebras Vect(S 1 ) n Fλ and superanalogues of Lie algebras Ai . 1 Case of the Virasoro Algebra, here γ (f, g) is a constant function on S 1 . 0

Extensions of Neveu–Schwarz Lie Superalgebra

295

3. Two Families of Modules over Vect(S 1 ) n F λ We will show in this section that there exist modules of tensor densities over each algebra Vect(S 1 ) n Fλ , playing the same role as the module F− 1 (the odd part of the 2 Neveu–Schwarz algebra) over the Virasoro algebra. We consider the action of the semi-direct product Vect(S 1 )nFλ on the space Fµ ⊕Fρ . It turns out that there exist two natural families of Vect(S 1 )nFλ -modules corresponding to ρ = µ + λ and ρ = µ + λ + 1. Definition 3.1. (a) Vect(S 1 ) n Fλ acts on the space Fµ ⊕ Fµ+λ as follows:   (µ) L d ψ ψ(dx)µ f dx . Tλ =  (µ+λ) d ! f dx β(dx)µ+λ β + ψa L d f dx a(dx)λ (b) Vect(S 1 ) n Fλ acts on space Fµ ⊕ Fµ+λ+1 by:   (µ) ψ L µ d ψ(dx) dx .  f(µ+λ+1) Teλ  d  µ+λ+1 = f dx β(dx) β + (µψa 0 − λψ 0 a) L d   f dx a(dx)λ

(7)

(8)

It is easy to see that the formulae (7) and (8) indeed define a Vect(S 1 ) n Fλ -action. Remark 1. The term (µψa 0 − λψ 0 a) is the Poisson (or Schouten) bracket which is denoted: J1 (ψdx µ ,adx λ ), see e.g. [8]. In this paper we are interested in generalizations of the Neveu–Schwarz algebra, therefore we take µ = − 21 in (7) and (8). Denote fλ = F 1 ⊕ F 1 Mλ = F− 1 ⊕ Fλ− 1 and M − λ+ 2

2

2

2

(9)

the Vect(S 1 ) n Fλ -modules (7) and (8) respectively. 4. Superanalogues of Vect(S 1 ) n Fλ It is an amazing fact that there exist two natural superanalogues of the semi-direct product fλ and Vect(S 1 ) n Fλ . Consider the spaces Vect(S 1 ) n Fλ ⊕ Mλ and Vect(S 1 ) n Fλ ⊕ M define a superalgebra structure on these two spaces. In each case one needs a symmetric bilinear mapping (anticommutator): fλ ⊗ M fλ → Vect(S 1 ) n Fλ . [ , ]+ : Mλ ⊗ Mλ → Vect(S 1 ) n Fλ and [ , ]+ : M We are looking for the anticommutators which extend the Neveu–Schwarz anticomfλ respectively. Namely: mutator (3) upon Mλ and M d , π ( (φ, α), (ψ, β) ) , (10) [ (φ, α), (ψ, β) ]+ = φψ dx where π is a symmetric bilinear map: fλ ⊗ M fλ → Fλ . π: M π¯ : Mλ ⊗ Mλ → Fλ and e

296

P. Marcel

Proposition 4.1. There exist unique bilinear maps π¯ and e π such that the anticommutator fλ (10) defines a Lie superalgebra structure on Vect(S 1 )nFλ ⊕Mλ and Vect(S 1 )nFλ ⊕ M respectively: π¯ = J1 (φdx − 2 , βdx λ− 2 ) + J1 (ψdx − 2 , αdx λ− 2 ), 1

1

1

1

λ

e π = (ψα + φβ)dx .

(11) (12)

Proof. Straightforward calculation. u t Let us denote by Sλ and Seλ the Lie superalgebras defined by the anticommutators (11) and (12) respectively. These Lie superalgebras can be realized in terms of contact vector fields and tensor densities on the supercircle S 1|1 (cf. [9]). However, we will not use this interpretation in this paper. 5. Superanalogues of the Lie Algebras Ai Consider now the Lie algebras Ai defined in Sect. 2.2.1. Recall that each Lie algebra Ai is a deformation of the semi-direct product Vect(S 1 ) n Fλi (see (6) for λi ) given by the cocycles γi . We are looking therefore for superanalogues of algebras Ai which are deformations of the superalgebras Sλi and Sf λi containing Ai as the even part. It turns out that for each algebra Ai , except A7 , only one of the superalgebras Sλi or Sf λi admits such a deformation and therefore leads to a superanalogue of algebra Ai . In the case of A7 there are no such deformation, that means, the Lie algebra A7 has no superanalogue. Theorem 5.1. (i) For the Lie algebra A1 there exists a unique superanalogue which is a deformation of Se0 . (ii) For each of the Lie algebras A2 , A3 there exists a unique superanalogue which is a deformation of S1 . (iii) For each one of the Lie algebras A4 , A5 there exists a unique superanalogue which is a deformation of S2 . (iv) For the Lie algebra A6 there exists a unique superanalogue which is a deformation of Se5 . (v) In the case of A7 there is no superanalogue. Let us begin the proof by defining the modules structure of the odd part of each superanalogue. 5.1. Modules over the Lie algebras Ai . We are looking for an action of the Lie algebras Mλi (cf. (6) for λi and (9)) defined as deformations of the Ai on the spaces Mλi and ] actions (7) and (8) respectively. Such a deformation has to be of the following form: (a) First type:

Ai -action on Mλi = F− 1 ⊕ Fλi − 1 , 2 2 ψ ψ 0 ! ! . Ti f = Tλi f + β β s¯i (f, ψ) a a

(70 )

Extensions of Neveu–Schwarz Lie Superalgebra

297

] (b) Second type: Ai -action on M λi = F− 1 ⊕ Fλi + 1 , 2 2 ψ ψ 0 Tei f ! , = Teλi f ! + β β e si (f, ψ) a a

(80 )

s: Vect(S 1 ) ⊕ F− 1 → Fλi + 1 where s¯ : Vect(S 1 ) ⊕ F− 1 → Fλi − 1 ande 2 2 2 2 are some bilinear mappings. Let us give here the list of mappings (70 ) and (80 ) which define an Ai -action on the odd part of the corresponding Lie superalgebra. Proposition–Definition 5.2. There exist the following actions of the Lie algebras Ai on spaces of tensor densities: – Action of the first type: 1

(a) A2 acts on M1 = F− 1 ⊕ F 1 with: s¯2 (f, ψ) = −2f 0 ψ 0 dx 2 . 2

2

2

2

2

2

1

(b) A3 acts on M01 = F− 1 ⊕ F 1 with: s¯3 (f, ψ) = −2f ψ 0 dx 2 . 3

(c) A4 acts on M2 = F− 1 ⊕ F 3 with: s¯4 (f, ψ) = −2f 0 ψ 00 dx 2 . 3

(d) A5 acts on M02 = F− 1 ⊕ F 3 with: s¯5 (f, ψ) = −2f ψ 00 dx 2 .

9 (e) A6 acts on M5 = F− 1 ⊕ F 9 with: s¯6 (f, ψ) = − 87 f 000 ψ 000 + 67 f (I V ) ψ 00 dx 2 . 2

2

2

2

– Action of the second type: 1 M0 = F− 1 ⊕ F 1 with: se1 (f, ψ) = f ψ 0 dx 2 . (f) A1 acts on ] 2

2

M5 = F− 1 ⊕ F 11 with: (g) A6 acts on ] 2 2 11 se6 (f, ψ) = 25 f 000 ψ (I V ) − 5f (I V ) ψ 000 + 23 f (V ) ψ 00 dx 2 . Proof. The relation T d   d  = T d  ◦ T d  − T d  ◦ T d  f g f g g f  dx ,  dx   dx   dx   dx   dx  a b a b b a

(13)

si : leads to the following condition on the map s¯i or e {γi (f, g), ψ} + s [f, g], ψ

= Lf

d dx

(− 1 ) s(g, ψ) + s(f, L d2 ψ) g dx

− Lg

d dx

(− 21 )

(s(f, ψ)) − s(g, L

f

d dx

ψ),

(14)

si , {γi (f, g), ψ}=γi (f, g)ψ in the case of s¯i and J1 ψ, γi (f, g) in where: s = s¯i or e the case of e si . si then follows by a straightforward calculation. u t The expression of s¯i and e Remark 2. For each one of the algebras A1 , A2 , A3 , A4 , A5 there exists only one of the actions (70 ) or (80 ) which leads to a module structure. This explains the fact that for each algebra A1 , A2 , . . . , A5 there exists only one superanalogue in the class of deformations of superalgebras Sλi and Sf λi .

298

P. Marcel

In the case of Lie algebra A6 the two forms of action (70 ) and (80 ) give a A6 -module structure on F− 1 ⊕ F 11 and F− 1 ⊕ F 9 respectively, but there exists a natural structure 2 2 2 2 of the Lie superalgebra only with one of these modules. For the Lie algebra A7 neither the action (70 ) nor the action (80 ) gives a A7 -module structure. This will be detailed in Sect. 8. 5.2. Relation to Vect(S 1 )-cohomology . We give here a cohomological interpretation si introduced in (70 ) and (80 ). of the bilinear mappings s¯i and e Let us consider the following mappings: S: Vect(S 1 ) → Hom(F− 1 , Fρ ), (ρ = λi − 21 or λi + 21 ) defined by: 2

si , S(f )(ψ) := s(f, ψ) , s = s¯i or e and the 2-cocycle σ : Vect(S 1 ) → H om(F− 1 , Fρ ), defined by: 2

σ (f, g)(ψ) := {γi (f, g), ψ} see (14). Note that σ is automatically a 2-cocycle since γi is one. It turns out that the cohomology class of γ is the unique obstruction for the existence of a Ai -action. One obtains immediately: Proposition 5.3. The formulas (70 ) and (80 ) define an action of the Lie algebra Ai if and only if dS = −σ.

(15)

Proof. This relation follows directly from (14). u t Remark 3. Note that if s¯ (resp e s) is a solution of (15) and s0 a 1-cocycle on Vect(S 1 ) with values in Hom(F− 1 , Fλi − 1 ) (resp. Hom(F− 1 , Fλi + 1 )) s + s0 is also a solution of 2 2 2 2 (15). The mappings b s, e s given in Proposition–Definition 5.2 are the simplest solutions of (15).

5.3. Definitions of the superalgebras. We will give here bilinear mappings defining a Lie ] superalgebra structure on each superspace Ai ⊕Mλi (with i = 2, 3, 4, 5) and Ai ⊕ M λi (with i = 1, 6) introduced in Definition 5.2. We are looking for anticommutators which are given by deformations of the corresponding anticommutators (10) on Sλi and Sf λi (cf. (6) and Proposition 4.1). Namely, for i = 2, 3, 4, 5 :

for i = 1, 6 :

[ (φdx − 2 , αdx λi − 2 ), (ψdx − 2 , βdx λi − 2 ) ]+ d , J1 (ψ, α) + J1 (φ, β) + γ¯i , = φψ dx 1

1

1

1

[ (φdx − 2 , αdx λi + 2 ), (ψdx − 2 , βdx λi + 2 ) ]+ d = φψ , (ψα + φβ)dx λi + γ˜i , dx 1

1

1

(16)

1

(17)

where γ˜i ,γ¯i are the prolongations of the cocycles γ1 ,γ2 , . . . , γ6 given in Sect. 2.2.1.

Extensions of Neveu–Schwarz Lie Superalgebra

299

Proposition 5.4. There exists a unique way to extend the cocycles γi upon: A2 ⊕ M1 :

γ¯2 = 4φ 0 ψ 0 dx,

A3 ⊕ M01 :

γ¯3 = (φψ)0 dx,

A4 ⊕ M2 :

γ¯4 = 2(ψ 0 φ 0 )0 dx 2 ,

A5 ⊕ M02 :

γ¯5 = (ψφ 00 + ψ 00 φ)dx 2 ,

M0 : A1 ⊕ ]

γe1 = φψ(x), γe6 = [ 3 ψ (I V ) φ 00 + φ (I V ) ψ 00 − 8(φ 000 ψ 000 ) ]dx 5

M5 : A6 ⊕ ]

] to obtain a Lie superalgebra structure on Ai ⊕Mλi (i= 2,3,4,5) and Ai ⊕M λi (i = 1, 6). Proof. The proposition can be proven by a direct verification of the graded Jacobi identity: (−1)|X||Z| [X, [Y, Z]] + (−1)|X||Y | [Y, [Z, X]] + (−1)|Y ||Z| [Z, [X, Y ]] = 0,

(18)

where |X| is the degree of X (|X| = 0 for X ∈ Ai and |X| = 1 for X ∈ Mλi ). M0 . The proof of (18) in the case of Let us give this verification in the case of A1 ⊕ ] the other algebras is analogous. For elements of the even part (18) is the Jacobi identity on the Lie algebra, if two of the elements X,Y,Z are even and the third is odd; (18) is a direct consequence of the module structure of the odd part. Therefore it remains to check (18) in two cases. φ ψ τ 1) First case. For X = ,Y = ,Z = elements of ] M0 = F− 1 ⊕ F 1 we α β γ 2 2 have: i h φ φ ψ τ . ,[ , ]+ = −Te1 h ψ ! τ !i α α β γ , β γ + By definition of the anticommutator [ , ]+ (cf. formula (17) ) and of γe1 : h i φ ψ τ φ e   = −T1 ,[ , ] . d α β γ + α   ψτ   dx (ψγ + τβ)(x) + ψτ (x) 0 (recall that λ1 = 0) cf. (8’) According to the definition of the action Te1 = Te0 + se1 and Proposition–Definition 5.2 one obtains: φ e   T1 d α   ψτ   dx (ψγ + τβ)(x) + ψτ (x)   (− 21 ) φ L d   ψτ =  1 dx . 1 2 0 0 Lψτ (α) − 2 φ[(ψγ + τβ) + ψτ ] + (ψτ )φ

300

P. Marcel

Finally

  ψτ φ 0 − 21 (ψ 0 τ + ψτ 0 )φ h i φ ψ τ   = − 1 ,[ , ] . α β γ + 1 2 0 0 Lψτ (α) − 2 φ[(ψγ + τβ) + ψτ ] + (ψτ )φ

In the same way one has

  φτ ψ 0 − 21 (φ 0 τ + φτ 0 )ψ h i ψ τ φ   = − 1 ,[ , ] , β γ α + 2 (β) − 21 ψ[ (τ α + φγ ) + (φτ ) ]0 + (φτ )ψ 0 Lφτ   φψτ 0 − 21 (φ 0 ψ + φψ 0 )τ h i τ φ ψ   = − 1 ,[ , ]  γ α β + 1 2 0 Lφψ (γ ) − 2 τ [(φβ + ψα) + (ψφ)] + (ψφ)τ

Taking the sum of these terms one readily obtains zero. d ψ φ f dx ∈ A1 and Y = ,Z = elements of 2) Second case. For X = β α a(x) F− 1 ⊕ F 1 we have: 2

2

a) First term: i h d i h d ψ φ φψ(x) f dx f dx ,[ , ]+ = , . β α (φβ + ψα)(x) + (φψ)(x) a(x) a(x) By definition of the commutator on A1 one obtains:   d f (φψ)0 − f 0 (φψ) dx .  (0) (0) Lf (φβ + ψα + φψ) − Lφψ (a) + f (φψ)0 − f 0 (φψ)

(19)

b) Second term: i h h d i ψ φ ψ φ f dx   e ] = , −T1 . ,[ , d β α + β α f a(x) +  dx  a(x) According to the definition of the action Te1 one has:   (− 21 ) L φ i h h d  f dxd i ψ ψ φ f dx  , ] = ,− ,[ , 1  (2)  β β α a(x) L d α − 21 φa 0 + f φ 0 + f

dx

and by definition of the anticommutator [ , ]+ one obtains finally:  (− 1 ) −ψ.L d2 φ f dx   (1) (− 1 ) (1) −ψ(L 2 d α − 21 φa 0 + f φ 0 ) − βL d2 φ − ψL 2 d f

dx

f

dx

f

dx

 φ

 .

(20)

Extensions of Neveu–Schwarz Lie Superalgebra

301

h d i ψ φ f dx , ] = ,[ β α a(x)

c) Third term: − 

−φ.L

(− 21 )

d f dx 1 (2) d f dx

  −φ(L



ψ β−

1 0 2 ψa

(− 21 )

+ f ψ 0 ) − αL f

d dx

ψ − φL

( 21 )

f

d dx

ψ

 .

(21)

We verify that the sum of the expressions (19), (20), (21) is zero.The first line follows directly from the Neveu–Schwarz case.The second line is a direct calculation. Proposition 5.4 is proven. u t Theorem 5.1 follows now from Proposition 5.4. 6. Central Extensions ci as central extensions of the Ai -superanalogues. We will study superanalogues of A Therefore we are looking for central extensions of the Lie superalgebras Ai ⊕ Mλi ] (i=2,3,4,5) and Ai ⊕ M λi (i=1, 6) given in Proposition 5.4. In each case one needs ci , that is, to give the to extend the anticommutators (16) or (17) to the Lie algebras A prolongation of the cocycles c, c2 and c3 of Sect. 2.2.2. The results are as follows: M0 , A4 ⊕ M2 , A5 ⊕ M02 , Proposition 6.1. (i) For each of the superalgebras A1 ⊕ ] A6 ⊕ ] M5 there exists a one-dimensional non-trivial central extension with the corresponding anticommutator (16) or (17) extended by: Z 1 1 φ 0 (x)ψ 0 (x)dx. b c (φdx − 2 , α), (ψdx − 2 , β) = 2 S1

(ii) For the Superalgebra A2 ⊕ M1 , there exists a two-dimensional non trivial central extension. The anticommutator [ , ]+ is extended by Z φ 0 (x)ψ 0 (x)dx and b c=2 1 ZS − 21 − 21 ψ 0 α + φ 0 β dx. b c2 (φdx , α), (ψdx , β) = − S1

(iii) For the Superalgebra A3 ⊕ M01 , there exists a two-dimensional non trivial central extension. The anticommutator [ , ]+ is extended by: Z φ 0 (x)ψ 0 (x)dx and b c=2 1 S Z 1 − 21 − 21 ψα + φβ dx. b c3 (φdx , α), (ψdx , β) = − 2 S1 Proof. Case (i) follows directly from the Neveu–Schwarz case. Cases (ii) and (iii) are proven by direct verification of the graded Jacobi identity. u t

302

P. Marcel

\ \ 1 ) n F and Vect(S 1) n F 7. Superanalogues of the Lie Algebras Vect(S 0 1 For the sake of completeness, let us also study superanalogues of the Lie algebras constructed in Sect. 2.1 as 3-dimensional central extensions of the semi-direct products \ \ 1 ) n F and Vect(S 1 ) n F . Their superanalogues will be obtained as central Vect(S 0 1 extensions of the corresponding superalgebras Sλ , Seλ (λ = 0, 1) see Proposition 4.1. It turns out, in each case, that only one of the corresponding superalgebras has a natural 3-dimensional central extension generalizing Vect(S 1 ) n F0 and Vect(S 1 ) n F1 . \ 1 ) n F and One needs therefore to extend the anticommutators (10) to Vect(S 0 0 00 0 00 \ 1 Vect(S ) n F1 by giving the continuation of the cocycles c, c ,c and c, c , c of Sect. 2.1. 0 0

1

1

The results are as follows: Proposition 7.1. (i) For the Lie superalgebra Se0 , there exists a 3-dimensional nontrivial central extension. The anticommutator is defined by c, cb00 , cb000 [ (φ, α), (ψ, β) ]+ = φψ , (ψα + φβ), b Z with b c=2

φ (x)ψ (x)dx, cb00 = −2 0

S1

0

Z S1

φ β + αψ dx, cb000 = 4 0

0

Z S1

αβdx.

(ii) For the Lie Superalgebra S1 , there exists a 3-dimensional non-trivial central extension. The anticommutator is defined by 1 c, cb10 , cb100 [ (φ, α), (ψ, β) ]+ = φψ, − (ψα + φβ)0 , b 2 Z Z Z 0 0 0 0 1 0 00 b b φ (x)ψ (x)dx, c1 = − ψ α + φ β dx, c1 = − 2 ψα + with b c=2 S1 S1 S1 φβ dx. (iii) The Lie Superalgebras S0 and Se1 have no 3-dimensional non-trivial central extensions. Proof. Let us prove (ii). One must verify the Jacobi identity (18) in two cases, (cf. proof of Proposition 5.4). The case where X, Y, Z ∈ F− 1 ⊕ F 1 is identical to the semi-direct product case and 2 2 does not contain calculations with constants.     d r1 f dx \ 1 ) n F r = r , It remains to prove (18) for X =  adx  ∈ Vect(S 2 1 r3 r ψ φ Y = ,Z = elements of F− 1 ⊕ F 1 . β α 2 2 a) First term:       f d f d φψ dx dx  adx  , [ ψ , φ ]+ =  adx  , − 1 (φβ + ψα)0  . 2 β α s r r

(22)

Extensions of Neveu–Schwarz Lie Superalgebra

303

\ 1 ) n F , one We consider only the terms of (22) belonging to the center of Vect(S 1 obtains:  Z f 0 (x)(φψ)00 (x)dx   S1    Z   1   0 0 0 f (− (φβ + ψα) − (φψ) a (x)dx  . (23)    S1 2    Z   1 0 f (− (φβ + ψα) ) − (φψ)a (x)dx 2 S1 b) Second term:   h f d dx ψ φ ψ φ  , −T1  d  ,[ , adx ] = β α β α f dx   + r adx   (− 1 ) L d2 φ  f dx  ψ   = , −  (1)  . β L 2 d α + φa + f

dx

According to the definition of b c, cb10 , cb100 the terms of the center are:   Z (− 1 ) ψ 0 (x)(L d2 φ)0 (x)dx −2 f dx   S1    Z   1 (− 21 )   0 (2) 0 ψ (L d α + φa) + β(L d φ) (x)dx  .  f dx f dx   S1     Z   1 1 (2) (− 2 ) 1 ψ(L d α + φa) + β(L d φ) (x)dx 2 S1

f

dx

f

dx

c) Third term: In the same way one has:   Z (− 1 ) φ 0 (x)(L d2 ψ)0 (x)dx −2 f dx   S1    Z   1 1 ( ) (− )    (φ 0 (Lf2 d β + ψa) + α(Lf d2 ψ)0 )(x)dx  .   S1 dx dx     Z   1 1 (2) (− 2 ) 1 φ(L d β + ψa) + α(L d ψ) (x)dx 2 S1

f

dx

f

(24)

dx

One checks easily that the sum of the expressions (23), (24), (25) is zero. Proposition 7.1 is proven in case (ii). u t

(25)

304

P. Marcel

8. Case of Algebra A7 We said in the introduction of Sect. 5 that there does not exist a generalization of the Neveu–Schwarz Superalgebra in the case of the algebra A7 . This is a consequence of the following result. Proposition 8.1. There is no A7 -module structure extending (7) and (8) either on spaces F− 1 ⊕ F 13 or on F− 1 ⊕ F 15 respectively. 2

2

2

2

Proof. With the notations of Sect. 5.1, we will show that there is no deformation T7 or Te7 of the action of the semi-direct product Vect(S 1 ) n F7 on space F− 1 ⊕ F 13 or 2

2

F− 1 ⊕ F 15 . Let us give here the details only in the case of T7 and space F− 1 ⊕ F 13 . 2 2 2 2 The proof uses the notion of so-called transvectants. Let us first recall the main definitions. Consider the bilinear mappings on tensor-densities: Jk : Fλ ⊕ Fµ → Fλ+µ+k ( with k integer) defined by λ

µ

Jk (adx , bdx ) =

X i+j =k

2λ + m − 1 (−1) m! i i

2µ + m − 1 a (i) b(j ) j

(26)

(where a (i) = d i a/dx i ). The operations (26) are Gordan’s transvectants, see [7] (they have been rediscovered by Rankin [18] and Cohen [1] in the theory of modular functions, see also [2] and [15] for the interesting properties of these operations). Consider now the commutator on A7 . This commutator can be written using transvectants: γ7 (f, g) = cJ9 (f, g), where c is a constant, h d i d = J1 f dx −1 , gdx −1 , f (x) , g(x) dx dx Lf b = J1 (f dx −1 , bdx 7 ). We have the same result for the action T7 given in (7). Namely: L L

(− 21 )

d dx ( 13 2 ) d f dx

f

ψ = J1 (f dx −1 , ψdx − 2 ), 1

β = J1 (f dx −1 , βdx 2 ), 13

ψadx ( 2 ) = J0 (ψdx − 2 , adx 7 ). 13

1

Consider the subalgebra sl 2 (R) ⊂ Vect(S 1 ) generated by the vector fields: d d d ,x , x2 , dx dx dx eS 1 . where x is the affine parameter on RP1 = For each k (cf. [7]), Jk is the unique sl 2 -equivariant differential operator of order k on the tensor-densities.

Extensions of Neveu–Schwarz Lie Superalgebra

305

Thus the commutator in A7 and the action T7 are given by sl 2 -equivariant maps. The property of sl 2 -equivariance implies that the only possible deformation T7 of T7 1 (see (70 )) is given by a complementary term s¯ (f, ψ) proportionnal to J8 (f dx −1 , ψdx − 2 ). A simple verification shows that the above map T7 does not define an action of the Lie algebra A7 on the space F− 1 ⊕ F 13 . 2 2 s(f, ψ) proportionnal We have the same result for the map Te7 of type (80 ) given by e to J9 (f dx −1 , ψdx − 2 ). Proposition 8.1 follows. u t 1

Appendix 1: The List of the Ovsienko–Roger Algebras Lie Algebras

Vector space

\ 1) n F Vect(S 0

Vect(S 1 ) ⊕ F0 ⊕ R3

\ 1) n F Vect(S 1

Vect(S 1 ) ⊕ F1 ⊕ R3

c1 A

Vect(S 1 ) ⊕ F0 ⊕ R

c2 A

Vect(S 1 ) ⊕ F1 ⊕ R2

c3 A

Vect(S 1 ) ⊕ F1 ⊕ R2

c4 A

Vect(S 1 ) ⊕ F2 ⊕ R

c5 A

Vect(S 1 ) ⊕ F2 ⊕ R

c6 A

Vect(S 1 ) ⊕ F5 ⊕ R

c7 A

Vect(S 1 ) ⊕ F7 ⊕ R

Commutators Z [f, g](f b0 − ga 0 )(x) , f 0 g 00 (x)dx, c00 , c000 S 1Z [f, g] , ((f b)0 − (ga)0 )dx , f 0 g 00 (x)dx, c10 , c100 S1 Z [f, g] , (f b0 − ga 0 )(x) + γ1 , f 0 g 00 (x)dx 1 ZS [f, g] , ((f b)0 − (ga)0 )dx + γ2 , f 0 g 00 (x)dx, c2 1 ZS [f, g] , ((f b)0 − (ga)0 )dx + γ3 , f 0 g 00 (x)dx, c3 1 ZS (2) (2) [f, g] , Lf b − Lg a + γ4 , f 0 g 00 (x)dx 1 ZS (2) (2) [f, g] , Lf b − Lg a + γ5 , f 0 g 00 (x)dx 1 ZS (5) (5) [f, g] , Lf b − Lg a + γ6 , f 0 g 00 (x)dx 1 ZS (7) (7) [f, g] , Lf b − Lg a + γ7 , f 0 g 00 (x)dx

S1

c00 , c000 , c10 , c100 are given in Sect. 2.1; the 2-cocycles γi are given in Sect. 2.2.1.

Appendix 2: The Superanalogues Lie Algebras

Odd part

Anticommutators

\ 1) n F Vect(S 0

f0 = F 1 ⊕ F 1 M −

\ 1) n F Vect(S 1

M1 = F − 1 ⊕ F 1

c1 A

] M0 = F− 1 ⊕ F 1

c2 A

M1 = F− 1 ⊕ F 1

c3 A

M01 = F− 1 ⊕ F 1

c4 A

M2 = F− 1 ⊕ F 3

c5 A

M02 = F− 1 ⊕ F 3

φψ, − 21 (ψα + φβ)0 − (ψ 0 α + φ 0 β) + γ¯4 , b c φψ, − 21 (ψα + φβ)0 − (ψ 0 α + φ 0 β) + γ¯5 , b c

c6 A

] M5 = F− 1 ⊕ F 11

φψ , (ψα + φβ) + e γ6 , b c

2

2

2

2 2

2 2

2

2

φψ, − 21 (ψα + φβ)0 , b c , cb10 , cb100

2

2

2 2

2 2

2

φψ , (ψα + φβ), b c , cb00 , cb000

φψ , (ψα + φβ) + e γ1 , b c

φψ, − 21 (ψα + φβ)0 + γ¯2 , b c , cb2 φψ, − 21 (ψα + φβ)0 + γ¯3 , b c , cb3

b c, cb00 , cb000 , cb10 , cb100 are given in Proposition 7.1; cb2 , cb3 are given in Proposition 6.1; γ¯i , e γi are given in Proposition 5.4.

306

P. Marcel

Acknowledgements. I am grateful to V. Ovsienko for his constant help and statement of the problem, to C. Roger and C. Duval for their interest in this work and stimulating discussions.

References 1. Cohen, H.: Sums involving the values at negative integers of L functions of quadratic characters. Math. Ann. 217, 181–194 (1975) 2. Cohen, P., Manin, Yu. & Zagier, D.: Automorphic pseudodifferential operators, Algebraic aspects of integrable systems. Prog. Nonlinear Differential Equations Appl., 26, Boston, MA: Birkhäuser Boston, 1997, pp. 17–47 3. Shun-Jen, Ch., Kac, V.G, Minoru, W.: Extensions of conformal modules. In: Topological field theory, primitive forms and related topics (Kyoto, 1996) Progr. Math., 160, Boston, MA: Birkhäuser Boston, 1998, pp. 79–129 4. Shun-Jen, Ch, Kac, V.G.: Conformal modules. Asian J. Math. 2 no.1, 153–156 (1998) 5. Feigin, B.L. and Leites, D.A.: New Lie Superalgebras of String Theory, Group Theoretical Methods in Physics. v. 1, Moscow 198, pp. 269–278 6. Fuks, D.B.: Cohomology of infinite Dimensional Lie Algebras. Tranlated from Russian. New York: Consultants Bureau, 1986 7. Gordan, P.: Invariantentheorie. Leipzig: Teubner, 1887 8. Grozman, P.Ya.: Classification of bilinear invariant operators over tensor fields. Func. Anal. Appl. 14, 58–59 (1980) 9. Grozman, P., Leites, D. and Shchepochkina, I.: Lie superalgebras of string theories. hep-th 9702120 10. Kac,V.G. andVan De Leur, J.W.: On Classification of SuperconformalAlgebras. In: Strings-88, Singapore: World Scientific, 1989, pp. 77–106 11. Kac, V.G.: Superconformal algebras and transitive group actions on quadrics. Commun. Math. Phys. 186, 233–252 (1997) 12. Kirillov, A.A.: Infinite dimensional Lie groups: Their orbits, invariants and representations. The geometry of moments. Lect. Notes in Math. 970, Berlin–Heidelberg–NewYork: Springer-Verlag, 1982, pp. 101–123 13. Kirillov, A.A.: Orbits of the group of diffeomorphisms of a circle and local superalgebras. Funct. Anal. Appl. 15 2, 135–137 (1980) 14. Marcel, P., Roger, C., Ovsienko, V.Yu.: Extension of the Virasoro and Neveu–Schwarz algebras and generalized Sturm-Liouville operators. Lett. Math. Phys. 40, 31–39 (1997) 15. Ovsienko, V.Yu.: Exotic deformation quantization. J. Differ. Geom. 45 no. 2, 390–406 (1997) 16. Ovsienko, V.Yu., Roger, C.: Extension of Virasoro group and Virasoro algebra by modules of tensor densities on S 1 . Funct. Anal. Appl. 31, 4 (1996) 17. Ovsienko, V.Yu., Roger, C.: Generalizations of the Virasoro group and Virasoro algebra through extensions by modules of tensor densities on S 1 . Indag. Mathem, N.S. 8, 3 (1998) 18. Rankin, R.A.: The construction of automorphic forms from the derivatives of a given form. J.Indian Math. Soc. 20, 103–116 (1956) Communicated by T. Miwa

Commun. Math. Phys. 207, 307 – 339 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Positive Energy Representations of the Loop Groups of Non-Simply Connected Lie Groups Valerio Toledano Laredo Institut de Mathématiques de Jussieu, UMR 7586, Case 191, Université Pierre et Marie Curie, 4, Place Jussieu, 75252 Paris Cedex 05, France. E-mail: [email protected] Received: 7 April 1998 / Accepted: 26 April 1999

Abstract: We classify and construct all irreducible positive energy representations of the loop group of a compact, connected and simple Lie group and show that they admit an intertwining action of Diff(S 1 ). 1. Introduction Let K be a compact, connected and simple Lie group and LK = C ∞ (S 1 , K) its loop group. We shall be concerned with the study of positive energy representations of LK, i.e., projective unitary representations π : LK −→ P U (H) = U (H)/T

(1.1)

on a Hilbert space H extending to the semi-direct product LK o Rot(S 1 ) in such a way that Rot(S 1 ) acts by non-negative characters only and with finite-dimensional eigenspaces. In other words, M H(n), (1.2) H= n≥0

where H(n) = {ξ ∈ H|π(Rθ )ξ = einθ ξ }, the subspace of energy n, supports a finitedimensional projective representation of K. Positive energy representations are completely reducible and, when K is simply connected, have been classified by several authors [PS,Wa]. The irreducible ones are then uniquely determined by their level ` ∈ N and lowest energy subspace H(0). The former classifies the corresponding central extension of LK or, equivalently, the cocycle associated to the infinitesimal representation of its Lie algebra. The latter is an irreducible K-module, the highest weight λ of which is bound by the requirement that hλ, θi ≤ `,

(1.3)

308

V. Toledano Laredo

where θ is the highest root of K and h·, ·i is the basic inner product of K, i.e., the multiple of the Killing form such that hθ, θi = 2. The aim of the present paper is to extend the above classification to the case of groups which are not simply connected. Write K = G/Z, where G is the universal covering group of K and π1 (K) ∼ = Z ⊆ Z(G). It will be more convenient to consider positive energy representations of the group of discontinuous loops LZ G = {ζ ∈ C ∞ (R, G)|ζ (x + 2π )ζ (x)−1 ∈ Z}

(1.4)

deferring to later the determination of those which factor through LK. Since LZ G/LG ∼ = Z, these may be studied with the Mackey machine [Ma1,Ma2], paying however due care to the fact that the representations in question are projective and form a strict subclass of those of LZ G. With these provisos, the analysis carries over essentielly unchanged and is dealt with in the following sections. We summarise them below. In Sect. 3, we classify the central extensions of LZ G and L(G/Z) by T. We show in particular the existence of an obstruction to the extension of the level ` central extension of LG to LZ G. This appears only at odd ` and for some G, the complete list of which is given in an appendix. It comprises SUn with n even. Thus, surprisingly perhaps, LZ2 SU2 , and a fortiori L SO3 do not have any odd level positive energy representations. A further obstruction on the level appears when demanding that a given central extension of LZ G descend to L(G/Z). ` must then be a multiple of a given non-negative integer `b , the basic level of G/Z, of which we compute the value for all simple groups. It is n for SUn /Zn . In Sect. 4, we show that the category P` of positive energy representations of LG at a given level ` is closed under conjugation by elements of LZ G and therefore that Z∼ = LZ G/LG acts on the positive energy dual of LG. We also compute the geometric counterpart of this action on the alcove of G which parametrises the irreducibles in P` . Aside from rendering the action of Z more explicit, this shows that Z operates by automorphisms of the extended Dynkin diagram of G. In Sect. 5, we compute the Mackey obstruction for a subgroup Y ⊆ Z stabilising a given positive energy representation H of LG. This vanishes for most groups since Y is cyclic unless G/Y = PSO4n but, somewhat surprisingly, doesn’t in the latter case. Section 6 contains our main results. We construct all irreducible positive energy representations of LZ G and show that they are classified by the central extension of LZ G they induce and their isomorphism class as LG-modules. Moreover, we prove that they admit an intertwining action of Diff(S 1 ) which coincides with the Segal–Sugawara representation obtained by regarding them as positive energy LG-modules. Finally, in Sect. 7 we determine those representations which factor through L(G/Z), thereby obtaining all positive energy representations of the latter group. They are exactly those the level of which is a multiple of the basic level of G/Z. Remark. In physical terms, we classify in this paper all inequivalent quantisations of the chiral Wess–Zumino–Witten model with target group G/Z. Related results have been obtained in the non-chiral case by Gepner–Witten [GW], Felder–Gaw¸edzki–Kupiainen [FGK1,FGK2] and Gaberdiel [Ga]. 2. The Coroot and Coweight Lattices of G We begin by gathering some well-known properties of the lattices canonically associated to G. The present discussion follows [GO]. Throughout this paper, G denotes a compact,

Positive Energy Representations of Loop Groups

309

connected and simply connected simple Lie group with Lie algebra g. Let T ⊂ G be a maximal torus with Lie algebra t ⊂ g. By the roots of G we shall always mean its infinitesimal roots, namely the set R of linear forms α ∈ it∗ = Hom(t, iR) such that the subspace gα = {x ∈ gC | [h, x] = α(h)x ∀h ∈ tC } is non-zero. Let 1 = {α1 , . . . , αn } be a basis of R and θ the corresponding highest root. The basic inner product h·, ·i, i.e.the unique multiple of the Killing form such that hθ, θi = 2, is positive definite on it and gives an identification it∗ ∼ = it of which we shall make implicit use. The coroots of G 2α . They form the dual root system R ∨ . are the elements of it given by α ∨ = hα,αi ∗ The root and coroot lattices 3R ⊂ it , 3∨R ⊂ it are the lattices spanned by R and ∨ R respectively. They have Z-basis given by 1 and 1∨ = {α1∨ , . . . , αn∨ }. Since θ is a long root and there are at most two root lengths in R with the ratio of the squared hθ,θi α, we length of a long root by that of a short one equal to 2 or 3, rewriting α ∨ = hα,αi hθ,θi 4 see that 3∨R ⊂ 3R . Notice that hα ∨ , α ∨ i = hα,αi = 2 hα,αi so that 3∨R is an even, and therefore integral lattice. The weight and coweight lattices 3W ⊂ it∗ , 3∨W ⊂ it are the lattices dual to 3∨R and 3R respectively. They have Z-basis given by the fundamental (co)weights λi , λ∨ i defined by

hλi , αj∨ i = hλ∨ i , αj i = δij .

(2.1)

Clearly, 3∨W ⊂ 3W . Moreover, since hα, β ∨ i ∈ Z for any root α and coroot β ∨ , we have 3R ⊂ 3W and, dually, 3∨R ⊂ 3∨W . Graphically, 3R

⊂

3W

∪

∪

∨

∨

3R

⊂

3W

⊂

it∗ (2.2)

⊂

it.

[ = Hom(Z(G), T) its Pontryagin dual. The Let Z(G) be the centre of G and Z(G) following is well-known Lemma 2.1. (i) The map e(h) = expT (−2πih) induces an isomorphism 3∨W /3∨R ∼ = Z(G). [ (ii) The pairing µ(expT (h)) = ehµ,hi induces an isomorphism 3W /3R ∼ = Z(G). Remark. When G is simply-laced, i.e.with all roots of equal length, the basic inner product identifies roots and coroots and the vertical inclusions in (2.2) are equalities. [ ∼ Moreover, Lemma 2.1 yields a canonical isomorphism Z(G) = Z(G). The Weyl group W of G is the finite group generated in End(it∗ ) by the orthogonal reflections σα corresponding to the roots α ∈ R. Since σα (µ) = µ − 2

hµ, αi α = µ − hµ, α ∨ iα = µ − hµ, αiα ∨ , hα, αi

(2.3)

the action of W preserves 3∨R -cosets in 3∨W and 3R -cosets in 3W . Call µ ∈ 3W (resp. µ ∈ 3∨W ) minimal if it is of minimal length in its 3R (resp. 3∨R )-coset. The following gives a characterisation of minimal (co)weights.

310

V. Toledano Laredo

Proposition 2.1. There is, in each 3∨W /3∨R -coset (resp. 3W /3R -coset) a unique W orbit of elements of minimal length. These may equivalently be characterised as those λ such that hλ, αi ∈ {0, ±1} (resp. hλ, α ∨ i ∈ {0, ±1})

(2.4)

for any root α (resp. coroot α ∨ ). Proof. It is sufficient to consider the case of 3∨W /3∨R since 3R , 3W are the coroot and coweight lattices of the dual root system R ∨ . Let µ ∈ 3∨W be of minimal length 2β , we have in its 3∨R -coset. Then, for any root β and corresponding coroot β ∨ = hβ,βi ∨ ∨ 2 2 kµ ± β k ≥ kµk and, expanding |hµ, βi| ≤ 1. Assume that λ ∈ 3W satisfies (2.4) and ν = λ mod 3∨R is of minimal length in its coset. We claim that wλ = ν for an appropriate w ∈ W . To see this, write ν = λ + β1∨ + · · · + βr∨ , where the βi∨ are (possibly repeated) coroots. Clearly, one cannot have hλ, βi i ≥ 0 for all i otherwise X X X βi∨ i + 2hλ, βi∨ i > hλ, λi (2.5) hν, νi = hλ, λi + h βi∨ , in contradiction with the minimality of ν. Thus, by (2.4) there exists an i ∈ {1, . . . , r} such that hλ, βi i = −1 and therefore λ1 := σβi λ = λ + βi∨ . Moreover, λ1 satisfies (2.4) since W permutes the roots and preserves h·, ·i. We may therefore iterate the above step to find a permutation τ of {1, . . . , r} such that for any i = 1 . . . r, λi := λ + βτ∨(1) + · · · + βτ∨(i) = σβτ (i) · · · σβτ (1) λ.

(2.6)

t In particular, ν = λr ∈ W λ whence kλk = kνk. u Recall that a weight µ ∈ 3W is dominant if it lies in the cone ∨ ∨ ∨ 3+ W = {ν ∈ 3W | hν, αi i ≥ 0 ∀αi ∈ 1 } =

n M

λi · N.

(2.7)

i=1

Since 3+ W is a fundamental domain for the action of W on 3W , Lemma 2.1 and Propo[ and minimal sition 2.1 establish a bijective correspondence between elements in Z(G) dominant weights. Dually, the elements of Z(G) correspond to the minimal dominant L ∨ · N of minimal length in their 3 -coset. The coweights, i.e.those µ ∈ (3∨W )+ = i λ∨ R i following gives another characterisation of minimal dominant coweights. Lemma 2.2. The non-zero minimal dominant coweights are exactly the fundamental coweights corresponding to special roots, i.e.those αi ∈ 1 bearing the coefficient 1 in the expansion X (2.8) θ= mi αi . Proof. By Proposition 2.1, µ ∈ (3∨W )+ is minimal iff hµ, θi ≤ 1. Indeed, for any positive root α, we get 0 ≤ hµ, αi ≤ hµ, θi − hµ, θ − αi ≤ hµ, θi. Since hµ, θi = 0 )+ such implies µ = 0, the non-zero P minimal dominant coweights are those µ ∈ (3∨WP ∨ that hµ, θi = 1. Writing µ = i ki λi , ki ≥ 0 and using (2.8), we find hµ, θi = ki mi . t Since θ − αi is a sum of positive roots, mi ≥ 1 for any i and the result follows. u

Positive Energy Representations of Loop Groups

311

3. Central Extensions of LZ G This section is devoted to the study of the central extensions by T of the group of discontinuous loops LZ G = {ζ ∈ C ∞ (R, G)| ζ (x + 2π )ζ (x)−1 ∈ Z}

(3.1)

corresponding to a subgroup Z ⊆ Z(G). These are uniquely determined by their restrictions to LG = C ∞ (S 1 , G) and to Hom(T, T /Z), the integral lattice of G/Z. The former are classified by their level ` ∈ Z [PS] and the latter by their commutator map, a T-valued, skew-symmetric bilinear form ω on Hom(T, T /Z). We shall prove below that ` and ω are bound by the requirement that ω(λ, µ) = (−1)`hλ,µi

(3.2)

whenever λ lies in the coroot lattice Hom(T, T ). Thus, central extensions of LG do not necessarily extend to LZ G since a suitable ω satisfying (3.2) for a given ` need not exist. In particular, LZ2 SU2 , and more generally LZ2n SU2n , do not possess central extensions of odd level. For compatible ` and ω, we construct the corresponding central extension of LZ G and show that the action of Diff + (S 1 ) on LZ G lifts uniquely to it. The classification of central extensions of L(G/Z) follows easily from this and is described at the end of this section. 3.1. Central extensions of LG. We begin by reviewing the construction of central extensions of LG, and more generally of a connected and simply connected (possibly infinite-dimensional) Lie group G, following chapter 4 of [PS]. All central extensions considered in this section are understood to be smooth and have T as their extending group. Let L be the Lie algebra of G and β a two-cocycle on L, i.e., a continuous, skew-symmetric, bilinear map β : L × L → R satisfying β([X, Y ], Z) + β([Y, Z], X) + β([Z, X], Y ) = 0.

(3.3)

β may be regarded as a right-invariant, closed two-form on G and we assume that (2π)−1 β is integral, i.e., such that its integral over any two-cycle in G is an integer. Then, there exists a unique central extension π

e −→ G → 1, 1→T→G

(3.4)

e = L ⊕ iR with bracket the Lie algebra of which is L [X ⊕ it, Y ⊕ is] = [X, Y ] ⊕ iβ(X, Y ).

(3.5)

e may be constructed using the following path group description. Assume G e exists G and regard it as a principal T-bundle over G with connection given by the splitting e = L ⊕ iR. In other words, the horizontal subspace at e e is Le L g∈G g . The pull-back of e to the space G PG = {p : I → G| p(0) = 1}

(3.6) e

of piece-wise smooth paths via the end-point fibration PG → G is topologically trivial, the identification of the fibre at the constant path 1 with that at p being simply given by

312

V. Toledano Laredo

parallel transport along p. Explicitly, if X = pp ˙ −1 : I → L is the right logarithmic derivative of p, the identification maps z ∈ T = e∗ π −1 (1) to the end point of the path e obtained by solving p ˙ = Xe p e in G e p, p e(0) = z. If p is closed, and therefore contractible R in G, the corresponding identification is simply multiplication by the holonomy ei σ β , where σ is any two-cycle in G with boundary p. The concatenation of pointed paths defined by q(2t) if 0 ≤ t ≤ 21 (3.7) p ∨ q(t) = p(2t − 1)q(1) if 21 ≤ t ≤ 1 e makes induces a monoidal structure on PG which, combined with the group law on G e a monoid. The crucial feature of the corresponding multiplication law is that it e∗ G e a direct consequence of becomes the canonical one when transported to PG × T ∼ = e∗ G, e e e may be described, the G-invariance of the connection on G. It follows that, as a group, G or indeed defined as the quotient of PG × T with law (p, z) ? (q, w) = (p ∨ q, zw) by the equivalence relation (p, z) ∼ (q, w)

p(1) = q(1) and ei

⇐⇒

R σ

β

= wz,

(3.8)

where σ is a two-cycle with boundary p ∨ qˇ and q(t) ˇ = q(1 − t)q(1)−1 . e if, and only if it leaves the cohomology Lemma 3.1. An automorphism A of G lifts to G class of β invariant, i.e.iff there exists a linear map F : L → R such that for any X, Y ∈ L, β(AX, AY ) = β(X, Y ) + F ([X, Y ]).

(3.9)

The lift is then unique up to multiplication by a character of G and is given infinitesimally by e ⊕ it) = AX ⊕ i(F (X) + t) A(X e by and in the path group description of G e z) = (Ap, zei A(p,

R p

F

),

where F is regarded as a right-invariant one-form, so that

(3.10)

(3.11) R p

F =

R1

F (pp ˙ −1 ).

0

eby A(X⊕it) eacts on L e Proof. The necessity of (3.9) is straightforward. Indeed, a lift A = e ⊕ i, Y ⊕ i] = AX ⊕ i(G(X) + t) for some linear map G : L → R. Requiring that A[X e ⊕ i), A(Y e ⊕ i)] and expanding both members yields [A(X A[X, Y ] ⊕ i(G([X, Y ]) + β(X, Y )) = [AX, AY ] ⊕ iβ(AX, AY )

(3.12)

and therefore (3.9). Conversely, (3.11) is a well-defined lift of A. Indeed, when regarded as an identity between right-invariant forms in G, (3.9) reads A∗ β = β − dF . It follows ˇ that if (p, z) ∼ (q, w) and σ is a two-cycle in G with ∂σ = p∨ q, ˇ then ∂ Aσ = Ap∨ (Aq) and ei

R Aσ

R

β

= ei

R σ

A∗ β

= ei R

R σ

β −i

e

R p

F i

e

R q

F

= wei

R q

F

zei

R p

F

(3.13)

e is clear for if A ei , i = 1, 2 so that (Ap, zei p F ) ∼ (Aq, wei q F ). The uniqueness of A e−1 is a lift of the identity and fixes T so that it is given by e2 A are two lifts of A, then A 1 χ ◦ π for some χ ∈ Hom(G, T). u t

Positive Energy Representations of Loop Groups

313

Remark. The phase factor in (3.11) may be derived from (3.10) as follows. For any e so that p ˙ = pp e and p ∈ PG, denote by p e its unique horizontal lift through 1 ∈ G e ˙ −1 p e p e(0) = 1. Then Q(t) = Ae p (t) satisfies −1 ˙ ˙ = A( e pp Q + iF (pp ˙ −1 )Q. Q ˙ −1 )Q = (Ap)(Ap)

(3.14)

f Set Q(t) = φ(t)Ap(t), where φ(t) ∈ T, then (3.14) reduces to φ˙ = iF (pp ˙ −1 )φ and Rt −1 ˙ )dτ . Conversely, (3.10) may be obtained from (3.11) by therefore φ(t) = ei 0 F (pp taking p as the path s → expG (stX) and differentiating at t = 0. Let now G = LG = C ∞ (S 1 , G) with Lie algebra Lg = C ∞ (S 1 , g). The basic inner product h·, ·i on g determines a right-invariant, closed two-form on LG given by Z 2π dθ (3.15) hX, Y˙ i B(X, Y ) = 2π 0 g ` and and such that (2π)−1 B is integral [PS, Thm. 4.4.1]. For any ` ∈ Z, denote by LG f ` the central extension of LG corresponding to `B and its Lie algebra. Then, any Lg g ` for a uniquely determined ` ∈ Z central extension of LG is isomorphic to some LG called its level [PS, Thm. 4.4.1]. Since B is invariant under the action of the group Diff + (S 1 ) of orientation-preserving diffeomorphisms of S 1 given by φγ = γ ◦ φ −1 , and Hom(LG, T) = {1} [PS, Prop. 3.4.1], this action lifts uniquely to any central extension of LG. Similarly, the action by g ` . Indeed, for ζ ∈ LZ G we have conjugation of LZ G on LG lifts to any LG Z 2π Z 2π dθ dθ hζ Xζ −1 , ζ Y˙ ζ −1 i hζ Xζ −1 , ζ [ζ −1 ζ˙ , Y ]ζ −1 i + B(ζ Xζ −1 , ζ Y ζ −1 ) = 2π 2π 0 0 Z 2π dθ hζ −1 ζ˙ , [X, Y ]i , = B(X, Y ) − 2π 0 (3.16) ˙ ζ ) = 0. ˙ ζ + ζ −1 ζ˙ = (ζ −1 where we used the Ad-invariance of h·, ·i and the fact that ζ −1 ` f Therefore, by Lemma 3.1, on Lg Z 2π dθ ^) X ⊕ it = ζ Xζ −1 ⊕ i t − ` Ad(ζ hζ −1 ζ˙ , Xi . (3.17) 2π 0 g ` factors through LG ⊂ LZ G and, by the uniqueIn particular, the adjoint action of LG ness of lifts, is given by (3.17). 3.2. The compatibility requirement. By Lemma 2.1, the subgroup Z ⊆ Z(G) is isomorphic to 3∨Z /3∨R , where 3∨R ⊂ 3∨Z ⊂ 3∨W is the integral lattice of G/Z, i.e.3∨Z ∼ = Hom(T, T /Z). We will regard 3∨Z as a subgroup of LZ G by associating to µ ∈ 3∨Z the discontinuous loop ζµ (θ) = expT (−iθµ). Since any character of 3∨R extends to 3∨Z , the connecting homomorphism in the five term sequence Hom(3∨Z , T) → Hom(3∨R , T) → H 2 (Z, T) → H 2 (3∨Z , T)

(3.18)

314

V. Toledano Laredo

is the zero map. Using the five term sequence for the inclusion LG ⊂ LZ G and the fact that Hom(LG, T) = {1} [PS, Prop. 3.4.1], we therefore obtain the following commutative diagram with exact row:

0

0 ? - H 2 (Z, T) - H 2 (LZ G, T) - H 2 (LG, T), ? = ∨ 2 H (3Z , T)

(3.19)

which shows that a central extension of LZ G by T is entirely determined by its restrictions to LG and to 3∨Z . The former is classified by its level ` and the latter by its commutator map ω defined by ζµe ζλ−1e ζµ−1 , ω(λ, µ) = e ζλe

(3.20)

] ζµ ∈ L where e ζλ , e Z G are arbitrary lifts of ζλ , ζµ . ω is a skew-symmetric, T-valued, ∨ ∼ Z-bilinear form on 3∨Z . Since the identification 3Z = Hom(T, T /Z) maps 3∨R to Hom(T, T ) ⊂ LG, ω is bound by the requirement that ω(α, β) = (−1)`hα,βi whenever α, β ∈ 3∨R [PS, Prop. 4.8.1]. We shall presently establish that ω is constrained by a more astringent identity, the proof of which gives an alternative derivation of Proposition 4.8.1 of [PS]. ] Theorem 3.1. Let L Z G be a central extension of LZ G by T, the restrictions to LG and 3∨Z of which have level ` and commutator map ω respectively. Then, for any λ ∈ 3∨R and µ ∈ 3∨Z , ω(µ, λ) = (−1)`hµ,λi .

(3.21)

Theorem 3.1 is an immediate corollary of the following eµ the unique lift of the conjugation Proposition 3.1. For any µ ∈ 3∨W , denote by A ` g . Then, for any λ ∈ 3∨ and lift e g ` of ζλ , ζλ ∈ LG action of ζµ on LG to LG R

eµ (e ζλ ) = A

ζλ . (−1)`hµ,λie

(3.22)

] ] ζµ ∈ L ζµ ∈ L Proof of Theorem 3.1. Let e ζλ , e Z G be lifts of ζλ and e Z G respectively. By ` ∼ e g ] e Lemma 3.1, Ad(ζµ ) = Aµ as automorphisms of LG = LZ G since both are lifts of Ad(ζµ ) and Hom(LG, T) = {1}. Thus, by (3.20) and (3.22) eµ (e ζλ )e ζλ−1 = (−1)`hµ,λi . ω(µ, λ) = A

LG

(3.23)

t u Proof of Proposition 3.1. Since ζµ ζλ ζµ−1 = ζλ in LG, the left-hand side of (3.22) is ζλ and is equal to ω(µ, λ)e ζλ , where ω(µ, λ) ∈ T is independent of the choice of the lift e eµ = Ad(e ζµ ) and ω is bilinear in µ, λ. Moreover, if µ ∈ 3∨R , Lemma 3.1 implies that A therefore skew-symmetric when restricted to 3∨R × 3∨R . We begin by establishing that ω(µ, λ) = (−1)`hµ,λi

(3.24)

Positive Energy Representations of Loop Groups

315

when λ = α ∨ is the coroot corresponding to a positive root α and µ ∈ 3∨W is such that hα, µi ∈ {0, 1}. The loop ζα ∨ (θ) = expT (−iθ α ∨ ) may then be written as a product of two exponentials in LG [PS, 4.8.1], namely π π (3.25) (eα (1) − fα (−1)) . ζα ∨ = expLG − (eα (0) − fα (0)) expLG 2 2 Here, using standard notation, eα , fα and hα = α ∨ span the sl2 (C)-subalgebra of gC corresponding to α and, for any x ∈ gC and n ∈ N, x(n) = x ⊗ einθ ∈ LgC . To see that (3.25) holds, consider the homomorphism σα : SU2 → G mapping the standard basis of sl2 (C) given by 01 00 1 0 e= , f = , h= (3.26) 00 10 0 −1 to {eα , fα , hα }. This induces a homomorphism LSU2 → LG sending −iθ 0 e θ −→ 0 eiθ

(3.27)

to ζα ∨ and (3.25) reduces to a simple matrix check. If h ∈ t, then [h, eα ] = hh, αieα whence Ad(expT (h))eα = exp(ad(h))eα = ehh,αi eα . Therefore, since ζµ (θ) = expT (−iθµ), we have ζµ eα (n)ζµ−1 (θ) = eα ⊗ eiθ(n−hα,µi) = eα (n − hα, µi)(θ )

(3.28)

ζµ eα (n)ζµ−1 = eα (n − hα, µi),

(3.29)

ζµ fα (n)ζµ−1

(3.30)

so that, in Lg = fα (n + hα, µi).

Since ζµ−1 ζ˙µ = −iµ ∈ t and this subspace is orthogonal to Ceα ⊕ Cfα with respect f ` . It to the Killing form, no correction term arises from (3.17) and the same holds in Lg follows that (3.24) holds if hα, µi = 0. If, on the other hand hα, µi = 1, then π π eµ (e ζα ∨ )e ζα−1 (−1) − f (1)) exp (0) − f (0)) (e (e A ` ∨ = exp g ` − α α α α g LG LG 2 2 π π (eα (0) − fα (0)) · expLG g ` − (eα (1) − fα (−1)) expLG g` 2 2 π = expLG g ` − (eα (−1) − fα (1)) 2 π π (e (e (0) − f (0)) exp (1) − f (−1)) − · Ad expLG ` ` α α α α g g LG 2 2 (0) − f (0)) . π(e · expLG α α g` (3.31) As is readily checked using σα , we have π (eα (0) − fα (0)) (eα (1) − fα (−1)) = eα (−1) − fα (1). Ad expLG 2

(3.32)

316

V. Toledano Laredo

Moreover, since we are conjugating by a constant loop, no correction term arises from (3.17) and (3.31) is therefore equal to (−1) − f (1)) exp (0) − f (0)) . (3.33) −π(e π(e expLG ` ` α α α α g g LG To proceed, we seek to diagonalise the above elements. This may be done in LSU2 using the identity eα (−m) − fα (m) = V (m)ihα (0)V (m)∗ , 1 ie−imθ . Since where V (m) ∈ LSU2 is given by θ −→ √1 2 ieimθ 1 m ihα (0) + eα (−m) − fα (m) , V −1 (m)V˙ (m) = 2 we have Z 2π dθ 2 m hV −1 (m)V˙ (m), ihα (0)i = − khα k2 = −m , 2π 2 hα, αi 0

(3.34)

(3.35)

(3.36)

and therefore, using (3.17), (3.33) is equal to −1

] ] (1) expLG e−iπ` hα,αi V g ` (−iπhα (0))V (1) 2

−1

] ] V (0) expLG g ` (iπ hα (0))V (0)

,

(3.37)

] ] g ` . Since expSU (−iπ hα (0)) = where V (0), V (1) are arbitrary lifts of V (0), V (1) in LG 2

−1 lies in the centre of any central extension of LSU2 , the above is equal to (−1)` hα,αi = ∨ (−1)`hα ,µi and (3.24) holds if hα, µi = 1. Let now λ = α ∨ and µ = β ∨ be coroots. Then, either |hα, β ∨ i| ≤ 1 or |hα ∨ , βi| ≤ 1 [Hu, Table 1, p. 45]. Using the bilinearity and skew-symmetry of both sides of (3.24), we may assume, up to a permutation and a sign change, that α is positive and that hα, β ∨ i ∈ {0, 1} so that, by the computation above, (3.24) holds whenever λ and µ lie in the coroot lattice. To complete the proof, it is sufficent to check (3.24) when λ = α ∨ is a positive coroot and µ varies in a set of representatives of 3∨R -cosets in 3∨W . A convenient choice is given by the minimal dominant coweights. If µ is one such then, by Proposition 2.1, hµ, αi ∈ {0, 1} and therefore (3.24) holds by our previous computation. t u 2

3.3. Construction of central extensions of LZ G. Define the level ` and commutator map ∨ ] ω of a central extension L Z G of LZ G by restriction to LG and 3Z respectively. By ] Theorem 3.1, no such L Z G exists unless ` and ω are compatible, i.e., satisfy (3.21). In particular, LZ2 SU2 , and a fortiori LSO3 , do not possess any central extensions of odd level since in this case 3∨R = αZ and 3∨W = α2 Z with hα, αi = 2. On the other hand, (3.21) requires ω(α, α2 ) = −1 in contradiction with the skew-symmetry of ω. Let now ` ∈ Z and ω be a skew-symmetric, T-valued bilinear form on 3∨Z . Then, ] Proposition 3.2. There exists a (necessarily unique) central extension L Z G of LZ G of level ` and commutator map ω if, and only if ω(µ, λ) = (−1)`hµ,λi ∨

whenever λ ∈ 3R .

(3.38)

Positive Energy Representations of Loop Groups

317

Proof. The necessity of (3.38) is the contents of Theorem 3.1 and the uniqueness of f∨ be the central extensions of LG and 3∨ with level g and 3 ] L Z G that of (3.19). Let LG Z Z ` and commutator map ω respectively. Following [PS, Prop. 4.6.9], we shall construct f∨ 1 . Lift the conjugation action of 3∨ ⊂ LZ G on LG go3 ] L Z G as a quotient of LG Z Z g by using Lemma 3.1 and denote the corresponding automorphisms of LG g by to LG ∨ ∨ ∨ f f g e Aµ , µ ∈ 3Z . Form the semi-direct product LG o 3Z , where the action of 3Z factors f∨ restrict to g and 3 through 3∨Z . By Theorem 3.1 and the compatibility of ` and ω, LG Z isomorphic central extensions g 3∨R of 3∨R . We may therefore consider the subgroup f∨ , go3 ζα−1 )} ⊂ LG N = {(e ζα , e Z

(3.39)

3∨R . We claim that N is normal. By Lemma 3.1, for any α ∈ 3∨R , where e ζα varies in g g e e γ ∈ LG, Aα = Ad(ζα ) since both are lifts of Ad(ζα ). Therefore, for any e e−α (e ζα−1 )(e γ −1 , 1) = (e γ , 1)(e ζα A γ −1 ), e ζα−1 ) = (e ζα , e ζα−1 ). (e γ , 1)(e ζα , e

(3.40)

Moreover, by Proposition 3.1 and (3.38) eµ (e ζα , e ζα−1 )(1, e ζµ−1 ) = (A ζα ), [e ζµ , e ζα−1 ]e ζα−1 ) (1, e ζµ )(e ζα , (−1)−`hµ,αie ζα−1 ). = ((−1)`hµ,αie

(3.41)

f∨ /N is the required central extension of go3 Thus, N is normal, and the quotient LG Z t LZ G. u Remark. Theorem 3.1 and Proposition 3.2 prove the exactness of 1 → H 2 (Z, T) → H 2 (LZ G, T) → H 2 (LG, T) → Z/`f Z → 0,

(3.42)

where `f is 1 if 3∨Z possesses a commutator map satisfying (3.21) with ` = 1 and `f = 2 otherwise. We call `f the fundamental level of G/Z. The fundamental levels of all compact simple groups are given in § 3.6.

1 ] 3.4. Automorphic action of Diff + (S 1 ) on L Z G. Let Diff + (S ) be the group of orientation-preserving diffeomorphisms of S 1 and D its universal covering group. D may be realised as the subgroup of diffeomorphisms φ of R such that φ(x + 2π ) = φ(x) + 2π and Diff + (S 1 ) ∼ = D/(T2π ), where Ty is translation by y. D acts automorphically on LZ G by

φ ζ = ζφ −1 = ζ ◦ φ −1 ,

(3.43)

and this action factors to one of D/(T2πk ), where k is the order of the largest cyclic subgroup of Z. The Lie algebra of D is the Lie algebra Vect(S 1 ) of all smooth vector 1 The proof of Proposition 4.6.9 of [PS] is sligthly incorrect in that it does not assume that ω is compatible with `. When that is not the case, the group N defined by (3.39) is not normal as can be seen from Eq. (3.41).

318

V. Toledano Laredo

fields on S 1 with bracket 2 [f

d d d , g ] = (f˙g − f g) ˙ . dθ dθ dθ

(3.44)

d ∈ Vect(S 1 ), the action of ξ on Lg corresponding to (3.43) is simply If ξ = f dθ

ξ X = −f X˙

(3.45)

so that the Lie algebra of D o LZ G is Vect(S 1 ) o Lg with bracket [X ⊕ f

d d d ˙ ⊕ (f˙g − f g) , Y ⊕ g ] = ([X, Y ] − f Y˙ + g X) ˙ . dθ dθ dθ

(3.46)

Proposition 3.3. Let k be the order of the largest cyclic subgroup of Z. Then, the action of the universal k-covering of Diff + (S 1 ) on LZ G lifts uniquely to any central extension ] L Z G. Proof. The uniqueness is easily settled for two lifts which necessarily differ by some b = 1, χ ∈ Hom(D, Hom(LZ G, T)) ∼ = Hom(D, Z)

(3.47)

where the first isomorphism follows from Hom(LG, T) = 1 and the second by connect∨ f∨ be the restrictions of L g 3 ] edness of D. Let LG, Z G to LG and 3Z respectively and Z `, ω the corresponding level and commutator map. We shall describe, as in the proof of f∨ g Since LZ G isn’t connected, the action ] Proposition 3.2, L Z G as a quotient of 3Z n LG. ] of D on it cannot be lifted to L Z G by using Lemma 3.1. Following the proof of [PS, g o D in the following way. Prop. 4.7.1], we shall regard it instead as one of LZ G on LG g and form the ] Consider first the connected component of the identity of L Z G, i.e. LG g o D, where D acts on LG g as in §3.1. Since D is contractible, semi-direct product LG g LGoD may equivalently be described as the central extension of LGoD corresponding to the Lie algebra cocycle Z 2π dθ d d hX, Y˙ i . , Y ⊕ g ) = `B(X, Y ) = ` (3.48) β(X ⊕ f dθ dθ 2π 0 g o D. The first action simply stems from We claim that LZ G acts on LG o D and LG the fact that LG o D is a normal subgroup of LZ G o D and is given explicitly by ζ (γ , φ) = (ζ γ ζ −1 ζ ζφ−1 −1 , φ)

(3.49)

and infinitesimally by ζ (X ⊕ f

d d ) = (ζ Xζ −1 + f ζ˙ ζ −1 ) ⊕ f . dθ dθ

2 The bracket (3.44) is the Lie-theoretic bracket on Vect(S 1 ) satisfying

[X, Y ] =

d exp(tX)Y exp(−tX) dt t=0

and is the opposite of the differential geometric one defined by the action of Vect(S 1 ) on C ∞ (S 1 ).

(3.50)

Positive Energy Representations of Loop Groups

319

g o D, we compute To see that this action lifts to LG β(ζ (X ⊕ f

d d ), ζ (Y ⊕ g )) = `B(ζ Xζ −1 + f ζ˙ ζ −1 , ζ Y ζ −1 + g ζ˙ ζ −1 ). dθ dθ

(3.51)

By (3.16), Z

dθ . 2π

(3.52)

B(f ζ˙ ζ −1 , ζ Y ζ −1 ) + B(ζ Xζ −1 , g ζ˙ ζ −1 ) Z 2π dθ f hζ˙ ζ −1 , ζ Y˙ ζ −1 i + f hζ˙ ζ −1 , ζ [ζ −1 ζ˙ , Y ]ζ −1 i = 2π 0 Z 2π ˙ −1 i − ghζ˙ ζ −1 , ζ [ζ −1 ζ˙ , X]ζ −1 i dθ ghζ˙ ζ −1 , ζ Xζ − 2π 0 Z 2π ˙ dθ . hζ −1 ζ˙ , f Y˙ − g Xi = 2π 0

(3.53)

B(ζ Xζ

−1

,ζYζ

−1

) = B(X, Y ) −

2π

h[X, Y ], ζ −1 ζ˙ i

0

On the other hand,

Finally, anti-symmetrising, we find Z

dθ . 2π

(3.54)

d d d d ), ζ (Y ⊕ g )) = β(X ⊕ f ,Y ⊕ g ) dθ dθ dθ dθ d d , Y ⊕ g ]), − `F ([X ⊕ f dθ dθ

(3.55)

B(f ζ˙ ζ −1 , g ζ˙ ζ −1 ) =

1 2

2π

(f g˙ − f˙g)hζ˙ ζ −1 , ζ˙ ζ −1 i

0

Thus β(ζ (X ⊕ f

where F : Lg o Vect(S 1 ) → R is given by Z 2π Z 1 2π dθ d ˙ dθ )= + hX, ζ −1 ζ˙ i f hζ −1 ζ˙ , ζ −1ζ i . F (X ⊕ f dθ 2π 2 2π 0 0

(3.56)

Since D is perfect [Ep], it follows from Lemma 3.1 that the action of LZ G on LGoD g o D and is given by lifts uniquely to LG ζ (X ⊕ f

d d ⊕ it) = (ζ Xζ −1 + f ζ˙ ζ −1 ) ⊕ f dθ dθ Z Z 2π ` 2π dθ −1 ˙ −1 ˙ −1 ˙ dθ hX, ζ ζ i f hζ ζ , ζ ζ i − . ⊕i t −` 2π 2 0 2π 0 (3.57)

f∨ factors f∨ n(LGoD) g where the action of 3 Consider now the semi-direct product 3 Z Z ∨ through 3Z and the subgroup N = {(ζeα , ζeα

−1

f∨ n (LG g o D), , 1)} ⊂ 3 Z

(3.58)

320

V. Toledano Laredo

∨ g ] where ζeα varies in 3 = L G Z R

3∨ R

g o D since, by . N lies in the centraliser of LG

g o D as both automorphisms are uniqueness, Ad((ζeα , 1, 1)) = Ad((1, ζeα , 1)) on LG f∨ n LG/N ∼ g ] lifts of Ad(ζλ ). It follows that the quotient 3 =L Z G is acted upon by D. Z To conclude, we need only show that translations by multiples of 2π k act trivially on ] L Z G, where k is the order of the largest cyclic subgroup of Z. It is sufficient to check this ] γ = on a representative of each connected component of L Z G since, by uniqueness, T2π e g e γ for any e e γ ∈ LG. Let ζλ be a lift of the discontinuous loop ζλ (θ ) = expT (−iλθ ), λ ∈ 3∨Z . By (3.57), ζeλ

` d d −1 ζeλ = −iλ ⊕ ⊕ i hλ, λi, dθ dθ 2

(3.59)

d ) and therefore, since Ty = expD (y dθ −1 ζeλ T2πk ζeλ = eπik`hλ,λi expT (−2πikλ)T2π k = (−1)`hkλ,λi T2π k .

(3.60)

Notice that kλ ∈ 3∨R since its image in Z is 1. Thus, by the skew-symmetry of ω and its compatibility with `, we have 1 = ω(kλ, λ) = (−1)k`hλ,λi

(3.61)

−1 ζeλ T2πk ζeλ = T2π k

(3.62)

whence as claimed. u t Let us record the following by-product of the proof of Proposition 3.3 since it extends formula 4.9.4 of [PS] LG o Diff + (S 1 ), where] LG is Corollary 3.1. The action of LZ G on the Lie algebra of] the central extension of LG of level `, is given by d d ⊕ it = ζ Xζ −1 + f ζ˙ ζ −1 ⊕ f ζ X⊕f dθ dθ Z Z 2π ` 2π −1 ˙ dθ −1 ˙ −1 ˙ dθ hX, ζ ζ i f hζ ζ , ζ ζ i − . ⊕i t −` 2π 2 0 2π 0 (3.63) 3.5. Central extensions of L(G/Z). We now classify the central extensions of L(G/Z). The five term sequence corresponding to π

1 → Z → LZ G → L(G/Z) → 1

(3.64)

and the fact that Hom(LG, T) = 1 yield the exactness of π∗

1 → Hom(Z, T) → H 2 (L(G/Z), T) −→ H 2 (LZ G, T).

(3.65)

The image of π ∗

is easily described. Let the basic level `b of G/Z be the smallest integer ` such that the restriction of `h·, ·i to 3∨Z is integral, i.e., such that ∨ `hλ∨ i , λj i ∈ Z ∨ ∨ for all fundamental coweights λ∨ i , λj lying in 3Z . Then,

(3.66)

Positive Energy Representations of Loop Groups

321

Proposition 3.4. A central extension of LZ G is the pull-back of one of L(G/Z) only if its level ` is a multiple of the basic level of G/Z. Conversely, if `b |`, the subgroup ] ] Z⊂L Z G corresponding to the canonical embedding G ,→ L Z G is central and ∼ ∗ ] ] L Z G = π (L Z G/Z).

(3.67)

] Proof. 3 As readily verified, a central extension L Z G of LZ G is the pull-back of one of L(G/Z) only if its restriction to Z lies in its centre. Conversely, since G is simple ] and simply-connected, the restriction of L Z G to G, and therefore to Z, is canonically e is central, then e ] split. If s : Z −→ Z = LZ G is the corresponding section and Z Z

] ] L Z G/s(Z) is a central extension of L(G/Z) which pulls back to L Z G. We therefore e e lies ] for which Z is a central subgroup. Notice first that Z need to determine those LZ G g=L ] ] in the centre of LG . Indeed, for any z ∈ Z, γ ∈ LG and lifts e z, e γ ∈L Z G Z G, LG

z−1 = χ (γ , z), γe e ze γ −1e

(3.68)

where χ (γ , z) ∈ T is independent of the lifts and multiplicative in each variable. Since e commutes with Hom(LG, T) = {1} however, χ ≡ 1. Thus, we need only check that Z the lifts of the discontinuous loops ζλ (θ) = expT (−iλθ ), λ ∈ 3∨Z . For any h ∈ t, we have by (3.17), e ζλ−1 = h + `hh, λi ζλ he

(3.69)

e ze ζλ−1 = e ze−2π i`hµ,λi , ζλe

(3.70)

whence e is central iff ` is a where µ ∈ 3Z is such that expT (−2πiµ) = z, and it follows that Z t multiple of `b . u ∨

Proposition 3.4 shows that H 2 (L(G/Z), T) ∼ = Ker ` ⊕ Hom(Z, T) ⊆ H 2 (LZ G, T) ⊕ Hom(Z, T),

(3.71)

where ` is the map giving the residue mod `b of the level of a central extension of LZ G. ] The canonical isomorphism is simply given by associating to a central extension L ZG b of level ` ∈ `b Z and χ ∈ Z the central extension ] L Z G/(z · χ(z))z∈Z .

(3.72)

The list of basic levels for all compact, connected and simple Lie groups is given in § 3.6. ] ] Remark. If the central extension L Z G has level ` ∈ `b Z, the action of D on L Z G clearly ^ ] descends to L(G/Z) = LZ G/Z. Surprisingly perhaps, it does not then necessarily factor to one of Diff + (S 1 ). Indeed, (3.59) yields −1 ζλ T2π =e ζλ expT (2π iλ)e2π i` T2π e

hλ,λi 2

(3.73)

∨ ] which equals 1 in L Z G/Z if, and only if `hλ, λi ∈ 2Z for any λ, i.e., iff 3Z , endowed with h·, ·i, is an even lattice.

3 The “only if” implication of Proposition 3.4 is essentially the contents of Lemma 4.6.3 of [PS].

322

V. Toledano Laredo

Remark. The basic level of G/Z is a multiple of the fundamental one for if `b |`, the form ω(λ, µ) = (−1)`hλ,µi+`

2 hλ,λihµ,µi

(3.74)

is a commutator map on 3∨Z satisfying the hypothesis of Proposition 3.2. In particular, LZ G possesses a canonical central extension at level `b . 3.6. Appendix: Fundamental and basic levels of simple Lie groups. Let Z ⊆ Z(G) and 3∨R ⊆ 3∨Z ⊆ 3∨W be the fundamental group and integral lattice of G/Z. ∼ 3∨ /3∨ is cyclic of order k, then G/Z has fundamental level 1 if Lemma 3.2. If Z = Z R and only if khλ, λi ∈ 2Z, where λ ∈ 3∨Z is a generator. Proof. If ω is a commutator map on 3∨Z satisfying ω(α, µ) = (−1)hα,µi

(3.75)

whenever α ∈ 3∨R , then, by skew-symmetry, 1 = ω(kλ, λ) = (−1)khλ,λi since kλ ∈ 3∨R . Conversely, if khλ, λi ∈ 2Z, the form e ω(α ⊕ aλ, β ⊕ bλ) = (−1)hα,βi+hbα+aβ,λi on ∨ ∨ t 3R ⊕ Zλ descends to one on 3R ⊕ Zλ/ − kλ ⊕ kλ ∼ = 3∨Z satisfying (3.75). u Proposition 3.5. The following is the list of fundamental and basic levels `f , `b for all compact, connected and simple Lie groups with universal cover G and fundamental group Z 6 = {1}. G

Z(G)

Z

SUn

n≥2

Zn

Zk

Spin2n+1

n≥2

Z2

Z2

Spn

n≥1

Z2

Z2

G/Z

`f

`b

1 for n odd

smallest ` with

or nk even

n(n−1) `∈Z k2

2 otherwise

Spin4m

m≥2

Z2 × Z2

Z02

Z± 2 Z2 × Z2

SO2n+1

SO4m

PSO4m

1

1

1 for n even

1 for n even

2 for n odd

2 for n odd

1

1

1 for m even

1 for m even

2 for m odd

2 for m odd

1 for m even

2

2 for m odd Spin4m+2

m≥1

Z4

Z2

SO4m+2

1

1

Z4

PSO4m+2

2

4

E6

Z3

Z3

1

3

E7

Z2

Z2

2

2

Proof. We proceed by enumeration according to the Lie type of G, using the tables [Bou, planches I–IX] and Lemmas 3.2 and 2.2. For G simply-laced, we identify the coroot and coweight lattices with the root and weight lattices respectively. In what follows, θi , i = 1 . . . n and h·, ·i are the standard basis and inner product in Rn . Unless otherwise indicated, the basic inner product is the standard one.

Positive Energy Representations of Loop Groups

323

SU n , n ≥ 2. P SUn is simply-laced and the quotient 3W /3R ∼ = Zn is generated by 1 ∨ λ1 = θ1 − n ( ni=1 θi ) corresponding to the special root α1 = θ1 − θ2 . For any k|n, the n(n−1) n ∨ n ∨ . subgroup of Z(SUn ) isomorphic to Zk is generated by nk λ∨ 1 and h k λ1 , k λ1 i = k2

Spin2n+1 , n ≥ 2. Z(Spin2n+1 ) ∼ = Z2 is generated by the coweight λ∨ 1 = θ1 corre∨ i = 1, ` = ` = 1. , λ sponding to the unique special root α1 = θ1 − θ2 . Since hλ∨ b f 1 1

Sp n , n ≥ 1. Sp1 is the group of unit quaternions and is therefore isomorphic to SU2 . For n ≥ 2, Z(Spn ) ∼ = Z2 is generated by λ∨ n = θ1 + · · · + θn corresponding to the unique special root αn = 2θn . Since the basic inner product is half the standard one on Rn , we n ∨ have hλ∨ n , λn i = 2 whence `b = `f = 1 for n even and 2 for n odd. This is consistent with the isomorphism Sp2 ∼ = Spin5 . Spin2n , n ≥ 3. Spin2n is simply-laced with minimal dominant coweights λ∨ 1 = θ1 , 1 ∨ = 1 (θ + · · · + θ ) corresponding to the special λ∨ = (θ + · · · + θ − θ ) and λ 1 n−1 n 1 n n n−1 2 2 roots α1 = θ1 − θ2 , αn−1 = θn−1 − θn and αn = θn−1 + θn . 2λ∨ 1 = 0 mod 3R and ∨ i = 1 so that the corresponding quotient Spin /Z ∼ SO has ` = ` = 1. , λ hλ∨ = 2 2n b f 2n 1 1 We must now distinguish two cases: n odd. Then 2λ∨ = 2λ∨ = λ∨ mod 3R and Z(Spin ) ∼ = Z4 with λ∨ , λ∨ of order n−1

n

n−1

n

1

2n

n−1

n

n ∨ 4. Since hλ∨ n , λn i = 4 , we get `b = 4 and `f = 2 for Spin2n /Z4 . This is in agreement with the isomorphism Spin6 ∼ = SU4 . ∨ ∨ = 2λ = 0 mod 3R and Z(Spin ) ∼ n even. Then 2λ = Z2 ×Z2 . Since hλ∨ , λ∨ i 2n

n−1

n−1

± n ∨ = hλ∨ n , λn i = 4 we get `b = `f = 1 or `b = `f = 2 for the quotients Spin2n /Z2 ∨ corresponding to λ∨ n−1 and λn according to whether n is divisible by 4 or not. For ∨ ∨ Z = Z2 × Z2 , we get hλ1 , λn i = 21 so that `b = 2. To determine `f , notice that if there exists a commutator map ω on 3∨Z = 3W satisfying (3.21) with ` = 1, the fundamental level of Spin2n /Z± 2 is one and therefore n is divisible by 4. Conversely, if 4|n, the form ∨ 0 ∨ ∨ ∨ ∨ 0 ∨ pq 0 −qp0 (−1)hα,βi+hα,p0 λ∨ 1 +q λn i+hpλ1 +qλn ,βi e ω(α ⊕ pλ1 ⊕ qλn , β ⊕ p0 λ∨ 1 ⊕ q λn ) = i ∨ ∨ ∨ defined on 3R ⊕ λ1 Z ⊕ λn Z descends to a suitable form on 3W = 3R ⊕ λ1 Z ⊕ ∨ ∨ ∨ ∨ λ∨ n Z/(−2λ1 ⊕ 2λ1 ⊕ 0)Z + (−2λn ⊕ 0 ⊕ 2λn )Z. E6 . E6 is simply-laced and 3W /3R ∼ = Z3 is generated by any of its non-zero elements 1 = θ − (θ + θ and therefore by λ∨ 7 − θ8 ) corresponding to the special root α6 = 5 6 3 6 4 ∨ ∨ −θ4 + θ5 . Since hλ6 , λ6 i = 3 , `b = 3 and `f = 1. 1 E7 . E7 is simply-laced and 3W /3R ∼ = Z2 is generated by λ∨ 7 = θ6 − 2 (θ7 − θ8 ) 3 ∨ corresponding to the unique special root α7 = −θ5 + θ6 . Since hλ∨ 7 , λ7 i = 2 , `b = t `f = 2. u

4. The Action of LZ G on the Positive Energy Dual of LG We show in this section that the category P` of positive energy representations of LG at a given level ` is closed under conjugation by elements of LZ G. We also identify the corresponding abstract action of Z ∼ = LZ G/LG on the alcove of G parametrising the irreducibles in P` with the geometric one obtained by realising Z as a distinguished subgroup of the automorphisms of the extended Dynkin diagram of G. We begin by studying the latter.

324

V. Toledano Laredo

4.1. Geometric action of Z(G) on the level ` alcove. This subsection is essentially an expanded version of [Bou, Ch. VI, §2.3]. The notation follows that of Sect. 2. Denote −θ by α0 , then Lemma 4.1. For any special root αi , the set 1i = 1\{αi } ∪ {α0 } is a basis of R with highest root −αi and dual basis 0

∨ λ∨ 0 = −λi ,

0 λ∨ j

=

λ∨ j

(4.1)

∨ − hθ, λ∨ j iλi .

Proof. Let x ∈ tC , then, by (2.8) X X ∨ ∨ ∨ (hx, λ∨ x= hx, λ∨ j iαj = hx, λi iθ + j i − hx, λi ihθ, λj i)αj

(4.2)

(4.3)

j 6=i

so that 1i is a vector space basis of tC with dual basis given by (4.1)-(4.2). If β is a ∨0 ∨0 positive root, then either hβ, λ∨ i i = 0, in which case hβ, λ0 i and hβ, λj i are all non∨ negative, or hβ, λ∨ i i = 1 since λi is a minimal dominant coweight. In the latter case 0 0 ∨ ∨ hβ, λ0 i = −1 and hβ, λj i = hβ − θ, λ∨ j i ≤ 0. Thus, 1i is a basis of R. Next, for any 0 ∨ ∨ β ∈ R, h−αi − β, λ0 i = 1 + hβ, λi i ≥ 0 since λ∨ i is minimal. Moreover, for j 6 = i, 0

∨ ∨ ∨ ∨ ∨ ∨ ∨ h−αi − β, λ∨ j i = hθ, λj i−hβ, λj i+hθ, λj ihβ, λi i = hθ −β, λj i+hθ, λj ihβ, λi i. (4.4)

0. If, on the other hand, hβ, λ∨i i = −1 The above is clearly non-negative if hβ, λ∨ i i ≥ ∨ then β is negative and (4.4) is equal to −hβ, λj i ≥ 0. Thus, −αi is the highest root t relative to 1i . u Proposition 4.1. Let 1 = 1 ∪ {α0 }. Then, for any special root αi , there exists a unique wi ∈ W0 = {w ∈ W | w1 = 1}

(4.5)

such that wi α0 = αi . The resulting map ı : Z(G) → W0 obtained by identifying Z(G) \ {1} with the set of special roots is a group isomorphism. Proof. The existence of wi follows from the previous lemma since W acts transitively on the set of basis of R and maps highest roots to highest roots. wi is unique because an element w ∈ W0 is determined by wα0 . Indeed, if w1 α0 = αj = w2 α0 , then w2−1 w1 is a permutation of 1 and is therefore the identity since W acts simply on basis. ı is injective because wi α0 = αi . Let now w ∈ W0 . We claim that αi = wα0 is a special root. It then follows by uniqueness that w = wP i and therefore that ı is surjective. To see this, we apply w to (2.8) and get −αi = j mj wαj while at the same time P −αi = m−1 j 6=i mj αj ). Equating the coefficients of α0 , we get mi = 1. To i (α0 + prove that ı is a homomorphism, let αi and αj be special roots. Then either wi wj = 1 or wi wj = wk , where αk is another special root. In the former case, wi αj = α0 and therefore, by (4.1), 0

∨ ∨ wi λ∨ j = λ0 = −λi

(4.6)

Positive Energy Representations of Loop Groups

325

∨ ∨ ∨ so that λ∨ j = −λi mod 3R since W leaves 3R -cosets invariant. In the latter, wi αj = αk and therefore, using (4.2)

0

∨ ∨ ∨ ∨ ∨ ∨ wi λ∨ j = λk = λk − hθ, λk iλi = λk − λi

(4.7)

∨ ∨ ∨ t whence λ∨ i + λj = λk mod 3R . u

The following is well-known and often rediscovered [OT,Ga]: Corollary 4.1. Z(G) is canonically isomorphic to the group of automorphisms of the extended Dynkin diagram of G induced by Weyl group elements. For any ` ∈ N, recall that the level ` alcove is the set defined by A` = {λ ∈ 3W |hλ, αi i ≥ 0, hλ, θi ≤ `}.

(4.8)

Proposition 4.2. For any ` ∈ N, there is a canonical action of Z(G) on the level ` alcove A` given by z −→ Ai = τ (`λ∨ i )wi ,

(4.9)

where i is the label of the special root corresponding to z via Lemma 2.2, τ denotes translation and wi = ı(z) corresponds to z via Proposition 4.1. Proof. If λ ∈ A` then for j 6 = i, hAi λ, αj i = hλ, wi−1 αj i ≥ 0 since wi−1 αj 6 = α0 . On the other hand, hAi λ, αi i = ` + hλ, α0 i ≥ 0. Finally, hAi λ, θi = ` − hλ, wi−1 θ i ≤ ` so ∨ that the Ai leave A` invariant. Next, Ai Aj = τ (`(λ∨ i + wi λj ))wi wj . If wi wj = 1, we get by (4.6) and the previous proposition Ai Aj = 1. If, on the other hand wi wj = wk , t (4.7) yields Ai Aj = Ak . u Remark. The explicit action of Z(G) on A` for all classical groups is given in § 4.4. 4.2. Positive energy representations of LG. We outline the classification of positive energy representations of LG following [Wa]. Let π be a projective unitary representation of LG o Rot(S 1 ) on a complex Hilbert space H, i.e., a strongly continuous homomorphism π : LG o Rot(S 1 ) −→ P U (H) = U (H)/T.

(4.10)

Over Rot(S 1 ), π lifts to a unitary representation which we denote by the same symbol. By definition, H is of positive energy if M H(n), (4.11) H= n≥n0

where each H(n) = {ξ ∈ H|π(Rθ )ξ = einθ ξ } is finite-dimensional. The lift is unique up to multiplication by a character of Rot(S 1 ) and we normalise it by choosing n0 = 0 and H(0) 6 = {0}. The classification of positive energy representation is obtained via the associated infinitesimal action of the Lie algebra of g-valued trigonometric polynomials Lpol g ⊂ Lg in the following way. Consider the subspace Hfin ⊂ H of finite energy vectors for

326

V. Toledano Laredo

Rot(S 1 ), that is the algebraic direct sum of the H(n). The latter is a core for the normalised self-adjoint generator of rotations which we denote by d. Thus d|H(n) = n

and

π(Rθ ) = eiθ d .

(4.12)

For any X ∈ Lpol g, the one-parameter projective group π(expLG (tX)) possesses a continuous lift to U (H), unique up to multiplication by a character of R. It is therefore given, via Stone’s theorem by etπ(X) , where π(X) is a skew-adjoint operator determined up to an additive constant. Theorem 4.1 (Wassermann). The subspace Hfin of finite energy vectors is an invariant core for the operators π(X), X ∈ Lpol g. The operators π(X) may be chosen uniquely ˙ on Hfin and then X → π(X) gives a projective so as to satisfy [d, π(X)] = iπ(X) representation of Lpol g on Hfin such that [π(X), π(Y )] = π([X, Y ]) + i`B(X, Y ), where B(X, Y ) =

R 2π 0

(4.13)

dθ hX, Y˙ i 2π and ` is a non-negative integer called the level of H.

We denote the restriction of the operators π(X), X ∈ Lpol g to Hfin by the same symbol and extend π to a projective representation of Lpol gC on Hfin satisfying (4.13) as well as the formal adjunction property π(X)∗ = −π(X). The operators π(X) and c d then give rise to a unitarisable representation of the Kac–Moody algebra g C at level ` such that d is diagonal with finite-dimensional eigenspaces and spectrum in N. Such representations split into a direct sum of irreducibles, each of which is an integrable highest weight representation, that is a module generated over the enveloping algebra U(g≤ ) by a vector v uniquely determined by the requirement that it be annihilated by g≥ and diagonalises the action of T o Rot(S 1 ). Here g≤ (resp. g≥ ) is the nilpotent Lie algebra spanned by the x(n) with n < 0 (resp. n > 0) and x ∈ gC or n = 0 and x lying in a negative (resp. positive) root space of gC . Thus, for any h ∈ tC , dv = nv, π(h)v = λ(h)v

(4.14) (4.15)

for some n ∈ N and dominant integral weight λ of G which satisfies hλ, θi ≤ `. The pair (`, λ) classifies the integrable representation uniquely [Ka]. The finite collection A` of dominant integral weights of G satisfying hλ, θi ≤ ` is called the level ` alcove of G. The classification of a positive energy representation (π, H) as an LG-module is equivalent to that of Hfin as an Lpol g-module. In particular, H is topologically irreducible under LG iff Hfin is irreducible under Lpol g and is then uniquely determined by its level ` and highest weight λ. Moreover, for any ` ∈ N and λ ∈ A` , there exists a unique c unitarisable highest weight g C -module with level ` and highest weight λ [Ka]. The corresponding action of Lpol g may then be exponentiated to yield a positive energy representation of LG on its Hilbert space completion [GoWa,TL2]. 4.3. Action of LZ G on positive energy representations of LG. The group of discontinuous loops LZ G acts on LG by conjugation and therefore on the irreducible projective unitary representations of LG by ζ∗ π(γ ) = π(ζ −1 γ ζ ), ζ ∈ LZ G. The following result shows that positive energy ones are stable under this action.

Positive Energy Representations of Loop Groups

327

Proposition 4.3. If (π, H) is an irreducible positive energy representation of LG and ζ ∈ LZ G, the conjugated representation ζ∗ π is of positive energy. Proof. Any intertwining action of Rot(S 1 ) for ζ∗ π , whether of positive energy or not is necessarily given by θ −→ Vθ = π(ζ −1 ζθ )Uθ ,

(4.16)

where ζθ (x) = ζ (x − θ) and Uθ is the positive energy intertwining action of Rot(S 1 ) for π . Indeed, ζ −1 ζθ ∈ LG and Vθ yields a projective action of Rot(S 1 ) satisfying Vθ ζ∗ π(γ )Vθ∗ = ζ∗ π(γθ ). Moreover, if Vθi , i = 1, 2 are two actions of Rot(S 1 ) intertwining ζ∗ π , then Wθ = (Vθ1 )∗ Vθ2 commutes projectively with LG and the following holds in U (H): Wθ π(γ )Wθ∗ π(γ )∗ = χ (γ , θ ),

(4.17)

where χ (γ , θ ) ∈ T depends multiplicatively on either variable. Since LG is perfect, [PS, Prop. 3.4.1.], χ ≡ 1 and by Shur’s lemma Wθ = 1 in P U (H). Thus, ζ∗ π is of positive energy iff Vθ is a positive energy representation of Rot(S 1 ). It is sufficient to check this for a given set of representatives of LG-cosets in LZ G and therefore for the discontinuous loops ζµ (θ ) = expT (−iθ µ), µ ∈ 3∨Z . If µ ∈ 3∨R , then ζµ ∈ LG and the action of Rot(S 1 ) given by (4.16) may be rewritten as π(ζµ−1 ζµ θ )Uθ = π(ζµ )∗ Uθ π(ζµ ) which is of positive energy. The general case µ ∈ 3∨Z is settled by the following observation. Notice that ζµ−1 ζµ θ = ζµ (−θ ) = expT (iθ µ) and write µP as a convex combination of elements in the since ζµ is a homomorphism P ∨ coroot lattice, µ = m i ti = 1, µi ∈ 3R . Since π lifts to a unitary i=1 ti µi , ti ∈ (0, 1], 1 1 representation e π over G × Rot(S ) ⊃ T × Rot(S ), Y π (Rθ ) = e π (expT (iθ tj µj ))e π (Rtj θ ) (4.18) e π (expT (iθµ))e j

is a lift of π(ζµ−1 ζµ θ )Uθ and the product of m commuting representations of R which by our previous argument are of positive energy. It follows that π(ζµ−1 ζµ θ )Uθ is of positive energy. u t Proposition 4.4. Let (π, H) be a positive energy representation of LG of level ` and ζ ∈ LZ G. Then, (i) ζ∗ π is of level `. (ii) If ζ (φ) = expT (−iφµ) is the discontinuous loop corresponding to µ ∈ 3∨Z , the subspaces of finite energy vectors of π and ζ∗ π coincide. (iii) If Ad(ζ )Lpol g = Lpol g and the finite energy subspaces of π and ζ∗ π coincide, the conjugated action of Lpol g on Hfin is given by ζ∗ π(X) = π(ζ −1 Xζ ) + i`

Z 0

2π

hζ˙ ζ −1 , Xi

dθ . 2π

(4.19)

Proof. It is sufficient to check (i) for the discontinuous loops ζµ , µ ∈ 3∨Z . This will be done in the course of the proof of (iii).

328

V. Toledano Laredo

(ii) It was remarked in the proof of the previous proposition that the conjugated action of rotations (4.16) corresponding to ζµ is given by π(ζµ−1 ζµ θ )Uθ = π(expT (iθ µ))Uθ

(4.20)

which commutes with the original action of Rot(S 1 ) given by Uθ . Since both are of positive energy, their finite energy subspaces coincide. (iii) Let h ∈ N be the level of ζ∗ π and denote by π and ζ∗ π the projective representations of Lpol g on Hfin given by Theorem 4.1, so that [π(X), π(Y )] = π([X, Y ]) + i`B(X, Y ), [ζ∗ π(X), ζ∗ π(Y )] = ζ∗ π([X, Y ]) + ihB(X, Y ).

(4.21) (4.22)

Evidently, ζ∗ π(X) = π(ζ −1 Xζ ) + iF (X) for some F (X) ∈ R since ζ∗ π(expLG (X)) = −1 π(expLG (ζ −1 Xζ )) = eπ(ζ Xζ ) in P U (H). It follows, by (3.16) that [ζ∗ π(X), ζ∗ π(Y )] = [π(ζ −1 Xζ ), π(ζ −1 Y ζ )] = π(ζ −1 [X, Y ]ζ ) + i`B(ζ −1 Xζ, ζ −1 Y ζ ) = ζ∗ π([X, Y ]) − iF ([X, Y ]) + i`B(X, Y ) Z 2π dθ hζ˙ ζ −1 , [X, Y ]i . + i` 2π 0

(4.23)

Since hB and `B lie in the same cohomology class iff h = `, we find by equating the above with (4.22) that the level of ζ∗ π is `. Moreover, (4.19) holds since [Lpol g, Lpol g] = t Lpol g. u Theorem 4.2. Let (π, H) be an irreducible positive energy representation of LG of level ` and highest weight λ and ζ ∈ LZ G. Then, the conjugated representation ζ∗ π(γ ) = π(ζ −1 γ ζ ) on H is of positive energy, level ` and highest weight ζ λ, where the notation refers to the geometric action of Z(G) ∼ = LZ(G) G/LG on the level ` alcove defined by Proposition 4.2. Proof. It suffices to prove the result for a given choice of representatives of LG-cosets in LZ G. Let z ∈ Z\{1} correspond to the special root αj by Lemma 2.2 and consider the discontinuous loop ζ = ζλ∨j wj , where λ∨ j is the associated fundamental coweight and wj ∈ G a representative of the Weyl group element corresponding to z by Proposition 4.1. Since G commutes Rot(S 1 ), the subspace of finite energy vectors of π coincides with that of wj ∗ π and, by the previous proposition, with that of ζ∗ π . We may therefore compare the infinitesimal actions of Lpol gC corresponding to π and ζ∗ π on Hfin . ∨ If α is a root and xα ∈ gα , then [λ∨ (θ ) = expT (−iλ∨ j , xα ] = hλj , αixα . Since ζλ∨ j θ ), j this gives ∨

iθ(n+hλj ,αi) = xα (n + hλ∨ ζλ−1 ∨ xα (n)ζλ∨ (θ) = xα ⊗ e j , αi)(θ ). j j

(4.24)

Therefore, up to a non-zero multiplicative constant ζ∗ π(eα (n)) = π(ew−1 α (n + hλ∨ j , αi)) j

(4.25)

Positive Energy Representations of Loop Groups

329

since no additional term arises from (4.19) because ζ˙ ζ −1 = −iλ∨ j lies in tC which is

orthogonal to gα . If, on the other hand h ∈ tC , then ζ −1 h(n)ζ = wj−1 h(n) and (4.19) reads ζ∗ π(h(n)) = π(wj−1 h(n)) + `δn,0 hh, λ∨ j i.

(4.26)

Let ∈ Hfin be the highest weight vector for ζ∗ π . We claim that, up to a scalar factor, = ϒ, the highest weight vector for π. To see this, recall that is the unique element of Hfin annihilated by the subalgebra g≥ spanned by the x(n), x ∈ gC and n > 0 and the xα (0) with α > 0. g≥ is generated by the elements corresponding to the simple affine roots, namely eαi (0) and eα0 (1), where α0 = −θ . Recalling from Proposition 4.1 that wj−1 acts as a permutation of 1 = {α0 , . . . , αn } and maps αj to α0 , we get, using (4.25), ζ∗ π(eαk (0)) =

π(ew−1 αk (0)) if k 6= j j

π(eα0 (1))

if k = j

,

ζ∗ π(eα0 (1)) = π(ew−1 α0 (0)) j

(4.27) (4.28)

whence = ϒ. To find the weight of and therefore the highest weight of ζ∗ π , we use (4.26) and the fact that π(h(0))ϒ = hλ, hiϒ whenever h ∈ tC so that ζ∗ π(h(0)) = hh, wj λ + `λ∨ j i = hh, ζ λi.

(4.29)

t u The proof of the above theorem has the following useful ∨ Corollary 4.2. Let z ∈ Z(G)\{1} and λ∨ j ∈ 3W and wj ∈ G the fundamental coweight and representative of the Weyl group element corresponding to z by Lemma 2.2 and Proposition 4.1 respectively. Then, conjugation by ζ = ζλ∨j wj induces an automorphism of Lpol gC preserving its triangular decomposition.

4.4. Appendix: Explicit action of Z(G) on the level ` alcove. We describe explicitly the action of Z(G) given by Proposition 4.2 for all classical groups, using the tables [Bou, Tables I–IV]. For each special root αi , we denote the corresponding element of W0 ⊂ W by wi and note that the fundamental coweight λ∨ i and weight λi coincide since is long. We let moreover θ , i = 1 . . . n be the standard basis in Rn and I the lattice α i i L n i θi · Z. Unless otherwise stated, the basic inner product is the standard one on R . SUn , n ≥ 2. simple roots: αi = θi − θi+1 , i = 1 . . . n − 1. highest root: θ = θ1 − θn = α1 + · · · + αn−1 . i P minimal dominant coweights: λ∨ j θj , i = 1 . . . n − 1. i = θ1 + · · · + θi − n Weyl group: Sn acting by permutation of the θi . W0 : wk is the cyclic permutation (θ1 · · · θn )k = (α0 · · · αn−1 )k .P level ` alcove: A` = {µ ∈ I | µ1 ≥ · · · ≥ µn , µ1 − µn ≤ `}/( j θj ). action of the centre: Ak (µ1 , . . . , µn ) = (` + µn+1−k , . . . , ` + µn , µ1 , . . . , µn−k ).

330

V. Toledano Laredo

Spin2n+1 , n ≥ 2. simple roots: αi = θi − θi+1 , i = 1 . . . n − 1 and αn = θn . highest root: θ = θ1 + θ2 = α1 + 2(α2 + · · · + αn ). minimal dominant coweight: λ∨ 1 = θ1 . Weyl group: Sn n Zn2 acts by permutations and sign changes of the θi . W0 : w1 is the sign change θ1 → −θ1 permuting α0 and α1 . level ` alcove: A` = {µ ∈ I + 21 (θ1 + . . . + θn )Z| µ1 ≥ · · · ≥ µn ≥ 0, µ1 + µ2 ≤ `}. action of the centre: A1 (µ1 , µ2 , . . . , µn ) = (` − µ1 , µ2 , . . . , µn ). Spn , n ≥ 2. simple roots: αi = θi − θi+1 , i = 1 . . . n − 1 and αn = 2θn . highest root: θ = 2θ1 = 2(α1 + · · · + αn−1 ) + αn . basic inner product: half the standard one on Rn . minimal dominant coweight: λ∨ n = θ1 + · · · + θn . Weyl group: Sn n Zn2 acts by permutations and sign changes of the θi . W0 : wn is the transformation θi → −θn+1−i . level ` alcove: A` = {µ ∈ I | ` ≥ µ1 ≥ · · · ≥ µn ≥ 0}. action of the centre: An (µ1 , . . . , µn ) = (` − µn , . . . , ` − µ1 ). Spin2n , n ≥ 3. simple roots: αi = θi − θi+1 , i = 1 . . . n − 1 and αn = θn−1 + θn . highest root: θ = θ1 + θ2 = α1 + 2(α2 + · · · + αn−2 ) + αn−1 + αn . minimal dominant coweights: λ1 = θ1 , λn−1 = 21 (θ1 + · · · + θn−1 − θn ), λn = 21 (θ1 + · · · + θn ). acts by permutations and even numbers of sign changes of the Weyl group: Sn n Zn−1 2 θi . W0 : w1 is the sign change θ1 → −θ1 , θn → −θn and permutes {α0 , α1 } and {αn−1 , αn }. For n even, wn−1 is given by θi → −θn+1−i , 2 ≤ i ≤ n − 1 and θ1 ↔ θn and permutes {α0 , αn−1 } and {α1 , αn } while wn is given by θi → −θn+1−i and permutes {αk , αn−k }. For n odd, wn−1 is given by θ1 → θn and θi → −θn+1−i , i = 2 . . . n and acts as the cyclic permutation α1 αn α0 αn−1 while wn is given by θi → −θn+1−i , i = 1 . . . n−1 −1 and θn → θ1 and acts as α1 αn α0 αn−1 . level ` alcove: A` = {µ ∈ I + 21 (θ1 +. . .+θn )Z| µ1 ≥ · · · ≥ µn−1 ≥ |µn |, µ1 +µ2 ≤ `}. action of the centre: (4.30) A1 (µ1 , . . . , µn ) = (` − µ1 , µ2 , . . . , µn−1 , −µn ). ( ` ` ` ` ( 2 + µn , 2 − µn−1 , . . . , 2 − µ2 , − 2 + µ1 ) n even . An−1 (µ1 , . . . , µn ) = n odd ( 2` − µn , . . . , 2` − µ2 , − 2` + µ1 ) (4.31) ( ` ` ( 2 − µn , . . . , 2 − µ1 ) n even . (4.32) An (µ1 , . . . , µn ) = ( 2` + µn , 2` − µn−1 , . . . , 2` − µ1 ) n odd 5. The Mackey Obstruction Corresponding to LG ⊂ LZ G We determine below the central extensions of LZ G corresponding to positive energy representations which remain irreducible when restricted to LG. More precisely, let H

Positive Energy Representations of Loop Groups

331

be an irreducible, level ` positive energy representation of LG the isomorphism class of which is invariant under LZ G. H gives rise to a projective action π of LZ G extending that of LG and therefore by pull-back of 1 → T → U (H) → P U (H) → 1

(5.1)

] ] to a central extension L Z G of LZ G. Since the restriction of L Z G to LG is smooth ] and of level ` [TL1, Chap. II, §2.4], [TL2, Prop. 5.3.1], LZ G is smooth and therefore falls within the classification of Sect. 3. In particular, it is uniquely determined by its commutator map ω(λ, µ) = π(ζλ )π(ζµ )π(ζλ )∗ π(ζµ )∗

(5.2)

which satisfies ω(α, µ) = (−1)`hα,µi ∨

(5.3)

∨

∨

whenever α ∈ 3R ⊂ 3Z . This binds ω uniquely if Z is cyclic, for if λ ∈ 3Z is a generator of Z ∼ = 3∨Z /3∨R , then ω(α + aλ, β + bλ) = (−1)`(hα,βi+hλ,aβ+bαi) for any ∨ α, β ∈ 3R . We therefore only need to investigate the case of G/Z = Spin4n /Z2 × Z2 , n ≥ 2. The main result of this section is that in this case ` is even and ω ≡ 1. We begin by computing a number of commutators related to (5.2). Recall that the roots of Spin4n are the vectors ±θi ± θj , 1 ≤ i 6 = j ≤ 2n, where {θi } is the canonical basis of R2n . The simple roots are αi = θi −θi+1 , i = 1 . . . 2n−1 and α2n = θ2n−1 +θ2n and the highest root is θ = θ1 + θ2 = α1 + 2(α2 + · · · + α2n−2 ) + α2n−1 + α2n so that 1 ∨ the minimal dominant coweights are λ∨ 1 = θ1 , λ2n−1 = 2 (θ1 + · · · + θ2n−1 − θ2n ) and 1 ∨ λ∨ 2n = 2 (θ1 + · · · + θ2n ). Fix, for any positive root α, a basis eα , fα , hα = α of the ∗ ∗ corresponding sl2 (C)-subalgebra of so4n,C such that eα = fα , where is the canonical anti-linear anti-involution acting as -1 on so4n . Then, Lemma 5.1. Let wj be the Weyl group elements corresponding to the minimal dominant coweights λ∨ j , j ∈ {1, 2n − 1, 2n} by Proposition 4.1. Then, the following elements may be taken as representatives of wj in Spin4n : π π (eθ1 +θ2n − fθ1 +θ2n ) expSpin4n (eθ1 −θ2n − fθ1 −θ2n ) , (5.4) w1 = expSpin4n 2 2 n π Y expSpin4n − fθi +θ2n−i+1 ) (eθ +θ w2n−1 = 2 i 2n−i+1 i=2 π (eθ1 −θ2n − fθ1 −θ2n ) , (5.5) expSpin4n 2 n π Y expSpin4n (5.6) (eθi +θ2n−i+1 − fθi +θ2n−i+1 ) . w2n = 2 i=1

−1 Moreover, the group commutators [wj , wk ] = wj wk w−1 j wk are all equal to one.

Proof. It is readily seen that the wj may be expressed as w1 = σθ1 +θ2n σθ1 −θ2n ,

w2n−1 =

n Y i=2

σθi +θ2n−i+1 σθ1 −θ2n ,

w2n =

n Y

σθi +θ2n−i+1 ,

i=1

(5.7)

332

V. Toledano Laredo

where σα is the orthogonal reflection corresponding to the root α and acts on tC as σα (h) = h − hh, αiα ∨ .

(5.8)

Each σα may be lifted in Spin4n to sα = expSpin4n π2 (eα − fα ) . Indeed, a power series expansion shows that Ad(sα ) leaves tC invariant and coincides with the right-hand side of (5.8). Thus, the right-hand sides of (5.4)–(5.6) do give lifts of ω1 , ω2n−1 , ω2n respectively. The last claim is a consequence of the fact that the lifts of wj and wk only involve roots αjp , βkq such that αjp ± βkq is either zero or not a root. Thus, [eαjp − t fαjp , eβkq − fβkq ] = 0 and sαjp and sβkq commute. u Lemma 5.2. Let (π, H) be an irreducible positive energy representation of L Spin4n the isomorphism class of which is invariant under LZ2 ×Z2 Spin4n and denote by the same symbol its unique extension to the latter group. Then, if wj are as in Lemma 5.1 and ζj (θ ) = expT (−iθλ∨ j ) are the discontinuous loops corresponding to the minimal , dominant coweights λ∨ j π(ζj wj )π(ζk wk )π(ζj wj )∗ π(ζk wk )∗ = 1.

(5.9)

Proof. Let λ∨ i be the minimal dominant coweight with corresponding Weyl group ele∨ ∨ ment wi such that wj wk = wi . By (4.7), wj λ∨ k = λi − λj so that −1 ∨ wj ζk (θ)w−1 j = expT (−iθ wj λk ) = ζi ζj (θ ),

(5.10)

−1 wk ζj w−1 k = ζi ζ k .

(5.11)

and similarly

Thus, −1 −1 −1 −1 [ζj wj , ζk wk ] = ζj wj ζk w−1 j [wj , wk ]wk ζj wk ζk = ζi [wj , wk ]ζi

(5.12)

which, by Lemma 5.1, equals 1. It follows that the group commutator [π(ζj wj ), π(ζk wk )] acts as a scalar χ on H. To evaluate χ, recall that by Corollary 4.2, conjugation by ζj wj and ζk wk induces an automorphism of Lpol so4n preserving its triangular decomposition so that the unitaries π(ζj wj ), π(ζk wk ) leave the highest weight vector in H invariant and χ = 1. u t Theorem 5.1. Let π be an irreducible positive energy representation of L Spin4n the isomorphism class of which is invariant under LZ2 ×Z2 Spin4n and extend it uniquely to a projective representation of the latter group. Then, the corresponding central extension of LZ2 ×Z2 Spin4n has even level and trivial commutator map. Proof. As readily checked from Eq. (4.32), the geometric action of Z(Spin4n ) on the level ` alcove has fixed points only if ` is even. Thus from (5.3), we get ω(α, µ) = 1 whenever α ∈ 3∨R and ω is the pull-back of one of the two skew-symmetric forms on ∨ 3∨W /3∨R ∼ = Z2 × Z2 so that ω ≡ 1 iff ω(λ∨ j , λk ) = 1 for a pair of distinct minimal ∨ ∨ dominant coweights λj , λk .

Positive Energy Representations of Loop Groups

333

By Lemma 5.1 and the fact that π lifts to a unitary representation of Spin4n , [π(ζj wj ), π(ζk wk )] = π(ζj )π(ζk )[π(ζk )∗ , π(wj )][π(wj ), π(wk )] [π(wk ), π(ζj )∗ ]π(ζj )∗ π(ζk )∗ = π(ζj )π(ζk )[π(ζk )∗ , π(wj )] [π(wk ), π(ζj )∗ ]π(ζj )∗ π(ζk )∗ .

(5.13)

By (5.10), [ζk−1 , wj ] = ζk−1 ζi ζj−1 so that [π(ζk )∗ , π(wj )] is proportional to π(ζk )∗ π(ζi )π(ζj )∗ . Similarly, by (5.11), [π(wk ), π(ζj )∗ ] is proportional to π(ζk )π(ζi )∗ π(ζj ) so that, by (5.2), (5.13) is equal to ∨ ∗ ∗ ω(λ∨ j , λk )[π(ζk ) , π(wj )][π(wk ), π(ζj ) ],

(5.14)

∨ ∗ ∗ ω(λ∨ j , λk ) = [π(ζj ) , π(wk )][π(wj ), π(ζk ) ].

(5.15)

and, by Lemma 5.2,

We shall now show that the right-hand side of (5.15) is equal to one. Set j = 1 and k = 2n so that the corresponding coweights are λ∨ 1 = θ1 and ∨ λ2n = 21 (θ1 +· · ·+θn ).As previously noted, if xβ ∈ so4n,C is a root vector corresponding to β and ζλ (θ ) = expT (−iθλ), then ζλ−1 xβ (n)ζλ = xβ (n + hλ, βi), and therefore ζ1−1 w2n ζ1 = expL Spin4n n Y i=2

π

(5.16)

(eθ1 +θ2n (1) − fθ1 +θ2n (−1))

2 π expSpin4n (eθi +θ2n−i+1 − fθi +θ2n−i+1 ) , 2

π −1 −1 w1 ζ2n = expSpin4n − (eθ1 −θ2n − fθ1 −θ2n ) ζ2n 2π expL Spin4n − (eθ1 +θ2n (1) − fθ1 +θ2n (−1)) . 2

(5.17)

(5.18)

Denoting by ρ the infinitesimal action of Lpol so4n,C on Hfin corresponding to π via Theorem 4.1, we therefore get by Proposition 4.4, π ρ(eθ1 +θ2n (1) − fθ1 +θ2n (−1)) [π(ζ1 )∗ , π(w2n )] = exp 2 π (5.19) exp − ρ(eθ1 +θ2n − fθ1 +θ2n ) , π 2 ρ(eθ1 +θ2n − fθ1 +θ2n ) [π(w1 ), π(ζ2n )] = exp 2 π (5.20) exp − ρ(eθ1 +θ2n (1) − fθ1 +θ2n (−1)) , 2 where the exponentials are given by the spectral theorem. Thus, [π(ζ1 )∗ , π(w2n )][π(w1 ), π(ζ2n )] = 1 as claimed. u t

(5.21)

334

V. Toledano Laredo

6. Positive Energy Representations of LZ G Let k be the order of the largest cyclic subgroup of Z and consider the action of the universal k-coverings of Diff + (S 1 ) and Rot(S 1 ) on LZ G by reparametrisation, as in § 3.4. We denote these coverings by Diff k+ (S 1 ) and Rotk (S 1 ) respectively. Define a positive energy representation of LZ G to be a strongly continuous homomorphism π : LZ G −→ P U (H)

(6.1)

extending to LZ GoRotk (S 1 ) in such a way that Rotk (S 1 ) acts by non-negative characters only and with finite-dimensional eigenspaces. Theorem 6.1. An irreducible positive energy representation (π, H) of LZ G yields by restriction (i) A positive energy representation of LG of level ` ∈ `f · N,

(6.2)

where `f ∈ {1, 2} is the fundamental level of G/Z. (ii) A projective representation of 3∨Z ∼ = Hom(T, T /Z), the commutator map of which, defined by ω(λ, µ) = π(ζλ )π(ζµ )π(ζλ )∗ π(ζµ )∗ , satisfies ω(α, µ) = (−1)`hα,µi

(6.3)

whenever α lies in the coroot lattice 3∨R ∼ = Hom(T, T ). As an LG-module, H=

M

Hµ ⊗ Cmλ ,

(6.4)

µ∈Zλ

where λ lies in the level ` alcove of G, Zλ is its orbit under the action of Z ⊆ Z(G) defined by Proposition 4.2 and Hµ is the irreducible level ` positive energy representation of LG with highest weight µ. Moreover, mλ = 1 unless G/Z = PSO4n , Zλ = {λ}, ` is even and ω is the pull-back of the non-trivial, skew-symmetric form on Z ∼ = Z2 × Z2 , in which case mλ = 2. The triple (`, ω, Zλ) classifies H uniquely, and for any (`, ω, Zλ) satisfying (6.2) and (6.3), there exists an irreducible positive energy representation of LZ G realising it. Lastly, the action of Rot k (S 1 ) on H extends uniquely to a projective unitary representation ρ of Diff k+ (S 1 ) satisfying ρ(φ)π(ζ )ρ(φ)∗ = π(ζ ◦ φ −1 )

(6.5)

which coincides with the Segal–Sugawara representation obtained by regarding H as a positive energy representation of LG. Proof. We shall repeatedly, and without further mention, use the following fact. Let Y ⊆ Z be a subgroup and ρ a positive energy representation of LY G on a Hilbert space K. ρ lifts to a unitary representation of the continuous central extension ρ ∗ U (K) of LY G obtained by pulling back p

1 → T → U (K) → P U (K) → 1

(6.6)

Positive Energy Representations of Loop Groups

335

to LZ G. Explicitly, ρ ∗ U (K) = {(ζ, V ) ∈ LY G × U (K)|ρ(ζ ) = p(V )}

(6.7)

acts on K by (ζ, V )ξ = V ξ . Let h be the level of K as a positive energy representation of LG. Then, the restriction of ρ ∗ U (K) to LG is smooth and of level h [TL1, §II.2.4], [TL2, Prop. 5.3.1] so that ρ ∗ U (K) is a smooth central extension of LY G and therefore falls within the classification of Sect. 3. Set now Y = Z, ρ = π and K = H. Then, the level ` and commutator map ω of π ∗ U (H), which is readily seen to be ω(λ, µ) = π(ζλ )π(ζµ )π(ζλ )∗ π(ζµ )∗ , are bound by Theorem 3.1 and therefore satisfy (6.2)–(6.3). As an LG o Rot k (S 1 )-module, H decomposes as M H(µ), H=

(6.8)

(6.9)

µ

where µ spans the level ` alcove A` of G and H(µ) is the isotypical summand of H corresponding to the irreducible, level ` positive energy representation Hµ of LG with highest weight µ. Evidently, for any ζ ∈ LZ G, π(ζ )H(µ) = H(ζ µ), where the notation refers to the abstract action of Z ∼ = LZ G/LG on A` given by Propositions 4.3 and 4.4 which, by Theorem 4.2, coincides with the geometric one given by Proposition 4.2. Since H is irreducible, (6.9) reduces to M H(µ) (6.10) H= µ∈Zλ

for some λ ∈ A` and the triple (`, ω, Zλ) is an invariant of H. To proceed, it will be more convenient to consider unitary representations rather than ] projective ones. For any subgroup Y ⊆ Z and central extension L Y G of LY G with level h and commutator map κ, there is a bijective correspondence between positive energy representations of LY G corresponding to the pair (h, κ) and unitary representations of k 1 k 1 ] L Y GoRot (S ) such that Rot (S ) acts by non-negative characters only and with finite] dimensional eigenspaces and the central subgroup T ⊂ L Y G acts as multiplication by ] the character z → z, provided that representations differing by a character of L Y G are 4 ] identified . We call these positive energy representations of LY G and will work with them from now on. ] Fix now `, ω satisfying (6.2)–(6.3) and denote by L Z G the central extension of LZ G with level ` and commutator map ω, the existence of which is guaranteed by Proposition 3.2. Recall moreover that by Proposition 3.3, the action of Diff k+ (S 1 ) on ] LZ G lifts to L Z G. For a given orbit Zλ ⊆ A` with isotropy subgroup Y ⊆ Z, denote by g ] ] LY G and LG the restrictions of L Z G to LY G and LG respectively. Then, by Mackey theory [Ma2, Thm 3.11], and in view of the fact that positive energy representations of LG are invariant under conjugation by LZ G, the map ] L GoRot k (S 1 ) K LY GoRotk (S 1 )

Z i : K −→ ind]

(6.11)

4 Since LG is perfect [PS, Prop. 3.4.1], such characters factor through the group of components Y of L G. Y

336

V. Toledano Laredo

] gives a bijection between the irreducible positive energy representations (ρ, K) of L YG g of which is isotypical of type Hλ , and the irreducible positive the restriction to LG 5 ] energy representations of L Z G with highest weight orbit Zλ . Moreover, since any character χ of Y extends to Z and ind(K ⊗ χ) = ind(K) ⊗ χ , K and K0 differ by a character iff i(K) and i(K0 ) do. Notice also that i(K) admits an intertwining action of Diff k+ (S 1 ) if K does. Indeed, let R be a projective representation of Diff k+ (S 1 ) on K satisfying ζ ◦ φ −1 ) R(φ)ρ(e ζ )R(φ)∗ = ρ(e

(6.12)

] for any e ζ ∈ L Y G, and lift it to a unitary representation of the corresponding central ^ k extension Diff (S 1 ) = R ∗ U (K) of Diff k (S 1 ). Then, by induction-restriction, +

+

] L GoRotk (S 1 ) K LY GoRot k (S 1 )

Z i(K) = ind]

]

^ k 1

LZ GoDiff + (S ) ∼ K = ind ^

] LY GoDiff k+ (S 1 )

(6.13)

k k 1 1 ] as L Z G o Rot (S )-modules, and i(K) admits an intertwining action of Diff + (S ). ] The relevant representations of L Y G are obtained in the following way [Ma2, §3.10]. g ] Extend the unitary action πλ of LG o Rotk (S 1 ) on Hλ to a projective one of L YG o k 1 Rot (S ) satisfying, and uniquely determined by

ζ )πλ (e γ )πλ (e ζ )∗ = πλ (e ζe γe ζ −1 ) πλ (e

(6.14)

g The corresponding central extension π ∗ U (Hλ ) of L ] ] γ ∈ LG. for any e ζ ∈L Y G and e YG λ determines one of Y by e = πλ∗ U (Hλ )/(e γ , πλ (e γ ))e Y g. γ ∈LG

(6.15)

e such that its central subgroup T acts by Any irreducible unitary representation ρ of Y ∗ −1 z → z yields one of πλ U (Hλ ), namely πλ ⊗ ρ, where T acts trivially, and therefore ∼ ∗ ] ] one of L Z G = πλ (U (Hλ ))/T and the representations of L Y G in question are exactly of this form. Moreover, they admit an intertwining action of Diff k+ (S 1 ). Indeed, if R is the Segal– g [PS, Prop. 13.4.2], then, Sugawara representation of Diff + (S 1 ) on Hλ intertwining LG k (S 1 ), ] G and φ ∈ Diff for any e ζ ∈L Y + ζ )R(φ)∗ = πλ (e ζ ◦ φ −1 ) R(φ)πλ (e

(6.16)

g Thus, projectively, since both sides have the same commutation relations with LG. ζ )(R(φ) ⊗ 1)∗ = κ(φ, e ζ )πλ ⊗ ρ(e ζ ◦ φ −1 ) (R(φ) ⊗ 1)πλ ⊗ ρ(e

(6.17)

for some κ(φ, e ζ ) ∈ T which is multiplicative in each variable. Since Diff k+ (S 1 ) is perfect [He,Ep], κ ≡ 1 and πλ ⊗ ρ admits an intertwining action of Diff k+ (S 1 ). 5 The induction functor is well-defined in the present context since L k 1 k 1 ] ] Y G o Rot (S ) ⊆ L Z G o Rot (S ) is of finite index, and satisfies the usual properties of its finite-dimensional counterpart which are necessary to prove Mackey’s theorem. Moreover, an elementary application of the induction-restriction theorem shows ] that T ⊂ L Z G acts by the required character on i(K).

Positive Energy Representations of Loop Groups

337

e, notice that the projective unitaries πλ (e ] To determine Y ζ ), e ζ ∈L Y G defined by (6.14) only depend on the image of e ζ in LY G and therefore give rise to a central extension ] of LY G. Since this extension has level `, it differs from L Y G by the pull-back of an e. We must now distinguish two cases. If Y e extension of Y which is readily seen to be Y splits, which is so if Y is cyclic or, by Theorem 5.1, if Y = Z = Z(Spin4n ) and ω ≡ 1, e correspond to the characters χ of Y and the irreducible, the relevant representations of Y ] positive energy representations of L Z G with highest weight orbit Zλ are of the form ] L GoRotk (S 1 ) (πλ LY GoRotk (S 1 )

Z ind]

⊗ χ)

(6.18)

so that their restriction to LG only involves isotypical summands of multiplicity one. e doesn’t split, then Y = Z = Z(Spin4n ) ∼ If, on the other hand, Y = Z2 × Z2 , ` is even and ω is the pull-back of the non-trivial, skew-symmetric form on Y . In this case, the e on K ∼ relevant representation is the Heisenberg representation of Y = C2 and there exists a unique irreducible, positive energy representation of LZ G with highest weight orbit Zλ = {λ} the restriction to LG of which is isomorphic to Hλ ⊗ C2 . This completes the classification of irreducible positive energy representations of LZ G and shows that any such (π, H) admits an intertwining action ρ of Diff k+ (S 1 ). 1 Let now ρ0 be the Segal–Sugawara representation of Diff m + (S ) on H, where m is some positive integer which we may take to be a multiple of k 6 . To see that ρ coincides with 1 ∗ ρ0 , notice that for any φ ∈ Diff m + (S ), the operator ρ(φ)ρ0 (φ) commutes with LG. Thus, if M Hµ ⊗ Cmλ (6.19) H= µ∈Zλ

L µ is the decomposition of H as an LG-module, then ρ0 (φ) = µ ρ0 (φ) ⊗ 1, where µ ρ0 (φ) gives the Segal–Sugawara representation of Diff + (S 1 ) on Hµ and therefore ρ(φ) =

M µ

µ

ρ0 (φ) ⊗ Tµ (φ)

(6.20)

µ

for some unitary operators Tµ (φ). Since both ρ and the ρ0 are projective representations m 1 1 of Diff m + (S ), the same is true of the Tµ which are therefore trivial since Diff + (S ) admits no non-trivial finite-dimensional projective representations [PS, Prop. 3.3.2]. u t 7. Positive Energy Representations of L(G/Z) We now determine those positive energy representations of LZ G which factor through L(G/Z) ∼ = LZ G/Z. Lemma 7.1. Let (π, H) be an irreducible positive energy representation of LG with highest weight λ and consider its unique lift to a unitary representation of G. Then, Z(G) acts on H as multiplication by the character χ(expT (h)) = ehλ,hi . 6 The Segal–Sugara representation factors through Diff (S 1 ) only on irreducible positive energy repre+ sentations of LG and through a finite order covering of Diff + (S 1 ) in general.

338

V. Toledano Laredo

Proof. For any z ∈ Z(G) and γ ∈ LG we have π(γ )π(z)π(γ )∗ π(z)∗ = κ(γ , z), where κ(·, ·) ∈ T is independent of the particular choice of lifts. κ is continuous and multi[ which, plicative in both variables and therefore defines a continuous map LG → Z(G) by connectedness of LG, is trivial. Thus, by Shur’s lemma, Z(G) acts as multiplication by a character which is easily computed since if ∈ H is the highest weight vector in t H, then π(expT (h)) = ehλ,hi . u Let now Z ∼ = 3∨Z /3∨R be a subgroup of Z(G). By using the basic inner product h·, ·i b may be identified with 3W /(3∨ )∗ , where 3R ⊆ (3∨ )∗ ⊆ 3W on it, the dual group Z Z Z is the dual lattice of 3∨Z . Lemma 7.2. Let H be an irreducible positive energy representation of LZ G of level ` and highest weight orbit Zλ. Then, the characters of Z ⊂ LZ G corresponding to H ∨ ∗ ∨ are the classes of λ + `λ∨ i mod (3Z ) , where λi are the minimal dominant coweights corresponding to Z. Proof. When restricted to LG, H decomposes as a direct sum of positive energy representations of LG the highest weights of which lie on the orbit Zλ. By Lemma 7.1 and ∨ ∗ ∨ Proposition 4.2, these give rise to the characters `λ∨ i + wi λ mod (3Z ) of Z, where λi are the minimal dominant coweights corresponding to Z. Since W preserves 3R , and a ∨ ∗ ∨ t fortiori (3∨Z )∗ -cosets in 3W however, we get `λ∨ i + wi λ = `λi + λ mod (3Z ) . u Corollary 7.1. An irreducible positive energy representation π of LZ G factors through L(G/Z) if, and only if its level is a multiple of the basic level `b of G/Z. Proof. π factors through L(G/Z) iff Z acts by the same character on each of its irreducible LG-submodules. By definition, `b is the smallest integer ` such that `h·, ·i is ∨ ∗ ∨ integral on 3∨Z , and therefore such that `λ∨ i ∈ (3Z ) for any fundamental coweight λi corresponding to Z. u t Acknowledgements. This work was supported by a TMR fellowship, contract no. FMBICT950083 and by a Non-Commutative Geometry Network grant, contract no. ERB FMRXCT960073, both from the European Commission. I am grateful to A. Wassermann for suggesting the use of the Mackey machine in the present context and for many useful conversations. This paper was begun at the Department of Pure Mathematics and Mathematical Statistics of the University of Cambridge and completed at the Institut de Mathématiques de Jussieu of the Université Pierre et Marie Curie, within the Algèbres d’Opérateurs et Représentations group. I wish to thank both institutions for their kind hospitality and pleasant working atmospheres.

References [Bou] [Ep] [FGK1] [FGK2] [Ga] [GW] [GO] [GoWa]

Bourbaki, N.: Groupes et Algèbres de Lie, Chapitres 4,5 et 6. Paris: Masson, 1981 Epstein, D.B.A.: Commutators of C ∞ -Diffeomorphisms. Appendix to A Curious Remark Concerning the Geometric Transfer Map by John Mather. Comment. Math. Helv. 59, 111–122 (1984) Felder, G., Gaw¸edzki, K., Kupiainen, A.: The Spectrum of Wess–Zumino–Witten Models. Nucl. Phys. B299, 355–66 (1988) Felder, G., Gaw¸edzki, K., Kupiainen, A.: Spectra of Wess–Zumino–Witten Models with Arbitrary Simple Groups. Commun. Math. Phys. 117, 127–58 (1988) Gaberdiel, M.R.: WZW Models of General Simple Groups. Nucl. Phys. B460, 181–202 (1996) Gepner, D., Witten, E.: String Theory on Group Manifolds. Nucl. Phys. B278, 493–549 (1986) Goddard, P., Olive, D.: The Magnetic Charges of Stable Self-Dual Monopoles. Nucl. Phys. B191, 528–548 (1981) Goodman, R., Wallach, N.R.: Structure and Unitary Cocycle Representations of Loop Groups and the Group of Diffeomorphisms of the Circle. J. Reine Angew. Math. 347, 69–133 (1984)

Positive Energy Representations of Loop Groups

[He] [Hu] [Ka] [Ma1] [Ma2] [OT] [PS] [Wa] [TL1] [TL2]

339

Herman, M.-R.: Simplicité du Groupe des Difféomorphismes de Classe C ∞ , Isotopes à l’Identité, du Tore de Dimension n. C. R. Acad. Sci. Paris, Ser. A, 273, 232–4 (1971) Humphreys, J.E.: Introduction to Lie Algebras and Representation Theory. Graduate Texts in Mathematics, Berlin–Heidelberg–New York: Springer-Verlag, 1972 Kac, V.: Infinite Dimensional Lie Algebras. 3rd. edition, Cambridge: Cambridge University Press, 1994 Mackey, G.W.: Unitary Representations of Group Extensions I. Acta. Math. 99, 265–311 (1958) Mackey, G.W.: The Theory of Unitary Group Representations. Chicago: University of Chicago Press, 1976 Olive, D., Turok, N.: The Symmetries of Dynkin Diagrams and the Reduction of Toda Field Equations Nucl. Phys. B215, 470–494 (1983) Pressley, A., Segal, G.B.: Loop Groups. Oxford: Oxford University Press, 1986 Wassermann, A.J.: Conformal Field Theory and Operator Algebras II: Fusion for Von Neumann Algebras and Loop groups. In preparation Toledano Laredo, V.: Fusion of Positive Energy Representations of LSpin2n . Ph.D. dissertation, University of Cambridge, 1997 Toledano Laredo, V.: Integrating Unitary Representations of Infinite-Dimensional Lie Groups. J. Funct. Anal. 161, 478–508 (1999)

Communicated by A. Connes

Commun. Math. Phys. 207, 341 – 383 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Stokes Matrices and Monodromy of the Quantum Cohomology of Projective Spaces Davide Guzzetti International School for Advanced Studies (SISSA - ISAS,) Via Beirut, 2–4, 34014 Trieste, Italy. E-mail: [email protected] Received: 24 October 1998 / Accepted: 27 April 1999

Abstract: In this paper we compute Stokes matrices and monodromy of the quantum cohomology of projective spaces. This problem can be formulated in a “classical” framework, as the problem of computation of Stokes matrices and monodromy of differential equations with regular and irregular singularities. We prove that the Stokes’ matrix of the quantum cohomology coincides with the Gram matrix in the theory of derived categories of coherent sheaves. We also study the monodromy group of the quantum cohomology and we show that it is related to hyperbolic triangular groups.

1. Introduction In this paper we compute Stokes’ matrices and monodromy group of the Frobenius manifold given by the quantum cohomology of the projective space CPd (denoted by QH ∗ (CPd )) . Our main motivation is to study the links between quantum cohomology and the theory of coherent sheaves. Stokes matrices first appeared in the theory of WDVV equations of associativity in the paper [8] by B. Dubrovin. WDVV equations were formulated in a geometrical setting: the theory of Frobenius manifolds. From then on, the notion of Frobenius manifold has been largely studied in many papers, of which we cite [10–12]. The Stokes’ matrix is a part of the monodromy data for a semisimple Frobenius manifold. Monodromy data can serve as natural moduli of semisimple Frobenius manifolds. More precisely, any local chart of the atlas covering the manifold is reconstructed from the monodromy data. The glueing of the local charts is described by the action of the braid group on the data, particularly, on the central connection matrix and on the Stokes’ matrix [10,11]. One well-known example of Frobenius manifolds is the quantum cohomology of smooth projective varieties [23,22,7,25,24]. In particular, we refer the reader to [22] and [7] for the important implications of Quantum Cohomology in enumerative geometry.

342

D. Guzzetti

It was conjectured [12] that the Stokes matrix for the quantum cohomology of a good P Fanok variety Xk is equal to the Gram matrix of the bilinear form χ(E, F ) := k (−1) dim Ext (E, F) computed on a full collection of exceptional objects in the derived category Der b (Coh(X)) of coherent sheaves on X. More precisely, let Der b (Coh(X)) be the derived category of coherent sheaves on a smooth projective variety X of dimension d. An object E of Der b (Coh(X)) is called exceptional if Exti (E, E) = 0 for 0 < i < d, Ext 0 (E, E) = C and Extd (E, E) is of the smallest dimension (if X is a projective space, then Ext d (E, E) = 0). A collection {E1 , . . . , Es } of exceptional objects is an exceptional collection if for any 1 ≤ m < n ≤ s we have Exti (En , Em ) = 0 for any i ≥ 0, Ext i (Em , En ) = 0 for any i ≥ 0 except possibly for one value of i. A full exceptional collection is an exceptional collection which generates Der b (Coh(X)) as a triangulated category. This theory is developed in [26,27,5]. We say that a Fano variety is good if it has a full exceptional collection. It is known that X = CPd is good, the collection of sheaves on CPd {O(n)}n∈Z is exceptional, and {E1 , E2 , . . . , Ed+1 } := {O, O(1), . . . , O(d)} is a full exceptional collection [3,19]. In this case, sij = χ (O(i − 1), O(j − 1)), i, j = 1, 2, . . . , d + 1 has the “canonical form”: d +j −i i < j, sij = j −i sii = 1, sij = 0 i > j. The inverse to this matrix has entries aij d +1 aij = (−1)j −i i < j, j −i aii = 1, aij = 0 i > j. This matrix is equivalent to the one above with respect to the action of the braid group. We will also call it “canonical”. The mentioned conjecture claims that the Stokes matrix of QH ∗ (CPd ) is equal to the above Gram matrix (modulo the action of the braid group: remarkably, this action on the Stokes matrix for the Frobenius manifold coincides with the natural action of the braid group on the collections of exceptional objects [26,28]). This conjecture has its origin in the paper by Cecotti and Vafa [6], where another Stokes matrix introduced in [9] for the tt ∗ equations was found in the case of the CP2 topological σ model.It wassuggested, on physical arguments, that the entries of the 1xy Stokes’ matrix S = 0 1 z  are integers. They must satisfy a Diophantine equation 001 2 2 2 x + y + z − xyz = 0 whose integer solutions (x, y, z) are all equivalent to (3,3,3) modulo the action of the braid group. The authors of [6] also suggested that their matrix must coincide with the Stokes matrix defined in the theory of WDVV equations of associativity, that is, in the geometrical theory of Frobenius manifolds for 2D topological field theories [8,10]. Later, in [28], the links between N = 2 supersymmetric field theories and the theory of derived categories were further investigated and the coincidence of χ(Ei , Ej ) with the Stokes matrix of tt ∗ for CPd was conjectured. The conjecture may probably be derived from more general conjectures by Kontsevich in the framework of categorical mirror symmetry. To my knowledge, the subject was discussed in [21] (I thank B. Dubrovin for this reference).

Stokes Matrices of QH ∗ (CPd )

343

The main result of this paper is the proof (Theorem 2) that the conjecture about coincidence of the Stokes matrix of QH ∗ (CPd ) and the Gram matrix χ(Ei , Ej ) of a full exceptional collection in Der b (Coh(CPd )) is true. In this way, we generalize to any d the result obtained in [11] for d = 2. We remark that it has not yet been proved that the Stokes’ matrix for tt ∗ equations and the Stokes’ matrix for the corresponding Frobenius manifold coincide. This point deserves further investigation. We also study the structure of the monodromy group of QH ∗ (CPd ). The notion of monodromy group of a Frobenius manifold was introduced in [10]. We prove (Theorem 3) that for d = 3 the group is isomorphic to the subgroup of orientation preserving transformations in the hyperbolic triangular group [2, 4, ∞]. In [11] it was proved that for d = 2 the monodromy group is isomorphic to the direct product of the subgroup of orientation preserving transformations in [2, 3, ∞] and the cyclic group of order 2, C2 = {±}. Our numerical calculations also suggest that for any d even the monodromy group may be isomorphic to the orientation preserving transformations in [2, d + 1, ∞], and for any d odd to the direct product of the orientation preserving transformations in [2, d + 1, ∞] by C2 . 2. The System Corresponding to QH ∗ (CPk−1 ) We introduce here the linear system of differential equations whose Stokes matrices are the Stokes matrices of QH ∗ (CPk−1 )(we use the more convenient choice k = d + 1). In QH ∗ (CPk−1 )we choose flat coordinates t := (t 1 , . . . , t k ). We consider the system of differential equations determining deformed flat coordinates t˜(t, w), where w ∈ C is the parameter of deformation (see [10,11]). Let yα = ∂α t˜(t, w), where ∂α = ∂/∂t α . At the semisimple point (0, t 2 , 0, . . . , 0) the system is 1 2 ˆ (1) ∂w y = U(t ) + µˆ y, w (2) ∂2 y = w Cˆ 2 (t 2 )y, where Cˆ 2 (t 2 )i,i+1 = 1 for i = 1, . . . , k − 1, Cˆ 2 (t 2 )k,1 = et , and Cˆ 2 (t 2 )i,j = 0 for any ˆ 2 ) = k Cˆ 2 (t 2 ), other i, j . U(t k−3 k−1 k−1 k−3 k−5 , , ,...,− ,− . µˆ := diag 2 2 2 2 2 2

The monodromy data of the system are, by definition, the monodromy data of QH ∗ (CPk−1 )in a chart containing (0, t 2 , 0, . . . , 0). Lemma 1. Let y(t 2 , w) = (y1 (t 2 , w), . . . , yk (t 2 , w))T be a column vector solution of the above system (1) (2). With the following substitution: yα (t 2 , w) =

1 k α−1

≡w

w

k−1 2 −α+1

k−1 2 −α+1

(w∂w )(α−1) ϕ(t 2 , w)

(3)

∂2α−1 ϕ(t 2 , w), α = 1, 2, . . . , k,

(4)

the above system is equivalent to the equations (w∂w )k ϕ = (kw)k et2 ϕ, ∂2k ϕ

k t2

= w e ϕ.

(5) (6)

344

D. Guzzetti

The proof is a simple calculation we leave to the reader. The substitution of the lemma implies ∂2 ϕ = k1 w∂w ϕ. Then t2

ϕ(t 2 , w) ≡ ϕ(we k ). t2

Namely, ϕ (at (0, t 2 , . . . , 0)) depends on one argument z = we k and satisfies the generalized hypergeometric equation d k ϕ(z) = (kz)k ϕ(z). (7) z dz The equation is equivalent to the system µˆ dY = Uˆ + Y, dz z

(8)

ˆ where Uˆ := U(0) and the column vector Y has components Yn (z) =

1 k n−1

z

k−1 2 −n+1

(z∂z )(n−1) ϕ(z) n = 1, 2, . . . , k.

(9)

We will show later (Sect. 4) that the monodromy data of the system (8) coincide with those of the system (1). Thus, we describe here the system (8), which is independent of t. The point z = 0 is a fuchsian singularity, and z = ∞ is a singularity of the second kind. Equation (8) has a fundamental matrix solution Y0 (z) whose behaviour at z = 0 is Y0 (z) = (I + O(z))zµˆ zR , Ri,i+1 = k for i = 1, . . . , k − 1, Rij = 0 otherwise, ˆ . and the monodromy for a counterclockwise loop around the origin is e2π i(µ+R) 2π i(n−1) ˆ k , with corresponding eigenThe matrix U has k distinct eigenvalues un = k e vectors xn , n = 1, . . . , k. The matrix n−1 1 1 X = √ [x1 |x2 | . . . |xk ] = √ (x j n ), x j n = e(2j −1)iπ k , j, n = 1, 2, . . . , k k k

puts U in diagonal form: ˆ = diag(u1 , u2 , . . . , un , . . . , uk ). U = X −1 UX We stress that ui 6 = uj for i 6 = j . The system (8) is transformed by the gauge X in an equivalent form V ˜ d Y˜ = U+ Y, (10) dz z ˆ Y˜ = X −1 Y, U = X−1 UX, V = X−1 µX, ˆ Observe that

ηµˆ + µη ˆ = 0, XXT = η−1 .

where η = (ηαβ ), ηαβ = δα+β,k+1 . This implies that V is skew-symmetric.

Stokes Matrices of QH ∗ (CPd )

345

3. Asymptotic Behaviour and Stokes’ Phenomenon Our aim is to explicitly compute a Stokes’ matrix for the above system (2), or for the system (8). The system (10) has formal matrix solution F2 F1 + 2 + . . . ez U , Y˜F = I + z z where Fj ’s are k ×k matrices. It is a well known result that fundamental matrix solutions exist which have Y˜F as asymptotic expansion for z → ∞ in some “admissible” sectors of the complex plane of angular width greater than π . In order to find such sectors we need the so called Stokes’ rays, defined by π < ((ur − us )z) = 0, r 6 = s ⇒ arg z = − arg(ur − us ) + + mπ, m = 0, 1. 2 There exists a unique solution of the system asymptotic to Y˜F in a sector greater than π and bounded by the first two Stokes’ rays we meet extending over π the angular width of the sector. The general theory of Stokes’ phenomenon is found in the classical paper by W. Balser, W. B. Jurkat, D. A. Lutz [1]. A possible choice for the labelling of the rays is the following: we call Rrs the Stokes’ ray, Rrs = {z = −iρ(u¯r − u¯s ), ρ > 0}, r 6 = s. A simple computation shows that: Lemma 2. For r < s the Stokes’ rays of the system (10) are π 2π − (r + s) , ρ > 0 , Rrs = z ∈ C | z = ρ exp i k k Rsr = −Rrs . Remark 1. Rrs = Rpq for r + s = p + q. R12 is at arg z = − πk , R13 is at arg z = − 2π k , and so on. For r + s = k + 2 the corresponding Rrs ’s are at arg z = −π and the Rsr ’s 3π are at arg z = 0. Rk−1,k is at the angle −2π + 3π k or, equivalently, at k . See Fig. 1. We choose two “admissible” overlapping sectors in a canonical way. Let l be an “admissible” line through the origin, namely a line not containing Stokes’ rays. For our purposes we take π l = {z | z = ρei , ρ ∈ R, 0 < < }. k l has a natural orientation inherited from R. We call 5R and 5L the half planes to the right/left of l w.r.t its orientation, 5R = {−π + < arg z < }, 5L = { < arg z < π + }. We then define two different “admissible” sectors SL , SR which contain l, π SL = {z ∈ C | 0 < argz < π + } ⊃ 5L , k π SR = {z ∈ C | − π < argz < } ⊃ 5R . k We call Y˜L (z) and Y˜R (z) the fundamental matrix solutions which have Y˜F (z) as asymptotic expansion in SL and SR respectively.

346

D. Guzzetti

R 21=R 3k =...

R k1 R k1

R 21 =R 3k =... R 2k

R k2 π /k

R 1k =R 2,k-1 =...

π /k

R 2k =...

R k2 R 12

R 12

R 1k =R 2,k-1 =... R 13

R 13

R 14 =R 23

R 14 =R 23

k odd

k even Fig. 1. Stoke’s rays

Definition. The Stokes’ matrix of the system (10) with respect to the admissible line l is the connection matrix S such that π Y˜L (z) = Y˜R (z)S, 0 < arg z < . k On the opposite overlapping region one can prove (as a consequence of the skewsymmetry of V , see [10]) that π Y˜L (z) = Y˜R (z e−2πi )S T , π < arg z < π + . k In [1] S is called a “ Stokes’ multiplier”. The terminology in this field changes from one author to the other . . . . Definition. We call central connection matrix the connection matrix C such that Y˜0 (z) = Y˜R (z)C, z ∈ 5R . It is clear that the system (8) has solutions Y0 (z) = XY˜0 (z), and YL (z) = XY˜L (z), YR (z) = XY˜R (z) asymptotic to X Y˜F (z) as z → ∞ in SL and SR respectively, which are connected by the same S and C. In order to compute the entries of S explicitly, we use the reduction of (8) to the generalized hypergeometric equation (7). If ϕ (1) (z), . . . , ϕ (k) (z) is a basis of k linearly independent solutions of (7), then the matrix Y (z) of entries (n, j ) defined by (j )

Yn (z) :=

k−1 1 z 2 −n+1 (z∂z )(n−1) ϕ (j ) (z) k n−1

(11)

is a fundamental matrix for (8). Lemma 3. The generalized hypergeometric equation (7) has two bases of linearly in(1) (k) (1) (k) dependent solutions ϕL (z), . . . , ϕL (z) and ϕR (z), . . . , ϕR (z) having asymptotic behaviour π h i 1 ei k (n−1) 1 (n) i 2π (n−1) k exp k e z 1+O , z→∞ ϕL/R = √ k−1 z k z 2

Stokes Matrices of QH ∗ (CPd )

347

in SL and SR respectively. Let 8(z) denote the row vector [ϕ (1) (z), . . . , ϕ (k) (z)]. The fundamental matrices YL (z), YR (z) of (8) are expressed through formula (11) in terms of 8L (z) and 8R (z) and YL (z) = YR (z) S, 0 < arg z <

π , k

8L (z) = 8R (z)S,

π . k

if and only if 0 < arg z <

Proof. Simply observe that for a fundamental solution in SL or SR (we omit subscripts L, R) # " k−1 k−1 z 2 ϕ (1) . . . z 2 ϕ (k) = XY˜ , Y (z) = .. .. . ... . which is asymptotic, for z → ∞, to " π 2π 3π k−1 # 1 ei k ei k ei k . . . ei k π 2π i(k−1) 2π i k z), . . . , exp(ke k ) . diag exp(kz), exp(ke ∼ . . .. .. .. .. .. . . ... . Now, the first row of Y (z) is z

k−1 2

8(z). u t

The Stokes’ matrix S has entries sii = 1, sij = 0 if Rij ⊂ 5R . This follows from the fact that on the overlapping region 0 < arg z < Stokes’ rays and

π k

there are no

ezU S e−zU ∼ I, z → ∞, then ez(ui −uj ) sij → δij . Moreover, < z(ui − uj ) > 0 to the left of the ray Rij , while < z(ui − uj ) < 0 to the right (the natural orientation on Rij , from z = 0 to ∞ is understood). This implies zu zu e i > e j and ez(ui −uj ) → ∞ as z → ∞ on the left, while on the right zu zu e i < e j and ez(ui −uj ) → 0 as z → ∞. With these observations in mind, we prove the following Lemma 4. S has a column whose entries are all zero but one. More precisely: For k even, k si, k +1 = 0 ∀i 6 = + 1, s k +1, k +1 = 1. 2 2 2 2 For k odd, k+1 , s k+1 , k+1 = 1. si, k+1 = 0 ∀i 6 = 2 2 2 2

348

D. Guzzetti

Proof. Let us determine n such that sin = 0 for any i 6 = n and snn = 1. We need to find all rays in 5R . We start with Rrs with r < s. We know that for r + s = k + 2 the ray is the negative real half-line (at angle −π). Then Rrs ⊂ 5R for r + s ≤ k + 1 (r < s). Then, in 5R we have R12 R13 . . . R1k R23 R24 . . . R2,k−1 R34 R35 . . . R3,k−2 .. .. . . Rab

where Rab = R k , k +1 for k even, and R k−1 , k+3 for k odd. In 5R we have also Rrs with 2 2 2 2 r + s ≥ k + 2 and r > s. For fixed n we require Rin ⊂ 5R for any i. Namely, ∀i < n i + n ≤ k + 1, ∀i > n i + n ≥ k + 2. k 2

This yields n =

+ 1 for k even, n =

k+1 2

for k odd. u t

th Let n(k) be 2k + 1, or k+1 2 . Lemma 4 implies that the n(k) columns of YL and YπR coincide. In particular, their asymptotic representation holds for −π < arg z < π + k . Actually, this domain can be further enlarged, up to

−

π π − π < arg z < π + k even, k k 2π k odd . −π < arg z < π + k

To see this recall that |ezui | < |e zuj | on the right of Rij , and conversely on the left. Then it is easy to see that for k even exp(z u k +1 ) dominates all exponentials in the sector 2 − πk − π < arg z < πk − π, while for k odd exp(z u k+1 ) dominates all exponentials in the sector π < arg z < π +

2π k . (n(k)) th (z) The first entry of the n(k) column is ϕL

2

(n(k))

k−1

≡ ϕR (z) times z 2 . Then ϕ (n(k)) has the estabilished asymptotic behaviour on the enlarged domains above. We now introduce an integral representation for a solution ϕ(z) of the generalized hypergeometric equation which will allow us to compute the entries of S. Lemma 5. The function g (n) (z) =

Z

1 (2π)

k+1 2

k

eiπ( 2 −n−1)

−c+i∞

−c−i∞

ds 0 k (−s) e−iπ ks ei2(n−1)π s zks

π defined for π2 − 2(n − 1) πk < arg z < 3π 2 − 2(n − 1) k , z 6 = 0 and for any positive number c > 0, is a solution of the generalized hypergeometric equation (7) ( the path of integration is a vertical line through −c). It has asymptotic behaviour

2π 1 ei k (n−1) (z) ∼ √ exp kei k (n−1) z , z → ∞. k−1 k z 2 π

g

(n)

Stokes Matrices of QH ∗ (CPd )

349

In particular, for n(k) = 2k +1 (k even), or n(k) = k+1 2 (k odd), the analytic continuation of g (n(k)) (z) has the above asymptotic behaviour in the domains π π < arg z < π + , k even , k k 2π , k odd , −π < arg z < π + k

−π −

(n(k))

(n(k))

≡ ϕR appearing in the first rows of the and coincides with the solution ϕL fundamental matrices YL and YR of the system (8). The following identity holds: k X

m−k

(−1)

m=0

2π k g (n) (z ei k m ) = 0. m

(12)

We omit the proof, which is a standard application of the steepest descent method. The reader may find it in the preprint [15]. Remark 2. Observe that for basic solutions of the hypergeometric equation 8L/R = (1) (k) [ϕL/R , . . . , ϕL/R ], π

2π 1 ei k (n−1) exp(kei k (n−1) z) ϕ (n) (z) ∼ √ k−1 k z 2

on some sector, and π

ϕ

(n)

(ze

2πi k

2π 1 ei k ([n+1]−1) ) ∼ (−1) √ exp(kei k ([n+1]−1) z) k−1 k z 2

like −ϕ (n+1) , on the sector rotated by ϕ (k) (ze

2πi k

−2π k .

Note however that

√ k−1 ) ∼ ( kz 2 )−1 ekz ,

2πi

like ϕ (1) (z). Also, note that g (n) (ze k ) = −g (n+1) (z) (we mean analytic continuations), with asymptotic behaviour on the rotated domain. 4. Monodromy Data of QH ∗ (CPk−1 ) We return to the system (1), which has a fundamental matrix y0 (t 2 , w) = (I + A1 (t 2 )w + A2 (t 2 )w2 + . . . ) wµˆ w R ,

w → 0,

where R is the same for the system (8). The series appearing in the solutions converges ˆ 2 ) has eigenvalues and eigenvectors near w = 0. The matrix U(t un (t 2 ) = ei

2π k (n−1)

t2

t2

e k ≡ un e k , n = 1, . . . , k, π

xn (t 2 ) of entries x j n (t 2 ) = ei(2j −1)(n−1) k e

2j −1−k 2 t 2k

≡ xj ne

2j −1−k 2 t 2k

.

350

D. Guzzetti

Let X(t 2 ) = (x j n (t 2 )). With the gauge y = X(t 2 ) y(t ˜ 2 , w) we obtain the equivalent system V (t 2 ) 2 y, ˜ (13) ∂z y˜ = U (t ) + z ˆ 2 ) X(t 2 ) = diag(u1 (t 2 ), . . . , uk (t 2 )), U (t 2 ) = X−1 (t 2 ) U(t V (t 2 ) = X−1 (t 2 ) µˆ X(t 2 )), V (t 2 )T + V (t 2 ) = 0. Let us fix an initial point t0 = (0, t02 , 0, . . . , 0). The system (13) has fundamental matrices yR (t02 , z), yL (t02 , z), which are asymptotic to the formal solution " # F1 (t02 ) F2 (t02 ) 2 2 + + . . . ew U (t0 ) y˜F (t0 , w) = I + w w2 in the sectors

(

"

) π <π+ , k !# ) t02 π < , SR (t0 ) = w ∈ C | − π < arg w exp k k t2 SL (t0 ) = w ∈ C | 0 < arg w exp 0 k ( "

and

!#

"

t2 y˜L (t02 , w) = yR (t02 , w) S, 0 < arg w exp 0 k

with respect to the admissible line (

=m t02 lt0 := w ∈ C | w = ρ exp i − k

!

!# <

π , k

) , ρ>0 .

The Stokes’ matrix is precisely the matrix S of system (8) with respect to the admissible line lt0 . Also the central connection matrix defined by t2 π 0 y0 (t02 , w) = yR (t02 , w) C, −π < arg we k < k is the same for the system (8). Definition. C and S, together with µ, ˆ R, and e = ∂t∂1 are the monodromy data of the quantum cohomology of CPk−1 in a local chart containing t0 . Recall that we fixed a point t0 = (0, t02 , . . . , 0). When we consider a point t = away from t0 , the system (1) acquires a general form µˆ ˆ y, (14) ∂w y = U(t) + w

(t 1 , . . . , t k )

Stokes Matrices of QH ∗ (CPd )

351

Domain of Yk2

R k1

R 2k

R k2

R 1k

R k3 R 1,k-1

R k,k-1

Fig. 2. Sector where Yk2 has he asymptotic behaviour XezU for z → ∞

ˆ is related to the multiplication by the Euler vector field in the manifold ([10, where U(t) (j ) 11] and yα (t; w) = ∂α t˜j (t, w). The admissible line lt0 must be considered fixed once and for all. Instead, the Stokes’ rays change. This is because they are functions of the ˆ 1 , .., t k ). For example, if just t 2 varies, eigenvalues u1 (t), . . . , uk (t) of the matrix U(t 1 3 k while t = t = · · · = t = 0, the system (14) has Stokes’ rays 2π π =m t 2 − i (r + s) − i , ρ > 0}. Rrs (t 2 ) = {w | w = ρ exp i k k k The dependence of the coefficients of the system (14) on t is isomonodromic [10,11] (the Stokes’ matrix is a natural monodromy datum in the theory of isomonodromic deformation [20,16–18,13]). Then µ and R are the same for any t. S and C do not change if we move in a sufficently small neighbourhood of t0 . Problems arise when some Stokes’ rays cross lt0 . S and C must be modified by an action of the braid group. We will return to this point later. 5. Computation of S To compute S, we factorize it in “Stokes’ factors”. Our fundamental matrix YL has the required asymptotic form on the sector between Rk2 (arg z = 0) and R1k (arg z = π + πk ). YR has the same behaviour between R2k (arg z = −π ) and Rk1 (arg z = πk ). Of course, we can consider fundamental matrices with the same asymptotic behaviour on other sectors of angular width less than π + πk and bounded by two Stokes’ rays. We introduce the following notation: consider a fundamental matrix of (8) having the required asymptotic behaviour on such a sector. If we go all over the sector clockwise we meet Stokes rays belonging to the sector at each displacement of πk . Let Rij be the last ray we meet before reaching the boundary (the boundary is still a Stokes ray not belonging to the sector). Then we will call the fundamental matrix Yij . For example, YL = Yk1 and YR = Y1k . See Fig. 2.

352

D. Guzzetti

Domain

Sk Rk l

R 2k-1 R

R0

R

k-1

k-2

R1

R2

Fig. 3. Sector Sk and labelling of Stoke’s rays

Sometimes, a different labelling is used in the literature. The rays must be enumerated as in Fig. 3. The numeration refers to the line l: the rays are labelled in counterclockwise order starting from the first one in 5R (which will be R0 ; then R0 , R1 , . . . , Rk−1 are in 5R , and Rk , . . . , R2k−1 are in 5L ). For our particular choice of l, R0 ≡ R1k (at arg z = −π + πk ); R1 ≡ R1,k−1 follows counterclockwise . . . . Then we proceed until we reach Rk−1 ≡ Rk2 before crossing l, and so on. The fundamental matrices are labelled as we prescribed above, namely Yj if its sector contains Rj as the last ray met going all over the sector clockwise before the boundary. The sector itself is denoted by Sj . See Fig. 3. Definition. The Stokes’ factors are the connection matrices Kj such that Yj +1 (z) = Yj (z)Kj on the overlapping region Sj ∩ Sj +1 of width π. We warn the reader that also the Stokes’ factors will be labelled with both conventions above, according to our convenience (for example K0 ≡ K1k ). As a consequence of the above definitions, we can factorize S as follows: YL = Yk1 = Yk2 Kk2 = Yk3 Kk3 Kk2 = . . . . . . = Y1k K1k K1,k−1 Kk,k−1 Kk,k−2 . . . Kk3 Kk2 ≡ YR S. Then S = K1k K1,k−1 Kk,k−1 Kk,k−2 . . . Kk3 Kk2 . We observe that, being the first row of Y (z) equal to z

k−1 2

(15)

8(z), the following holds:

8j +1 (z) = 8j Kj .

Remark 3. The Stokes’ factors of the system (8) and of the gauge-equivalent system (10) are the same. From the skew-symmetry of V it follows that Kj i = (Kij−1 )T .

Stokes Matrices of QH ∗ (CPd )

353

Before computing the Stokes factors explicitly, we show that just two of them are enough to compute all the others. Let iπ

F (z) :=

2π i 1 ek 1 1 √ k−1 exp(kz), √ k−1 exp(ke k z), kz 2 kz 2 ! iπ 2π i 1 e k (k−1) exp(ke k (k−1) z) ..., √ k−1 k z 2

= (F (1) (z), F (2) (z), . . . , F (k) (z)) be the row vector whose entries are the first terms of the asymptotic expansions of an actual solution 8(z) of the generalized hypergeometric equation. By a straightforward computation we see that F (ze

2πi k

) = F (z)TF ,

where (TF )i+1,i = −1 for i = 1, . . . , k − 1; (TF )1,k = 1; otherwise (TF )i,j = 0. We use now the convention of enumeration of Stokes’ rays R0 , R1 , . . . , starting from l (see above). Let 8m (z) be an actual solution of the hypergeometric equation having asymptotic behaviour F (z) on Sm , 8m (z) ∼ F (z), z → ∞, z ∈ Sm . Then 8m+2 (ze

2πi k

) ∼ F (ze

2πi k

) = F (z)TF , z ∈ Sm .

Namely, 8m+2 (ze

2πi k

)TF−1 ∼ F (z), z ∈ Sm ,

then, by unicity of actual solutions having asymptotic behaviour F (z) in a sector wider than π , we have 8m+2 (ze

2πi k

) = 8m (z)TF , z ∈ Sm .

Lemma 6. For any p ∈ Z, −p

p

Km+2p = TF Km TF . Proof. For z ∈ Sm ∩ Sm+1 , 8m+1 (z) = 8m (z) Km = 8m+2 (ze = 8m+3 (ze

2πi k

2πi k

) TF−1 Km

−1 −1 ) Km+2 TF−1 Km = 8m+1 (z) TF Km+2 TF−1 Km .

t Then Km+2 = TF−1 Km TF . By induction we prove the lemma. u From the lemma it follows that just two Stokes’ factors are enough to compute all the others. We are ready to give a concise formula for S:

354

D. Guzzetti

Theorem 1. Let l be an admissible line (i.e. not containing Stokes’ rays), and let us enumerate the rays in counterclockwise order starting from the first one in 5R (which will be R0 , then R0 , R1 , . . . , Rk−1 are in 5R , and Rk , . . . , R2k−1 are in 5L ). Then the Stokes’ matrix for (8), (10), (7) and k > 3 is  k k k k 2 2   TF2 ≡ TF2 TF−1 Kk−2 Kk−1 , k even K0 K1 TF−1 . S= k−1 k−1 k−1 k−1  2 −1  K K T −1 2 K T 2 ≡ T 2 K K K , k odd T 0 1 F 0 F k−1 k−2 k−1 F F (16) Proof. S = K0 K1 K2 . . . Kk−1 , K2p =

−p TF

p K0 TF

and K2p+1 =

−p TF

p K1 TF . Then

S = K0 K1 (TF−1 K0 TF ) (TF−1 K1 TF ) K4 K5 . . . Kk−1 = (K0 K1 TF−1 ) K0 K1 TF K4 K5 . . . Kk−1

= (K0 K1 TF−1 ) K0 K1 TF (TF−2 K0 TF2 ) (TF−2 K1 TF2 ) K6 . . . Kk−1 = (K0 K1 TF−1 )(K0 K1 TF−1 ) K0 K1 TF2 K6 . . . Kk−1 . −( 2k−1)

Now observe that Kk−1 = TF for k odd. Then, for k even,

k

K1 TF2

−1

k

k

k 2 −2

k 2 −2

S = (K0 K1 TF−1 ) 2 −2 K0 K1 TF2 = (K0 K1 TF−1 ) K0 K1 TF k k 2 = K0 K1 TF−1 TF2 .

−2

−( k−1 2 )

for k even, while Kk−1 = TF

k−1

K0 TF 2

Kk−2 Kk−1 −( 2k −1)

TF

k

K0 TF2

−1

−( 2k −1)

TF

k

K1 TF2

−1

For k odd: S = (K0 K1 TF−1 )

k−3 2 k−3

k−3

K0 K1 TF 2 Kk−1 k−3

−( k−1 2 )

= (K0 K1 TF−1 ) 2 K0 K1 TF 2 TF k−1 k−1 2 = K0 K1 TF−1 K0 TF 2 .

k−1

K0 TF 2

If instead we write the Stokes’ factors in term of Kk−2 and Kk−1 we obtain the other two formulas in the same way. u t Remark 4. For our particular choice of l, K0 ≡ K1k , K1 ≡ K1,k−1 , Kk−2 ≡ Kk3 and Kk−1 ≡ Kk2 . For k = 3, S = TF K32 (TF−1 K12 K32 ) = K13 K12 K32 . It is now worth deriving some properties of the monodromy of Y (z) (for (8)) and 8(z) (for (7)), which will be useful later. Consider 8m (z) with asymptotic behaviour F (z) on Sm . Then 8m (z) = 8m−2 (z) Km−2 Km−1 ≡ 8m (ze

2π i k

) TF−1 Km−2 Km−1 .

On the other hand −1 −1 Km ≡ 8m (ze− 8m (z) = 8m+2 (z) Km+1

This proves the following

2π i k

−1 −1 ) TF Km+1 Km .

Stokes Matrices of QH ∗ (CPd )

355

Lemma 7. The basic solution 8m (z) of the generalized hypergeometric equation (7) with asymptotic behaviour F (z) on Sm , satisfies the identity 8m (ze

2πi k

) = 8m (z) Tm ,

where −1 −1 −1 −1 Km−2 TF = TF Km+1 Km . Tm := Km−1

Corollary 1. The monodromy (at z = 0) of 8m (z) is 8m (ze2πi ) = 8m (z) (Tm )k . Now, for our particular choice of the line l and for m = k, 8m (z) = 8L (z). For the solution YL (z) of (8), the relations YR (z) = YL (z) S −1 (0 < arg z < πk ), YL (z) = YR (ze−2πi ) S T (π < arg z < π + πk ) immediately imply YL (ze2πi ) = YL (z) S −1 S T . Recall that the (n, j )th entry of Y (z) is (j )

Yn,j (z) ≡ Yn (z) = and observe that (ze2πi ) lowing:

k−1 2

k−1 1 z 2 −n+1 (z∂z )(n−1) ϕ (j ) (z), k n−1

= (−1)k−1 z

k−1 2

. Then, from Corollary 1 we get the fol-

Corollary 2. Let T be the k-monodromy matrix of 8L (namely, T ≡ Tk for our choice of l). Then T k = (−1)k−1 S −1 S T . Our formula (16) allows us to easily compute S. The recipe is simply to take Kk3 , Kk2 (which we are going to compute explicitly) and substitute them into k

k

k

k

S = TF2 (TF−1 Kk3 Kk2 ) 2 = TF2 T − 2

(17)

for k even, or into k−1

S = TF 2 Kk2 (TF−1 Kk3 Kk2 ) for k odd.

k−1 2

k−1

= TF 2 Kk2 T −

k−1 2

(18)

356

D. Guzzetti

Computation of Stokes’ factors. We need to distinguish between k odd and even. In the following g(z) will mean g (n(k)) (z) (n(k) = 2k +1 or k+1 2 for k even or odd respectively). k odd.

( k+1 2 )

g(z) = ϕL

( k+1 2 )

(z) ≡ ϕR

(z)

with asymptotic behaviour on −π < arg z < π +

2π . k 2π i

k m) If we iterate the map z 7 → ze k for m = 1, 2, . . . , k+1 2 times, the domain of g(ze k+1 for each m covers SR . When we reach m = 2 a new iteration (i.e. a new rotation of π π the domain of − 2π k ) will leave the sector − k < arg z < k of SR uncovered. The same, 2πi

if we do z 7 → ze− k , the sector −π < arg z < −π + Then, by Remark 2: 2πi

k−1 i 2π k+1 81k (z) ≡ 8R (z) = (−1) 2 g(ze k 2 ), . . .

2π k

of SR remains uncovered.

k−3 2

unknown terms . . . , k−1 2πi 4πi k−1 i 2π k 2 k k 2 ) . g(z), − g(ze ), g(ze ), . . . , (−1) g(ze

In the same way we see that

8k1 (z) ≡ 8L (z) = (−1) (−1)

k−1 2

k−1 2

g(ze

g(ze

g(z), . . .

−i 2π k

−i 2π k

k−1 2

k−1 2

k−3 2

),

), . . . , −g(ze− unknown terms . . .

2π i k

),

and similar expressions for 81,k−1 , 8k,k−1 , 8k,k−2 , . . . , 8k3 ,8k2 . The unknown terms are computed using the identity (12) and simple considerations on the dominant expo2π nentials |ezui | on the sectors which remain uncovered in the iterations of z 7 → ze±i k . Example. A simple example will clarify this procedure. Let k = 7; then g = ϕ (4) , 817 = 8R = (−g(ze

8πi 7

), ?, ?, g(z), g(ze

2π i 7

), g(ze

4π i 7

), g(ze

6π i 7

)).

−2πi

(3)

We look for ϕR . If we take (−1)g(ze k ) we leave uncovered −π < arg z < −π + 2π 7 in SR . On −π < arg z < −π + π7 (between two nearby Stokes’ rays) we have the zu4 | > |ezu3 | relations |ezu4 | > |ezu5 | > |ezu3 |. On −π + π7 < arg z < −π + 2π 7 , |e (later on we will simply write 4 > 3). Then (3)

ϕR (z) = (−1)g(ze

−2πi 7

(4)

(5)

) + c4 ϕR (z) + c5 ϕR (z). (3)

12π i

To find c4 , c5 we need another representation for ϕR . We consider (−1)g(ze 7 ), which has the correct asymptotic behaviour, but on a domain which leaves uncovered π π − 3π 7 < arg z < 7 . The relations are: on 0 < arg z < 7 , 1 > 7 > 2 > 6 > 3;

Stokes Matrices of QH ∗ (CPd )

357

π on − π7 < arg z < 0, 1 > 2 > 7 > 3; on − 2π 7 < arg z < − 7 , 2 > 1 > 3; on 3π 2π − 7 < arg z < − 7 , 2 > 3. Then (3)

ϕR (z) = (−1)g(ze

12πi 7

(1)

(2)

(6)

(7)

) + d1 ϕR (z) + d2 ϕR (z) + d6 ϕR (z) + d7 ϕR (z).

In the same way one finds ( (2) ϕR (z)

=

g(ze

−4πi 7

(3)

(4)

(5)

(6)

) + a3 ϕR (z) + a4 ϕR (z) + a5 ϕR (z) + a6 ϕR (z) . 10πi (1) (4) g(ze 7 ) + b1 ϕR (z) + b2 ϕR (z)

ϕ (i) ’s are known for i = 1, 4, 5, 6, 7. Using the identity (12) we compute a, b, c, d. We get −2πi 2π i 4π i 7 7 7 7 g(ze 7 ) + g(z) − g(ze 7 ) + g(ze 7 ), 1 2 3 4 −2πi 2π i 7 7 (3) g(z) − g(ze 7 ). ϕR (z) = −g(ze 7 ) + 1 2 (2)

ϕR (z) = g(ze

−4πi 7

)−

A similar computation gives 871 = 8L :  6πi −g(ze− 7 )   4πi   g(ze− 7 )   −2πi   −g(ze 7 )     g(z)     2π i 7 =  g(z) −g(ze 7 ) +     6   4π i 2π i 2πi 7 7 7 −   7 )− 7 )+ 7 ) g(ze g(ze g(z) − g(ze   6 5 4     6π i 4π i 2π i 2πi 4πi 7 7 7 7 7 − − g(ze 7 ) + g(ze 7 ) − g(ze 7 ) + g(z) − g(ze 7 ) −g(ze 7 ) + 3 6 5 4 2 

8T71

and

 6πi −g(ze− 7 ) 4πi   g(ze− 7 )     −2πi   −g(ze 7 )     g(z)   2πi = . 7 ) −g(ze     4π i 2πi 7 7   g(ze 7 ) + g(z) g(ze 7 ) −   6 5    6π i 4π i 2πi 2πi  7 7 7 7 g(ze− 7 ) g(ze 7 ) − g(ze 7 ) + g(z) − −g(ze 7 ) + 3 6 5 4 

8T72

Notice that in each of the last three entries of 872 there is a term missing w.r.t. the (K72 )i,i = 1 for i = 1, . . . , k; corresponding entries of 871 .This immediatelyimplies 7 7 7 , (K72 )3,6 = , (K72 )4,5 = . All the other entries are zero. (K72 )2,7 = 2 4 6 The next step is the computation of 873 and K73 , through 872 = 873 K73 . It is done in the same way.

358

D. Guzzetti

The above procedure is extended to the general case. The factors of interest are: k for j = 2, . . . , k+1 k odd. (Kk2 )j, k−j +2 = 2 . (Kk2 )j,j = 1 for j = 2(j − 1) k k ; (Kk3 )j, k−j +3 = 1, . . . , k. All the other entries are zero. (Kk3 )2,1 = − 1 2j − 3 . (K ) = 1 for j = 1, . . . , k. All the other entries are zero. for j = 3, . . . , k+1 k3 j,j 2 k for j = 2, . . . , 2k . (Kk2 )j,j = 1 for j = 1, . . . , k. k even. (Kk2 )j, k−j +2 = 2(j − 1) k k ; (Kk3 )j, k−j +3 = , for All the other entries are zero. (Kk3 )2,1 = − 1 2j − 3 j = 3, . . . , 2k + 1. (Kk3 )j,j = 1 for j = 1, . . . , k. All the other entries are zero. Our equation is a particular case of generalized confluent hypergeometric equations, whose Stokes’ factors were computed by A. Duval and C. Mitschi in [14]. 6. Reduction of S to “Canonical” Form Using the Stokes’ factors above the reader may compute the Stokes’ matrices; lots of numerical examples are reported in the preprint [15]. He may observe that S is not in a nice upper triangular form (see also Lemma 4) and quite strange numbers (complicated combinations of sum and products of binomial coefficients) appear. Some natural operations are allowed on the Stokes matrices of a Frobenius manifold: a) Permutations. Let us consider the system 1 d Y˜ = U + V Y˜ , dz z where U = diag(u1 , u2 , . . . , uk ). Let σ : (1, 2, .., k) 7 → (σ (1), σ (2), . . . , σ (k)) be a permutation. It is represented by an invertible matrix P such that P U P −1 = diag(uσ (1) , uσ (2) , . . . , uσ (k) ). S and C are then transformed in P SP −1 and P C. For a suitable P , P SP −1 is upper triangular. As a general result [1], the good permutation is the one which puts uσ (1) , . . . , uσ (k) in lexicographical ordering w.r.t. the oriented line l. P corresponds to a change of coordinates in the given chart, consisting in the permutation σ of the coordinates. b) Sign changes of the entries. The construction at point a) is repeated, but now a diagonal matrix I with 1’s or −1’s on the diagonal takes the place of P . In ISI −1 some entries change sign. Note that IU I −1 ≡ U . Moreover, the formulae [10,11] which define a local chart of the manifold in terms of monodromy data are not affected by S 7→ ISI, C 7 → IC. c) Action of the braid group. We first recall that the braid group is generated by k − 1 elementary braids β12 , β23 , . . . , βk−1,k , with relations: βi,i+1 βj,j +1 = βj,j +1 βi,i+1 for i + 1 6= j, j + 1 6 = i, βi,i+1 βi+1,i+2 βi,i+1 = βi+1,i+2 βi,i+1 βi+1,i+2 .

Stokes Matrices of QH ∗ (CPd )

359

This abstract group is realized as the fundamental group of (Ck \diagonals)/Sk := {(u1 , . . . , uk ) | ui 6 = uj for i 6 = j }/Sk , where Sk is the symmetric group of order k. In Sect. 4 we proved that the Stokes’ rays of the system (13), and more generally the rays of the system (14), depend on the point t of the manifold. This is equivalent to ˆ depend on t. Let us start from the the fact that the eigenvalues u1 (t), . . . , uk (t) of U(t) 2 point t0 = (0, t0 , . . . , 0). If we move sufficently far away from t0 (we move to another chart), some Stokes’ rays cross the fixed admissible line lt0 . Then, we must change “Left” and “Right” solutions of (14). Then S and C change. The motions of the points u1 (t), . . . , uk (t) as t changes represent transformations of the braid group. Actually, a braid βi,i+1 can be represented as an “elementary” deformation consisting of a permutation of ui , ui+1 moving counterclockwise (clockwise or counterclockwise is a matter of convention). Suppose u1 , . . . , uk are already in lexicographical order w.r.t. l, so that S is upper triangular (recall that this configuration can be reached by a suitable permutation P ). The effect on S of the deformation of ui , ui+1 representing βi,i+1 is the following: S 7 → S βi,i+1 := Aβi,i+1 (S) S Aβi,i+1 (S), where Aβi,i+1 (S)

= 1, n = 1, . . . , k, n 6= i, i + 1, (S) i+1,i+1 = −si,i+1 , A Aβi,i+1 (S) i,i+1 = Aβi,i+1 (S) i+1,i = 1, nn βi,i+1

−1 (ui and ui+1 move clockwise) and all the other entries are zero. For the inverse braid βi,i+1 the representation is

−1

Aβi,i+1 (S)

= 1, n = 1, . . . , k, n 6 = i, i + 1, −1 = −si,i+1 , Aβi,i+1 (S) i,i −1 −1 = Aβi,i+1 (S) = 1, Aβi,i+1 (S) nn

i,i+1

i+1,i

and all the other entries are zero. We remark that S β is still upper triangular. π

In Fig. 4 we have drawn some lines Lj = {λ = uj + ρei( 2 −) , ρ > 0} (0 < < πk is the angle of l), which help us to visualize the topological effect of the braids action (they are the branch cuts for the fuchsian system which will be introduced in Sect. 8). We are going to prove that the braid whose effect is to set the deformed points in cyclic order and the cuts in the configuration of Fig. 4 (namely, the last two cuts remain unchanged, the others are alternatively “inverted”), brings S in a canonical form: k , sj i = 0, i < j. sii = 1, sij = j −i To be precise, we find that the last column is negative. The “canonical form” is reached after the conjugation S 7 → ISI, where I := diag(1, 1, . . . , −1).

360

D. Guzzetti

L2 L3

L4

L5 L 6

L7 L8

L1

1

2

l

6

l

ε

1

4

β

8

8

k even

5

3

2

3

4

7

5

(example k=8)

7 6

L2

L1

L

L 3

4

L

L6

5

L

7 2 1

3 1

5

l

3

l ε

β

7

7

6

2

4

k odd

4

6 5

(example k=7)

Fig. 4. Effect of the braid which brings S to the canonical form on the lines Lj

Lemma 8. Let the points uj (j = 1, .., k) be in lexicographical order w.r.t. the admissible line l. Then the braid β := (βk−5,k−4 βk−6,k−5 . . . β12 ) (βk−6,k−5 βk−7,k−6 . . . β23 )(βk−7,k−6 . . . β34 ) . . . . . .β k −2, k −1 (βk−3,k−2 βk−4,k−3 . . . β12 ) 2

2

for k even, or β := (βk−5,k−4 βk−6,k−5 . . . β12 )(βk−6,k−5 βk−7,k−6 . . . β23 )(βk−7,k−6 . . . β34 ) . . . . . . (β k−3 , k−1 β k−5 , k−3 )(βk−3,k−2 βk−4,k−3 . . . β12 ) 2

2

2

2

for k odd, brings the points in cyclic counterclockwise order, u1 being the first point in 5L (Fig. 4, right side, or Fig. 5). Note that we have collected the braids in (. . . ).

k 2

− 1 (k even), or

k−3 2

(k odd) sequences

Proof. Let k be even. The first braid βk−5,k−4 interchanges uk−4 and uk−5 . The second braid interchanges uk−5 and uk−6 . One easily sees that the effect of the first sequence of braids (βk−5,k−4 βk−6,k−5 . . . β12 ) is to bring u1 in the (old) position of uk−4 , uk−4

Stokes Matrices of QH ∗ (CPd )

361

k-5 5

k-6

4

1

k-4

3

2

k-2

k-2 2

1

k

k 3

First sequence of braids

k-1 5

4

k-1 6

k-3

k-3 k-4

k-5

k-6

k-7

k-6

second sequence 2

6 4

1

3

k-2

5

k-1

k

k-3

7 k-4

Ssituation before the last sequence 4

ect ...

k-5

3

5

5 2

4

3

1

2 C

k-2

k/2-2

1

Y C

k k-1

k/2-1

k

L

Last sequence of braids

I

k-1

C

k-3

k-2 k-4

k-6

k-3

k-5

Fig. 5. Effect of the sequences of braids which brings S to the canonical form (the figure refers to k even)

in the position of uk−5 , uk−5 in the position of uk−6 , . . . , u4 in the position of u3 and u2 in the position of u1 (Fig. 5). uk , uk−1 , uk−2 , uk−3 are not moved. The second sequence of braids (βk−6,k−5 βk−7,k−6 . . . β23 ) acts in a similar way, bringing u2 in uk−5 , uk−5 in uk−6 , . . . , u3 in u2 . uk , uk−1 , uk−2 , uk−3 , uk−4 are not moved. We go on in this way. After the action of (βk−5,k−4 βk−6,k−5 . . . β12 ) (βk−6,k−5 βk−7,k−6 . . . β23 ) (βk−7,k−6 . . . β34 ) . . . β k −2, k −1 2

2

the points are as in Fig. 5: uk is on the positive real axis, uk−2 is the first point met in counterclockwise order, u1 is the second, u2 is the third; the points are in cyclic order up to uk−3 ; finally, uk−1 is the last point before reaching again the positive real axis from below. Then, (βk−3,k−2 βk−4,k−3 . . . β12 ) brings u1 in uk−2 , uk−2 in uk−3 , uk−3 in uk−4 , and so on. The cyclic order is reached. For k odd the proof is similar. u t A careful consideration of the effect of the braid β on the lines Lj (which we leave as an exercise for the reader) shows that they are alternatively inverted as in Fig. 4. To reconstruct uniquely this configuration we just need to know the oriented line l, namely, its angle w.r.t. the positive real axis. The points uk−1 , uk and the lines Lk−1

362

D. Guzzetti

and Lk are unchanged (angle π2 − ). The line at u1 starts in the opposite direction, it goes around u2 , . . . , uk−2 without intesecting other cuts, and then goes to ∞ with the original asymptotic direction π2 −. Moving in the direction opposite to that of l we meet uk−2 . Its line has the original direction π2 − . Then we meet u2 , and the corresponding line starts with opposite direction, goes around u3 , . . . , uk−3 and then goes to ∞ with asymptotic direction π2 − . And so on. Now we find the matrix representation for the braid β. Proposition 1. The braid β of Lemma 8 has the following matrix representation:  0                    0   0    1   β A (S) =  0     0    .  .  .               

0 0 0 0

... 0 0 1 0 k 1 0 0 ... 1 k k 0 0 ... 1 2 k 0 1 0 3 0 0 0 1 . . .

. . .

. . .

. . .

0 0 0 0 0 for k even.

0

0

0

0

0

0

1

0

1

0

. . . . .

k 1 k 2 ∗ ∗ ∗ ∗ ∗

∗

.

∗

.

∗

.

∗

.

∗

.

∗

.

k 1 ∗ ∗ ∗ ∗ ∗

∗ ∗ ... ∗ ∗ ∗ k k−7 0

0

0 0 . . . . .

. ∗ . . ∗ . . ∗ . . ∗ . . ∗ . k 0 0 k − 6 k 1 0 k−5

0

0

0

1

0 0

0 0

0 0

0 0

1 k 1 k 2 k 3 ∗ ∗ ∗ ∗ ∗

00



 0 0    0 0    0 0   . .  . . . .  . .  . .   ∗ . .    ∗ . .    ∗ . .     ∗ . .  ∗ . .  ∗ . .  ∗ . .  ∗ . .  k  0 0 k − 5   k  0 0 k − 4   k  0 0 k−3    0 1 0 0 01

Stokes Matrices of QH ∗ (CPd )

363



00 0 0 0

                   0   0    1    β A (S) =  0   0     0  .  .  .               

0

0

0

0

0

1

0

1

0

. . . . .

k 1 k 2 ∗ ∗ ∗ ∗ ∗

∗

.

∗

.

∗

.

∗

.

∗

.

∗

.

∗

.

∗

.

k 1 ∗ ∗ ∗ ∗ ∗

... 0 0 0 1 0 k 0 1 0 0 ... 1 k k 0 0 0 ... 1 2 k k 1 0 0 2 3 k 0 0 1 0 4 0 0 0 0 1 . . . . ... . . . . . . . .

0

∗ ∗ ∗ ∗ k k−7 0

00 0 0 0

0

0 0 . . . . .

. ∗ . . ∗ . . ∗ . . ∗ . k 0 0 k − 6 k 1 0 k−5

0

0

0

1

0 0

0 0

0 0

0 0

1 k 1 k 2 k 3 ∗ ∗ ∗ ∗ ∗

00



 0 0    0 0    0 0   . .  . . . .  . .  . .   ∗ . .    ∗ . .    ∗ . .    ∗ . .     ∗ . .  ∗ . .  ∗ . .  ∗ . .  k  0 0 k − 5   k  0 0 k − 4   k  0 0 k−3    0 1 0 0 01

for k odd.

k The “∗” means , and j increases by one when we move downwards row by row. j Proof. We need some steps. 1) For an upper triangular Stokes’ matrix S (with entries sij ) the entries of the matrix Aβ which are different from zero are the following: k odd.

β

β

Ak−1,k−1 = Ak,k = 1,

For j even, 2 + 2i ≤ j ≤ k − 2, and i = 1, 2, . . . , 2k − 1, β

Ak

2 −i,2i

= 1,

364

D. Guzzetti

β

Ak

2 −i,j

j 2 −i−1

= −s2i,j +

X

(−1)n+1 Fn,i,j ,

n=1

   Pp−1 j j j αr −n+p−1 1 −n+1 1 −α2 −n+2 2 −i−αX 2 −i− r=1 X  2 −i−α X X   = · · ·    j 2 −i−n

Fn,i,j

α1 =1

α2 =1

α3 =1

αp =1



Pn−1 j 2 −i− r=1 αr −1

X

 . . .



 f (i, j, α) ,

αn =1

f (i, j, α) = s2i,2i+2α1 s2i+2α1 ,2i+2α1 +2α2 s2i+2α1 +2α2 ,2i+2α1 +2α2 +2α3 . . . s2i+2α1 +···+2αn ,j . For j even, 2i + 2 ≤ j ≤ k − 2 and i = 0, 1, . . . , 2k − 2, β

Ak

2 +i,2i+1

= 1,

j 2 −i−1

X

A k +i,j = −s2i+1,j + 2

(−1)n+1 Gn,i,j ,

n=1

   Pp−1 j j j αr −n+p−1 1 −n+1 1 −α2 −n+2 2 −i−αX 2 −i− r=1 X  2 −i−α X X   =   · · · j 2 −i−n

Gn,i,j

α1 =1

α2 =1

α3 =1

αp =1



Pn−1

 . . .

j 2 −i−

X

r=1

αr −1

 g(i, j, α) ,

αn =1

g(i, j, α) = s2i+1,2i+2α1 s2i+2α1 ,2i+2α1 +2α2 s2i+2α1 +2α2 ,2i+2α1 +2α2 +2α3 . . . s2i+2α1 +···+2αn ,j k odd.

Ak−1,k−1 = Ak,k = 1,

For j odd, 2i + 1 ≤ j ≤ k − 2, i = 1, . . . , k−3 2 : j −1 2 −i

A k−1 ,2i = 1, A k−1 +i,j = −s2i,j + 2

2

X n=1



(−1)n+1 H (n, i, j ),

Stokes Matrices of QH ∗ (CPd ) j −1 2 −i−n

H (n, i, j ) =

X

  

α1 =0

365 j −1 2 −i−α1 −n+1

X

  · · ·

α2 =1

Pp−1 j −1 2 −i− αr =1 αr −n+p−1

X

αp =1

   ···

Pn−1 j −1 2 −i− αr =1 αr −1

X

  h(i, j, α) ,

αn =1

h(i, j, α) = s2i,2i+1+2α1 s2i+1+2α1 ,2i+1+2α1 +2α2 . . . . s2i+1+2α1 +···+2αn ,j . For j odd, 2i + 3 ≤ j ≤ k − 2, i = 0, 1, . . . , k−3 2 : A k−1 −i,2i+1 = 1, 2

j −1 2 −i

A k−1 −i,j = −s2i+1,j + 2

j −1 2 −i−n

V (n, i, j ) =

X

α1 =1

  

j −1 2 −i−α1 −n+1

X

α2 =1

X

(−1)n+1 V (n, i, j ),

n=0

  · · ·

Pp−1 j −1 2 −i− r=1 αr −n+p−1

  · · ·

X

αp =1 Pn−1 j −1 2 −i− r=1 αr −1

X

  v(i, j, α) ,

αn =1

v(i, j, α) = s2i+1,2i+1+2α1 s2i+1+2α1 ,2i+1+2α1 +2α2 . . . s2i+1+2α1 +···+2αn ,j . In order to prove the above expressions, we have to find each matrix Aβi,i+1 corresponding to the elementary braid βi,i+1 appearing in β. This means computing its entry (i + 1, i + 1). This is a rather complicated problem, since the entry (i + 1, i + 1) of a given Aβi,i+1 is minus the entry (i, i + 1) of the Stokes’ matrix resulting from the action of the elementary braids acting before βi,i+1 , which in general is a sum of products of the elements sij of the initial Stokes matrix S. First we recall that S 7→ Aβi,i+1 S Aβi,i+1 has the following effect on the entries of S: sn,i 7 → sn,i+1 , sn,i+1 7 → sn,i − si,i+1 sn,i+1 , n = 1, 2, . . . , i − 1, si,i+1 7 → −si,i+1 , si,n 7 → si+1,n , si+1,n 7 → si,n − si,i+1 si+1,n , n = i + 2, . . . , k, while all the other entries of S remain unchanged. We start from Aβk−5,k−4 , whose non-trivial entry is simply −sk−5,k−4 . Its action on S brings sk−6,k−5 to sk−6,k−4 (the reader may compute all the elements of S βk−5,k−4 ). Then, the entry of Aβk−6,k−5 is −sk−6,k−4 . Proceeding in this way, the reader may check that for the first sequence of braids (βk−5,k−4 βk−6,k−5 . . . β12 ) the entries (i, i + 1) of the matrices are: −si,k−4 for Aβi,i+1 .

366

D. Guzzetti

Now, observe that

     1 00 1 0 1 1 x1   0 1  = 1 0 x1  , 1 1 x2 0 1 x2

and recall that Aβ1 β2 = Aβ2 Aβ1 . This implies for the first sequence of braids: 

m1 := Aβk−5,k−4

βk−6,k−5 ... β12

0 1 0   1 0  .. ..  . .   ..  = .0   1     

1 ∗ ∗ .. .



         ∗  ∗   1   1  1  1

the entries ∗ are exactly those of the Ai,i+1 ’s, namely −s1,k−4 , −s2,k−4 , . . . , −sk−5,k−4 from the top to the bottom of the (k − 4)th column. For the second sequence of braids (βk−7,k−6 . . . β34 ), the entries are −si−1,k−6 for Aβi,i+1 , and, as above: 

1

m2 := Aβk−6,k−5 βk−7,k−6 ... β23



0 1 ∗ ∗ .. .

 0   10  1 0   .. ..  . .   ..  = .0∗   1∗  1   1   1   1

                    1

and the entries ∗ are those of the Ai,i+1 ’s (namely, −s1,k−6 , . . . , −sk−7,k−6 from the top to the bottom). For the third sequence (βk−6,k−5 . . . β23 ), they are −si−2,k−8 for Aβi,i+1 ,

Stokes Matrices of QH ∗ (CPd )

367

and:

m3 := Aβk−7,k−6 ... β34



 1 0 0  1  0 1   10 ∗   1 0 ∗   . . . . . . ...    .. = .0∗   1∗   1   1   1   1  1

           .            1

And so on. We reach the last but one “sequence”, namely β k −2, k −1 for k even, or 2 2 (β k−3 , k−1 β k−5 , k−3 ) for k odd. The entries are −s12 , or −s23 ,−s13 respectively. Then 2

2

2

2





1

β k −2, k −1

m k −2 := A

2

2

2

           

 ..  .   1  0 1  = 1 −s12   1   ..  . 1





or

1

β k−3 , k−1 β k−5 , k−3

m k−5 := A 2

2

2

2

2

       .      

 ..  .   1  00 1   1 0 −s13 =  1 −s23   1   ..  . 1

The entries for the last braid are more complicated, because the entries on the first upper sub-diagonal of the Stokes’ matrix have been shuffled by the preceeding braids. We give the result (Ai,i+1 stands for Aβi,i+1 ): Ak−3,k−2 : −sk−3,k−2 , Ak−4,k−3 : −sk−5,k−2 + sk−5,k−4 sk−4,k−2 , Ak−5,k−4 :

− sk−7,k−2 + (sk−7,k−4 sk−4,k−2 + sk−7,k−6 sk−6,k−2 ) − sk−7,k−6 sk−6,k−4 sk−4,k−2 ,

368

D. Guzzetti

.. . A k −1, k : 2

− s1,k−2 + (s12 s2,k−2 + s14 s4,k−2 + · · · + s1,k−4 sk−4,k−2 ) + . . .

2

k

+ (−1) 2 −1 s12 s24 s48 . . . sk−4,k−2 , A k −2, k −1 : 2

2

− s2,k−2 + (s24 s4,k−2 + s26 s4,k−2 + · · · + s2,k−4 sk−4,k−2 ) + . . . k

+ (−1) 2 −2 s24 s48 . . . sk−4,k−2 , .. . A34 :

− sk−8,k−2 + (sk−8,k−4 sk−4,k−2 + sk−8,k−6 sk−6,k−2 ) − sk−8,k−6 sk−6,k−4 sk−4,k−2 , A23 :

−sk−6,k−2 + sk−6,k−4 sk−4,k−2 , A12 : −sk−4,k−2 .

For k odd, Ak−3,k−2 : −sk−3,k−2 , Ak−4,k−3 : −sk−5,k−2 + sk−5,k−4 sk−4,k−2 , Ak−5,k−4 :

A k−1 , k+1 : 2

2

− sk−7,k−2 + (sk−7,k−4 sk−4,k−2 + sk−7,k−6 sk−6,k−2 ) − sk−7,k−6 sk−6,k−4 sk−4,k−2 , .. .

− s2,k−2 + (s23 s3,k−2 + s25 s5,k−2 + · · · + s2,k−4 sk−4,k−2 ) + . . . + (−1)

A k−3 , k−1 : 2

2

s23 s35 s57 . . . sk−4,k−2 ,

− s1,k−2 + (s13 s3,k−2 + s15 s5,k−2 + · · · + s1,k−4 sk−4,k−2 ) + . . . + (−1)

A34 :

k−3 2

k−3 2

s13 s35 s57 . . . sk−4,k−2 , .. .

− sk−8,k−2 + (sk−8,k−4 sk−4,k−2 + sk−8,k−6 sk−6,k−2 ) − sk−8,k−6 sk−6,k−4 sk−4,k−2 , A23 :

−sk−6,k−2 + sk−6,k−4 sk−4,k−2 , A12 : −sk−4,k−2 ,

and:



Aβk−3,k−2

βk−4,k−3 ... β12

0 1 0   1 0  .. ..  . .  = ..  .0   1  

1 ∗ ∗ .. .



      ,  ∗   ∗  1 0 01

Stokes Matrices of QH ∗ (CPd )

369

which we call m k −1 for k even, m k−3 for k odd. 2 2 Then, for k even, Aβ = m k −1 m k −2 . . . m3 m2 m1 , 2

2

and for k odd, Aβ = m k−3 m k−5 −2 . . . m3 m2 m1 . 2

2

Doing a careful computation, we obtain exactly the expression for Aβ proposed at the beginning of the proof. 2) The second step consists of expressing Aβ in terms of the entries of the Stokes’ factors of S, which are simply binomial coefficients. First we prove the following Lemma 9. Given an upper triangular k × k matrix S, with entries sii = 1, we can uniquely determine numbers aij such that, for k, i, j all even or all odd: sij = aij + (ai,i+2 ai+2,j + ai,i+4 ai+4,j + . . . + ai,j −2 aj −2,j ) + + (ai,i+2 ai+2,i+4 ai+4,j + ai,i+2 ai+2,i+6 ai+6,j + . . . + ai,j −4 aj −4,j −2 aj −2,j ) + +ai,i+2 ai+2,i+4 ai+4,i+6 ai+6,j + . . . + ai,j −6 aj −6,j −4 aj −4,j −2 aj −2,j ) + . . . . . . + ai,i+2 ai+2,i+4 ai+4,i+6 . . . aj −2,j .

If k is even, but i is odd, just replace in the formula i + 2 with i + 1, i + 4 with i + 3, etc. If k is even, but j is odd, just replace j − 2 with j − 1, j − 4 with j − 3, etc. If k is odd, but i is even, or j is even, just do the same replacements as above. Proof. We have to solve a non-linear system Fij (a) = sij . The sum of the differences between the indices of the factors a in sij is equal to the difference of the indices of sij , namely j − i. From this it follows that the terms non-linear in the a’s occur on sub-diagonals which lie above all the sub-diagonals containing the factors of the non-linear terms. Then, the system Fij (a) = sij is uniquely solvable, starting from the first sub-diagonal t and successively determining all the ars going up diagonal by diagonal. u Corollary 3. With the above factorization, the matrix Aβ becomes: 

. . . . . .  . .  . 0  0 1 β A (S) =  1 −a12  0 0  . .  . . . . . .

. . . . . . . 0 0 1 0 −a24 0 −a14 1 −a34 0 0 . . . . . .

. . . 0 0 0 0 0 1 0 . .

. . 0 1 −a46 −a26 −a16 −a36 −a56 0 . .

. . 0 0 0 0 0 0 0 1 0 .

. 0 1 −a68 −a48 −a28 −a18 −a38 −a58 −a78 0 .

 ... . . . . . .  . . .  . . .  . . .  . . .  . . .  . . .  . . . . . . ...

(19)

370

D. Guzzetti

for k even; and 

. . .  .  0  1 Aβ (S) =  0  .  .  . .

. . . . . . . 0 0 1 0 −a13 1 −a23 0 0 . . . . . . . . .

. . . 0 0 0 0 1 0 . . .

. . 0 1 −a35 −a15 −a25 −a45 0 . . .

. . 0 0 0 0 0 0 1 0 . .

. 0 1 −a57 −a37 −a17 −a27 −a47 −a67 0 . .

 ... . . . . . .  . . .  . . .  . . .  . . .  . . .  . . .  . . . . . .

(20)

...

for k odd. In other words, all the non-linear terms in the formulae of Aβ drop. Proof. Just substitute the factorization of S in the formulae for Aβ given at the beginning of the proof. u t In Sect. 5 we computed the Stokes’ factors for S. If we sum all the factors appearing in formula (15) we get a matrix of the form: 

M := K1k + K1,k−1 + Kk,k−1 + · · · + Kk3 + Kk2

k ∗ k ∗ ∗  . .  =  .. ..   ∗ ∗ ∗ ∗ ∗0

0 k ∗ .. . . · · .. . . · . · . . . ... ∗·· 0 k

 0 ∗ ∗  ..   . . ..   . ∗ k

The ∗ are the binomial coefficient appearing in the factors. If we know M, we can determine all the entries of the single Stokes’ factors, because if the entry (i, j ) is not zero for one factor, then it is zero for all the other factors. Now, we rename the entries of the factors in such a way that 

P MP −1

k a12 a13  k a23  k =  

a14 a24 a34 .. .

 ... . . . . . . , ..  .  k

Stokes Matrices of QH ∗ (CPd )

371

where the matrix P of permutation is        P = . . .   0 1

0 .. .. .. . . . 1 0 0

for k even (the 1 on the first row is on the        P = . . . 0  0 1

0 .. .. .. . . . 1 0 0

0 1 0 .. .

k 2

0 0 1 .. .

1 0 0 0 .. .

1 0 0 0 0 .. .

0 0 1 0 0 .. .

 0 0 1 .. .

0 .. .. .. . . . 0 0 0

       ..   .   1 0

+ 1th column) ; and  01  001   1000  0001   00000  .. .. .. .. .. .. .. ..   . . . . . . . . 0   0 1 00 th

column). With this choice of the labelling, for k odd (the 1 on the first row is on the k+1 2 the product Supper := P K1k K1,k−1 Kk,k−1 . . . Kk3 Kk2 P −1 is precisely factorized as in Lemma 9. Then we can write the entries of Aβ (formulae (19) (20)) from the entries of the Stokes factors (which are binomial coefficients). The final result is precisely the claim of the proposition. u t We are ready to prove the main result: k

k

k−1

k−1

Theorem 2. Consider the Stokes matrix S = TF2 T − 2 (k even) or S = TF 2 Kk2 T − 2 (k odd) of QH ∗ (CPk−1 )and set it in the upper triangular form Supper = P S P −1 by the permutation P . Then, there exists a braid β (Lemma 8), represented by a matrix Aβ (Proposition), which sets Supper in the canonical form (after conjugation by (diag 1, 1, . . . , −1)): k , i < j. sij = j −i Another conjugation by diag (−1, 1, −1, 1, −1, . . . ) brings the matrix in the equivalent canonical form k j −i , i < j. sij = (−1) j −i Finally, by the action of the braid group, the last matrix can be put in the form k−1+j −i , i < j. sij = j −i In all the above matrices

sii = 1,

sij = 0,

i > j.

372

D. Guzzetti

Proof. First, we want to explain which isthe braid which brings the upper triangular k−1+j −i k j −i . We matrix with entries sij = (−1) in the matrix sij = j −i j −i make use of the following known result [28]: Consider the upper triangular Stokes’matrix S, the braid β = β12 (β23 β12 ) (β34 β23 β12 ) . . . (βn−1,n βn−2,n−1 . . . β23 β12 ) and the permutation Pˆ of entries Pˆi,k−i+1 = 1 for i = 1, . . . , k, and zero otherwise. Then, the relation iβ h (21) S −1 = Pˆ S T Pˆ holds. Observe that for the matrix S, whose upper triangular part has entries k−1+j −i , sij = j −i we have Pˆ S T Pˆ ≡ S. Moreover, S −1 is upper triangular with entries k j −i sij = (−1) . j −i This proves that S and S −1 are equivalent w.r.t the action of the braid group. Let us now prove the theorem starting from k even. We have to prove that k

k

Aβ P TF2 T − 2 P −1 [Aβ ]T is in “canonical form”. The proof “reduces” to the computation of products of matrices explicitly given. We do the products in an shrewed way. First we rewrite k k 2 2 [Aβ ]T P T −1 P −1 S β = Aβ P TF P −1 k

and we quite easily compute (P TF P −1 ) 2 (here P is the permutation introduced at the k

2 = ±1, with + for i end of the proof of Proposition 1). The result is (P TF P −1 )i,k−i+1 even and − for i odd. The other entries are zero. Then, using the expression of Aβ from the proposition: 0 0  1 k   0 0 1   1     k k   0   2 1     k k   0 0 1   3 2     . . .     . . .   . . .   . . . 1     k   ∗ ∗ ∗ ... 1 0 1     k   β −1 k k 2   F := A P TF P = ∗ ∗ ∗ ... 2 0 −1  1     k   ∗ ∗ ∗ . . . 3 −1 0         . . .   . . .   . . .     k k   0   k−4 k−5     k   −1 0   k−3      0 −1  0   −1 0

Stokes Matrices of QH ∗ (CPd )

373 th

(the −1 on the last column is on the 2k row). Using the explicit expressions for Kk3 and Kk2 of Sect. 5 we compute T −1 = TF−1 Kk3 Kk2 and then 

0

 −1   0   0       −1 −1 P T P =              

0 0 0 −1 0

−1 k − k−1 0 0 −1 k k − 0 − 0 k−2 k−3 0 0 0 0 −1 k k −1 − 0− k−4 k−5

               .  ..   .   0 0 0 −1 0   k k −1 − 0 − 0   4 3    0 0 0 1 k k  0 −1 − 2 1

After this we computed F P T −1 P −1 , F P T −1 P −1 P T −1 P −1 , 3 k F P T −1 P −1 , . . . , F P T −1 P −1 2 . We omit the intermediate computations and we give the final result: F1 := F

P T −1 P −1

k

2

                       k  1   k − 2  k  · 1 0 k−1    1                      

... k ... k − 6 k ... k − 5 k ... k − 4 k ... k − 3 k ... k − 2 k ... k−1 1 ...

k 4 k 5 .. .

k k − 4 k 0 k − 3 k 0 k − 2 k 0 k−1 1

1

1 0 0 0 0 0

1 0 0 .. .

...

∗

0

...

∗

0

...

∗

0

...

∗

0

...

∗

0

...

∗

0

...

∗ . . . . . . . .. k k−1

0 .. . 0 1

k 2 k 3 k 4 .. .

k 1 k − 2 k − 3 .. . −



               ∗ ∗     ∗ ∗     ∗ ∗    . ∗ ∗     ∗ ∗     ∗ ∗    ∗ ∗   .. ..  . .    k k  − k − 2 k − 4    k k  − k−1 k − 2    k −  k−1 1

374

D. Guzzetti

T Now, multiplying F1 Aβ , we obtain precisely the “canonical form”. For k odd, we did a similar computation. We omit the detail, but we indicate the order of multiplications which yielded the most simple expressions to multiply step k−1

by step. Our aim is to compute Aβ P TF 2 Kk2 (TF−1 Kk3 Kk2 ) k−1 2

k−1 2

P −1 [Aβ ]T . k−1

First, we computed P TF P −1 , then (P TF P −1 ) , then F := Aβ (P TF P −1 ) 2 . After this, we computed P Kk2 P −1 , and F1 := F P Kk2 P −1 . Finally, we calculated k−1 m := P TF−1 Kk3 Kk2 P −1 and F1 m, F1 m m, . . . , F2 := F1 m 2 . The matrix t F2 [Aβ ]T proved to be in “canonical form”. u 7. Canonical form of S −1 The matrix S −1 such that YR (z) = YL (z) S −1 can be put in the same canonical form of S, as a consequence of the relation (21). The only remarks we want to add concern the braid which brings S −1 to the canonical form, because it arranges the lines Lj in a “beautiful” shape. Lemma 80 . Let the points uj be in lexicographical order w.r.t the admissible line l. −1 . Then, the following braid arranges the points in cyclic Let us denote σi,i+1 := βi,i+1 clockwise order, u1 being the first point in 5L for k even, or the last in 5R for k odd (w.r.t. the clockwise order) (see Fig. 6): β 0 := (σ34 σ56 σ78 . . . σk−3,k−2 σk−1,k ) (σ45 σ67 σ89 . . . σk−4,k−3 σk−2,k−1 ) . . . i . . . (σ k , k +1 σ k +2, k +3 ) σ k +1, k +2 2 2 2 2 2 2 h (σ k +2, k +3 σ k +3, k +4 . . . σk−2,k−1 σk−1,k ) (σ k +2, k +3 σ k +3, k +4 . . . σk−2,k−1 ) . . . 2 2 2 2 2 2 2 i2 . . . (σ k +2, k +3 σ k +3, k +4 ) σ k +2, k +3 2

2

2

2

2

2

for k even, and β 0 := β12 (σ45 σ67 σ89 . . . σk−3,k−2 σk−1,k ) (σ56 σ78 . . . σk−4,k−3 σk−2,k−1 ) . . . i . . . (σ k+1 , k+3 σ k+5 , k+7 ) σ k+3 , k+5 2 2 2 2 2 2 h (σ k+5 , k+7 σ k+7 , k+9 . . . σk−2,k−1 σk−1,k ) (σ k+5 , k+7 σ k+7 , k+9 . . . σk−2,k−1 ) . . . 2 2 2 2 2 2 2 i 2 . . . (σ k+5 , k+7 σ k+7 , k+9 ) σ k+5 , k+7 2

2

2

2

2

2

for k odd. A careful consideration of the topological effect of the braid on the lines Lj shows that they are arranged as in Fig. 6. To reconstruct the configuration it is enough to know the admissible line l (at angle w.r.t. the positive real axis). In fact, u1 is the first point in 5L (in clockwise order) for k even, or the last in 5R for k odd. The lines come out of the points in centrifugal directions. They go to infinity, without intersections (so preserving their lexicographical order w.r.t. l) with the original asymptotic direction π2 − . 0 Aβ can be computed as in Proposition 1, and the analogous part of Theorem 2 can be proved using the braid β 0 .

Stokes Matrices of QH ∗ (CPd )

375 L k-1 L k L1L

L1 L 2 L3

Lk-1

2

L3

Lk

3

3 2

2

l

1

l

ε

ε

k/2+1

(k+3)/2 1

k

k

k-1

k even

k-1

k odd

Fig. 6. Lines Lj (Branch cuts) after the braid which brings S −1 to canonical form

8. Relation Between Irregular and Fuchsian Systems Let us consider the fuchsian system (U − λ)

dφ = dλ

1 +V 2

φ

(22)

which can also be written k

X Aj dφ = φ, dλ λ − uj j =1 1 + V , (Ej )jj = 1, otherwise (Ej )nm = 0. Aj = −Ej 2 Around the point uj a fundamental matrix has the form

B0 + O λ − uj

(λ − uj )M ,

where M =diag(− 21 , 0, . . . , 0) and the columns of B0 are the eigenvectors of Aj ; in particular, the first column is (0, .., 0, 1, 0, . . . , 0)T , and 1 occurs at the j th position. Then, the system has k independent vector solutions, of which k − 1 are regular near uj

376

D. Guzzetti

and the last is 

 0  ..  . p 1   φ (j ) (λ) = p  1  + O λ − uj λ → uj ,   λ − uj  ..  . 0 where 1 occurspat the j th row. For any uj we can construct such a basis of solutions. The branch of λ − uj is chosen as follows: let us consider an angle η with a range of 2π , for example − π2 ≤ η < 3π 2 , such that η 6 = arg(ui − uj ), ∀i 6 = j . Then consider the cuts Lj = {λ = uj + ρeiη , ρ > 0}. Actually, the cuts have two sides, − iη i(η−2π ) , ρ > 0}. The branch L+ j = {λ = uj + ρe , ρ > 0} and Lj = {λ = uj + ρe and log(λ − uj ) = is determined by the choice log(λ − uj ) = log |λ − uj | + iη on L+ j√ √ S − log |λ − uj | + i(η − 2π) on Lj . On C\ j Lj , λ − u1 , . . . , λ − uk are single valued. For any two (column) vector solutions φ(λ), ψ(λ) we define the symmetric bilinear form: (φ, ψ) := φ(λ)T (λ − U )ψ(λ) which is independent of λ and u1 , . . . ,uk . Let G be the matrix whose entries are Gij = φ (i) , φ (j ) . In particular, Gii = 1. Then, it can be proved (see [2] and also [11]) that near uj , φ (i) (λ) = Gij φ (j ) (λ) + rij (λ), where rij (λ) is regular near uj . For a counterclockwise loop around uj the monodromy of φ (i) is φ (i) , φ (j ) (i) (i) (i) φ 7 → Rj φ := φ − 2 (j ) (j ) φ (j ) ≡ φ (i) − 2Gij φ (j ) . φ ,φ Then, the monodromy group of (22) acts on φ (1) , . . . , φ (k) as a reflection group whose Gram matrix is 2G. In particular, φ (1) , . . . , φ (k) are linearly independent (and then a basis) if and only if det G 6 = 0. Now consider an oriented line l of argument θ = π2 − η, and for any j define the following vector √ Z z dλ φ (j ) (λ) eλz , (23) Y˜ (j ) = − √ 2 π γj which is a Laplace transform of φ (j ) . The path γj comes from infinity near L+ j , encircles . We can define 5 = {θ < arg z < θ + π} uj and returns to infinity along L− L j and 5R = {θ − π < arg z < θ }. λ = ∞ is a regular singularity for (22), h ithen the (1) (k) has the integrals exist for z ∈ 5L , and the non-singular matrix Y˜ (z) := Y˜ | . . . |Y˜ asymptotic behaviour 1 ˜ ezU z → ∞, z ∈ 5L Y (z) ∼ I + O z

Stokes Matrices of QH ∗ (CPd )

377

and satisfies the system (10). Then it is a fundamental matrix Y˜L . Note that l is admissible, since it does not contain Stokes‘ rays. It is a fundamental result [11] that the Stokes’ matrix of (10) satisfies S + S T = 2G. 9. Monodromy Group of QH ∗ (CPk−1 ) A system like (22) comes about in the theory of Frobenius manifolds (replace U 7 → U (t), V 7 → V (t)). It determines flat coordinates x 1 (t, λ), . . . , x k (t, λ) for a linear pencil of metrics ( , ) − λ < , >∗ (( , ) is the intersection form [10,11]). We write a gauge equivalent form (gauge X(t 2 )) at the semisimple point (0, t 2 , 0, . . . , 0) dψ 1 2 ˆ = + µˆ ψ. (24) (U(t ) − λ) dλ 2 (j )

A fundamental matrix ψ(t, λ) has entries ψα (t, λ) = ∂α x (j ) (t, λ). Moreover, by (23) √ Z w dλ ∂α x j (t, λ) eλw . (25) ∂α t˜j (t, w) = − √ 2 π The Monodromy group of the Frobenius manifold M is the group of the trans1 k formations which (x (t, λ), . . . , x ∗(t, λ)) undergo when t moves in M\6λ , where 6λ = {t ∈ M | det ( , ) − λ < , > = 0 } is the discriminant of the linear pencil. Due to formula (25), for CPk−1 this group is generated by the monodromy of the solutions of (24) when λ describes loops around u1 (t), . . . , uk (t) (see [10,11]). To these loops, we must add the effect of the displacement t 2 7 → t 2 + 2π i. In fact, in this case t2

t2

t2

t2

[ϕ (1) (we k ), . . . , ϕ (k) (we k )] 7 → [ϕ (1) (we k ), . . . , ϕ (k) (we k )] T , and the same holds for t˜(w, (0, t 2 , . . . , 0)). Then, the monodromy group of QH ∗ (CPk−1 )is generated by the transformations R1 , R2 , . . . , Rk , T introduced in the preceding sections. We are going to study the structure of the monodromy group of QH ∗ (CPk−1 )for any k ≥ 3. Recall that the matrix S for (10) is not upper triangular, because in U the order of u1 , . . . , uk is not lexicographical w.r.t. the line l. Then, the Coxeter identity is −S −1 S T = product of the Rj ’s in the order referred to l. For example, for k = 3, S −1 S T = −R2 R3 R1 , since the lexicographical ordering would be u2 , u3 , u1 . From the identity S −1 S T = (−1)k−1 T k a first general relation follows in the group T k = (−1)k product of Rj ’s in suitable order. Two cases must now be distinguished. k odd. As a general result [2], det G = 0 if and only if V + 21 has an integer eigenvalue. k−3 k−1 (1) The eigenvalues of V are k−1 2 , 2 , . . . , − 2 . Then, for k odd, det G 6 = 0, and φ , . . . , φ (k) are a basis. The matrices Rj have entries (Rj )i,i = 1 for i 6 = j , (Rj )j,j = −1, (Rj )j,l = −2Gj,l (l 6 = j ). The other entries are zero. In concrete examples, we have “empirically” found other relations like R2 = p2 (T , R1 ), R3 = p3 (T , R1 ), . . . , Rk = pk (T , R1 ), where pj (T , R1 ) means a product of the

378

D. Guzzetti

elements T and R1 . We have also found the relation (T R1 )k = −I We investigated the following cases: CP2 (k = 3). R2 = T R1 T −1 , R3 = T (R1 R2 R1 ) T −1 , (T R1 )3 = −I, T 3 = −R2 R3 R1 . CP4 (k = 5).

( R3 = T R2 T −1 , R2 = T R1 T −1 , −1 R4 = T (R2 R3 R2 ) T , R5 = T −1 (R2 R1 R2 ) T , (T R1 )5 = −I, T 5 = −R3 R4 R2 R5 R1 .

CP6 (k = 7).  −1  R3 = T R2 T −1 , R4 = T R3 T −1 , R2 = T R1 T , −1 R5 = T (R3 R4 R3 )T , R6 = T (T R1 )3 R2 [T (T R1 )3 ]−1 ,  R = T −2 (R R R )T 2 . 7 3 2 3 (T R1 )7 = −I, T 7 = −R4 R5 R3 R6 R2 R7 R1 . Note that just R1 , T , −I are enough to generate the monodromy group in each of the examples. They satisfy (in the examples) the relations: R12 = (−I T R1 )k = (−I )2 = I, R1 (−I ) ((−I )R1 )−1 = I, T (−I ) ((−I )T )−1 = I. The last two relations mean simply the commutativity of −I with R1 and T . The relations are not only satisfied, but also “fulfilled” (namely, (−I T R1 )n 6 = I for n < k). Now call X := R1 , Y := −I T R1 , Z = −I. These elements generate the monodromy group of QH ∗ (CPk−1 )with at least the relations X 2 = Y k = Z 2 = 1, (ZX)(XZ)−1 = 1, (ZY )(Y Z)−1 = 1. Note that Z generates the cyclic group C2 of order 2. If there were no other relations (which we did not find “empirically”), we would conclude that the monodromy group of QH ∗ (CPk−1 )(in the examples) is isomorphic to the direct product < X, Y | X2 = Y k = 1 > ×C2 ,

Stokes Matrices of QH ∗ (CPd )

379

where < X, Y | X2 = Y k = 1 > means the group generated by X, Y with relations X 2 = Y k = 1. k even. Now det G = 0, since V + 21 has integer eigenvalues. G has rank k − 1 and the eigenspace of its eigenvalue 0 has dimension 1. Let (z1 , . . . , zk )T be an eigenvector of P eigenvalue 1. The vector v := kj =1 zj φ (j ) is zero, because k X zj Gj i = 0 ∀i, v, φ (i) =

j =1

then

z1 φ (1) + z(2) φ (2) + · · · + zk φ (k) = 0,

and k − 1 of the φ (j ) ’s are linearly independent. The fuchsian system (22) has a regular (vector) solution φ0 (λ) = φ0 , a constant vector. φ (1) , φ (2) , . . . , φ (k−1) , φ0 is then a possible choice for a basis of solutions. Observe that in the gauge equivalent form ψ = Xφ, ψ0 is the eigenvector of 21 + µˆ with eigenvalue zero. Then    0 ∂1 x  ..  ∂2 x  .  .      ψ0 =  1  ≡  ..  , .  .  .  .  . . 0 ∂k x 

where all the entries are zero but the one at position 2k + 1. x is the flat coordinate for ( , )−λ < , >∗ corresponding to ψ0 . Then, we can choose the following flat coordinates: k

x 1 (λ, t), x 2 (λ, t), . . . , x k−1 (λ, t), t 2 +1 . The monodromy group then acts on a k − 1 dimensional space. Let us determine the reduction of R1 , R2 , . . . , Rk , T to the k − 1 dimensional space. The entries of T on P the vectors φ (j ) are: T φ (i) = kj =1 Tj i φ (j ) , i = 1, . . . , k. On the new basis φ (1) , . . . , φ (k−1) , φ0 the matrices are rewritten Rj φ (i) = φ (i) − 2Gij φ (j ) , i = 1, .., k − 1, j 6 = k, Rj φ0 = φ0 , j 6 = k,   k−1 X 1 zj φ (j )  , i 6 = k, Rk φ (i) = φ (i) − 2Gik − k z j =1   k−1 k−1 X X 1 Tj i φ (j ) + Tki − k zj φ (j )  . T φ (i) = z j =1

j =1





∗∗0 Then the matrices assume a reduced form ∗ ∗ 0. 001

380

D. Guzzetti

We studied two examples; besides Coxeter identity T k =product of Rj ’s, we found relations similar to the case k odd: R2 = p2 (T , R1 ), R3 = p3 (T , R1 ), . . . , Rk = pk (T , R1 ) and (T R1 )k = I . Namely: CP3 (k = 4).

( R2 = T R1 T −1 , R3 = T R2 T −1 , R4 = T −1 (R2 R1 R2 ) T , (T R1 )4 = I, T 4 = R3 R2 R4 R1 .

CP5 (k = 6).  −1 −1  R2 = T R1 T , R3 = T R2 T , −1 R4 = T R3 T , R5 = T (R2 R3 R4 R3 R2 ) T −1 ,  R = T −1 (R R R ) T , 6 2 1 2 (T R1 )6 = I, T 6 = R4 R3 R5 R2 R6 R1 . The same remarks of k odd hold here. Call X := R1 ,

Y := R1 T ,

then, if there were no other hiden relations, the monodromy group of QH ∗ (CPk−1 )(in the examples) would be isomorphic to < X, Y, | X 2 = Y k = 1 > . Note that < X, Y, | X 2 = Y k = 1 > is (isomorphic to) the subgroup of orientation preserving transformations of the hyperbolic triangular group [2, k, ∞]. Lemma 10. The subgroup of the orientation preserving transformations of the hyperbolic triangular group [2, k, ∞] is isomorphic to the subgroup of P SL(2, R) generated by 1 τ 7→ − , τ τ 7→

1 , 2 cos πk − τ

τ ∈ H := {z ∈ C | =z > 0}. Proof. Consider three integers m1 , m2 , m3 such that 1 1 1 + + < 1. m1 m2 m3

Stokes Matrices of QH ∗ (CPd )

381

In the Bolyai–Lobatchewsky plane H , the triangular group [m1 , m3 , m3 ] of hyperpolic reflections in the sides of hyperbolic triangles of angles mπ1 , mπ2 , mπ3 is generated by three reflections r1 , r2 , r3 satisfying the relations r12 = r22 = r32 = (r2 r3 )m1 = (r3 r1 )m2 = (r2 r1 )m3 = 1, and the subgroup of orientation preserving transformation is generated by X = r2 r3 , Y = r3 r1 . Then Xm1 = Y m2 = (XY )m3 = 1. For m1 = 2, m2 = k, m3 = ∞, a fundamental triangular region is {0 < 1}. Then r1 (τ ) = −τ¯ , r2 (τ ) =

1 π , r3 (τ ) = 2 cos − τ¯ . τ¯ k

The bar means complex conjugation. Then 1 1 . X(τ ) = − , Y (τ ) = τ 2 cos πk − τ

t u

Remark. The orientation preserving transformations of [2, 3, ∞] are the modular group P SL(2, Z). Theorem 3. The monodromy group of the quantum cohomology of CP2 is isomorphic to < X, Y, | X 2 = Y 3 = 1 > ×C2 ∼ = P SL(2, Z) × C2 .

(26)

The monodromy group of the quantum cohomology of CP3 is isomorphic to < X, Y, | X 2 = Y 4 = 1 >∼ = orient. preserv. transf. of [2, 4, ∞].

(27)

The theorem for the case of CP2 is already proved in [11]. Proof. a) CP2 :

    −1 3 3 0 0 1 R1 =  0 1 0 , T = −1 3 3 , 0 01 0 −1 0

and X = R1 , Y = −I TR 1 and Z = −I satisfy the relations of (26). They act on x the column vector x = y . The quadratic form q(x, y, z) = xT Gx is R1 and T z invariant. Then T , R1 act on two dimensional invariant subspaces q(x, y, z) = constant. On each of these subspaces we introduce new coordinates χ ∈ R and ϕ ∈ [0, 2π ). Let τ = eχ eiϕ and 3 i a (τ τ¯ − (τ + τ¯ ) + 1) , 2 2 τ − τ¯ 1 i a , y = (τ τ¯ − (τ + τ¯ ) − 1) 2 2 τ − τ¯ x=

382

D. Guzzetti

z=

a 1 i (−τ τ¯ − (τ + τ¯ ) + 1) , 2 2 τ − τ¯

a ∈ R, a 6 = 0. Note that q(x, y, z) = a 2 > 0. Then, it is easily verified that 1 x(− ) = −X x(τ ), τ 1 ) = Y x(τ ), x( 1−τ x(τ, −a) = Z x(τ, a). This implies the 1 to 1 correspondence between the generators of the modular group and X and Y . b) Case of CP3 .

 −1 0 R1 =  0 0

4 1 0 0

−10 0 1 0

  0 0 0 −1 , T = 0 0 1 0

0 1 0 3 −1 3 0 0

 0 0 . 0 1

The matrices are already written on φ (1) , φ (2) , φ (3) , φ0 . Recall that the monodromy acts only on x 1 , x 2 , x 3 , because the last flat coordinate is t 3 . This action is given by the following   three dimensional matrices, acting on a three dimensional space of vectors x x = y : z     −1 4 10 0 0 1 r1 =  0 1 0  , t := −1 0 3 . 0 0 1 0 −1 3 We redefine X = r1 and Y = tr1 , which satisfy the relations (27). We proceed as above, defining i 1 1 , x = a(τ τ¯ − √ (τ + τ¯ ) + ) 3 τ − τ¯ 2 √ 2 2 2 i 2 (τ + τ¯ ) + ) , y = a(− τ τ¯ − 3 3 3 τ − τ¯ √ 2 1 i 1 (τ + τ¯ ) + ) . z = a(− τ τ¯ − 3 6 3 τ − τ¯ a 6 = 0. Note that xT gx = (8/9)a 2 , where g is the 3 × 3 reduction of G. It is easily verified that 1 x(− ) = −X x(τ ), τ 1 ) = −Y x(τ ), x( √ 2−τ which proves the theorem. u t Acknowledgements. I am indebted to Prof. B. Dubrovin who introduced me to the problem and constantly gave me suggestions and advice. I also thank M. Bertola and M. Mazzocco for useful discussions.

Stokes Matrices of QH ∗ (CPd )

383

References 1. Balser, W., Jurkat, W.B., Lutz, D.A.: Birkhoff Invariants and Stokes’ Multipliers for Meromorphic Linear Differential Equations. J. Math. Anal. and Appl. 71, 48–94 (1979) 2. Balser, W., Jurkat, W.B., Lutz, D.A.: On the reduction of connection problems for differential equations with an irregular singular point to ones with only regular singularities. SIAM J. Math. Anal. 12, 691–721 (1981) 3. Beilinson, A.A.: Coherent Sheaves on Pn and Problems of Linear Algebra. Funct. Anal. Appl., 12, (1978) 4. Birman, J.S.: Braids, Links, and Mapping Class Groups, Annals of Mathematics Studies, 82, Princeton, NJ: Princeton Univ. Press, 1975 5. Bondal, A.I., Polishchuk, A.E.: Homologival Properties of Associative Algebras: the Method of Helices. Russ. Acad. Sci. Izv. Math. 42, 219–260 (1994) 6. Cecotti, S. Vafa, C.: On Classification of N = 2 supersymmetric theories. Commun. Math. Phys. 158, 569–644 (1993) 7. Di Francesco, P., Itzykson, C.: Quantum Intersection Rings. In: The Moduli Space of Curves, edited by R.Dijkgraaf, C.Faber, G.B.M. van der Geer, 1995 8. Dubrovin, B.: Integrable Systems in Topological Field Theory. Nucl. Phys B 379, 627–689 (1992) 9. Dubrovin, B.: Geometry and Integrability of Topological-Antitopological Fusion. Comm. Math. Phys 152, 539–564 (1993) 10. Dubrovin, B.: Geometry of 2D topological field theories. Lecture Notes in Math, 1620, Berlin–Heidelberg– New York: Springer-Verlag, 1996, pp. 120–348 11. Dubrovin, B.: Painlevé trascendents in two-dimensional topological field theory. SISSA Preprint 24/98/FM (1998) 12. Dubrovin, B.: Geometry and Analytic Theory of Frobenius Manifolds. math.AG/9807034 (1998) 13. Dubrovin, B., Mazzocco, M.: Monodromy of Certain Painlevé trascendents and Reflection Groups. SISSA Preprint 149/97/FM, math/9806056 (1997); to appear in Inventiones Matematicae 14. Duval, A., Mitschi, C.: Matrices de Stokes et Groupe de Galois des Equations Hypergeometriques Confluentes Generalisees. Pacific Journal of Mathematics 138, no. 1, 1989 15. Guzzetti, D.: Stokes matrices and Monodromy for the Quantum Cohomology of projective Spaces. Preprint SISSA 85/98/FM, math.AG/9904099 (1998) 16. Jimbo, M., Miwa, T., Ueno, K.: Monodromy Preserving Deformations of Linear Ordinary Differential Equations with Rational Coefficients (I). Physica D 2, 306–352 (1981) 17. Jimbo, M., Miwa, T.: Monodromy Preserving Deformations of Linear Ordinary Differential Equations with Rational Coefficients (II). Physica D 2, 407–448 (1981) 18. Jimbo, M., Miwa, T.: Monodromy Preserving Deformations of Linear Ordinary Differential Equations with Rational Coefficients (III). Physica D 4 , 26–46 (1981) 19. Gorodentsev, A.L., Rudakov, A.N.: Duke Math. J. 54, 115 (1987) 20. Its, A.R., Novokshenov, V.Y.: The isomonodromic deformation method in the theory of Painleve equations. Lecture Notes in Math, 1191, Berlin–Heidelberg–New York: Springer-Verlag, 1986 21. Kontsevich, M.: Talk at the Scuola Normale Superiore. Pisa, April 1998 22. Kontsevich, M., Manin,Y.I.: Gromov–Witten classes, Quantum Cohomology and Enumerative Geometry. Commun. Math. Phys 164, 525–562 (1994) 23. Manin, V.I.: Frobenius Manifolds, Quantum Cohomology and Moduli Spaces. Max Planck Institut für Mathematik, Bonn, Germany, 1998 24. McDuff, D., Salamon, D.: J-holomorphic Curves and Quantum Cohomology. Providence, RI: American Mathematical Society, 1994 25. Ruan, Y., Tian, G.: A Mathematical Theory of Quantum Cohomology. Math. Res. Lett. 1, 269–278 (1994) 26. Rudakov, A.: Integer Valued Bilinear Forms and Vector Bundles. Math. USSR Sbornik 66, 187–194 (1990) 27. Rudakov, A.: Helices and Vector Bundles. Seminaire Rudakov. London Math. Society. Lecture Notes Series, 148, Cambirdge: C.U.P, 1990 28. Zaslow, E.: Solitons and Helices: The Search for a Math-Phys Bridge. Commun. Math. Phys 175, 337–375 (1996) Communicated by T. Miwa

Commun. Math. Phys. 207, 385 – 409 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Existence of a Stationary Wave for the Discrete Boltzmann Equation in the Half Space Shuichi Kawashima1 , Shinya Nishibata2 1 Graduate School of Mathematics, Kyushu University, Fukuoka 812-8581, Japan 2 Department of Mathematics, Fukuoka Institute of Technology, Fukuoka 811-0295, Japan

Received: 7 December 1998 / Accepted: 27 April 1999

Dedicated to Professor Kiyoshi Mochizuki on his 60th birthday Abstract: We study the existence and the uniqueness of stationary solutions for discrete velocity models of the Boltzmann equation in the first half space. We obtain a sufficient condition that guarantees the existence and the uniqueness of solutions connecting the given boundary data and the Maxwellian state at a spatially asymptotic point. First, a sufficient condition is obtained for the linearized system. Then this result as well as the contraction mapping principle is applied to prove the existence theorem for the nonlinear equation. Also, we show that the stationary wave approaches the Maxwellian state exponentially at a spatially asymptotic point. We also discuss some concrete models of Boltzmann type as an application of our general theory. Here, it turns out that our sufficient condition is general enough to cover many concrete models.

1. Introduction 1.1. Problem. The discrete Boltzmann equation appears in the discrete kinetic theory of rarefied gases. This system of equations describes the evolution of a system of gas particles with a finite number of velocities. It has the mathematical structure of a hyperbolic system of semilinear partial differential equations. Although the discrete Boltzmann equation is established well as mathematical models, it seems too general as it contains various unrealistic models from the physical point of view. Therefore, it is necessary to develop mathematical theories, which reasonably reflect the physical phenomena. The discrete Boltzmann equation in bounded space, as well as unbounded space, has been studied by many authors. Among them, here we refer to papers which treat the problem with the boundary, as [4,7,8] and [12]. Paper [7] is concerned with the nonstationary initial boundary value problem. In [8], the existence and the stability of the stationary solutions are shown. In that paper, a stability condition is proposed and plays the essential role to obtain the stability theorem. Also, the same condition is

386

S. Kawashima, S. Nishibata

necessary in our own analysis. Therefore, this stability condition seems indispensable one for making the discrete Boltzmann equation physically admissible. In [4] and [12], the existence of the stationary solutions in the half space is studied, under the assumption that particle speeds do not take zero. The excellent work in [12] constructs a stationary solution to the discrete Boltzmann equation. Some crucial assumptions in [12], however, fail to be satisfied by well-known concrete models, because the paper avoided the difficulties of degeneracy arising from zero particle speed. This is the problem, which we also study in the present paper. The end of this problem in the half space is to obtain sufficient (and necessary) conditions to guarantee the unique existence of solutions. Here, we give a general sufficient condition. But, still it is an open problem to obtain the reasonable sufficient and necessary conditions. Thus it seems necessary to develop mathematical theory, which is admissible to physical models of rarefied gases, for the analysis of the stationary problem to the discrete Boltzmann equation. It will be shown in this paper that the discrete Boltzmann equation admits a unique stationary solution under general assumptions. This theory is constructed, by following the context and idea of [12], carefully decomposing the state space where the solution should belong. Then, as applications of the general theory, we treat some concrete models. They are shown to satisfy the assumptions and have a stationary solution. The aim of the present paper is to study the existence of a unique stationary solution to the one-dimensional discrete Boltzmann equation in the half space {x > 0}. A general form of the discrete Boltzmann equation in one space dimension is written as νi (∂t Fi + vi ∂x Fi ) = Qi (F ), i ∈ 3,

(1.1)

with F = (Fi )i∈3 , where 3 is a finite set {1, 2, . . . m}, each νi is a positive integer and Fi = Fi (x, t) represents the mass density of gas particles with the i th velocity at time t and position x; each vi , i ∈ 3, is a real constant and denotes the x-component of the i th velocity, and hence a vi is not necessarily distinct to each other nor non-zero. The reader is referred to [6] for the derivation of (1.1) from the original discrete Boltzmann equation in several space dimensions. The collision term Qi (F ) on the right hand side of (1.1) is given in the formula: Qi (F ) =

r X p=2

(p)

Qi (F ),

(1.2)

(p)

where each Qi (F ) is called the pth order collision term and is a homogeneous polyno(p) mial of degree p with respect to F = (F )i∈3 . Actually, each Qi (F ) is formulated so that the vector Q(F ) = (Qi (F ))i∈3 , also called the collision term, reflects the indistinguishability of gas particles and the reversibility of collisions in rarefied gas dynamics. (p) For the explicit formula of Qi (F ), readers are referred to [5,9]. The vector φ which is orthogonal to the collision term Q(F ) for each F ∈ Rm is called a collision invariant. The set of collision invariants is denoted by M: M = {φ ∈ Rm ; hφ, Q(F )i = 0 for ∀ F ∈ Rm }.

(1.3)

It is obvious that M is not an empty set nor the total space Rm . Thus, let d (1

≤ d ≤ m−1) denote the dimension of M, {φi }i=1,... ,d the basis of the subspace M and {φi }i=d+1,... ,m the basis of the orthogonal complement M⊥ of M; M = span{φ1 , φ2 , . . . , φd }, M⊥ = span{φd+1 , φd+2 , . . . , φm }.

(1.4)

Stationary Waves for Discrete Boltzmann Equation

387

We divide the set 3 = {1, . . . , m} into three subsets according to the sign of the corresponding speed vi as 3± = {i ∈ 3; vi ≷ 0} and 30 = {i ∈ 3; vi = 0}. Let R denote the subspace of Rm , corresponding to 3− ∪ 3+ and N to 30 . Thus we have the orthogonal decomposition: Rm = N ⊕ R, N = N(V ), R = R(V ), where V = diag(νi vi )i∈3 ,

(1.5)

and ⊕ means the orthogonal direct sum. Moreover, N(V ) and R(V ) denote the null space and the range space of the operator V , respectively. Let P0 and P1 denote the orthogonal projections onto N and R, respectively: P0 Rm = N, P1 Rm = R, P02 = P0 , P12 = P1 , P0 P1 = P1 P0 = O. Our main concern is the existence of a unique stationary solution F = F (x) = (Fi (x))i∈3 to (1.1), which satisfies d F = Q(F ), x > 0. (1.6) V dx We often call the solution F (x) a stationary wave. The Dirichlet boundary condition on x = 0 is posed: Fi (0) = Bi , for i ∈ 3+ , where Bi is a constant. This condition is rewritten as R + F (0) = B = (Bi )i∈3+ , where

R±

(1.7)

means the restriction to the subspace corresponding to 3± , respectively: R ± f = R ± (fi )i∈3 = (fi )i∈3± .

This type of boundary condition represents the pure diffuse reflection in rarefied gas dynamics. Also, the spatially asymptotic condition is posed to satisfy F (x) → M = (Mi )i∈3 as x → ∞.

(1.8)

Here, it is natural to assume that the asymptotic state M = (Mi )i∈3 is a Maxwellian, which is an equilibrium state to (1.1) with positive entries: Q(M) = 0, Mi > 0 for i ∈ 3.

(1.9)

B0 [0, ∞)

is a solution to the problem (1.6), (1.7) and (1.8), If the function F (x) in then the boundary data B should satisfy the consistency condition: B − R + M ∈ R + (V M)⊥ .

(1.10)

Proof of (1.10). For arbitrary ψ ∈ V M, there exists a φ ∈ M such that ψ = V φ. Taking the inner product of ψ with Fx (x) yields that hψ, F ix = hψ, Fx i = hV φ, Fx i = hφ, V Fx i = hφ, Q(F )i = 0. Here, we have used (1.6) as well as the facts that V is symmetric and Q(F ) ∈ M⊥ . Integrating the above equality over [x, ∞) by using (1.8), we obtain the condition: hψ, F (x) − Mi = 0 for x > 0.

(1.11)

Since ψ is an arbitrary vector in V M, (1.11) is equivalently rewritten as F (x) − M ∈ (V M)⊥ for x > 0. Especially, making x to zero gives F (0) − M ∈ on this and using (1.7), we have (1.10). u t

(V M)⊥ . Applying

(1.12) the restriction R +

388

S. Kawashima, S. Nishibata

1.2. Reformulation of the problem. Because of (1.8), it is convenient to look for the solution in the form: Fi = Mi (1 + fi ) for i ∈ 3.

(1.13)

It is rewritten in the vector form as, for F = (Fi )i∈3 and f = (fi )i∈3 , F = M + IM f,

(1.14)

where IM is a diagonal matrix whose entry is Mi . Substituting (1.14) in (1.6) yields the system of equations, to be satisfied by f , VM fx + LM f = 0M (f ),

(1.15)

where the operators are given by VM = diag(νi vi Mi )i∈3 , LM = −DF Q(M)IM , 0M (f ) = Q(M + IM f ) − Q(M) − DF Q(M)IM f.

(1.16) (1.17) (1.18)

The operator LM is called the linearized collision operator. The following known facts about the operators in (1.15) are proved by [5,9]. These proofs are based on the indistinguishability of gas particles and the reversibility of collisions in rarefied gas dynamics. LM is real symmetric and nonnegative, N(LM ) = M,

(1.19) (1.20)

R(LM ) = M⊥ ,

(1.21)

0M (f ) ∈ M

⊥

for

∀

m

f ∈R .

(1.22)

The boundary conditions for f , derived from (1.7) and (1.8), are written as R + f (0) = b, f (x) → 0 as x → ∞,

(1.23) (1.24)

where the boundary value b = (bi )i∈3 is given as Bi − Mi for i ∈ 3+ . bi = Mi Since Mi > 0 ( i ∈ 3), we have

(1.25)

R = R(VM ), N = N(VM ).

(1.26)

Moreover, we introduce the following subspaces: NM = VM M = span{VM φ1 , . . . , VM φd },

(1.27)

−1 ⊥ N⊥ M = VM (M ),

(1.28)

−1 (·) VM

means the inverse image by VM . Note that as we allow vi = 0, VM is where possibly singular. Since N ⊂ N⊥ M , we have the orthogonal decompositions as ⊥ N⊥ M = N ⊕ (NM ∩ R),

(1.29)

∩ R).

(1.30)

R=

NM ⊕ (N⊥ M

Owing to (1.25), the consistency condition (1.10) is expressed in the above terminology as the condition that the boundary data b should satisfy b ∈ R + N⊥ M.

(1.31)

Stationary Waves for Discrete Boltzmann Equation

389

1.3. Assumptions and results. Here, we state the two assumptions which play the essential role in decomposing the system (1.15). The first one is the stability condition proposed in [8]: M ∩ N = {0}.

[S]

The stability condition [S] is equivalent to either one of the following: • If VM φ = 0 and LM φ = 0 for φ ∈ Rm , then φ = 0. • P0 LM P0 is positive definite on N.

(1.32) (1.33)

The equivalence among the three conditions above are easily shown under (1.19), (1.20), (1.21) and (1.26) (see [8]). It follows from the stability condition [S] that {VM φi }i=1,... ,d is a basis of NM . Equivalently, dim NM = dim M = d.

(1.34)

In addition to [S], we need to assume that [A]

# dim R + N⊥ M = {µ < 0; det(µVM + LM ) = 0},

where we count the multiplicity of generalized eigenvalues µ. Condition [A] will be used in proving Proposition 2.6. Here, we are in a position to state the main theorem in this paper. To this end, we introduce three additional notations. The first one is the function space L1α [0, ∞) which is defined as the space of measurable functions f (x) on [0, ∞) satisfying (1 + x)α f (x) ∈ L1 [0, ∞) for a certain constant α, which is defined in (3.5). The second one is the supremum norm weighted by eσ x : kgkσ = sup eσ x |g(x)|, x≥0

(1.35)

where σ is an arbitrary positive constant satisfying σ σ¯ := min{|µ|; µ < 0, det(µVM + LM ) = 0}.

(1.36)

The last one is the distance in the R + Rm between the Maxwellian state M and the boundary data B: X |Bi − Mi |. (1.37) β= i∈3+

Theorem 1.1. (i) Suppose that the stationary problem, (1.6), (1.7) and (1.8), admits a solution. Then the boundary data B = (Bi )i∈3+ should satisfy the consistency condition (1.10). (ii) Suppose that the stability condition [S] holds. Let the boundary data B = (Bi )i∈3+ satisfy the consistency condition (1.10). Moreover, suppose that the Maxwellian state M satisfies Condition [A]. Then, there exists a positive constant β0 such that if β ≤ β0 , the stationary problem, (1.6), (1.7) and (1.8), has a unique solution F = (Fi )i∈3 in the class of functions such that F (x) − M ∈ (B0 ∩ L1α )[0, ∞) if α is nonnegative or F (x) − M ∈ B0 [0, ∞) if α is negative, where the constant α is defined in (3.5). Furthermore, the solution F = (Fi )i∈3 is located in a small

390

S. Kawashima, S. Nishibata

neighborhood of the Maxwellian state M with respect to the norm k · kσ defined by (1.35). Also, this solution F (x) belongs to C ∞ [0, ∞) and verifies the estimate |∂xk (F (x) − M)| ≤ Ck βe−σ x (x > 0),

(1.38)

for each integer k ≥ 0, where Ck is a positive constant depending on k and σ is an arbitrary positive constant satisfying (1.36). Proof. The first statement has been proved in deriving (1.10). The second assertion follows immediately from Theorem 4.2 and Remark 4.1. u t The plan of this paper is as follows. First, we study decompositions of the algebraic equations arising from (1.15) in Sect. 2. Also, the related eigenvalue problems are discussed. It turns out that both the stability condition [S] and Condition [A] are sufficient to decompose the state space to which solutions belong and to obtain information on the operators restricted to the subspaces arising from the decomposition. Algebraic results, obtained here, become indispensable in studying the stationary problem to the discrete Boltzmann equation (1.1) in the later sections. Section 3 is devoted to showing the existence of a solution to the linearized problem, derived from the Boltzmann equation (1.1). Owing to the algebraic decomposition, which is studied in Sect. 2, the Boltzmann equation is decomposed into two systems of equations. One consists of linear algebraic equations and the other consists of linear differential equations. By solving thus decomposed systems, we obtain the solution formula of the linear Boltzmann equation. Then, we derive some estimates from the solution formula. In Sect. 4, these estimates are applied to prove the existence of the stationary wave to the (nonlinear) Boltzmann equation. First, we define a function space to which solutions of the linear Boltzmann equation belong. Then, a mapping on the function space is constructed through the solution formula given in Sect. 3. We show, with the aid of the estimates in Sect. 3, that the mapping is a contraction mapping if the given boundary data is sufficiently close to the spatially asymptotic Maxwellian state. Applications of the existence theorem to the concrete Boltzmann type equations are studied in Sect. 5. Here, we treat two kinds of models, Cabannes’ 14-velocity model and 6-velocity model with multiple collisions. Both are shown to have stationary waves, provided that they satisfy certain conditions which are derived from the sufficient condition, [S] and [A], in the general theory. Notation. span{φi } denotes the subspace spanned by vectors {φi }i=1,... ,n . dim V denotes the dimension of the subspace V ⊂ Rm . V1 ⊕ V2 denotes the orthogonal direct ˙ 2 denotes the direct sum for the subspaces Vi (i = 1, 2). det(A) denotes the sum; V1 +V determinant of the matrix A. # S denotes the number of elements of the set S. B0 (I ) denotes the space of continuous and bounded functions on the interval I . C k (I ) denotes the space of k-times continuously differentiable functions on the interval I . 2. Algebraic Preparation This section is devoted to proving some lemmas, which play essential roles in proving the existence of a stationary wave in latter sections.

Stationary Waves for Discrete Boltzmann Equation

391

2.1. Consequences of the stability condition [S]. First, we consider the generalized eigenvalue problem to (1.15), (µVM + LM )ϕ = 0 for µ ∈ C, ϕ ∈ Cm , ϕ 6= 0.

(2.1)

We call the value µ, satisfying (2.1) with a certain ϕ, an admissible value. Also, let 6 denote the set of admissible values, i.e., 6 = {µ ∈ C; det(µVM + LM ) = 0}.

(2.2)

The following lemma shows that every admissible value is real. Lemma 2.1. The stability condition [S] implies that 6 ⊂ R.

(2.3)

Proof. Let µ ∈ 6. Then there exists a non-zero vector ϕ ∈ Cm satisfying (2.1). Taking the inner product of (2.1) with ϕ, we have µhVM ϕ, ϕi + hLM ϕ, ϕi = 0. Note that hLM ϕ, ϕi ≥ 0 by (1.19). Since VM is real symmetric, if hVM ϕ, ϕi 6= 0, then µ ∈ R. Therefore, we assume that hVM ϕ, ϕi = 0. Then, hLM ϕ, ϕi = 0. From (1.19), LM ϕ = 0. Thus we find out, by (2.1) again, that µVM ϕ = 0. On the other hand, the t stability condition (1.32) means that VM ϕ 6 = 0. Thus, µ = 0 ∈ R. u Next, we consider the following algebraic equations derived from (1.15): VM ψ + LM φ = η

(2.4)

for ψ, φ, η ∈ Rm . We decompose the system (2.4) by applying the decomposition Rm = N ⊕ R. First, by using this decomposition, we write as ψ = ψ 0 + ψ 1, φ = φ0 + φ1,

(2.5)

where ψ 0 = P0 ψ, ψ 1 = P1 ψ, φ 0 = P0 φ and φ 1 = P1 φ. Applying the projections P0 and P1 on (2.4), receptively, we obtain that (P0 LM P0 )φ 0 + (P0 LM P1 )φ 1 = P0 η,

(2.6a)

(P1 VM P1 )ψ + (P1 LM P0 )φ + (P1 LM P1 )φ = P1 η. 1

0

1

(2.6b)

Since (P0 LM P0 ) is nonsingular on N owing to (1.33), the equivalent condition as the stability condition [S], we can solve (2.6a) with respect to φ 0 as φ 0 = −(P0 LM P0 )−1 (P0 LM P1 )φ 1 + (P0 LM P0 )−1 P0 η.

(2.7)

Substituting (2.7) in (2.6b) and using the fact that (P1 VM P1 ) is nonsingular on R, we can express ψ 1 in terms of φ 1 and η. Thus we arrive at the following system of equations: φ 0 = Aφ 1 + J η, ψ 1 = Bφ 1 + Gη,

(2.8a) (2.8b)

392

S. Kawashima, S. Nishibata

where A = −(P0 LM P0 )−1 (P0 LM P1 ),

(2.9a)

−1

J = (P0 LM P0 )

P0 , −1

B = −(P1 VM P1 )

−1

G = (P1 VM P1 )

(2.9b) −1

[(P1 LM P1 ) − (P1 LM P0 )(P0 LM P0 ) −1

[P1 − (P1 LM P0 )(P0 LM P0 )

(P0 LM P1 )],

P0 ].

(2.9c) (2.9d)

Here, the above operators are regarded as ones over Rm . Conversely, we assume that the vectors φ 0 , φ 1 , ψ 1 and η satisfy (2.8). Thus, φ 0 ∈ N and ψ 1 ∈ R. Then, put φ = φ 0 + φ 1 , ψ = ψ 0 + ψ 1 , with arbitrarily fixed ψ 0 ∈ N.

(2.10)

By the consequence of straightforward calculations, we find out that φ, ψ and η satisfy (2.4). Here, we summarize the above observations in the lemma. Lemma 2.2. Suppose that the stability condition [S] holds. If φ, ψ and η ∈ Rm satisfy (2.4), then φ 0 , φ 1 , ψ 0 and ψ 1 , defined in (2.5), together with η satisfy (2.8). Conversely, if φ 0 , φ 1 , ψ 1 and η ∈ Rm satisfy (2.8), then φ and ψ, defined in (2.10), together with η satisfy (2.4). By applying Lemma 2.2, we show the following lemma. Lemma 2.3. Suppose that the stability condition [S] holds. Then the operators B and G, defined by (2.9c) and (2.9d), satisfy B : Rm → N⊥ M ∩ R,

(2.11)

G:M →

(2.12)

⊥

N⊥ M

∩ R.

Proof. For arbitrary φ 1 in Rm , we put ψ 1 = Bφ 1 . By (2.9c), it is apparent that ψ 1 ∈ R. We define φ 0 by (2.8a) putting η = 0. Then φ 0 , φ 1 and ψ 1 satisfy (2.8) with η = 0. By applying Lemma 2.2, ψ and φ, defined by (2.10) putting ψ 0 = 0, satisfy (2.4) with η = 0, i.e., VM ψ 1 + LM φ = 0. 1 Thus VM ψ 1 = −LM φ ∈ M⊥ . This implies ψ 1 ∈ N⊥ M . Consequently, we have ψ ∈ N⊥ M ∩ R. Next, let ψ 1 = Gη for arbitrary η ∈ M⊥ . Then ψ 1 ∈ R by (2.9d). Define φ 0 by (2.8a) together with φ 1 = 0. Then, φ 0 , φ 1 = 0, ψ 1 and η satisfy (2.8). Thus by Lemma 2.2, ψ and φ, defined by (2.10) with ψ 0 = 0, satisfy (2.4), i.e.,

VM ψ 1 + LM φ 0 = η. −1 ⊥ 1 (M⊥ ) = N⊥ Since (1.21) holds, ψ 1 ∈ VM M . Thus we have proved ψ ∈ NM ∩ R.

Stationary Waves for Discrete Boltzmann Equation

Let B1 be the restriction of B on N⊥ M ∩ R. By (2.11), apparently we have: ⊥ B1 = B N⊥ ∩R : N⊥ M ∩ R → NM ∩ R. M

393

(2.13)

Let σ (B1 ) denote the spectrum of B1 : σ (B1 ) = {µ ∈ C; Bϕ = µϕ for ϕ ∈ N⊥ M ∩ R, ϕ 6 = 0}.

(2.14)

The characteristic polynomial of B1 is characterized by that of (2.1) in the following lemma. Lemma 2.4. Suppose that the stability condition [S] holds. Then, the characteristic polynomial of B1 in an indeterminate X satisfies det(X − B1 ) = c0−1 X−d det(XVM + LM ),

(2.15)

where c0 is a non-zero constant (given in (2.20)) and d = dim M. Proof. We decompose VM and LM corresponding to Rm = N ⊕ R. The matrix representations of VM and LM in the suitably chosen coordinate system are given by the formula: O O L00 L01 , LM = , (2.16) VM = L10 L11 O V11 where V11 = diag(νi vi Mi ) (i ∈ 3+ ∪ 3− ) and Lkl (k, l = 0, 1) are constant matrices such that L00 is real symmetric and positive definite by (1.33), L11 is real symmetric and L01 = L> 10 . The operator B defined in (2.9c) is expressed in this coordinate as: O O −1 (L11 − L10 L−1 (2.17) B= where B11 = −V11 00 L01 ). O B11 By (2.16), we have

L00 L01 . (2.18) XVM + LM = L10 XV11 + L11 −1 −1 Multiplying (2.18) by L00 −L00 L01 , where I is the unit matrix, from the right side O I and taking the determinant of the resulting equality, we obtain that det(XVM + LM ) = c0 det(X − B11 ),

(2.19)

where c0 is non-zero constant given by c0 = det(L00 ) det(V11 ).

(2.20)

Let us note that B11 is a restriction of B on R and B1 is a restriction of B on ⊥ N⊥ M ∩ R. Also, take it into consideration that NM ∩ R is an invariant subspace of B owing to (2.11). Therefore, we have the matrix representation of B11 in a suitably chosen coordinate system, corresponding to the decomposition (1.30): O O . (2.21) B11 = ∗ B1 The above matrix representation (2.21) together with (1.30) and (1.34) yields that det(X − B11 ) = Xd det(X − B1 ). By combining (2.19) with (2.22), we obtain (2.15). u t

(2.22)

394

S. Kawashima, S. Nishibata

Applying the above lemma, we have; Lemma 2.5. Suppose that the stability condition [S] holds. Then, we have σ (B1 ) \ {0} = 6 \ {0},

(2.23)

with the multiplicity of non-zero (generalized) eigenvalues. Especially, σ (B1 ) ⊂ R.

(2.24)

Proof. The first statement (2.23) follows immediately from Lemma 2.4. The second assertion (2.24) follows immediately from (2.3) and (2.23). u t By Lemma 2.5, every eigenvalue of the linear operator B1 is real. Therefore we can decompose the subspace N⊥ M ∩ R as follows: −˙ + N⊥ M ∩ R = E +E ,

(2.25)

where E− and E+ are the eigen subspaces corresponding to negative eigenvalues and nonnegative eigenvalues of B1 , respectively. The dimension of E− is characterized by the following proposition, which follows from Lemmas 2.4 and 2.5. Proposition 2.6. Suppose that the stability condition [S] holds. Then, we have dim E− = # {µ < 0; det(µVM + LM ) = 0},

(2.26)

counting the multiplicity of generalized eigenvalues on the right-hand side. 2.2. A sufficient condition for [S]. The following condition concerning the Maxwellian state M is proposed in [12]: M ∩ N⊥ M = {0}.

(2.27)

Although our theory does not need to assume (2.27), it gives information on the operator B1 defined by (2.13). Therefore, we summarize some consequences of (2.27). First, let us note that (2.27) implies [S] since N ⊂ N⊥ M . Also, (2.27) is equivalent to the condition that ˙ ⊥ Rm = M+N M,

(2.28)

˙ means a direct sum. Actually, (2.28) is shown by computation of dimensions where + of the subspaces M and N⊥ M by using (1.34) and (2.27) . The following lemma, which gives a characterization of Condition (2.27), is proved in [12]. Lemma 2.7. Condition (2.27) is equivalent to

det(H ) 6 = 0 where H = hVM φi , φj i

i,j =1,... ,d

.

(2.29)

We have proved that the eigenvalues of B1 are characterized by (2.15), as a consequence of the stability condition [S]. But, the condition does not necessarily mean that B1 is invertible. In fact, it requires assuming (2.27). Lemma 2.8. Condition (2.27) holds if and only if B1 is nonsingular.

Stationary Waves for Discrete Boltzmann Equation

395

Proof. We suppose that (2.27) holds. Let B1 φ 1 = 0 for a certain φ 1 ∈ N⊥ M ∩ R. Then put φ 0 = Aφ 1 , ψ 1 = 0 and η = 0. It is easy to see these φ 0 , φ 1 , ψ 1 and η satisfy (2.8). Then by Lemma 2.2, φ = φ 0 + φ 1 and ψ = ψ 1 = 0, η = 0 satisfy (2.4),i.e., LM φ = 0. Thus φ = φ 0 + φ 1 ∈ N(LM ) = M. On the other hand, since φ 0 ∈ N owing to (2.8a) ⊥ 0 1 and that φ 1 ∈ N⊥ M ∩ R, φ = φ + φ ∈ NM by (1.29). By Assumption (2.27), it turns 0 1 out that φ = φ + φ = 0. By (1.29) again, we get that φ 0 = 0 and φ 1 = 0. Next, let (2.27) not hold. Then there exists a non-zero vector φ ∈ M ∩ N⊥ M . This φ satisfies LM φ = 0. Putting ψ = 0 and η = 0, we see that these φ, ψ and η satisfy (2.4). By Lemma 2.2, φ 0 and φ 1 defined by (2.5) satisfy (2.8) with ψ 1 = 0 and η = 0, i.e., φ 0 = Aφ 1 ,

(2.30a)

0 = Bφ .

(2.30b)

1

⊥ 1 ⊥ 1 Since φ 0 ∈ N and φ ∈ N⊥ M , VM φ = VM φ ∈ M . This implies φ ∈ NM . As ⊥ 1 1 1 φ ∈ R, we obtain φ ∈ NM ∩ R. Now, suppose that φ = 0. Then from (2.30a), we get that φ 0 = 0. Therefore, φ = φ 0 + φ 1 = 0. This contradicts the definition of φ. 1 1 1 Thus we obtained that φ 1 ∈ N⊥ M ∩ R and φ 6 = 0. Consequently, B1 φ = Bφ = 0 t by (2.30b). Apparently, this means that B1 is singular. u

Immediately from Lemmas 2.4 and 2.8, we have the following proposition. Proposition 2.9. Suppose that Condition (2.27) holds. Then, we have σ (B1 ) = 6 \ {0}

(2.31)

with the multiplicity of (generalized) eigenvalues. 3. Linearized Problem Here and hereafter, the stability condition [S] is always assumed. The decompositions, derived as a consequence of [S] in the previous section, become essential tools in solving the stationary problem.

3.1. Decomposition of the system. In this section, we show the existence of a solution to the linearized system: VM fx + LM f = h, x > 0,

(3.1)

R + f (0) = b, f (x) → 0 as x → ∞,

(3.2) (3.3)

with boundary conditions

which are the same as (1.23), (1.24). In (3.1), the function h(x) is assumed to satisfy h(x) ∈ M⊥ for x > 0, h(x) ∈ B0 [0, ∞), h(x) → 0 as x → ∞.

(3.4)

396

S. Kawashima, S. Nishibata

Moreover, if B1 is singular it is necessary to assume that h(x) ∈ L1α [0, ∞) for α = m0 − 1,

(3.5)

where m0 is the multiplicity of the zero eigenvalue of the minimum polynomial of the matrix B1 . If B1 is nonsingular, we define m0 = 0 and α = −1 formally though we do not need (3.5). Let f (x) satisfy the linear problem (3.1), (3.2) and (3.3). Then by the same computation as the one in deriving (1.12) together with (1.16), (1.20), (3.4) and (3.5), we find out that f (x) also satisfies f (x) ∈ N⊥ M for x > 0.

(3.6)

Thus it is also required to assume the consistency condition b ∈ R + N⊥ M.

(3.7)

Since fx , f and h satisfy the algebraic equations (2.4) due to (3.1), we apply Lemma 2.2 and obtain that f 0 = Af 1 + J h, fx1 = Bf 1 + Gh,

(3.8a) (3.8b)

where the matrices A, B, J and G are given in (2.9). Moreover, f = f 0 + f 1 , f 0 = P0 f ∈ N, f 1 = P1 f ∈ N⊥ M ∩ R.

(3.9)

Here, the operators P0 and P1 are the projections from Rm onto N and R, respectively. Once we find the function f 1 satisfying (3.8b), f 0 is expressed by (3.8a) and so does f , f = (Im + A)f 1 + J h,

(3.10)

where Im is the mth unit matrix. Apparently, from (3.8), P1 f (x) should be a C 1 -function. Nonetheless, P0 f (x) is a C 0 -function. Therefore, we define a solution to problem (3.1), (3.2) and (3.3) as follows. Definition 3.1. A function f (x) ∈ B0 [0, ∞) is called the solution to the linear stationary problem (3.1),(3.2) and (3.3), if f (x) satisfies (3.1), (3.2) and (3.3) as well as P1 f (x) ∈ B1 [0, ∞). 3.2. Solution formula. First, we consider the initial value problem to (3.1), i.e., (3.8), with the data: f (0) = f0 ∈ N⊥ M,

(3.11)

and neglect the boundary conditions (3.2) and (3.3) for a moment. Note that the restriction on the data (3.11) is the same as (3.6). Then, after solving the initial value problem (3.1), (3.11), we look for conditions on the initial data f0 so that the solution f (x) satisfies (3.3) and R + f0 = b, which is the same condition as (3.2).

Stationary Waves for Discrete Boltzmann Equation

397

It is enough to seek a function f 1 satisfying (3.8b) owing to (3.10). So, let us state the problem with f 1 for the clarity, before solving the initial value problem. Since f 1 ∈ N⊥ M ∩ R, we can rewrite (3.8b) as: fx1 = B1 f 1 + Gh.

(3.12)

Also, boundary conditions (3.2) and (3.3) are rewritten as R + f 1 (0) = b, f 1 (x) → 0 as x → ∞.

(3.13) (3.14)

The boundary condition (3.13) follows from (3.2) because R + f 1 = R + (f − f 0 ) = R + f . Also, h(x) is required to satisfy (3.4) and (3.5). Then, we treat the initial value problem (3.12) with the data f 1 (0) = P1 f0 . Let P ± be the projections onto E± and apply P± on (3.12), respectively. The result is fx± = B ± f ± + P ± Gh, where

(3.15)

B ± = B1 E± , f ± = P ± f 1 .

By solving (3.15), we have f + (x) = eB f − (x) = eB

+x

−x

f0+ + f0− +

Z 0

Z

x x

eB

+ (x−y)

P + Gh(y) dy,

(3.16a)

eB

− (x−y)

P − Gh(y) dy,

(3.16b)

0

where f0± = P ± P1 f0 . Next, we look for the condition such that (3.13) and (3.14) hold. From (3.16a), we have Z x + + B+x −B + y + e P Gh(y) dy . (3.17) f0 + f (x) = e 0 | {z } T

Since f + (x) satisfy (3.14), it is necessary that T → 0 as x → ∞. This determines f0+ . Substituting this in (3.16a) yields Z ∞ + + eB (x−y) P + Gh(y) dy. (3.18) f (x) = − x

By using (3.16b) and (3.18), we obtain f 1 (x) = (K1 f0− )(x) + (K2 h)(x),

(3.19)

where −

(K1 f0− )(x) = P − eB x P − f0− , Z x Z − B − (x−y) − P e P Gh(y) dy − (K2 h)(x) = 0

x

(3.20) ∞

P + eB

+ (x−y)

P + Gh(y) dy. (3.21)

398

S. Kawashima, S. Nishibata

⊥ Apparently, (K1 f0− )(x) ∈ E− for f0− ∈ E− and (K2 h)(x) ∈ N⊥ M ∩R for h(x) ∈ M , ⊥ since Gh(x) ∈ NM ∩ R. Now we introduce a liner operator π as:

πf0− = R + (K1 f0− )(0) = R + P − f0− for f0− ∈ N⊥ M ∩ R.

(3.22)

Lemma 3.2. Suppose the stability condition [S] holds. Then, the operator π : E− 7 −→ R + N⊥ M , is an injection. Proof. Let f0− ∈ E− . Then (K1 f0− )(x) ∈ E− ⊂ N⊥ M ∩ R. Thus, + ⊥ R + (K1 f0− )(x) ∈ R + (N⊥ M ∩ R) = R NM for

∀

x ≥ 0.

Putting x = 0, we have πf0− ∈ R + N⊥ M. Next, we show that π is an injection. Let πf0− = 0 for a certain f0− ∈ E− . Then we put f 1 (x) = (K1 f0− )(x) = P − eB

−x

P − f0− .

(3.23)

We have, by direct calculations using (3.22) and (3.23), that fx1 = B − f 1 ,

(3.24)

f (x) → 0 as x → ∞, 1

+ 1

R f (0) = R

+

(K1 f0− )(0)

(3.25) =

πf0−

= 0.

(3.26)

Since f 1 ∈ E− ⊂ N⊥ M ∩ R, (3.24) means that fx1 = Bf 1 .

(3.27)

f 0 = Af 1 .

(3.28)

Then we define f 0 by

Equations (3.27) and (3.28) imply that f 0 , f 1 and fx1 satisfy the algebraic equations (2.8) with putting zero in place of η. Therefore, applying Lemma 2.2, we obtain that, for f = f 0 + f 1 , VM fx + LM f = 0.

(3.29)

Also, (3.25), (3.28), and (3.26) mean that f (x) = f 0 (x) + f 1 (x) → 0 as x → ∞, +

+ 1

R f (0) = R f (0) = 0.

(3.30) (3.31)

Taking the inner product of (3.29) with f and integrating the resulting equation over x ∈ [0, ∞) using the asymptotic condition (3.30), we obtain that Z ∞ 1 X 1 X νi |vi |Mi fi (0)2 + hLM f, f i dx = νi |vi |Mi fi (0)2 , (3.32) 2 2 0 i∈3−

i∈3+

Stationary Waves for Discrete Boltzmann Equation

399

where each fi (0) is an entry of f (0). The right-hand side of (3.32) is zero by (3.31). Also, the integrand on the left-hand side is nonnegative because of (1.19). Therefore, we get that R − f (0) = 0. Combining this result with (3.31) yields that f (0) = 0 on R. Thus f 1 (0) = 0 on R since f 1 = f there. Consequently, f0− = 0 because of (3.23). t u If condition [A] holds, then Proposition 2.6 implies that − dim R + N⊥ M = dim E .

(3.33)

Thus we get the following proposition from Lemma 3.2. Proposition 3.3. Suppose that the stability condition [S] and condition [A] hold. Then, the operator π : E− 7 −→ R + N⊥ M is an isomorphism. Put x = 0 on (3.19), apply the restriction operator R + on the resulting equality and use (3.13), to obtain that πf0− = b − R + (K2 h)(0).

(3.34)

Thus for arbitrary boundary data b ∈ R + N⊥ M , (3.34) is solvable because π has a inverse operator π −1 owing to Proposition 3.3: f0− = π −1 (b − R + (K2 h)(0)).

(3.35)

Substituting (3.35) in (3.19) yields that f 1 (x) = (K3 b)(x) + (K4 h)(x),

(3.36)

where the operators K3 and K4 are defined as (K3 b)(x) = P − eB

−x

P − π −1 b, +

(K4 h)(x) = (K2 h)(x) − (K3 R (K2 h)(0))(x).

(3.37) (3.38)

Thus, using (3.10), we have the solution formula to the linearized problem (3.1), (3.2), (3.3): f (x) = (Im + A)(K3 b)(x) + [(Im + A)(K4 h)(x) + J h(x)],

(3.39)

where the constant matrices A, J are given in (2.9) and Im is a unit matrix on Rm . Summarizing the previous considerations in this subsection, we have reached the theorem. Theorem 3.4. Suppose that conditions [S] and [A] hold. Let the function h(x) satisfy conditions (3.4) and (3.5). Then the linear stationary problem (3.1), (3.2) and (3.3) have a unique solution f (x) in the sense of Definition 3.1. Moreover, the solution f (x) is given by the explicit formula (3.39).

400

S. Kawashima, S. Nishibata

3.3. Estimates of solution formula. From (3.39), we derive estimates for the solution to the linearized problem as the preparation to construct a solution to the nonlinear problem in Sect. 4. To this end, we employ the supremum norm, which is defined by (1.35). Here, we define σ˜ as follows: σ˜ = min{µ ≥ 0; det(µVM + LM ) = 0}. By (2.15), it turns out that σ¯ , defined in (1.36), is the minimum eigenvalue of −B − and σ˜ is the minimum eigenvalue of B + . Because B1 is not necessarily nonsingular, σ˜ is possibly zero. The reason that we employ the weighted supremum norm (1.35) is to obtain the exponential decay of the stationary solution as x → ∞. This fact is essential to prove the stability theory in [10] and [11]. First, we prove the following estimates: Lemma 3.5. |(K3 b)(x)| ≤ C|b|e−σ x , |(K4 h)(x)| ≤ Ckhk2σ e

−σ x

(3.40) .

(3.41)

Proof. Apparently (3.40) follows from (3.37). By estimating (3.21), we have that for arbitrarily small > 0, |(K2 h)(x)| ≤ Ckhk2σ

Z |0

x

Z ∞ e−σ (x−y) e−2σy dy + e(σ˜ −)(x−y) e−2σy dy . (3.42) {z } |x {z } T1

T2

The first term T1 on the right-hand side of (3.42) is estimated as follows. T1 ≤ e

−σ x

Z

x

e−σy dy ≤ Ce−σ x .

(3.43)

0

Also, the second term T2 is estimated as T2 ≤ e

(σ˜ −)x

Z x

∞

e−(2σ +σ˜ −)y dy ≤ Ce−2σ x .

(3.44)

Now, combining (3.42) with (3.43) and (3.44), we have that |(K2 h)(x)| ≤ Ckhk2σ e−σ x .

(3.45)

Substituting x = 0 in (3.45) yields that |(K2 h)(0)| ≤ Ckhk2σ . Estimating (3.38) applying (3.40), (3.45) and (3.46) gives (3.41). u t

(3.46)

Stationary Waves for Discrete Boltzmann Equation

401

4. Nonlinear Problem We solve the nonlinear problem (1.15), (1.23), (1.24). If the function f satisfies (1.15), then f , fx and 0M (f ) satisfy (2.4). Thus by Lemma 2.2, we can decompose the differential equation (1.15) as: f 0 = Af 1 + J 0M (f ), fx1 = Bf 1 + G0M (f ),

(4.1a) (4.1b)

where f (x) = f 0 (x) + f 1 (x), f 0 (x) = P0 f (x) ∈ N, f 1 (x) = P1 f (x) ∈ N⊥ M ∩ R. Also, A, J , B and G are given in (2.9). Then f (x) is expressed by the formula: f (x) = 8(f )(x) := (Im + A)(K3 b)(x) + [(Im + A)(K4 0M (f ))(x) + J 0M (f )(x)], (4.2) owing to (3.39). We show the existence of the solution to (1.15), (1.23) and (1.24) by applying the contraction mapping principle on 8(f ). To this end, we introduce a Banach space X and its closed convex subset: X = {f ∈ B0 [0, ∞); kf kσ < ∞}, SR = {f ∈ X; kf kσ ≤ R|b|},

(4.3) (4.4)

where the norm k · kσ is defined by (1.35) and R is a positive constant to be determined later. In fact, we show that for a suitably chosen R > 0,8 is a contraction mapping of SR into itself, provided that |b| is small enough. Now, we show the following estimates. Lemma 4.1. Let f and g belong to X. Then, (4.5) k8(0)kσ = k(Im + A)(K3 b)kσ ≤ C¯ 1 |b|, r−2 ¯ k8(f ) − 8(g)kσ ≤ C2 (1 + kf k0 + kgk0 ) (kf kσ + kgkσ )kf − gkσ , (4.6) where k · k0 denotes the standard supremum norm over [0, ∞). Proof. Equation (4.5) follows from (3.40) and (4.2). By the definition of 8 in (4.2), we have that 8(f ) − 8(g) = (Im + A)K4 (0M (f ) − 0M (g)) + J (0M (f ) − 0M (g)).

(4.7)

Therefore, evaluating the above equality with the norm k · kσ yields that k8(f ) − 8(g)kσ ≤ Ck0M (f ) − 0M (g)k2σ ,

(4.8)

where we have used (3.41). Next we estimate the right-hand side of (4.8). By the definition of 0M , |0M (f ) − 0M (g)| ≤ C|f − g|

r−1 X (|f | + |g|)j j =1

≤ C|f − g|(|f | + |g|)(1 + |f | + |g|)r−2 ,

(4.9)

402

S. Kawashima, S. Nishibata

where r ≥ 2 is the multiplicity of the collisions. Then, by evaluating (4.9) with the norm k · k2σ , k0M (f ) − 0M (g)k2σ ≤ C(1 + kf k0 + kgk0 )r−2 (kf kσ + kgkσ )kf − gkσ . (4.10) Thus the estimate (4.8) together with (4.10) yields (4.6). u t Applying the estimates (4.5) and (4.6), we show the existence theorem to the stationary problem. Theorem 4.2. Suppose that Conditions [S] and [A] hold. Let the boundary data b = (bi )i∈3+ satisfy the consistency condition (1.31). Then there exists a positive constant β¯0 such that if β¯ ≤ β¯0 , where X |bi |, β¯ = i∈3+

the stationary problem (1.15), (1.23) and (1.24) has a unique solution f = (fi )i∈3 in SR with a suitably chosen R > 0. Furthermore, this solution f (x) belongs to C ∞ [0, ∞) and verifies the estimate ¯ −σ x for x ≥ 0, |∂xk f (x)| ≤ Ck βe

(4.11)

with each integer k ≥ 0, where each Ck is a positive constant depending on k and σ is a positive constant defined in (1.36). Proof. We fix R as R = 2C¯ 1 and let f ∈ SR . Estimate (4.2) by using (4.5) as well as (4.6) in which g = 0 is substituted. The result is that k8(f )kσ ≤ C¯ 1 |b| + C¯ 2 (1 + R|b|)r−2 R 2 |b|2 .

(4.12)

Now, by choosing |b| sufficiently small as 2C¯ 2 (1 + R|b|)r−2 R|b| ≤ 1, we have k8(f )kσ ≤ R|b|.

(4.13)

Thus we have found out that 8(f ) ∈ SR . Let f, g ∈ SR . If b is so small that 4C¯ 2 (1 + R|b|)r−2 R|b| ≤ 1, then we get from (4.6) that k8(f ) − 8(g)kσ ≤

1 kf − gkσ . 2

(4.14)

Thus we have found out that 8(f ) is a contraction mapping on SR with respect to the norm k · kσ . Therefore, there exists a fixed point f ∈ SR of 8. Apparently, the fixed point f is the desired unique solution to the stationary problem, (1.15), (1.23) and (1.24). As f ∈ SR , f satisfies (4.11) with k = 0: ¯ −σ x . |f (x)| ≤ C0 βe

(4.15)

Next, we show the regularity f ∈ C ∞ [0, ∞). Since f satisfies (1.15), f 1 = P1 f satisfies (4.1b), i.e., fx1 = Bf 1 + G0M (f ) ∈ C 0 [0, ∞). Thus f 1 ∈ C 1 [0, ∞). Also, (4.1a) gives the equivalent equation: 9(f 0 , f 1 ) := f 0 − Af 1 − J 0M (f 0 + f 1 ) = 0,

(4.16)

Stationary Waves for Discrete Boltzmann Equation

403

where f 0 = P0 f . We see easily that 9(0, 0) = 0 and D0 9(0, 0) = I , where D0 means the Fréchet derivative with respect to f 0 ∈ N. Hence, by the implicit function theorem applied to (4.16), we obtain that f 0 is expressed uniquely in terms of f 1 . This expression combined with the fact that f 1 ∈ C 1 [0, ∞) implies that f 0 ∈ C 1 [0, ∞). Thus we have verified that f ∈ C 1 [0, ∞). Repeating this argument, we have the final regularity that f ∈ C ∞ [0, ∞). The above procedure combined with (4.15) also gives (4.11) for k = 1, 2, . . . . u t Remark 4.1. The uniqueness of the solution f (x) in the above theorem can be proved more generally in a small neighborhood of zero in B0 [0, ∞) ∩ L1α [0, ∞). Moreover, if B1 is nonsingular, i.e., if the condition (2.27) holds, the uniqueness is valid in a neighborhood in B0 [0, ∞). These are proved by using the solution formula (4.2). 5. Applications We study two concrete models of the Boltzmann type as an application of the theory developed in previous sections.

5.1. Cabannes’ 14-velocity model. In this subsection, we treat Cabannes’ 14-velocity model ([1,8]), which is written in the system equations: ∂t F1 + ∂x F1 = σ1 q1 (F ) + σ2 q2 (F ), 4(∂t F2 + ∂x F2 ) = −σ2 q2 (F ), ∂t F3 − ∂x F3 = σ1 q1 (F ) − σ2 q2 (F ), 4(∂t F4 − ∂x F4 ) = σ2 q2 (F ), 4∂t F5 = −2σ1 q1 (F ),

(5.1a) (5.1b) (5.1c) (5.1d) (5.1e)

where σ1 and σ2 are positive constants as well as q1 (F ) = F52 − F1 F3 , q2 (F ) = F2 F3 − F1 F4 .

(5.2)

Thus, the Maxwellian is a vector M = (Mi )i=1,... ,5 satisfying M52 = M1 M3 , M2 M3 = M1 M4 , Mi > 0 for i = 1, . . . , 5.

(5.3)

It is easy to see that the space M of the collision invariants and its orthogonal complement M⊥ are given by M = span{φ1 , φ2 , φ3 }, M⊥ = span{φ4 , φ5 },

(5.4)

where φ1 = (2, 2, 0, 0, 1)> , φ2 = (0, 0, 2, 2, 1)> , φ3 = (0, 1, 0, 1, 0)> , φ4 = (1, 0, 1, 0, −2)> , φ5 = (1, −1, −1, 1, 0)> . It is known that physically admissible collision invariants are given by ϕ1 =

1 1 1 (φ1 + φ2 ), ϕ2 = (φ1 − φ2 ), ϕ3 = (φ1 + φ2 + 4φ3 ), 2 2 2

(5.5)

404

S. Kawashima, S. Nishibata

where ϕ1 , ϕ2 and ϕ3 correspond to the conservation of mass, momentum (in the xdirection) and energy, respectively. Stationary solution F = (Fi )i=1,... ,5 to the model (5.1) satisfies ∂x F1 = σ1 q1 (F ) + σ2 q2 (F ), = −σ2 q2 (F ), 4∂x F2 −∂x F3 = σ1 q1 (F ) − σ2 q2 (F ), = σ2 q2 (F ), −4∂x F4 0 = −2σ1 q1 (F ).

(5.6a) (5.6b) (5.6c) (5.6d) (5.6e)

We write the boundary data as: F1 (0) = B1 , F2 (0) = B2 . F (x) → M as x → ∞.

(5.7) (5.8)

Fi = Mi (1 + fi ) for i = 1, . . . , 5,

(5.9)

Then putting

and substituting (5.9) in (5.6), we have the system: VM fx + LM f = 0M (f ).

(5.10)

From the definitions (1.16) (1.17) and (1.18), the operators VM , LM and 0M (·) in (5.10) are given by VM = diag(M1 , 4M2 , −M3 , −4M4 , 0), LM = σ1 M1 M3 φ4 φ4> + σ2 M1 M4 φ5 φ5> , 0M (f ) = σ1 M1 M3 q1 (f )φ4 + σ2 M1 M4 q2 (f )φ5 .

(5.11)

The boundary conditions to f (x), derived from (5.7) and (5.8), are as follows: f1 (0) = (B1 − M1 )/M1 , f2 (0) = (B2 − M2 )/M2 , f (x) → 0 as x → ∞.

(5.12) (5.13)

Apparently from (5.11), we obtain the range R(VM ) and the null space N(VM ) of the operator VM as R = R(VM ) = span{e1 , e2 , e3 , e4 }, N = N(VM ) = span{e5 },

(5.14)

where {ei }i=1,... ,5 is the canonical basis of R5 . Thus, the stability condition [S] holds owing to (5.4) and (5.14). Also, direct calculations yield that 1 2 , ψ 3 }, N⊥ = V −1 (M⊥ ) = span{ψ 4 , e }, (5.15) , ψM NM = VM M = span{ψM M M M 5 M 4 N⊥ M ∩ R = span{ψM },

(5.16)

Stationary Waves for Discrete Boltzmann Equation

405

where 1 1 2 VM φ1 , ψM = (0, 0, M3 , 4M4 , 0)> = − VM φ2 , 2 2 1 1 1 1 1 4 = (0, M2 , 0, −M4 , 0)> = VM φ3 , ψM =( ,− , ,− , 0)> . 4 M1 4M2 M3 4M4 (5.17)

1 = (M1 , 4M2 , 0, 0, 0)> = ψM 3 ψM

Note that from (5.17) and (5.5), we also have 4 . φ5 = VM ψM

(5.18)

Immediately from (5.16) and (5.17), we have dim R + N⊥ M = 1.

(5.19)

Since f = (fi ) ∈ N⊥ M , it is expressed by 4 + α2 (x)e5 for α1 , α2 ∈ R. f (x) = α1 (x)ψM

(5.20)

By substituting x = 0 in (5.20), we find that the consistency condition (1.31) holds if and only if B1 + 4B2 = M1 + 4M2 .

(5.21)

Condition [A] holds if and only if the operator B1 , defined by (2.13), is negative since B1 is a scalar owing to (5.19). Thus, we look for the equivalent condition to µ < 0, where B1 = (µ). Thus, B1 and µ satisfy the eigenvalue problem: B1 φ 1 = µφ 1 for φ 1 ∈ N⊥ M ∩ R.

(5.22)

By putting φ 0 = Aφ 1 , we obtain from Lemma 2.2 that µVM φ + LM φ = 0 where φ = φ 0 + φ 1 ∈ N⊥ M.

(5.23)

Meanwhile from (5.15), φ is expressed as 4 + α¯ 2 e5 , for α¯ 1 , α¯ 2 ∈ R. φ = α¯ 1 ψM

(5.24)

By multiplying (5.24) by VM and LM respectively, we have from (5.11) that VM φ = α¯ 1 φ5 , 1 LM φ = − σ2 w α¯ 1 φ5 + σ1 {(M1 + M3 )α¯ 1 − 2M1 M3 α¯ 2 }φ4 , 4

(5.25)

where w is the constant given by w = M1 + 4M2 − M3 − 4M4 .

(5.26)

The constant w is called the fluid dynamical momentum at the Maxwellian M in rarefied gas dynamics. Substituting (5.25) in (5.23) yields that 1 (µ − σ2 w)α¯ 1 φ5 + σ1 {(M1 + M3 )α¯ 1 − 2M1 M3 α¯ 2 }φ4 = 0. 4

(5.27)

406

S. Kawashima, S. Nishibata

Thus we obtain 1 B1 = (µ) = ( σ2 w). 4

(5.28)

Thus we find out that Condition [A] holds if and only if w < 0. Taking Remark 4.1 into consideration, these observations for the Cabannes model (5.1) are summarized as follows. Theorem 5.1. Suppose that the boundary data B = (Bi )i=1,2 and the Maxwellian state M = (Mi )i=1,... ,5 satisfy (5.21) and w < 0, where w is given by (5.26). Then there exists a positive constant β1 such that if |B − R + M| ≤ β1 , the stationary problem (5.6), (5.7) and (5.8) has a unique solution F (x) in B0 [0, ∞). Moreover, this solution F (x) belongs to C ∞ [0, ∞) and verifies the estimate (1.38). Remark 5.1. It is not necessary to assume that |B − R + M| is sufficiently small in the above theorem. Actually, the solution is given by the explicit formula: F1 (x) = M1 + (B1 − M1 )e−σ¯ x , F2 (x) = M2 + (B2 − M2 )e−σ¯ x , F3 (x) = M3 + (B1 − M1 )e−σ¯ x , F4 (x) = M4 + (B2 − M2 )e−σ¯ x , F5 (x) = (F1 (x)F3 (x))1/2 , where σ¯ = −σ2 w/4 > 0. 5.2. 6-velocity model with multiple collisions. In this subsection, we treat the 6-velocity model with multiple collisions, which is introduced in [9]. It is written in the system: ∂t F1 + ∂x F1 = σ1 q1 (F ) + σ2 q2 (F ), 1 2(∂t F2 + ∂x F2 ) = −σ1 q1 (F ) − 2σ2 q2 (F ), 2 1 2(∂t F3 − ∂x F3 ) = −σ1 q1 (F ) + 2σ2 q2 (F ), 2 ∂t F4 − ∂x F4 = σ1 q1 (F ) − σ2 q2 (F ),

(5.29a) (5.29b) (5.29c) (5.29d)

where σ1 and σ2 are positive constants and q1 (F ) = F2 F3 − F1 F4 , q2 (F ) = F22 F4 − F1 F32 .

(5.30)

Maxwellian is a vector M = (Mi )i=1,... ,4 satisfying M2 M3 = M1 M4 , M22 M4 = M1 M32 , Mi > 0 for i = 1, . . . , 4.

(5.31)

By putting a = M2 /M1 , it is parameterized as M = M1 (1, a, a 3 , a 4 )> , where M1 , a > 0.

(5.32)

The straightforward calculation yields that the space M of the collision invariants and its orthogonal complement M⊥ for the model (5.29) are given by M = span{φ1 , φ2 }, M⊥ = span{φ3 , φ4 },

(5.33)

Stationary Waves for Discrete Boltzmann Equation

407

where φ1 = (1, 1, 1, 1)> , φ2 = (1, 1/2, −1/2, −1)> , φ3 = (1, −1, −1, 1)> , φ4 = (1, −2, 2, −1)> .

(5.34)

Note that φ1 and φ2 are also physically admissible collision invariants corresponding to the conservation of mass and momentum (in the x-direction), respectively. The stationary solution F = (Fi )i=1,... ,4 to the model (5.29) satisfies ∂x F1 ∂x F2 −∂x F3 −∂x F4

= σ1 q1 (F ) + σ2 q2 (F ), = −σ1 q1 (F ) − 2σ2 q2 (F ), = −σ1 q1 (F ) + 2σ2 q2 (F ), = σ1 q1 (F ) − σ2 q2 (F ).

(5.35a) (5.35b) (5.35c) (5.35d)

We write the boundary condition as: F1 (0) = B1 , F2 (0) = B2 , F (x) → M as x → ∞.

(5.36) (5.37)

Fi = Mi (1 + fi ) for i = 1, . . . , 4,

(5.38)

Then putting

in (5.35) and dividing by M1 , we obtain the system VM fx + LM f = 0M (f ).

(5.39)

In this model, the operators VM and LM and function 0M (f ) are given, from the definitions (1.16), (1.17) and (1.18) as well as (5.32), by LM

VM = diag(1, a, a 3 , a 4 ), = σ1 a 4 M1 φ3 φ3> + σ2 a 6 M12 φ4 φ4> ,

0M (f ) = σ1 a 4 M1 q1 (f )φ3 + σ2 a 6 M12 {q˜ 2 (f ) + q2 (f )}φ4 ,

(5.40) (5.41) (5.42)

where we have abbreviated as q˜ 2 (f ) = f2 (f2 + 2f4 ) − (2f1 + f3 )f3 .

(5.43)

The boundary conditions for f are also written as f1 (0) = (B1 − M1 )/M1 , f2 (0) = (B2 − aM1 )/(aM1 ), f (x) → 0 as x → ∞.

(5.44) (5.45)

It is easy to see from (5.40) that R = R(VM ) = R4 , N = N(VM ) = {0}.

(5.46)

Apparently from the above, the stability condition [S] holds in this model. Also, direct calculations yield that N(LM ) = M = span{φ1 , φ2 }, R(LM ) = M⊥ = span{φ3 , φ4 }, (5.47)

−1 1 2 ⊥ 3 4 , ψM }, N⊥ NM = VM M = span{ψM M = VM (M ) = span{ψM , ψM }, (5.48)

408

S. Kawashima, S. Nishibata

where i = VM φi for i = 1, 2, ψM 3 ψM

(5.49) 4 >

= (1, −1/a, 1/a , −1/a ) , 3

4 ψM

4 >

= (1, −2/a, 2/a , −1/a ) . 3

(5.50)

We have from (5.50) that ⊥ 2 + ⊥ ⊥ R + N⊥ M = NM = R , dim R NM = dim NM = 2.

(5.51)

Equation (5.51) implies that the consistency condition (1.31) always holds. Next, we look for the sufficient condition for B1 to have two negative eigenvalues. This is equivalent to Condition [A] owing to (5.51) and (2.15). If ϕ ∈ N⊥ M , it is expressed, with α3 , α4 ∈ R, as, 3 4 + α4 ψM . ϕ = α3 ψM

(5.52)

Multiplying (5.52) by VM and LM respectively, we have VM ϕ = α3 φ3 + α4 φ4 , LM ϕ = β1 a

4

(5.53)

3 4 (α3 hφ3 , ψM i + α4 hφ3 , ψM i)φ3

+ β2 a

4

3 4 (α3 hψM , φ3 i + α4 hφ4 , ψM i)φ3 ,

(5.54)

where β1 = σ1 M1 > 0, β2 = σ2 a

2

M12

> 0.

(5.55)

−1 LM Thus the representation matrix of B1 , which is the restriction of the operator B = VM ⊥ on NM , i.e.,

3 4 ⊥ 3 4 B1 = B N⊥ : N⊥ M = span{ψM , ψM } → NM = span{ψM , ψM }, M

is given by β1 (a 4 + a 3 − a − 1) β1 (a 4 + 2a 3 + 2a + 1) . β2 (a 4 + 2a 3 + 2a + 1) β2 (a 4 + 4a 3 − 4a − 1)

B1 = −

(5.56)

Also, by direct calculation with (5.56), we have the trace and the determinant of B1 , respectively: trace(B1 ) = −{(β1 + β2 )(a 2 + 1) + (β1 + 4β2 )a}(a + 1)(a − 1), det(B1 ) = β1 β2 a(a − 3a − 3a − 1)(a + 3a + 3a − 1). 3

2

3

2

(5.57) (5.58)

Owing to (2.24), B1 has two negative eigenvalues if and only if the trace of B1 is negative and the determinant of B1 is positive. Apparently from (5.57), the trace of B1 is negative if and only if a > 1. By solving the algebraic equation det(B1 ) = 0 with using (5.58), we find that det(B1 ) > 0 holds if and only if a < −1 +

√ √ √ 3 3 3 2 2, or a > 1 + 2 + 2 (=: c1 ).

(5.59)

Taking Remark 4.1 into consideration, the above observations on (5.29) are summarized as follows:

Stationary Waves for Discrete Boltzmann Equation

409

Theorem 5.2. Let the constant M = (Mi )i=1,... ,4 be a Maxwellian state. (Then, it is expressed by (5.32) with positive constants M1 , a = M2 /M1 .) Also, let the boundary data B = (Bi )i=1,2 be arbitrarily given in R2 . Suppose that a > c1 , then there exists a positive constant β2 such that if |B − R + M| ≤ β2 , the stationary problem (5.6), (5.7) and (5.8) has a unique solution F (x) in B0 [0, ∞). Moreover, the solution F (x) belongs to C ∞ [0, ∞) and verifies the estimate (1.38). Remark 5.2. The condition a > 1 implies that the fluid dynamical momentum w at the Maxwellian M is negative, i.e., w := M1 + M2 − M3 − M4 = −M1 (a + 1)(a − 1)(a 2 + a + 1) < 0. This is the case in the above theorem. References 1. Cabannes, H.: Étude de la propagation des ondes dans un gaz à 14 vitesses. J. Mécan. 14, 705–744 (1975) 2. Cabannes, H.: Solution grobale du probleme ` de Cauchy en théorie cinétique discrete. ` J. Mécan. 17, 1–22 (1978) 3. Cabannes, H.: The discrete Boltzmann equation (Theory and applications). Lecture Notes, Univ. of California, Berkeley, 1980 4. Cercignani, C., Illner, R., Pulvienti, M. and Shinbrot, M.: On nonlinear stationary half-space problems in discrete kinetic theory. J. Statist. Phys. 54, 885–896 (1988) 5. Gatignol, R.: Théorie Cinétique de Gaz a` Répartition Discrete ` de Vitesses. Lecture Notes in Phys. 36, New York: Springer-Verlag, 1975 6. Kawashima, S.: Large-time behavior of solutions of the discrete Boltzmann equation. Commun. Math. Phys. 109, 563–589 (1987) 7. Kawashima, S.: Global solutions to the initial-boundary value problems for the discrete Boltzmann equation. Nonlinear Anal. 17, 577–597 (1991) 8. Kawashima, S.: Existence and Stability of Stationary Solutions to the Discrete Boltzmann Equation. Japan J. Indust. Appl. Math. 8, 389–429 (1991) 9. Kawashima, S. and Bellomo, N.: The Discrete Boltzmann equation with multiple collisions: Global existence and stability for the initial value problem. J. Math. Phys. 31, 245–253 (1990) 10. Kawashima, S. and Nikkuni, Y.: To appear 11. Nikkuni, Y.: Stability of stationary solutions to some discrete velocity model of the Boltzmann equation in the half-space. To appear 12. Ukai, S.: On the Half-Space Problem for Discrete Velocity Model of the Boltzmann Equation. In: Advances in Nonlinear Partial Differential Equations and Stochastics, S. Kawashima and T.Yanagisawa (ed.) Series on Advances in Mathematics for Applied Sciences, 48, 1998, pp. 160–174 Communicated by H. Araki

Commun. Math. Phys. 207, 411 – 438 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Resonances Near the Real Axis for Transparent Obstacles Georgi Popov? , Georgi Vodev Département de Mathématiques, UMR 6629, Université de Nantes-CNRS, 2, rue de la Houssinière, B.P. 92208, 44322 Nantes-Cedex 03, France Received: 27 January 1999 / Accepted: 27 April 1999

Abstract: The purpose of this paper is to study the resonances for the transmission problem for a strictly convex obstacle in Rn , n ≥ 2. If the speed of propagation in the interior of the body is strictly less than that in the exterior, we obtain an infinite sequence of resonances tending rapidly to the real axis. These resonances are associated with a quasimode for the transmission problem. The main ingredient of the paper is the construction of a quasimode the frequency support of which coincides with the corresponding gliding manifold K. To do this we first find a global symplectic normal form for pairs of glancing hypersurfaces in a neighborhood of K and then we separate the variables microlocally near the whole glancing manifold K. 1. Introduction and Statement of Results Let O ⊂ Rn , n ≥ 2, be a bounded strictly convex domain with C ∞ boundary 0 and let = Rn \ O. Fix constants 0 < c < 1 and α > 0. The complex number λ ∈ C will be said to be a resonance for the transmission problem associated to O (see [9]), if the following problem has a nontrivial solution  2 (c 1 + λ2 )u1 = 0 in O,     (1 + λ2 )u2 = 0 in , (1.1) u1 − u2 = 0 on 0,   0 u1 + α∂ν u2 = 0 on 0, ∂  ν  u2 − λ-outgoing, Pn where 1 = j =1 ∂x2j , ν(x) is the exterior unit normal to 0 at x ∈ 0 and ν 0 (x) = −ν(x) is the interior one. Recall that the function v is said to be λ-outgoing if for some ρ0 1 we have v||x|≥ρ0 = R0 (λ)g||x|≥ρ0 , ? Author partially supported by grant MM-706/97 with MES, Bulgaria

412

G. Popov, G. Vodev

where g ∈ L2comp (), with a compact support independent of λ, and R0 (λ) is the free outgoing resolvent of 1 in Rn . Here “outgoing” means that R0 (λ) ∈ L(L2 (Rn ), L2 (Rn )) for Im λ < 0. Then all the resonances are in the upper half-plane Im λ > 0 (see Sect. 2). It is known from [6] that all the resonances associated to the Laplacian in the exterior of a strictly convex obstacle with Dirichlet boundary conditions lie in the region Im λ ≥ C1 |λ|1/3 − C2 for some positive constants C1 and C2 depending on the obstacle. In the case of transmission problem (1.1) with 0 < c < 1, however, the situation is completely different. We have the following Theorem 1.1. There exists an infinite sequence {λj } of different resonances of (1.1) such that 0 < Im λj ≤ CN |λj |−N , ∀N 1. To prove this theorem we use an argument close to that in [22], where a similar result is obtained for the Neumann problem in the linear elasticity in the exterior of an arbitrary bounded obstacle in R3 with a smooth boundary. Namely, it follows from the analysis in [21–23] that to obtain a sequence of resonances tending rapidly to the real axes, it suffices to find an infinite sequence of quasi-resonances {kj } with nontrivial uniformly compactly supported quasi-resonant functions, i.e. a suitable quasimode (see Proposition 2.1 in the present paper). Moreover, one can localize resonances near {kj }. In our case, kj are the eigenvalues of a suitable pseudodifferential operator of order one on 0 with a nonhomogeneous principal symbol, given by (5.15). The main ingredient in this paper is the construction of a quasimode the frequency support of which is concentrated at the glancing manifold K ⊂ T ∗ Rn of the interior problem which can be identified with {(x, ξ ) ∈ T ∗ 0 : ckξ k = 1}. In [22] the existence of quasi-resonances and quasi-resonant functions supported near the boundary is due to the so-called Rayleigh surface waves. The reason for the existence of quasimodes in our case is quite different. To explain it we take a look at the geometry of the generalised billiard flow corresponding to our problem. Any ray coming from infinity splits into two when interacting with the boundary. One of them reflects by the usual law of the geometric optics and goes again to infinity but the other one enters the obstacle. The latter is called the refracted ray. This is true even for the diffractive rays, that is rays which are tangent to the boundary. Consider now a ray γ traveling in the interior of the obstacle. If γ hits the boundary far away from the glancing manifold K the same phenomenon occurs. It splits into two rays, one of them remaining in the obstacle while the other one leaves the obstacle and goes to infinity. On the other hand, if γ hits the boundary sufficiently close to K then it remains in the interior after the reflection without giving rise to a refracted ray in the exterior. In other words, there is a total reflection near the gliding region. The gliding rays do not give rise to refracted rays either, hence, they are trapped by the obstacle. Moreover, there is a kind of “effective” stability of the billiard flow in O near the glancing manifold. Namely, for each N > 0 there exists CN > 0 such that if a broken ray γ (t) is ε close to K for t = 0 (in a suitable metric) then it remains 2ε close to K for |t| ≤ CN ε−N . In particular, those rays conserve a considerable amount of energy in the obstacle for a long period of time. Then it is natural to expect that the Lax–Phillips conjecture is true in this case,

Resonances Near the Real Axis for Transparent Obstacles

413

i.e. that there exist infinitely many resonances approaching the real axis. If the boundary is strictly convex and analytic, the “effective” stability is valid in an exponentially large time interval |t| ≤ Ceb/ε (see [4]) with some C, b > 0 and we hope that there exists a sequence of resonances which tends exponentially fast to the real axis in this case. The “effective” stability of the billiard flow near K is a consequence of the existence of a smooth approximate interpolating Hamiltonian ζ for the billiard ball map of O B : 6 → 6, where

6 = {(x, ξ ) ∈ T ∗ 0 : c||ξ || ≤ 1}

is the co-ball bundle. Note that B is a smooth symplectic map in the interior of 6 and continuous at the boundary ∂6 ' K. A function ζ ∈ C ∞ (T ∗ 0) is called an approximate interpolating Hamiltonian of B if ζ defines ∂6 (ζ = 0 and dζ 6= 0 on ∂6), ζ > 0 inside 6, and for any ϕ ∈ C ∞ (T ∗ 0) the function p (1.2) Rϕ (%) = ϕ(B(%)) − ϕ(exp(−2 ζ (%)Hζ )(%)) , % ∈ T ∗ 0, is flat at ∂6. In other words, all the derivatives of Rϕ vanish at ∂6. Here Hζ is the Hamiltonian vector field of ζ and R 3 t → exp(tHζ ) is the Hamiltonian flow. The approximate interpolating Hamiltonian ζ has been introduced by Marvizi and Melrose [11] (see also [10], [4] and [8]). Formula (1.2) is a consequence of the local symplectic normal form near ∂6 obtained by Melrose [12]. More generally, Melrose found a local normal form for the corresponding pair of glancing hypersurfaces F and G in a neighborhood of the glancing manifold K. In our case, F := T ∗ Rn |0 = {(x, ξ ) ∈ T ∗ Rn : f (x) = 0}, G := S ∗ Rn = {(x, ξ ) ∈ T ∗ Rn : c|ξ | − 1 = 0}, where the smooth function f defines the boundary 0 in Rn , i.e. f (x) = 0, df (x) 6 = 0 on 0 and f (x) > 0 inside O (see Sect. 3). Notice that G is the characteristic manifold of the Helmholtz operator −c2 1 − λ2 considered as an operator with a large parameter λ while F is the restriction of the cotangent bundle at the boundary. To prove the existence of quasi-resonances for the transmission problem (1.1) with the desired properties, we are going to find a microlocal normal form of (1.1) in a whole neighborhood of the glancing manifold K following some arguments from [15–17] (quoted only for the source of methods but not for any specific results). First we obtain a “global” symplectic normal form of the corresponding classical system, i.e. of the pair of glancing hypersurfaces F, G, in a neighborhood of K. Namely, we find in Theorem 3.1 an exact symplectic transformation in a whole neighborhood of K, mapping F and G to a pair of glancing hypersurfaces F0 and G0 in T ∗ D, D = 0 × (−δ, δ), δ > 0, such that F0 = {(x 0 , xn , ξ ) ∈ T ∗ D : xn = 0}, G0 = {(x 0 , xn , ξ 0 , ξn ) ∈ T ∗ D : ξn2 + xn − ζ (x 0 , ξ 0 ) = R(x, ξ )}, where x 0 are local coordinates on 0, xn ∈ (−δ, δ), and R(x, ξ ) vanishes to infinite order at K0 = {(x 0 , xn , ξ 0 , ξn ) ∈ T ∗ D : xn = ξn = ζ (x 0 , ξ 0 ) = 0}.

414

G. Popov, G. Vodev

Then we transform (microlocally near K) the corresponding boundary value problem to an elliptic boundary value problem with a large parameter (1.3) Dn2 + xn − L(x 0 , D0 , λ) + R(x, D, λ) v = O(|λ|−∞ ), with suitable boundary conditions, where Dn = −iλ−1 ∂xn , D0 = −iλ−1 ∂x 0 , L is a selfadjoint λ-pseudodifferential operator acting on half-densities in 0, with a principal symbol ζ (x 0 , ξ 0 ) and subprincipal symbol zero, while R(x, D, λ) is a λ-pseudodifferential operator the full symbol of which vanishes up to infinite order at K (see Theorem 4.2 and (4.10)). This enables us to “separate” the variables microlocally near K in Sect. 5. Namely, we are looking for an asymptotic solution of Dn2 + xn − L v = O(|λ|−∞ ) of the form: v = v(x, λ) ∼

∞ X

λ−k/3 Ai(k) τ + λ2/3 xn Xk g(x 0 , λ), |λ| → ∞,

k=0

where Ai(k) is the derivative of order k ≥ 0 of the Airy function Ai(s), τ < 0 is a zero of Ai(s), Xk are suitable λ-pseudodifferential operators on 0, X0 is the identity, and g(·, λ) ∈ C ∞ (0) has a frequency set in ∂6 = {ζ = 0}. Then we obtain R(x, D, λ)v = O(|λ|−∞ ), hence, v turns out to be also an asymptotic solution of (1.3). Using the boundary condition (4.10), we determine the operators Xk , k ≥ 1, and then we obtain λ = kj and g = gj as solutions of the spectral problem Q(x 0 , D 0 )g = λg, where Q is an elliptic pseudodifferential operator on 0 of order one, given by (5.15). The principal symbol of Q has the form σ (Q)(x 0 , ξ 0 ) = c ||ξ 0 ||x 0 − 2−1/3 τ c (II (x 0 , ξ 0 ))2/3 ||ξ 0 ||−1 x0 , where ||ξ 0 ||2x 0 is the principal symbol of the Laplace-Beltrami operator on 0, and II (x 0 , ξ 0 ) is the second fundamental form of 0. Notice that σ (Q)(x 0 , ξ 0 ) is real valued but nonhomogeneous. 2. Preliminaries Introduce the Hilbert space H = L2 (O; α −1 c−2 dx) ⊕ L2 (; dx) and consider the operator P u = (−c2 1u1 , −1u2 ), u = (u1 , u2 ) ∈ D(P ), with domain of definition D(P ) := {(u1 , u2 ) ∈ H :u1 ∈ H 2 (O), u2 ∈ H 2 (), u1 |0 = u2 |0 , ∂ν 0 u1 |0 = −α∂ν u2 |0 }. It is easy to see that P is selfadjoint, positive and elliptic. Define the resolvent R(λ) = (P − λ2 )−1 : H → H for Im λ < 0, and let χ ∈ C0∞ (Rn ) be such that χ = 1 in a neighbourhood of O. In the same way as in [24], we obtain that the cutoff resolvent Rχ (λ) = χR(λ)χ : H → H

Resonances Near the Real Axis for Transparent Obstacles

415

extends to a meromorphic function on the complex plane C if the dimension n is odd, and on the Riemann logarithmic surface if n is even. Moreover, as in the case of the exterior Dirichlet (Neumann) problems (e.g. see [9]) one can see that the poles of Rχ (λ) coincide with the resonances as defined in the introduction. One can use also the framework of the “black box scattering” as developed in [20]. A quasimode for the transmission problem is defined as an infinite sequence (j )

(j )

Q = {(kj , (u1 , u2 )) : j ∈ N}, (j )

(j )

where kj ∈ C, |kj | → ∞, Re kj ≥ 1, and u1 ∈ C ∞ (O), u2 ∈ C ∞ () have (j ) supports in a fixed compact neighbourhood of 0, ku1 |0 kL2 (0) = 1, and  (j ) k(c2 1 + kj2 )u1 kL2 (O) = O(|kj |−∞ ),    (j )  k(1 + kj2 )u2 kL2 () = O(|kj |−∞ ), (2.1) (j ) (j )  ku1 |0 − u2 |0 kH 2 (0) = O(|kj |−∞ ),    (j ) (j ) k(∂ν 0 u1 + α∂ν u2 )|0 kH 2 (0) = O(|kj |−∞ ). Given an infinite sequence {zj } of (complex) numbers, we say that zj = O(|kj |−∞ ), if for each N > 0 there exists CN > 0 such that |zj | ≤ CN |kj |−N for all j . Using a result of Tang and Zworski [23] we shall localize resonances of the transmission problem near the sequence of “quasi-resonances” {kj }. More precisely, we have: Proposition 2.1. Let Q be a quasimode for the transmission problem. Then 0 < Im kj = O(|kj |−∞ ), and there exists an infinite sequence 3 = {λj } of resonances of P such that dist (kj , 3) = O(|kj |−∞ ). In particular, 0 < Im λj = O(|λj |−∞ ). (j )

(j )

(j )

(j )

Proof. Set hj = u1 |0 − u2 |0 and gj = ∂ν 0 u1 |0 + α∂ν u2 |0 . We are going to (j ) (j ) construct compactly supported functions w1 ∈ C ∞ (O), w2 ∈ C ∞ () such that (j )

(j )

w1 |0 − w2 |0 = hj , (j ) ∂ν 0 w1 |0

(j ) + α∂ν w2 |0

= gj ,

(2.2) (2.3)

and (j )

(j )

kw1 kH 2 (O) + kw2 kH 2 () ≤ C(khj kH 2 (0) + kgj kH 2 (0) ).

(2.4)

Let {ρk }k=1,M ⊂ C0∞ (0) be a partition of unity on 0 so that supp ρk be small enough. Let Uk be small open neighbourhoods of supp ρk and let Vk ⊂ Rn be open domains such that Vk ∩ 0 = Uk . Take in Vk local coordinates (x 0 , xn ), x 0 ∈ Uk , xn is the normal coordinate taken so that Vk ∩ O is given by xn < 0 and Vk ∩ is given by xn > 0. Choose a function ϕ ∈ C0∞ (R), ϕ(t) = 1 for |t| ≤ δ, ϕ(t) = 0 for |t| ≥ 2δ, where 0 < δ 1. Define the functions ψ1,k and ψ2,k by ψ1,k = −(ρk gj )(x 0 )xn ϕ(xn ), xn < 0, ψ2,k = −(ρk hj )(x 0 )ϕ(xn ), xn > 0.

416

G. Popov, G. Vodev

P (j ) (j ) (j ) Clearly, the functions wi = k ψi,k , i = 1, 2, satisfy (2.2)-(2.4). Set vi = ui − (j ) wi , i = 1, 2. In view of (2.2)-(2.4) we have for any N > 1,  2 (j ) k(c 1 + kj2 )v1 kL2 (O) = ON (|kj |−N ),     (j ) k(1 + kj2 )v2 kL2 () = ON (|kj |−N ), (j ) (j )  v1 |0 − v2 |0 = 0,    (j ) (j ) ∂ν 0 v1 |0 + α∂ν v2 |0 = 0. (j )

(j )

Hence v (j ) = (v1 , v2 ) ∈ D(P ) and k(P − kj2 )v (j ) kH = ON (|kj |−N ).

(2.5)

Moreover, we have (j )

(j )

1 = kv1 |0 kL2 (0) ≤ C1 kv1 kH 1 (O) ≤ C2 (kP v (j ) kH + kv (j ) kH ) ≤ C2 (1 + |kj |2 )kv (j ) kH + ON (|kj |−N ). Hence kv (j ) kH ≥ C|kj |−2 ,

(2.6)

provided |kj | is large enough. Since P is selfadjoint, by (2.5) and (2.6) we get |Im kj | ≤ CN |kj |−N .

(2.7)

Therefore, we can suppose that all kj are real and 0 < k1 < k2 < · · · . For any 0 < h ≤ h0 , h0 > 0, we set P (h) = h2 P , and define E(h) = (hkj )2 and u(h) = uj for 1/kj ≤ h < 1/kj −1 . Applying the main theorem in [23] and the remark after it we find resonances λ(h), 0 < h ≤ h0 , such that λ(h)2 − E(h) = O(h∞ ). Then λj = kj−1 λ(1/kj ) form the desired sequence of resonances. Notice that λj may not be all different from each other. u t Remark 2.2. In Sect. 5.3 we associate to each zero of the Airy function a suitable elliptic pseudodifferential operator Q of order one on 0 given by (5.15). The quasi-resonances kj are just the eigenvalues of Q. Probably, using a recent result of Stefanov [21], one can get at least a lower bound N(r) ≥ Cr n−1 of the number of resonances λj such that 0 < Im λj = ON (|λj |−N ) and |λj | ≤ r. To do this it suffices to show that Q is similar to a self-adjoint operator (we prove in Sect. 5 that Q − Q∗ is of order at most −1). The problem of existence of quasi-resonances with the desired properties will be considered in the next sections. We are going to use a class of pseudo-differential operators with a large parameter λ on a compact manifold X which will be called λ − 9DOs. To have an invariantly defined subprincipal symbol we consider λ − 9DOs A : C ∞ (X, 1/2 ) → C ∞ (X, 1/2 ) acting on sections of the half-density bundle 1/2 of X. Locally, any such A is defined by n Z λ eiλhx−y,ξ i a(x, ξ, λ) u(y) dydξ, u ∈ C0∞ (X), A(x, D, λ) u (x, λ) = 2π

Resonances Near the Real Axis for Transparent Obstacles

417

where D = (D1 , . . . , Dn ), Dj = λ−1 Dxj = (iλ)−1 ∂/∂xj and the amplitude a is given asymptotically by a(x, ξ, λ) ∼

∞ X

λ−j aj (x, ξ ).

(2.8)

j =0

The support of the symbol a is defined as the closure of the union of the supports of aj , j ≥ 0. It is invariantly defined as a subset of T ∗ X, although the “complete” symbol is invariantly defined only modulo O(λ−2 ) ([7], Theorem 18.1.33). Hereafter we consider the class of λ−9DOs acting on C ∞ (X, 1/2 ) with compactly supported symbols which will be denoted by OP0 (X). More generally, given k ∈ R, we say that A ∈ OPk (X) if λ−k A ∈ OP0 (X) for |λ| 1. We shall also allow complex values for the parameter λ in L = {λ ∈ C : |Im λ| ≤ C1 , Re λ ≥ 1}, C1 > 0.

(2.9)

Notice that the frequency set ] W F (A) of any A ∈ OP0 (X) is contained in the support of its symbol. As usual, given an unbounded set M ⊂ L and a family of smooth functions g the frequency set of u. Recall that u(·, λ) ∈ C0∞ (X), λ ∈ M, we denote by WF(u) g / WF(u), if there exist neighborhoods U1 and U2 of x0 and ξ0 respectively (x0 , ξ0 ) ∈ c u(λξ, λ), λ ∈ M, is rapidly such that for any ϕ ∈ C0∞ (U1 ) the Fourier transform ϕ decreasing as Re λ → +∞ uniformly with respect to ξ ∈ U2 (see [1,5], and [3] in case of distributions). More generally, given k, m ∈ R we denote by OPk,m (X) the class of λ − 9DOs A(λ), λ ∈ R, having in any local coordinates symbols λk a(x, ξ, λ), where a is given asymptotically by (2.8) and β

|∂xα ∂ξ aj (x, ξ )| ≤ Cα,β,j (1 + |ξ |)m−j −|β| . We keep the same notation A(λ) for its holomorphic extension given by Proposition A.1 in [19] in any domain {z ∈ C : |Im λ| ≤ |λ|1−δ } for δ > 0 fixed. Moreover, A(λ) can be represented by a Re λ−9DO B(Re λ) with a large parameter Re λ or by a |λ|−9DO g g C(|λ|) with a large parameter |λ| in that domain, and % ∈ /W F(A) implies % ∈ /W F(B) g and % ∈ /W F(C) (for more details see [19], Appendix A). We are going to reduce the problem (2.1) to a suitable interior problem. Set D2 = 2 D1 +· · ·+Dn2 . Since 0 < c < 1, the set ∂6 = {(y, η) ∈ T ∗ 0 : ckηk = 1} belongs to the elliptic region for the exterior Dirichlet problem for the λ-differential operator D2 − 1. g F(f ) contained in ∂6, Given a family f (·, λ) ∈ C ∞ (0), λ ∈ L, with kf kL2 = 1 and W ∞ and hence in the elliptic region, there exist u2 (·, λ) ∈ C (), λ ∈ L, supported in a compact neighborhood of 0 such that 2 in , (D − 1)u2 = O(|λ|−∞ )f (2.10) u2 = f + O(|λ|−∞ )f on 0 , where u2 (·, λ) = R(λ)f (·, λ) and R(λ) is a λ-Fourier integral operator with a complex phase function (see the appendix in [3]). Moreover, the restriction of ∂ν R(λ) to 0 is a λ − 9DO on 0 and we obtain λ−1 N(∂ν u2 )|0 = u2 |0 + O(|λ|−∞ )u2 |0 , where N is a λ − 9DO in OP0 (0) with a real-valued principal symbol σ (N ) which is elliptic on ∂6. Hereafter O(|λ|−∞ ) : H −s → H s , s ≥ 0, stands for operators which

418

G. Popov, G. Vodev

are bounded in the corresponding Sobolev spaces with norms < Cs,N |λ|−N for each N > 1. Moreover, σ (N)(y, η) = (q0 (y, η) − 1)−1/2

(2.11)

in a neighborhood of ∂6, where q0 (y, η) = ||η||2y , (y, η) ∈ T ∗ 0, is the principal symbol of the Laplace-Beltrami operator on 0. In what follows, we are going to investigate the “interior” problem  (c2 D2 − 1)u = O(|λ|−∞ )f in O,  u=f on 0, (2.12)  −1 −1 α λ N(∂ν 0 u)|0 + u|0 = O(|λ|−∞ )u|0 . We will show that this problem has a solution for infinitely many discrete values {kj }∞ j =1 ⊂ L of λ and a sequence of functions {fj }∞ such that |k | −→ +∞, kf k = 1 and 2 j j L j =1 g j }) ⊂ ∂6. Then using (2.10) we obtain the desired quasimode for the transmission WF({f problem satisfying (2.1). To obtain kj and fj we construct a “normal form” of the boundary value problem (2.12) near the glancing manifold K. First we are going to find a symplectic normal form for the corresponding classical problem. 3. Global Symplectic Normal Forms of Glancing Hypersurfaces We denote by g(x, ξ ) = |ξ |2 − c−2 the principal symbol of the operator P = D2 − c−2 and choose f ∈ C ∞ (Rn ) such that {f = 0} and ∇f (x) = ν 0 (x) on 0, and f > 0 in O, f < 0 in . Set f (x, ξ ) = f (x) for (x, ξ ) ∈ T ∗ Rn and consider the pair of hypersurfaces F = {(x, ξ ) ∈ T ∗ Rn : f (x, ξ ) = 0}, G = {(x, ξ ) ∈ T ∗ Rn : g(x, ξ ) = 0} in a neighborhood of K = {f = g = {f, g} = 0}. Since 0 is strictly convex, F and G form a pair of transversal glancing hypersurfaces (see [7,12] and [13]), i.e. F and G intersect transversally at J = F ∩ G, and for any % ∈ J , either the integral curve of the Hamiltonian vector field of Hf (resp. Hg ) passing through % intersects G (resp. F ) transversally or it is simply tangent to G (resp. F ). In other words, the equality {f, g}(%) = 0, % ∈ J, implies {f, {f, g}}(%) 6 = 0, {g, {g, f }}(%) 6= 0.

(3.1)

Here {·, ·} stands for the Poisson brackets in T ∗ Rn corresponding to the standard symplectic two-form ω = dξ1 ∧ dx1 + . . . + dξn ∧ dxn . In our case, as 0 is strictly convex, we have {f, g}(x, ξ ) = 2hν(x), ξ i, (x, ξ ) ∈ F, and {f, {f, g}}(x, ξ ) = 2 , {g, {g, f }}(x, ξ ) = 4h∇ 2 f (x)ξ, ξ i = −4 II (x, ξ ) < 0,

Resonances Near the Real Axis for Transparent Obstacles

419

on K. Hereafter II (x, t), t ∈ Tx 0, stands for the second fundamental form on 0 associated with the exterior normal field ν(x), x ∈ 0, i.e. II (x, t) = −hdν(x)t, ti, and we identify covectors ξ ∈ Tx∗ 0 with vectors t ∈ Tx 0 via the Euclidean metric. We are going to find a global symplectic normal form for the pair F, G in a neighborhood of the glancing manifold K, and then to transform simultaneously the operator P and the boundary conditions in (2.12) to a suitable “normal form”. Here we make use of some arguments from [15,16] and [17]. First we recall some basic facts about glancing hypersurfaces which can be found in [7,12] and [13]. Because of (3.1), the glancing set K = {f = g = {f, g} = 0} is a submanifold of J . Indeed, the differentials df , dg, and d{f, g} are linearly independent on K, since df and dg are linearly independent over J and df (Hf ) = dg(Hf ) = 0, d{f, g}(Hf ) 6= 0, at K according to (3.1). There are two smooth involutions JF : J −→ J, JG : J −→ J, defined as follows: For any % in J outside K, JF (%) (resp. JG (%)) is the second point of intersection with J of the integral curve of Hf (resp. Hg ) passing through %, and JF (resp. JG ) coincides with the identity mapping on the glancing manifold K. Moreover, their differentials are linearly independent at any point of K. Notice that JF (%) (resp. JG (%)) does not depend on the particular choice of the functions f and g defining F and G but only on the Hamiltonian foliations of F and G. We denote by 6 the quotient 6 = J /JF , and by π : J → 6 the natural projection. Obviously, 6 is the coball bundle {(x, ξ ) ∈ T ∗ 0 : c||ξ || ≤ 1} of 0 in our case. Because of (3.1), π is a double foliation with a fold singularity on K. Hence π has two right inverse continuous maps π ± , π ◦ π ± = I d, where π + (x, ξ ) = (x, e) and the covector e is positive on the unit inward normal ν 0 (x) to O at x. The billiard ball map is defined as the boundary map B = π ◦ JG ◦ π + .

(3.2)

Fix δ > 0 sufficiently small and set D = 0 × (−δ, δ). Introduce normal geodesic coordinates y = (y 0 , yn ) : W → D in a small neighborhood W of 0 in Rn as follows: For any z ∈ W there exists an unique y(z) = (y 0 (z), yn (z)) ∈ D, such that z = yn (z)ν 0 (y 0 (z)) + y 0 (z), z ∈ W, in the Cartesian coordinates in Rn . The map y : W → D is smooth, and we have f (y, η) = yn + O(yn2 ), g(y, η) = ηn2 + q(y, η0 ), (y, η) ∈ T ∗ D, where

(3.3)

||η0 ||y 0 = q0 (y 0 , η0 ) = q(y 0 , 0, η0 ) + c−2 , (y 0 , η0 ) ∈ T ∗ 0,

is the Hamiltonian associated to the metric on 0 induced by the Euclidean one. Clearly, 6 = {(y 0 , η0 ) ∈ T ∗ 0 : c2 q0 (y 0 , η0 ) ≤ 1}, and

K = {(y 0 , 0, η0 , 0) ∈ T ∗ D : (y 0 , η0 ) ∈ ∂6},

420

G. Popov, G. Vodev

where ∂6 is the boundary of 6. Note also that yn ≥ 0 in W ∩ O. Since O is strictly convex, we have q1 (y 0 , η0 ) : =

∂q 0 1 (y , 0, η0 ) = − {g, {g, f }}(y 0 , 0, η0 , 0) = 2 II (y 0 , η0 ) > 0, ∂yn 2

for any (y 0 , η0 ) ∈ ∂6. We recall from [11] that the approximate interpolating Hamiltonian ζ of B defines ∂6 (ζ = 0 and dζ 6 = 0) on ∂6 and ζ > 0 inside 6. Hence, there exists a positive function b ∈ C ∞ (T ∗ 0) such that ζ (y 0 , η0 ) = b(y 0 , η0 ) (c−2 − q0 (y 0 , η0 )).

(3.4)

We are going to find a symplectic normal form for the pair of functions f and g in (3.3) in a neighborhood of K. As a consequence we obtain a normal form for the pair of glancing hypersurfaces F and G. The main result in this section is: Theorem 3.1. There exists an exact symplectic transformation χ : T ∗ D → T ∗ D such that fe = f ◦ χ and e g = g ◦ χ have the form fe(x, ξ ) = a(x 0 , ξ 0 )−1 xn + O(xn2 ) , e g (x, ξ ) = h(x, ξ 0 ) ξn2 + xn − ζ (x 0 , ξ 0 ) + O(xn∞ ) + O(ξn∞ ) + O(ζ ∞ ) in a neighborhood of K = {xn = ξn = ζ = 0} in T ∗ D, where ζ is the approximate interpolating Hamiltonian of the billiard ball map B and h ∈ C ∞ (T ∗ 0 × [−δ, δ]), h > 0 in a neighborhood of ∂6 × {0}. Moreover, χ(x, ξ ) = (y(x, ξ ), η(x, ξ )) has the form yn = xn a(x 0 , ξ 0 )−1 + O(xn2 ) , (y 0 , η0 , ηn ) = (x 0 , ξ 0 , ξn a(x 0 , ξ 0 )) + O(xn ),

(3.5)

and a(x 0 , ξ 0 ) = b(y 0 , η0 )−1/2 = (2 II (x 0 , ξ 0 ))1/3 + O(ζ (x 0 , ξ 0 )), 0

0

(3.6)

0 2

h(x, ξ ) = a(x , ξ ) (1 + O(xn )). First we obtain: Lemma 3.2. There exists an exact symplectic transformation κ : T ∗ D → T ∗ D , κ(x, ξ ) = (y(x, ξ ), η(x, ξ )) such that f0 = f ◦ κ and g0 = g ◦ κ become f0 (x, ξ ) = a(x 0 , ξ 0 )−1 xn + O(xn2 ) , (3.7) g0 (x, ξ ) = a(x 0 , ξ 0 )2 × ξn2 + xn − ζ (x 0 , ξ 0 ) + xn R0 (x 0 , ξ 0 ) + O(xn ξn ) + O(xn2 ) in a neighborhood of K = {xn = ξn = ζ = 0} in T ∗ D, where a(x 0 , ξ 0 ) = b(x 0 , ξ 0 )−1/2 . Moreover, (3.5) holds and R0 (x 0 , ξ 0 ) = −1 + b(x 0 , ξ 0 )3/2 q1 (x 0 , ξ 0 ).

Resonances Near the Real Axis for Transparent Obstacles

421

Proof. We are looking for the inverse κ −1 of κ which will be of the form κ −1 (y, η) = (x(y, η), ξ(y, η)). First we define the functions xn and ξn in D by xn = yn a(%), ξn = ηn a(%)−1 , (%, yn , ηn ) ∈ T ∗ 0 × T ∗ (−δ, δ).

(3.8)

Then the Poisson bracket {ξn , xn } = 1 in T ∗ D. Taking any local coordinates y 0 in 0 we denote by (y 0 , η0 ) the corresponding symplectic local coordinates in T ∗ 0 and consider (y 0 , 0, η0 , ηn ) as local coordinates in F . Then we determine (x 0 , ξ 0 ) : T ∗ D → T ∗ 0 by Hξn x 0 = 0, Hξn ξ 0 = 0, x 0 |F = y 0 |F , ξ 0 |F = η0 |F , where Hξn is the Hamiltonian vector field of ξn which is transversal to F . Set κ −1 (y, η) = (x(y, η), ξ(y, η)). It is easy to see that the symplectic transformation κ −1 does not depend on the choice of the local coordinates y 0 in 0. Set κ(x, ξ ) = (y(x, ξ ), η(x, ξ )). Then we have ∂η0 ∂y 0 (x, ξ ) = ηn {a −1 , y 0 } = O(ξn ) and (x, ξ ) = ηn {a −1 , η0 } = O(ξn ). ∂xn ∂xn These estimates lead to n−1

X ∂ q0 (y 0 (x, ξ ), η0 (x, ξ )) = ∂xn j =1

Hence,

∂q0 ∂ηj ∂q0 ∂yj + ∂yj ∂xn ∂ηj ∂xn

= O(ξn ).

q0 (y 0 (x, ξ ), η0 (x, ξ )) = q0 (x 0 , ξ 0 ) + O(xn ξn ).

In the same way we get a(y 0 (x, ξ ), η0 (x, ξ )) = a(x 0 , ξ 0 ) + O(xn ξn ). Taking into account the last two equalities, (3.3) and (3.4), we obtain g ◦ κ(x, ξ ) = a(x 0 , ξ 0 )2 ξn2 + xn a(x 0 , ξ 0 )−3 q1 (x 0 , ξ 0 ) − ζ (x 0 , ξ 0 ) + O(xn2 ) + O(xn ξn ), since a(x 0 , ξ 0 ) = b(x 0 , ξ 0 )−1/2 . It is easy to see that κ is exact symplectic and that (3.5) holds. u t Lemma 3.3. The function R0 in (3.8) has the form R0 (x 0 , ξ 0 ) = O(ζ (x 0 , ξ 0 )). In particular,

b(x 0 , ξ 0 ) = (2 II (x 0 , ξ 0 ))−2/3 + O(ζ (x 0 , ξ 0 )).

422

G. Popov, G. Vodev

Proof. Consider the glancing hypersurfaces F0 = κ −1 (F ) and G0 = κ −1 (G) in a neighborhood of K in T ∗ D. We have g0 = 0}, F0 = {xn = 0}, G0 = {e where e g0 (x, ξ ) = a(x 0 , ξ 0 )−2 g0 (x, ξ ) = ξn2 + xn − ζ (x 0 , ξ 0 ) + xn R0 (x 0 , ξ 0 ) + O(xn ξn ) + O(xn2 ). p Given (x 0 , ξ 0 ) ∈ 6 we set ηn = ζ (y 0 , η0 ) and consider the integral curve t −→ (x(t), ξ(t)), (x(0), ξ(0)) = (x 0 , 0, η0 , ηn ) ∈ J = F0 ∩ G0 , g0 (x, ξ ). Then of the Hamiltonian vector field in T ∗ D with Hamiltonian e x˙n (t) = 2ξn (t) + O(xn (t)) ξ˙n (t) = −1 − R0 (x 0 (t), ξ 0 (t)) + O(xn (t)) + O(ξn (t)), which implies

xn (t) = 2ηn t − (1 + R0 (y 0 , η0 ))t 2 + O(ηn t 2 ) + O(t 3 ) ξn (t) = ηn − (1 + R0 (y 0 , η0 ))t + O(ηn t) + O(t 2 )

for small |t|. Given (y 0 , η0 ) ∈ 6 \∂6 in a neighborhood of ∂6, we denote by T (y 0 , η0 ) > 0 the smallest positive solution of xn (T ) = 0. Obviously, T (y 0 , η0 ) = 2ηn (1 + R0 (y 0 , η0 ))−1 + O(ηn2 ).

(3.9)

On the other hand, the involution JG0 : J → J has the form (3.10) JG0 (y 0 , 0, η0 , ηn ) = (x 0 (T ), 0, ξ 0 (T ), ξn (T )), √ 0 where ξn (T ) = − ζ (x (T ), η0 (T )). Recall that the involutions JF0 and JG0 do not depend on the choice of the functions defining F0 and G0 but only on the Hamiltonian foliations of F0 and G0 . Moreover, they are symplectic invariants and so is the billiard ball map B given by (3.2). On the other hand, π(x 0 , 0, ξ 0 , ξn ) = (x 0 , ξ 0 ) and π + (x 0 , ξ 0 ) = (x 0 , 0, ξ 0 , ζ (x 0 , η0 )1/2 ) on 6 = {ζ ≥ 0}. Then (3.10) implies (x 0 (T ), ξ 0 (T )) = B(y 0 , η0 ). Since ζ is an approximate interpolating Hamiltonian of B defined by (1.2), we obtain in any local coordinates (x 0 (T ), ξ 0 (T )) = B(y 0 , η0 ) = exp(−2ζ (y 0 , η0 )1/2 Hζ (y 0 , η0 )) + O(ζ (y 0 , η0 )∞ ) = (x 0 (0), ξ 0 (0)) − 2ζ (y 0 , η0 )1/2 Xζ (y 0 , η0 ) + O(ζ (y 0 , η0 )), where Xζ stands for the symplectic gradient (∂ζ /∂η0 , −∂ζ /∂y 0 ) of ζ . On the other hand, writing down the Hamiltonian system for e g0 we get (x˙ 0 (t), ξ˙ 0 (t)) = −Xζ (x 0 (t), ξ 0 (t)) + O(xn (t)) and we obtain (x 0 (T ), ξ 0 (T )) = (x 0 (0), ξ 0 (0)) − T (y 0 , η0 )Xζ (y 0 , η0 ) + O(T 2 ) + O(ηn T ).

Resonances Near the Real Axis for Transparent Obstacles

Hence,

423

T (y 0 , η0 ) = 2ζ (y 0 , η0 )1/2 + O(ζ (y 0 , η0 )).

Taking into account (3.9) we obtain R0 (y 0 , η0 ) = O(ζ (y 0 , η0 )1/2 ) = O(ζ (y 0 , η0 )) t since R0 is smooth. u Now we set R0 (x 0 , ξ 0 ) = ζ (x 0 , ξ 0 )W0 (x 0 , ξ 0 ) and write g0 (x, ξ ) = a(x 0 , ξ 0 )2 × ξn2 + xn − ζ (x 0 , ξ 0 ) + xn ζ (x 0 , ξ 0 )W0 (x 0 , ξ 0 ) + xn ξn V0 (x 0 , ξ ) + O(xn2 ) in a neighborhood of K = {xn = ξn = ζ = 0} in T ∗ D. To get rid of W0 we write e0 (x 0 , ξ ) + O(xn2 ) , g0 (x, ξ ) = h1 (x, ξ 0 ) ξn2 + xn − ζ (x 0 , ξ 0 ) + xn ξn V where

h1 (x, ξ 0 ) = a(x 0 , ξ 0 )2 (1 − xn W0 (x 0 , ξ 0 )).

e0 . To do this we perform a symplectic transformation χ0 in T ∗ D, It remains to annihilate V given by the time-one-flow exp(4) of the Hamiltonian vector field 4 with a Hamiltonian ψ0 (x, ξ ) = Q0 (x 0 , ξ )xn2 . Applying Taylor’s formula to g0 (exp(t4)(x, ξ )) for t ∈ [0, 1] it is easy to see that g0 (exp(4)(x, ξ )) = g0 (x, ξ ) + {ψ0 , g0 }(x, ξ ) Z 1 1 {ψ0 , {ψ0 , g0 }}(exp(s4)(x, ξ )) ds. + 2 0

(3.11)

On the other hand, {ψ0 , g0 }(x, ξ ) = −4a(x 0 , ξ 0 )2 xn ξn Q0 + O(xn2 ). Moreover, exp(s4)(x, ξ ) = (x + O(xn2 ), ξn + O(xn ), ξ 0 + O(xn2 )), 0 ≤ s ≤ 1, hence, the last term in the right-hand side of (3.11) is O(xn2 ). We obtain g0 ◦ exp(4)(x, ξ ) = q0 (x, ξ ) − 4 a(x 0 , ξ 0 )2 xn ξn Q0 (x 0 , ξ ) + O(xn2 ). e0 /4 we annihilate the coefficient V e0 . Choosing Q0 = V Fix an integer k ≥ 1 and suppose that there exists a symplectic transformation χk−1 given by the time-one-flow exp(4k−1 ) of a Hamiltonian vector field 4k−1 with a Hamiltonian ψk−1 (x, ξ ) =

k−1 X j =0

j +2

Qj (x 0 , ξ )xn

(3.12)

424

G. Popov, G. Vodev

such that fk = f0 ◦ χk−1 and gk = g0 ◦ χk−1 have the form fk (x, ξ ) = a(x 0 , ξ 0 )−1 (xn + O(xn2 )),

gk (x, ξ ) = hk (x, ξ 0 )(ξn2 +xn −ζ (x 0 , ξ 0 )+xnk+1 Rk (x 0 , ξ 0 ))+O(xnk+1 ξn )+O(xnk+2 ),

where

 hk (x, ξ 0 ) = a(x 0 , ξ 0 )2 1 −

k−1 X

 j +1 Wj (x 0 , ξ 0 )xn  ,

(3.13)

j =0

and Rk and Wj , 0 ≤ j ≤ k − 1, are smooth functions in T ∗ 0. We have already found Q0 and W0 . Lemma 3.4. We have where Wk ∈ C ∞ (0).

Rk (x 0 , ξ 0 ) = ζ (x 0 , ξ 0 )Wk (x 0 , ξ 0 ),

Proof. As in Lemma 3.3 we consider the integral curve (x(t), ξ(t)) of the Hamiltonian 0 0 vector field with Hamiltonian e gk = h−1 k gk issuing from (y , 0, η , ηn ) ∈ J , where p 0 0 0 0 (y , η ) ∈ 6, ηn = ζ (y , η ). Then   x˙n (t) = 2ξn (t) + O(xn (t)k+1 ) ξ˙ (t) = −1 − (k + 1)xnk Rk (x 0 (t), ξ 0 (t)) + O(xnk ξn ) + O(xnk ζ ) + O(xnk+1 )  n0 (x˙ (t), ξ˙ 0 (t)) = −Xζ (x 0 (t), ξ 0 (t)) + O(xnk+1 ) for |t| ≤ 4ηn . Hence Rt xn (t) = 2ηn t − t 2 − 2(k + 1)Rk (y 0 , η0 ) 0 (2ηn s − s 2 )k ds + O(ηn2k+2 ) ξn (t) = ηn − t − (k + 1)Rk (y 0 , η0 ) (2ηn t − t 2 )k + O(ηn2k+1 ) for |t| ≤ 4ηn , and we obtain T (y 0 , η0 ) = 2ηn + bk Rk (y 0 , η0 )ηn2k + O(ηn2k+1 ), where

Z bk = (k + 1)

Moreover,

2

(2s − s 2 )k ds > 0.

0

(x˙ 0 (t), ξ˙ 0 (t)) = −Xζ (x 0 (t), ξ 0 (t)) + O(ηn2k+2 )

for |t| ≤ 4ηn . Then (x 0 (t), ξ 0 (t)) = exp(−tHζ )(y 0 , η0 )) + O(ηn2k+2 ) for |t| ≤ 4ηn . Let ϕ ∈ C0∞ (T ∗ 0). Applying Taylor’s formula for the function t −→ ϕ(exp(−tHζ )(y 0 , η0 )) at T = 2ηn + (T − 2ηn ), we obtain ϕ(x 0 (T ), ξ 0 (T )) = ϕ(exp(−T Hζ )(y 0 , η0 )) + O(ηn2k+2 ) = ϕ(exp(−2ηn Hζ )(y 0 , η0 )) − bk Rk (y 0 , η0 )Hζ ϕ(y 0 , η0 )ηn2k + O(ηn2k+1 ). On the other hand, ϕ(x 0 (T ), ξ 0 (T )) = ϕ(B(y 0 , η0 )) = ϕ(exp(−2ηn Hζ )(y 0 , η0 )) + O(ηn∞ ). Therefore,

Rk (y 0 , η0 ) = O(ηn ) = O(ζ (y 0 , η0 )1/2 ) = O(ζ (y 0 , η0 )) t since Rk is smooth. u

Resonances Near the Real Axis for Transparent Obstacles

425

Now we can write gk (x, ξ ) = hk+1 (x, ξ 0 ) × ξn2 + xn − ζ (x 0 , ξ 0 ) + xnk+1 ξn Vk (x 0 , ξ ) + O(xnk+2 ) ,

(3.14)

where Vk (x 0 , ξ ) is a smooth function, hk+1 is defined by (3.13) and Wk (x 0 , ξ 0 ) is given by Lemma 3.4. It remains to annihilate Vk . To do this we perform a symplectic transformation χk in T ∗ D, given by the time-one-flow χk = exp(4k ) of a Hamiltonian vector field 4k with a Hamiltonian ψk (x, ξ ) = ψk−1 (x, ξ ) + Qk (x 0 , ξ )xnk+2 , where Qk is to be determined. Denote by 4 the Hamiltonian vector field of Qk (x 0 , ξ )xnk+2 . Since ψj (x, ξ ) = O(xn2 ) p p for any 0 ≤ j ≤ k, it is easy to see by recurrence that 4j g0 (x, ξ ) = O(xn ) for each p. Moreover, for p ≥ 2 we have p

p

4k g0 (x, ξ ) = 4k−1 g0 (x, ξ ) + O(xnk+2 ). Then by Taylor’s formula we get P 1 p k+2 gk+1 (x, ξ ) = g0 (exp 4k (x, ξ )) = k+1 p=0 p! 4k g0 (x, ξ ) + O(xn ) Pk+1 1 p k+2 = p=0 p! 4k−1 g0 (x, ξ ) + 4g0 (x, ξ ) + O(xn ) = gk (x, ξ ) + 4gk (x, ξ ) + O(xnk+2 ). Taking into account (3.14) we obtain gk+1 (x, ξ ) = gk (x, ξ ) − 2(k + 2)hk+1 (x, ξ 0 )xnk+1 ξn Qk + O(xnk+2 ). Choosing Qk = (2k + 4)−1 Vk we get rid of Vk in (3.14). Finally we obtain gk+1 (x, ξ ) = hk+1 (x, ξ 0 ) ξn2 + xn − ζ (x 0 , ξ 0 ) + O(xnk+2 ). Using Borel’s theorem we obtain smooth functions ψ and h whose Taylor’s polynomials with respect to xn of degree k and k + 1 respectively are ψk and hk for each k ≥ 1. Then we define χ = exp(Hψ ) ◦ κ. Obviously χ is exact symplectic by construction and it satisfies (3.5). The proof of Theorem 3.1 is complete. u t 4. Transformation of the Boundary Value Problem 4.1. FIOs associated to χ. The symplectic transformation (y, η) = χ(x, η) in Theorem 3.1 maps a neighborhood U of the glancing manifold K in T ∗ D into itself, where D = 0 × (−δ, δ). Let us denote by 3 the canonical relation 3 = {(y, η, x, ξ ) ∈ T ∗ D × T ∗ D : (y, η) = χ (x, ξ ), (x, ξ ) ∈ U}. Set 30 = {(y, x, η, ξ ) ∈ T ∗ (D × D) : (y, η, x, −ξ ) ∈ 3}. Notice that 3 is an exact Lagrangian manifold in view of Theorem 3.1, i.e. the restriction of the fundamental one-form ηdy − ξ dx of T ∗ D × T ∗ D to 3 is exact. Hence, the Liouville class of 3 is trivial, and we can consider the class I 0 (D, D, 3; λ) of Fourier integral operators (FIOs) U : C ∞ (D, 1/2 ) → C ∞ (D, 1/2 ) with a large parameter λ ∈ L, associated with

426

G. Popov, G. Vodev

the canonical relation 3 and acting on half-densities (see [2,14]), where L is defined by (2.9). The distribution kernel of any operator U in this class can be written microlocally as an oscillatory integral of the form n Z λ eiλϕ(y,x,ξ ) J (y, ξ ) b(y, ξ, λ) dξ. (4.1) 2π Rn Here ϕ(y, x, ξ ) = hy − x, ξ i + S(y, ξ ) is a nondegenerate phase function which defines locally the Lagrangian manifold 30 . In other words, 30 = {(y, x, dy ϕ, dx ϕ) : dξ ϕ(y, x, ξ ) = 0}, which is equivalent to the relation 3 = {(y, ξ + ∂S/∂y, y + ∂S/∂ξ, ξ ), (y, ξ ) ∈ U}. Moreover, we have

(4.2)

2 1/2 ∂ ϕ (y, ξ ) 6= 0 J (y, ξ ) = det ∂y∂ξ

(see (4.4) below). The amplitude b is a classical symbol with respect to λ ∈ L with a compact support. More precisely, we suppose that b is given asymptotically as |λ| → ∞ by b(y, ξ, λ) ∼

∞ X

λ−j bj (y, ξ ),

(4.3)

j =0

where b(·, ·, λ) and bj , j ≥ 0, are smooth functions supported in a fixed compact neighborhood of K in U. Note that the calculus of FIOs with a large parameter λ 1 developed in [2] can be extended also for λ ∈ L, since the support of the amplitude b is compact with respect to (y, ξ ) and |Im λ| is bounded by a constant. Notice that (3.5) and (4.2) imply yn + ∂S/∂ξn (y 0 , ξ 0 ) = xn = yn a(y 0 , ξ 0 ) + O(yn2 ), which yields ϕ(x, y, ξ ) = hy 0 − x 0 , ξ 0 i − xn ξn + yn ξn a(y 0 , ξ 0 ) + O(yn2 ).

(4.4)

In particular, the Maslov class of 3 is trivial. Hence, the principal symbol of any operator U ∈ I 0 (D, D, 3; λ) can be identified with a smooth function in T ∗ D. More precisely, parameterizing locally 3 by (y, ξ ) as above, the principal symbol of the operator U with distribution kernel (4.1) can be identified with b0 (y, ξ ). Consider an operator U1 ∈ I 0 (D, D, 3; λ) which is “microlocally unitary” in a neighborhood of K for λ ∈ R. In other words we suppose that g F(C) ∩ K = ∅, λ ∈ R, U1 (λ)U1∗ (λ) = Id + C(λ), W where C ∈ OP 0,0 (D) (for the definition see Sect. 2). Using a finite number of local charts in a neighborhood of ∂6 in T ∗ 0 we extend U1∗ (λ), λ ∈ R, to an operator U2 ∈ I 0 (D, D, 3−1 ; λ), λ ∈ L, so that g F(C) ∩ K = ∅, λ ∈ L. U1 (λ)U2 (λ) = Id + C(λ), W

(4.5)

Resonances Near the Real Axis for Transparent Obstacles

427

Moreover, we suppose that the principal symbols of U1 (λ) and U2 (λ) are equal to 1 in a neighborhood of K. Denote by R theP ideal of all R ∈ OP0,0 (D) having in any local chart complete −j symbols of the form ∞ j =0 λ Rj (x, ξ ), where Rj (x, ξ ) = O(xn∞ ) + O(ξn∞ ) + O(ζ ∞ ), j ≥ 0, β

which means that Dxα Dξ Rj (x, ξ ) = 0 on K = {xn = ξn = ζ = 0} for each α and g β in Nn . Notice that any operator C ∈ OP0,0 (D) with frequency set WF(C) ∩K = ∅ belongs to R. In particular the operator C in (4.5) belongs to R. Given two operators A, B ∈ OP0,0 (D), we say that A = B (mod R) iff A − B ∈ R. Using the calculus of FIOs with a large parameter (see [2,14]) and Theorem 3.1, we obtain e1 = U2 P U1 ∈ OP0 (D), P e1 has the form where the principal symbol of P e1 )(x, ξ ) = h(x, ξ 0 ) ξn2 + xn − ζ (x 0 , ξ 0 ) + R1 (x, ξ ), σ (P (4.6) and R1 (x, ξ ) = O(xn∞ ) + O(ξn∞ ) + O(ζ ∞ ). Using (4.5) we get e1 (λ) = P (λ)U1 (λ) + C(λ)P (λ)U1 (λ), WF(C) g ∩ K = ∅. U1 (λ) P Since the subprincipal symbol of P is zero (P is regarded as a pseudodifferential operator acting on half-densities) and the principal symbol of U1 is 1, we obtain as in [1] that the e1 is also zero in a neighborhood of K. subprincipal symbol of P Let H (xn , λ) ∈ OP0 (0) , xn ∈ [−δ, δ], be a smooth family of selfadjoint λ − 9DOs e1 = U1 H and with principal symbols h(x, ξ 0 )−1/2 and subprincipal symbols 0. Set U e2 = H U2 and consider the operator U e2 P U e1 . e = HP e1 H = U P e is The principal symbol of P e)(x, ξ ) = ξn2 + xn − ζ (x 0 , ξ 0 ) + O(xn∞ ) + O(ξn∞ ) + O(ζ ∞ ) σ (P e(λ) is selfadjoint for λ ∈ R since U e2 (λ) = (U1 (λ)H )∗ in view of (4.6). Moreover, P for λ ∈ R. Then its subprincipal symbol is real valued. On the other hand, since the e1 are both 0 in a neighborhood of K, we obtain that the subprincipal symbols of H and P e is pure imaginary, hence, it is equal to zero in a neighborhood subprincipal symbol of P e an operator R ∈ R, we can suppose that the principal symbol of of K. Adding to P e + R equals ξn2 + xn − ζ (x 0 , ξ 0 ) and its subprincipal symbol is zero. In this way we get P e = P0 + P

∞ X

λ−j Kj (mod R),

(4.7)

j =2

where

P0 (x, D, λ) = Dn2 + xn − L(x 0 , D0 , λ),

L ∈ OP0 (0) is selfadjoint and modulo O(|λ|−2 ) the complete symbol of L is given in any local coordinates by ζ (x 0 , ξ 0 ) +

n−1

i X ∂ 2ζ (x 0 , ξ 0 ). 2λ ∂xj ∂ξj j =1

428

G. Popov, G. Vodev

e1 (λ) on 0. Denote 4.2. Boundary values at 0. Consider now the boundary values of U 0 0 ∗ ∞ by ı : 0 → D the inclusion map ı(y ) = (y , 0) and by ı : C (D) → C ∞ (0) the pull-back map given by ı ∗ (w)(y 0 ) = w(y 0 , 0), w ∈ C ∞ (D). Taking into account (4.1) and (4.4) we obtain e1 (λ) = H (0, λ)ı ∗ V1 (λ), ı∗U where V1 ∈ OP0 (D) has a distribution kernel n Z λ eiλhy−x,ξ i v(y, ξ, λ) dξ, n 2π R and v(y, ξ, λ) ∼

∞ X

λ−j vj (y, ξ ).

j =0

Since

J (y 0 , 0, ξ )

=

a(y 0 , ξ 0 )1/2 ,

we get

v0 (y, ξ ) = a(y 0 , ξ 0 )1/2 b0 (y, ξ, λ) + O(yn ) in any local coordinates, where b0 is the principal symbol of U1 (λ). In the same way we get e1 (λ) = H (0, λ) ı ∗ V2 (λ)Dn + λ−1 ı ∗ V3 (λ), ı ∗ Dn U where V2 , V3 ∈ OP0 (D) and the principal symbol of V2 is a(y 0 , ξ 0 )3/2 b0 (y, ξ ) + O(yn ). Recall that the principal symbol of H (0, λ) is a(y 0 , ξ 0 )−1 and the principal symbol of U1 (λ) is 1 in a neighborhood of K. On the other hand, any smooth function vj (y, ξ ) can be written as follows: vj (y, ξ ) = vj 0 (y, ξ 0 ) + vj 1 (y, ξ 0 )ξn + (ξn2 + yn − ζ (y 0 , η0 ))vj 2 (y, ξ ). e + R, R ∈ R, Using that ξn2 + yn − ζ (y 0 , η0 ) is the principal symbol of the operator P we easily obtain by recurrence (with respect to j ) e1 (λ) = β(λ)ı ∗ + λ−1 γ (λ)ı ∗ Dn + ı ∗ V (λ)P e(λ) + ı ∗ C(λ), ı∗U e(λ) + ı ∗ C 0 (λ). e1 (λ) = β 0 (λ)ı ∗ Dn + λ−1 γ 0 (λ)ı ∗ + ı ∗ V 0 (λ)P ı ∗ Dn U

(4.8)

Here β, β 0 , γ and γ 0 belong to OP0 (0), the principal symbols of β and β 0 are a(x 0 , ξ 0 )−1/2 and a(x 0 , ξ 0 )1/2 respectively, while C and C 0 belong to R and V , V 0 ∈ OP0 (D). e to P0 . 4.3. Transformation of P Lemma 4.1. There exist operators A and B in OP0 (D) such that e A = (A + λ−2 B) P0 (mod R), P where A is elliptic on K, and ] WF(A) and ] WF(B) are contained in a compact neighborhood of K in T ∗ D. Moreover, there exist operators A0j (x, D0 , λ) and A1j (x, D0 , λ) in OP0 (0) such that the principal symbol of A00 is equal to one in a neighborhood of ∂6 and asymptotically A(x, D, λ) = A00 (x, D0 , λ) +

∞ X j =1

λ−j (A0j (x, D0 , λ) + A1j (x, D0 , λ)Dn ) (mod R).

Resonances Near the Real Axis for Transparent Obstacles

429

Proof. Fix a sufficiently small neighborhood U of K. We are looking for P∞pseudodif−j ferential operators A and B in OP0 (D) given asymptotically by A = j =0 λ Aj P∞ −j 0 0 and B = j =0 λ Bj , where Aj , Bj ∈ OP (D). We choose A0 ∈ OP (D) with a principal symbol σ (A0 ) equal to 1 in a neighborhood of K. Then the principal symbol (iλ)−1 {ξn2 + xn − ζ, σ (A0 )} of the commutator [P0 , A0 ] vanishes in a neighborhood of K, and in view of (4.7) we have P˜ A0 − A0 P0 = λ−2 R2 (mod R), where R2 ∈ OP0 (D). Then we obtain e(A0 + λ−1 A1 ) − (A0 + λ−1 A1 + λ−2 B0 )P0 = λ−2 R3 (mod R), P where the principal symbol of R3 is σ (R3 ) = −i{ξn2 + xn − ζ, σ (A1 )} − (ξn2 + xn − ζ )σ (B0 ) + σ (R2 ). We shall find σ (A1 ) and σ (B0 ) so that σ (R3 ) = 0. First we solve the problem 2 i{ξn + xn − ζ, u1 }(x, ξ ) − σ (R2 )(x, ξ ) = 0 u1 (x, ξ 0 , 0) = 0 for (x, ξ ) in a neighborhood of K and then we write u1 (x, ξ ) = σ (A1 )(x, ξ ) + (ξn2 + xn − ζ (x 0 , ξ 0 ))v1 (x, ξ ), where σ (A1 )(x, ξ ) = a10 (x, ξ 0 ) + a11 (x, ξ 0 )ξn has the desired form in a neighborhood of K in U. Next we set σ (B0 ) = i{ξn2 + xn − ζ, v1 }, and then we extend σ (A1 ) and σ (B0 ) as smooth functions with compact supports in U. We choose pseudodifferential operators A1 and B0 with principal symbols prescribed as t above. In the same way we obtain by recurrence Aj and Bj −1 for any j ≥ 1. u e1 (A + e1 A and T 0 = U 4.4. Transformation of the boundary value problem. Set T = U −2 λ B). Then using (4.9) we obtain: Theorem 4.2. We have P T = T 0 P0 + T 0 C,

(4.9)

where C ∈ R and T and T 0 are elliptic operators in a neighborhood of K. Moreover, ı ∗ T = Q1 ı ∗ + λ−1 Q2 ı ∗ Dn + ı ∗ V P0 + ı ∗ C1 , ı ∗ Dn T = Q01 ı ∗ Dn + λ−1 Q02 ı ∗ + ı ∗ V 0 P0 + ı ∗ C10 , where Qj and Q0j , j = 1, 2, are in OP0 (0), the principal symbols of Q1 and Q01 are a(x 0 , ξ 0 )−1/2 and a(x 0 , ξ 0 )1/2 respectively, C1 , C10 ∈ R and V , V 0 ∈ OP0 (D).

430

G. Popov, G. Vodev

g v ⊂ K. Then P u = O(|λ|−∞ )u is equivalent to Set u = T v and suppose that WF −∞ P0 v + Cv = O(|λ| )v, where C ∈ R. If this holds, the left hand side of the boundary condition in (2.12) becomes √ √ e + O(|λ|−∞ )v, e ∗ Dn v + N e0 ı ∗ v + ı ∗ Cv −1α −1 N ı ∗ (Dn T v) + ı ∗ (T v) = −1Nı e and e ∈ R and N e and N e0 belong to OP0 (0), and the principal symbols of N where C e0 are α −1 a 1/2 σ (N) and a −1/2 respectively. In this way we transform (2.12) to the N following problem for v: −∞ g ⊂ K, √ P0 v∗+ Cv = ∗O(|λ| ∗ )v, WF v−∞ (4.10) e = O(|λ| )v, −1W ı Dn v + ı v + ı Cv e ∈ R and W ∈ OP0 (0). The principal symbol σ (W ) of W is real valued and where C, C taking into account (2.11) we get σ (W )(x 0 , ξ 0 ) =

a(x 0 , ξ 0 ) q α ||ξ 0 ||2x 0 − 1

(4.11)

in a neighborhood of ∂6 = {q0 = c−2 }. 5. Construction of Quasimodes ∞ n We are going to find a sequence {kj }∞ j =1 ⊂ L and functions vj = v(·, kj ) ∈ C (R ) satisfying (5.10) and such that ||vj |0 || ≥ C > 0. First we separate the variables in (5.10) using the Airy functions. In this way we reduce (5.10) to a single pseudodifferential equation on 0 given by (5.6). Next we transform that equation to a spectral problem for a classical elliptic pseudodifferential operator with a polyhomogeneous symbol on 0.

5.1. Separation of the variables near K. The form of the operator P0 above suggests that we should look for asymptotic solutions of P0 v + Cv = O(|λ|−∞ )v in terms of theAiry function, where the meaning of the right-hand side has been explained in Sect. 2. Recall that the Airy function Ai(s) is defined by Z 3 −1 eiz /3+izs dz Ai(s) = (2π) Im z=δ

for any δ > 0. The zeros of Ai(s) are all negative, while for Re s 1, | arg s| ≤ θ0 , θ0 > 0 small enough the function Ai(s) is rapidly decreasing at infinity. Moreover, Ai(s) satisfies the equation (Ds2 + s)Ai(s) = 0. Set Ai(k) (s) = d k Ai(s)/ds k and let τ < 0 be a zero of Ai(s). It is easy to deduce from the equality above that (Dn2 + xn )Ai(k) τ + λ2/3 xn (5.1) = −kλ−2/3 Ai(k−1) τ + λ2/3 xn − τ λ−2/3 Ai(k) τ + λ2/3 xn .

Resonances Near the Real Axis for Transparent Obstacles

431

We are ready to separate the variables for the boundary value problem (4.10) near K. If we had the equation P0 v = O(|λ|−∞ )v with “Dirichlet” boundary conditions ı ∗ v + ı ∗ T Cv = O(|λ|−∞ )v instead of those in (4.10), we could take v(x, λ) = Ai(τ + λ2/3 xn )g(x 0 , λ), where g satisfies the equation L(x 0 , D0 , λ)g + τ λ−2/3 g = O(|λ|−∞ )g. The boundary conditions in (4.10 ), however, make the construction more complicated. Set ∞ X λ−j/3 Xk,j , k ≥ 1, X0 = I d, Xk = j =0

g k,j ) where Xk,j are auxiliary operators in OP0 (0) to be determined later with WF(X contained in a small neighborhood of ∂6 independent of k and j . We are looking for an asymptotic solution of P0 v + Cv = O(|λ|−∞ )v of the form v = v(x, λ) =

∞ X

λ−k/3 Ai(k) τ + λ2/3 xn Xk g,

(5.2)

k=0

with g ∈ C ∞ (0) to be specified later on and so that g F(g) ⊂ ∂6. Lg = O(|λ|−2/3 )g, W

(5.3)

By (5.1) we have p xn Dn` Ai(k) τ + λ2/3 xn = O |λ|−(2p+`)/3 for any indices k, ` and p, and since the principal symbol of L is ζ , using (5.3), we obtain for any C ∈ R, (5.4) Cv = O |λ|−∞ g = O |λ|−∞ v. Then P0 v + Cv = P0 v + O(|λ|−∞ )g. Moreover, P0 v = −

∞ X

λ−k/3 Ai(k) τ + λ2/3 xn Yk g + O |λ|−∞ g,

k=0

where

Yk = LXk + λ−2/3 τ Xk + (k + 1)λ−1 Xk+1 = Xk L + λ−2/3 τ + [L, Xk ] + (k + 1)λ−1 Xk+1 .

It seems at first glance that we have to solve an infinite number of equations Yk g = 0, k ≥ 0. To overcome this difficulty we introduce an auxiliary operator Z=

∞ X j =0

λ−j/3 Zj ,

432

G. Popov, G. Vodev

where Zj ∈ OP0 (0) will be determined from the boundary conditions, and write Yk = Xk L + λ−2/3 τ + λ−1 Z + λ−1 ((k + 1)Xk+1 + λ[L, Xk ] − Xk Z) . For any k ≥ 0 we are going to determine Xk+1 in terms of Z so that (k + 1)Xk+1 + λ[L, Xk ] − Xk Z = O |λ|−∞ . Then the system of equations Yk g =

(5.5)

O(|λ|−∞ )g,

k ≥ 0, reduces to a single equation g ⊂ ∂6. (5.6) λLg + λ1/3 τg + Zg = O |λ|−∞ g, WF(g)

Obviously, (5.6) implies (5.3). We are going to solve (5.5) by recurrence. For k = 0 we get X1 = Z since X0 = Id. Then for k = 1 we obtain 2X2 = −λ[L, Z] + Z 2 which implies 1 X 1 Zp Zq − λ[L, Zj ]. X2,j = 2 2 p+q=j

In the same way, for any k ≥ 2 we obtain from (5.5), X Xk,p Zq − (k + 1)−1 λ[L, Xk,j ], j = 0, 1, . . . , Xk+1,j = (k + 1)−1 p+q=j

and we prove that each Xk,j , k ≥ 1, j ≥ 0, depends only on Zp for p ≤ j . More precisely, we obtain by recurrence that Xk,j = Pk,j (Z0 , . . . , Zj ),

(5.7)

where Pk,j is a polynomial of operators Z1 , . . . , Zj and of commutators λ` [L, [· · · , [L, Zp1 ] · · · ]Zp` ] , ` < k, where ps ≤ j for any s ≤ `. Since τ is a zero of the Airy function we have Ai(τ ) = Ai00 (τ ) = 0. Then (5.2) and the boundary condition in (4.10) imply Ai0 (τ )Wg + Ai0 (τ ) +

∞ ∞ X X

∞ X

λ−`/3 Z` g

`=0 −(k+j −1)/3

λ

Ai

(k)

−∞

(τ ) W Xk−1,j + Xk,j g = O |λ|

(5.8) g.

k=3 j =0

Taking into account (5.7) we shall determine Z so that (5.8) holds for each g. We set Z0 = −W,

Z1 = 0,

λ−`/3

in (5.8) for ` ≥ 2 we obtain and comparing the coefficients of X Ai(k) (τ )Ai0 (τ )−1 W Xk−1,j + Xk,j , Z` = −

(5.9)

k,j

where the sum is taken over all nonnegative integers (k, j ) such that k + j = ` + 1 and k ≥ 3. In particular, j < ` and using (5.7) and (5.9) we determine the operators Z` and Xk,` , k ≥ 1, by recurrence with respect to ` ≥ 2. We have reduced the problem (4.10) to the pseudodifferential equation (5.6). In what follows, we will transform (5.6) to a spectral problem for a classical elliptic pseudodifferential operator with a polyhomogeneous symbol on 0.

Resonances Near the Real Axis for Transparent Obstacles

433

5.2. Reduction to a λ−9DO on 0 with a polyhomogeneous symbol. Let E(λ) ∈ OP0 (0) be selfadjoint (for λ real) with a principal symbol cb(x 0 , ξ 0 )−1/2 (1 + c||ξ 0 ||)−1/2 in a neighborhood of ∂6 and subprincipal symbol 0, where b is given by (3.4), ||ξ 0 || = p 0 0 ||ξ ||x 0 = q0 (x , ξ 0 ) and we shall drop sometimes x 0 . Setting g = E(λ)g0 in (5.6) we obtain g 0 ) ∈ ∂6, (G(λ) − λ) g0 = O(|λ|−∞ )g0 , WF(g

(5.10)

where G(λ) = E(λ)(λL + λ1/3 τ + Z)E(λ) + λ is given asymptotically by G(λ) = λG0 (x 0 , D0 , λ) +

∞ X

λ1−j/3 Gj (x 0 , D0 , λ), Gj ∈ OP0 (0)

(5.11)

j =2

We can suppose that the operators G0 and G2 are selfadjoint for λ real. Moreover, their principal symbols are σ (G0 )(x 0 , ξ 0 ) = c||ξ 0 ||, σ (G2 )(x 0 , ξ 0 ) = −τ c2 b(x 0 , ξ 0 )−1 (1 + c||ξ 0 ||)−1 , in view of (3.6), and their subprincipal symbols are both 0 in a neighborhood of ∂6. The operator G3 is selfadjoint too for λ real, its principal symbol is c2 σ (Z0 )(x 0 , ξ 0 ) b(x 0 , ξ 0 )(1 + c||ξ 0 ||) c2 a(x 0 , ξ 0 )3 c2 σ (W )(x 0 , ξ 0 ) p = = b(x 0 , ξ 0 )(1 + c||ξ 0 ||) α ||ξ 0 ||2 − 1(1 + c||ξ 0 ||)

σ (G3 ) = −

in view of (3.6) and (4.11), and its subprincipal symbols is 0 in a neighborhood of ∂6. Moreover, G4 = 0 since Z1 = 0. It is not hard to calculate also Z2 and then G5 and to prove that its principal symbol is also real valued. Conjugating G(λ) − λ with a suitable elliptic λ − 9DO, we can suppose that there exists a neighborhood U of ∂6 such that the full symbol of each operator Gj (λ), j ≥ 0, in (5.11) has the form h(j ) (x 0 , ξ 0 , λ) ∼

∞ X

(j )

hp (x 0 , ξ 0 ) λ−j/3−p , (x 0 , ξ 0 ) ∈ U,

(5.12)

p=0 (j )

in any local coordinates, where hp is smooth in T ∗ 0, homogeneous of order 1−j/3−p with respect to ξ 0 outside a small neighborhood U0 of the zero section such that U0 ∩U = ∅ and h(j ) = 0 in a fixed neighborhood of the zero section. We can assume that this holds for G0 . Indeed, its principal symbol is homogeneous in a neighborhood U of ∂6, the subprincipal symbol of G0 is 0, and modifying if necessary the lower order terms, we replace G0 with an operator of the desired form. This would change Gj , j ≥ 6, eventually. Fix k ≥ 2 and suppose that each Gj , j < k has the desired form. Then we write the principal symbol of Gk as follows: (k)

σ (Gk )(x 0 , ξ 0 ) = h0 (x 0 , ξ 0 ) + sk (x 0 , ξ 0 )(c ||ξ 0 || − 1)

434

G. Popov, G. Vodev (k)

in U, where h0 (x 0 , ξ 0 ) is homogeneous of order 1 − k/3 in T ∗ 0 outside a small neighborhood U0 of the zero section. In particular, (k) / U0 . (5.13) h0 (x 0 , ξ 0 ) = σ (Gk ) x 0 , ξ 0 /(c||ξ 0 ||) (c||ξ 0 ||)1−k/3 , (x 0 , ξ 0 ) ∈ (k)

ek an operator of the desired form with a principal symbol h . Let Sk have Denote by G 0 a principal symbol −sk /2. Then we get 1 + λ−k/3 Sk (λ) (G(λ) − λ) 1 + λ−k/3 Sk (λ) =

k−1 X

ek (λ) + λ1−(k+1)/3 G ek+1 (λ) + λ1−k/3 Rk+1 (λ), λ1−j/3 Gj (λ) + λ1−k/3 G

j =0

ek+1 ∈ OP0 (0) and Rk+1 ∈ OP0,0 (0), WF(R g k+1 ) ∩ U = ∅. In this way we where G obtain by recurrence ∞ X λ−j/3 Sj +2 (λ) ∈ OP0 (0) S(λ) ∼ j =0

e and G(λ) of the desired form such that e − λ + λ R(λ), (1 + λ−2/3 S(λ)) (G(λ) − λ) (1 + λ−2/3 S(λ)) = G(λ) where

g F(R) ∩ ∂6 = ∅. R ∈ OP0,0 (0), W e ∩ ∂6 = ∅, we e g F(R) Finally, adding to G(λ) an operator R(λ) ∈ OP0,0 (0) with W can assume that each operator Gj (λ) belongs to OP0,0 (0) and that its full symbol is polyhomogeneous outside U0 and has the form (5.12) in any local coordinates. Fix r so that c−1 > r > 0, and set U = {(x 0 , ξ 0 ) ∈ T ∗ 0 : ||ξ ||x > r}.

(5.14)

Then ∂6 ⊂ U and we suppose that U ∩ U0 = ∅. Then in any local coordinates the full symbol of Gj (λ) has the form (5.12), it vanishes in a neighborhood of the zero section (j ) in U0 , and hp (x 0 , ξ 0 ), p ≥ 0, are homogeneous of order 1 − j/3 − p in U. 5.3. Reduction to a classical pseudodifferential operator on 0. Consider the class of “classical” pseudodifferential operators 9 k (0), k ∈ R, acting on half densities in 0 with symbols in the Fréchet space S k (T ∗ 0). Given k ∈ R and 0 < h ≤ 1, such that 1/ h ∈ Z, k,h we denote by 9phg (0) the class of pseudodifferential operators with polyhomogeneous k,h k,1 k (T ∗ 0) of order k and step h (see [7, Chap. 18.1]). Set Sphg = Sphg symbols in Sphg

k = 9 k,1 . In what follows we shall replace each operator λ1−j/3 G (x 0 , D 0 , λ), and 9phg j phg

j ≥ 0, in (5.10) by a classical pseudodifferential operator Qj (x 0 , D 0 ) ∈ 9phg (0). More precisely, we define the distribution kernel of Qj (x 0 , D 0 ), j ≥ 0, microlocally by the oscillatory integral Z 0 0 0 eihx −y ,ξ i h(j ) (x 0 , ξ 0 , 1) dξ 0 , Qj (x 0 , y 0 ) = (2π)1−n 1−j/3

Rn−1

Resonances Near the Real Axis for Transparent Obstacles

where h(j ) is given by (5.12). We can suppose that Q0 (x 0 , D 0 ) = c the operator

435

√ −10 . Consider

∞ X p 1,1/3 Qj (x 0 , D 0 ) ∈ 9phg (0), Q(x 0 , D 0 ) = c −10 +

(5.15)

j =2

which is well defined modulo a smoothing operator. 1/3 0 (0) are selfadjoint, The operators Q2 (x 0 , D 0 ) ∈ 9phg (0) and Q3 (x 0 , D 0 ) ∈ 9phg their principal symbols are σ (G2 )(x 0 , ξ 0 ) = −2−1√τ c(2 II (x 0 , ξ 0 ))2/3 ||ξ 0 ||−1 , σ (G3 )(x 0 , ξ 0 ) = c3 (α 1 − c2 )−1 II (x 0 , ξ 0 )||ξ 0 ||−2

(5.16)

in view of (5.13) and (3.6), and the subprincipal symbols are both 0. Moreover, Q4 = 0. We say that Sλ belongs to 9 k (0) uniformly with respect to λ ∈ L if the family of the “complete” symbols of Sλ is bounded in the Fréchet space S k in any local coordinates. Then the operators Sλ : H s (0) −→ H s−k (0), λ ∈ L, are uniformly bounded for any s ∈ R. It will be said that Sλ ∈ 9 −∞ (0) uniformly with respect to λ ∈ L if Sλ : H s1 (0) −→ H s2 (0), λ ∈ L, are uniformly bounded for any s1 and s2 . Proposition 5.1. We have G(x 0 , D, λ) − λ = (1 + λ−1 Pλ (x 0 , D 0 ))(Q(x 0 , D 0 ) − λ) + C(λ) + Rλ (x 0 , D 0 ), where Pλ ∈ 9 1 (0) and Rλ ∈ 9 −∞ (0) uniformly with respect to λ ∈ L and C(λ) ∈ OP0,0 (0), ] WF(C) ∩ ∂6 = ∅. Proof. Fix r0 and r1 in R so that c−1 > r0 > r1 > r > 0, where r is introduced in (5.14), and choose ϕ ∈ C ∞ (R), ϕ(t) = 1 for t ≥ r0 and ϕ(t) = 0 for t ≤ r1 . Choose a finite atlas of local (conic) charts in T ∗ 0 \ 0 and consider G(λ) in a given local chart g F C ∩ ∂6 = ∅, to with local coordinates (x 0 , ξ 0 ). Adding a λ-PDO C(λ) ∈ OP0 (0), W G(λ), we suppose that in that local coordinates the complete symbol of Gj (λ), j ≥ 1, has the form ϕ ||ξ 0 ||x 0 h(j ) (x 0 , ξ 0 , λ), where h(j ) is given by (5.14). Denote by e ϕ (z), z ∈ C, an almost analytic extension of ϕ such that e ϕ (z) = 1 in ϕ (z) = 0 in {|z| ≤ r1 }, and set {|z| ≥ r0 } and e ψ(z) = 1 − e ϕ (z). Then supp ψ ⊂ {|z| ≤ r0 },

436

G. Popov, G. Vodev

and the support of ∂z e ϕ (z) = O(|Im z|∞ ) is contained in {r1 ≤ |z| ≤ r0 }. For each j ≥ 0 and λ > 1 we write microlocally the distribution kernel of Gj (λ) in the given local coordinates as an oscillatory integral λ1−j/3 Gj (x 0 , y 0 , λ) = (2π)1−n

Z Rn−1

0 0 0 eihx −y ,ξ i ϕ ||ξ 0 ||x 0 /λ h(j ) (x 0 , ξ 0 , 1) dξ 0 . (5.17)

Set

e ϕ ||ξ 0 ||x 0 /λ h(j ) (x 0 , ξ 0 , 1) h(j ) (x 0 , ξ 0 , λ) = e

ej (λ), λ ∈ L, the operator with distribution kernel of the form (5.17) and denote by G e and amplitude h(j ) (x 0 , ξ 0 , λ). Notice that ϕ ||ξ 0 ||x 0 /λ ∈ S −∞,−∞ , z = 1/λ, ∂z e since r1 |λ| ≤ ||ξ 0 ||x 0 ≤ r0 |λ| on its support, which implies Im (||ξ 0 ||x 0 /λ) ≤ r0 |Im λ||λ|−1 ≤ C|λ|−1 ≤ C 0 (|λ| + ||ξ 0 ||x 0 )−1 , λ ∈ L. ej (λ) coincides with the holomorphic extension of Gj (λ) in L given by PropoHence, G sition A.1, [19], modulo an operator in OP−∞,−∞ (0). On the other hand, ej (λ) = Qj (x 0 , D 0 ) + Rjλ (x 0 , D 0 ), G where the full symbol of Rjλ (x 0 , D 0 ) is given in the local coordinates by ψ ||ξ 0 ||x 0 /λ h(j ) (x 0 , ξ 0 , 1). Set ψ1 (z) = ψ(z)(cz − 1)−1 . Then we obtain ejλ (x 0 , D 0 ), Rjλ (x 0 , D 0 ) = λ−1 Sjλ (x 0 , D 0 )(Q(x 0 , D 0 ) − λ) + λ−1 R eλ ∈ 9 −j/3 (0) uniformly with respect to λ ∈ L and the where Sjλ ∈ 9 1−j/3 (0) and R j (j )

principal symbol of Sjλ is ψ1 (||ξ 0 ||/λ)h0 (x 0 , ξ 0 ). On the other hand, the full symbol of eλ (x 0 , D 0 ) vanishes outside the ball {||ξ 0 || ≤ r0 } and we can repeat the same argument. R j In this way we obtain by recurrence ejλ (x 0 , D 0 ), Rjλ (x 0 , D 0 ) = λ−1 Pjλ (x 0 , D 0 )(Q(x 0 , D 0 ) − λ) + R

λ ∈ 9 1−j/3 (0) and R eλ ∈ 9 −∞ (0) uniformly with respect to λ ∈ L. This where Pj,1 j,1 completes the proof of the claim. u t

Resonances Near the Real Axis for Transparent Obstacles

437

5.4. Construction of quasimodes. The principal symbol of the pseudodifferential oper1,1/3 ator Q ∈ 9phg (0) satisfies σ (Q)(x 0 , ξ 0 ) ≥ A||ξ 0 ||x 0 − B, (x, ξ ) ∈ T ∗ 0, for some positive constants A and B. Then there exists C > 0 such that the unbounded operator Q + C in L2 (0) with a domain of definition H 1 (0) is invertible and (Q + C)−1 is compact. Using the Riesz-Schauder theorem for (Q + C)−1 (see e.g. [18]), we obtain that the spectrum of Q is a discrete set of different eigenvalues {kj }∞ j =1 ⊂ C of finite algebraic multiplicity such that |kj | → ∞. Moreover, choosing the constant C1 in (2.9) large enough, we obtain kj ∈ L, j ≥ j0 , for some j0 ≥ 1, since Q∗ − Q is of order ≤ 0. Let gj 6 = 0 be eigenfunctions corresponding to kj . Obviously gj are smooth g functions on 0. On the other hand, W F({(kj , gj )}) ⊂ ∂6 by construction, and we −∞ g ∩ ∂6 = ∅. Using obtain C(kj )gj = O(kj )gj for any C(λ) ∈ OP0,0 (0) with WF(C) Proposition 5.1 and (5.10), we obtain that g(kj ) = E(kj )gj satisfy (5.6) for λ = kj . Denote by vj (x) = v(x, kj ) the function (5.2) choosing λ = kj and g = E(kj )gj . Then C(kj )vj = O(kj−∞ )gj for any C(λ) ∈ R, and v(x, λ) satisfies (4.10) for λ ∈ {kj }∞ j =1 . Hence, (j ) (j ) u1 = T (kj )vj |O , fj = u1 |0 (j )

satisfy (2.12) with λ = kj . Denote by u2 the solution of (2.10) with λ = kj , f = fj . (j ) (j ) Then the sequence {(kj , u1 , u2 )}∞ j =1 satisfies (2.1). Using Theorem 4.2 we obtain f j = ı ∗ T vj =

∞ X p=0

−(1+p)/3

kj

Ai(p+1) (τ )Rp (kj )gj ,

where R0 (λ) is elliptic on ∂6 with a principal symbol equal to a(x 0 , ξ 0 )−1/2 in a neighborhood of ∂6. Then we have kgj kL2 (0) ≤ C|kj |1/3 kfj kL2 (0) with a constant C > 0 independent of j . So we can normalize fj so that kfj kL2 (0) = 1, and we obtain the quasimode Q, which completes the proof of Theorem 1.1. Acknowledgements. We would like to thank Plamen Stefanov for helpful discussions.

References 1. Colin de Verdière, Y.: Quasimodes sur les variétés Riemanniennes. Invent. Math. 43, 15–52 (1977) 2. Duistermaat, J.: Oscillatory integrals, Lagrange immersions and unfolding of singularities. Commun. Pure Appl. Math. 27, 207–281 (1974) 3. Gérard, C.: Asymptotique des poles de la matrice de scattering pour deux obstacles strictement convex. Bull. Soc. Math. France, Mémoire n. 31, 116, 1988 4. Gramchev, T. and Popov, G: Nekhoroshev type estimates for billiard ball maps. Ann. Inst. Fourier 45, 859–895 (1995) 5. Guillemin, V. and Sternberg, S.: Geometric asymptotics. Am. Math. Soc. Surveys 14, 1997 6. Hargé, T. and Lebeau, G.: Diffraction par un convexe. Invent. Math. 118, 161–196 (1984) 7. Hörmander, L.: The Analysis of Linear Partial Differential Operators. Vol. III, IV. Berlin–Heidelberg–New York: Springer, 1985

438

G. Popov, G. Vodev

8. Kovachev, V. and Popov, G.: Invariant tori for the billiard ball map. Trans. Am. Math. Soc. 317, 45–81 (1990) 9. Lax, P. and Phillips, R.: Scattering Theory. New York: Academic Press, 1967 10. Magnuson, A.: Symplectic singularities, periodic orbits of the billiard ball map, and the obstacle problem. Thesis, Cambridge, MA: M.I.T., 1984 11. Marvizi, Sh. and Melrose, R.: Spectral invariants of convex planar regions. J. Diff. Geom. 17, 475–502 (1982) 12. Melrose, R.: Equivalence of glancing hypersurfaces. Invent. Math. 37, 165–191 (1976) 13. Melrose, R. and Taylor, M.: Boundary problems for wave equations with grazing and gliding rays. Unpublished manuscript 14. Petkov, V and Popov, G.: Semi-classical trace formula and clustering of eigenvalues for Schrödinger operators. Ann. Inst. H. Poincaré (Physique Théorique). 68, 17–83 (1997) 15. Popov, G.: Quasi-modes for the Laplace operator. Unpublished manuscript, 1989 16. Popov, G.: Glancing hypersurfaces and length spectrum invariants. Unpublished manuscript, 1990 17. Popov, G.: Quasi-modes for the Laplace operator and glancing hypersurfaces. In: M. Beals, R. Melrose, J. Rauch (eds.): Proceeding of Conference on Microlocal Analysis and Nonlinear Waves, Minnesota 1989, Berlin–Heidelberg–New York: Springer, 1991 18. Reed, M. and Simon, B.: Methods of modern mathematical physics. Vol. I. New York–San Francisco– London: Academic Press, 1972 19. Sjöstrand, J. and Vodev, G.: Asymptotics of the number of Rayleigh resonances. Math. Ann. 309, 287–306 (1997) 20. Sjöstrand, J. and Zworski, M.: Complex scaling and distribution of scattering poles. J. Am. Math. Soc. 4, 729–769 (1991) 21. Stefanov, P.: Quasimodes and resonances: Sharp lower bounds. Duke Math. J. 99, 75–92 (1999) 22. Stefanov, P. and Vodev, G.: Neumann resonances in linear elasticity for an arbitrary body. Commun. Math. Phys. 176, 645–659 (1996) 23. Tang, S.-H. and Zworski, M.: From quasimodes to resonances. Math. Res. Lett. 5, 261–272 (1998) 24. Vodev, G.: Sharp bounds on the number of scattering poles for perturbations of the Laplacian. Commun. Math. Phys. 146, 205–216 (1992) Communicated by B. Simon

Commun. Math. Phys. 207, 439 – 465 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Semiclassical Dynamics with Exponentially Small Error Estimates George A. Hagedorn1,? , Alain Joye2,?? 1 Department of Mathematics and Center for Statistical Mechanics and Mathematical Physics, Virginia Poly-

technic Institute and State University, Blacksburg, VA 24061-0123, USA

2 Institut Fourier, Unité Mixte de Recherche CNRS-UJF 5582, Université de Grenoble I, BP 74, 38402 Saint

Martin d’Hères Cedex, France Received: 7 January 1999 / Accepted: 30 April 1999

Abstract: We construct approximate solutions to the time–dependent Schrödinger equation h¯ 2 ∂ψ = − 1ψ + V ψ i h¯ ∂t 2 for small values of h¯ . If V satisfies appropriate analyticity and growth hypotheses and |t| ≤ T , these solutions agree with exact solutions up to errors whose norms are bounded by C exp {−γ /h¯ } , for some C and γ > 0. Under more restrictive hypotheses, we prove that for sufficiently small T 0 , |t| ≤ T 0 | log(h¯ )| implies the norms of the errors are bounded by C 0 exp −γ 0 /h¯ σ , for some C 0 , γ 0 > 0, and σ > 0. 1. Introduction In this paper, we construct exponentially accurate semiclassical approximations ψ(x, t, h¯ ) to certain normalized exact solutions 9(x, t, h¯ ) of the d-dimensional time– dependent Schrödinger equation i h¯

h¯ 2 ∂9 = − 19 + V 9. ∂t 2

? Partially Supported by National Science Foundation Grant DMS–9703751.

?? Partially Supported by Fonds National Suisse de la Recherche Scientifique, Grant 8220-037200.

(1.1)

440

G. A. Hagedorn, A. Joye

More precisely, our main result is that for |t| ≤ T and small values of h¯ , these approximations satisfy error estimates of the form kψ(x, t, h¯ ) − 9(x, t, h¯ )kL2 (Rd ) ≤ C exp {−γ /h¯ } ,

(1.2)

where γ > 0. Our construction of ψ(x, t, h¯ ) is technically complicated, but quite explicit. It uses a particular collection of semiclassical wave packets {ϕj (A, B, h¯ , a, η, ·)} that are defined in [6,7], and the next section. Here A and B are d × d complex matrices that satisfy certain conditions. The quantities a and η are elements of Rd . For fixed A, B, h¯ , a, and η, {ϕj (A, B, h¯ , a, η, ·)} is an orthonormal basis of L2 (Rd ) as j ranges over all d-dimensional multi-indices. The function ϕj (A, B, h¯ , a, η, ·) is concentrated near position a, and its Fourier transform is concentrated near momentum η. Its position and momentum uncertainties √ h . The position uncertainty is determined√by the matrix |A| = are proportional to ¯ √ AA∗ , and the momentum uncertainty is determined by |B| = BB ∗ . We construct ψ(x, t, h¯ ) by applying the idea of “optimal truncation” of an asymptotic expansion. For initial conditions of the form X cj ϕj (A(0), B(0), h¯ , a(0), η(0), x), (1.3) 9(x, 0, h¯ ) = |j |≤J

with

P

|j |≤J

|cj |2 = 1, there exist ([5,6]) approximate solutions

ψl (x, t, h¯ ) = eiS(t)/h¯

X

cj (l, t, h¯ )ϕj (A(t), B(t), h¯ , a(t), η(t), x),

(1.4)

|j |≤Je(l)

that satisfy sup

t∈[−T ,T ]

kψl (x, t, h¯ ) − 9(x, t, h¯ )kL2 (Rd ) ≤ C(l)hl/2

(1.5)

for some constant C(l). Here Je(l) = J + 3l − 3, and A(t), B(t), a(t), η(t), and S(t) are solutions to the classical equations of motion a(t) ˙ = η(t) ˙ = ˙ = A(t) ˙ B(t) =

η(t), −∇V (a(t)), iB(t), iV (2) (a(t))A(t), 2 ˙ = η(t) − V (a(t)), S(t) 2

(1.6)

where V (2) denotes the Hessian matrix for V , and the initial conditions A(0), B(0), a(0), η(0), and S(0) = 0 satisfy At (0)B(0) − B t (0)A(0) = 0, A∗ (0)B(0) + B ∗ (0)A(0) = 2I.

(1.7) (1.8)

The cj (l, t, h¯ ) satisfy a linear system of ordinary differential equations that we describe in the next section.

Semiclassical Dynamics

441

We carefully estimate the l-dependence of C(l) in (1.5). Then, for each h¯ , we choose l(h¯ ) to minimize the error C(l(h¯ ))h¯ l(h¯ )/2 over all choices of l. It turns out that l(h¯ ) behaves like a constant times 1/h¯ . We define ψ(x, t, h¯ ) = ψl(h¯ ) (x, t, h¯ ) and prove that (1.2) is satisfied. For t in a fixed compact interval, the precise statement of our results is the following: Theorem 1.1. Suppose V is a real-valued function on Rd that is bounded below and has an analytic continuation to the set D = {z ∈ Cd : |Im zj | < δ, j = 1, 2, · · · , d}. Suppose further that there exist M > 0 and τ > 0, such that |V (z)| ≤ M exp(τ |z|2 ),

for all z ∈ D,

P where |z|2 = dj =1 |zj |2 . Suppose initial conditions A(0), B(0), a(0), η(0), S(0) = 0, P and cj for |j | ≤ J are specified that satisfy (1.7) and |j |≤J |cj |2 = 1. Then for any T > 0, there exist C and γ > 0, such that the difference between the semiclassical approximation ψ(x, t, h¯ ) and the exact solution 9(x, t, h¯ ) to the Schrödinger equation (1.1) with initial condition (1.3) satisfies kψ(x, t, h¯ ) − 9(x, t, h¯ )kL2 (Rd ) ≤ C exp {−γ /h¯ } , whenever |t| ≤ T . Remarks. 1. Theorem 1.1 can be generalized to allow time–dependent potentials. For example, suppose a potential V (x, t) depends smoothly on t, is bounded below, and satisfies M exp(τ |x|2 ) 1 , |D m V (x, t))| ≤ m! δ |m| for |t| ≤ T for all multi-indices m. Suppose further that a classical solution (a(t), η(t)) to Newton’s equations with potential V (x, t) is bounded for |t| ≤ T . Then the conclusion to Theorem 1.1 holds. 2. Our results can also be extended to obtain weaker error estimates of the form C exp {−γ /h¯ σ } for some σ ∈ (0, 1), when the potential belongs to a Gevrey class. 3. Theorem 1.1 is optimal in the sense that the conclusion fails if the hypotheses are relaxed slightly. For example, consider the one-dimensional potential V (x) = exp(−1/x u ), for x > 0, and V (x) = 0 for x ≤ 0, where u > 0. It is shown in [8] that this potential belongs to the Gevrey class of order 1 + 1/u. For initial conditions a(0) = 0, η(0) = 0, A(0) = B(0) = 1, S(0) = 0, and cj (0) = δj,0 , our approximation yields ϕ0 (1 + it, 1, h¯ , 0, 0), for all times. This function is very simple, and we can write the error term explicitly. By steepest descent analysis, we can show that there exist δ > 0 and 61 > 62 > 0, such that t ∈ (0, h¯ δ ) implies exp(−61 /h¯ u/(1+u) ) ≤ ke−itH (h¯ )/h¯ ϕ0 (1, 1, h¯ , 0, 0, ·) − ϕ0 (1 + it, 1, h¯ , 0, 0, ·)k ≤ exp(−62 /h¯ u/(1+u) ). Note also that if we choose a(0) = −a < 0 and η(0) = η > 0, it is easy to check that the error term is O(exp(−γ /h¯ )), for each t < a/η.

442

G. A. Hagedorn, A. Joye

4. For all practical purposes, we can replace the cj ’s by the corresponding Dyson expansion up to order l(h¯ ) (see (2.26)–(2.28)) without spoiling our exponential estimate. The normalization of the approximation, however, will be lost. 5. One cannot expect to get better agreement with the exact solution when constructing an approximation using wave packets associated with a single classical path. Such wave packets do not describe tunneling, which is an exponentially small effect in h¯ . Theorem 1.1 can also be generalized to allow time intervals that grow like | log(h¯ )| as h¯ tends to zero. However, we obtain a somewhat weaker conclusion. Our precise results are summarized by the following theorem. Theorem 1.2. Suppose V is bounded below and analytic in D = {z ∈ Cd : |Im zj | < δ, j = 1, 2, · · · , d}. Suppose further that there exist M > 0 and τ > 0, such that |V (z)| ≤ M exp(τ |z|),

for all z ∈ D.

Suppose initial conditions A(0), B(0), a(0), η(0), S(0) = 0, and cj for |j | ≤ J are P specified that satisfy (1.7) and |j |≤J |cj |2 = 1, and further assume there exist N > 0 andλ > 0, such that kA(t)k ≤ N exp(λ|t|).

(1.9)

Then for sufficiently small T 0 > 0, there exist C 0 , γ 0 > 0, and σ > 0, such that the difference between the semiclassical approximation ψ(x, t, h¯ ) and the exact solution 9(x, t, h¯ ) to the Schrödinger equation (1.1) with initial condition (1.3) satisfies kψ(x, t, h¯ ) − 9(x, t, h¯ )kL2 (Rd ) ≤ C 0 exp −γ 0 /h¯ σ , whenever |t| ≤ T 0 | log(h¯ )|. Remark. Standard existence and uniqueness theorems for systems of ODE’s show that condition (1.9) is satisfied if the norm of the Hessian V (2) (a(t)) is uniformly bounded. That is the case if V is the sum of a quadratic polynomial plus an analytic function bounded on D. It is also the case if E denotes the energy of the considered trajectory and kV (2) (x)k is bounded on the connected component of the classically allowed region DE = {x ∈ Rd : V (x) ≤ E} that contains a(t). This is satisfied for all confining potentials. It is easily deduced from the proof that if τ can be chosen arbitrarily small, then we can take T 0 = ( 16 − ) λ1 with arbitrarily small. This yields exponential control over the same time intervals as in [3]. The propagation of coherent states is also considered by Combescure and Robert in [3], see also [9], using an approximation given by a linear combination of squeezed states. (The squeezed states coincide with our semiclassical wave packets, although the notation is quite different.) Their emphasis is on the long time behavior of this approximation. The bound on the error term is of the form Cl (t)h¯ l/2 , with explicit control of the time-dependence of Cl (t) in terms of classical quantities. The l behavior is however not investigated. Results of a flavor similar to ours can be found in the work of Yajima [12]. They are obtained by means of the pseudo-differential techniques developed in the analytic

Semiclassical Dynamics

443

context by Sjöstrand in [10]. These results concern the propagation of wave packets of the form ϕ(x) = eiS(x)/h¯ f (x), where S is analytic and f belongs to the set of compactly supported Gevrey functions of order s > 1. Assuming the potential V is analytic, Yajima constructs approximations to the actual evolution of such wave packets that are valid 1/(2s−1) , with γ > 0 (see Theorems up to an error, whose L2 (Rd ) norm is of order e−γ /h¯ 1.1, 1.2 and Lemma 2.5 in [12]). However, it should be possible to make use of the theory [10] to recover our results. Similar issues have been dealt with by Bambusi, Graffi and Paul in [2]. They focus on the validity for large times of the semiclassical approximation of the Heisenberg evolution of a smooth observable, under analyticity assumptions on the hamiltonian. They prove that the semiclassical approximation remains useful for times up to order | ln(h¯ )|, the Ehrensfest time scale. However, the Hamiltonians they can accommodate consist more or less of analytic perturbations of the harmonic oscillator that decay as x and p tend to infinity. The paper is organized as follows: In Sect. 2, we prove Theorem 1.1 under the assumption that two types of error terms satisfy certain bounds. We prove the two required bounds in Sects. 3 and 4. In Sect. 5 we describe the proof of Theorem 1.2. 2. Proof of Theorem 1.1 We begin this section by presenting the definition of the semiclassical wave packets ϕj (A, B, h¯ , a, η, x) that is given in [7]. A more explicit, but more complicated definition is given in [6]. Since [7] provides a detailed discussion of these wave packets, we do not prove all their properties here. We adopt the standard multi-index notation. A multi-index j = (j1 , j2 , . . . , jd ) is a P d-tuple of non-negative integers. We define |j | = dk=1 jk , j ! = (j1 !)(j2 !) · · · (jd !), ∂ |j | j j j . x j = x11 x22 · · · xdd , and D j = (∂x1 )j1 (∂x2 )j2 · · · (∂xd )jd Throughout the paper we assume a ∈ Rd , η ∈ Rd and h¯ > 0. We also assume that A and B are d × d complex invertible matrices that satisfy At B − B t A = 0, A∗ B + B ∗ A = 2I.

(2.1) (2.2)

These conditions guarantee that both the real and imaginary parts of BA−1 are symmet−1 = AA∗ . We ric. Furthermore, ReBA−1 is strictly positive definite, and ReBA−1 note that conditions (2.1) and (2.2) are preserved under the dynamics generated by (1.6). Our definition of ϕj (A, B, h¯ , a, η, x) is based on the following raising operators that are defined for m = 1, 2, . . . , d: " d # d X X 1 ∂ ∗ B nm (xn − an ) − Anm (−i h¯ − ηn ) . Am (A, B, h¯ , a, η) = √ ∂xn 2h¯ n=1

n=1

Definition. For the multi-index j = 0, we define the normalized complex Gaussian wave packet (modulo the sign of a square root) by ϕ0 (A, B, h¯ , a, η, x) = π −d/4 h¯ −d/4 (det(A))−1/2 n o × exp −h(x − a), BA−1 (x − a)i/(2h¯ ) + ihη, (x − a)i/h¯ .

(2.3)

444

G. A. Hagedorn, A. Joye

Then, for any non-zero multi-index j , we define j j 1 A1 (A, B, h¯ , a, η)∗ 1 A2 (A, B, h¯ , a, η)∗ 2 · · · ϕj (A, B, h¯ , a, η, ·) = √ j! j × Ad (A, B, h¯ , a, η)∗ d ϕ0 (A, B, h¯ , a, η, ·). Remarks. 1. For A = B = I , h¯ = 1, and a = η = 0, the ϕj (A, B, h¯ , a, η, ·) are just the standard Harmonic oscillator eigenstates with energies |j | + d/2. 2. For each A, B, h¯ , a, and η, the set {ϕj (A, B, h¯ , a, η, ·)} is an orthonormal basis for L2 (Rd ). 3. The raising operators can also be given by another formula that was omitted from [7] in the multi-dimensional case. If we set o ∗ n g(A, B, h¯ , a, x) = exp −h(x − a), BA−1 (x − a)i/(2h¯ ) − ihη, (x − a)i/h¯ , then we have ∗

r

Am (A, B, h¯ , a, η) ψ (x) = − ·

d X n=1

1 h¯ 2 g(A, B, h¯ , a, x) Anm

∂ (g(A, B, h¯ , a, x)ψ(x)) . ∂xn

4. In [6], the state ϕj (A, B, h¯ , a, η, x) is defined as a normalization factor times Hj (A; h¯ −1/2 |A|−1 (x − a)) ϕ0 (A, B, h¯ , a, η, x). Here Hj (A; y) is a |j |th order polynomial in y that depends on A only through UA , where A = |A|UA is the polar decomposition of A. 5. By the argument on p. 370 of [6] or by scaling out the |A| and h¯ dependence and 2 using Remark 3 above, one can show that Hj (A; y)e−y /2 is an (unnormalized) eigenstate of the usual Harmonic oscillator with energy |j | + d/2. 6. When the dimension√d is 1, the position and √ momentum uncertainties of the h |A| and (j + 1/2)h¯√ |B|, respectively. In higher ϕj (A, B, h¯ , a, η, ·) are (j + 1/2) ¯ √ dimensions, they are bounded by (|j | + d/2)h¯ kAk and (|j | + d/2)h¯ kBk, respectively. 7. When we approximately solve the Schrödinger equation, the choice of the sign of the square root in the definition of ϕ0 (A, B, h¯ , a, η, ·) is determined by continuity in t after an arbitrary initial choice. The proof of the theorem depends on the following abstract lemma, whose proof is an easy application of Duhamel’s formula (see e.g. [7]). Lemma 2.1. Suppose H (h¯ ) is a family of self-adjoint operators for h¯ > 0. Suppose ψ(t, h¯ ) belongs to the domain of H (h¯ ), is continuously differentiable in t, and approximately solves the Schrödinger equation in the sense that i h¯

∂ψ (t, h¯ ) = H (h¯ )ψ(t, h¯ ) + ξ(t, h¯ ), ∂t

(2.4)

Semiclassical Dynamics

445

where ξ(t, h¯ ) satisfies kξ(t, h¯ )k ≤ µ(t, h¯ ). Then, for t > 0, ke−itH (h¯ )/h¯ ψ(0, h¯ ) − ψ(t, h¯ )k ≤ h¯ −1

(2.5) Z

t 0

µ(s, h¯ )ds.

(2.6)

The analogous statement holds for t < 0. Because V is smooth and bounded below, there exist global solutions to the first two equations of the system (1.6) for any initial condition. It then follows immediately that the remaining three equations of the system (1.6) have global solutions. Furthermore, it is not difficult ([5,6]) to prove that (2.1) and (2.2) are preserved by the flow. As mentioned in the introduction, it is proved in [5] and [6] that initial conditions of the form (1.3) give rise to approximate solutions of the form X cj (l, t, h¯ )ϕj (A(t), B(t), h¯ , a(t), η(t), x), ψl (x, t, h¯ ) = eiS(t)/h¯ |j |≤Je(l)

with errors whose norms are of order h¯ l/2 . Here Je(l) = J + 3l − 3, and A(t), B(t), a(t), η(t), and S(t) satisfy (1.6). The coefficients cj (l, t, h¯ ) satisfy the linear system X Kkj (l, t, h¯ )cj (l, t, h¯ ), |k| = 0, 1, . . . , Je(l), (2.7) i h¯ c˙k (l, t, h¯ ) = |j |≤Je(l)

with initial conditions cj (l, 0, h¯ ) = cj , for |j | ≤ J in accordance with (1.3) and cj (l, 0, h¯ ) = 0 for |j | > J . To specify the (J + 3l − 2) × (J + 3l − 2) matrix K(l, t, h¯ ) that appears in (2.7), we first decompose the potential as V (x) = Wa (x) + Za (x) ≡ Wa (x) + (V (x) − Wa (x)),

(2.8)

where Wa (x) denotes the second order Taylor approximation (with the obvious abuse of notation) Wa (x) ≡ V (a) + V (1) (a)(x − a) + V (2) (a)(x − a)2 /2.

(2.9)

Next (reverting to multi-index notation), we approximate Za (x) by its Taylor approximation of order l + 1, Za[l] (x) =

X 3≤|m|≤l+1

(D m V )(a) (x − a)m , m!

(2.10)

and define the infinite matrix ekj (l, t, h¯ ) = K [l] (x)ϕj (A(t), B(t), h¯ , a(t), η(t), x)i. (2.11) hϕk (A(t), B(t), h¯ , a(t), η(t), x), Za(t)

e t, h¯ ) by restricting the indices to |j | ≤ Then, we obtain the matrix K(l, t, h¯ ) from K(l, Je(l) and |k| ≤ Je(l).

446

G. A. Hagedorn, A. Joye

The general strategy [5–7] to show that ψl (x, t, h¯ ) is an approximation to the actual solution 9(x, t, h¯ ) of (1.1) and (1.3) up to order h¯ l/2 is as follows: From [7], we know that for all multi-indices j , i ∂ h iS(t)/h¯ e ϕj (A(t), B(t), h¯ , a(t), η(t), x) ∂t h 2 i h¯ = − 1 + Wa(t) (x) eiS(t)/h¯ ϕj (A(t), B(t), h¯ , a(t), η(t), x) . 2 i h¯

(2.12)

Thus, the ϕj take into account the kinetic energy and Wa(t) (x) parts of the Hamiltonian. Next, we expand the exact solution as 9(x, t, h¯ ) =

X

bj (h¯ , t)eiS(t)/h¯ ϕj (A(t), B(t), h¯ , a(t), η(t), x),

j

where the bj (h¯ , t) satisfy an infinite linear system of ordinary differential equations whose matrix is obtained from the Za(t) (x) term in the Hamiltonian. In that system, we make a first approximation by replacing the function Za(t) (x) by its Taylor approximation [l] e t, h¯ ). Its entries are Za(t) (x). This yields an infinite linear system whose matrix is K(l, 1/2 of order l − 1. We make a second approximation time dependent polynomials in h¯ by truncating the infinite system to obtain (2.7) that is satisfied by the cj (l, t, h¯ ). The result (1.5) is proved by using Lemma 2.1 to show that the errors generated by the Taylor approximation and the truncation approximation are of order h¯ l/2 . As described in the introduction, we construct the exponentially accurate approximate solution ψ(x, t, h¯ ) = ψl(h¯ ) (x, t, h¯ ), by keeping track of the l-dependence of C(l) in (1.5) and then choosing l(h¯ ) in order to minimize the error. In the remainder of this section, we prove Theorem 1.1 under the assumption that the following three technical lemmas are true. We prove the first lemma in Sect. 3. It [l] (x). The second and estimates errors that arise from our replacement of Za(t) (x) by Za(t) third lemmas are proved in Sect. 4. They bound certain matrix elements and combinatorial quantities that arise from the truncation approximation discussed above. Lemma 2.2. Suppose V satisfies the hypotheses of Theorem 1.1, |m| = l + 2, Je(l) = J + 3l − 3, and ψl (x, t, h¯ ) = eiS(t)/h¯

X

cj (l, t, h¯ )ϕj (A(t), B(t), h¯ , a(t), η(t), x),

|j |≤Je(l)

P with |j |≤Je(l) |cj (l, t, h¯ )|2 = 1. Let ζ (x, a(t)) = (a(t) + θx,a(t) (x − a(t))) ∈ Rd , with θx,a(t) ∈ (0, 1). There exist constants g0 and g1 , that depend on d and J only, such that for sufficiently small h¯ ,

m

D V (ζ (x, a(t))) m

(x − a(t)) ψl (·, t, h¯ )

m! p l+2 . ≤ g0 M exp(4τ (δ 2 d + a(t)2 )) g1 h¯ (l + 2)kA(t)k/δ

(2.13)

Semiclassical Dynamics

447

Lemma 2.3. We define the infinite matrix ejmk (t, h¯ ) = X hϕj (A(t), B(t), h¯ , a(t), η(t), ·), (x − a(t))m ϕk (A(t), B(t), h¯ , a(t), η(t), ·)i, (2.14) em (t, h¯ ) by restricting its indices to and then define the finite matrix X m (l, t, h¯ ) from X em (s, h¯ ) = 0 if |j | − |k| > |m| |j | ≤ Je(l) and |k| ≤ Je(l). Then, Xjmk (l, s, h¯ ) = 0 and X jk and, for each N > 0, there exists D(N) < ∞, such that em0 sup X (t0 , h¯ )Xm1 (l, t1 , h¯ )Xm2 (l, t2 , h¯ ) · · · Xmq (l, tq , h¯ ) j k |k| ≤ Je(l) Je(l) + 1 ≤ |j | ≤ Je(l) + l + 1

q ≤ D(N ) h¯ Je(l)

!|m0 |+|m1 |+|m2 |···+|mq |

sup

t∈{t0 ,t1 ,t2 ,···tq }

kA(t)k

,

(2.15)

for any collection m0 , m1 , . . . , mq of multi-indices that satisfy |mj |/Je(l) ≤ N . Lemma 2.4. We define

X

Fp (n, q) =

1

(2.16)

1 ≤ |m1 |, |m2 |, · · · , |mq | ≤ p |m1 | + |m2 | + · · · + |mq | = n

to be the number of distinct sets {m1 , m2 , . . . , mq }, where each mj is a d-dimensional multi-index with 1 ≤ |mj | ≤ p and |m1 | + |m2 | + · · · + |mq | = n. We note that Fp (n, q) is zero unless q ≤ n ≤ qp. Suppose that a function Lq (l) satisfies q q(l+1) C1 X (C2 h¯ l)n/2 Fl+1 (n, q), Lq (l) ≤ q h¯ q!

(2.17)

n=3l−2

where C1 and C2 are constants. Let [[α]] denote the greatest integer less than or equal to α. Then there exists g ∗ > 0, such that for any g ∈ (0, g ∗ ), there exist positive constants C3 , γ1 , and h¯ ∗ , that depend only on g, C1 , and C2 , such that 0 < h¯ < h¯ ∗ , l(h¯ ) = [[g/h¯ ]], and 2 ≤ q ≤ l(h¯ ) + 2

(2.18)

Lq (l(h¯ )) ≤ C3 exp {−γ1 /h¯ } .

(2.19)

imply Proof of Theorem 1.1. We define ψl (x, t, h¯ ) by (1.4), where the cj (l, t, h¯ ) are determined by the system (2.7) and initial conditions described above. To apply Lemma 2.1 we define 2 ∂ h¯ (2.20) ξl (x, t, h¯ ) = i h¯ ψl (x, t, h¯ ) − − 1 + V (x) ψl (x, t, h¯ ). ∂t 2 By using (2.12), we see that this can be decomposed as a sum of two terms, (1) [l] (x) − Za(t) (x) ψl (x, t, h¯ ) ξl (x, t, h¯ ) = Za(t)

(2.21)

448

G. A. Hagedorn, A. Joye

and (2)

[l] (x)ψl (x, t, h¯ ), ξl (x, t, h¯ ) = −P{|j |≥Je(l)} Za(t)

(2.22)

where P{|j |≥l+2} is the orthogonal projection onto the span of the set {ϕj (A(t), B(t), h¯ , a(t), η(t), ·) : |j | ≥ l + 2}. By the standard Taylor series error formula, [l] (x) = Za(t) (x) − Za(t)

X |m|=l+2

D m V (ζ (x, a(t))) (x − a(t))m , m!

for some ζ (x, a) = (a + θx,a (x − a)), with θx,a ∈ (0, 1). Thus, by the crude estimate X

1 ≤

|m|=n

X

1≤

d X X

1

j =1 mj ≤n

|m|≤n

= (n + 1)d = ed ln(n+1) ≤

n ed ,

(2.23)

and Lemma 2.2, we obtain

(1)

ξl (·, t, h¯ )

X D m V (ζ (x, a(t)))

m

(x − a(t)) ψl (·, t, h¯ ) ≤

m! |m|=l+2

l+2 l+2 p . ed ≤ g0 M exp(4τ (δ 2 d + a(t)2 )) g1 h¯ (l + 2) kA(t)k/δ Thus, there exist constants C4 and C5 , such that |t| ≤ T implies

l+2 p

(1) .

ξl (·, t, h¯ ) ≤ C4 C5 h¯ (l + 2)

(2.24)

(2)

The quantity ξl (·, t, h¯ ) satisfies hϕj (A(t), B(t), h¯ , a(t), η(t), x),

(2) ξl (x, t, h¯ )i

= 0, if

0 ≤ |j | ≤ Je(l) or |j | > Je(l) + l + 1,

and (2)

hϕj (A(t), B(t), h¯ , a(t), η(t), x), ξl (x, t, h¯ )i = X [l] hϕj (A(t), B(t), h¯ , a(t), η(t), x), Za(t) (x)ϕk (A(t), B(t), h¯ , a(t), η(t), x)ick (t)

|k|≤Je(l)

e = (K(t)c(t)) j,

if Je(l) < |j | ≤ Je(l) + l + 1,

(2.25)

Semiclassical Dynamics

449

where we denote the cj (l, t, h¯ ) collectively by the vector c(l, t, h¯ ). We easily verify these (2) facts by using (2.7), (2.11), (2.12) and Lemma 2.3. To estimate the norm of ξl (·, t, h¯ ), we use the Dyson expansion with remainder to decompose c(l, t, h¯ ) =

l X

cq (l, t, h¯ ) + r(l, t, h¯ ),

(2.26)

q=0

where (dropping some arguments) Z t Z sq−1 Z s1 ds1 ds2 · · · dsq (i h¯ )−q K(s1 )K(s2 ) · · · K(sq )c(0), cq (t) = 0

0

and

Z

Z

t

r(t) =

s1

ds1

Z ds2 · · ·

0

0

(2.27)

0

sl

0

dsl+1 (i h¯ )−(l+1) K(s1 )K(s2 ) · · · K(sl+1 )c(sl+1 ). (2.28)

Using (2.11), (2.10), and (2.14), we see that each cq (l, t, h¯ ) is of order h¯ q/2 and that r(l, t, h¯ ) is of order h¯ (l+1)/2 . (2) e t, h¯ ), To estimate the norm of ξl (·, t, h¯ ) we study the j th component of K(t)c(l, e with |j | > J (l). Because of (2.26), this coefficient is a sum of two types of terms: those that arise from cq (l, t, h¯ ) and those that arise from r(l, t, h¯ ). Using (2.10), we expand em (t, h¯ ), to e t, h¯ ) in terms of Xm (l, t, h¯ ) and X K(l, t, h¯ ) in (2.27) and (2.28) and K(l, obtain q e (t) = (i h¯ )−q K(t)c

Z

t

Z

s1

ds1

(l+1)(q+1) X

X

n=3(q+1)

m0 , m1 , m2 , · · · , mq |m0 | + |m1 | + · · · + |mq | = n 3 ≤ |mj | ≤ l + 1

Z

0

0

sq−1

ds2 · · · 0

dsq

D m0 V (a(t))D m1 V (a(s1 )) · · · D mq V (a(sq )) m0 !m1 !m2 ! · · · mq !

em0 (t)Xm1 (s1 )Xm2 (s2 ) · · · Xmq (sq )c(0), ×X

(2.29)

and e K(t)r(t) = (i h¯ )−(l+1)

Z 0

t

Z ds1 0

s1

Z ds2 · · · 0

(l+1)(l+2) X

X

n=3(l+2)

m0 , m1 , m2 , · · · , ml+1 |m0 | + |m1 | + · · · + |ml+1 | = n 3 ≤ |mj | ≤ l + 1

sl

dsl+1

D m0 V (a(t))D m1 V (a(s1 )) · · · D ml+1 V (a(sl+1 )) m0 !m1 !m2 ! · · · ml+1 !

em0 (t)Xm1 (s1 )X m2 (s2 ) · · · Xml+1 (sl+1 )c(sl+1 ). ×X

(2.30)

The values of mj that occur in both (2.29) and (2.30) satisfy |mj |/Je(l) ≤ (l + 1)/(3l − 3) ≤ 1,

(2.31)

450

G. A. Hagedorn, A. Joye

as long as l ≥ 3. So, we can apply Lemma 2.3 with N = 1. em (s, h¯ ) = 0 if |j | − |k| > |m|. Since ck (0) is Recall that Xjmk (l, s, h¯ ) = 0 and X jk th e for non-zero only for |k| ≤ J , and we need only consider the j coefficient of K(t)c(t) |j | > Je(l), the only relevant values of n in (2.29) must satisfy n ≥ 3l − 2.

(2.32)

This condition is also satisfied for all values of n in (2.30), since the sum begins with n = 3l + 6. To use the analyticity assumptions to get estimates on the derivatives of V , we define Cδ (x) = {z ∈ Cd : zj = xj + δeiθj , θj ∈ [0, 2π ), j = 1, 2, · · · , d}. If z ∈ Cδ (a(t)), then, for all j = 1, 2, · · · , d, |zj | ≤ δ + |aj (t)|. Hence, writing

1 m D V (a(t)) as a d-dimensional Cauchy integral, we get the bound m! supz∈Cδ (a(t)) |V (z)| v(t) 1 |D m V (a(t))| ≤ ≡ |m| , |m| m! δ δ

(2.33)

where v(t) ≤ M exp(4τ (δ 2 d + a 2 (t))). Furthermore, since kA(t)k depends continuously on t, there exists w(T ), such that sup

t∈[−T ,T ]

kA(t)k ≤ w(T ).

The norm of the vector c(t) is 1 since K(l, t, h¯ ) is self-adjoint. Thus, the non-zero entries of c(0) are each bounded by 1, and by a crude estimate, there are at most (J + 1)d of them. Similarly, c(t) has at most (Je(l)+1)d non-zero entries, each of which is bounded by 1. Thus, for l ≥ 3, (2.11), (2.10), (2.14), (2.29), (2.30), (2.32), (2.31), and Lemma 2.3 imply the following two estimates when j satisfies Je(l) + 1 ≤ |j | ≤ Je(l) + l + 1: q (K(t)c e (t))j q R T (l+1)(q+1) v(s)ds X 0 (D(1)w(T )/δ)n h¯ n/2 Je(l)n/2 Fl+1 (n, q + 1)(J + 1)d ≤ v(t) h¯ q q! n=3l−2 =

v(t)h¯ (q + 1)(J RT 0 v(s)ds ·

+ 1)d

R

q+1 T v(s)ds 0 hq+1 (q + 1)! ¯

q (D(1)w(T ) h¯ Je(l)/δ)n Fl+1 (n, q + 1)

(l+1)(q+1) X n=3l−2

(2.34)

Semiclassical Dynamics

451

and v(t)h¯ (l + 2)(Je(l) + 1)d (K(t)r(t)) e RT j ≤ 0 v(s)ds ·

R

l+2 T 0 v(s)ds hl+2 (l + 2)! ¯

q (D(1)w(T ) h¯ Je(l)/δ)n Fl+1 (n, l + 2),

(l+1)(l+2) X

(2.35)

n=3l−2

where Fp (n, q) is defined by (2.16). q (t)) and (K(t)r(t)) e e By Lemma 2.4, (2.34), and (2.35), both (K(t)c j j are bounded by C3 exp {−γ1 /h¯ } for an appropriate choice of l(h¯ ) = [[g/h¯ ]] and sufficiently small h¯ . (2) Each of the l + 2 terms in (2.26) contributes a term of this type to ξl (x, t, h¯ ), so

(2) h¯ −1 ξl (x, t, h¯ ) ≤ C3 h¯ −1 (l(h¯ ) + 2) exp {−γ1 /h¯ } ≤ C3 exp {−γ2 /h¯ } , for any γ2 < γ1 when h¯ is sufficiently small. We shrink g if necessary to make C52 g < 1

(2.36)

and set l = l(h¯ ) in (2.24). This yields a similar estimate

(1) h¯ −1 ξl (·, t, h¯ ) ≤ C6 exp {−γ3 /h¯ } , for some γ3 > 0. We combine these two estimates and apply Lemma 2.1 to obtain (1.2) with γ = t min{γ2 , γ3 }. This proves the theorem. u

3. Proof of Lemma 2.2 For simplicity, we drop the t dependence in the notation throughout P this section. To prove Lemma 2.2, we use Hölder’s inequality to see that |j |≤Je(l) |cj (l, h¯ )|2 = 1 implies  1/2 1/2  X X X |cj (l, h¯ )| ≤  |cj (l, h¯ )|2   1 |j |≤Je(l)

|j |≤Je(l)

≤ Je(l) + 1

d/2

|j |≤Je(l)

.

Thus, it is sufficient to prove

m

D V (ζ (x, a)) m

ϕ (A, B, h , a, η, x) (x − a) ¯ j

m! l+2 p , ≤ g3 M exp(4τ (δ 2 d + a 2 )) g4 h¯ (l + 2)kAk/δ for some g3 and g4 , whenever |j | ≤ Je(l).

(3.1)

452

G. A. Hagedorn, A. Joye

We mimic the proof of (2.33) to obtain a bound on D m V (ζ (x, a))/m!. If z ∈ Cδ (ζ (x, a)), then, for all j = 1, 2, · · · , d, |zj | ≤ δ + |ζj (x, a)| ≤ δ + |aj | + |xj − aj |. Using this and applying (b +c)2 ≤ 2(b2 +c2 ) several times, we see that z ∈ Cδ (ζ (x, a)) implies |V (z)| ≤ M exp(2τ (x − a)2 ) exp(4τ (δ 2 d + a 2 )). 1 m D V (ζ (x, a)) as a d-dimensional Cauchy integral, we easily obtain Hence, writing m! the bound exp(4τ (δ 2 d + a 2 )) 1 exp(2τ (x − a)2 ). |D m V (ζ (x, a))| ≤ M m! δ |m| Thus, estimate (3.1) follows from the corresponding estimate on the integral Z (x − a)2m exp(4τ (x − a)2 ) |ϕj (A, B, h¯ , a, η, x)|2 dx. I= Rd

(3.2)

(3.3)

Performing the change of variables x 7 → y = |A|−1 (x − a)/h¯ 1/2 , and using the explicit formula for ϕj , we see that Z h¯ |m| (|A|y)2m exp(−y 2 + 4τ h¯ (|A|y)2 ) |Hj (A; y)|2 dy, (3.4) I = |j | 2 j !π d/2 Rd where Hl (A; y) is the polynomial described in Remarks 4 and 5 that immediately follow the definition of ϕj (A, B, h¯ , a, η, x) in Sect. 2. We assume henceforth that h¯ is sufficiently small that 4τ h¯ kAk2 ≤ 1/2.

(3.5)

The estimate (|A|y)2 ≤ kAk2 y 2 implies (|A|y)2k ≤ kAk2 y 2 , for k = 1, 2, · · · , d. From this, we conclude |m|

I ≤

(h¯ kAk2 ) 2|j | j !π d/2

Z y 2|m| exp(−y 2 /2)|Hj (A; y)|2 dy.

(3.6)

We next need an estimate on |Hj (A; y)|. By Remark 5 after the definition in Sect. 2, 2 2 (3.7) −1 + y 2 Hj (A; y)e−y /2 = (2|j | + d) Hj (A; y)e−y /2 . This equation states that Hj (A; y)e−y /2 is a eigenfunction of −1 + y 2 corresponding to the eigenvalue 2|j | + d. Introducing normalization factors, we conclude that X 2 bk ϕk1 (y1 )ϕk2 (y2 ) · · · ϕkd (yd ), (3.8) 2−|j |/2 (j !)−1/2 π −d/4 Hj (A; y)e−y /2 = 2

|k|=|j |

where ϕk (y) = 2−k/2 (k!)−1/2 π −1/4 Hk (y)e−y /2 is the normalized eigenfunction of ∂2 − 2 + y 2 corresponding to the eigenvalue 2k + 1, and the coefficients bk satisfy ∂y 2

Semiclassical Dynamics

453

P

2 |k|=|j | |bk |

= 1. We can thus deduce an estimate of Hj (A; y) from an estimate for usual Hermite polynomials Hk (y): X bk Hk1 (y1 )Hk2 (y2 ) · · · Hkd (yd )|2 |Hj (A; y)|2 = | |k|=|j |

≤

X

|Hk1 (y1 )Hk2 (y2 ) · · · Hkd (yd )|2 .

(3.9)

|k|=|j |

The Hermite polynomials, in turn, satisfy the following bounds. √ Lemma 3.1. If |y| > 2k + 1, then |Hk (y)| ≤ 2k |y|k , and there exists a numerical constant κ > 0, such that for all y ∈ R, √ 2 |Hk (y)| ≤ κ2k/2 k!ey /2 .

(3.10)

(3.11)

Proof. The second statement is well known. See [1], formula 22.14.17 or [4], formula 8.954.2. By symmetry, it is sufficient to prove (3.10) for y > 0. It is well known that the k th eigenfuction of the harmonic oscillator is non-zero in the classically forbidden region √ |y| > 2k + 1. It is also well known that Hk (y) = 2k y k + Pk (y), where Pk (y) is a polynomial of degree k − 2. Thus, we have Hk (y)/(2k y k ) = 1 + Bk (y), where Bk satisfies

Bk (y) = O(1/y 2 )

for large y,

and Bk (y) > −1

if y >

√ 2k + 1.

Using standard properties of the Hermite polynomials, we see that Bk0 (y) = (yHk0 (y) − kHk (y))/(2k y k+1 ) = (2kyHk−1 (y) − kHk (y))/(2k y k+1 ) = k(2yHk−1 (y) − (2yHk−1 (y) − 2(k − 1)Hk−2 (y)))/(2k y k+1 ) = k(k − 1)Hk−2 (y)/(2k−1 y k+1 ).

√ √ So, if y > 2k + 1, we see that Bk0 (y) > 0. Thus, y > 2k + 1 implies −1 < Bk (y) < 0, which implies (3.10). u t In (3.6), we use the multinomial expansion X |m| 2n 2n 2 |m| y 1 y 2 · · · yd2nd (y ) = n1 · · · nd 1 2 |n|=|m|

454

G. A. Hagedorn, A. Joye

to obtain |m|

I≤

(h¯ kAk2 ) 2|j | j !π d/2

X

X

|n|=|m| |k|=|j |

Y d Z |m| yr2nr |Hkr (yr )|2 exp(−yr2 /2)dyr . n1 · · · nd r=1

Using Lemma 3.1, the one dimensional integrals in the final factor here can then be estimated as follows (without indices): Z y 2n |Hk (y)|2 exp(−y 2 /2)dy Z ≤

y 2 ≤2k+1

≤ κ 2 2k k!

κ 2 2k k!y 2n exp(y 2 /2)dy +

Z

4k y 2(n+k) exp(−y 2 /2)dy

2 (2k + 1)n+1/2 exp(k + 1/2) + 4k 2(k+n+1/2) 2n + 1

Z

z2(k+n) e−z dz 2

=

(2(k + n) − 1)! 2k+1 2 k+1/2 k!(2k + 1)n+1/2 + 4k 2(k+n+1/2) π 1/2 2(n+k)−1 κ e 2n + 1 2 (k + n − 1)!

=

2k+1 2 k+1/2 (2(k + n) − 1)! k!(2k + 1)n+1/2 + 2k−n+3/2 π 1/2 κ e . 2n + 1 (k + n − 1)!

Here we have used Z

z2(k+n) e−z dz = π 1/2 2

(3.12)

1 · 3 · 5 · · · (2(k + n) − 1) . 2k+n

Now, Stirling’s formula guarantees the existence of a > 0, such that a n nn ≤ n! ≤ nn ,

(3.13)

for all integers n ≥ 1. Using this in (3.12), we obtain Z y 2n |Hk (y)|2 exp(−y 2 /2)dy ≤ f0 f1k+n (k + n)!, for some constants f0 and f1 . From this we conclude (restoring the indices) |m|

(h¯ kAk2 ) I ≤ |j | 2 j !π d/2

X

X

|m|+|j | f0d f1

|n|=|m| |k|=|j |

|m| (k + n)! n1 · · · nd

(3.14)

In this expression, we have (k + n)! = (|k| + |n|)!

|k| + |n| kd + nd k 1 + n1 · · ·

−1

|m| + |j | = (|m| + |j |)! ··· kd + nd k1 + n1 ≤ (|m| + |j |)!

−1

(3.15)

Semiclassical Dynamics

455

X

We further use

|n|=|m|

|m| n1 · · · nd

2 |m|

I ≤ (h¯ kAk ) = (h¯ kAk2 )|m| 2 |m|

≤ (h¯ kAk )

f0 √ π

f0 √ π f0 √ π

d

d d

= d |m| and (2.23) in (3.14) to obtain |m|

(f1 d)

(f1 d)|m| (2f1 d)

|m| |j |

≡ (h¯ kAk2 d)|m| f2d f3 f4

|m|

ed f1 2

ed f1 2 d

|j |

(|m| + |j |)! j!

|j |

|j | j1 · · · jd

|j |

(e f1 )

|j | |m|! j1 · · · jd

|j | |m|!, j1 · · · jd

(3.16)

where f2 , f3 , f4 are numerical constants. Estimates (3.2) and (3.16) imply

m

D V (ζ (x, a)) m

ϕ (A, B, h , a, η, x) (x − a) ¯ j

m! ≤ M exp(4τ (δ d 2

|m| + |j | |m|! |j |

d/2 |j |/2 + a 2 ))(h¯ kAk2 (d/δ 2 )f3 )|m|/2 f2 f4

s p |m|!

|j | . j1 · · · jd

So, by the Schwarz inequality, (1.4), (3.9), and (3.18),

2

m

D V (ζ (x, a)) m

(x − a) ψl (x, t, h¯ )

m!

2 Je(l) X m X

D V (ζ (x, a)) m

(x − a) ϕj (A, B, h¯ , a, η, x) ≤

m! n=0 |j |=n

e

≤ M exp(8τ (δ d + a ))(h¯ kAk (d/δ 2

2

2

2

2

)f3 )|m| f2d |m|!

J (l) X X

|j |

f4

n=0 |j |=n

|j | j1 · · · jd

e

= M 2 exp(8τ (δ 2 d + a 2 ))(h¯ kAk2 (d/δ 2 )f3 )|m| f2d |m|!

J (l) X

f4n d n

n=0

= M exp(8τ (δ d + a ))(h¯ kAk (d/δ 2

2

2

2

2

)f3 )|m| f2d |m|!

e

(f4 d)J (l)+1 − 1 f4 d − 1 e

≤ M 2 exp(8τ (δ 2 d + a 2 ))(h¯ kAk2 (d/δ 2 )f3 )|m| f2d |m|!2(f4 d)J (l) .

!

(3.17)

The last step depends on the following fact: We can assume without loss that q = f4 d ≥ 2, so that for any positive integer p, we have −1 ≤ q p+1 − 2q p , and hence, q p+1 − 1 ≤ 2q p+1 − 2q p . Thus, (q p+1 − 1)/(q − 1) ≤ 2q p .

(3.18)

456

G. A. Hagedorn, A. Joye

We use the hypothesis |m| = l + 2 and note that |m|! ≤ (l + 2)l+2 . We then conclude that there exist constants g0 and g1 , that depend only on d and J , such that (2.13) holds. This implies the lemma. u t 4. Proofs of Lemmas 2.3 and 2.4 The proof of Lemma 2.3 relies on two preliminary lemmas. Lemma 4.1. The matrix elements of (x − a)m satisfy hϕj (A, B, h¯ , a, η, x), (x − a)m ϕk (A, B, h¯ , a, η, x)i p √ ≤ h¯ |m|/2 ( 2d)|m| kAk|m| (|k| + 1)(|k| + 2) · · · (|k| + |m|),

(4.1)

and

hϕj (A, B, h¯ , a, η, x), (x − a) ϕk (A, B, h¯ , a, η, x)i = 0, if |j | − |k| > |m|. m

Proof. For i = 1, 2, · · · , d, we can use Eq. (3.28) of [7] to express (xi − ai ) in terms of raising and lowering operators. Doing so, we obtain (xi − ai )ϕk (A, B, h¯ , a, η, x) =

d X p p h¯ /2 Aip kp + 1ϕk 0 (p) (A, B, h¯ , a, η, x) p=1 d X p p A¯ ip kp ϕk 00 (p) (A, B, h¯ , a, η, x), + h¯ /2 p=1

where k 0 (p) and k 00 (p) are constructed from k by replacing the component kp by kp + 1 and kp − 1, respectively. Thus, (xi − ai )ϕk (A, B, h¯ , a, η, x) may be written as a sum of 2d terms, each of√which has the form bq ϕq (A, B, h¯ , a, η, x), where |q| ≤ |k| + 1 and √ |bq | ≤ h¯ /2kAk |k| + 1. From this and a simple induction, (x − a)m ϕk (A, B, h¯ , a, η, x) may be written as a sum of (2d)|m| terms, each of which has the form bq ϕq (A, B, h¯ , a, η, x), where |k| − √ |m| ≤ |q| ≤ |k| + |m| and |bq | ≤ (h¯ /2)|m| kAk|m| (|k| + 1)(|k| + 2) · · · (|k| + |m|). This implies the lemma. u t In the next lemma we use the shorthand X m (t) to denote Xm (l, t, h¯ ) of Lemma 2.3. Lemma 4.2. For each N > 0, there exists D(N ) < ∞, such that m (X 1 (t1 )X m2 (t2 ) · · · Xmq (tq ))j k sup {j,k: |j |,|k|≤Je(l)}

√

≤ D(N ) h¯

!|m1 |+|m2 |···+|mq | sup

t∈{t1 ,t2 ,···tq }

kA(t)k

Je(l)(|m1 |+|m2 |+···+|mq |)/2 ,

for any collection m1 , m2 , . . . , mq of multi-indices that satisfy |mj |/Je(l) ≤ N .

(4.2)

Semiclassical Dynamics

457

em by restricting its indices to |j |, |k| ≤ Je(l). Proof. The matrix Xm is obtained from X Using Lemma 4.1, we see that for such values of the indices, q √ |m| m 2h¯ dkA(t)k Je(l)|m|/2 (1+1/Je(l))(1+2/Je(l))) · · · (1+|m|/Je(l)). Xj k (t) ≤ Thus, |m|/Je(l) ≤ N implies √ m Xj k (t) ≤ ( 2h¯ dkA(t)k)|m| (1 + N )|m|/2 Je(l)|m|/2 . We estimate the absolute value of the j, k matrix element of the product m (X 1 (t1 )X m2 (t2 ) · · · Xmq (tq ))j k ≤

X |pi | ≤ Je(l), i = 1, · · · , q − 1 |p1 − j | ≤ |m1 |, |p2 − p1 | ≤ |m2 |, · · · , |k − pq−1 | ≤ |mq |

mq m1 Xjp1 (t1 )Xpm12p2 (t2 ) · · · Xpq−1 k (tq ) (4.3)

i by bounding the absolute values of each of the matrix elements Xpmi−1 pi (ti ) by

√ mi Xpi−1 pi (ti ) ≤ ( 2h¯ dkA(ti )k)|mi | (1 + N )|mi |/2 Je(l)|mi |/2 ,

(4.4)

and multiplying by a bound on the number of terms. To estimate the number of terms, we first note that for any multi-index r, the number of multi-indices p that satisfy |p| − |r| ≤ |m| is equal to the number of vectors v P with integer components, such that | di=1 vi | ≤ |m|. This is bounded by the number of vectors with integer components, such that |vi | ≤ |m| for i = 1, 2, · · · , d, which is (2|m| + 1)d . Thus, the number of terms is bounded by X

1

|pi | ≤ Je(l), i = 1, · · · , q − 1 |p1 − j | ≤ |m1 |, |p2 − p1 | ≤ |m2 |, · · · , |k − pq−1 | ≤ |mq |

≤ (2|m1 | + 1)d (2|m2 | + 1)d · · · (2|mq | + 1)d ≤ (e2 )d(|m1 |+|m2 |+···+|mq |) . The final inequality follows because 1 + 2|mj | ≤ e2|mj | . We obtain the lemma by using (4.4) and (4.5) to bound (4.3). u t

(4.5)

458

G. A. Hagedorn, A. Joye

Proof of Lemma 2.3. We mimic the proof of Lemma 4.2. Let m e = |m0 | + |m1 | + |m2 | + · · · + |mq |. By Lemmas 4.1 and 4.2, √ h¯

=

≤

√

!−e m sup

t∈{t0 ,t1 ,··· ,tq }

kA(t)k !−e m

h¯

sup

t∈{t0 ,t1 ,··· ,tq }

X

kA(t)k

m (X e 0 (t0 )X m1 (t1 )X m2 (t2 ) · · · Xmq (tq ))j k X m e 0 (t0 ) (X m1 (t1 )X m2 (t2 ) · · · Xmq (tq ))rk X jr

|r|≤Je(l)

p 2|m0 |/2 (|r| + 1)(|r| + 2) · · · (|r| + |m0 |)

|r| ≤ Je(l) |j − r| ≤ |m0 |

×D(N)|m1 |+|m2 |+···+|mq | Je(l)(|m1 |+|m2 |+···+|mq |)/2 ≤

X

q 2|m0 |/2 Je(l)|m0 |/2 (1 + 1/Je(l))(1 + 2/Je(l)) . . . (1 + |m0 |/Je(l))

|r| ≤ Je(l) |j − r| ≤ |m0 |

×D(N)|m1 |+|m2 |+···+|mq | Je(l)(|m1 |+|m2 |+···+|mq |)/2 ≤

X

2|m0 |/2 (1 + N)|m0 |/2 D(N)|m1 |+|m2 |+···+|mq | Je(l)(|m1 |+|m2 |+···+|mq |)/2

|r| ≤ Je(l) |j − r| ≤ |m0 |

X

≤ 2|m0 |/2 (1 + N)|m0 |/2 D(N)|m1 |+|m2 |+···+|mq | Je(l)(|m1 |+|m2 |+···+|mq |)/2

1.

|j −r|≤|m0 |

Since the sum in the last line is (1 + 2|m0 |)d ≤ e2d|m0 | , we obtain the desired estimate. t u We now turn to the proof of Lemma 2.4. Our first step is to study the combinatorial factor Fp (n, q) in the one dimensional case. Lemma 4.3. Let mj denote positive integers, and let X Gp (n, q) =

1.

(4.6)

1 ≤ m1 , m2 , · · · , mq ≤ p m1 + m2 + · · · + mq = n

This quantity is zero if n < q or n > qp. Otherwise, it satisfies  n−1   if q ≤ n ≤ [q(p + 1)/2]  q −1 Gp (n, q) ≤ q(p + 1) − n − 1   if [q(p + 1)/2] + 1 ≤ n ≤ qp.  q −1

(4.7)

Semiclassical Dynamics

459

Proof. If n < q or n > qp, the sum in (4.6) contains no terms, so Gp (n, q) = 0. The quantity Gp (n, q) is the number of ways that the number n can be decomposed as n = m1 + m2 + · · · + mq with each mj satisfying 1 ≤ mj ≤ p. We uniquely associate to each such decomposition of n, a corresponding decomposition of n0 = q(p +1)−n = m01 + m02 + · · · + m0q by setting m0j = p + 1 − mj . This association is a one-to-one correspondence, so we see that Gp (n, q) satisfies the symmetry relation Gp (n, q) = Gp (q(p + 1) − n, q).

(4.8)

As a consequence, the second inequality in (4.7) (with [q(p + 1)/2] + 1 ≤ n ≤ qp) follows from the first (with q ≤ n ≤ [q(p + 1)/2]). To prove the first inequality in (4.7) we drop the condition mj ≤ p and exactly calculate the resulting function. Dropping the upper bound on the mj , we clearly have Gp (n, q) ≤ G(n, q), where X 1. G(n, q) = 1 ≤ m1 , m2 , · · · , mq m1 + m2 + · · · + mq = n

So, the lemma will be proved once we establish n−1 G(n, q) = . q −1

(4.9)

To prove this, we use induction. Formula (4.9) is trivial to verify for all n ≥ 1 when q = 1 and for all n ≥ 2 when q = 2. Assume (4.9) has been verified whenever n ≥ q −1, and suppose n ≥ q. We have G(n, q) =

n−q+1 X

G(n − m1 , q − 1) =

m1 =1

n−q+1 X m1 =1

n − m1 − 1 q − 2,

so the induction step is complete, since it is known (see [11], p.36) that n−q+1 X m1 =1

n−1 n − m1 − 1 = . q −2 q −1

(4.10)

This ends the proof of the lemma. u t Our next step is to obtain an estimate for Gp (n, q) that depends only on n. Lemma 4.4. Let Gp (n, q) be defined by (4.6). There exists a constant C7 , such that p ≥ 1, q ≥ 1, and n ≥ 1 imply Gp (n, q)) ≤ (C7 )n .

(4.11)

Proof. We note that Gp (n, q) is zero unless q ≤ n ≤ qp. Furthermore, Gp (q, q) = Gp (qp, q) = G2 (2, 1) = 1 and G1 (n, q) = δnq . So, we need only prove existence of C7 , such that p ≥ 2, q ≥ 1, (p, q) 6 = (2, 1) imply sup

q
(Gp (n, q))1/n ≤ C7 .

(4.12)

460

G. A. Hagedorn, A. Joye

To prove this, we first study the case where q < n ≤ [q(p + 1)/2]. Since q ≤ n, we have q 1/n = exp(ln(q)/n) ≤ exp(ln(n)/n) ≤ e1/e . Thus, by (4.7), the condition q < n ≤ [q(p + 1)/2], and (3.13), we have 1/n (n − 1)! ≤ (q − 1)!(n − q)! (n − 1)! 1/n ≤ q 1/n q!(n − q)! n 1/e ≤e . aq q/n (n − q)(n−q)/n

(Gp (n, q))

1/n

(4.13)

An explicit computation shows that n n q ∂ = q/n (ln(q) − ln(n − q)) . q/n (n−q)/n (n−q)/n ∂n q (n − q) n2 q (n − q) From this, we deduce that the right-hand side of (4.13) attains its maximum at n = 2q. Evaluating that maximum, we obtain sup

q
(Gp (n, q))1/n ≤

2 1/e e . a

(4.14)

For [q(p + 1)/2] < n < pq, the number n0 = q(p + 1) − n satisifes q < n0 ≤ [q(p + 1)/2] and n0 /n ≤ 1. Thus, by (4.8) and (4.14), we obtain (Gp (n, q))1/n = (Gp (n0 , q))1/n 0 2 1/e n /n e ≤ a 2 ≤ e1/e . a Thus, sup

[q(p+1)/2]
(Gp (n, q))1/n ≤

2 1/e e . a

(4.15)

Inequalities (4.14) and (4.15) imply the existence of C7 for which (4.12) holds. This implies (4.11), and the lemma is proved. u t We now generalize Lemma 4.4 to the multi-dimensional case. Lemma 4.5. Let Fp (n, q) be defined by (2.16). For all p ≥ 1, q ≥ 1, and n ≥ 1, we have (n+qd)

Fp (n, q) ≤ C7 where C7 is the constant of Lemma 4.4.

,

(4.16)

Semiclassical Dynamics

461

Proof. With mj temporarily denoting numbers instead of multi-indices, we define X 1, 0p (n, q) = 0 ≤ m1 , m2 , · · · , mq ≤ p m1 + m2 + · · · + mq = n

which is zero unless 0 ≤ n ≤ pq. By defining m0j = mj + 1, we see that every decomposition of n as n = m1 + m2 + · · · + mq , with 0 ≤ mj ≤ p, corresponds uniquely to a decomposition of n + q as n + q = m01 + m02 + · · · + m0q , with 1 ≤ m0j ≤ p + 1. Therefore, we have the identity 0p (n, q) = Gp+1 (n + q, q).

(4.17)

We now let the mj denote multi-indices and let mj (k) denote the k th component of mj . Then, using (4.17) and Lemma 4.4, we easily obtain X (qd+n) 1 = 0p (n, qd) ≤ C7 . t u Fp (n, q) ≤ Pq

0 ≤ mj (k) ≤ p Pd k=1 mj (k) = n

j =1

Proof of Lemma 2.4. Suppose 2 ≤ q ≤ l + 2, and let Lq (l) satisfy (2.17), i.e., Lq (l) ≤

q q(l+1) C1 X (C2 h¯ l)n/2 Fl+1 (n, q). h¯ q q! n=3l−2

By Lemma 4.5 and (3.13), there exist C8 and C9 , such that q(l+1) (C1 C7d )q X Lq (l) ≤ (C2 h¯ l)n/2 C7n (a h¯ q)q n=3l−2

≤

Note that we can take

q C9

q(l+1) X

(h¯ q)q

C8 h¯ 1/2 l 1/2

.

(4.18)

n=3l−2

p C8 = max{1, C2 C7 }, C9 =

n

max{1, C1 C7d },

(4.19) (4.20)

because we can assume C8 ≥ 1 and C9 ≥ 1 without loss of generality. We arbitrarily choose two positive numbers α < 1 and β < 1. We then define ) ( β2 1 2 (4.21) g = 2 min α , 2 4 2/e . C8 C9 C8 e We henceforth assume l = l(h¯ ) is chosen to satisfy (2.18) with this value of g. With this choice, we have C8 h¯ 1/2 l 1/2 ≤ α.

(4.22)

462

G. A. Hagedorn, A. Joye

By summing a geometric series in (4.18), we see that q 3l−2 C9 1/2 1/2 h l C Lq (l) ≤ ¯ 8 (1 − α)(h¯ q)q q

= As a function of q,

2q

C9 C8

(1 − α)(h¯ qC82 )q

C8 h¯ 1/2 l 1/2

3l−2

.

(4.23)

q l = exp (q ln(l/q)) q

is maximized at q = l/e, where it has the value el/e . Thus, q l ≤ el/e . q This, q ≤ l + 2, and h¯ lC82 ≤ α 2 < 1 imply q el/e l ≤ . q (h¯ lC82 )l+2

1 1 = (h¯ qC82 )q (h¯ lC82 )q

From this, C8 ≥ 1, and C9 ≥ 1, we conclude Lq (l) ≤

2(l+2) l/e l−6 e C9l+2 C8 C8 h¯ 1/2 l 1/2 (1 − α)

l = e1/e C9 C83 h¯ 1/2 l 1/2

C92 (1 − α)(h¯ l)3 C82

.

(4.24)

By (4.21), we have e1/e C9 C83 h¯ 1/2 l 1/2 ≤ β < 1. Since

g g −1 ≤ l ≤ , h¯ h¯

we see from (4.24) that if h¯ ≤ g/2, then we have Lq (l) ≤ e−| ln(β)|l ≤

C92 (1 − α)(g − h¯ )3 C82

23 eC92 (1 − α)g 3 C82

e−| ln(β)|g/h¯ .

(4.25)

Note that the factor 23 can be replaced by (1 + ), with arbitrarily small, by taking h¯ sufficiently small. u t

Semiclassical Dynamics

463

Remark. We can choose α and β to satisfy

β e1/e

≤α<

1 e1/e

. Then, since

β2 ≤ α 2 , we obtain the conclusion of Lemma 2.4 with e2/e g= γ1 = C3 =

β2 C92 C86 e2/e | ln(β)|β 2 C92 C86 e2/e

β2 C92 C84 e2/e

≤

,

(4.26)

,

(4.27)

23 e1+6/e C98 C810 , (1 − e−1/e )β 6

(4.28)

where C8 and C9 are related to C1 , C2 , and C7 by (4.19) and (4.20). Note that C7 is purely combinatorial and has no time dependence.

5. Proof of Theorem 1.2 To prove Theorem 1.2, we revisit the proof of Theorem 1.1 and make the T dependence of all constants explicit. We then allow T to grow with h¯ with the restriction that our approximation remain close to the actual solution as h¯ → 0. Lemmas 4.3, 4.4, and 4.5 have no time dependence, and the time dependence of Lemmas 2.1, 2.2, 2.3, 3.1, 4.1, and 4.2 has been made explicit in their conclusions. Since V is bounded below and energy is conserved, |a(t)| grows at most linearly with time. Thus, there exist v1 > 0 and v0 > 0, such that v(t) defined by (2.33) satisfies v(t) ≤ v0 exp(v1 t).

(5.1)

From assumption (1.9), it follows that w(T ) = sup kA(t)k ≤ N exp(λT ).

(5.2)

0≤|t|≤T

In the sequel, we denote all inessential constants that do not depend on T by the same symbol c. Consider Lemma 2.2. For z ∈ Cδ (ζ (x, a(t))), we can prove the existence of ρ > 0 such that

m

D V (ζ (x, a(t))) m

(x − a(t)) ψl (x, t, h¯ )

m! l+2 p . (5.3) ≤ c exp(ρ|t|) c h¯ (l + 2) exp(λ|t|) Indeed, with our bound on |V (z)| we have D m V (ζ (x, a(t))) ≤ c exp(ρ|t|) exp(c|x − a(t)|) m!

464

G. A. Hagedorn, A. Joye

instead of (3.2). With this and the estimate exp(cy) ≤ exp(cy 2 ) + exp(c) for all c > 0 and y > 0, we see that the estimates in the proof of Lemma 2.2 are still valid. This yields (5.3). Consequently, the constants C4 and C5 appearing in (2.24) satisfy C4 (T ) = c exp(ρT ), C5 (T ) = c exp(λT ), where ρ > 0. By using estimates (5.1) and (5.2) in (2.34) and (2.35), we see that when we apply Lemma 2.4, the constants C1 and C2 satisfy C1 (T ) = c exp(ρT ), C2 (T ) = c exp(2λT ). We still must determine the T dependence of C and γ of Theorem 1.1. We do this by determining the T dependence of g, γ1 and C3 in Lemma 2.4. Equations (4.19) and (4.20) yield C8 and C9 , in terms of which the above constants are determined. We find that C8 (T ) = c exp(λT ), C9 (T ) = c exp(ρT ). Consequently, from (4.26), (4.27) ,(4.28), we have g(T ) = c exp(−νT ), γ1 (T ) = c exp(−νT ), C3 (T ) = c exp(µT ),

ν = 2ρ + 6λ, µ = 8ρ + 10λ.

To arrive at these conclusions, we imposed various conditions. Those conditions were (3.5), that now requires h¯ ≤ c exp(−2λT ), (2.36), that now requires c exp(2λT ) exp(−νT ) < 1, and h¯ ≤ g(T )/2 used in (4.25), that now requires h¯ ≤ c exp(−νT ). These are all satisfied, provided we take T (h¯ ) ≤ T 0 | ln(h¯ )|, with T 0 > 0 sufficiently small. Using this in our estimate for the error term ξ(t, h¯ ), we see that for sufficiently small T 0 , |t| ≤ T 0 | ln(h¯ )| implies kψ(x, t, h¯ ) − 9(x, t, h¯ )kL2 (Rd ) ≤ C 0 exp(−γ 0 /h¯ σ ), t for some γ 0 > 0, σ > 0, and C 0 > 0. u References 1. Abromowitz, M. and Stegun, I. A.: Handbook of Mathematical Functions. New York: Dover, 1968 2. Bambusi, D., Graffi, S., and Paul, T.: Long Time Semiclassical Approximation of Quantum Flows: A Proof of Ehrenfest Time. 1998 preprint 3. Combescure, M. and Robert, D.: Semiclassical Spreading of Quantum Wave Packets and Applications near Unstable Fixed Points of the Classical Flow. Asymptotic Anal. 14, 377–404 (1997) 4. Gradsteyhn, I. S. and Ryzhik, I. M.: Table of Integrals, Series, and Products, Fifth Ed. NewYork:Academic Press, 1994

Semiclassical Dynamics

465

5. Hagedorn, G. A.: Semiclassical Quantum Mechanics III: The Large Order Asymptotics and More General States. Ann. Phys. 135, 58–70 (1981) 6. Hagedorn, G. A.: Semiclassical Quantum Mechanics IV: Large Order Asymptotics and More General States in More than One Dimension. Ann. Inst. H. Poincaré Sect. A. 42, 363–374 (1985) 7. Hagedorn, G. A.: Raising and Lowering Operators for Semiclassical Wave Packets. Ann. Phys. 269, 77–104 (1998) 8. Jung, K.: Adiabatic Invariance and the Regularity of Perturbations. Nonlinearity 8, 891–900 (1995) 9. Robert, D.: Semi-Classical Approximation in Quantum Mechanics. A Survey of Old and Recent Mathematical Results. Preprint 97/03-6, Université de Nantes, 1997 10. Sjöstrand, J.: Singularités analytiques microlocales. Astérisque 95, 1982 11. Vilenkin, N.Y.: Combinatorics. London–New York: Academic Press, 1971 12. Yajima, K.: Gevrey Frequency Set and Semi-classical Behavior of Wave Packets. In: Schrödinger Operators, The Quantum Mechanical Many Body Problem, Lecture Notes in Physics 403, ed. by E. Balslev. Berlin–Heidelberg–New York: Springer-Verlag, 1992, pp. 248–264 Communicated by B. Simon

Commun. Math. Phys. 207, 467 – 479 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Positivity of General Superenergy Tensors Göran Bergqvist Department of Mathematics, Mälardalen University, 721 23 Västerås, Sweden. E-mail: [email protected] Received: 21 December 1998 / Accepted: 5 May 1999

Abstract: Several of the most important results in general relativity require or assume positivity properties of certain tensors. The positive energy theorem and the singularity theorems make assumptions about the energy-momentum tensor and Ricci tensor respectively. Positivity of the Bel–Robinson tensor is needed in the proof of the global stability of Minkowski spacetime. Senovilla has recently presented a procedure of how to construct a superenergy tensor from any tensor. For a Maxwell field or a scalar field the procedure yields the usual energy-momentum tensor, for the Weyl tensor and the Riemann tensor one obtains the Bel–Robinson tensor and Bel tensor respectively. In general, by considering any tensor as an r-fold (n1 , . . . , nr )-form, one constructs a rank 2r superenergy tensor from it. By using spinor methods, we prove that the contraction of any such superenergy tensor with 2r future-pointing vectors is non-negative. We refer to this as the dominant superenergy property and it generalizes several previous positivity results obtained for certain tensors as well as it provides a unified way of treating them. Some more examples are given and applications discussed. 1. Introduction That a quantity is positive is often a required property or a needed assumption in the demonstrations of many important results. The dominant energy condition in general relativity [PR1] forces the energy-momentum tensor Tab to satisfy Tab ua v b ≥ 0

(1)

for any future-pointing vectors ua and v a . This assumption is essential in the positive energy theorem, which says that the total energy of an asymptotically flat spacetime is non-negative [PR2]. Sometimes one needs another positivity condition, the strong energy condition, which (via Einstein’s equation) is (−Rab )ua ub ≥ 0

(2)

468

G. Bergqvist

for any future-pointing vector ua . Here Rab is the Ricci tensor and the minus sign follows the conventions of [PR1]. A well known situation in which (2) is assumed is in some versions of the singularity theorems of Penrose and Hawking [HE], which are based on the Raychaudhuri equation which contains the term Rab ua ub . If Cabcd denotes the Weyl conformal curvature tensor, then the Bel–Robinson tensor Tabcd [Bel1] is defined by Tabcd = Ca e c f Cbedf + ∗ Ca e c f ∗ Cbedf ,

(3)

where ∗ Cabcd = 21 eab ef Cef cd is the dual of Cabcd with respect to the first two indices. The Bel–Robinson tensor satisfies the positivity property [B1], Tabcd ua v b wc zd ≥ 0

(4)

for all future-pointing vectors ua , v a , wa and za . This property, or rather special cases of it [CK1, CK2], is why it was useful to Christodoulou and Klainerman in their proof of the global stability of Minkowski spacetime. Other applications of (4) to energy expressions are given in [B2]. The definition (3) can be generalized to the Riemann curvature tensor Rabcd . We then define the Bel tensor [Bel2] Babcd by ∗ ∗ + ∗ Ra∗ e c f ∗ Rbedf , 2Babcd = Ra e c f Rbedf + ∗ Ra e c f ∗ Rbedf + Ra∗ e c f Rbedf

(5)

∗ where ∗ Rabcd is the dual with respect to the first two indices, Rabcd the dual w.r.t. the ∗ ∗ last two indices, and Rabcd the (double) dual w.r.t. both pairs of indices. In fact, the Bel tensor also satisfies [Bo],

Babcd ua v b w c zd ≥ 0. ∗ Now, (3) can be rewritten to look more like (5). By [PR1] we have ∗ Cabcd = Cabcd and ∗C∗ ∗∗ = C = −C . Hence abcd abcd abcd ∗ ∗ + ∗ Ca∗ e c f ∗ Cbedf . 2Tabcd = Ca e c f Cbedf + ∗ Ca e c f ∗ Cbedf + Ca∗ e c f Cbedf

(6)

The Bel–Robinson and Bel tensors are of course mathematical quantities while the energy-momentum tensor Tab depends on the physical model. It can however in several cases be written in a way similar to (5) or (6). If Fab is a Maxwell field (i.e. antisymmetric), then [PR1] 1 Tab = k( gab Fcd F cd − Fa c Fbc ) 4 (k a positive constant depending on units) which can be written k Tab = − (Fa c Fbc + ∗ Fa c ∗ Fbc ), 2

(7)

where ∗ Fab is the dual of Fab . An energy-momentum tensor defined by (7) will automatically satisfy the dominant energy condition (1) [PR1]. Another example is a scalar field 8. Let Ja = ∇a 8 be a 1-form and ∗ Jabc = eabc d Jd its dual 3-form. The energy-momentum tensor of the field is 1 Tab = k(Ja Jb − gab Jc J c ), 2

(8)

Positivity of General Superenergy Tensors

469

which may be expressed as Tab =

k 1 (Ja Jb + ∗ Ja cd ∗ Jbcd ). 2 2

(9)

Again, any Tab defined from a 1-form as in (9) can be shown to satisfy the dominant energy condition (1). The similarities between (5), (6), (7) and (9) has led Senovilla [S] to propose a way of constructing superenergy tensors from any tensor. The Weyl and Riemann tensors are anti-symmetric in the first and the second pairs of indices so one can form duals with respect to each pair, and we will call these tensors double 2-forms. A general tensor will be considered as a multiple form, and by adding all possible duals together, the above equations will be generalized to give a general superenergy tensor. We shall work entirely with general relativistic spacetimes, i.e. four-dimensional manifolds with a Lorentz metric of signature + − −−. Our notation is that of Penrose and Rindler [PR1], which means that our indices are abstract labels rather than referring to any coordinate system. In Sect. 2 we present the definition given by Senovilla and list some of its simpler properties. Then, in Sect. 3, we introduce spinors and describe how tensor duals are expressed in terms of spinors. For the definition (3) of the Bel–Robinson tensor we follow [S] and the general form of superenergy tensors, and exclude a factor of 1/4 used in [PR1] and [B1]. We see how the spinor form implies that the superenergy tensors of simple 1-forms, 2-forms and 3-forms, and double 1-forms and 2-forms satisfy positivity properties like (1) and (4). We refer to this as the dominant superenergy property. In Sect. 4 we present the spinor form of the general superenergy tensor. From this we arrive at the main result of this paper: a proof of the dominant superenergy property in the most general case. By its generality, it unifies all results mentioned above and also generalizes them. It provides a method of constructing positive quantities from any tensor. The proof which uses the spinor form is much simpler than proofs of simple special cases by tensor methods, such as the results on the Bel–Robinson tensor by Christodoulou and Klainerman. In Sect. 5 we then explicitly study the superenergy tensors constructed from some other tensors. 2. Definition of General Superenergy Tensors 2.1. Arbitrary tensors as multiple forms. Let Ac1 ...cm be an arbitrary real rank m (covariant) tensor. Let [n1 ] denote the set of indices containing c1 and all other indices cj such that Ac1 ...cm is anti-symmetric in c1 cj . The number n1 is the number of indices in [n1 ]. Then [n2 ] is the next set formed from anti-symmetries with c2 (or c3 if c2 is already in [n1 ] and so on). In this way c1 , . . . , cm are divided into r blocks [n1 ], . . . , [nr ] with n1 + · · · + nr = m. We can therefore consider Ac1 ...cm as an r-fold (n1 , . . . , nr )-form and we write Ac1 ...cm = A[n1 ]...[nr ] . It is possible to form 2r different (multiple) duals of A[n1 ]...[nr ] . The dual with respect ∗ , the dual with respect to [n1 ] and [n2 ] by to the block [n1 ] is denoted A[4−n 1 ][n2 ]...[nr ] ∗ ∗ and so on. Note that different duals may be tensors of different rank A[4−n 1 ][4−n2 ]...[nr ] but all duals are r-folded forms. The notation (AP )[ ]...[ ] , P = 1, 2, . . . , 2r , will be used to label all the possible duals. Here P = 1 + s1 + 2s2 + · · · + 2r−1 sr with sj = 0 if there is no dual with respect to the block [nj ] and sj = 1 if there is a dual with respect to this block (so A1 = A and A2r is the form where the dual has been taken with respect to all blocks).

470

G. Bergqvist

2.2. The superenergy tensor. In order to define the superenergy tensor of Ac1 ...cm we need to define a product × of an r-fold form by itself resulting in a 2r-tensor. First, let A˜ c1 ...cm be the tensor obtained by permuting the indices in Ac1 ...cm so that the n1 first indices in A˜ c1 ...cm are precisely the indices in the block [n1 ], the following n2 indices are the ones in [n2 ] and so on. We now define the product as (−1)n1 −1 (−1)nr −1 (n1 −1)! . . . (nr −1)! · dr2 ...drnr . A˜ a1 d12 ...d1n1 a2 d22 ...d2n2 ...ar dr2 ...drnr A˜ b1 d12 ...d1n1 b2 d22 ...d2n2 ... ... br

(A × A)a1 b1 ...ar br =

(10)

From each block in A[n1 ]...[nr ] two indices are obtained in (A × A)a1 b1 ...ar br . We can form (AP × AP )a1 b1 ...ar br for any P and we make the following important definition. Definition. The superenergy of Ac1 ...cm is defined to be r

Ta1 b1 ...ar br {A} =

2 1X (AP × AP )a1 b1 ...ar br . 2

P =1

Observe that any dual AP of the original tensor A = A1 generates the same superenergy tensor. For this reason it would be sufficient to study properties of superenergy tensors of tensors with 1-blocks and 2-blocks only (all 3-blocks may be changed to 1-blocks by taking the corresponding dual). We also remark that A[n1 ]...[nr ] could contain 4-blocks for which the dual 0-blocks the expression (10) has no meaning. However, a 4-form in four dimensions is trivial and contributes to the superenergy as a scalar factor. This factor is precisely its dual 0-form (squared). From now on, we assume that we have no such trivial 4-blocks. We note that Ta1 b1 ...ar br {A} = T(a1 b1 )...(ar br ) {A} and if A[n1 ]...[nr ] is symmetric with respect to two blocks [ni ] and [nj ], then Ta1 b1 ...ar br is symmetric with respect to the pairs ai bi and aj bj . The (−1)-factors in (10) which are not used in the paper by Senovilla [S] is due to different sign conventions. We need a minus sign on the superenergy of tensors with an odd number of 2-blocks (all terms (AP × AP )a1 b1 ...ar br have the same number of 2-blocks), as we have in (7). 3. Spinor Presentation in Some Simple Cases 3.1. Spinors and tensor duals. We now introduce spinors and show how they simplify the description of the superenergy tensors. We begin by recalling some basic facts from [PR1] about the spinor presentation of n-forms and their duals, and then give the spinor form of the superenergy in some basic cases. The completely general case is treated in Sect. 4. We use the notation and sign conventions of Penrose and Rindler [PR1], so A, B, . . . , A0 , B 0 , . . . are spinor indices and pairs of them are identified with tensor indices according to AA0 = a in the standard way. The anti-symmetric symbol AB is used to raise ( AB βB = β A ) and lower (AB α A = αB ) spinor indices, and it is related to the metric by gab = AB ¯A0 B 0 .

Positivity of General Superenergy Tensors

471

Let A be any set of indices (covariant or contravariant) and assume that FabA = −Fba A is real. It has the spinor representation FabA = 0

1 0 (FAC 0 B C A ¯A0 B 0 + F¯CA0 C B 0 A AB ), 2

0

where FAC 0 B C = F(AB)C 0 C and its dual with respect to ab is ∗ Fab = 21 eab cd Fcd , where eabcd is the completely anti-symmetric tensor satisfying eabcd eabcd = −24. The dual has the spinor form ∗

i 0 FabA = − (FAC 0 B C A ¯A0 B 0 − F¯CA0 C B 0 A AB ). 2

(11)

Similarily, for a tensor Ja A with no anti-symmetry in a, the dual with respect to a has the spinor form ∗

JabcA = eabc d Jd A = i(AC ¯B 0 C 0 JBA0 A − BC ¯A0 C 0 JAB 0 A ),

(12)

and for a tensor KabcA = K[abc]A we have ∗

Ka A =

1 bcd i 0 ea Kbcd A = KAC 0 CA0 CC A . 6 3

If A is empty, Ja A = Ja , FabA = Fab and KabcA = Kabc are just normal 1-forms, 2-forms and 3-forms respectively, and ∗ Jabc , ∗ Fab and ∗ Ka their duals. 3.2. Simple and double 2-forms. We now study the spinor expressions of the superenergies of a 2-form Fab = −Fba and a tensor Labcd = L[ab][cd] , which we call a double 2-form. The definitions of the superenergies Tab {F } and Tabcd {L} are 2Tab {F } = −(Fa c Fbc + ∗ Fa c ∗ Fbc ), 2Tabcd {L} = La e c f Lbedf + ∗ La e c f ∗ Lbedf + L∗a e c f L∗bedf + ∗ L∗a e c f ∗ L∗bedf . We begin by studying an expression Ha c A HbcB + ∗ Ha c A ∗ HbcB , where A and B are some sets of indices and HabA = −Hba A . Substituting the duals by (11) we get − Ha c A HbcB − ∗ Ha c A ∗ HbcB 1 0 0 0 0 = − (HAD 0 CD A ¯A0 C + H¯ DA0 DC A A C )(HBD 0 C D B ¯B 0 C 0 + H¯ DB 0 D C 0 B BC ) 4 1 0 0 0 0 + (HAD 0 CD A ¯A0 C − H¯ DA0 DC A A C )(HBD 0 C D B ¯B 0 C 0 − H¯ DB 0 D C 0 B BC ) 4 1 0 0 0 0 = − (HAD 0 CD A H¯ DB 0 D C 0 B ¯A0 C BC + H¯ DA0 DC A HBD 0 C D B A C ¯B 0 C 0 ) 2 1 0 0 (13) = (HAC 0 B C A HCA0 C B 0 B + HAC 0 B C B HCA0 C B 0 A ). 2 For a simple 2-form Fab the two last terms in (13) are equal as A and B are empty. By (13) we get 0

2Tab {F } = FAC 0 B C FCA0 C B 0 = 4φAB φ¯ A0 B 0 ,

(14)

472

G. Bergqvist 0

where φAB = φ(AB) = 21 FAC 0 B C . Hence Tab {F } is a symmetric 2-spinor multiplied by its complex conjugate. 0 For Labcd , with A = c f , B = df , C = AE 0 B E and D = EA0 E B 0 , using (13) we find by firstly adding the first and second terms and the third and fourth terms, and secondly the first and third terms and the second and fourth terms, E0 E E A LEA0 B 0 B + LAE 0 B B LEA0 B 0 A 0 0 + L∗AE 0 B E A L∗EA0 E B 0 B + L∗AE 0 B E B L∗EA0 E B 0 A = LC c f LDdf + LDc f LC df + L∗C c f L∗Ddf + L∗Dc f L∗C df

4Tabcd {L} = LAE 0 B E

0

1 0 0 (LC CF 0 D F LDF C 0 F D 0 + LDCF 0 D F LC F C 0 F D 0 2 0 0 + LDCF 0 D F LC F C 0 F D 0 + LC CF 0 D F LDF C 0 F D 0 )

=

= LAE 0 B E + LAE 0 B

0

CF 0 D

E0

F0

LEA0 E B 0 F C 0 F D 0

F E F0 F C 0 D 0 LEA0 B 0 CF 0 D

= 16(χABCD χ¯ A0 B 0 C 0 D 0 + φABC 0 D 0 φ¯ A0 B 0 CD ), where χABCD =

1 1 0 0 0 LAE 0 B E CF 0 D F , φABC 0 D 0 = LAE 0 B E F C 0 F D 0 . 4 4

0 0 0 0 If ua = U A U¯ A , v a = V A V¯ A , wa = W A W¯ A and za = Z A Z¯ A are future-pointing A A A A null vectors expressed in terms of the spinors U , V , W and Z , then by (14)

Tab {F }ua v b = 2|φAB U A V B |2 ≥ 0 and 0 0 Tabcd {L}ua v b wc zd = 4(|χABCD U A V B W C Z D |2 + |φABC 0 D 0 U A V B W¯ C Z¯ D |2 ) ≥ 0.

If ua or v a are future-pointing and timelike, they can be written as a sum of two future-pointing null vectors. If, for instance, both are timelike we write ua = ua1 + ua2 0 0 and v a = v1a + v2a , where uaj = UjA U¯ jA and vja = VjA V¯jA (j = 1, 2) are future-pointing null vectors. One gets Tab {F }ua v b = 2(|φAB U1A V1B |2 + |φAB U1A V2B |2 + |φAB U2A V1B |2 + |φAB U2A V2B |2 ) ≥ 0. If ua is null and v a timelike one obtains two positive terms. Hence, for all superenergy tensors Tab {F } of 2-forms, we have that Tab {F }ua v b ≥ 0 for all future-pointing vectors ua and v a . In the same way we get Tabcd {L}ua v b w c zd ≥ 0

Positivity of General Superenergy Tensors

473

for all future-pointing vectors ua , v a , wa and za . Thus, the dominant superenergy property holds for both superenergies. Special cases of double 2-forms are the Bel–Robinson tensor Tabcd {C} and the Bel tensor Babcd = Tabcd {R}. The dominant superenergy property for the Bel–Robinson tensor was shown in [B1] and for the Bel tensor in [Bo]. For the Riemann tensor, the spinor φABC 0 D 0 is simply the Ricci spinor, while χABCD = 9ABCD + 3(AC BD + AD BC ), where 9ABCD = 9(ABCD) is the Weyl spinor and 3 = R/24, R being the scalar curvature. For the Weyl tensor φABC 0 D 0 = 0, 3 = 0 so the Bel–Robinson tensor is ¯ A0 B 0 C 0 D 0 . Tabcd {C} = 49ABCD 9 3.3. Simple and double 1-forms. Let Ja be a 1-form and Aab a double 1-form (Aab 6= −Aba ). The definitions of the superenergies are 1 ∗ cd ∗ Ja Jbcd , 2 1 2Tabcd {A} = Aac Abd + ∗ Aaef c ∗ Ab ef d 2 1 1 + A∗acef A∗bd ef + ∗ A∗aef cgh ∗ A∗b ef d gh , 2 4 2Tab {J } = Ja Jb +

(15)

where ∗ to the left or right means dual with respect to a or b respectively. In the same way as we did with 2-forms, we begin by studying a more general expression for a quantity Ia A , where A is some set of indices. Using (12) for the duals, we have 1 ∗ cd ∗ Ia A Ibcd B 2 1 0 0 0 0 = Ia A IbB − (A D ¯ C D I C A0 A − CD ¯A0 D IA C A ) 2 · (BD ¯C 0 D 0 ICB 0 B − CD ¯B 0 D 0 IBC 0 B ) Ia A IbB +

= Ia A IbB − (BA I C A0 A ICB 0 B − Ia A IbB + ¯B 0 A0 IA C = IAB 0 A IBA0 B + IBA0 A IAB 0 B .

0

A IBC 0 B )

(16)

The last step follows from the identity AB CD + BC AD + CA BD = 0 [PR1] which implies BA I C A0 A ICB 0 B = AB CD ICA0 A IDB 0 B = −(B C A D + C A B D )ICA0 A IDB 0 B = −IBA0 A IAB 0 B + Ia A IbB , Hence, for a simple 1-form Ja with empty A and B, we have by (15) and (16) that the superenergy has the simple spinor form Tab {J } = JAB 0 JBA0 (compare with [PR2]).

(17)

474

G. Bergqvist

For Aab , using (16) twice, we find in a similar way as for double 2-forms that the superenergy is 1 AAB 0 c ABA0 d + ABA0 c AAB 0 d ) 2 1 + (A∗AB 0 cef A∗BA0 d ef + A∗BA0 cef A∗AB 0 d ef ) 4 1 = (AAB 0 CD 0 ABA0 DC 0 + AAB 0 DC 0 ABA0 CD 0 2 + ABA0 CD 0 AAB 0 DC 0 + ABA0 DC 0 AAB 0 CD 0 ) = AAB 0 CD 0 ABA0 DC 0 + AAB 0 DC 0 ABA0 CD 0 .

Tabcd {A} =

A0

A0

Again if we contract with ua = U A U¯ , v a = V A V¯ , w a = W A W¯ we find that

0 Z A Z¯ A ,

(18) A0

and za =

0 Tab {J }ua v b = |JAB 0 U A V¯ B |2 ≥ 0

and 0 0 0 0 Tabcd {A}ua v b w c zd = |AAB 0 CD 0 U A V¯ B W C Z¯ D |2 + |AAB 0 DC 0 U A V¯ B W¯ C Z D |2 ≥ 0

and, as in Sect. 3.2, these properties are therefore also true for any future-pointing vectors ua , v a , wa and za . 3.4. Simple 3-forms. If Kabc is a 3-form, the superenergy is equal to that of its dual 0 1-form ∗ Ka = ∗ KAA0 = 3i KAC 0 CA0 CC . Therefore, by (17), we get 1 0 0 Tab {K} = ∗ KAB 0 ∗ KBA0 = − KAD 0 CB 0 CD KBC 0 DA0 DC 9 1 0 0 = KAD 0 CB 0 CD KDA0 BC 0 DC , 9 from which we obtain Tab {K}ua v b =

1 0 0 |KAD 0 CB 0 CD U A V¯ B |2 ≥ 0 9

so the dominant superenergy property holds again. 4. General Spinor Form and Proof of the Dominant Superenergy Property 4.1. Spinor form of the general superenergy tensor. We consider now a general (covariant) tensor Ac1 ...cm as in Subsect. 2.1 and write it as an r-fold form = A[n1 ]...[nr ] . To simplify the notation we may assume that the first s blocks [n1 ], . . . [ns ] are 2-blocks, the following t blocks are 1-blocks and the remaining r-s-t blocks are 3-blocks. As remarked in Subsect. 2.2, we could, by taking the dual with respect to any 3-block, consider 1-blocks and 2-blocks only. However, we want to obtain the spinor form of the superenergy in a direct manner for any given tensor without using tensor duals. Therefore we keep 3-blocks, but assume that any trivial 4-blocks already have been removed.

Positivity of General Superenergy Tensors

475

It is clear from the examples in Sect. 3 how one must proceed to find the spinor form of the superenergy tensor Ta1 b1 ...ar br {A} in the general case. The 2r terms in the definition of Ta1 b1 ...ar br {A} can be separated into two groups, one with terms containing ∗ . For a given term in one group, there is another [nk ] (for some k), and the other [4−n k] term in the other group equal to the first one except that one contains [nk ] and the other ∗ [4−nk ] . Adding these two together, the sum can be rewritten with spinor indices as a sum of two other terms. For nk = 2 this new sum is 1 ˜ Ck0 ˜ ...C A0 Ck B 0 ... + A˜ ...C A0 Ck B 0 ... A˜ ...A C 0 B Ck0 ... ). 0 (A ... A k k k k k k k k k 2 ...Ak Ck Bk For nk = 1 it is A˜ ...Ak Bk0 ... A˜ ...Bk A0k ... + A˜ ...Bk A0k ... A˜ ...Ak Bk0 ... , and for nk = 3, 1 ˜ 0 0 0 0 (A...Ak Dk0 Ck Bk0 Ck Dk ... A˜ ...Dk A0k Bk Ck0 Dk Ck ... + A˜ ...Dk A0k Bk Ck0 Dk Ck ... A˜ ...Ak Dk0 Ck Bk0 Ck Dk ... ). 9 Do this with every pair of terms to obtain 2r−1 expressions, all with such k-indices. Then ∗ . We now get 2r−2 expressions, each with four continue with some other [nl ] and [4−n l] terms, which if nk = nl = 2 are of the form 1 0 0 (A˜ ...Ak Ck0 Bk Ck ...Al Cl0 Bl Cl ... A˜ ...Ck A0k Ck Bk0 ...Cl A0l Cl Bl0 ... 2 2 0 0 + A˜ ...Ak Ck0 Bk Ck ...Cl A0l Cl Bl0 ... A˜ ...Ck A0k Ck Bk0 ...Al Cl0 Bl Cl ... 0

0

+ A˜ ...Ck A0k Ck Bk0 ...Al Cl0 Bl Cl ... A˜ ...Ak Ck0 Bk Ck ...Cl A0l Cl Bl0 ...

0 0 + A˜ ...Ck A0k Ck Bk0 ...Cl A0l Cl Bl0 ... A˜ ...Ak Ck0 Bk Ck ...Al Cl0 Bl Cl ... ).

For nk = 2 and nl = 1 (and similar for nk = 1 and nl = 2) we get 1 ˜ 0 (A...Ak Ck0 Bk Ck ...Al Bl0 ... A˜ ...Ck A0k Ck Bk0 ...Bl A0l ... 2 0 + A˜ ...Ak Ck0 Bk Ck ...Bl A0l ... A˜ ...Ck A0k Ck Bk0 ...Al Bl0 ... 0 + A˜ ...Ck A0k Ck Bk0 ...Al Bl0 ... A˜ ...Ak Ck0 Bk Ck ...Bl A0l ...

0 + A˜ ...Ck A0k Ck Bk0 ...Bl A0l ... A˜ ...Ak Ck0 Bk Ck ...Al Bl0 ... ),

while nk = 2 and nl = 3 ( similar for nk = 3 and nl = 2) gives 1 0 0 0 (A˜ ...Ak Ck0 Bk Ck ...Al Dl0 Cl Bl0 Cl Dl ... A˜ ...Ck A0k Ck Bk0 ...Dl A0l Bl Cl0 Dl Cl ... 2·9 0 0 0 + A˜ ...Ak Ck0 Bk Ck ...Dl A0l Bl Cl0 Dl Cl ... A˜ ...Ck A0k Ck Bk0 ...Al Dl0 Cl Bl0 Cl Dl ... 0

0

0

0

0

0

+ A˜ ...Ck A0k Ck Bk0 ...Al Dl0 Cl Bl0 Cl Dl ... A˜ ...Ak Ck0 Bk Ck ...Dl A0l Bl Cl0 Dl Cl ... + A˜ ...Ck A0k Ck Bk0 ...Dl A0l Bl Cl0 Dl Cl ... A˜ ...Ak Ck0 Bk Ck ...Al Dl0 Cl Bl0 Cl Dl ... ).

476

G. Bergqvist

When nk = nl = 1 we get A˜ ...Ak Bk0 ...Al Bl0 ... A˜ ...Bk A0k ...Bl A0l ... + A˜ ...Ak Bk0 ...Bl A0l ... A˜ ...Bk A0k ...Al Bl0 ...

+ A˜ ...Bk A0k ...Al Bl0 ... A˜ ...Ak Bk0 ...Bl A0l ... + A˜ ...Bk A0k ...Bl A0l ... A˜ ...Ak Bk0 ...Al Bl0 ... , and for nk = 1 and nl = 3 (again similar for nk = 3 and nl = 1) 1 ˜ 0 0 (A...Ak Bk0 ...Al Dl0 Cl Bl0 Cl Dl ... A˜ ...Bk A0k ...Dl A0l Bl Cl0 Dl Cl ... 9 0 0 + A˜ ...Ak Bk0 ...Dl A0l Bl Cl0 Dl Cl ... A˜ ...Bk A0k ...Al Dl0 Cl Bl0 Cl Dl ... 0

0

+ A˜ ...Bk A0k ...Al Dl0 Cl Bl0 Cl Dl ... A˜ ...Ak Bk0 ...Dl A0l Bl Cl0 Dl Cl ...

0 0 + A˜ ...Bk A0k ...Dl A0l Bl Cl0 Dl Cl ... A˜ ...Ak Bk0 ...Al Dl0 Cl Bl0 Cl Dl ... ).

Finally nk = nl = 3 gives the expression 1 0 0 0 0 (A˜ ...Ak Dk0 Ck Bk0 Ck Dk ...Al Dl0 Cl Bl0 Cl Dl ... A˜ ...Dk A0k Bk Ck0 Dk Ck ...Dl A0l Bl Cl0 Dl Cl ... 92 0 0 0 0 + A˜ ...Ak Dk0 Ck Bk0 Ck Dk ...Dl A0l Bl Cl0 Dl Cl ... A˜ ...Dk A0k Bk Ck0 Dk Ck ...Al Dl0 Cl Bl0 Cl Dl ... 0

0

0

0

0

0

0

0

+ A˜ ...Dk A0k Bk Ck0 Dk Ck ...Al Dl0 Cl Bl0 Cl Dl ... A˜ ...Ak Dk0 Ck Bk0 Ck Dk ...Dl A0l Bl Cl0 Dl Cl ... + A˜ ...Dk A0k Bk Ck0 Dk Ck ...Dl A0l Bl Cl0 Dl Cl ... A˜ ...Ak Dk0 Ck Bk0 Ck Dk ...Al Dl0 Cl Bl0 Cl Dl ... ). The total number of terms is 2r . Continuing the process with [nm ] gives 2r−3 expressions of 8 terms, so the total number of terms will in each step remain at 2r . We write T˜1 =

1 0 0 0 ...A 0 · A˜ A1 C10 B1 C1 ...As Cs0 Bs Cs As+1 Bs+1 s+t Bs+t 2s 9r−s−t 0 0 As+t+1 Ds+t+1 Cs+t+1 Bs+t+1

0 Cs+t+1 Ds+t+1

...Ar Dr0 Cr Br0

Dr Cr0

· A˜ C1 A01 C1 B10 ...Cs A0s Cs Bs0 Bs+1 A0s+1 ...Bs+t A0s+t 0 Ds+t+1 A0s+t+1 Bs+t+1 Cs+t+1

0 Ds+t+1 Cs+t+1

...Dr A0r Br Cr0

Dr Cr0

,

and define T˜Q with Q = 1 + z1 2r−1 + z2 2r−2 + · · · + zr from T˜1 , where zj = 1 if the j-indices in T˜Q have been interchanged between the two ˜ in T˜1 according to A:s Aj Cj0 Bj

Cj0

Aj Bj0 Cj Dj0 Aj Dj0 Cj Bj0

↔ Cj A0j Cj Bj0 ,

1 ≤ j ≤ s,

↔ Bj A0j ,

s + 1 ≤ j ≤ s + t,

↔

Dj Cj0 Dj A0j Bj Cj0

,

s + t + 1 ≤ j ≤ r,

and zj = 0 otherwise. Continuing the above procedure through all blocks, we have proved:

Positivity of General Superenergy Tensors

477

Theorem 1. The spinor form of the superenergy tensor is r

Ta1 b1 ...ar br {A} =

2 1 X ˜ (TQ )a1 b1 ...ar br . 2

Q=1

Remark. T˜Q = T˜2r +1−Q so the sum can be written Ta1 b1 ...ar br {A} =

r−1 2X

(T˜Q )a1 b1 ...ar br .

Q=1

Thus, Ta1 b1 ...ar br {A} can always be written as a sum of 2r−1 terms, giving one term for simple forms and two terms for double forms as we showed in Sect. 3 above. Another important feature of the spinor form is that, as opposed to the tensor form, ˜ only traces within each A. ˜ there are no contractions between the two A:s, We also note that Ta1 b1 ...ar br {A} is trace-free in any pair aj bj obtained from a 2-block. 4.2. The dominant superenergy theorem. From the spinor form of the superenergy tensor it is easy to prove the dominant superenergy property: Theorem 2. Let Ta1 b1 ...ar br {A} be a superenergy tensor and ua1 , . . . , ua2r future-pointing vectors. Then r ub2rr ≥ 0, Ta1 b1 ...ar br {A}ua11 ub21 . . . ua2r−1

Proof. Suppose first that all ua1 , . . . , ua2r are null vectors. We write each of them in terms 0 of spinors as uak = UkA U¯ kA . For a given term (T˜Q )a1 b1 ...ar br in Theorem 1, each A˜ in T˜Q A, U ¯ A0 , . . . , U¯ A0 that have the same 2r is contracted with those 2r spinors of U1A , . . . , U2r 1 2r indices as A˜ has. The second A˜ is contracted with the remaining 2r spinors (the complex r ub2rr is a conjugates of the 2r first). We find that each (T˜Q )a1 b1 ...ar br {A}ua11 ub21 . . . ua2r−1 complex number multiplied by its complex conjugate, so r ub2rr = (T˜Q )a1 b1 . . . ar br ua11 ub21 . . . ua2r−1

1 2s 9r−s−t

|A˜ A1 ... U A1 . . . |2 ≥ 0

r ub2rr is a sum of 2r−1 such non-negative terms. and Ta1 b1 ...ar br {A}ua11 ub21 . . . ua2r−1 If some of the vectors are timelike, we write such vectors uak = vka + wka , where vka and wka are future-pointing null vectors. If K is the number of vectors that are timelike, r ub2rr will be a sum of 2K terms of the same kind but then Ta1 b1 ...ar br {A}ua11 ub21 . . . ua2r−1 in each term all contractions are with null vectors. Above we have already shown that each such term is non-negative so the sum must also be non-negative. This proves the theorem. u t r ub2rr will be a sum of 2K+r−1 squares Remark 1. In total, Ta1 b1 ...ar br {A}ua11 ub21 . . . ua2r−1 of absolute values of complex numbers.

Remark 2. The case Ta1 b1 ...ar br {A}ua1 ub1 . . . uar ubr ≥ 0 (2r contractions with the same timelike vector ua ), with strict inequality if Ac1 ...cm 6= 0, was proven with tensor methods by Senovilla [S].

478

G. Bergqvist

5. Further Examples and Discussion We begin by giving a few more examples of superenergy tensors. Let ua be a null vector 0 field. Writing it as ua = U A U¯ A and using (17) we get Tab {u} = uAB 0 uBA0 = UA U¯ B 0 UB U¯ A0 = ua ub . Similarly, if Aab = ua ub where ua is null, (18) implies Tabcd {A} = ua ub uc ud . This can be continued in the obvious manner. What is the superenergy of the metric gab ? The metric is a symmetric (1,1)-form so with gab = gAA0 BB 0 = AB ¯A0 B 0 , (18) now implies Tabcd {g} = gAB 0 CD 0 gBA0 DC 0 + gAB 0 DC 0 gBA0 CD 0 = AC ¯B 0 D 0 BD ¯A0 C 0 + AD ¯B 0 C 0 BC ¯A0 D 0 = gac gbd + gad gbc which obviously satisfies the dominant superenergy property as gab itself does. Another symmetric (1,1)-form is the Einstein tensor Gab . Writing [PR1] Gab = −63gab − 28ab = −63AB ¯A0 B 0 − 28AA0 BB 0 with 3 and 8ab as in Subsect. 3.2, and again using (18), we find Tabcd {G} = 3632 Tabcd {g} + 4Tabcd {8} + 123(8ACB 0 D 0 ¯A0 C 0 BD + 8BDA0 C 0 ¯AC ¯B 0 D 0 + 8ADB 0 C 0 ¯A0 D 0 BC + 8BCA0 D 0 AD ¯B 0 C 0 ). Finally, we consider a (2,1)-form Labc = L[ab]c which we can write Labc = ¯ A0 B 0 C 0 C AB , where 3ABCC 0 = 3(AB)CC 0 = 1 LAE 0 B E 0 CC 0 . The su3ABCC 0 ¯A0 B 0 + 3 2 perenergy of Labc is ¯ A0 B 0 C 0 D + 3ABDC 0 3 ¯ A0 B 0 D 0 C ). Tabcd {L} = 2(3ABCD 0 3 An important example of such a (2,1)-form is the Lanczos potential of the Weyl tensor 0 [EH]. In terms of spinors it satisfies 29ABCD = ∇(A A 3B)CDA0 , where 9ABCD is the Weyl spinor. Another (2,1)-form is ∇c Fab , where Fab is a Maxwell field. Properties of its superenergy tensor Tabcd {∇F } are discussed in [S]. We note that not every pairwise symmetric tensor which satisfies the dominant superenergy property is the superenergy tensor of some other tensor. The metric gab is such an example. As it is not trace-free it is not the superenergy of a 2-form. Neither can it be the superenergy of a 1-form Ja (or a 3-form) as by (8) we would then have gab proportional to Ja Jb . As we stated in the introduction, positive quantities are useful in mathematical proofs. They can be used to get estimates in exitence theorems. Christodoulou and Klainerman [CK2] used the Bel–Robinson tensor in their work on the Einstein vacuum equation. We have seen that the Bel–Robinson tensor is the superenergy of the Weyl tensor, and that it is just the Weyl spinor squared. We also see that it is |∇3|2 , where 3ABCC 0 is the Lanczos spinor potential. Existence theorems for Lanczos potentials may therefore be of importance to existence results of Einstein’s vacuum equation.

Positivity of General Superenergy Tensors

479

In [CD2] only vacuum spacetimes are studied, the non-vacuum case of course being much more complicated. Observe however that the Bel tensor, the superenergy of the Riemann tensor, also satisfies the dominant superenergy property. It is therefore a natural candidate to replace the Bel–Robinson tensor when studying the non-vacuum stability of the Minkowski spacetime. Acknowledgements. It is a pleasure to thank José Senovilla for many comments, proposals and fruitful discussions. This work was supported financially by The Swedish Natural Science Research Council (NFR).

References [Bel1] [Bel2] [B1] [B2] [Bo] [BoS1] [BoS2] [CK1] [CK2] [EH] [HE] [PR1] [PR2] [S]

Bel, L.: Compt. Rend. Acad. Sci. Paris 247, 1094 (1958) Bel, L.: Compt. Rend. Acad. Sci. Paris 248, 1297 (1959) Bergqvist, G.: J. Math. Phys. 39, 2141 (1998) Bergqvist, G.: Class. Quantum Grav. 15, 1535 (1998) Bonilla, M.Á.G.: Class. Quantum Grav. 15, 2001 (1998) Bonilla, M.Á.G. and Senovilla, J.M.M.: Phys. Rev. Lett. 78, 783 (1997) Bonilla, M.Á.G. and Senovilla, J.M.M.: Gen. Rel. Grav. 29, 91 (1997) Christodoulou, D. and Klainerman, S.: Comm. Pure Appl. Math. 18, 137 (1990) Christodoulou, D. and Klainerman, S.: The global non-linear stability of Minkowski space. Princeton, NJ: Princeton University Press, 1993 Edgar, S.B. and Höglund, A.: Proc. R. Soc. London, Ser. A 453, 835 (1997) Hawking, S.W. and Ellis, G.F.R.: The large-scale structure of spacetime. Cambridge: Cambridge University Press, 1973 Penrose, R. and Rindler, W.: Spinors and spacetime. Vol. 1, Cambridge: Cambridge University Press, 1984 Penrose, R. and Rindler, W.: Spinors and spacetime. Vol. 2 Cambridge: Cambridge University Press, 1986 Senovilla, J.M.M.: To appear in Gravitation and Relativity in General, eds A. Molina, J. Martín, E. Ruiz and F. Atrio, Singapore: World Scientific, 1999

Communicated by H. Nicolai

Commun. Math. Phys. 207, 481 – 493 (1999)

Communications in

Mathematical Physics

On Lieb’s Conjecture for the Wehrl Entropy of Bloch Coherent States? Peter Schupp?? Department of Physics, Princeton University, Princeton, NJ 08544-0708, USA Received: 2 March 1999 / Accepted: 7 May 1999

Abstract: Lieb’s conjecture for the Wehrl entropy of Bloch coherent states is proved for spin 1 and spin 3/2. Using a geometric representation we solve the entropy integrals for states of arbitrary spin and evaluate them explicitly in the cases of spin 1, 3/2, and 2. We also give a group theoretic proof for all spin of a related inequality. 1. Introduction Wehrl proposed [16] a hybrid between quantum mechanical and classical entropy that enjoys monotonicity, strong subadditivity and positivity – physically desirable properties, some of which both kinds of entropy lack [7,8]. This new entropy is the ordinary Shannon entropy of the probability density provided by the lower symbol of the density matrix. For a quantum mechanical system with density matrix ρ, Hilbert space H, and a Rfamily of normalized coherent states |zi, parametrized symbolically by z and satisfying dz |zihz| = 1 (resolution of identity), the Wehrl entropy is Z (1) SW (ρ) = − dz hz|ρ|zi lnhz|ρ|zi. Like quantum mechanical entropy, SQ = −tr ρ ln ρ, Wehrl entropy is always nonnegative, in fact SW > SQ ≥ 0. In view of this inequality it is interesting to ask for the minimum of SW and the corresponding minimizing density matrix. It follows from concavity of −x ln x that a minimizing density matrix must be a pure state, i.e., ρ = |ψihψ| for a normalized vector |ψi ∈ H [6]. (Note that SW (|ψihψ|) depends on |ψi and is non-zero, unlike the quantum entropy which is of course zero for pure states.) ? © 1999 by the author. Reproduction of this article, in its entirety, by any means is permitted for noncommercial purposes. ?? Present address: Sektion Physik, Universität München, Theresienstr. 37, 80333 München, Germany. E-mail: [email protected]

482

P. Schupp

For Glauber coherent states Wehrl conjectured [16] and Lieb proved [6] that the minimizing state |ψi is again a coherent state. It turns out that all Glauber coherent states have Wehrl entropy one, so Wehrl’s conjecture can be written as follows: Theorem 1.1 (Lieb). The minimum of SW (ρ) for states in H = L2 (R) is one, Z SW (|ψihψ|) = − dz |hψ|zi|2 ln |hψ|zi|2 ≥ 1,

(2)

with equality if and only if |ψi is a coherent state. To prove this, Lieb used a clever combination of the sharp Hausdorff-Young inequality [2,9,11] and the sharp Young inequality [3,2,9,11] to show that Z (3) s dz |hz|ψi|2s ≤ 1, s ≥ 1, again with equality if and only if |ψi is a coherent state. Wehrl’s conjecture follows from this in the limit s → 1 essentially because (2) is the derivative of (3) with respect to s at s = 1. All this easily generalizes to L2 (Rn ) [6,10]. The lower bound on the Wehrl entropy is related to Heisenberg’s uncertainty principle [1,4] and it has been speculated that SW can be used to measure uncertainty due to both quantum and thermal fluctuations [4]. It is very surprising that “heavy artillery” like the sharp constants in the mentioned inequalities are needed in Lieb’s proof. To elucidate this situation, Lieb suggested [6] studying the analog of Wehrl’s conjecture for Bloch coherent states |i, where one should expect significant simplification since they live on finite dimensional Hilbert spaces. However, no progress has been made, not even for a single spin, even though many attempts have been made [7]. Attempts to proceed along the lines of Lieb’s original proof have failed to provide a sharp inequality and the direct computation of the entropy and related integrals, even numerically, was unsuccessful [14]. The key to the recent progress is a geometric representation of a state of spin j as 2j points on a sphere. In this representation the expression |h|ψi|2 factorizes into a product of 2j functions fi on the sphere, which measure the square chordal distance from the antipode of the point parametrized by to each of the 2j points on the sphere. Lieb’s conjecture, in a generalized form analogous to (3), then looks like the quotient of two Hölder inequalities Q2j ||fi ||2j s ||f1 · · · f2j ||s ≤ Qi=1 , 2j ||f1 · · · f2j ||1 ||fi ||2j

(4)

i=1

with the one with the higher power winning against the other one. We shall give a group theoretic proof of this inequality for the special cases s ∈ N in Theorem 4.3. In the geometric representation the Wehrl entropy of spin states finds a direct physical interpretation: It is the classical entropy of a single particle on a sphere interacting via Coulomb potential with 2j fixed sources; s plays the role of inverse temperature. The entropy integral (1) can now be done because |h|ψi|2 factorizes and one finds a formula for the Wehrl entropy of any state. When we evaluate the entropy explicitly for states of spin 1, 3/2, and 2 we find surprisingly simple expressions solely in terms of the square chordal distances between the points on the sphere that define the given state.

Wehrl Entropy of Bloch Coherent States

483

A different, more group theoretic approach seems to point to a connection between Lieb’s conjecture and the norm of certain spin j s states with 1 ≤ s ∈ R [5]. So far, however, this has only been useful for proving the analog of inequality (3) for s ∈ N. We find that a proof of Lieb’s conjecture for low spins can be reduced to some beautiful spherical geometry, but the unreasonable difficulty of a complete proof is still a great puzzle; its resolution may very well lead to interesting mathematics and perhaps physics. 2. Bloch Coherent Spin States √ 1 2 Glauber coherent states |zi = π − 4 e−(x−q) /2 eipx , parametrized by z = (q+ip)/ 2 and with measure dz = dpdq/2π, are usually introduced as eigenvectors of the annihilation √ 2, a|zi = z|zi, but the same states can also be obtained by operator a = (xˆ + i p)/ ˆ the action of the Heisenberg-Weyl group H4 = {a † a, a † , a, I } on the extremal state 1 2 |0i = π − 4 e−x /2 . Glauber coherent states are thus elements of the coset space of the Heisenberg-Weyl group modulo the stability subgroup U (1) ⊗ U (1) that leaves the extremal state invariant. (See e.g. [17] and references therein.) This construction easily generalizes to other groups, in particular to SU(2), where it gives the Bloch coherent spin states [13] that we are interested in: Here the Hilbert space can be any one of the finite dimensional spin-j representations [j ] ≡ C2j +1 of SU(2), j = 21 , 1, 23 , . . . , and the extremal state for each [j ] is the highest weight vector |j, j i. The stability subgroup is U(1) and the coherent states are thus elements of the sphere S2 =SU(2)/U(1); they can be labeled by = (θ, φ) and are obtained from |j, j i by rotation: |ij = Rj ()|j, j i. For spin j =

1 2

(5)

we find φ

φ

|ωi = p 2 e−i 2 | ↑i + (1 − p) 2 ei 2 | ↓i, 1

1

(6)

with p ≡ cos2 θ2 . (Here and in the following |ωi is short for the spin- 21 coherent state |i 1 ; ω = = (θ, φ). | ↑i ≡ | 21 , 21 i and | ↓i ≡ | 21 , − 21 i.) An important observation for 2 what follows is that the product of two coherent states for the same is again a coherent state: |ij ⊗ |ij 0 = (Rj ⊗ Rj 0 ) (|j, j i ⊗ |j 0 , j 0 i)

= Rj +j 0 |j + j 0 , j + j 0 i = |ij +j 0 .

(7)

Coherent states are in fact the only states for which the product of a spin-j state with a spin-j 0 state is a spin-(j + j 0 ) state and not a more general element of [j + j 0 ] ⊕ . . . ⊕ [ |j − j 0 | ]. From this key property an explicit representation for Bloch coherent states of higher spin can be easily derived: 1 ⊗2j φ φ 1 |ij = (|ωi)⊗2j = p 2 e−i 2 | ↑i + (1 − p) 2 ei 2 | ↓i 1 j X 2 j +m j −m φ 2j = p 2 (1 − p) 2 e−im 2 |j, mi. j +m m=−j

(8)

484

P. Schupp

(The same expression can also be obtained directly from (5), see e.g. [12, chapter 4].) The coherent states as given are normalized h|ij = 1 and satisfy Z d |ij h|j = Pj , (resolution of identity) (9) (2j + 1) 4π P where Pj = |j, mihj, m| is the projector onto [j ]. It is not hard to compute the Wehrl entropy for a coherent state |0 i: Since the integral over the sphere is invariant under rotations it is enough to consider the coherent state |j, j i; then use |hj, j |i|2 = |h↑ |ωi|2·2j = p2j and d/4π = −dp dφ/2π, where p = cos2 θ2 as above, to obtain Z d 0 0 |h|0 i|2 ln |h|0 i|2 SW (| ih |) = −(2j + 1) 4π Z 1 2j . (10) = −(2j + 1) dp p2j 2j ln p = 2j + 1 0 Similarly, for later use, Z (2j s + 1)

d |h0 |i|2s = (2j s + 1) 4π

Z

1

dp p2j s = 1.

(11)

0

As before the density matrix that minimizes SW must be a pure state |ψihψ|. The analog of Theorem 1.1 for spin states is: Conjecture 2.1 (Lieb). The minimum of SW for states in H = C2j +1 is 2j/(2j + 1), Z d 2j |h|ψi|2 ln |h|ψi|2 ≥ , (12) SW (|ψihψ|) = −(2j + 1) 4π 2j + 1 with equality if and only if |ψi is a coherent state. Remark. For spin 1/2 this is an identity because all spin 1/2 states are coherent states. The first non-trivial case is spin j = 1. 3. Proof of Lieb’s Conjecture for Low Spin In this section we shall geometrize the description of spin states, use this to solve the entropy integrals for all spin and prove Lieb’s conjecture for low spin by actual computation of the entropy. Lemma 3.1. States of spin j are in one to one correspondence to 2j points on a sphere: With 2j points, parametrized by ωk = (θk , φk ), k = 1, . . . , 2j , we can associate a state 1

|ψi = c 2 Pj (|ω1 i ⊗ . . . ⊗ |ω2j i) ∈ [j ],

(13) 1

and every state |ψi ∈ [j ] is of that form. (The spin- 21 states |ωk i are given by (6), c 2 6= 0 fixes the normalization of |ψi, and Pj is the projector onto spin j .) Remark. Some or all of the points may coincide. Coherent states are exactly those states 1 for which all points on the sphere coincide. c 2 ∈ C may contain an (unimportant) phase that we can safely ignore in the following. This representation is unique up to permutation of the |ωk i. The ωk may be found by looking at h|ψi as a function of = (θ, ψ): they are the antipodal points to the zeroes of this function.

Wehrl Entropy of Bloch Coherent States

485

Proof. Rewrite (8) in complex coordinates for θ 6 = 0, z=

p 1−p

1 2

θ eiφ = cot eiφ 2

(14)

(stereographic projection) and contract it with |ψi to find h|ψi =

1 jmax X 2 2j e−ij φ zj +m ψm , j (1 + z¯z) j +m

(15)

m=−j

P where jmax is the largest value of m for which ψm in the expansion |ψi = ψm |mi is nonzero. The sum is a polynomial of degree j +jmax in z ∈ C and can thus be factorized: h|ψi =

j +j e−ij φ ψjmax Ymax (z − zk ). (1 + z¯z)j

(16)

k=1

Consider now the spin 21 states |ωk i = (1 + zk z¯ k )− 2 (| ↑i − zk | ↓i) for 1 ≤ k ≤ j + jmax and |ωm i = | ↓i for j + jmax < m ≤ 2j . According to (15): 1

iφ

hω|ωk i =

e− 2 1

iφ

(z − zk ), hω|ωm i = 1

(1 + z¯z) 2 (1 + zk z¯ k ) 2

e− 2

1

(1 + z¯z) 2

,

(17)

so by comparison with (16) and with an appropriate constant c, 1

1

h|ψi = c 2 hω|ω1 i · · · hω|ω2j i = c 2 h|ω1 ⊗ . . . ⊗ ω2j i.

(18)

By inspection we see that this expression is still valid when θ = 0 and with the help of (9) we can complete the proof the lemma. u t We see that the geometric representation of spin states leads to a factorization of |h|ψi|2 . In this representation we can now do the entropy integrals, essentially because the logarithm becomes a simple sum. Theorem 3.2. Consider any state |ψi of spin j . According to Lemma 3.1, it can be 1 written as |ψi = c 2 Pj (|ω1 i ⊗ . . . ⊗ |ω2j i). Let Ri be the rotation that turns ωi to the (i) “north pole”, Ri |ωi i = | ↑i, let |ψ (i) i = Ri |ψi, and let ψm be the coefficient of |j, mi (i) in the expansion of |ψ i, then the Wehrl entropy is:   j 2j X jX −m X 1 (i) 2  |ψm  | − ln c. (19) SW (|ψihψ|) = 2j + 1 − n i=1 m=−j

n=0

Remark. This formula reduces the computation of the Wehrl entropy of any spin state to its factorization in the sense of Lemma 3.1, which in general requires the solution of a 2j th order algebraic equation. This may explain why previous attempts to do the entropy integrals have failed. The n = 0 terms in the expression for the entropy sum

486

P. Schupp

up to 2j/(2j + 1), the entropy of a coherent state, and Lieb’s conjecture can be thus be written   j −1 2j jX −m X X 1 (i) 2  |ψm  | . (20) ln c ≤ 2j + 1 − n n=1

i=1 m=−j +1

(i)

(i)

Note that ψ−j = 0 by construction of |ψ (i) i: ψ−j contains a factor h↓ | ↑i. A similar calculation gives Z ln c = 2j +

d ln |h|ψi|2 . 4π

(21)

Proof. Using Lemma 3.1, (9), the rotational invariance of the measure and the inverse Fourier transform in φ we find Z SW (|ψihψ|) = −(2j + 1)

X d |h|ψi|2 ln |hω|ωi i|2 − ln c 4π 2j

i=1

2j Z X d |h|ψ (i) i|2 ln |hω| ↑i|2 − ln c = −(2j + 1) 4π i=1

= −(2j +1)

2j X

j X

i=1 m=−j

(i) 2 2j |ψm | (j +m)

Z

1

dp pj +m (1−p)j −m ln p − ln c. (22)

0

It is now easy to do the remaining p-integral by partial integration to prove the theorem. t u Lieb’s conjecture for low spin can be proved with the help of formula (19). For spin 1/2 there is nothing to prove, since all states of spin 1/2 are coherent states. The first nontrivial case is spin 1: Corollary 3.3 (spin 1). Consider an arbitrary state of spin 1. Let µ be the square of the chordal distance between the two points on a sphere of radius 21 that represent this state. Its Wehrl entropy is given by 2 SW (µ) = + c · 3

µ 1 1 + ln , 2 c c

(23)

with µ 1 =1− . c 2

(24)

Lieb’s conjecture holds for all states of spin 1: SW (µ) ≥ 2/3 = 2j/(2j + 1) with equality for µ = 0, i.e. for coherent states.

Wehrl Entropy of Bloch Coherent States

487

Proof. Because of rotational invariance we can assume without loss of generality that the first point is at the “north pole” of the sphere and that the second point is parametrized ˜ as ω2 = (θ˜ , φ˜ = 0), so that µ = sin2 θ2 is the square of the chordal distance between the two points. Up to normalization (and an irrelevant phase) ˜ = Pj =1 | ↑ ⊗ωi ˜ |ψi

(25)

is the state of interest. But from (6), 1

1

| ↑ ⊗ωi ˜ = (1 − µ) 2 | ↑ ↑i + µ 2 | ↑ ↓i.

(26) 1

Projecting onto spin 1 and inserting the normalization constant c 2 we find 1 1 1 1 2 2 2 |ψi = c (1 − µ) |1, 1i + µ √ |1, 0i . 2 This gives (ignoring a possible phase) µ µ =c 1− 1 = hψ|ψi = c 1 − µ + 2 2

(27)

(28)

and so 1/c = 1 − µ/2. Now we need to compute the components of |ψ (1) i and |ψ (2) i. Note that |ψ (1) i = |ψi because ω1 is already pointing to the “north pole”. To obtain |ψ (2) i we need to rotate point 2 to the “north pole”. We can use the remaining rotational freedom to effectively exchange the two points, thereby recovering the original state |ψi. The components of both |ψ (1) i and |ψ (2) i can thus be read off (27): 1 1 1 1 √ (1) (2) (1) (2) ψ1 = ψ1 = c 2 (1 − µ) 2 , ψ0 = ψ0 = c 2 µ 2 / 2. (29) (1)

(2)

(1)

(2)

Inserting now c, |ψ1 |2 = |ψ1 |2 = c(1 − µ), and |ψ0 |2 = |ψ0 |2 = cµ/2 into (19) gives the stated entropy. To prove Lieb’s conjecture for states of spin 1 we use (28) to show that the second term in (23) is always non-negative and zero only for µ = 0, i.e. for a coherent state. This follows from cµ cµ − ln c ≥ +1−c =0 (30) 2 2 with equality for c = 1 which is equivalent to µ = 0. u t Corollary 3.4 (spin 3/2). Consider an arbitrary state of spin 3/2. Let , µ, ν be the squares of the chordal distances between the three points on a sphere of radius 21 that represent this state (see Fig. 1). Its Wehrl entropy is given by +µ+ν µ + ν + µν 1 1 3 − + ln (31) SW (, µ, ν) = + c · 4 3 6 c c with +µ+ν 1 =1− . c 3

(32)

Lieb’s conjecture holds for all states of spin 3/2: SW (, µ, ν) ≥ 3/4 = 2j/(2j + 1) with equality for = µ = ν = 0, i.e. for coherent states.

488

P. Schupp

3

µ

1

ZZ

Z

Z

Z

Z2

ν Fig. 1. Spin 3/2

Proof. The proof is similar to the spin 1 case, but the geometry and algebra is more involved. Consider a sphere of radius 21 , with points 1, 2, 3 on its surface, and two planes through its center; the first plane containing points 1 and 3, the second plane containing points 2 and 3. The intersection angle φ of these two planes satisfies p 2 cos φ µ(1 − )(1 − µ) = + µ − ν − 2µ.

(33)

φ is the azimuthal angle of point 2, if point 3 is at the ‘north pole’ of the sphere and point 1 is assigned zero azimuthal angle. The states |ψ (1) i, |ψ (2) i, and |ψ (3) i all have one point at the north pole of the sphere. (i) It is enough to compute the values of |ψm |2 for one i, the other values can be found by appropriate permutation of , µ, ν. (Note that we make no restriction on the parameters 0 ≤ , µ, ν ≤ 1 other than that they are square chordal distances between three points on a sphere of radius 21 .) We shall start with i = 3: Without loss of generality the three (3) ˜ 0), ω(3) = (θ, φ), and ω(3) = (0, 0) with points can be parametrized as ω1 = (θ, 2 3 µ = sin2

θ˜ 2

and = sin2 θ2 . Corresponding spin- 21 states are (3)

1

1

|ω1 i = (1 − µ) 2 | ↑i + µ 2 | ↓i, (3)

1 2

|ω2 i = (1 − ) e (3) |ω3 i

−iφ 2

1 2

(34) iφ 2

| ↑i + e | ↓i,

= | ↑i,

(35) (36)

and up to normalization, the state of interest is (3)

(3)

(3)

|ψ˜ (3) i = Pj =3/2 |ω1 ⊗ ω2 ⊗ ω3 i 1 1 −iφ 3 3 = (1 − ) 2 (1 − µ) 2 e 2 | , i 2 2 1 3 1 1 1 iφ 1 1 −iφ + (1 − µ) 2 2 e 2 + µ 2 (1 − ) 2 e 2 √ | , i 3 2 2 1 1 iφ 1 1 3 + µ 2 2 e 2 √ | , − i. 2 3 2

(37)

Wehrl Entropy of Bloch Coherent States

489

This gives (3)

|ψ˜ 3 |2 = (1 − )(1 − µ), 2 p 1 (3) (1 − µ) + µ(1 − ) + 2 µ(1 − µ)(1 − ) cos φ |ψ˜ 1 |2 = 3 2 2 ν 2 = (1 − µ) + µ(1 − ) − , 3 3 3 µ (3) 2 ˜ , |ψ 1 | = −2 3

(38)

(39) (40)

(3) and |ψ˜ 3 |2 = 0. The sum of these expressions is −2

+µ+ν 1 ˜ ψi ˜ =1− = hψ| , c 3

(41)

with 0 < 1/c ≤ 1. The case i = 1 is found by exchanging µ ↔ ν (and also 3 ↔ 1, φ ↔ −φ). The case i = 2 is found by permuting → µ → ν → (and also 1 → 3 → 2 → 1). Using (19) then gives the stated entropy. To complete the proof Lieb’s conjecture for all states of spin 3/2 we need to show that the second term in (31) is always non-negative and zero only for = µ = ν = 0. From the inequality (1 − x) ln(1 − x) ≥ −x + x 2 /2 for 0 ≤ x < 1, we find +µ+ν 1 +µ+ν 2 1 1 , (42) ln ≥ − + c c 3 2 3 with equality for c = 1. Using the inequality between algebraic and geometric mean it is not hard to see that + µ + ν 2 µ + ν + µν (43) ≥ 3 3 with equality for = µ = ν. Putting everything together and inserting it into (31) we have, as desired, SW ≥ 3/4 with equality for = µ = ν = 0, i.e. for coherent states. t u Corollary 3.5 (spin 2). Consider an arbitrary state of spin 2. Let , µ, ν, α, β, γ be the squares of the chordal distances between the four points on a sphere of radius 21 that represent this state (see Fig. 2). Its Wehrl entropy is given by 1 1 4 , (44) SW (, µ, ν, α, β, γ ) = + c · σ + ln 5 c c where

and

1X j 1 X j 1 + =1− c 4 12

(45)

X 1X j 5X j X j 1 j − − +3 − σ = 12 2 3

(46)

490

P. Schupp

with

X X

X

j≡ αµν + βν + µγ + αβγ ,

j≡ α + βµ + γ ν,

X

j≡ α + β + γ + µ + ν + ,

(47) (48)

j≡ αµ + αν + µν + β + βν + ν + γ + µ + µγ + αβ + αγ + βγ . (49) 3

BZ Z B Z µ Bγ Z Z B Z2 B ν

1 B Q

Q B α Q B β Q B Q

4

Fig. 2. Spin 2

Remark. The fact that the four points lie on the surface of a sphere imposes a complicated constraint on the parameters , µ, ν, α, β, γ . Although we have convincing numerical evidence for Lieb’s conjecture for spin 2, so far a rigorous proof has been limited to certain symmetric configurations like equilateral triangles with centered fourth point ( = µ = ν and α = β = γ ), and squares (α = β = = µ and γ = ν). It is not hard to find values of the parameters that give values of SW below the entropy for coherent states, but they do not correspond to any configuration of points on the sphere, so in contrast to spin 1 and spin 3/2 the constraint is now important. SW is concave in each of the parameters , µ, ν, α, β, γ . Proof. The proof is analogous to the spin 1 and spin 3/2 cases but the geometry and algebra are considerably more complicated, so we will just give a sketch. Pick four points (3) (3) on the sphere, without loss of generality parametrized as ω1 = (θ˜ , 0), ω2 = (θ, φ), (3) (3) ¯ φ). ¯ Corresponding spin 1 states are |ω(3) i, |ω(3) i, |ω(3) i, ω3 = (0, 0), and ω4 = (θ, 1 2 3 2 as given in (34), (35), (36), and (3)

1

|ω4 i = (1 − γ ) 2 e

−i φ¯ 2

1

i φ¯

| ↑i + γ 2 e 2 | ↓i.

(50)

Up to normalization, the state of interest is (3)

(3)

(3)

(3)

|ψ˜ (3) i = Pj =2 |ω1 ⊗ ω2 ⊗ ω3 ⊗ ω4 i. (3)

(51)

In the computation of |ψ˜ m |2 we encounter again the angle φ, compare (33), and two new angles φ¯ and φ¯ − φ. Luckily both can again be expressed as angles between planes that intersect the circle’s center and we have p (52) 2 cos φ¯ µγ (1 − µ)(1 − γ ) = µ + γ − α − 2µγ , p (53) 2 cos(φ¯ − φ) γ (1 − )(1 − γ ) = γ + − β − 2γ ,

Wehrl Entropy of Bloch Coherent States

491

P (3) and find 1/c = m |ψ˜ m |2 as given in (45). By permuting the parameters , µ, ν, α, β, (i) γ appropriately we can derive expressions for the remaining |ψ˜ m |2 ’s and then compute t SW (44) with the help of (19). u 4. Higher Spin The construction outlined in the proof of Corollary 3.5 can in principle also be applied to states of higher spin, but the expressions pretty quickly become quite unwieldy. It is, however, possible to use Theorem 3.2 to show that the entropy is extremal for coherent states: Corollary 4.1 (spin j ). Consider the state of spin j characterized by 2j − 1 coinciding 1 points on the sphere and a 2j th point, a small (chordal) distance 2 away from them. The Wehrl entropy of this small deviation from a coherent state, up to third order in , is SW () =

2j c 2 + O[ 4 ], + 2j + 1 8j 2

(54)

with 2j − 1 1 =1− (exact). c 2j

(55)

A generalized version of Lieb’s conjecture, analogous to (3), is [6] Conjecture 4.2. Let |ψi be a normalized state of spin j , then Z d |h|ψi|2s ≤ 1, s > 1, (2j s + 1) 4π

(56)

with equality if and only if |ψi is a coherent state. Remark. This conjecture is equivalent to the “quotient of two Hölder inequalities” (4). The original Conjecture 2.1 follows from it in the limit s → 1. For s = 1 we simply get the norm of the spin j state |ψi, Z d (57) |h|ψij |2 = |Pj |ψi|2 , (2j + 1) 4π where Pj is the projector onto spin j . We have numerical evidence for low spin that an analog of Conjecture 4.2 holds in fact for a much larger class of convex functions than x s or x ln x. For s ∈ N there is a surprisingly simple group theoretic argument based on (57): Theorem 4.3. Conjecture 4.2 holds for all s ∈ N. Remark. For spin 1 and spin 3/2 (at s = 2) this was first shown by Wolfgang Spitzer by direct computation of the integral.

492

P. Schupp

Proof. Let us consider s = 2, |ψi ∈ [j ] with ||ψi|2 = 1, rewrite (56) as follows and use (57): Z

d |h|ψi|2·2 4π Z d = (2(2j ) + 1) |h ⊗ |ψ ⊗ ψi|2 = |P2j |ψ ⊗ ψi|2 . 4π

(2j · 2 + 1)

(58)

But |ψi⊗|ψi ∈ [j ]⊗[j ] = [2j ]⊕[2j −1]⊕. . .⊕[0], so |P2j |ψ⊗ψi|2 < ||ψ⊗ψi|2 = 1 unless |ψi is a coherent state, in which case |ψi ⊗ |ψi ∈ [2j ] and we have equality. The proof for all other s ∈ N is completely analogous. u t It seems that there should also be a similar group theoretic proof for all real, positive s related to (infinite dimensional) spin j s representations of su(2) (more precisely: sl(2)). There has been some progress and it is now clear that there will not be an argument as simple as the one given above [5]. Coherent states of the form discussed in [15] (for the hydrogen atom) could be of importance here, since they easily generalize to non-integer “spin”. Theorem 4.3 provides a quick, crude, lower limit on the entropy: Corollary 4.4. For states of spin j , SW (|ψihψ|) ≥ ln

4j + 1 > 0. 2j + 1

(59)

Proof. This follows from Jensen’s inequality and concavity of ln x: Z

d |h|ψi|2 ln |h|ψi|2 SW (|ψihψ|) = −(2j + 1) 4π Z d |h|ψi|2·2 ≥ − ln (2j + 1) 4π 2j + 1 . ≥ − ln 4j + 1

(60)

In the last step we have used Theorem 4.3. u t We hope to have provided enough evidence to convince the reader that it is reasonable to expect that Lieb’s conjecture is indeed true for all spin.All cases listed in Lieb’s original article, 1/2, 1, 3/2, are now settled – it would be nice if someone could take care of the remaining “dot, dot, dot” . . . Acknowledgements. I would like to thank Elliott Lieb for many discussions, constant support, encouragement, and for reading the manuscript. Much of the early work was done in collaboration with Wolfgang Spitzer. Theorem 4.3 for spin 1 and spin 3/2 at s = 2 is due to him and his input was crucial in eliminating many other plausible approaches. I would like to thank him for many discussions and excellent team work. I would like to thank Branislav Jurˇco for joint work on the group theoretic aspects of the problem and stimulating discussions about coherent states. It is a pleasure to thank Rafael Benguria, Almut Burchard, Dirk Hundertmark, Larry Thomas, and Pavel Winternitz for many valuable discussions. Financial support by the Max Kade Foundation is gratefully acknowledged.

Wehrl Entropy of Bloch Coherent States

493

References 1. Anderson, A., Halliwell, J.J.: Information-theoretic measure of uncertainty due to quantum and thermal fluctuations. Phys. Rev. D 48, 2753–2765 (1993) 2. Beckner, W.: Inequalities in Fourier analysis. Ann. of Math. 102, 159–182 (1975) 3. Brascamp, H.J., Lieb, E.H.: Best constants in Young’s inequality, its converse, and its generalization. Adv. in Math. 20, 151–173 (1976) 4. Grabowski, M.: Wehrl-Lieb’s inequality for entropy and the uncertainty relation. Rep. Math. Phys. 20, 153–155 (1984) 5. Joint work with B. Jurˇco 6. Lieb, E.H.: Proof of an entropy conjecture of Wehrl. Commun. Math. Phys. 62, 35–41 (1978) 7. Lieb, E.H.: Coherent states as a tool for obtaining rigorous bounds. In: Feng, D.H., Klauder, J. (eds.) Coherent states: Past, present, and future. Proceedings, Oak Ridge, Singapore: World Scientific, 1994, pp. 267 –278 8. Lieb, E.H.: Some convexity and subadditivity properties of entropy. Bull. Am. Math. Soc. 81, 1–13 (1975) 9. Lieb, E.H.: Gaussian kernels have only Gaussian maximizers. Invent. Math. 102, 179–208 (1990) 10. Lieb, E.H.: Integral bounds for radar ambiguity functions and Wigner distributions. J. Math. Phys. 31, 594–599 (1990) 11. Lieb, E.H., Loss, M.: Analysis. Graduate Studies in Mathematics, 14, Providence, R.I.: American Mathematical Society, 1997 12. Peˇrina, J., Hradil, Z., Jurˇco, B.: Quantum optics and fundamentals of physics. Fundamental Theories of Physics, 63, Dordrecht: Kluwer Academic Publishers Group, 1994 13. Radcliffe, J.M.: Some properties of coherent spin states. J. Phys. A 4, 313–323 (1971) 14. Joint work with W. Spitzer 15. Thomas, L.E., Villegas-Blas, C.: Asymptotic of Rydberg states for the hydrogen atom. Commun. Math. Phys. 187, 623–645 (1997) 16. Wehrl, A.: On the relation between classical and quantum mechanical entropy. Rept. Math. Phys. 16, 353–358 (1979) 17. Zhang, W.-M., Feng, D.H., Gilmore, R.: Coherent states: Theory and some applications. Rev. Mod. Phys. 62, 867–927 (1990) Communicated by D. Brydges

Commun. Math. Phys. 207, 495 – 497 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Erratum

Enriques Surfaces, Analytic Discriminants, and Borcherds’s 8 Function Jay Jorgenson1 , Andrey Todorov2 1 Department of Mathematics, City College of New York/CUNY, Convent Avenue at 138th , New York,

NY 10031, USA. E-mail: [email protected]

2 Department of Mathematics, University of California, Santa Cruz, CA 95064, USA.

E-mail: [email protected] Received: 13 May 1999 / Accepted: 9 June 1999 Commun. Math. Phys. 191, 249–264 (1998)

It was pointed out to us by Yoshikawa that we need to change the definition of the degree 2 Enriques discriminant fEnr,2 in “Enriques Surfaces, Analytic Discriminants, and Borcherds’s 8 Function” so that Theorem 8 should be true. Here we will correct the definition of the degree 2 Enriques discriminant fEnr,2 . We will recall the basic definitions that we used in [3]. Definition 1. We will define an Enriques surface Y to be X/ρ, where X is a K3 surface and ρ is an involution acting on X without fixed points. 01 . Definition 2. We will define 3K3 u H3 ⊕ (−E8 )2 with H = 10 Definition 3. We will define the Enriques involution ρ(z1 ⊕ z2 ⊕ z3 ⊕ x ⊕ y) = (−z1 ⊕ z3 ⊕ z2 ⊕ y ⊕ x). − Notation 4. Let 3+ K3 and 3K3 be the ρ-invariant and ρ-anti-invariant sublattices.

Remark 5. The unimodular lattice 21 3+ K3 is isometric to the Enriques lattice 3Enr . Definition 6. We define the space Enr = P(3− K3 ⊗ C)∩K3 , where K3 is the period domain for marked K3 surfaces, i.e. K3 := SO0 (3, 19)/SO(2) × SO(1, 19). It is easy to see that Enr := SO0 (2, 10)/SO(2) × SO(10).

496

J. Jorgenson, A. Todorov

Definition 7. We define 0Enr = rest3− {g ∈ Aut(3K3 )|g ◦ ρ = ρ ◦ g}. K3

Definition 8. We defined e 0Enr as a subgroup of finite index in 0Enr which preserves the so called H-marking of the Enriques surfaces. ρ Let 1+ := {l ∈ 3− K3 | hl, li = −2} and 1− := {δ ∈ 3K3 | hδ, δi = −2 and δ 6 = δ}. It is shown on p. 283 of [1] that no point of the hyperplane Hl = {p ∈ Enr | hp, li = 0 and l ∈ 1+ } can be the period of a marked Enriques surfaces. Namikawa showed in [5] the following result. Let δ ∈ 1− , then the points of Hδ = {p ∈ Enr | hp, δ − δ ρ i = 0} correspond to Enriques surfaces with double rational points. We defined in [3] the moduli 0Enr \Enr . space M2Enr,H as e

Definition 9. We will define two divisors D+ := e 0Enr \ ∪l Hl (for all l ∈ 1− ) and 0Enr \ ∪δ Hδ in M2Enr,H , (for all δ ∈ 1− ). D− := e Remark 10. The results of Borcherds in [2], imply that both divisors D+ and D− are irreducible in M2Enr,H . We proved [4] that if the Baily–Borel compactification of the moduli space of K3 surfaces with Picard lattice which is either equal or contains a fix primitive sublattice S ⊂ 3K3 , contains a unique cusp of dimension zero, then the regularized determinant det S (10 ) of the of the Laplacian of the Calabi–Yau metric corresponding to the polarization class e, can be represented on the symmetric space h2,20−rkS = SO(2, 20 − rkS)/SO(2) × SO(20 − rkS) as follows:

det e (10 ) = kωnor k2 |ηS |2 , where ηSN is an automorphic form with respect to the arithmetic group 0S := {g ∈ Aut(3K3 )|g(e) = e, g(S) = S and g preserves the spinor norm}, and ωnor is normalized as having period one on certain invariant vanishing cycle. See [4]. Moreover the zeroes of the automorphic form ηSN are exactly supported on ∪l Hδ =: {τ ∈ h2,19 | hτ, δi = 0&δ ∈ S ⊥ } for all δ ∈ 3K3 such that hδ, Si = 0 and hδ, δi = −2. We will apply the above result from [4] in the case of moduli of algebraic K3 surfaces + whose Picard group is equals to 3+ K3 or it contains the lattice 3K3 . In [3] we defined the degree 2 Enriques discriminant fEnr,2 as the restriction of the automorphic form ηSN on Enr for a polarization class e ∈ 3K3 such that he, ei = 4. So it is easy to see that the above stated theorem from [3] implies that fEnr,2 vanishes on + − fEnr,2 , D+ and on D− so fEnr,2 = fEnr,2 + − vanishes on D+ and fEnr,2 vanishes on D− . where fEnr,2 + the degree 2 Enriques discriminant. Definition 11. We will call fEnr,2 + . This change In Theorem 8 of our paper [3] we need to replace fEnr,2 with fEnr,2 does not affect the proof of Theorem 8. + − and fEnr,2 . This It is not difficult to see that the mirror symmetry interchange fEnr,2 will be done in a future paper.

Acknowledgement. We want to thank Yoshikawa for his comments.

Erratum: Enriques Surfaces, Analytic Discriminants, and Borcherds’s 8 Function

497

References 1. Barth, W., Peters, C. and van de Ven: Compact Complex Surfaces. Ergebnisse der Math. 4, New York: Springer Verlag, 1984 2. Borcherds, R.: The Moduli of Enriques Surfaces and the Fake Monster Lie Superalgebra. Topology 35, 699–710 (1996) 3. Jorgenson, J. and Todorov, A.: Enriques Surfaces, Analytic Discriminants, and Borcherds’s 8 Function. Commun. Math. Phys. 191, 249–264 (1998) 4. Jorgenson, J. and Todorov, A.: An Analytic Discriminant for Polarized K3 Surfaces. AMS/IP Studies in Advanced Mathematics, Vol. 10, 1998, p. 211–261 5. Namikawa, Y.: Periods of Enriques Surfaces. Math. Ann. 270, 201–222 (1985) Communicated by A. Jaffe

Commun. Math. Phys. 207, 499 – 555 (1999)

Communications in

Mathematical Physics

c Springer-Verlag 1999

Harmonic Analysis on the Quantum Lorentz Group E. Buffenoir1 , Ph. Roche2 ;? 1 2

Laboratoire de Physique Math´ematique et Th´eorique (Laboratoire du CNRS ESA 5032), Universit´e Montpellier 2, Place Eug`ene Bataillon, 34000 Montpellier, France. E-mail: [email protected] TH Division CERN, 1211 Geneva 23, Switzerland. E-mail: [email protected]

Received: 11 November 1997 / Accepted: 7 February 1999

Abstract: Using a continuation of 6j symbols of SUq (2) with complex spins, we give a new description of the unitary representations of SLq (2; C )R and find explicit expressions for the characters of unitary representations of SLq (2; C )R . We prove a Plancherel theorem for the Quantum Lorentz Group. 1. Introduction Our interest in the subject of harmonic analysis on SLq (2; C )R stems from our desire to apply the program of combinatorial quantization of Chern–Simons theory [17,1,9] to the case where the quantum group is a quantization of a non-compact group. By SLq (2; C )R we mean a quantization, for q real, of the group SL(2; C ) considered as a real Lie group. We have kept in mind that there is a close relationship between quantum gravity in 2 + 1 dimensions and Chern–Simons theory with a non-compact group of the type SL(2; C )R ; SL(2; R) SL(2; R); ISO(2; 1) (depending on the sign of the cosmological constant) as first shown in [42] (which is a 2 + 1 analogue of Ashtekar variables [4]). Any result in the project of quantization of Chern–Simons theory for these groups has spin-offs on the program of canonical quantization of 2 + 1 quantum gravity (for a review on 2+1 quantum gravity, see [11]). Moreover the Chern–Simons topological invariants associated to SL(2; C )R appear to be essential in the description of states in Ashtekhar’s program of canonical quantization of 3+1 dimensional gravity. It will be clear in this paper that the “non-compactness” of SL(2; C )R is at the origin of many properties which cannot be so easily translated from the compact case. There are various ways to investigate the problem of quantization of Chern–Simons theory with a non-compact group, two main contributions being the work of J. E. Nelson and T. Regge on the direct quantization of the algebra of observables [32] and the ? On leave from CPT Ecole Polytechnique (Laboratoire Propre du CNRS UPR 14), 91128 Palaiseau Cedex, France

500

E. Buffenoir, Ph. Roche

work of E. Witten on the geometric quantization of the space of flat connections for a complex group [43] (see also [25] in the case of SL(2; R)). If we could apply the program of combinatorial quantization, and particularly the results of [2], in the case where the group is non-compact, we would be able to go much further than the approach of Nelson and Regge and obtain an Hilbert space of states, and a unitary representation of the algebra of observables on the space of states for any punctured surface using only representation theory of the associated quantum group. Unfortunately for the simplicity and fortunately for the richness of the theory, there are numerous problems to handle in order to use the combinatorial quantization program in these cases. All the difficulties are connected to the fact that harmonic analysis on quantization of non-compact groups is completely different from the compact case (as in the classical case) and in the quantum case it is still not very much developed. Harmonic analysis on compact quantum groups has been developed in the context of the compact matrix pseudo group by Woronowicz [44]. Its culminating result is the existence of a left and right invariant integral and an analogue of Peter-Weyl theorem. We can say that in this case, functional analysis is restricted to very little and the theory is almost completely algebraic. The case of SUq (1; 1) is now completely understood: classification of unitary representations of SUq (1; 1) [29], proof of Plancherel Theorem [24,27]. Harmonic analysis on quantization of non-compact groups has made a lot of progress after the work of Woronowicz who recognized the central role of the theory of multipliers and affiliated elements of a non unital C ? -algebra [46]. In their very important work [31], Podles and Woronowicz gave three main results: the construction of Uq (sl(2; C )R ) as a quantum double of Uq (su(2)), an explicit Iwasawa decomposition that allows the construction of different classes of functions on SLq (2; C )R and the proof of the existence of a unimodular Haar measure. Their study is not restricted to the case of SLq (2; C )R and concerns all quantum doubles of any quantization of compact groups. The aim of our work is twofold: the first goal is to obtain a complete description of the characters of SLq (2; C )R and prove a Plancherel formula. The second goal is to mix different points of view on quantum group theory: the C ? algebra approach developed by Woronowicz and his collaborators, and the R-matrix approach developed by the Russian school. We have tried to think of Uq (sl(2; C )R ) as being the double of Uq (su(2)) and to forget about commutation relations; this has the advantage that all the results can be recast in terms of generalized 6j symbols and that the proofs are mainly graphical. We not only gain in clarity but obtain results like the description of characters and the Plancherel theorem, which would have been hard to obtain outside this formulation. It will be clear to the reader that our constructions and proofs can be generalized, with some work, to the case of the quantization of any complex group. We have only analysed the case where q is real. Although the study of SLq (2; C )R , when q is any complex number, is of great physical relevance (for quantum Liouville theory and for 2 + 1 quantum gravity with a gravitational Chern–Simons term [43]), it appears that the general case requires quasi-Hopf algebras. Note that we have abusively called in our work SLq (2; C )R the quantum Lorentz group although classically SL(2; C )R is the simply connected covering of SO(3; 1). In [31] a quantization of SO(3; 1) is proposed, which is the quantum analogue of the relation SO(3; 1) = SL(2; C )R =Z2.

Harmonic Analysis on the Quantum Lorentz Group

501

2. The Quantum Lorentz Group 2.1. Quantum deformation of the envelopping algebra of the Lorentz group. We begin by recalling basic notions on Hopf algebras and real forms of Hopf algebras. Let A; B be two Hopf algebras, and h:; :i : A B ! C , a pairing of Hopf algebras R L on A; B . We define an action (resp. an antiaction) of A on B called . (resp. .) by: X

R

a .b =

(b)

L

b(1) a; b(2) ; a.b =

X

(b)

a; b(1) b(2)

8a 2 A ; b 2 B:

(1)

It is easy to see that these operations satisfy:

R R R aa0 .b = a . (a0 .b); R

a . (bc) =

X (a)

R

L L L aa0 .b = a0 .(a.b)

8a; a0 2 A; 8b 2 B; L a2 .c 8a 2 A; 8b; c 2 B:

X L L a.(bc) = a1 .b (a)

R

a1 .b a2 .c;

(2) (3)

An action (or antiaction) satisfying the last property will be called derivation. We can define two other actions, called adjoint actions, of A on B defined as follows: ad

a . b=

X

(a);(b)

S 1 (a1 ); b1 b2 ha2 ; b3 i ;

8a 2 A; 8b 2 B:

(4)

Note that, unless the coproduct on A is cocommutative, these adjoint actions are not derivations. We will make constant use of the following property: let b be an element of B,

. b = (a)b; 8a 2 A)

,

ad+

(a

R L (8a 2 A; a .b = S 2 (a).b);

(5)

which is straightforward to verify. In this case, b is called ad+ invariant. Let A and B be two Hopf algebras and F an element of B A, satisfying

id)(F ) = F23 F13 ; (id A )(F ) = F12 F13 : (6) The twisted Hopf algebra denoted A F B is the Hopf algebra equal to A B as an (B

algebra and with coproduct defined by:

(a b) = F23 A13 (a)B24 (b)F231 :

(7)

It then follows that F satisfies the antipode on A F

(id SA )(F ) = (SB 1 id)(F ) = F

1

;

B being given by the formula: S 1 (a b) = F 1 (S 1 (a) S 1 (b))F21 : 21

A

B

(8)

(9)

The quantum double of a Hopf algebra is a particular case of this construction. Let A be a Hopf algebra of finite dimension, let ei be any choice of basis of A and let us denote by ei the dual basis. We define C to be the Hopf algebra C = A F Aop , P P where F 1 = i ei ei . It is easy to show that F = i ei SA1 ei satisfies (6). The quantum double is by definition the quasitriangular Hopf algebra D(A) = C with R

502

E. Buffenoir, Ph. Roche

P

matrix R = i ei 1 1 ei . As a coalgebra it is given by D(A) = algebra law being given by:

(a ):(b ) =

X (b);( )

ab2 2 1 ; S 1 b1 h3 ; b3 i ;

A Aop , the (10)

^ Aop to remind the where a; b 2 A ; ; 2 A . We will use the notation D(A) = A

reader that inside D(A), (A 1) and (1 Aop ) do not commute but satisfy the braided equalities (10). If (A; R) is a quasitriangular Hopf algebra we will denote R(+) = R; R( ) = R211 . If A is a ribbon Hopf algebra, we will denote, as usual, by v the ribbon element and by = uv 1 the associated group-like element. If is a finite dimensional representation of a ribbon Hopf algebra A, acting on V , the matrix elements of are elements of A and the linear forms trV ((:)) (resp. trV (( 1 )(:)) ) are respectively invariant elements of A under the action . (resp. . ). Recall that if A is a quasitriangular Hopf algebra, it is called factorizable if the linear map: ad+

ad

A ! A 7! ( id)(RR0 ) is an isomorphism of vector space. When A is a factorizable Hopf algebra, then the theorem of factorization of [35] applies and we have an isomorphism of Hopf algebra between the quantum double D(A) and A R 1 A:

: D(A) ! A R 1 A; X a 7! a1 R^ (+) (1 ) a2 R^ ( ) (2 );

(11)

(a);( )

^ () : Aop ! A are the algebra homomorphisms defined where a; 2 A Aop and R ( ) by R ( ) = ( id)(R ). It is then easy to compute the image of the universal R D matrix of D(A) under this isomorphism: ()

( )(R D ) = R14

( )

() () (+) R24 R13 R23 :

(12)

Let A be a Hopf algebra over the complex field, a real form of A is equivalent to the definition of a star structure on A, i.e., an involutive antilinear map: ? : A ! A satisfying the two conditions:

(ab)? = b? a? 8a; b 2 A; (? ?)(a) = (a? ); 8a 2 A:

(13) (14)

In this case it is easy to show that:

S Æ ? = ? Æ S 1;

(15)

where S denotes the antipode of A. If A is a star algebra, Matn (C ) A will always be endowed with the star structure defined by (M y )ij = (Mij )? , where M 2 Matn (C ) A.

Harmonic Analysis on the Quantum Lorentz Group

503

Let W be a hermitian vector space, a representation : A ! End(W ) is unitary if and only if (a? ) = (a)y ; 8a 2 A. In this section we review the construction of the quantum deformation of the Lie algebra of the Lorentz group and make our conventions precise. We first define A = Uq (su(2)) for q 2]0; 1[, as being the star Hopf algebra defined by the generators J ; q Jz , and the relations: q2Jz q 2Jz : qJz qJz = 1; qJz J q Jz = q1 J ; [J+ ; J ] = (16) 1

q

The coproduct is defined by (qJz ) = qJz

qJ ;

q

(J ) = q Jz J + J qJz ;

z

and the star structure is given by: (q Jz )? = q Jz ;

J? = q1 J :

(17)

(18)

This Hopf algebra is a ribbon quasi-triangular Hopf algebra. A description of the universal R matrix, of the element ; v is given in the Appendix. Finite dimensional representations of Uq (su(2)) are completely reducible, and irreducible representations are completely classified by a couple (!; K ) 2 f1; 1; i; ig 1 + Jz 2 Z , the dimension of the representation is 2K + 1 and the spectrum of q is included 1 n in f!q ; n 2 2 Zg. The appearance of ! 2 f 1; i; ig comes from the existence of non trivial outer automorphisms ! of Uq (su(2)), defined by ! (q Jz ) = !q Jz ; ! (J+ ) = !2 J+ ; ! (J ) = J . These automorphisms have no classical counterpart, and are a source of annoying features. In all the quantization schemes with specialization of the parameter to a complex number these automorphisms appear. It is easy to show that an irreducible representation is unitarizable if and only if ! 2 f+1; 1g. We will denote in the rest of this article Irr(A) the set of all equivalency classes of finite dimensional irreducible and unitary representations with ! = 1. The tensor product of elements of Irr(A) is completely reducible in elements of Irr(A). K Each of these classes is labelled by its spin K (we will use a capital letter to K denote it), and let us define V as being the vector space associated to the representation of spin K . The representation of spin 0 is associated to the counit. The representation K of dimension 2K + 1 and associated to !; (! 4 = 1) is Æ ! . K K Let M be an element of End(V ), we will define trq (M ) = trK ( 1 M ). In particuV K lar the q-dimension of the representation is by definition [dK ] = trq (1) = [2K + 1]. The tensor product of elements of Irr(A) behaves as in the classical case:

I

J

=

IM +J K =jI J j

K

:

(19)

It will be convenient to introduce at this point the following notation: if I; J; K are elements of 12 Z+, we define Y (I; J; K ) as follows: (

Y (I; J; K ) =

1 if I + J 0 otherwise

K; J + K

I; K + I

J

2 Z+ :

504

E. Buffenoir, Ph. Roche

We also define (

Y~ (I; J ) =

1 if I + J; I 0 otherwise

J

2 Z+ :

K K K Let us denote by ( e i ji = 1 : : : dim V ) a particular unitary basis of V which diagoK K _ nalizes the action of q Jz (see the Appendix), and ( e i ji = 1 : : : dim V ) its dual basis. I J K K For any representations ; ; of Irr(A) we define the Clebsch–Gordan maps IJ I J K K I J (resp. IJ K ) as elements of HomA (V V; V) (resp. HomA (V; V V)). These Clebsch– Gordan maps will be chosen to satisfy the following properties: K = 0; IJ = 0, when Y (I; J; K ) = 0 we have IJ K K IJ when Y (I; J; K ) = 1 IJ ; K are non-zero and defined by Clebsch–Gordan coefficients: K

IJ K ( e c) =

X

a;b

a b K I J I J c ea eb ;

X c I J J K I IJ (ea eb ) = K ab c

K

e c: (20)

These Clebsch–Gordan maps are at this point defined up to phases. We will impose first K = (IJ )y . These maps are still defined up to a phase which is fixed up to a that IJ K sign by the requirement that the Clebsch–Gordan coefficients are real (this is possible because q is real). These last signs are completely determined by the convention of Wigner:

I J K I J I J

2 R+ :

With these conventions, the Clebsch–Gordan coefficients can be explicitly computed and satisfy numerous properties, some of them being stated in the Appendix. To a representation of Uq (su(2)), we can naturally associate three representations, the conjugate representation , defined by (x) = (S 1 (x? )) and two contragredient ^ (resp. ) defined by ^ (x) = t (S (x)) (resp. = t (S 1 (x)) ) for representations any x 2 Uq (su(2)). I In the special case of Uq (su(2)) and if we take = then these three repreI I I I sentations are equivalent to , in particular we have (x) = W (x)W 1 with W I matrices whose components are easily shown to be defined up to a scalar: W m m0 = cI Æmm0 qm ( 1)I m . D I I I I E We will define the linear forms uji = ej j (:)jei and Pol(SUq (2)) the star Hopf I algebra generated as a vector space by (uji )I 2 21 Z+;i;j =1:::2I +1 . By a direct application of the definitions, we have

I J

u1 u2 =

X

M

M

M IJ M u IJ ;

(21)

Harmonic Analysis on the Quantum Lorentz Group

I

(uab ) =

505

XI

I

uac ucb:

c

(22)

The first equation implies the exchange relations

IJ

J I IJ

I J

R 12 u1 u2 = u2 u1 R 12 ;

(23)

IJ I J where R = ( )(R). Due to the unitarity of the elements of Irr(A), we have: I I (uab )? = S (uba )

II I I uuy = uy u = 1:

i.e.

I I I I I The elements u satisfy also the relation: u = W uW then unitarity of the representation implies that: I

S (uji ) =

1.

(24)

I I If we define w im = W im

XI

m;n

I I 1 nj wim um : nw

(25)

Up to this point we will often use the Einstein convention of summation over the indices labelling the vectors of a representation (the magnetic moments denoted by a small letter) and not on the indices labelling the representations (the spin of the representation denoted by a capital letter). Pol(SUq (2)) can equivalently be viewed as the star Hopf algebra generated by the 1 2

matrix elements of u with the defining relation (23) (with I; J = 21 ) and the relation (24) (with I = 12 ). Explicit relations are recalled in the Appendix. Uq (su(2)) being a factorizable Hopf algebra it is possible to give a different but equivalent form of the defining relations of Uq (su(2)). Let us introduce, for each I 2 I I I I 1 Z+ the elements L() 2 End(V ) A defined by L() = (

id)(R() ). 2 The duality bracket is given by

I () J

IJ L1 ; u2 = R (12) :

(26)

These matrices satisfy the relations: X I J K () K L(1) L(2) = IJ K L IJ ;

IJ

I

J

K

J

I

(27)

IJ

( ) ( ) (+) R 12 L(+) R 12 ; 1 L2 = L2 L1 XI I () a I (L ) = L() c L() a ;

b

I (L() ab )? = S

b

c

1

I (L() ba );

(28)

c

i.e.

(29)

I

I

I

I

L() y L() = L() L() y = 1:

(30)

The first equation implies the exchange relations

IJ

I J J I IJ R 12 L(1) L(2) = L(2) L(1) R 12 :

(31)

506

E. Buffenoir, Ph. Roche

Explicit relations between generators of Uq (su(2)) and matrix elements of recalled in the Appendix. The quantum double of Uq (su(2)) is the Hopf algebra

I

L() are

D = Uq (su(2)) ^ Pol(SUq (2))op :

I I We will denote by gab the elements uab embedded in Pol(SUq (2))op D. This is a star I I algebra with the following definition of star: (g ab )? = S 1 (g ba ) and also (30). I I The algebra law satisfies the relations (21) (with u replaced by g) and also the exchange relations: IJ () I () J

J I IJ R 12 L1 g2 = g2 L(1) R (12) :

(32)

The coproduct on D satisfies

I

D (L() ab ) =

XI

c

I

I

L() cb L() ac ; D (gab ) =

XI

c

I

gcb gac :

(33)

D is a quantum deformation of the envelopping algebra of sl(2; C )R [31,33,15]. In order to understand this point it is important to recall the principle of quantum duality which appears first in [16] and developed in [36]. Let ? be the involution on sl(2; C ) selecting the compact form su(2). Uq (su(2)) being a star Hopf algebra, (sl(2; C ); ?) inherits a star Lie bialgebra structure. By duality it gives on sl(2; C ) a structure of star Lie bialgebra whose real form is isomorphic as a R Lie algebra to the Lie algebra an(2), the real Lie algebra of 2 2 lower triangular complex matrices with real diagonal of zero trace. At the quantum level, an easy application of the quantum duality principle shows that Pol(SUq (2)) = Uq (su(2)) = Uq (su(2) ) = Uq (an(2)) (at this point we are quite unprecise on the exact significance of the dual, this will be corrected later on). As a result D = Uq (su(2)) Uq (an(2)) as a vector space, containing Uq (su(2)) and Uq (an(2)) as subalgebras. It can be shown that the classical limit of D is the Lie algebra sl(2; C )R , and the classical limit of the previous decomposition is the Iwasawa decomposition. In the appendix the reader will find the complete defining set of commutation relations of D. We will use the notation D = Uq (sl(2; C )R ). Another natural decomposition, in the classical case, of U(sl(2; C )R ) C as a complex algebra is the decomposition in two commuting copies U(sl(2; C )) U(sl(2; C )). In order to obtain such a description in the quantum case, let us now describe the factorization theorem applied to D. Using the explicit form of the isomorphism and the expressions of the coproduct (33), we can write: I

I

I

(L() ) = M (r) M (l) ; I I I (g ) = M (r ) M (l+) ;

(34) (35)

I I where the components of M (l) (resp. M (r) ) are located in the subalgebra A 1 (resp. 1 A) of A A. From a direct application of the factorization theorem, we have the following relations: IJ I (i) J (i) J I IJ (36) R 12 M 1 M 2 = M (2i) M (1i) R 12 ;

Harmonic Analysis on the Quantum Lorentz Group

507

IJ () I (i) J (i)

(37)

J (i) I (i) IJ () = M 2 M 1 R 12 ; I J 0 J 0 I M (1l) M (2r ) = M (2r ) M (1l) ;

R 12 M 1 M 2

(38)

with i 2 fr; lg; ; 0 2 f+; g. This factorization theorem gives an immediate description of the finite dimensional representations of D. Indeed from the isomorphism of a complex algebra between D and Uq (sl(2; C )) Uq (sl(2; C )) we obtain that each finite dimensional representation of D is completely reducible and that the finite dimensional irreducible representations of D are labelled by a pair (; 0 ) of irreducible finite dimensional representations of Uq (sl(2; 0C )). If ; 0 are two irreducible finite dimensional representations acting on 0 V and V , is an irreducible representation of Uq (sl(2; C )) Uq (sl(2; C )), from which we deduce the associated representation (; 0 ) of D acting on V V 0 by the expressions:

0

0

I I I 0 I () I () I() (; 0 )(g1 ) = R (13 ) R (+) 12 ; (; )(L1 ) = R 13 R 12 :

(39)

At this point the reader is invited to look at the appendix where the commutation relations of generators of D are explicitly computed. From these relations it is easy to show that there exists a non trivial outer automorphism of D defined by s (a) = a; s (b) = b; s (c) = c; s (d) = d; s (qJz ) = qJz ; s (J ) = J . This implies that s = Æ s is a one dimensional representation, which can be written as s = 0

0

(i ; i ) and has therefore no classical analogue. This representation is called strange

[31] (is not a smooth representation in the sense of these authors) and will reappear when we will study infinite dimensional representations of D. It has been shown in [38] that finite dimensional representations of Uq (sl(2; C )R ) are completely reducible and that the only finite dimensional irreducible representations, up to equivalence, of D are I J I J (; ) and (; ) Æ s : The center of the algebra D is a polynomial algebra with two generators defined by: 1 2

= trq (L()

1

12

g ):

(40)

An easy computation shows that 1 2

( + ) = trq (M (l 1 2

)

( ) = trq (M (r+)

1

1 2

M (l+) );

1

1 2

M (r ) );

(41) (42)

which are the quadratic Casimirs of each copies in the intrinsic form found by [18]. The action of the star involution is easy to compute and we have ( + )? = . The action of s on is given by: s ( ) = . In the appendix we have worked out the precise connection betwen our conventions and those of [31,30].

508

E. Buffenoir, Ph. Roche

2.2. Algebras of functions on SLq (2; C ). In our work we will make an extensive use of the notion of multiplier Hopf algebra. A very good reference on this subject is [39,40] in the pure algebraic case. We will also use the notions of C multiplier Hopf algebra, and affiliated elements which is well exposed in [46,28] and also [31] in the case of SLq (2; C ). The aim of the theory of multipliers, applied to the quantum group case, is to develop an appropriate mathematical framework where quantization of the non-compact group naturally fits in. As we will see the formalism is not completely obvious, and it is the aim of our work to show that it is possible to develop a complete harmonic analysis of SLq (2; C ) in this framework. Let us review briefly the basic ideas. Pol(SUq (2)) is a star Hopf algebra and we can define on Pol(SUq (2)) a norm defined by:

kak = sup k(a)k; 8a 2 Pol(SUq (2));

(43)

where the supremum is taken over all unitary representations of Pol(SUq (2)) acting on a Hilbert space (note that (a) is a bounded operator and that the supremum always exists and is finite [41]). After completion of (Pol(SUq (2)) with respect to this norm we obtain a C ? algebra denoted (Fun(SUq (2)); k k) which can be thought of as being the quantum analog of the space of continuous functions on SU (2). The coproduct : Pol(SUq (2)) ! Pol(SUq (2)) Pol(SUq (2)) extends by density to a morphism of C algebras: : Fun(SUq (2)) ! Fun(SUq (2)) Fun(SUq (2)). Note that there is no ambiguity in the definition of the tensor product Fun(SUq (2)) Fun(SUq (2)) because Fun(SUq (2)) is nuclear [45]. From the quantum duality principle, it is natural to view the linear forms on Pol(SUq (2)) as quantum analogues of classes of functions on AN . More precisely Podles and Woronowicz were led to introduce the following conI defined by the duality bracket: struction: let (E ij ) be the elements of (Pol(SUq (2)))op I J i m E ij ; gm n = ÆI;J Æn Æj . It is easy to see that these elements are independent and that they are naturally endowed with the following structure of star algebra, dual to the coproduct on the elements I gij :

I J

I

I

I

E kl E rs = ÆI;J E ks Ælr ; (E ij )? = E ji :

(44)

I As aL result the structure of the algebra generated by E kl is isomorphic to the star algebra I 2 1 Z+ Mat2I +1 (C ). 2 We define Func (ANq (2)) (the quantum analog of compact supported functions) to I L be the star algebra I Mat2I +1 (C ) whose basis is E kl . Note that this algebra has no unit element. This algebra is endowed with a structure of a normed algebra defined as follows: jj I aI jj = supI jjaI jj where jjaI jj is the usual sup norm of the finite dimensional matrix. ? Fun0 (ANq (2)) (the quantum P analog of functions vanishing at infinity) is the C algebra Fun0 (ANq (2)) = I Mat2I +1 (C ) completion of Func (ANq (2)) with respect to this norm.

Harmonic Analysis on the Quantum Lorentz Group

509

By duality Func (ANq (2)) is endowed with a structure of a multiplier Hopf algebra, called the discrete quantum group [40]. The coproduct maps

Func (ANq (2)) to M (Func (ANq (2) Func (ANq (2));

where M (B ) denotes the algebra of the multiplier of B . In this particular case,

M (Func (ANq (2) Func (ANq (2))) =

Y

I;J 2 Z 1 2

Mat2I +1 (C ) Mat2J +1 (C ):

+

It is easy to compute the coproduct by duality:

I

(E ij ) =

X

J;K

n s I J K j

Kr i J K EJ m E n s; I mr

(45)

where this infinite sum is to be understood as the multiplier

(

n s I J K j

Kr i J K EJ m E n s )J;K 2 12 Z+: I mr

(46)

In order not to obscure formulas we will stick to the notation with the sum. It is therefore an easy task to check the axioms of a star multiplier Hopf algebra on Func (ANq (2)). Up to this point we can endow Pol(SUq (2)) with a different star structure; let us I Im denote by Pol(SUq0 (2)) this star algebra and denote by k m n the element un when conI ? sidered as an element of Pol(SUq0 (2)) : the involution is then defined as (k m n) = I S 1 (knm ). As a result it is natural to define the normed star algebra

Funcc (SLq (2; C )R ) = Pol(SUq0 (2)) Func (ANq (2));

the normed star algebra Func (SLq (2; C )R ) = Fun(SUq0 (2)) Func (ANq (2)), where the tensor product always denotes the algebraic tensor product. Naturally we can define the C ? algebra

Fun0 (SLq (2; C )R ) = Fun(SUq0 (2)) Fun0 (ANq (2))

and denote by Fun(SLq (2; C )RQ ) the set of affiliated elements to Fun0 (SLq (2; C )R ). We have Fun(SLq (2; C )R ) = I 2 1 Z+ Fun(SUq0 (2)) Mat2I +1 (C ). Note that this 2 last algebra is not endowed with a norm. We can show that Funcc (SLq (2; C )R ) is endowed with a structure of multiplier Hopf algebra whereas Fun0 (SLq (2; C )R ) is endowed with a structure of multiplier C ? Hopf algebra. The coalgebra structure on these last two spaces has to be twisted, according to the structure of the dual of a quantum double. More precisely we have

Funcc (SLq (2; C )R ) = Pol(SUq0 (2)) F Func (ANq (2))

Fun0 (SLq (2; C )R ) = Fun(SUq0 (2)) F Fun0 (ANq (2))

as multiplier coalgebras, with F the invertible multiplier of

M ( Func (ANq (2)) Pol(SUq (2)) );

and

510

E. Buffenoir, Ph. Roche

defined by:

F=

E x kJ y ; y x

X J

J;x;y

(47)

(this element satisfies Eqs. (6) and is the bicharacter of [31]). All this is described in [31]. Let us now describe explicitly the full structure of the multiplier Hopf algebra of Funcc (SLq (2; C )R ). The explicit structure of multiplier algebra of Funcc (SLq (2; C )R ) is defined by the following equations:

Ai B k C m D p X i m kj E l kn E q = AC F X A B q s (kji E kl ) = F231 ( CD C;D;m; p;q;r;s Ai B k (kj E l ) = Æji ÆB;0; A B A B (kji E kl )? = S 1 (k ji ) E lk ; XJ 1J with F121 = E xy S (kyx ): J;x;y

F r B l

Bk s A C Æp Æ Fk r E B;D s q; l F jn

C p Am D r k C D Ak i E m q kj E s ) F23 ; B pr

(48)

It is easy to check the axioms of a multiplier Hopf algebra. It is also easy to check that, although the coproduct maps Funcc (SLq (2; C )R ) to

M (Funcc(SLq (2; C )R ) Funcc(SLq (2; C )R ));

R L one can still define actions . (resp. .) of Uq (sl(2; C )R ) on Funcc (SLq (2; C )R ) by the formulas (1). As an algebra we can define Pol(SLq (2; C )R ) = Pol(SUq0 (2)) Uq (su(2))op , which elements are affiliated elements to Fun0 (SLq (2; C )R ). 2.3. Iwasawa decomposition of SLq (2; C )R . Let

(I;J )

I J

= (; ) be an irreducible repre-

IJ sentation of Uq (sl(2; C )R ), its matrix elements G are affiliated elements of Fun0 (SLq (2; C )R ). We have the decomposition Pol(SLq (2; C )R ) = Pol(SUq0 (2))

I Uq (su(2))op which is in duality with Uq (sl(2; C )R ). Let us denote by T () 2 I Uq (su(2))op the elements L() of Uq (su(2)) when considered as elements of Pol(SLq (2; C )R ). The duality bracket is defined by:

Uq (su(2)) ^ Pol(SUq (2))op Pol(SUq0 (2)) Uq (su(2))op ! C

I ()

L1

J

B

g 10 ; k2 T (20 ) A

=

IJ () AB0 () R 12 R 10 20 :

(49)

Harmonic Analysis on the Quantum Lorentz Group

511

From these definitions it is easy to show that the algebra law of Pol(SLq (2; C )R ) on I I the elements k and T () satisfies:

I J

k1 k2 = I

J

X

K

K K IJ K k IJ ;

T () ab T () cd =

(50)

X c

a K J I r

K () r s J I T s K db ;

K IJ ax J ( ) y I (+) b I (+) a I ( ) x IJ yb R by T z T c = T b T y R cz ; I J J I k1 T (2) = T (2) k1

(51) (52) (53)

I (the reader should notice that the formulas for the product of T () are inverted with I I respect to those of L() , this is because T () 2 Uq (su(2))op ). The action of the untwisted coproduct is given by: I

0 (kab ) =

XI

c

I

I

kac kcb 0 (T ()ab ) =

XI

c

I

T () cb T () ac :

(54)

From the duality bracket, it is easy to show that

) 1 (+) 1 K 2 k3 T 3 ; g1

I I

) 1 (+) 1 2 k3 T 3 ; L1

k2 T (

J J

I I

k2 T (

J J

K

=

=

KJ ( ) KI (+)

R 13 R 12

KJ () KI ()

(55)

R 13 R 12 ;

I I J J IJ comparing these expressions with (39) we deduce that G 12 = k 1 T ( ) 1 1 k2 T (+) 2 1 : If we take (I; J ) = ( 12 ; 0) this is exactly the quantum analog of the Iwasawa decomposition, first found, by direct computation in [31]. The fact that the last expressions I I contain T () 1 comes from the fact that the coproduct on T () is inverted with respect I I to the usual coproduct on k . With our conventions the matrix T ( ) is lower trianguI lar whereas T (+) is upper triangular, and their expressions as affiliated elements of Fun0 (ANq (2)) is the following: I T () 1 ab =

Jy IJ R () 1 ax by E x :

X

J2 Z 1 2

(56)

+

2.4. Characters of finite dimensional representations of Uq (sl(2; C )R ). We will be particularly interested in the trace of the representation

(A;B )

which can be written as

512

E. Buffenoir, Ph. Roche

AB

(A;B )

= tr110 G 110 : it is an affiliated element of Fun0 (SLq (2; C )R ) which can there-

fore be represented as an element of Y

I 2 21 Z+

Fun(SUq (2)) Mat2I +1 (C ):

We will now use the generalized Iwasawa decomposition proved above to explicitly find the expression of this affiliated element. (A;B )

Theorem 1 (Expression of the characters of finite dimensional representations). is an affiliated element of Fun0 (SLq (2; C )R ) which can be explicitly computed in term of 6j symbols of SUq (2) : (A;B )

=

X

K

X f KN (A; B ) n z N M MK u

M;N

u KM M m Ky N y m kn E z ;

(57)

where f KN M (A; B ) 2 C is given by

KN f M (A; B ) = =

X

D

AB M KN D

X B

D

A M KN D

2 2

1=2 vD vM 1=2 vB vN1=2 vK

1= 2 vA vN1=2 vK ; 1=2 vD vM

(58)

(the index f stands for finite dimensional). Note that in Eq. (58) the sum on D is finite, and that if K is fixed then the sum in (57) involves just a finite number of M . Proof. We have:

AA

(A;B )

= trA (k T (

=

X X

K

M

) 1

BB )trB ( k T (+)

a c M AK ( AB m R

)

X AK A A K L R ( ) 1 bx BL (+) 1 dz y t ) = k ab k cd

ay R ct E x E z K;L BK n AB M 1 bx (+) 1 dz m Ky R ay cx M b d k n E z

1

2

6 X6 6 = 6 KM 4

3

7 7 Md 7 k 7 e 5

K

E cb :

V

The reader is invited to look at the appendix where this graphical representation is explained. This last expression can be written, using twice the unitarity on Clebsch– Gordan coefficients as: X

K;M;N

n z N KN f M (A; B ) M K u

Ky u K M Mk m E n z; N y m

Harmonic Analysis on the Quantum Lorentz Group

513

where f KN M (A; B ) is defined by: 2 6 6 6 6 4

Æ A; B : Æ AB M ÆÆ KN D AB M v ÆÆ KN D v v v AB M v Æ KN D v v v Æ 3

7 7 7 = f KN ( M 7 5

) id N V

(59)

V

The left-hand side is equal to: 2

3

6 X6 6 6 D 4

7 7 7 7 5

V

2

3

6 X6 6 = 6 D 4

7 7 7 7 5

2

7 7 7 7 5

1=2 vM AB M = 1=2 1=2 K N D v B vN vK D

X vD

2

D B K

V

3

6 X6 6 = 6 D 4

1= 2 D M 1=2 1=2 B N K

V

:

From the definition of the 6j symbols it is easy to show that if

Y (A; B; M ): Y (K; N; D): Y (K; B; D): Y (A; B; D) = 0

AB M then KN D

= 0. As a result these constraints impose that if K is fixed then the

sum in (57) involves just a finite number of M . This can be written as a picture in terms of coefficients of the R matrix in IRF representation. Indeed, it is trivial to show using the formula of f KN M (A; B ) and the formulas for the R matrix in IRF picture that:

ÆÆ

ÆÆ

2 X6 KN 4 f M (A; B ) =

D

3 7 5

vN1=2 : 1=2 1=2 vM vK

(60)

IRF

We can also express f KN M (A; B ) with the following formula: 2

KN f M (A; B ) =

X6 4

D

3 7 5

1=2 1=2 vM vK : 1=2 vN

IRF

(61)

514

E. Buffenoir, Ph. Roche

In order to show this last equation we have to modify a little bit the previous proof. From Eq. (59), we have:

Æ Æ Æ : Æ

2

3

6

7 7 7 7 5

6 KN f M (A; B ) id N = 6 6 V 4

2

3

6 6 =6 6 4

7 7 7 7 5

V

V

From this diagram we can apply the succession of moves of the previous proof, and we finally obtain a new expression for f KN M (A; B ) :

KN f M (A; B ) =

X B

A M KN D

D

2 X6 = 4

D

2

1=2 vA vN1=2 vK 1=2 vD vM

ÆÆ

3 7 5

1=2 1=2 vM vK : 1=2 vN

ut

IRF

Let us now study the case of the strange representations Æs . It is easy to show from the definition of s that the character of the one dimensional representation s is J P the affiliated element J 2 1 Z+( 1)2J E xx (this is the way it appears in [31]). As a result 2 (A;B )

from the fact that Æ s = (A;B ) Æ s can be written as: (A;B )

X

K

s , we obtain that the character of the representation

(A;B )

X ( 1)2K f KN (A; B ) n z N M MK u

M;N

Ky u K M Mk m E n z: N y m

(62)

2.5. Left and right integral on SLq (2; C )R and its classical limit. One of the main results of [31] is the proof that there exists a right and left integral on Func (SLq (2; C )R ). Let hANr and hANl be the linear forms defined on Func (ANq (2)) by the formula

I I hANr (E rs ) = [dI ] 1 rs ;

I

I

hANl (E rs ) = [dI ]rs ;

(63)

then these linear forms are respectively right and left integral of the discrete quantum group Func (ANq (2)). A complete study of existence and uniqueness of right and left integral has been done in the case of the discrete quantum group in [40].

Harmonic Analysis on the Quantum Lorentz Group

515

Let hK be the unique normalized right and left integral on Fun(SUq0 (2)) . It is I defined on Func (SUq0 (2)) by hK (k rs ) = ÆI;0 . Then the linear form h = hK hANr , satisfies:

I J J h(kba E rs ) = ÆI;0 [dJ ] 1 rs :

(64)

This linear form is a left and right integral on Func (SLq (2; C )R ). Using this integral we can define on Funcc (SLq (2; C )R ) an hermitian form defined by (a; b) = h(a? b); 8a; b 2 Funcc (SLq (2; C )R ). As a result Funcc (SLq (2; C )R ) is 1 equipped with a norm jj jjL2 defined by jjajjL2 = (a; a) 2 . We will denote by 2 L (SLq (2; C )R ) the completion of Funcc(SLq (2; C )R ) with respect to this norm. One puzzling aspect of the construction of Func (SLq (2; C )R ) is that we know from general arguments that this multiplier Hopf algebra should correspond to the quantum analog of the compact supported functions on SL(2; C )R , but the precise limit q ! 1 is not so easy to understand. Although we precisely know what the classical limit of I I the elements k is, it is not at all obvious what is the limit, if any, of the elements E . In particular we would like to understand in which sense the formula of h gives, in the classical limit, the well known left and right invariant Haar measure of SL(2; C )R . In the rest of this part we will explain how to understand the classical limit of the expressions (63). a 0 with a 2 R+ ; n 2 C . An element g 2 AN (2) can be written as g = 1

n a

As usual a and n can be thought of as being functions of g . The expressions of the right and left Haar measure dmr and dml on Func (AN (2)) are given by Z

(g)dml (g) =

Z

Z

(g)a da d2 n;

(g)dmr (g) =

Z

(g)a

3

da d2 n: (65)

The aim of what follows is to prove that the expressions (63) are certain types of the Jackson integral discretizing the previous classical Haar measures. A straightforward computation, shows that 1 2

T(

) 1

=

(q

2

a^

+1 12 ^ 2 ) n

a^

0

1

with the commutation relations:

a^a^? = a^? a^; a^n^ = q 1 n^ a^; a^n^ ? = qn^ ? a^; n^ n^ ?

n^ ? n^ =

2

q+q

1 (q

1

q)(^a2

a^ 2 ):

(66)

^; n ^; n ^ ? is isomorphic to Uq (su(2)) when q 6= 1 by the The algebra generated by a quantum duality principle. ^; n ^; n ^ ? commute as they should. The element In the limit where q ! 1 the elements a 1 ? ? 2 2 C^ = 2 (^nn^ + n^ n^ ) + (^a + a^ ) is in the center of this algebra. The elements C^ and a^ are both self-adjoint positive affiliated elements and commute. The projections of

516

E. Buffenoir, Ph. Roche

the affiliated elements a ^; n ^; n ^ ? on each component Mat2I +1 (C ) given by the following expression:

I

a^I = (qJz ); n^ I = q 2 (q2 1

n^ ?I = q

1 2

(q 2

1)(

2 q+q

1)(

2

q+q

1

Func (AN (2)) are

1 I ) 2 (J );

(67)

1 I 2 1 ) (J+ ):

a; C^ ) Let : R+ R+ ! R be continuous with compact support; we can define (^ to be the affiliated element whose components are ((^ a; C^ ))I = (^aI ; C^I ). It will also be convenient to define (a; C ) to be the function on AN (2) defined by (a; C )(g ) = (a(g); C (g)), where C (g) = n(g)n(g) + a2 (g) + a 2 (g). We have the following theorem: Theorem 2. The classical left and right measure of AN (2) are related to the quantum left and right measure Haar measure by the following formula: 1 3

hANr ((^a; C^ )) =

lim (q q !1

q )

lim (q q !1

q 1 )3 hANl ((^a; C^ )) =

where is any continuous function R+

Z

Z

(a; C )dmr (g); (a; C )dml (g);

(68) (69)

R+ ! R with compact support

Proof. The rigth-hand side of the first equality is equal to

2

Z +1 Z +1 0

0

(a; 2 + a2 + a 2 )a2r;l

1

d da;

where r = 1; l = 1. After the change of variable 2 + a2 + a exp(y ) this integral is rewritten:

4

Z +1

1

dx

Z x=2

x=2

2

= 2 cosh(x); a =

(ey ; 2 cosh x)e2r;ly sinh x dy:

This integral is transformed into a Riemann sum by discretizing x = ~(2J + 1), y = m~, where q = e~ , 2 12 Z+ and m 2 12 Z. As a result we have: Z

(a; C )dmr;l (g)

= lim ~2 8 ~!0

X

J X

J 2 12 Z+ m= J

(qm ; q2J +1 + q 2J 1 )qr;l (q2J +1

q 2J 1 ):

This last expression can also be written as:

lim ~2 8 ~!0 2 q+q

J X

X

J 2 Z m= J

1 (q

1 2

2J +1

(qm ;

+

+q

2J 1

(70)

))q r;l (q 2J +1

q 2J 1 )

Harmonic Analysis on the Quantum Lorentz Group

because of q+2q q2I +1 + q 2I

1

517

= 1+O(~). An easy computation, using (67) shows that CI = q+2q 1; as a result we obtain that (70) is exactly

1

lim f(q q !1

q 1 )3

X

J2 Z 1 2

tr( (^ aJ ; C^J ) ) [dJ ]g:

ut

1

(71)

+

It would be very interesting to extend this result to a larger class of functions. The following heuristic argument shows that this is certainly possible but will require a deeper analysis. Assume that is now a function of three variables, we would like to a; n^ ; n^ ? ). This is not possible in general because n^ ; n^ ? do not commute, but define (^ the difference of two choices of ordering will be of order ~ so that when one computes the left-hand side of (68, 69) in the limit where ~ goes to zero the choice of ordering is irrelevant. I It is easy to see from the structure of the representation that the only monomials J n^ j n^ ?k such that trJ (n^ j n^ ?k ) is non-zero are those for which j = k. From the previous ^n ^ ? + 12 n ^?n ^ )j and we are back to the argument we can always choose the ordering ( 12 n case of a function (a; C ). In particular this argument explains the presence of the trace in the formula hANr;l : taking the trace corresponds in the classical limit to taking the mean over the angular variable defined by n = ei ; 2 R+ . 3. Representation Theory and Characters 3.1. Irreducible unitary representations. Irreducible representations of SL(2; C )R have been first classified by I. M. Gelfand and M. A .Naimark [20], see also the very good monographs [19,34] devoted to SL(2; C )R . Irreducible unitary representations of Uq (sl(2; C )R ) have been classified in [30]. W.Pusz shows that if (V; ) is an irreducible unitary Harish–Chandra representation of Uq (sl(2; C )R ) with domain V then ( ) = ! idV and ! are complex numbers such that !+ = ! . Moreover the irreducible unitary representations are entirely classified, up to equivalence, by the value of ! + . More precisely ! + ; ! can take only the following allowed value [2(2@1 +1)] @0 +1)] !+ = [2(2 [2@0 +1] ; ! = [2@1 +1] with the following constraint on @0 ; @1 : 1. @0 = @1 = 0. In this case is the trivial one dimensional representation = . 2. @0 = @1 = 2i~ . In this case is the one dimensional representation = s . 3. 2@0 + 1 = 12 (m + i), 2@1 + 1 = 12 ( m + i), m 2 N ; 2 [0; 4~ [ or m = 0; 2 [0; 2~ ]. is an infinite dimensional representation belonging to the so-called ^ p the set of representations belonging to the principal series. We will denote by D principal series. 4. 2@0 +1 = 2@1 +1 = 2]0; 1[: is an infinite dimensional representation belonging to the so-called complementary series. 5. 2@0 + 1 = 2@1 + 1 = + i ~ ; 2]0; 1[. is an infinite dimensional representation belonging to the so-called strange complementary series.

One can map one complementary series to the other one by tensoring it with the representation s . The principal series and the standard complementary series, after proper rescaling of the generators tend, when q goes to one, to the principal and complementary series of SL(2; C )R , whereas the limit of the strange complementary representations, when q ! 1, is singular and disappears in this limit.

518

V

E. Buffenoir, Ph. Roche

The proof follows the following path: when V is infinite dimensional it is shown that L 1 I I Jz = + I = m2 V , with (a) = I (a) for a 2 fq ; J+ ; J g (note that the multiplicity 1

K 2 are always 1) and by a Wigner-Eckart theorem we have that ( g ij ) maps V to

K 1

1

K +1

V

V K

V . Explicit formulas for the action of g2 ij are computed. We will show in this section I that one can find expressions of (g ij ) for any I 2 12 Z+ and this is nicely expressed in term of continuation of 6j symbols. At this point the reader is invited to read the appendix where we give an introduction to complex continuation of 6j symbols and their graphical representations, and to read the articles [5,6,13,22] where this theory is developped.

Theorem 3 (Unitary irreducible representations of Uq (sl(2; C )R )). Let be an infinite dimensional unitary irreducible Harish–Chandra representation of Uq (sl(2; C )R ), and assume that @0 ; @1 take one of the values described above and associated to the principal series or to one of the two complementary series. is equivalent to the following representation of Uq (sl(2; C )R ) : the representation space is given by V = A L C A2 21 Z+;A m2 2N V and the action of generators in an orthonormal basis f e r ; C 2 1 Z+; r = C; : : : ; C g is given by 2

L() ij B

g ij

B

e r = Ce n R () in jr ; X D x B C p i C E BD er = ep E B x D j r EC (@0 ; @1);

BC

C

DE where the complex numbers BD EC (@0 ; @1 ) are defined by

BC AD (@0 ; @1 ) = =

X

@2 jY (B;@2 ;@1 =1 y(c;@2 ;@0 )=1 X

B C A

@0 @1 @2

@2 jY (B;@2 ;@1 )=1 y(c;@2 ;@0 )=

B C A

B C D

@0 @1 @2

@1 @0 @2

B C D

X

D

E C AC BD X F A C BU K B U D F E EP =

E (

vA1=2 v11=2 2

1=2

vD

K

)1

@1 @0 @2

These coefficients satisfy the following relations: X A

(73)

v@2 vA1=4 vD1=4 ; v@1 vB1=2 vC1=2

(74)

v@0 vB1=2 vC1=2 : 1=4 v@2 vA1=4 vD

(75)

A B K KU ; P U D FP

[2D + 1] 12 D = !Y (A; @0 ; @1 ): [2A + 1] AA

(72)

(76)

(77)

A L Proof. From Pusz classification, we know that V = A2 12 Z+;A m2 2N V . The eleI I ments of Uq (su(2)) act on each V by and we can always assume that the elements of Uq (an(2)) act as follows:

X E g jl Pe z = EC uj lp e u : CP E

C

Harmonic Analysis on the Quantum Lorentz Group

519

B C The relation (72) is the action of the representation expressed in terms of L() . We first impose on the constraint obtained from the commutation relations between Uq (su(2)) and Uq (an(2)). We have BC

(R

() ij B () k

XE

l P m gn) ez =

kl L

C

E

BC

eu( R

() ij BE () ku

kl R

EC pl mp CP nz );

XE BC BP BE C B P ( g jl L() ik R () kl e u ( EC uj R () ip R () kl mn ) e z = mn ) lp kz CP E

Therefore there exist complex numbers CD EP such that

X u j D EC uj = lp EC x CP

D

We now study the constraint on the numbers Uq (an(2)). We first have: P (g ij g kl ) e z =

X

A B

F EDC

v i C F A y

i k =( AB v m = F K

K Kg m m n R p p R

y AE C jp

CD EP coming from the algebra law on

x C P CD D l p EP :

p k D EB x

x B P AC BD F D l z F E EP e v

n AB P K j l ) ez KP i k K n A B KR Fe : FP v nz AB m K j l

We can derive a more compact expression for these constraints. Indeed we have, for the first part of the equality 2 6 6 BD 6 AC F E EP 6 6 6 DCE 4 X

=

X

DCEU

BD A E C AC F E EP B U D

3 7 7 7 7 7 7 5 V

2

3

6 6 6 6 6 6 4

7 7 7 7 = 7 7 5 V

:

520

E. Buffenoir, Ph. Roche

Using a similar treatment for the other part, we obtain easily

2 6 6 X 6 KT F P 6 6 6 KT 4

=

X

KU FP

KUDC

FA C BU K

AB K PU D

3 7 7 7 7 = 7 7 5 V

2

3

6 6 6 6 6 6 4

7 7 7 7 7 7 5 V

which gives the first announced constraint (76) on coefficients CD EP : The second constraint is derived from the action of Casimir elements. Let us write this action explicitly. Using the expressions (72)(73) and the expression of the Casimir elements we obtain

ej =

B

=

=

=

X A 21

R

D X

(

D

X

D

(

X

D

(

1

vA1=2 v11=2 2

1=2 vD

2

1=2

vD

vA1=2 v11=2 2

1=2

vD

X

D

(

1a

k 12 B 12 D Ae = AB i D aj

1 b i D 2 )1 1 ab 1 2 A k

6 6 4

2

D [2D + 1] 6 )1 AB 4 [2A + 1] 1 2

vA1=2 v11=2 2

1=2

vD

3 7 7 7 7 5

1D 1 6 )1 2 e2i 2 6

AB

k 12 B 12 D A D a j AB e i =

2

vA1=2 v11=2

= Æji ÆA;B

l c D lc b A 1 k 2

() ib 2

ei =

A

V

3 7 5

ei =

A

V

1 D [2D + 1] A 2 )1 AB ei; [2A + 1]

which are Eqs. (77). As a first conclusion, we obtain that unitary irreducible representations of Uq (sl(2; C )R ) give solutions of the relations (76, 77) on the set of complex numbers BC AD , and inversely solutions of these equations give a representation of Uq (sl(2; C )R ) (the irreducibility and the unitarity has to be shown).

Harmonic Analysis on the Quantum Lorentz Group

521

Let BC AD be given by the expression (74) which can be represented as: 2

BC AD =

6

6 vC1=2 6 1=2 1=4 1=4 6 v v v @2 B A D 4

X

3 7 7 7 7 5

: IRF

Let us first show the equality (75). The square of the expressions (74) and (75) are, from the expression (145) rational functions in q i 2 . In order to show that these expressions are equal it is therefore sufficient to show that they are equal for an infinite set of values of z =i 2 12 Z+.But this is the case because we have (58), and the complex continuation

B C D

is equal to the ordinary 6j coefficient when z is an integer

@0 @1 @2

sufficiently large. We want now to prove that the relation (76) is satisfied. This relation is also equivalent to the following form, using unitarity relations on 6j coefficients:

[2U +1][2G +1] ei(G+U C D) [2C +1][2D +1] KU

X

BD AC F G GH =

BC U AD G

FA C BU K

12

A B K KU : H U D FH

(78)

This can also be written as:

[2C + 1] 2 [2D + 1] 2 AC BD ei(2G C D) F G GH = [2G + 1] (79) X F A C A B K i (G U ) [2U + 1] 12 B C U KU = e ( ) AD G BU K H U D F H : [2G + 1] 1

1

KU

Using the pictorial representation, it is easy to show that this relation is true. Indeed, the 1=2 1=2 1=2 1=2 1=2 1=2 right-hand side of the relation (79) multiplied by the factor vA vB vC vD vG vF can be represented as

2

3

6 6 6 6 X 6 6 6 6 KMU @2 6 6 6 4

7 7 7 7 7 7 7 7 7 7 7 5 IRF

which can be transformed, using successively the graphical rules associated to relations (156),(149), two times (155) combined with trivial symetries, and an orthogonality re-

522

E. Buffenoir, Ph. Roche

lation, as follows:

@ @ @ @ @ @ @ @@ 2

3

6 6 6 X 6 6 6 6 KMU 6 2 3 4 6 4

7 7 7 7 7 7 7 7 7 5

=

2

3

IRF

6 6 6 X6 6 = 6 6 MU 6 2 3 6 4

7 7 7 7 7 7 7 7 7 5

=

3

6 6 6 6 X6 6 = 6 6 MU 6 2 3 6 6 4

7 7 7 7 7 7 7 7 7 7 7 5

IRF

2

=

IRF

2

6 6 X6 6 = 6 6 2 3 4

3 7 7 7 7 7 7 5

IRF

which is explicitly equal to X

@ 2 @3

[2C + 1] 2 [2D + 1] 2 ei(2G C D) [2G + 1]

A@0 C@1 G @3

1

A C H

@0 @1 @3

1

B D F

@0 @1 @2

B D G

@0 @1 @2

v@2 v@3 vF1=2 vH1=2 vG ( ); v@2 1 vC vD

which conclude the proof of the relation (79). In order to prove the irreducibility and the unitarity of this representation we need an 1 1 2 2C . This is trivial to do explicit formula for the action of g ij , i.e we need to compute AD using the explicit formula for the 6j symbols when one of the spin is fixed to 12 , which

Harmonic Analysis on the Quantum Lorentz Group

523

are given in (158). We obtain the following formulas:

C2+ 1CC + 1 (@0 ; @1 ) = 2 2 q@1 +@0 +1 [C + @1 1

@0 + 12 ] + q

=

C 1CC 2 1 2

1 2

(@0 ; @1 ) =

q@1 +@0 +1 [C + @0

=

@1 @0 1 [C + @0

[2C + 1]

@1 + 12 ] + q

@1 @0 1 [C + @1

@1 + 21 ] ; @0 + 12 ] ;

[2C + 1]

C 1CC + 1 (@0 ; @1 ) 2 2 1 qC @0 @1 2 = i (1 q 4C +2 ) q q 1 q2(C +@1 @0 + 12 ) 1 q2(C +@0 @1 + 12 ) 1 2

q

q

q2(C +@1 +@0 + 32 ) 1 q2(@1 +@0 C + 21 )

1

1 C = C2 + 1 C 2

1 2

(@0 ; @1 ):

The irreducibility of the representation can be easily proved. Indeed, the action of A Uq (su(2)), on each vector space generated by fAe i ; i = 1 dim V g is irreducible, any operator commuting with the action of the whole algebra must be scalar in each of these vector spaces . Let us denote by A its value in the vector space labelled by A. 1 2 Then by analysing the action of g ij we can easily see that, if this operator commutes 1 1 2 2C = 0. From the computation of with the generators g ij , then we have (A D )AD 1C 2 AD , we deduce that A = A+1 , which allows us to conclude that this representation is irreducible. It remains to show that there exists an Hermitian form such that A A is unitary. Due to the fact that e i is an orthonormal basis for the representation , it is sufficient to show that elements of Uq (an(2)) are represented by unitary actions. We have B C A B C A ( e j j ( g kl )? j e i ) = ( e j j S 1 ( g lk ) j e i )

= = =

D

jm D C 1 nl wkm w BC p

X j

D

p CA CD D ni AB (@0 ; @1 )

DC ei(B D) [2D + 1] 2 l p A ei(A D) [2D + 1] 2 CD (@ ; @ ) 1 1 AB 0 1 B pk CD i [2B + 1] 2 [2A + 1] 2

X l

DM

XC

j M CB q

1

q AC ei(B+A 2D) [2D + 1] CD B CD (@ ; @ ): M i k [2B + 1] 12 [2A + 1] 21 CM A AB 0 1

But we also have, X ei(B +A

D

1

2D ) [2D

+ 1]

[2B + 1] 2 [2A + 1] 2 1

1

C D B CD (@ ; @ ) = C M A AB 0 1

524

E. Buffenoir, Ph. Roche

v v @ v v v v @ @ v v v v v @ 2

3

6 6 6 6 1 = 4 1 = 4 6 X 6 A B = 1=2 1=2 6 6 D 2 C M 6 6 6 4

7 7 7 7 7 7 7 7 7 7 7 5

2

3

6 6 6 X6 6 = 6 6 2 3 6 4

7 7 7 7 7 7 7 7 5

=

2

3

6 X6 6 6 4 2

7 7 7 7 5

1=4 A 1=2 C

=

IRF

1=4 B = 1=2 M

IRF

1=4 1=4 1=2 A B vC = CM (@ ; @ ); AB 1 0 1=2 M

IRF

using successively graphical rules associated to (155) (152) (148) (75). As a first conclusion, we then have A ( e j j ( g kl )? j e i ) = B

C

X l

j M CB q

M

q A C CM M i k AB (@1 ; @0 ):

On another hand we trivially get: B ( e i j g kl j e j ) = A

C

X

M

l j M CB q

q A C CM M i k AB (@0 ; @1 )

We thus have shown that unitary of the representation is equivalent to the relation:

BC BC AD (@1 ; @0 ) = AD (@0 ; @1 ): 1

2 Because g kl generate Uq (an(2)), we only have to show that this last relation is satisfied 1 for B = 12 . This is indeed true from the explicit form of BC AD (@0 ; @1 ) with B = 2 , 1C computed above. In particular the value of C2 1 C + 1 (@0 ; @1 ) uses a square root and 2 2 t implies the constraint 2]0; 1[ for the two complementary series. u

From the definition of the complex continuation of the 6j symbols we would expect function in x where x = q 2@1 +1 . We will show that in fact BC that BC AA is a rational AA as 2 are Laurent polynomials in the variable x (this is clear for B = 1 from ) well as (BC AD 2 the explicit expression). The reason for such a property is explained in the following section.

Harmonic Analysis on the Quantum Lorentz Group

525

3.2. Connections between the coefficients BC AD and the universal shifted cocycle F (x). In the article [7], O.Babelon introduces the universal matrix F (x) 2 Uq (sl(2)) 2 in the context of quantization of Liouville theory on a lattice: +1 X

F12 (x) =

k=0

q 1 )k

(q

xk ( 1)k Q2k 1 H [k ]! =k (xq q 2

q x 1 q q H2 )

k(H1 +H2 )=2 J k

+

Jk ; (80)

where we have denoted H = 2Jz . It is shown that

F12 (x) +1 X =

k=0

1

(q

=

q 1 )k

1 xk Qk [k ]! =1 (xq q H2 x

1q

q H2 )

qk(H1 +H2 )=2 J+k J k

(81)

and that it satisfies the shifted cocycle identity:

(id )(F (x))F23 (x) = ( id)(F (x))F12 (xq H3 ):

(82)

In [8] a quasi-Hopf algebra interpretation of this construction is given as well as a connection between complex continuation of 6j coefficients and F (x), which is encoded by the following formula: AB

1 2

F (x)10 20 =

X

C

N (C ) (x; 1 + 2 )Æ + ;0 +0 N (A) (xq20 ; 10 )N (B) (x; 20 ) 1

2

1

2

2

1 2 C A B 1 + 2

A

B

C

@(x) @(x) + 10 + 20 @(x) + 20

(83)

;

with x = q 2@(x)+1 and N (A) (x; 1 ) are normalisation factors, the exact expression for them are derived in [8] and is given by (see the appendix for notations on continued 6j s):

iA +1 (2@ + A + + 2) +1 (2@ A + + 1) (2@ + 2)+1 1 (2@ + 2 + 1)

N (A) (q2@+1 ; ) = e

From this result we easily obtain that: AB

(F (x)

1

(A) 2 0 0 (B) 0 10 20 X N (xq 2 ; 1 )N (x; 2 )Æ1 +2 ;10 +20 )1 2 = N (C ) (x; 1 + 2 ) C A B C 1 + 2 A B C 1 2 @(x) @(x) + 10 + 20 @(x) + 20 : (84)

We first have the following simple result which relates BC AD (@0 ; @1 ) to F (x) :

526

E. Buffenoir, Ph. Roche

Lemma 4. The coefficients BC AD (@0 ; @1 ) satisfy the relation:

BC AD (@0 ; @1 ) = =

m

vA1=4 N (D) (x; m2 ) 1 2 10 20 D ; ( ( x )) m 10 20 B C 1=4 N (A) (x; m ) 2 vD 2 @1 = m2 , and x = q2@1 +1 , and we have denoted

C B A 2 1 2

where as usual @0

(85)

(x) = RF (x)G(x)F (x) 1 ; with G(x)

2 Uq (sl(2)) 2 the element G(x) = q

Proof. Using the relations connecting rewrite Eq. (75) as:

(H1 +H2 )2 2

2

H x H1 q 22 .

F (x) to the complex 6j coefficients, we can

vA1=4 N (D) (x; m2 ) BC AD (@0 ; @1 ) = 1=4 N (A) (x; m ) vD 2 m 1 = 2 1 = 2 00 00 X m BC BC 1 2 0 0 vB 1v=2C A2 B1 C2 F (x)10 20 vv@1 + 20 (F (x) 1 )11002200 B1 C2 Dm : @1 +2 2 vA

(86)

In the last sum we necessarily have 1 + 2 = m 2 , it is then natural to introduce the mav@1 +1 +2 1 ;2 Æ 0 Æ 0 , which is the representation in the tensor product trix G0 ;0 = v 1 2 @1 +20 1 ;1 2 ;2 B C of the element G(x) introduced in the proposition. Using the twist relation on t the Clebsch–Gordan coefficients we finally obtain the relation of the proposition. u Let us denote by B (x) the element B (x) = q

H22

Lemma 5. For each p 2 N ,

F (x)B (x)p F (x)

1

= B (x)p

+1 X (q

n=0

2

xH2 , we have the following result:

q 1 )n n2 (H1 +H2 ) n x gp;n (xqH2 )J+n J n ; q [n]! (87)

where gp;n (x) 2 C [x], and in particular g0;n (x) = Æ0;n and g1;n (x) = ( 1)n q From this result we obtain the following identity on F (x):

F12 (x)q

2 H2 2

xH2 F12 (x)

1

=R

1

q

H H 2

q

2 H2 2

xH2 :

n(n+1) 2

xn :

(88)

Proof. By applying the proof to the case where p = 0 we will obtain a simpler proof of the fact that F (x) 1 is given by Eq. (81). Let us introduce the rational functions fk (x); f~k (x) defined by:

fk (x) =

1 x =k (xq

Q2k 1

1q

)

; f~k (x) =

Qk

=1

(xq

1

; x 1q )

Harmonic Analysis on the Quantum Lorentz Group

527

k (H ) = (H 2k )J k we obtain that relation (87) holds true with from the relation J

gp =

n X

( 1)k

2 n fk (x)f~n k (xq2k )q2pk x2pk ; k q

k=0

(we will omit the variable n in the sequel). Let us prove that gp (x) is a polynomial in x. Let Rk;p (x) = fk (x)f~n k (xq2k )x2pk , this rational function is equal to

Rk;p (x) =

(x2 q 2k

Qk+n

=k

1)xn+2pk ; (x 2 q q )

and we can decompose this rational function in simple poles:

Rk;p (x) = Pk;p (x) +

n X l=0

Al;k;p x + Bl;k;p x2 qk+l q k l

with Pk;p (x) polynomial of degree less than or equal to 2pk n. Because Rk;p (x) has the same parity as n, we necessarely have Al;k;p = 0 if n is even,and Bl;k;p = 0 if n is odd. When n is odd we get Al;k;p = ( 1)l (q q 1 )1 n [n]! nl q [k l]q 2pk(k+l) , therefore n n X X ( 1)k+l (q q 1 )1 n n n gp (x) = Pk;p (x) + q 2pkl [k l]: k + l 1 k l xq x q k l q q k=0 k;l=0 The second term of the right-hand side is zero due to the antisymmetry in the exchange of k and l. As a result gp (x) is a polynomial of degree less than or equal to (2p 1)n. From Rk;p (x) = x2pk+n Qk;p (x), with Qk;p (x) rational function not having 0 as pole, we get by differentiating that Rk;p (x)(m) (0) = 0, for 0 m < n, and R0;p (x)(n) = n(n+1) n!( 1)n q 2 . The proof is similar if n is even. As a result we obtain that g0;n (x) = n(n+1) Æ0;n and g1;n (x) = ( 1)n q 2 xn . An immediate application of this result shows that

F (x)B (x)F (x) 1 = +1 X n(n+1) n (q q 1 )n 2n2 n ( 1)n q 2 (H1 H2 ) q 2 =f J+ J n gB (x) = [ n ]! n=0 =R

1

q

H H 2

B (x):

ut

We summarize the previous results in the following proposition, which gives a new expression for the coefficients BC AD (@0 ; @1 ). Proposition 1.

vA1=4 N (D) (x; m2 ) m2 C B BC ( @ ; @ ) = 0 1 AD 1=4 N (A) (x; m ) A 2 1 vD 2

q 21 z B1 C2 D m ; 2

(89)

m + z . As a result BC as well as (BC )2 where again 2@0 + 1 = m AA AD 2 + z; 2@1 + 1 = 2 are Laurent polynomials in the variable x.

528

E. Buffenoir, Ph. Roche

Proof. The universal matrix (x) can be simplified using the previous lemma, and we have:

(x) = q

H H 2

q

2 H1 2

x H1 :

Using this simple expression, it is then trivial to obtain the announced formula.

ut

H2 H H H2 Remark. The linear relation F12 (x)q 2 xH2 = R 1 q 2 q 2 xH2 F12 (x) is a very important relation. We have computed [3], see also [23], using a generalisation of this identity, the exact form of the shifted cocycles F : H? ! Uq (G), where H is a Cartan subalgebra of G. 2

2

3.3. Characters. Funcc (SLq (2; C )R ) = Pol(SUq0 (2)) Func (ANq (2)) is a multiplier ~ = Pol(SUq (2)) Pol(SUq (2)), Hopf algebra, let us define on the vector space D where Pol(SUq (2)) is the restricted dual of Pol(SUq (2)) and is the algebraic tensor product, a structure of multiplier Hopf algebra in duality withFuncc (SLq (2; C )R ). Let A B Ai A X j 2 Pol(SUq (2)) be the dual basis of g ij , i.e. X ij ; g kl = ÆAB Æli Æjk , the struc-

ture of algebra on Pol(SUq (2)) Pol(SUq (2)), is defined by the quantum double, i.e. X I J M M g 1 g2 = (90) IJ M g IJ ; M A B A X ij X kl = ÆAB Æli X kj (91) A B and a braided relation between X ij and g kl that can be computed from (10). By duality A the coproduct on X ij is defined by X m r A Bn Cs Ai i B C (X j ) = (92) B C j A n s Xm Xr: BC A A ~ and we have the trivial relation L() ij = The element L() 2 D are multipliers of D P AB ik B l B R jl X k . From the expression of an irreducible unitary representation of D = Uq (sl(2; C )R ), ~ as follows: ~ of D associated to @0 ; @1 we can construct a representation

i Ce ; ~ (X ij )Ce m = ÆBC Æm j XE D x B C p i Bi C BD ~ (g j ) e m = ep E B x D j m EC (@0 ; @1 ): B

For each defined by:

2

DE

Funcc (SLq (2; C )R ), let us define () to be the element of End(V )

() =

X

AB

~(X ij Bg kl )h((k ji E lk )); A

A

B

(93)

Harmonic Analysis on the Quantum Lorentz Group

529

this sum involves just a finite number of non-zero terms. From the structure of ~ we A immediately see that the matrix of () in the Hilbert basis f e m ; A 2 12 Z+; m = A; : : : ; Ag has only a finite number of non-zero matrix elements. As a result we can define () = trV ( ( 1 ) ()), which defines the character associated to . We have a natural inclusion Fun(SLq (2; C )R ),!(Funcc (SLq (2; C )R )) given as usual by: if f 2 Fun(SLq (2; C )R ) we have (f )(a) = h(fa); 8a 2 Funcc (SLq (2; C )R ). 2 (Funcc(SLq (2; C )R )) , where the denotes the algebraic dual, we can Let endow this vector space with a structure of topological space with the weak * topology. I J IJ P IJ ns I m J r = Let (knm E rs ) = ns mr , we always have IJ mr (kn E s ), where the convergence always holds. As a result we have

=

X j

A;B;C

l C AB n

We will use the notation

AB jl im =

X

C

n B A C mt

j l C AB n

A

B

A 1t i m BC AA i (kj E l ):

n B A C mt

A 1t BC AA i

BC and BC A = AA . R L We have already defined . and . as right and left action of Uq (sl(2; C )R ) on Funcc (SLq (2; C )R ), let us define a right and left action of Uq (sl(2; C )R ) on (Funcc (SLq (2; C )R )) : let 2 (Funcc (SLq (2; C )R )) and x 2 Uq (sl(2; C )R ). We R L denote x . , (resp. x . ) to be the elements of (Funcc (SLq (2; C )R )) defined as follows: L L R R x . (a) = (S 1 (x) . a); x . (a) = (S (x) . a); (94) 8a 2 Funcc(SLq (2; C )R ): R;L R;L This definition is chosen in order that if f 2 Fun(SLq (2; C )R ), x . (f ) = (x . f ). From these definitions we obtain the following proposition: Proposition 2.

8x 2 Uq (sl(2; C )R ; L

R

(x . ) = (S (x))(); (x . ) = ()(S 1 (x)); 8 2 Funcc (SLq (2; C )R ); R

L

x . = S 2 (x) . ; (ad+ invariance); L

. = ! ; where ! are the eigenvalues of the Casimir operator. Proof (Easy application of the definitions).

ut

We can define on Funcc (SLq (2; C )R ) a convolution product as follows: let Funcc (SLq (2; C )R ), we define

=

X ()

(1) h(S ((2) ) ):

It is easy to show that the following proposition is verified:

;

2

530

E. Buffenoir, Ph. Roche

Proposition 3.

8; 2 Funcc (SLq (2; C )R );

( ? ) = ()( ); (S (? )) = ()y :

(95) (96)

~. ~ is a unitary representation of D Proof. This is a consequence of the fact that

ut

being an ad+ invariant element and an eigenvector of the action of , this implies linear relations satisfied by the elements BC AA . We will need this linear system in the proof of the Plancherel formula.

Proposition 4 (Invariance under Uq (an(2))). The element of (Funcc (SLq (2; C )R )) is invariant under the adjoint action of Uq (an(2)), which implies that AB CC satisfy the linear equations: X

AB

X

AB

MC B BD AA A D R A C R BK AA M K B

C A R MK T CM B AK T

AM T = CD B MA T C D R

(97)

for all K; T; C; M; D; R 2 12 Z+.

Proof. The right action of Uq (an(2)) on is given by the formula:

Ai B m Cp R X g n . ( AB jl im (kj E l )) = A;B =

X

ABMC

u d B AB jl im C M l

m C M (S 1 (Ak )v Ck i Ak p M b n j u E d ); B vb

(98)

whereas the left action is

Ai B m Cp L X g n . ( AB jl im (kj E l )) = A;B =

X

ABMC

c p B AB jl im M C l

m M C Ai M a B a n (kj E c ):

The element is therefore ad+ invariant if the following relation holds: X j

ABD

l D AB x

=

X

ABD

n i C A u c C M

x B A D mi

c p B MC l

m MC B a n

R f C A BD e R t j AA = B m CM i p R f AC l B t a AC e R ju

j l D AB x

x B A BD D m i AA :

(99)

Harmonic Analysis on the Quantum Lorentz Group

531

The left-hand side of this equation is nicely described by the following picture 2 6 6 6 6 X BD A 6 6 6 ABD 6 6 4

3 7 7 7 7 7 7 7 7 7 5

;

V

which can be written using the definitions of 6j symbols and the unitarity relations on Clebsch–Gordan coefficients as X

BD A AB DKT

MC B A D R

C A R MK T

AM T CD B

2

3

6 6 6 6 6 6 4

7 7 7 7 7 7 5

V

The right-hand side is represented by the picture: 2 6 6 6 6 X BD A 6 6 6 ABD 6 6 4

which is equal after similar manipulations to X

ABDKT

BK A

A C R MK B

CM B AK T

2

3

6 6 6 6 6 6 4

7 7 7 7 7 7 5

As a result we obtain the announced Eq. (97).

ut

MA T C D R

V

3 7 7 7 7 7 7 7 7 7 5 V

;

532

E. Buffenoir, Ph. Roche

We will need the explicit form of this system of equations for C = 21 . Using the formulas for the 6-j symbols with one spin fixed to 12 (143) the system is written as: 8(R; M; T ) 2 21 Z+ 12 Z+ 12 Z+,

[R + M + T + 52 ] M + 21 T + 12 [R + T M + 32 ] M 12 T + 12 R+ 1 + R+ 1 2 2 [2R + 2] [2R + 2] 1 3 1 1 1 1 [T + M R 2 ] M 2 T 2 [M + R T + 2 ] M + 2 T 2 R+ 1 R+ 1 + = 2 2 [2R + 2] [2R + 2] [R + M T 21 ] M 12 T + 12 [T + M R + 32 ] M + 21 T + 12 + + = Z (M; T; R) R 1 R 1 2 2 [2R] [2R]

Z (M; T; R)

[R + T M + [2R]

1 1 2 ] M + 2 T

R

1 2

1 2

[R + M + T + 21 ] M + R [2R]

1 2 1 2

T

1 2

; (100)

where

Z (M; R; T ) = 1 [R + T + M + 23 ] 2 [R + T =

M + 12 ] 2 [M + R T + 21 ] 2 [M + T 1 [2T ] 2 1

1

R + 21 ] 2 1

:

The coefficients BC AD satisfy additional linear relations, which will be used in the proof of the Plancherel Theorem. Lemma 6 (A useful identity). 2

SB;@0

6 6 X6 6 = 6 C;@2 6 4

3 7 7 7 7 7 7 5

= ( 1)2B

[(2B + 1)(2@0 + 1)] Y (A; @0 ; @1 ); [2@0 + 1]

IRF

and the same formula with the undercrossing exchanged with the overcrossing. Proof. The proof is obtained by an easy induction on B . It is clear that this equation is true for B = 0. Furthermore, using the definition (145) it is an easy check to verify that it is also true for B = 21 . Now, we can establish a recursion relation on the half-integer B . Indeed, we have, using graphical rules associated to the continued 6j s: 2

SB;@0 SC;@0

6 6 X6 6 = 6 DE 6 @2 @3 4

3 7 7 7 7 7 7 5 IRF

Harmonic Analysis on the Quantum Lorentz Group

533

@@ Y B; C; F @

2

3

2

3

6 6 6 X6 6 = 6 6 DE 6 @2 @36 4

7 7 7 7 7 7 7 7 7 5

6 6 6 X 6 6 = 6 6 DEF 6 2 3 6 4

7 7 7 7 7 7 7 7 7 5

2

3

6 6 X6 6 = 6 F C 26 4

7 7 7 7 7 7 5

IRF

(

)=

X

F

=

IRF

Y (B; C; F )SF @0 :

IRF

In particular by taking C = 12 , we have SB;@0 (q (2@0 +1) + q (2@0 +1) ) = SB + 12 ;@0 + @0 +1)] Y (A; @0 ; @1) satisfies the same SB 21 ;@0 . As a result, because ( 1)2B [(2B+1)(2 [2@0 +1] t functional equation, it is equal to SB;@0 : u Proposition 5 (Action of Casimir elements). The coefficients BC AD (@0 ; @1 ) satisfy the following linear relations: X

MN

MN AD

BI M NA C

for all A; B; C; D; I

1=2 1=2 B I M [d ]( vM 1 = !(I )[dC ]BC ( vB )1 ) N AD ND C v1=2 v1=2

N

C

(101)

2 12 Z+, with the definition

!+(I ) =

[(2I + 1)(2@0 + 1)] [(2I + 1)(2@1 + 1)] ; ! (I ) = : [2@0 + 1] [2@1 + 1]

This system of linear equations, when I = 21 is equivalent to the fact that , is an eigenvector of the right action of () with eigenvalues ! . Proof. From the relations satisfied by the generalized 6j symbols, we have the following chain of equations: 2 6 6 6 X 6 6 6 6 MN @2 6 6 4

Æ

3 7 7 7 7 7 7 7 7 7 5 IRF

534

E. Buffenoir, Ph. Roche

Æ Æ I @ @

2

3

6 6 X 6 6 = 6 MN 6 @2 @3 @4 4

7 7 7 7 7 7 5

2

3

6 6 X 6 6 = 6 N @2 @3 6 4

7 7 7 7 7 7 5

2

= ( 1)(2I )

IRF

IRF

[(2 + 1)(2 0 + 1)] 6 4 [(2 0 + 1)]

3 7 5

; IRF

where we have used the previous lemma to obtain the last equation. If we replace the graphics by their explicit expressions in terms of 6j symbols, and complex continuation of 6j symbols, the left-hand side of this chain of equations is the opposite of the left-hand side of (101) and similarly for the right-hand side. The equation for ! (I ) is obtained similarly, using the other equivalent definition of BC AD (75). It is an easy computation to show that indeed is an eigenvector of the right action of () with eigenvalues ! if and only if these linear relations are satisfied with A = D and I = 21 . We will need the explicit form of these linear equations when I = 21 ; the system is equivalent to:

A + 1][A + B + K + 2]q(K B)BA+ 2 K + 2 B + 1 K 12 [A + K B ][A + B K + 1]q (K +B +1) 2 1

[B + K

1

A

B 1 K 21 + [A + K + B + 1][B + K A]q (K B ) A 2 B 1 K + 12 [A + B K ][A + K B + 1]q (K +B +1) 2 = ! [2K + 1][2B + 1]BK A :

(102)

A

Let A 2 12 Z+, we will define A = f(B; C ) 2 ( 21 Z+)2 ; Y (A; B; C ) = 1g, and we will denote by @ Ad+ ; @ Ad ; @ Aa the “boundaries” of A (d is for diagonal, and a is for antidiagonal) defined by @ Ad = f(B; C ) 2 A ; B C = Ag, @ Aa = f(B; C ) 2 A ; B + C = Ag. An inspection of the system of linear equations (102) shows that if we fix A 2 12 Z+, and 0AA the solution BC A for (B; C ) 2 A of (102), if it exists, is unique. The idea of the proof is very simple, it uses the fact that we can always eliminate one of the B + K + 2 in the system (102) by taking the linear combination of the indeterminate A 2 equation containing !+ and the equation containing ! . a As a result we can compute BC A on the antidiagonal @ A by eliminating the point B + 21 C + 12 B 21 C 21 A . The point A does not contribute because its coefficient vanishes

Harmonic Analysis on the Quantum Lorentz Group

535

(we also assumed that BC A = 0 for Y (A; B; C ) = 0 which is then the case). As a result we obtain a linear recurrent equation which is written as:

[A + C

B ][A + B

C + 1][2C + 1]BA+ 2 C 1

1 2

C ][A + C B + 1][2B + 1]BA 2 C + 2 = ([2C + 1][2B + 1](!+ q C +B ! q C B )BC A ; [A + B

1

1

a 0A which then defines uniquely the coefficients BC A on @ A in term of !+ ; ! and A 1 1 A 2 because the coefficient of A2 vanishes. 1 1 B 2C 2 , we obtain a linear system which is triangular with reIf we eliminate A a spect to the gradation defined by B + C . Because BC A are known on the boundary A , BC this triangular system defines uniquely the A . This triangular system is an efficient tool to compute BC A . 4. Plancherel Theorem In this section we give a proof of the Plancherel formula for SLq (2; C )R . Let us denote ^ p associated to m 2 N ; 2 [ 2 ; 2 ] = I~ , in this case by (m; ) the element of D ~ ~ 1 we have 2@0 + 1 = 2 (m + i) and 2@1 + 1 = 12 ( m + i). Note that we have doubled the interval for m = 0, and that in this case we have (0; ) = (0; ) for 2 I~ . ) the coefficients defining this principal Here we will often denote abusively BC A (m;BC ( m; ) = series of representation, i.e. BC A A (@0 ; @1 ) and (m; ) = (m;) . Let us define the Plancherel density

P (m; ) = =

~

2

(1

~

4

(q

1 Æm;0 )(cosh(m~) cos(~)) 2 1 q 1 )2 (1 Æm;0 )[2@0 + 1][2@1 + 1]; 8m 2 N ; 2

2 I~ :

^ p , denoted We can equivalently see this Plancherel density as a measure on the space D dP (). We will prove the following theorem: Theorem 7 (1. Plancherel Formula for SLq (2; C )R ). Let be any element of Funcc (SLq (2; C )R ), we have: +1 Z X

~

m=0 I

(m; )()P (m; )d = ():

(103)

Note that this series contains only a finite number of non-zero terms. The proof of this theorem consists in three lemmas which are interesting in themselves. Lemma 8. Assume that (f (m; :))m2N is a family of C 1 even functions on I~ such that the Plancherel formula holds, i.e. +1 Z X

m=0 I

~

(m; )()f (m; )d = ()

then f (m; ) = P (m; ).

8 2 Funcc(SLq (2; C )R );

(104)

536

E. Buffenoir, Ph. Roche

Proof. We first compute h( (m; ) S

CDF

Br n ) E s ); this is equal to:

X C

h(

A

1 (k t

A D B k uj S 1 (k tn ) E pl E rs Cj Dl F x A

x D C C 1 i DF F p i u C (m; ))

ÆAC un t B x D C C 1 i DF Æj ÆBD 1 ps [dB ]Ærl Cj Dl F x F p i u C (m; ) [ d ] A CDF [dB ] X t r F x B A B 1 p BF (m; ): = s A F pn AB x [dA ] F =

X

Assume that the Plancherel formula holds with the densities f (m; :), the previous computation implies that the Plancherel formula is equivalent to: +1 Z X [dB ] X t r AB [dA ] m=0 I F

F F 1 x y B A BF (m; )f (m; )d = A 1 t Æ y F s n A n B;0 x

~

AB Therefore if we multiply both sides of this equation by P12 R and if we take the trace A B on V V we obtain that: +1 Z X [dB ][dF ] X

F

[dA ]

m=0

vA vB 1 BF A (m; )f (m; )( v ) 2 d = [dA ]ÆB;0 : F I~

(105)

Using the relation X [dF ] vA vB 21 ~ BF ) = Y (A; A (m; )(

F [dA ]

vF

m [(2B + 1) (m2+i) ] ) 2 [ (m+i) ] 2

proved in the previous lemma, the system of equations (105) is equivalent to:

[(2B + 1) (m2+i) ] f (m; ) = [dA ]ÆB;0 : [ (m2+i) ] mjA m2 2N I X

Z

~

We can develop f (m; :) in Fourier series and write: f (m; ) = the relation:

(106)

P

m ip p2Zap q 2 . Using

2B [(2B + 1) (m2+i) ] X (k B )(m+i) = q ; [ (m2+i) ] k=0

the system (106) is equivalent to a linear system on apm which is trivial to solve and whose unique non-zero solution under the assumption that f (0; ) is an even function gives f (m; ) = P (m; ). In particular, only the Fourier modes (0; 1; 1) of f (m; :) are non-zero. u t

Harmonic Analysis on the Quantum Lorentz Group

537

It remains to show that the system of equations +1 Z X [dB ] X t r A B [dA ] m=0 I F

F F 1 x y B A BF (m; )P (m; )d = A 1 t Æ y F s n A n B;0 x

~

(107)

is satisfied. This is trivially the case for B = 0, because if we put B = 0 in the system (105) , we get: +1 Z X

m=0

~

I

0AA (m; )f (m; )d = [dA ]

which is trivially equivalent to (107) with B = 0. Therefore, the proof of the Plancherel formula is reduced to the proof of (107) when B 6= 0, which is equivalent to: +1 Z X

m=0

~

I

BA C (m; )f (m; )d = 0; B 6= 0:

(108)

Lemma 9. In order to show that Eqs. (108) are true, it is sufficient to show that they hold for any (B; C ) 2 @ Ad ; @ Aa , where A is any half-integer and B 6= 0. Proof. The idea is to use the linear system coming from the invariance under Uq (an(2)). BC = P+1 R BC (m; )f (m; )d, and assume that Let us define A m=0 I A

~

ABC = 0;

8A 2 12 Z+;

(B; C ) 2 @ Ad ; @ Aa ;

B 6= 0:

BC . It is triangular with respect By linearity, the system (100) is trivially satisfied by A BC = 0 for all (B; C ) 2 to the gradation A + B + C , therefore if we assume that A d 1 + a @ A ; @ A , B 6= 0 we deduce, by an easy induction, that ABC +1 = 0; 8B; C 2 2 Z , B 6= 0 except for the special case B = 0; C = A + 1. Indeed in this case the lin1A+1 BC ear system expresses A +1 in terms of linear combination of A ; B + C < A + 0A+1 BC 0 A 3; A+1 ; B + C < A + 2 containing A+1 and A which are non-zero. The linear system (100) applied to this case gives, using the induction assumption, [2A + 4] 1A+1 [2A + 2] 0A [2A + 2] 0A+1 = [2A + 3] A+1 [2A + 1] A [2A + 3] A+1 [2A + 2] [2A + 2] = [2A + 1] [2A + 3] = 0: [2A + 1] [2A + 3] This concludes the proof of the lemma.

ut

(109)

538

E. Buffenoir, Ph. Roche

Lemma 10. The explicit expressions for BC A on the boundaries of A are: B m ( 1)2B X A + m2 A m2 BC ik ~ A (m; ) = Y (A; ) 2A q ; (@ a ); B + k B k 2 q q 2B q k= B m ~ Y (A; 2 ) BC A (m; ) = 2C +1 2B q B X A m2 + B + k A + m2 + B k q ik ; (@ d ); B k B + k q q k= B m 2 A i m ( 1) q 2 BC A (m; ) = Y~ (A; 2 ) 2B +1 2C q C X A m2 + C k A + m2 + C + k ik q ; (@ d+): C k C +k q q k= C (110) Equations (107) are satisfied on @ Ad ; @ Aa . Proof. In order to show that BC A are given by the expressions (110) we have to integrate the linear equations (102). This is trivial, because we know that these systems admit a unique solution and it is then sufficient to show that the rigth-hand side of (110) satisfy these linear equations, which is easy to show. We will first show that Eqs. (107) are satisfied on @ Aa : a From the expression of BC odd integer then BC A1 on @ A , we see that if 2BR is anBC A has just Fourier modes in Z + 2 . As a result the integration I A (m; )P (m; )d = 0, and the Plancherel formula is trivially satisfied. The only non-trivial check is therefore for 2B even. From the explicit expression of BC A (m; ) we obtain that

~

Z

2A BC (m; )P (m; )d = 2B q I A m A + m2 A m2 = Y~ (A; )( (q m + q m ) 2 B q B q A m A+ m A A+ m

( 1)2B

~

2

B+1 q B

2

1 q

B

2

1 q

m 2

B+1 q

):

We have to consider two cases. The first one is when A is an integer, the constraint Y~ (A; m2 ) = 1 imposes that we put m = 2s, the Plancherel formula (108), is then

equivalent to the following identity:

A X

A+s A s (q 2s + q 2s ) B B q q s=0 A+s A s A+s A s )g = B+1 q B 1 q B 1 q B+1 q A A A A = : B q B q B+1 q B 1 q

f

(111)

Harmonic Analysis on the Quantum Lorentz Group

539

The second case is when A = l + 12 , with l 2 Z, the constraint Y~ (A; m 2 ) = 1 imposes that we put m = 2s +1 with 0 s l, the Plancherel formula (108), is then equivalent to the following identity: l X l+s+1 l

f

B

s

q B q s=0 l+s+1 l s B+1 q B 1 q

(q 2s+1 + q

2s 1

)

l+s+1 A s )g = 0: B 1 q B+1 q

We will now show that these identities on q -binomials hold, and will only give an explicit proof of the first identity; the second case uses the same method of proof. We will show the two identities, for B 6= 0,

A X

(

)

A+s A s A+s A s q 2s B B B + 1 B 1 q q q q s=0 A A+1 = q A B +1 ; B+1 q B q

A X

(

A+s A s 2s q B B q q s=0 A A+1 = qA B : B q B+1 q

(112)

)

A s A+s B+1 q B 1 q

(113)

If we take the sum of these two equations, the left-hand side is the left-hand side of (111) and the rigth-hand side is, after the use of two q -analogues of Pascal relations, equal to the right-hand side of (111). Let us first recall that if x; y are two elements satisfying the relation yx = q 2 xy , then we have the identity: n X n p(n p) p n p n (x + y ) = q xy : (114) p=0 p q Let us consider the C algebra generated by x; y; a; b and satisfying the relations yx = q2 xy; ba = q2 ab; ax = xa; ay P = ya; bx = xb; by = yb. A basis of this algebra is (am y n ar bs )m;n;r;s2N. If P = m;n;r;s2N Pmnrs am y n ar bs we will denote P Cm;r (P ) = n;s2N Pmnrs . From the defining relations, we have:

A+s A s = CB;B (q B q B q

from which we deduce that: A X A s A!s 2s

s=0 =q

B

q B q

q

=q

2(A

2(A

B )B (x + y )A+s (a + b)A s );

B )B C A A B;B ((x + y ) (a + b)

(a + b)A+1 2(A B )B CB;B ((x + y)A a+b

(q

2 (x + y ))A+1

A X s=0

(

x+y 2 s q )) a+b

); q 2 (x + y) where the reader is urged to check that because a + b commutes with x + y , these expressions have a perfect definition. A similar reasoning implies that

540

E. Buffenoir, Ph. Roche

A+s A s B+1 q B 1 q

=q

2(A

B )B +2 C B;B (x

1

a(x + y)A

(a + b)A+1 a+b

(q

2 (x + y ))A+1

q 2 (x + y )

):

(Note that we have localized the algebra in x, with a trivial extension of the definition of Cm;r .) Therefore we have A X A+s A

s

A+s A s g= B B B + 1 B 1 q q q q s=0 (a + b)A+1 (q 2 (x + y ))A+1 = q 2(A B )B CB;B (x 1 q 2 (a q 2 x)(x + y )A ): a + b q 2 (x + y )

f

q 2s

We now use the following trick:

CB;B (x 1 (b q 2 y)am yn ar bs ) = 0; which is trivial to show, and obtain

A X

f A B+ s A B s q 2s BA ++ s1 BA s1 g = q q q q s=0 2(A B )B 1 2 2 2 = q CB;B (x q (a q x + b q y)(x + y)A A+1 2 A+1 (a + ba) + b (qq 2 (x(x++yy))) ) = q 2(A B )B CB;B (x 1 q 2 (x + y )A ((a + b)A+1 (q 2 (x + y ))A+1 )) A A+1 = q A B +1 ; B+1 q B q

which ends the proof of the first relation (112). The second relation (113) is proved with exactly the same method. The proofs of (107) on @ Ad are similar and we leave them t to the reader. u Let 2 Funcc (SLq (2; C )R ). It is trivial to show that ( S (? )) = h(? ). As a result we obtain using ( S (? )) = () ()y and the Plancherel formula that: Theorem 11 (2. Plancherel Formula). Let we have:

jjjj2L

2

Z

=

D^p

tr( (

1

be any element of Funcc(SLq (2; C )R ), ) () ()y )dP ( ):

(115)

V admits a Hilbert basis E = fAe m ; A 2 12 Z+; m = A; : : : ; Ag, and denote by P LqHS (V ) the Hilbert space LqHSP (V ) = f 2 C EE ; x;y2E xx1 j (x; y )j2 < +1g, with Hermitian product: h; i = x;y2E xx1 (x; y ) (x; y ); 8; 2 LqHS (V ).

Harmonic Analysis on the Quantum Lorentz Group

541

When q = 1, = 1 and LqHS (V ) is the Hilbert space of Hilbert–Schmidt bounded operator of V . If 2 Funcc (SLq (2; C )R ), we can associate to () an element () 2 LqHS defined by ()(x; y) = hxj()(y)i ; x; y 2 E . We have

h (); ()i = tr((

1

) () ()y ):

From this last proposition we have defined an isometric mapping T from Funcc (SLq (2; C )R ) with the norm jj:jjL2 to the Hilbert space H defined as the direct integral

H=

Z

D^p

LqHS (V )dP ();

with the following definition: T () = (). As a result, T being uniformly continuous we can uniquely extend T to an isometric R map T # from L2 (SLq (2; C )R ) to D^ p LqHS (V )dP ( ). Theorem 12 (Plancherel Theorem). The Fourier transform T # : L2 (SLq (2; C )R ) ! R D^p LqHS (V )dP () is a unitary operator.

Proof. Because T # is an isometry, it is an injection. The only non trivial point is to show that T # is a surjection. Because T # is an extension of T , it is sufficient to show that T (Funcc (SLq (2; C )R )) is dense in H, which is equivalent to show that

T (Funcc(SLq (2; C )R ))? = f0g:

Let

2 H and assume that

R

D^p h ; ()i dP () = 0 for any 2 Funcc(SLq (2; C )R )):

A B From the expression of ~ (X ij g kl ) in the orthonormal basis E , this orthogonality condition is equivalent to the condition

8A; B; C; D 2

1 + Z ; 2

Z

D^p

(Ae i ; De j )BC AD ( )dP ( ) = 0:

(116)

2

L2 (I~ ), with

1 + Z ; 2

(117)

As a result it will be sufficient to show that if (fm )minf (A;D) (Y~ )(A; m ) = (Y~ )(D; m ) = 1, is such that 2

XZ

m

2

~

I

fm ()BC AD (m; )dP (m; ) = 0; 8B; C 2

then fm = 0; 8m. Let us prove this result. If we assume that XZ

m

I

~

fm ()BC AD (m; )dP (m; ) = 0; 8B; C;

we obtain using the linear equations (101) that XZ

m

~

I

!+ (I )! (J )fm ()BC AD (m; )dP (m; ) = 0; 8B; C; I; J:

542

E. Buffenoir, Ph. Roche

This can be written as XZ

r

s + m + i)][ (+m + i)]fm ()BC AD (m; )d = 0; 8r; s 2 Z :

[ ( I 2

m

~

2

(118)

R in We will first show that this implies that I fm ()BC AD (m; )q 2 d = 0; 8n 2 Z. This is done by an induction on m0 the largest integer appearing in the finite sum (117). Indeed from Eq. (118), we obtain the equality for every integer p; r; s, Z

~

i fm0 ()BC AD (m0 ; )q 2 d = n

IZ =

~

~

I

fm0 ()BC AD (m0 ; )(q

pm0 2

(119)

r

s n m0 + i)][ (+m0 + i)] + qi 2 )d +

[ ( 2

2

(120)

X Z

~

m<m0 I

fm()BC AD (m; )q

pm0 2

r

[ ( 2

s m + i)][ (+m + i)]d:

(121)

2

As a result we get: Z

j

~

I

i fm0 ()BC AD (m0 ; )q 2 dj n

jjBC AD (m0 ; :)fm jjL jjgr;s jjL 0

2

2

+

X

m<m0

jjBC AD (m; :)fm jjL jjhr;s;m jjL ; 2

2

(122)

pm0 n where gr;s ; hr;s;m are the functions: gr;s () = q 2 [ r2 ( m0 +i)][ 2s (+m0 +i)]+q i 2 pm0 and hr;s;m () = q 2 [ r2 ( m + i)][ 2s (+m + i)]. Let us define a sequence rp ; sp by rp sp = n; rp + sp = p, (with p odd or even depending on the parity of n), then it is easily shown that grp ;sp ; hrp ;sp ;m are uniformally convergent toward 0 when p goes to infinity. As Ra result, by fixing r = rp ; s = sp in (122) and taking the limit p ! +1, we in BC obtain that I fm0 ()BC AD (m0 ; )q 2 d = 0. This implies that fm0 AD (m0 ; :) = 0 as a L2 function. There always exist B; C such that BC AD (m0 ; ) is a non constant 2 it has only a finite function, and because BC AD (m0 ; ) is a trigonometric polynomial, number of zeros. As a result we obtain that fm0 = 0 as a L2 function. A trivial induction on m0 shows that fm =R 0 for all m 6= 0. The induction ends up when m = 0, in this case (118) implies that I [i 2r ][i 2s ]f0 ()BC AD (0; )d = 0, from which we deduce t that f0 () = 0 because we assumed f0 () = f0 ( ). This concludes the proof. u

~

~

5. Conclusion The main theme of this article was the recognition of the central role played by generalized 6j symbols in the harmonic analysis of SLq (2; C )R . There are many ways in which our work can be pursued. From a purely mathematical point of view, it would be nice to generalize our work to the quantization of other complex Lie algebras. This should be quite a hard and technical problem, involving the precise study of generalized 6j symbols to compact quantum groups, but the main structures that we develop in our article should not suffer a lot of modifications, apart from the multiplicities, because many of the proofs were graphical.

Harmonic Analysis on the Quantum Lorentz Group

543

It would be interesting to study the precise image of different classes of functions on SLq (2; C )R by T # . In particular there should exist a whole class of theorems, qanalogues of Paley–Wiener theorems. We also do not precisely know how to recover the standard characters of the unitary series of SL(2; C )R from the classical limit of the characters of SLq (2; C )R : Our work certainly has implication for the physics of low dimensional field theory where the non-compact groups enter in an essential way. This is the case for Chern– Simons theory with the group SL(2; C )R which is related to 2 + 1 and 3 + 1 quantum gravity with positive cosmological constant in the canonical quantization program. The program of combinatorial quantization can certainly be extended to this case, and representations of the moduli algebra [2] obtained by acting on wave packets of unitary representations. This will be a difficult task because it will necessarily amount to studying the tensor product of unitary representations of SLq (2; C )R and its decomposition in terms of integral of unitary representations, which will be studied in a second paper [10]. Because complex continuation of 6j symbols played an important role in the study of special points in the strong coupling regime of Liouville theory, it is natural to suspect that harmonic analysis o f SLq (2; C )R is hidden in this problem and that the study of the tensor product of unitary representations may be an important tool in the study of the algebra of fields of Liouville theory [14] Acknowledgements. We warmfully thank O. Babelon, J. L. Gervais, E. Ragoucy, J. Schnittger, J. Teschner for discussions on this work and on related topics.

6. Appendix 6.1. Conventions and commutation relations. Let x be a complex number. We will dex x note by [x]q = [x] the q-number associated to x defined by [x] = qq qq 1 . The q Qn factorial is defined by [n]! = k=1 [k ] n 2 Z+ with the convention that [0]! = 1. The q binomial coefficients are defined by:

n [n]! ; n 0; 0 p n: = p q [p]![n p]!

(123)

They satisfy the two relations:

n+1 n n = qp + q n+p 1 ; p q p q p 1 q n+1 n n = q n+1 p +q p : p q p q p 1 q

(124) (125)

Uq (su(2)) is the star Hopf algebra defined by the relations (16,17,18). This Hopf algebra is quasitriangular and the expression of the R matrix is given by: R = q2Jz Jz e(qq 1 q

1

) (q Jz J+ J

q

Jz )

with ez =

+1 X

k=0

k(k 1) 2

zk : [k ] !

(126)

544

E. Buffenoir, Ph. Roche

The action of the generators of A on an orthonormal basis for an irreducible representation of spin J of Irr(A) is given by the following expressions:

I

I

qJz em = qm em ; 1p I I J em = q 2 [I m + 1][I m] em1

(127) (128)

(the unusual presence of q 2 in the left-hand side comes from our definition of the ? on J ). 2 = q 4K (K +1) , and The element is given by = q 2Jz . It is easy to compute vK we will choose K (K +1) 1=4 2 : vK = exp(i K )q 2 With this choice of signs the relations of the Clebsch Gordan coefficients written in 5.2 are satisfied. The reader is advised that there exists a large number of conventions largely unexplained in the literature whose consequences are that there are lots of misprints and phase ambiguities in certain papers on this subject. Pol(SUq (2)) is the star Hopf algebra generated by the elements of 1

u = ac bd 1 2

satisfying the relations:

qab = ba; qac = ca; bc = cb; ad da = (q (a) = a a + b c; (d) = d d + c b;

qbd = db; qcd = dc; 1 q)bc; ad q 1 bc = 1; (b) = b d + a b; (c) = c a + d c; a? = d; d? = a; b? = q 1 c; c? = qb: (129) 1 2

The relations between generators of Uq (su(2)) and matrix elements of L() are:

Jz 2 q 2 )J L(+) = q0 (1 J z q 1

1 2

; L(

)

=

q Jz (1

q2 )J+

0 qJz

:

(130)

The mixed relations of the quantum double are

qJz c = qcqJz ; qJz b = q 1 bqJz ; qJz ; a = 0; qJz ; d = 0; [J+ ; c] = 0; [J ; b] = 0; [J+ ; b] = q 1 (q Jz a q Jz d); [J ; c] = q (q Jz d q Jz a); J a = q 1 aJ + bqJz ; aJ+ = qJ+ a + q Jz c; dJ = q 1 J d + q Jz b; J+ d = qdJ+ + cqJz :

(131)

The application of the factorization theorem gives an explicit isomorphism of complex algebras between D and Uq (sl(2; C )) Uq (sl(2; C )). If we denote 1 2

M

(i+)

"

=

qJz

(i)

0

(1

q

#

q 2 )J (i) ; M2 (i (i) J z

1

"

)

=

(i) q Jz 0 (i) (i) 2 (1 q )J+ q Jz

#

;

(132)

Harmonic Analysis on the Quantum Lorentz Group

545

then (34,35) gives the explicit change of generators: (r) (l) (r ) (l) Jz = Jz(l) + Jz(r); J = qJz J (l) + J (r)q Jz ; J+ = qJz J+(l) + J+(r) q Jz ; (l ) (r ) (r) (l) a = qJz Jz ; b = (1 q 2 )q Jz J (l) ; c = (1 q2 )qJz J+(r) ; (l) (r ) d = q Jz +Jz (q q 1 )2 J+(r) J (l) : (133)

The expression of the Casimir elements is given by:

+ = q(q q 1 )J+ b + q 1 qJz a + qq Jz d;

= q 1 (q q 1 )J c + q 1 q Jz a + qqJz d:

(134) (135)

From these commutation relations it is easy to show that the precise relation between the generators fJ ; q Jz ; a; b; c; dg and the generators fA; N; N ? ; ; ? ; ; ? g used in [31,30] is:

A = qJz ; N = J + ; N ? = q 1 J ; = b; = d; = q 1 c; ? = a:

(136)

+ As a result the relation p between the Casimir element and the Casimir element X of + 2 [30] is q = 1 + q X .

6.2. Clebsch–Gordan coefficients, and vertex pictures definitions. In the sequel we shall frequently use the following notations:

[P ]k =

k Y i=1

[P + i

1]:

The R matrix elements and the Clebsh-Gordan coefficients can be computed and are given by (see [21] for a precise derivation)

(q R nn11 +nn2 n2 n

AB

s

[A [A

q 1 )n 2n1 n2 +n(n1 n2 q [n]!

n+1 2

)

n1 ]![A + n1 + n]![B + n2 ]![B n2 + n]! ; n1 n]![A + n1 ]![B + n2 n]![B n2 ]!

m n K m(p+1)+ 21 (J (J +1) I (I +1) K (K +1)) I J p = Æm+n;p q s

[2K + 1][I + J K ]![I m]![J n]![K p]![K + p]! [K + J I ]![I + K J ]![I + J + K + 1]![I + m]![J + n]! KXp V (K +p+1) i(V +I m) e Y (I; J; K )[I + m + V ]![J + K m q [ V ]![ K p V ]![I m V ]![J K + m + V ]! V =0

V ]!

:

546

E. Buffenoir, Ph. Roche

Pictorial representation. 2 6 4 2 6 4

i j C ; k AB AB k C i j ; R0 R ; R ; Æ e q v w ;

3

7 5 = V

3

AB 1 ij 7 5 = km V

2

3

6 4

7 5 =

V

2

3

6 4

7 5 =

AB

ij = BAji km mk

V

2

3

6 4

7 5 = m; n V

im m = 1=2 A mn A

2

3

6 4

7 5

V

1=2 A = Æm; n e im q m = vA w mn

Relations. Orthogonality:

Æ ; v v v v v v

2

3

6 4

7 5 = V

2

7 5 =

C

3

6

7 5

C,D Y(A,B,C) 4

3

X6 4

2

"

#

V

2

3

6

7 5

=4

V

2

3

6

7 5

=4

V

V

V

Twist:

1=2

C

2

3

6

7 A B 5 = 1=2

1=2 1=2 4

A B

3

6

7 5

V

2

=4

1=2 1=2

C

2

3

6 4

7 5

V

Contragredient: 2 6 6 6 6 4

3 7 7 7 = 7 5 V

2

[dC ] 12 6 ( ) 4 [dB ]

i(B-C)

e

2

3

6 6 7 5 =6 6 4 V

3 7 7 7 7 5 V

V

Harmonic Analysis on the Quantum Lorentz Group

Normalization:

2 6 4 2 6 4

547

3

; 2

3

e ÆB;A 6 7 5 = p 4

7 5

iB

[dB ]

V

3

V

2

3

e ÆB;A 6 7 5 = p 4

7 5

iB

[dB ]

V

V

One has to add to these relations those coming from elementary topological moves on ribbon graphs, which come from Yang-Baxter equation and quasitriangularity equations (see [26] for a complete set of moves). 6.3. 6 j coefficients and IRF pictures. In our constructions we are using extensively usual quantum 6j symbols entering the representation theory of Uq (su(2)). In order to distinguish them from their complex continuations described in the next subsection we will denote them by 6j (0). Details about these 6j symbols are given in [26]. Definitions of 6j symbols, relations and formulas. 2 6 6 6 6 4

2 6 6 6 6 4

3

7 7 7 = A 7 C 5

B E F D

2

ÆF;H 6 4

V

3 7 7 7 = C 7 A 5

B D F E

2

ÆF;H 6 4

V

A B E e i(C+D+2E A B) DC F E F A; B; E A;C; F C;E;D D; B; F X eiU U A B C D U A D E F U B C E F U U A C F U A B E U B D F U D C E ) 2 Z

([2

+1][2

(

[

+

[

U

A+C+F A+B+E B+D+F D+C+E

)

+1]!([

U

(

)

+

+

(

)

]![

]![

+

+

(

7 5 V

3 7 5 V

(137)

=

1 +1]) 2

3

)

+

]![

]![

+

+

+

]![

]!

]!)

1

;

A+B+C+D A+D+E+F B+C+E+F

(138) with

I;J; K (

) =

Y I;J; K (

s )

J K I J I K I K J 8I;J;K 2 Z+: I J K

[

+

]![

[ +

+

+

]![ +

+ 1]!

]!

1 2

(139)

548

E. Buffenoir, Ph. Roche

The 6j symbols satisfy the symmetry relations:

AB E = BA E = CD E = AD F : CD F DC F AB F CB E

(140)

They also satisfy the following relation, known as Racah–Wigner relation:

A C F = ei(C F +E D) A F C ( [dF ][dD ] ) 12 ; BE D B D E [dC ][dE ]

which is nicely pictured in the next subsection. We give the value of 6j symbols for one spin fixed to the value 0 or

B B A C C = Y (A; B; C ); 0

1 2

(141)

: s

[dC ] A A 0 = Y (A; B; C )ei(A+B C ) ; BB C [dA ][dB ] (142)

AB 1 1 2 E+ 2 AB 1 1 2 E+ 2 AB 1 1 2 E 2 AB 1 1 2 E 2 6.4. 6

[B + E A + 1][A + B + E + 2] 12 E ) Y (A; B; E ); B + 12 = ( [2E + 2][2B + 1] [A + B E ][A + E B + 1] 1 E ) 2 Y (A; B; E ); B 12 = ( [2E + 2][2B + 1] [A + E B ][A + B E + 1] 12 E ) Y (A; B; E ); B + 12 = ( [2E ][2B + 1] [A + B + E + 1][E + B A] 1 E ) 2 Y (A; B; E ): (143) B 12 = ( [2E ][2B + 1]

j coefficients and IRF pictures.

Pictorial representation.

2

3

6 4

7 5 =

2

3

6 4

EA D ; BF C

IRF

1=2 1=2

vD v E 7 5 = 1=2 1=2 IRF

2

3

6 4

7 5 IRF

vB v F

2 6 4

3 7 5 =

FB C ; AE D

IRF

AB D ; CF E

v1=2 vF1=2 A B D ; = B vD1=2 vE1=2 C F E

2

3

2

3

6 4

7 i(A B ) ( [dB ] ) 12 Y (A; B; C ) = 6 5 =e 4

7 5

IRF

[dA ]

; IRF

Harmonic Analysis on the Quantum Lorentz Group

2

3

6 4

7 5 =

549

Y (A; B; C ):

V

Relations. Orthogonality 2

3

6 4

7 5 =

2 6 4

IRF

3

E,G Y(A,E,D)Y(B,H,E)

2

IRF

=

E

C

IRF

2

2

3

6 4

7 6 5 =4

7 5

IRF

IRF

3 7 5

IRF

3

2 7 12 7 6 7 = i(B C ) [dC ] 4 7 [ d ] B 5

2

3IRF

6 6

7 7 7 7 5

4

7 5

3

e :

=6 6

3

2

vC1=2 X 6 4 vA1=2 vB1=2 E

Racah–Wigner Symmetry: 6 6 6 6 4

7 5

7 X6 7 7 = 4 7 5 F

2

2

X6 4

IRF

v1=2v1=2 X

=

3

3

6 6 7 5 = C,E 6 6 4

Racah relation:

A B vC1=2

Æ ; Æ : 2

V

Normalization

2

3

6 4

7 5 =

2

3

6 4

7 5 =

3

D;C 6 4 [dA ]

7 5

p

V

V

2

eiA Æ

eiA ÆD;C p [dA ]

;

2

3

V

6 4

7 5

: V

3 7 5 IRF

;

: IRF

550

E. Buffenoir, Ph. Roche

One has to add to these relations the elementary topological moves associated to the pentagonal relation, and the hexagonal one (see [26] for a complete set of moves). It is important to remark that the ranges of summations in these relations are simply given by the Y factors in the definitions of 6j coefficients entering in the formulas associated to the pictorial representation. 6.5. One variable complex continuations of quantum 6j symbols. The constructions described in this paper make extensive use of one variable complex continuation of quantum 6j symbols, denoted by 6j (1). The pionering work on this subject is due to R.Askey and J.Wilson [5], and has been analyzed in great detail in the works of E. Cremmer, J. L. Gervais, J. F. Roussel [13,22]. In this appendix, we just want to recall the basic ideas of their construction. Ordinary 6j (0) coefficients can be rewritten using a 4 3 terminating basic hypergeometric functions (see [22]) called q -Racah polynomial,

D E C = ( w(x) ) 21 p ((x); a; b; c; d); n AB F hn

where the notations are those of (Eq. 4.1-4.5) of [5] and with n = A + C B; a = q 4A 2 ;

c = q2(D+B E A) ;

d = q2(D+B+E+A+2) ;

(144)

b = q4B+2 ; x=A+E F

so it could seem natural to study the continuation from this expression. But, at first sight, the proof of polynomial equations usually verified by 6j (0) coefficients is not so obvious in this formulation. In the work [13,22] another method is developed. This continuation is done in different steps, depending on the number of combination of spins we want to let be half integers, or conversely of the number of independent complex variables entering the continuation. For our purpose, the first step will be sufficient and we will see in our next paper [10] that our program, methods and results differ radically from those of J.L.Gervais and Al. for the other steps. The central idea is that the definition of usual 6j coefficients (139) requires as a basic condition, that some combinations of spins be integers, but does not require the spins to be half integers themselves. It is then possible to define one variable continuations in such a way that the basic hypergeometric part of the formula is still terminating and such that the normalization part is still the square root of a rational function of the complex variable. Thus the polynomial equations will be proved by only using continuation arguments from the 6j(0) case, where the polynomial equations are simply derived from features of the representation theory of the associated quantum group [26]. In order to write the explicit formula for these 6j (1) symbols it is necessary to introduce some basic definitions. First, the Y function will be continuated by:

Y (A; @ + N; @ + P ) =

8@ 2 C

(

1 0

if A (N elsewhere

P) 2 N

8A 2 12 Z+, 8N; P 2 12 Z. And we will introduce the following basic functions: 8; ; ; Æ; ; ; 2 C , 8n 2 N 1 Z+, 2

()+1 := (q 2 ; q 2 )+1 =

+ 1 Y

i=0

(1

q2+2i ); ()n :=

()+1 ; ( + n)+1

Harmonic Analysis on the Quantum Lorentz Group + 1p Y

551

+1 () ; +1 ( + n) i=0 ( + + + 2)+1 ( + + 1)+1 ( !(; ; ) := +1 +1 ( + + + 1) + 1 X ()k ( )k ( )k (Æ )k q 2k Æ : := 43 (1)k ()k ( )k ()k k=0 +1 () :=

1

q2+2i ; n () :=

+ + 1)

;

We can therefore define the two types of 6j (1) by

B C A

=

@1 @2 @3

ei(B+@2 @3 ) q(B+@2 @3 )(C B+@2 @1 +1)+(A+@2 @1 )(B+A C ) 1 (2@3 + 1) B ; @2 ; @3 )!(B ; A; C ) 1 (2A + 1) (C(B B C+ +@1@ +@2@+ +1)+1)1 !((2@2B; A;+ @1)1 )!((1) 1 2 +1 +1 +1 ! (C ; @1 ; @3 ) Y (A; B; C )Y (A; @1 ; @2 ) Y (B; @3 ; @2 )Y (C; @1 ; @3 ) 4 3 C A B C + A 2BB + 1 @@2 3 @1@2 BB+ C C +@@3 1 @2@2 BB +1 1 (145)

A @2 @3 B @1 @4 = ei(A+@1 @4 ) q (A+@1 @4 )(@2 A+@1 B +1)+(@3 +@1 B )(A+@3 @2 ) 1 (2@4 + 1) 1 !(@1 ; @3 ; B )!(A; @1 ; @4 )!(A; @3 ; @2 ) 1 (2@3 + 1) (@2 (A A +@ B+ B@+1 +@ 1)++1) 2 1 +1 (2A + 1)+1 (1)+1 ! (@2 ; B; @4 ) Y (A; @3 ; @2 )Y (A; @4 ; @1 ) Y (B; @3 ; @1 )Y (B; @2 ; @4 ) @ A @ @ @ 2 @3 2 + @3 A +1 4 @1 A 4 @1 A 1 4 3 2A @1 B A + @2 @2 + B @1 A +1 ; (146)

1 Z+, and 8i; j , @ where A; B; C 2 12 Z+, 8i, @i 2 C i @j 2 Z. 2 These definitions needs some remarks. Firstly, using elementary transformations (see [5]) we can show that these definitions are equal to the formula (144) when we consider the limit where the complex spins @i degenerate into halfintegers Xi , except for the continued Y which have supports larger than the usual one for generic Xi and equal to the usual one only for sufficiently large Xi . In this sense, we can speak of “continuation” to complex spins. Secondly, the global sign of the expression is intimately related to the choice of square root we have made for the normalization factors. In the paper the choice of square root will always be the same, i.e.

8x 2 C ;

x =j x j eiArg(x) ; with Arg(x) 2]

p

; ]; then x :=

p

jxj e

i

2

Arg(x) : (147)

552

E. Buffenoir, Ph. Roche

At this point, we have just given definitions and we can check the following basic properties whose pictorial representations can easily be deduced using the same conventions as in the 6j (0) case. Symmetries.

A B C

@1 @2 @3

B A C = @ @ @ 2

1

=

3

@3 +A C )q(@1 @3 +A C ) 1 (2C +1)1(2@3 +1) C B A 1(2A +1)1(2@1 +1) @3 @2 @1 A @2 @3 = A @1 @4 = B @1 @3 = B @1 @4 B @2 @3 A @2 @4 i (@1 @3 +@2 @4 ) (@1 @3 +@2 @4 ) 1 (2@4 +1)1 (2@3 +1) A @3 @2 =e q 1 (2@2 +1)1 (2@1 +1) B @4 @1 = ei(@1

(148)

Orthogonality: X A

C

X

B C

@1 @2 @3 A B C

A B C

@1 @2 @4 A B D

= Æ@3 ;@4 Y (A; @2 ; @3 )Y (B; @1 ; @3 )

(149)

@1 @2 @3 @1 @2 @3 = ÆC;D Y (A; B; C )Y (C; @1 ; @2 ) X A @ @ A @2 @3 = Æ 3 2 @ ;@ Y (A; @1 ; @4 )Y (B; @2 ; @4 ) B @1 @4 B @1 @5 @

(150)

@3

4

(151)

5

3

Racah relation. X

C

A B C

@2 @1 @3

A B C

@1 @2 @4

v@1=42 v@1=32 vC1=2 1 A @1 ) = 1=2 1=2 1=2 1=2 B @2 v@1 v@2 vA vB

(

@3 @4

(152)

Pentagonal relations. X D

A

@3 @1 @4

X C

@6

F A

@1 @4 @2

D @6

DF A EB C

C @2 A @5

@6 @3

E A B

@1 @2 @3

A D B

@4 @5 @6

=

E F C

@4 @2 @3

=

C @1 B @5

@4 @3

D C B

@2 @1 @4

(153)

A D B

@1 @3 @2 (154)

The range of the sums in these equations are fixed by the Y s entering in the definition of the 6j s. The symetries are checked using elementary properties of basic hypergeometric functions (see [5]). The orthogonality properties can be easily expressed as the q Racah orthogonality formula [5]. The other properties are checked by continuation arguments. Indeed, each of these properties asserts that a certain finite sum of rational functions in the variable q 2@+1 is zero in the entire complex plane. In fact, we can easily show it is zero for @ being sufficiently large half-integer, because of the corresponding

Harmonic Analysis on the Quantum Lorentz Group

553

relations for 6j (0) (it is important to remark that neither the range of summation nor the degree of the rational functions depend on this complex variable for sufficiently large values). The identities are then shown on the entire complex plane, because a rational function null on an infinite number of values is necessary identically zero. The list of relations we have given is obviously not complete and we can deduce many other relations by combining the latter ones. Especially, we want to mention the hexagonal relation deduced from relations (152) (153)(154): X A

M =

P M

@4 @3 @2

X B

@5

P C

AP M BN C

@4 @5 @2

A @2 B @1

1=2 1=2 B M N ( vP v@3 )1 = @4 @1 @3 vM1=2 v@1=2 2 1=2 1=2 @3 A C N ( vC v@1 )1 : @5 @4 @1 @5 vN1=2 v@1=2 5

(155)

and another pentagonal relation useful in some computations of this paper, deduced from (153)(150): X

@5

B A P

@2 @1 @5

A Q M

@4 @5 @2

B M N

@4 @1 @5

BA P = QN M

P Q N

@4 @1 @2

:

(156)

It is easy to compute using the formula (145) the 6j symbols when one of the spins is equal to 0 or 12 . Indeed, depending on the type of 6j (1) we consider, the explicit value are given by:

1

C + 12 AB 1 1 2 C+2 AB 1 1 2 C 2 AB 1 1 2 C 2 AB 2

C B + 21 = C B 12 = C B + 21 = C B 21 =

with C 2 12 Z+, A = the second case.

C C A B B = Y (A; B; C ) 0

(157)

1 (B + C A + 1)1 (A + B + C + 2) Y (A; B; C ) 1 (2C + 2)1 (2B + 1) 1(A + B C )1 (A + C B +1) C +B A+1 q Y (A; B; C ) 1 (2C + 2)1 (2B + 1) 1 (A + C B )1 (A + B C + 1) C +B A q Y (A; B; C ) 1 (2C )1 (2B + 1) 1 (A + B + C + 1)1 (C + B A) Y (A; B; C ) (158) 1 (2C )1 (2B + 1)

@1 , B = @2 in the first case, and A 2 21 Z+, C = @1 , B = @2 in

References 1. Alekseev, A.Y., Grosse, H., Schomerus, V.: Combinatorial Quantization of the Hamiltonian Chern– Simons Theory I. Commun. Math. Phys. 172, 317-358 (1995); Alekseev, A.Y., Grosse, H., Schomerus, V.: Combinatorial Quantization of the Hamiltonian Chern–Simons Theory II. Commun. Math. Phys. 174, 561–604 (1995) 2. Alekseev, A.Y., Schomerus, V.: Representation theory of Chern–Simons Observables. q-alg/9503016, Duke. Math. J. Vol.85, No.2, 447 (1996)

554

E. Buffenoir, Ph. Roche

3. Arnaudon, D., Buffenoir, E., Ragoucy, E., Roche, Ph.: Universal solutions of Quantum Dynamical YangBaxter Equations. To be published in Lett. Math. Phys and q-alg/9712037 4. Ashtekar, A.: Lectures on Nonperturbative Quantum Gravity. Singapore: World Scientific, 1991 5. Askey, R., Wilson, J.: A set of orthogonal polynomials that generalize the Racah coefficients or 6 symbols. SIAM J. Math. Anal. 10, 1008 (1979) 6. Askey, R., Wilson, J.: Some basic hypergeometric orthogonal polynomials that generalize Jacobi polynomials. Mem. Am. Math. Soc. Vol. 54 n. 319 (1985) 7. Babelon, O.: Universal Exchange Algebra for Bloch Waves and Liouville Theory. Commun. Math. Phys. 139, 619–643 (1991) 8. Babelon, O., Bernard, D., Billey, E.: A quasi-Hopf algebra interpretation of quantum 3 and 6 symbols and difference equation. Phys. Lett. B 375, 89 (1996) 9. Buffenoir, E., Roche, Ph.: Two dimensional lattice gauge theory based on a quantum group. Commun. Math. Phys. 170 (1995); Buffenoir, E., Roche, Ph.: Link Invariants and Combinatorial Quantization of Hamiltonian Chern Simons Theory. Commun. Math. Phys, 181, 331–365 (1996) 10. Buffenoir, E., Roche, Ph.: Tensor product of unitary representations of the quantum lorentz group and Askey–Wilson polynomials. In preparation 11. Carlip, S.: Lectures on 2+1 dimensional gravity. UCD-95-6, Feb 1995, and gr-qc/9503024 12. Chari, V., Pressley, A.: A guide to Quantum Groups. Cambridge: Cambridge University Press 13. Cremmer, E., Gervais, J.L., Roussel, J.F.: The Quantum Group Structure of 2D gravity and Minimal Models (II): The Genus Zero Chiral Bootstrap. Commun. Math. Phys. 161, 597 (1994) 14. Cremmer, E., Gervais, J.L., Schnittger, J.: Hidden q ( (2)) q ( (2)) quantum group symmetry in two dimensional gravity. Commun. Math. Phys. 183, 609 (1997) 15. Carow-Watamura, U., Schlieker, M., Scholl, M., Watamura, S.: A quantum Lorentz Group. Int. J. Modern. Phys. A 6, 3081 -3108 (1991) 16. Drinfeld, V.G.: Quantum Groups. Proceedings of the International Congress of Mathematicians, Berkeley 1986, A.M.Gleason (ed), Providence, RI: Am. Math. Soc, pp. 798–820 17. Fock, V.V., Rosly, A.A.: Poisson structures on moduli space of flat connections on Riemann surfaces. Preprint ITEP 72-92,(1992) 18. Faddeev, L.D., Reshetikhin, N., Takhtajan, L.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193–226 (1990) 19. Gelfand, I.M., Minlos, R.A., Shapiro, Z.Ya.: Representations of the rotation and Lorentz groups and their applications. London: Pergammon Press, 1963 20. Gelfand, I.M., Naimark, M.A.: Unitary representations of the Lorentz group. (in Russian) Izvestia Akad. Nauk. SSSR (Ser.Mat.) 11, 411 (1947) 21. Gervais, J.L.: The Quantum Group Structure of 2D gravity and Minimal Models. Commun. Math. Phys. 130, 257 (1990) 22. Gervais, J.L., Roussel, J.F.: Solving the strongly coupled 2D gravity (II). Fractional spin operators and topological three point functions. Nucl.Phys. B 246, 140 (1994) 23. Jimbo, M., Konno, H., Odake, S., Shiraishi, J.: Quasi-Hopf twistors for elliptic quantum groups. qalg/9712029 24. Kakehi,, T.: Eigenfunction expansion associated with the Casimir operator on the quantum group q (1 1) Duke Math. J. 80 No.2, (1995) 25. Killingback, T.P.: Quantization of (2 R) Chern–Simons theory. Commun. Math. Phys. 145, 1–16 (1992) 26. Kirillov, A.N., Reshetikhin, N.Yu.: Representations of the algebra Uq ( (2)), q-orthogonal polynomials and invariants of Links Infinite Dimensional Lie algebras and Groups, V.G.Kac(ed), Singapore: World Scientific, pp. 285–339 27. Korogodsky, L., Vaksman, L.: Spherical functions on the quantum analogue of q (1 1) and a Fok–Mehler formula. Funk. Anal. i Prilozhen 25, 60–62, (1991) ? 28. Kustermans, J., Van Daele, A.: algebraic quantum groups arising from algebraic quantum groups. To appear in International J. Math. q-alg/9611023 29. Masuda, T., Mimachi, K., Nakagami, Y., Noumi, M., Saburi, Y., Ueno, K.: Unitary Representations of Quantum group q (1 1) I: Structure of the dual space of Uq ( (2)) Lett. Math. Phys 19, 187–194 (1990); II: Matrix elements of unitary representations and the Hypergeometric functions. Lett. Math. Phys 19, 195–204 (1990) 30. Pusz, W.: Irreducible unitary representations of quantum Lorentz group. Comm. Math. Phys. 152, 591– 626 (1993) 31. Podles, P., Woronowicz, S.L.: Quantum deformation of the Lorentz Group. Comm. Math. Phys, 130, 381–431, (1990) 32. Nelson, J.E., Regge, T.: Homotopy groups and 2+1 dimensional quantum gravity. Nucl. Phys. B328, 190–202 (1989) 33. Ogievetsky, O., Schmidke, W.B., Wess, J., Zumino, B.: Six generators q-deformed Lorentz algebra. Lett. Math. Phys 23, 233–240 (1991)

j

j

U sl

SU ;

U sl

SL ;

sl

SU ;

C

SU ;

sl

q

j

Harmonic Analysis on the Quantum Lorentz Group

555

34. W.R¨uhl, The Lorentz group and Harmonic Analysis. New York: W.A. Benjamin, Inc 1970 35. Reshetikhin, N.Yu., Semenov, M.A.: Quantum R-matrices and factorization problems. J. Geom. Phys, 5, 533–550 36. Semenov-Tyan-Shanskii, M.A.: Poisson Lie groups and Quantum Duality Principle and the Quantum Double. Theor. Math. Phys. 93 1292–1307 (1992), hep-th/9304042 37. Soibelman, Ya.S.: The algebra of functions on a compact quantum group, and its representations. Leningrad. Math. J. 2, 161–178 and 256 (1991) 38. Takeuchi, M.: Finite Dimensional Representations of the Quantum Lorentz Group. Commun. Math. Phys 144, 557–580 (1992) 39. Van Daele, A.: Multiplier Hopf Algebras. Trans. Am. Math. Soc., 342 vol 2, 917–932 (1994) 40. Van Daele, A.: Discrete Quantum Groups. J. Algebra, 180, 445–458 (1996) 41. Vaksman, L.L., Soibelman, Ya.S.: Algebra of functions on the quantum group (2). Funct. Anal. Appl.22, 170–81 42. Witten, E.: 2 + 1 dimensional gravity as an exactly soluble system. Nuclear Physics, 311, 46–78 (1988) 43. Witten, E.: Quantization of Chern–Simons Gauge theory with Complex Gauge Group. Commun. Math. Phys. 137, 29–66, (1991) 44. Wonorowicz, S.L.: Compact matrix pseudogroups. Comm. Math. Phys, 111, 613–665, (1987) 45. Wonorowicz, S.L.: Twisted (2) Group. An example of a Non-Commutative Differential Calculus. Publ.RIMS, Kyoto Univ.28, 117–181 (1987) 46. Wonorowicz, S.L.: Unbounded Elements Affiliated with ? -Algebras and Non-Compact Quantum Groups. Comm. Math. Phys. 136, 399–432, (1991)

SU

SU

Communicated by T. Miwa

C

Commun. Math. Phys. 207, 557 – 587 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Positive Commutators and the Spectrum of Pauli–Fierz Hamiltonian of Atoms and Molecules Volker Bach1 , Jürg Fröhlich2 , Israel Michael Sigal3 , Avy Soffer4 1 FB Mathematik MA 7-2, TU Berlin, Str. d. 17 Juni 136, 10623 Berlin, Germany.

E-mail: [email protected]

2 Institut für Theoretische Physik, ETH Hönggerberg, 8093 Zürich, Switzerland.

E-mail: [email protected]

3 Department of Mathematics, University of Toronto, Toronto, M5S 3G3, Canada.

E-mail: [email protected]

4 Department of Mathematics, Rutgers University, New Brunswick, NJ 08903, USA.

E-mail: [email protected] Received: 10 April 1998 / Accepted: 12 April 1999

Abstract: In this paper we study the energy spectrum of the Pauli–Fierz Hamiltonian generating the dynamics of nonrelativistic electrons bound to static nuclei and interacting with the quantized radiation field. We show that, for sufficiently small values of the elementary electric charge, and under weaker conditions than those required in [3], the spectrum of this Hamiltonian is absolutely continuous, except possibly in small neighbourhoods of the ground state energy and the ionization thresholds. In particular, it is shown that (for a large range of energies) there are no stable excited eigenstates. The method used to prove these results relies on the positivity of the commutator between the Hamiltonian and a suitably modified dilatation generator on photon Fock space. 1. Introduction In this paper we extend the method of positive commutators to a family of Hamiltonians related to the Pauli–Fierz Hamiltonian describing nonrelativistic electrons bound to static nuclei and interacting with the quantized electromagnetic field, subject to an ultraviolet cut-off. This is a standard Hamiltonian of quantum electrodynamics of nonrelativistic 2 particles. Let e and m be the electron charge and mass and α := he¯ c , the fine-structure 1 , however, in this paper it is constant. The physical value of α is approximately 137 considered as a small dimensionless parameter. In dimensionless units in which the energy, photon wave vector, particle coordinate, particle charge and particle mass are 2 h¯ 2 3/2 K)−1 and m, respectively (here K is , me measured in units of mc2 α 2 , α me 2 , e(α h¯ 2 an ultraviolet cut-off defined below), the Pauli–Fierz Hamiltonian for a system of N charged particles (typically electrons) is given by 2 N X 1 pj − ej A(xj ) + V (x) ⊗ 1f + 1part ⊗ Hf , H (e) = 2mj j =1

(1.1)

558

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

where e := (e1 , . . . , eN ), ej is the electric charge, mj the mass, xj the position (operator), and pj := −i∇j the momentum operator of the j th particle, for j = 1, . . . , N; moreover x := (x1 , . . . , xN ). The operator K 1/2 A(y) is the quantized electromagnetic vector potential, cut-off at large wave vectors, at the pointy in physical space R3 . It is assumed to satisfy the Coulomb gauge condition, ∇ · A (y) = 0. The operator V (x) originates in a properly rescaled electrostatic (scalar) potential of the charged particles (electrons) in the Coulomb field of static charges (nuclei) (see [2]). Finally, Hf is the usual Hamiltonian of the noninteracting, quantized electromagnetic field. The operators A(y), y ∈ R3 , and Hf are densely defined, self-adjoint operators on the usual Fock space, Hf , of the quantized electromagnetic field (the photon Fock space), and V (x) is a multiplication operator on the particle Hilbert space, Hpart , which is given by (a subspace of prescribed symmetry character of) L2 (R3N ), with R3N the configuration space of the charged particles. The Hilbert space of the entire system consisting of the charged particles and an arbitrary number of photons is given by the tensor product space Hpart ⊗ Hf . One can prove without much difficulty (see, e.g., [8,9]) that H (e) is a densely defined, self-adjoint operator on Hpart ⊗ Hf , whose energy spectrum is bounded below by a finite constant (depending on the positions of the nuclei and their electric charges). A proof can be based, either on diamagnetic type inequalities or on constructing the semigroup exp −tH (e) , for t ≥ 0, with the help of path-integrals. It should be noted that, for simplicity, we have set the magnetic moments of the charged particles to zero. (Otherwise, the Hamiltonian H (e) would contain an additional term describing the Zeeman energies of magnetic moments in the ultraviolet cut-off, quantized electromagnetic field. This term would complicate our analysis slightly.) P For |e| := N j =1 |ej | sufficiently small, we shall construct a suitable modification of the (2nd -quantized) generator of dilatations on the photon Fock space, with the property that its commutator with the Hamiltonian H (e) is positive, provided that we restrict the energy to small neighbourhoods of the eigenvalues of the particle Hamiltonian, Hpart =

N X 1 2 p + V (x), 2mj j

(1.2)

j =1

corresponding to excited states of the atom or molecule. This result has the following implications: In the vicinity of the eigenvalues of Hpart corresponding to excited eigenstates, (i) H (e) has no eigenvalues; (ii) the spectrum of H (e) is purely absolutely continuous; (iii) H (e) satisfies the limiting absorption principle. Implication (i) is derived from the basic positive-commutator estimate via a virial theorem, while (ii) and (iii) follow from that estimate with the help of a slight extension of Kato–Mourre theory presented in this paper. The limiting absorption principle represents a first step towards analyzing properties of the time evolution of a quantum mechanical system. The results announced in the abstract follow from (i) and (ii) above, together with similar (but simpler) results in Sect. IV of [3]. Results similar to (i) and (ii) above (but of somewhat more detailed nature), were first obtained, under stronger hypotheses, in [2– 4]; (see remarks after Theorem 3.1). If the quantized electromagnetic field is not only cut off in the ultraviolet, but also in the infrared (at small wave vectors), e.g., by introducing

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

559

a small photon mass, results similar to ours have previously been established in [22,10, 12,11]. Furthermore, in [12], commutator estimates were derived that inspired, in part, our findings. Parallel results for sufficiently high temperatures (here the temperature leads to an effective infrared cut-off) were obtained in [15,16]. Commutator methods were introduced in [24,18], further developed in [19] and turned into a deep theory in [20]. In [20,21,23,26] they were shown to yield a powerful tool in analyzing spectral properties of Hamiltonians of quantum-mechanical systems and in studying their time evolution. The present paper is inspired by these earlier discoveries and should be viewed as a step towards understanding the time evolution of systems of photons interacting with nonrelativistic, quantum-mechanical matter. 2. The Hamiltonian of Nonrelativistic QED As announced, we study systems of nonrelativistic, quantum-mechanical, charged particles interacting with the quantized electromagnetic field. The dynamics of such systems is described by the Hamiltonian H (e) introduced in (1.1). The potential energy V (x) is assumed to satisfy standard Kato-type conditions specified below. The Hamiltonian Hf of the noninteracting, quantized electromagnetic field can be expressed in terms of standard photon creation- and annihilation operators, a ∗ (k) and a(k), as follows: Z ω(k) a ∗ (k) · a(k) d 3 k, (2.1) Hf = where ω = ω(k) = |k| is the energy of a photon with wave vector k. The creationand annihilation operators a ∗ (k) and a(k) are transverse, vector-valued, operator-valued distributions on Hf satisfying k · a ∗ (k) = k · a(k) = 0 and a(k) = 0, for all k ∈ R3 , where is the vacuum (zero-photon) vector in Hf . Furthermore, these operators satisfy the canonical commutation relations

ai# (k) , aj# (k)

= 0,

ai (k) , aj∗ (k)

=

ki kj0 δij − δ(k − k 0 ), |k|2

(2.2)

where ai# is the i th component of a # (in the plane perpendicular to k), and a # = a or a ∗ . The cut-off electromagnetic vector potential A(y), y ∈ R3 , is the densely defined self-adjoint operator on Hf given by Z κ(k) d 3 k, (2.3) A(y) = e−iky ⊗ a ∗ (k) + eiky ⊗ a(k) √ ω(k) where κ is a real function on R3 of rapid decrease, as |k| → ∞. It describes the ultraviolet cut-off and is necessary for A(y) to be densely defined and self-adjoint, for every y ∈ R3 . We assume it lives on a scale K, i.e., it is of the form κ(k) = K −1/2 κ0 (k/K), where κ0 is a fixed function. The particular form of κ0 is irrelevant for our analysis. All that is required are certain bounds on κ0 and its derivatives. It is convenient to forget the origin of the vector potential A(y) and consider a slighty generalized form of it given by Z (2.4) A(y) = Gy (k) ⊗ a ∗ (k) + Gy (k) ⊗ a(k) d 3 k,

560

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

where the function Gx (k) is assumed to satisfy a variety of conditions (depending on the problem we study), the most important one being Z sup x

1 |Gx (k)|2 d 3 k ω(k)

< ∞.

(2.5)

This condition guarantees that, for |e| small enough, the operator H (e) is bounded below and self-adjoint on the domain of H (e = 0) (see [5]). We recall that we neglect the Zeeman term, −

X

µi Si · B(xi ),

(2.6)

describing the interaction energy of the magnetic moments µi Si , where Si is the spin operator of the i th particle, with the magnetic field B(y) = curl A(y). In order to simplify notation and exposition, we demonstrate our approach on the model of a particle system interacting with a massless scalar field, instead of the vector potential. The Hamiltonian for such a model is given by H = Hpart ⊗ 1f + 1part ⊗ Hf + gI,

(2.7)

acting on Hpart ⊗ F, where the Hilbert space Hpart is the same as before, F is the Fock space of scalar fields generated by L2 (R3 ), Hpart is given in (1.2) and is a particle (atomic) Hamiltonian, acting on Hpart , and Hf is a scalar field Hamiltonian on F given, similarly to (2.1), by Z Hf =

ω(k) a ∗ (k) a(k) d 3 k,

(2.8)

with ω = ω(k) = |k|, as above. Finally, the interaction term I is defined by I :=

Z Gx (k) ⊗ a ∗ (k) + Gx (k) ⊗ a(k) d 3 k

= a ∗ (Gx ) + a(Gx ) ,

(2.9)

where x = (x1 , . . . , xN ) ∈ R3N , and where Gx (k) is required to satisfy (2.5) (we use the same notation for coupling functions as in the vector case). The operators a ∗ (k) and a(k) are creation- and annihilation operators of a scalar quantum field acting on F. They obey the canonical commutation relations, [a # (k), a # (k 0 )] = 0, [a(k), a ∗ (k 0 )] = δ(k − k 0 ), and a(k) = 0, for all k, k 0 ∈ R3 , where is the vacuum vector in F. (For brevity we continue to refer to the scalar field as photon field.) Note that for a scalar field the coupling to matter cannot be “minimal”, i.e., it cannot be described by replacing the momentum operator by a covariant derivative. The simplified model contains all the difficulties of the vector model, but the infrared problem becomes visible in its pure form, unencumbered by vector notation and other inessential particulars. In (2.9), it is straightforward to also include terms quadratic in a and a ∗ . We do not pursue this in order not to muddle the key ideas underlying our methods.

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

561

Throughout the paper, we assume that N P 1 ∗ on the domain of 2 Hpart = Hpart 2mj pj and has several isolated eigenvalues, j =1

E0 , E1 , . . . , of finite multiplicity below the bottom, 6, of its essential spectrum: E0 < E1 < · · · < 6. This assumption is satisfied for a large class of potentials including many-body Coulomb potentials (see e.g. [25]). The Hamiltonian H (e) defined in (1.1) is self-adjoint under the above assumption on the potential V (x) and under assumption (2.5) on the coupling function Gx . This is proven by using diamagnetic-type inequalities or by considering the semigroup e−H (e)t . P It was shown in [3] that for |e| |ej | sufficiently small, it is self-adjoint on the = domain D H (e) = D H (0) . The self-adjointness of the Hamiltonian H , defined in (2.7)–(2.9), on the domain D(H ) = D(H0 ), for g sufficiently small, follows from a result of [3] (see Eq. (4.10) of Sect. 4). In what follows, E1 (H ) stands for the spectral projection of a self-adjoint operator H associated with an interval 1, while χλ∈ , for the characteristic function of a set decay (thus E1 (H ) = χH ∈1 ). Below, we make use of the following exponential R |Gx |2 ∞ 2 estimate proven in [3]: If φ ∈ C0 , with supp φ ⊂ − ∞, 6 − g sup , then ω x

keα|x| φ(H )k ≤ Cα , R for α sufficiently small α < 6 − sup supp φ − g 2 sup and [H, x] are H -bounded, Eq. (2.10) implies that

x

(2.10) |Gx |2 ω

. Since the operators Hf

khxiM ⊗ (Hf + 1)φ(H )k < ∞ for any M ≥ 0.

(2.11)

3. Results First we formulate the restrictions on the coupling functions Gx = Gx (k) used in this paper:   2   Z (k · ∇k )Gx (k)  Z |G (k)|2 x 3 −M 3 d k + hxi d k < ∞, (3.1) sup  ω(k) ω(k) x    and −M

suphxi x

2 Z X

2 1 + ω(k)−1 (k · ∇)n Gx (k) d 3 k < ∞

(3.2)

n=1

for some M ≥ 0. In order to simplify somewhat the technical part of the paper we assume that suphki2 hxi−M |(kˆ · ∇k )n Gx (k)| < ∞ , x,k

(3.3)

562

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

where kˆ = k · |k|−1 , for some M ≥ 0 and for n = 0, 1, and that 1/2  Z 2 |G | x  ≤ Cρ 1/2 . g(ρ) := sup  ω x

(3.4)

ω≤ρ

is , s = 1, . . . , m , be the eigenvalues and corresponding eigenfuncLet E i and ψpart i tionsRof Hpart , where i = 0, 1, . . . , and E 0 < E 1 < . . . . For i, j ≥ 0, we assume that |k|=ω (Aij )∗ Aij dSω is continuous in ω and vanishes at ω = 0. Here Aij are the jr

i` , G ψ mi × mj matrices with the entries ghψpart x part i, in the case of the Hamiltonian N P jr ea ⊥ k i` , ⊥ ˆ ˆ ˆ H , and hψpart ma pa Gxa ψpart i, p = p − (p · k)k, with k = |k| (the projection a=1

of p onto the plane, k ⊥ , perpendicular to k), in the case of the Hamiltonian H (e). Here ` = 1, . . . , mi and r = 1, . . . , mj , and dSω is the area element on the sphere js {k ∈ R3 | |k| = ω}. For j ≥ 1 (i.e., for excited states ψpart ), we define the self-adjoint matrix 0j by X Z (3.5) (Aij )∗ Aij δ(ω − E j i ) d 3 k, 0j = i:E i <E j

where E j i = E j − E i . The eigenvalues of this matrix are the resonance widths to second order in the coupling constant, associated with the eigenvalue E j , what is known in quantum mechanics as Fermi’s Golden Rule. We assume that δj = lim inf (λ−2 0j ) > 0,

(3.6)

|λ|→0

where λ = g in the case of the Hamiltonian H and |e| = max |ei |, in the case of the Hamiltonian H (e). The main result of this paper is the following theorem.

i

Theorem 3.1. Assume (3.1)–(3.4) and (3.6). Let j ≥ 1. Then for |e| sufficiently small, the spectrum of H (e) in any interval containing E j , but not containing any other part of the spectrum of Hpart , and whose distance to spec Hpart ∩ (−∞, E j ) is |e|, is purely absolutely continuous. Moreover, in such an interval, H (e) has the local decay property (formulated below). A similar statement, but with |e| replaced by g, holds for H . The first statement of the theorem was proved in [2–4] under additional assumptions R 2 x| < ∞ for some β > 0, which is a stronger condition of analyticity of Gx and sup |G ω1+β x

in the infrared region, k → 0, than the one we require in this paper. Next, we formulate the local decay property mentioned in Theorem 3.1. To this end, we introduce the anti-self-adjoint operator Z 1 (3.7) a ∗ (k) k · ∇k + ∇k · k a(k) d 3 k. −A = 1part ⊗ 2 This operator is a second quantization of the generator of dilatations in the one-photon momentum space, i.e., of 21 (k · ∇k + ∇k · k). In what follows, whenever no danger of confusion arises, we omit the trivial factors 1part ⊗ and ⊗1f . We say that H has the local

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

563

decay property in a spectral interval 1 (with respect to an operator A), if the following estimate holds Z ∞

2

(3.8)

hAi−α e−iH t ψ dt ≤ Cα kψk2 , −∞

for any α > 1/2 and any ψ ∈ Ran χH ∈1 . (In fact, a slightly stronger property, the limiting absorption principle with Hölder constant θ < α − 21 , holds in our case.) Theorem 3.1 follows from a positive commutator estimate derived below (Theorem 5.2) and from the Kato–Mourre theory mentioned in the introduction and expounded upon in Sect. 5. We prove only the part of Theorem 3.1 concerning the operator H . The corresponding part for the operator H (e), given in (1.1), is proven in exactly the same P ej2 2 way, using some simple additional estimates related to the quadratic part 2mj A(xj ) of the perturbation H (e) − H (0). We note that absolute continuity of the spectrum and the local decay property outside of O(g 2 )- (resp. O(|e|2 )-) neighbourhoods of the eigenvalues and thresholds of Hpart has been proven in [3]. Remark 3.2. The requirement that g is small is not completely satisfactory, since if e−ik·x and κ(k) = we, remembering the origin of Gx in (2.3), take Gx (k) = √κ(k) ω(k)

K −1/2 κ0 (k/K), then

−2

hxi

Z

|k · ∇k Gx (k)|2 3 d k = O(K 2 ) ω(k)

(3.9)

for large K. However, the operator hxi−M/2 k·∇k in conditions (3.1)–(3.2) on the coupling function Gx (k) can be replaced by the operator k · ∇k − x · ∇x . This is done by replacing in our analysis the key operator A, given in (3.7), by the operator Z 1 0 a ∗ (k)(k · ∇k + ∇k · k)a(k)d 3 k −A = 1part ⊗ 2 h1 i − (x · ∇x + ∇x · x) ⊗ 1f . (3.10) 2 Given standard additional conditions on V (x) (see e.g. [6, 13]), most of the analysis given below goes without a change. The advantage of the modified conditions on Gx is in the fact that they do not require the ultraviolet cut-off K to be small in the case of 1 e−ik·x with κ(k) = K − 2 κ0 (k/K). Indeed, in this case, e.g. interest: Gx (k) = √κ(k) ω(k) 2 Z (k · ∇k − x · ∇x )Gx (k) 3 d k = O(1) (3.11) sup ω(k) x e−ik·x , we assume instead of (3.9). Moreover, if, abstracting properties of Gx (k) = √κ(k) ω(k) that Gx satisfies Z 1 X |(k · ∇k − x · ∇x )n Gx (k)|2 3 sup (3.12) d k < ∞, ω(k) x n=0

instead of (3.1), and a corresponding relation replacing (3.2), then the analysis presented in Sect. 5 below simplifies considerably (see also Remark 5.7). In what follows we absorb the parameter g into the coupling function Gx (k).

564

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

4. Relative Bounds on the Interaction In this section we collect some elementary bounds needed for the proof of Theorem 3.1. −1/2 −1/2 we always understand Hf P , where P is the projection In what follows, by Hf onto the orthogonal complement of the vacuum state in Fock space. Lemma 4.1 (Relative bounds). Z

|f |2 ω

ka(f )ψkFock ≤

1/2

(Hf )1/2 ψ

(4.1)

Fock

and ka

∗

Z (f )ψk2Fock

≤

|f |2 ω

2

(Hf )1/2 ψ

Z

Fock

+

|f |

2

kψk2Fock .

(4.2)

Proof. We drop the subindex “Fock” in the proof. By Schwarz’ inequality we have Z

Z ka(f )ψk ≤ Thanks to

|f (k)| ka(k)ψk ≤

|f |2 ω

1/2 Z

1/2 ω(k) ka(k)ψk

2

. (4.3)

Z ω(k) ka(k)ψk2 = hψ, Hf ψi ,

(4.4)

this implies (4.1). Inequality (4.2) follows from a(f )a ∗ (f ) = a ∗ (f )a(f ) + hf, f i1,

(4.5)

t hψ, a ∗ (f )a(f )ψi = ka(f )ψk2 and (4.1). u We rewrite bound (4.1) as Z

−1/2

a(f ) Hf

Fock

−1/2 ∗ a (f )

Hf

Fock

≤ Z ≤

|f |2 ω |f |2 ω

1/2 ,

(4.6)

.

(4.7)

1/2

These two bounds are equivalent, since the expressions under the norm signs are adjoint to each other. Moreover, (4.1) implies that ±ha ∗ (f ) + a(f )iψ ≤ 2

Z

|f |2 ω

1/2

1/2

Hf ψ · kψk,

(4.8)

which yields Z |f |2 1 , ± a ∗ (f ) + a(f ) ≤ αHf + α ω

(4.9)

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

565

for any α > 0. Furthermore, inequalities (4.1) and (4.2) imply Z Z 1/2 1/2

|f |2

1/2

∗ |f |2 kψk + 2

Hf ψ . (4.10)

a (f ) + a(f ) ψ ≤ ω Equation (4.9) implies that I is Hf form-bounded with relative bound zero, provided 1/2 (2.5) holds, while Eq. (4.10) implies that I is Hf -bounded with relative bound Z |Gx |2 1/2 , 2 sup ω x provided (2.5) holds. The latter of these two statements implies that, if (2.5) is satisfied, then H is self-adjoint on the domain of Hf . To develop more refined bounds we need the Pull-through formulae (see [2,3]) a(k)g(Hf ) = g(Hf + ω(k))a(k)

(4.11)

g(Hf )a ∗ (k) = a ∗ (k)g(Hf + ω(k)),

(4.12)

and valid for any piecewise continuous and bounded function g. (These formulae follow from the following commutation relation a(k)Hf = (Hf + ω(k))a(k)

(4.13)

and its adjoint.) Now if ψ = χHf ≤ρ ψ, then ka(k)ψkFock = kχHf +ω(k)≤ρ a(k)ψkFock ≤ χω(k)≤ρ ka(k)ψkFock . Using this in (4.3) we obtain instead of (4.1) (or (4.6)) 1/2  Z Z 2 |f |  · ρ 1/2 . |f (k)|ka(k)χHf ≤ρ kFock ≤  ω

(4.14)

(4.15)

ω≤ρ

These estimates can be extended to products of several annihilation or creation operators. Namely, relation (4.11) and a property of characteristic functions imply that ! m m Y Y a(kj ) χHf ≤ρ = a(kj )χHf ≤ρ . (4.16) 1

1

Applying estimate (4.15) to each factor on the r.h.s., we find 1/2  Z Z Y m m Y Y ⊗ |fj |2  ρ m/2 ,  |fj |k a(kj ) χHf ≤ρ k ≤ ω 1

1

and similarly for certain operators: Z Y ⊗

|fj |kχHf ≤ρ

m Y 1

a(kj ) k ≤

m Y 1

(4.17)

ω≤ρ

 

Z

ω≤ρ

1/2 |fj |2  ρ m/2 . ω

(4.18)

566

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

5. Positive Commutators In this section we formulate our key technical result. In the following, when we speak of a commutator of two, in general unbounded, operators, H and A, we understand that D(H ) ∩ D(A) is dense, and [H, A] is defined first as a form on D(H ) ∩ D(A) and then extended to a bounded or unbounded operator. j We fix j ≥ 1 once and for all. Let Ppart = Ppart be the orthogonal projection onto the eigenspace of Hpart corresponding to the eigenvalue E j . For a fixed energy scale ρ, we define the projection operator P = Ppart ⊗ χHf ≤ρ

(5.1)

and P = 1 − P . We define a family of operators AV = A + P V P − P V ∗ P ,

(5.2)

where A is the second quantized dilatation generator defined in (3.7), and 2

V = θ R ε I,

R ε = Rε P ,

(5.3)

for positive constants θ and ε to be chosen below, where Rε =

h

(H0 − E j )2 + ε2

i−1/2

.

(5.4)

Note that εRε2 → δ(H0 − E j ), as ε → 0. We note also that AV depends on four parameters, g, ε, θ and ρ. Lemma 5.1. The commutator [H, AV ] can be defined as a quadratic form on the dense operator. set D(H0 ) ∩ D(A) and can be extended from there to a (hxiM ⊗ Hf )-bounded R |Gx |2 ∞ Moreover, for any φ ∈ C0 with supp φ ⊂ − ∞, 6 − sup the operator ω x

φ(H )[H, AV ] is bounded.

(5.5)

Proof. The first statement of the lemma follows from the relations D(H ) = D(H0 ) and D(AV ) = D(A). The second of these two relations is due to the fact that the operator AV − A is bounded. To prove the second statement we observe that, by a direct computation, eθ A , θ ∈ R, maps D(H ) = D(H0 ) into itself and therefore, in a sense of quadratic forms, ∂ (5.6) [H, A] = Hθ , ∂θ θ =0 where Hθ = e−θA H eθA . A direct computation (see Eq. (5.18) below) and Lemma 4.1 show that the r.h.s. of this equality is a (hxiM ⊗ Hf )-bounded operator. Hence [H, A] extends to a (hxiM ⊗ Hf )-bounded operator. Furthermore, due to definition (5.1)–(5.4), AV − A is a bounded operator mapping H = Hpart ⊗ F into D(H ), so [H, AV − A] is well defined. As can be easily shown, it is a bounded operator. Hence [H, AV ] extends to a (hxiM ⊗ Hf )-bounded operator. Finally, the third statement follows from the second one and estimate (2.11). u t

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

567

Observe that it is not hard to show that the operator [H, AV ] is self-adjoint. Hence taking adjoints in (5.5) one concludes that also the operator [H, AV ]φ(H ) is bounded.

(5.7)

Let 1 be an energy interval containing E j but no other parts of the spectrum of Hpart , and let n o (5.8) θ1 = inf 1 − sup σ (Hpart ) ∩ (−∞, inf 1) > 0, i.e., the distance, θ1 , of inf 1 to the part of the spectrum of Hpart below 1 is assumed to be positive. The key technical result of this paper is Theorem 5.2. Assume that Conditions (3.1)-(3.4) and (3.6) hold, and let, for simplicity, the parameters ε, θ, and ρ in (5.1)–(5.4) satisfy the inequalities ε ≤ ρ ≤ θ1 and ε ≤ θ . If γj is the smallest eigenvalue of 0j and α = O εθ −1 + θερ −2 + θ1−1 θρ 2 ε−2 + θg 2 ε−2 ρ −1 + g + oε (1), then E1 (H ) [H, AV ] E1 (H ) ≥

θ (2 − α) γj E1 (H )2 . ε

(5.9)

(Here oε (1) → 0, as ε → 0, and 0j is the matrix introduced in (3.5).) This theorem is proven in Sect. 7. Since g θ1 , we can pick the parameters ε, θ and ρ in (5.1)–(5.4) satisfying the inequalities s θ ρεθ, θ ε ρ 2 ≤ θ12 , (5.10) θ1 and g θ −1/2 ερ 1/2 .

(5.11)

Then the parameter α in (5.9) is much smaller than 1 and therefore the r.h.s. is strictly positive on Ran E1 (H ). In what follows we assume that conditions (5.10)–(5.11) are satisfied. Before proceeding any further we derive the most important consequence of this theorem – the instability of the eigenvalue E j . Theorem 5.3 (Virial theorem). Let the conditions of Theorem 5.2 be satisfied. If ψ is an R 2 eigenfunction of the operator H with an eigenvalue E < 6 − sup |Gωx | , then ψ is in the domain of [H, AV ] and

x

ψ, [H, AV ]ψ = 0.

(5.12)

Consequently, in view of Theorem 5.2, H has no eigenvalues in any interval 1 containing only one eigenvalue of Hpart and satisfying θ1 g 2 with θ1 defined in (5.8).

568

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

Proof. Let g1 ∈ C0∞ (R), be real, be supported in − ∞,

P

− sup

R

x

|Gx |2 ω

and satisfy

g1 (E) = 1. Then g1 (H )ψ = ψ, so, as shown at the end of this proof, (5.12) is equivalent to the relation

(5.13) ψ, [g(H ), AV ]ψ = 0 , where g(λ) := (λ − E)g1 (λ). Note g(H )ψ = 0. Since we do not know whether ψ ∈ D(AV ), we must understand the commutator on the l.h.s. of (5.13) as an operator resulting once the commutation is performed. Now we claim that [A, g(H )] is bounded.

(5.14)

¯ = g and suppg¯ ⊂ − ∞, 6 − sup Indeed, let g¯ ∈ C0∞ be s.t. gg x

R

2

|Gx | ω

. The proof

of (5.14) will follow from the following formula Z ¯ )(z − H )−1 [A, g(H )] = d g(z)(z ˜ − H )−1 [A, H ]g(H Z +

˜¯ d g(z)(z − H )−1 g(H )[A, H ](z − H )−1 ,

(5.15)

understood in the sense of quadratic forms on D(A). Here we use the notation and definitions of Appendix B of [14]. Indeed, the l.h.s. is defined as a quadratic form on D(A) by hψ, [A, g(H )]ψi = 2Rehg(H )ψ, Aψi, while the r.h.s. represents a bounded ˜¯ operator by virtue of (5.5) and (5.7) with V = 0 and estimates (B.6) of [14] on g˜ and g. Thus it suffices to prove the representation above. To this end we use the formula ¯ )ψi ∂θ |θ=0 hψ, e−θA g(H )eθ A ψi = ∂θ |θ =0 hψ, e−θ A g(H )eθ A g(H ¯ )eθ A ψi . + ∂θ |θ=0 hψ, g(H )e−θ A g(H

(5.16)

It suffices to consider one of the terms on the r.h.s., say the first one. We use the Helffer– Sjöstrand formula Z g(H ) =

d g(z)(z ˜ − H )−1

(see [14]) to obtain ¯ )ψi = ∂θ |θ =0 ∂θ |θ=0 hψ, e−θA g(H )eθA g(H

Z

¯ )ψi , d g(z)hψ, ˜ (z − Hθ )−1 g(H (5.17)

where, recall, Hθ = e−θA H eθA and is given by an explicit formula Hθ = Hpart ⊗ 1f + 1part ⊗ e−θ Hf + Iθ with Iθ = a ∗ (Gx,θ ) + a(Gx,θ ), Gx,θ (k) = e− 2 Gx (e−θ k). 3θ

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

569

It is not difficult to see that the operator function (z − Hθ )−1 g(H ¯ ) is differentiable in θ at θ = 0: due to (5.7) with V = 0, 1 1 (z − Hθ )−1 − (z − H )−1 g(H ¯ )(z − H )−1 ¯ ) = (z − Hθ )−1 (Hθ − H )g(H θ θ → (z − H )−1 [H, A]g(H ¯ )(z − H )−1 as θ → 0, in the operator norm. Taking this into account and taking the θ -derivative under the sign of integral in (5.17) we arrive at Z −θA θA g(H )e g(H ¯ )ψi = d g(z)hψ, ˜ (z − H )−1 [A, H ]g(H ¯ )(z − H )−1 ψi . ∂θ |θ=0 hψ, e The last equation together with a similar equation for the second term in (5.16) yields (5.15). As was already mentioned Eq. (5.15) together with Eqs. (5.5) and (5.7) for V = 0 yields (5.14). Equation (5.14) implies that [AV , g(H )] is also bounded. In order to write the l.h.s. of (5.13) as a quadratic form, which is what we ultimately need for the proof, we proceed in a standard way by approximating it as follows

ψ, [g(H ), AV ]ψ = lim ψλ , [g(H ), AV ]ψλ , λ↑∞

where ψλ = Rλ ψ, Rλ = λ(λ + A)−1 . (Note that ψλ → ψ as λ → ∞.) Since ψλ ∈ D(A) = D(AV ) we can write

ψλ , [g(H ), AV ]ψλ = 2 Re g(H )ψλ , AV ψλ . Since g(H )ψ = 0 and [g(H ), Rλ ] = λ(λ + A)−1 [A, g(H )](λ + A)−1 (in a sense of quadratic forms), we have g(H )ψλ = Rλ [A, g(H )](λ + A)−1 ψ. Hence, due to (5.14), kRλ k ≤ 1 and k(λ + A)−1 ψk ≤ λ1 kψk, we have kg(H )ψλ k ≤ Consequently,

1

[A, g(H )]

ψ . λ

ψλ , [g(H ), AV ]ψλ → 0

as λ → ∞, so (5.13) follows. Finally, we show the equivalence of (5.12) and (5.13). Define the family of functions ψλ := g1 (H )λ(λ + A)−1 ψ. Then, as above, ψλ → ψ as λ → ∞. Moreover, since, due to (5.14), g1 (H ) maps D(A) into itself, we conclude that ψλ ∈ D(A). Hence hψ, [H, A]ψi = lim hψλ , [H, A]ψλ i λ→∞

= 2 lim Im hH ψλ , Aψλ i = 2 lim Im hg(H )ψλ , Aψλ i λ→∞

λ→∞

= lim hψλ , [g(H ), A]ψλ i = hψ, [g(H ), A]ψi. λ→∞

Thus hψ, [H, A]ψi = hψ, [g(H ), A]ψi and therefore (5.12) holds. u t

570

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

To deduce the statements of Theorem 3.1, about absolute continuity and local decay, from Theorem 5.2, we use an abstract Kato–Mourre theory. A standard variant of this theory h (see, e.g., [1,i6, 13, 20, 23]) requires H -boundedness of the commutators [AV , H ] and AV , [AV , H ] . In our case, these commutators are not H -bounded for two reasons. h i First, under Condition (3.1), [A, H ] and A, [A, H ] are H -bounded only for M = 0, where M is the exponent appearing in (3.1). This follows from the straightforward computation (justified in the proof of Lemma 5.5 below) 3 3 adnA (H ) = Hf + a ∗ (k · ∇k + )n Gx + a (k · ∇k + )n Gx , 2 2

(5.18)

where we used the standard notation adA (H ) = [H, A] (see, however, Remarks 3.2 and 5.7). The second reason is that the second part of the operator AV (see Eqs. (5.2)–(5.4)) contains the projection χHf ≤ρ , entering in the definition of P , and this operator, not being differentiable in Hf , has a very singular commutator with the dilatation generator A (or any other operator not commuting with Hf ). To remedy the first problem, we weaken the conditions used in Mourre theory (see Lemmata 5.5 and 5.6 below). We go around the second problem by replacing AV by a smooth version, as follows. In definition (5.2)–(5.4) of the operator AV , we replace the projection P by the projection Ps , where Ps = Ppart ⊗ χHf ≤sρ .

(5.19)

Thus, we just vary the photon energy scale a little. Denote the resulting operatorRby AV ,s . Let µ be a non-negative function supported in the interval [1, 2] and satisfying µ = 1. Define Z (av) (5.20) AV := µ(s) AV ,s ds. (av)

The next two lemmas establish the desired properties of AV . (av)

Lemma 5.4. Theorem 5.2 holds if we replace AV by AV . Proof. Inequalities (5.10)–(5.11) still hold true if we replace ρ by sρ with 1 ≤ s ≤ 2. Hence (5.9) holds after AV is replaced by AV ,s , for 1 ≤ s ≤ 2. Since µ ≥ 0 and R (av) t µ = 1, this implies (5.9) with AV replaced by AV . u R 2 − ∞, 6 − sup |Gωx | , where 6 = Lemma 5.5. Let φ ∈ C0∞ and supp φ ⊂ x (av) (av) (av) inf σcont (Hpart ). Then the operators [AV , H ]φ(H ) and AV , [AV , H ] φ(H ) are bounded. Proof. The boundedness of the first commutator follows from Lemma 5.1 (see also the sentence after Eq. (5.23)). To show the boundedness of the second commutator we write (av) AV = A + Q, where Z (5.21) Q := (P s V Ps − Ps V ∗ P s )µ(s)ds.

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

571

We consider first the operator A and make sense of the formal computation (5.18). The case n = 1 was justified in the proof of Lemma 5.1. So we consider the case n = 2. Due to (5.14), φ(H ) : D(A) → D(A). (5.22) Hence, due to (5.7), the commutator [H, A], A is defined as a quadratic form on φ(H )D(A). The fact that eθA , θ ∈ R, preserves D(H ) = D(H0 ) and a simple computation shows that d 2 Hθ = [H, A], A , 2 dθ θ =0

e−θA H eθ A , in a sense of quadratic forms. The l.h.s. of this equality

where, recall, Hθ := can be evaluated explicitly: it is exactly the r.h.s. of (5.18). Applying Eq. (4.10), with f := hxi−M (k · ∇k + 23 )n Gx , to (5.18) and observing that Condition (3.1) guarantees that supx (kf k + kω−1/2 f k) is finite, we conclude that the operators adnA (H )hxi−M are Hf -bounded for n = 1, 2. Hence, due to Eq. (2.10), adnA (H )φ(H ) are bounded for n = 1, 2.

(5.23)

(Again, Q is a bounded operator, and Eqs. (5.2)–(5.4) show that so are the operators (av) H · Q and Q · H . Hence [AV , H ]φ(H ) is bounded as was also shown above.) Now we write i h i h i h (av) (av) = [H, A], A + [H, A], Q [H, AV ], AV h i h i + [H, Q], A + [H, Q], Q . (5.24) By Eq. (5.23) and since Q and [H, Q] are bounded, the first two terms and the last term on the r.h.s. of (5.24), multiplied by φ(H h i ) on both sides, are bounded. It remains to show that [H, Q], A , the third term on the r.h.s. of (5.24), times φ(H ), is bounded. To this end, we want to use the Jacobi identity and rewrite this term as h i h i h i [Q , H ] , A = [A , Q] , H + Q , [A , H ] . (5.25) To demonstrate this identity we prove it first for A replaced by the bounded operator Aλ := A · iλ(iλ + A)−1 and then take the limit λ → ∞ for the quadratic forms. Now we demonstrate that [A , Q] and H · [A, Q] are bounded. We write [A , Q] = θ (S + S ∗ ), where

S =

We present S in the form

Z A,

µ(s) P s Rε2 I Ps .

Z d µ(s) eθ A P s Rε2 I Ps e−θ A ds dθ θ =0 Z d 2 µ(s) P se−θ Rε,θ Iθ Pse−θ ds , = dθ θ =0

S=

572

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

where Rε,θ = eθA Rε e−θA , etc., and where we have used that eθ A Hf e−θ A = eθ Hf . Therefore eθA Ps e−θ A = Pse−θ . Next, using Leibnitz’ rule, we rewrite this relation as Z Z 2 µ(s) P s Rε2 [A, I ] Ps ds S = µ(s) P s [A, Rε ] P s I Ps ds + Z d P se−θ Rε2 I Pse−θ ds . + µ(s) dθ θ =0 Since [A, H0 ] and [A, I ]hxi−M are Hf -bounded, the first two terms on the r.h.s. are bounded. The last term on the r.h.s. can be rewritten as Z Z d(s µ(s)) d P s Rε2 I Ps ds = P s Rε2 I Ps ds, − s µ(s) ds ds which shows that it is bounded, as well. Thus we proved that [A, Q] is bounded. Using the above analysis and Eqs.(4.1) and (4.2), one shows that H ·[A, Q] is bounded as well. Consequently, [A, Q], H is bounded. Next, since [A, H ] is (hxiM ⊗ Hf )-bounded −M through Eq. and Q(Hf + i) is bounded, remembering (2.10) and commuting hxi Q, if necessary, we conclude that Q, [A, H ] is bounded. Thus by identity (5.25), the boundedness of [Q, H ], A follows, which completes the proof of the lemma. u t In the next lemma, we slightly weaken the hypotheses of Mourre theory (see, e.g., [1,6,13,20,23]), in order to accommodate our situation (see Lemma 5.5). Lemma 5.6. Let H and iA be two self-adjoint operators, defined on the same Hilbert ∞ 0 0 space, and let 1 b 1 ⊂⊂ R be intervals such that for any real φ ∈ C0 (1 ), the operators [H, A] and [H, A], A , defined originally as quadratic forms on the domains D(H ) ∩ D(A) and φ(H )D(A), extend to unbounded operators satisfying [H, A]φ(H ) and φ(H ) [H, A], A φ(H ) are bounded, (5.26) φ(H )[H, A]φ(H ) ≥ θ φ(H )2 , for some θ > 0.

(5.27)

Then the spectrum of H in 1 is absolutely continuous and H has the local decay property in 1 with respect to the operator A. The proof of this lemma follows, by now standard, arguments of [20,23,6]. For the reader’s convenience it is given in Appendix A. (For a different proof see [15].) Proof of Theorem 3.1. By Lemma 5.4, we have a positive commutator estimate as in (av) (5.9), but with AV replacing AV , (av)

E1 (H ) [H, AV ] E1 (H ) ≥

θ (2 − α) γj E1 (H )2 , ε

(5.28)

and by Lemma 5.5, we know that, for any φ ∈ C0∞ with supp φ ⊂ − ∞, 6 − R (av) 2 (av) (av) sup |Gωx | , the operators [AV , H ]φ(H ) and AV , [AV , H ] φ(H ) are bounded. x

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

573

Thus Lemma 5.6 implies that the spectrum of H in 1 is absolutely continuous and that (av) the local decay property holds w.r.t. AV . To pass to the local decay property w.r.t. the operator A, it suffices to observe that, due to (4.10), Q is a bounded operator and (av) t therefore hAi−α · hAV iα ≤ const, for α > 0. u Remark 5.7. The arguments presented above can be simplified if we use, from the beginning, the operator (3.10) instead of (3.7). Indeed, under assumptions which generalize e−ik·x – (see Remark 3.2), the coupling functions the case of interest – Gx (k) = g √κ(k) ω(k) arising in the commutators [H, A0 ] and [H, A0 ], A0 do not grow in x and therefore do not require φ(H ) for bounding them. 6. Positivity of the Truncated Commutator Before tackling the proof of Theorem 5.2 head on, we go part of the way by proving the positivity of a simpler commutator. Recall that we are considering Hamiltonian (2.7) but with the parameter g absorbed into the coupling function Gx (see (2.9)). Now, let B 0,1 = P 1 B0 P 1 ,

(6.1)

B0 = [ H , A ]

(6.2)

P 1 = P E1 (H0 ).

(6.3)

where

and

Recall that 1 is an energy interval containing E j but disjoint from the rest of the spectrum of Hpart . The main result of this section is the following lemma. Lemma 6.1. Assume g 2 ρ ≤ θ1 . Then B 0,1 ≥

1 1 Hf P 1 ≥ ρ P 1 , 2 2

and, if in addition 3|z| ≤ ρ, then

−1/2

P 1 ψ ,

|B 0,1 − z|−1/2 P 1 ψ ≤ 2 Hf where |A| :=

√

(6.4)

(6.5)

A∗ A, for a closed operator A.

Proof. We begin with a computation. For B0 as in (6.2), we have by (5.15) with n = 1, B0 = Hf + Ie,

(6.6)

ex ) + a(G ex ), and where Ie = a ∗ (G ex (k) := k · ∇k Gx (k) + 3 Gx (k). G 2

(6.7)

574

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

ex yields Inequality (4.9) with α = 1/4 and f = G Z e 2 |Gx | 1 , ±Ie ≤ Hf + 4 4 ω

(6.8)

which implies Z

3 B0 ≥ Hf − 4 4

ex |2 |G . ω

(6.9)

Since Hf ≥ 0, inequality (2.10) implies that

M

hxi E1 (H0 ) ≤ CM ,

(6.10)

for any M < ∞, provided sup 1 < inf cont spec Hpart − sup

R

x

inequalities imply that B 0,1 ≥

|Gx |2 ω .

3 Hf − Cg 2 P 1 . 4

The last two

(6.11)

Next, definition (5.1) yields that P = P part ⊗ 1 + Ppart ⊗ χHf ≥ρ .

(6.12)

Since, by energy conservation, X

P part E1 (H0 ) =

i Ppart E1 (Hf + E i ),

(6.13)

i: E i <E j

we have that Hf P part E1 (H0 ) ≥ θ1 P part E1 (H0 ),

(6.14)

where θ1 is given in (5.8). This yields Hf P 1 ≥ min(θ1 , ρ) P 1 = ρ P 1 ,

(6.15)

which, together with (6.11) and the condition g 2 ρ, implies (6.4). The proof of (6.5) is based on the following identity:

B 0,1 − z

−1

−1/2

= Hf

1+K

−1

−1/2

Hf

,

(6.16)

where −1/2

K = Hf

−1/2

P 1 (Ie− z) P 1 Hf

.

(6.17)

It suffices to prove that for g 2 ρ, kKk ≤

1 , 2

(6.18)

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

575

which would imply (6.5). To prove the latter inequality, we write K = K0 + K1 , where K0 = −z Hf−1 P 1 2

(6.19)

and −1/2

K1 = Hf

−1/2 P 1 IeP 1 Hf .

(6.20)

Since |z| ≤ ρ/3, we have that kK0 k ≤ 1/3, due to (6.4). Next, using (6.7), inequality ex , and inequality (6.10) again, we arrive at (4.6) with f = G −1/2

K1 = Hf

−1/2

P 1 O(g) + O(g) P 1 Hf

,

(6.21)

which, together with (6.15), yields that kK1 k = O(ρ −1/2 g). Since g 2 ρ and since t kK0 k ≤ 1/3, this implies (6.18) which in turn yields (6.5). u Now we boost this proposition to a more complicated result. Let BV := [ H , AV ]

and

B V ,1 := P 1 BV P 1 .

(6.22)

Lemma 6.2. Assume g 2 ρ ≤ θ1 and g ε3/4 ρ 1/2 . Then B V ,1 ≥

1 1 Hf P 1 ≥ ρ P 1 , 2 2

and, if in addition 3|z| ≤ ρ, then

−1/2

P 1 ψ .

|B V ,1 − z|−1/2 P 1 ψ ≤ 2 Hf

(6.23)

(6.24)

Proof. By the definition of AV , we have B V ,1 = B 0,1 − E,

(6.25)

where, since P 1 P = P P 1 = 0,

i h E = −P 1 I , P V P − P V ∗ P P 1 = P 1 I P V ∗ P 1 + h.c. = θ P 1 I P I Rε2 P 1 + h.c.

(6.26)

We claim that kEk ≤ Cθg 2 ε−3/2 .

(6.27)

Indeed, since I = a ∗ (Gx ) + a(Gx ), estimates (4.6) and (4.7) imply that kP 1 I P k ≤ Cg. 2

(6.28)

It remains to estimate the operator P I R ε . It is shown in Lemma 6.4 below that kP I R ε k ≤ cgε −1/2 . The last two estimates and the inequality kRε k ≤ ε−1 imply (6.27). The latter estimate together with (6.25) and (6.4) implies (6.23). Equation (6.24) is proven similarly to (6.5). u t

576

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

Remark 6.3. It suffices to prove an appropriate Hf -form bound on E, rather than the norm bound, Eq. (6.27). The former bound would improve our final estimates. Lemma 6.4. We have kP I R ε k ≤ Cgε−1/2 .

(6.29)

(R ε I P )∗ (R ε I P ) = P I R ε I P .

(6.30)

Proof. We write 2

2

Next, we analyze the operator I R ε I restricted to Ran χHf ≤ρ . To this end we need the Pull-through formulae (see (4.11) - (4.12)) a(k) R ε = R ε,ω(k) a(k), ∗

(6.31)

∗

R ε a (k) = a (k) R ε,ω(k) ,

(6.32)

where R ε,ω = R ε H

f →Hf +ω

Recalling (2.9) and pulling, in a ∗ ’s

the a’s to the right and the (6.31) and (6.32), we obtain

2 I Rε I

.

=

(6.33) 2 ∗ R ε a (Gx ) + a(Gx ) ,

a ∗ (Gx ) + a(Gx )

to the left with the help of the Pull-through formulae 2

I R ε I = M + L, where

Z M =

2

Gx (k) R ε,ω(k) Gx (k) d 3 k

(6.34)

(6.35)

and, with ωi = ω(ki ),

ZZ 2 2 Gx (k1 ) a ∗ (k2 ) R ε,ω1 +ω2 a(k1 ) Gx (k2 )d 3 k1 d 3 k2 L = a ∗ (Gx ) R ε a(Gx ) + Z 2 + Gx (k) R ε,ω(k) a(k) a(Gx )d 3 k + adjoint. (6.36)

Using that kR ε,ω k ≤ ε−1 , we estimate the latter operator by ZZ

|Gx (k1 )Gx (k2 )| ka(k1 )a(k2 )χHf ≤ρ k

χHf ≤ρ L χHf ≤ρ ≤ 2ε−2 2 Z +2 ε −1 |Gx (k)| ka(k)χHf ≤ρ k . Applying inequalities (4.17) and (4.18) to the r.h.s., we arrive at

χHf ≤ρ L χHf ≤ρ ≤ 4ε−2 ρg(ρ)2 ,

(6.37)

(6.38)

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

where, recall, g(ρ) := sup x

!1/2

R ω≤ρ

|Gx ω

|2

577

√ . Since by our restrictions g(ρ) ≤ Cg ρ,

this in turn yields that, on Ran χHf ≤ρ , I R ε I = M + O(ε−2 g 2 ρ 2 ). 2

(6.39)

Now it is not hard to convince oneself that kMk ≤ Cg 2 ε−1 .

(6.40)

Indeed, remembering expression (6.12) for P , one can represent M as a sum of terms of the form Z∞ 0

fi (ω)dω , (ω + Hf − E j i )2 + ε2

where fi (ω) are bounded by Cg 2 (in fact, decaying at ∞), continuous functions. Instituting the change of variable as ω → λ = ε−1 (ω + Hf − E j i ), one shows easily that each integral is bounded by Cg 2 ε−1 . t Estimates (6.30), (6.39) and (6.40) and the condition ρ 2 ≤ ε imply (6.29). u 7. Proof of Theorem 5.2 First we estimate from below the following operator BV ,1 := E1 (H0 ) [ H , AV ] E1 (H0 ).

(7.1)

Using the definition of AV (see Eq. (5.2)), we write BV ,1 as BV ,1 = B V ,1 + P1 C ∗ P 1 + P 1 C P1 + P1 F P1 ,

(7.2)

where, in accordance with (5.2), (6.2), (6.3) and (6.22), P1 = P E1 (H0 ), C = [ H − E j , V ] + B0 , F = B0 + V ∗ P I + I P V .

(7.3) (7.4) (7.5)

Here we used that, by virtue of the definition of V , we may identify V ≡ P V P . The key to the proof is the following inequality which follows from an application of the Feshbach projection method (a derivation is given in Appendix B): n o (7.6) λ0 ≥ inf spec E Ran P1 , where

and

n o λ0 = inf spec BV ,1 Ran E1 (H0 )

(7.7)

−1 C. E = F − C ∗ B V ,1 − λ0

(7.8)

578

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

We may assume here that 3λ0 ≤ ρ; otherwise Theorem 5.2 follows readily from conditions (5.10)–(5.11) on the parameters. With this assumption, Lemma 6.2 is applicable −1 is bounded on Ran P 1 . Hence (7.8) is well-defined. and yields that B V ,1 − λ0 Our task is to estimate E on Ran P1 from below. The first term on the r.h.s. of (7.8) can be easily analyzed. Due to (5.3), V ∗ P I + I P V = 2θ I R ε I. 2

(7.9)

Next, Eqs. (6.6)–(6.10) imply that 3 2 Hf − Cg P1 ≥ −Cg 2 P1 . P1 B0 P1 ≥ 4

(7.10)

Hence, on Ran P1 , 2

F ≥ 2θ I R ε I − Cg 2 .

(7.11)

Next, we estimate from above the operator −1 C, G := C ∗ B V ,1 − λ0

(7.12)

on Ran P1 . A large part of the remainder of this section is devoted to this estimate. As mentioned after Eq. (7.8), Lemma 6.2 is applicable to (7.12). It yields

2

−1/2

P1 C ψ . (7.13) | h G iψ | ≤ 2 Hf From now on, we assume that ψ ∈ Ran P , which implies that P 1 ψ = 0. This relation, the definition of C (Eq. (7.4)), and Eq. (6.6) imply that

−1/2

−1/2

−1/2 P 1 C ψ ≤ Hf P 1 Ieψ + Hf P 1 [H − E j , V ] ψ ,

Hf (7.14) 2

where V = P V P = θ R ε I P . To estimate the first term on the r.h.s. of (7.14), we use ex ) + a(G ex ) (see Eq. (6.7)) and Eqs. (6.3) and (6.4), to obtain that Ie = a ∗ (G

−1/2 −1/2 ∗ e P 1 Ieψ ≤ hxiM E1 (H0 ) · hxi−M Hf a (Gx ) · kψk (7.15)

Hf

1/2 ex ) H −1/2 + ρ −1/2 hxi−M a(G

Hf ψ . f

1/2

Since hxiM E1 (H0 ) ≤ CM and Hf ψ is bounded by ρ 1/2 kψk, we obtain from (4.6)-(4.7) that

−1/2 P 1 Ieψ ≤ C gkψk. (7.16)

Hf It remains to estimate the second term on the r.h.s. of (7.14). Using the properties of P , P 1 , and H − E j = (H0 − E j ) + I , we obtain that −1/2

Hf

P 1 [H − E j , V ] P = θ

4 X 1

Ai ,

(7.17)

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

579

where −1/2

A1 := Hf

P 1 (H0 − E j )R ε I P , 2

−1/2

P 1 R ε I P Hf ,

−1/2

P 1Rε I P I P

A2 := −Hf A3 := −Hf and

−1/2

A4 := Hf

(7.18)

2

(7.19)

2

(7.20)

2

P 1I Rε I P .

(7.21)

To estimate the first of these terms we use the expression P 1 (Eqs. (6.3) and (6.12)) and the estimate k(H0 − Ej )Rε k ≤ 1 to obtain −1/2

kA1 k ≤ kHf

E1 (H0 )P part k · kR ε I P k −1/2

+ kPpart ⊗ χHf ≥ρ Rε k · kHf

I P k.

Now we use that, due to (4.6)–(4.7), kP I P k ≤ 2g, kP Hf k ≤ ρ,

−1/2 ∗

−1/2 χHf ≥ρ I P ≤ Hf a (Gx ) + ρ −1/2 a(Gx ) P ≤ 2 g,

Hf

(7.22)

use inequality (6.29) and use the fact that Hf ≥ θ1 on Ran(E1 (H0 )P part ) to obtain −1/2 −1/2

kA1 k ≤ Cθ1

g + 2ρ −1 g.

(7.23)

gρ + 2ρ −1 g.

(7.24)

ε

Similarly we have −1/2 −3/2

kA2 k ≤ Cθ1 −1/2

Next using the estimates kHf √ and kP I P k ≤ g ρ, we find

ε

P 1 k ≤ 2ρ −1/2 (see (6.15)), kRε k ≤ ε−1 , (6.29)

kA3 k ≤ ε−3/2 g 2 . −1/2

Now taking into account expressions (6.3) and (6.12) for P 1 , we estimate kP 1 Hf Cρ −1/2 . Next, using (4.6) and (4.7) we find −1/2

kP 1 Hf

k≤

−1/2 ∗

I R ε k ≤ kHf +

a (Gx )kkR ε k

−1/2 kP 1 Hf kka(Gx )R ε k −1 −1/2 −1

≤ C(g · ε

+ρ

gε

).

Finally using (6.29), we obtain kA4 k ≤ Cρ −1/2 ε−3/2 g 2 .

(7.25)

Collecting the estimates above and remembering (7.17) and remembering that ε ≤ ρ, we find −1/2 −1/2 kHf P 1 [H − E j , V ]P k ≤ Cθg ρ −1 + θ1 ρε−3/2 + ρ −1/2 ε−3/2 g . (7.26)

580

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

This together with (7.13), (7.14) and (7.16) gives G ≥ −Cg 2 (1 + θ 2 ρ −2 + θ1−1 θ 2 ρ 2 ε−3 + θ 2 ρ −1 ε−3 g 2 ).

(7.27)

Finally, combining the last inequalities with (7.8), (7.11) and (7.24) yields on RanP1 , E ≥ 2θI R ε I − Cg 2 (1 + θ 2 ρ −2 + θ1−1 θ 2 ρ 2 ε−3 + θ 2 g 2 ε−3 ρ −1 ). 2

(7.28)

This estimate together with (6.39) implies that on Ran P , E ≥ 2θM − Cg 2 (1 + θ 2 ρ −2 + θ 2 ρ 2 ε−3 + θ 2 g 2 ε−3 ρ −1 + θ ε−2 ρ −2 ). Now we analyze the operator P MP . Introducing X (≤j ) (>j ) (≤j ) i Ppart and Ppart := 1part − Ppart , Ppart :=

(7.29)

(7.30)

i: E i ≤E j (>j )

(>j )

and noting that (Hpart − E j ) Ppart ≥ δPpart , for some δ > 0, we estimate

Z

(≤j ) 2 3

P MP − P Gx (k) Ppart R ε,ω(k) Gx (k) d k P

≤ Cg 2 .

This relation can be rewritten as h i−1 X Z dωP + O(g 2 ), fij (ω) (Hf + ω − E j i )2 + ε2 P MP =

(7.31)

(7.32)

i: E i ≤E j

where E j i := E j − E i and fij (ω) =

R |k|=ω

(Aij )∗ Aij dSω with the matrices Aij defined

in the paragraph preceeding Eq. (3.5). Now using the change of the variables formula and the mean value theorem we find Z fij (ω)[(Hf + ω − E j i )2 + ε2 ]dω Z = fij (α − Hf )[(α − E j i )2 + ε2 ]dα Z = fij (α)[(α − E j i )2 + ε2 ]−1 dα + R, where 1Z

Z R= 0

fij0 (α − sHf )[(α − E j i )2 + ε2 ]−1 dαdsHf .

Since the functions fij have, by the assumptions on Gx (k), bounded derivatives, we obtain that 2 g ρ . (7.33) RP = O ε

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

581

Using this together with Eq. (7.32) and with the fact that fij (ω) vanish at ω = 0 and remembering the definition of 0i (see Eq. (3.5)) yields on RanP , i 0j h · 1 + oε (1) + O(ρ) + O(g 2 ), M = (7.34) ε where oε (1) stands for a function of ε vanishing as ε → 0. Equation (7.34) inserted into (7.29) yields E≥

θ0j 2 − oε (1) − O(ρ) ε − O(g 2 ) 1 + θ 2 ρ −2 + θ1−1 θ 2 ρ 2 ε−3 + θ 2 g 2 ε−3 ρ −1 + θ ε−2 ρ 2 .

(7.35)

Since 0j ≥ δj g 2 with δj positive and independent of g and since ρ ≥ ε, we may write (7.35) on Ran P1 as E ≥ where

θ 0j (2 − α1 ) , ε

ε θρ 2 θg 2 ρ2 θε + 2+ 2 + 2+ α1 = O θ ρ ε θ1 ρε ε

(7.36) + oε (1) < 2.

(7.37)

This together with (7.6)–(7.8) (see also the paragraph after Eq. (7.8)) implies BV ,1 ≥

θ γj (2 − α1 ) E1 (H0 )2 , ε

(7.38)

where, we recall, γj is the smallest eigenvalue of 0j . Now we derive (5.9) from (7.38). Let 1 ⊂⊂ 10 and pick a smooth function h supported in 10 and equal to 1 on 1. Moreover, we denote E 1 (λ) = 1 − E1 (λ). We use the estimate

(7.39)

h(H ) − h(H0 ) (H0 + i)1/2 ≤ C g/|1|, which can be easily derived using operator calculus (see, e.g., [14]) and (4.6)–(4.7). Recalling that BV ,1 = E1 (H0 ) BV E1 (H0 ) and BV = [H, AV ], we may write E1 (H ) BV E1 (H ) = E1 (H )BV ,10 E1 (H ) + S + T ,

(7.40)

where S = E1 (H )E 10 (H0 ) BV E10 (H0 )E1 (H ) + adjoint, and T = E1 (H )E 10 (H0 ) BV E 10 (H0 )E1 (H ). Writing E1 (H )E 10 (H0 ) as E1 (H ) h(H ) − h(H0 ) · E 10 (H0 ) and using Eq. (7.39) we obtain (∗) E1 (H )E 10 (H0 ) = E1 (H )O(g), and similarly for the adjoint operator. The latter estimate implies that T = E1 (H )O(g 2 )E1 (H ).

(7.41)

582

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

Next, we write

BV = [H0 , A] + U, − P V ∗ P ] + [I, AV ]

where U := [H0 , P V P commutes with E10 (H0 ) so that

and use that [H0 , A] = Hf and therefore

S = E1 (H )E 10 (H0 )U E10 (H0 )E1 (H ) + h.c. . Again Eqs. (6.29) and (∗), together with elementary estimates similar to those performed above imply that (7.42) S = E1 (H )O(θ ε−3/2 g 2 ρ)E1 (H ). Combining estimates (7.40)–(7.42), we obtain E1 (H )BV E1 (H ) ≥ E1 (H )(BV ,1 − Cθ ε−3/2 g 2 ρ)E1 (H ). Now using inequalities (7.38) and (7.39) we arrive at h θγ (2 − α ) i j 1 1 − O(g) − Cθ ε−3/2 g 2 ρ E1 (H )2 . E1 (H )BV E1 (H ) ≥ ε It is not hard now to identify this inequality with (5.19). A. Proof of Lemma 5.6 Both statements of Lemma 5.6 follow in a standard fashion (see [25], Theorems XIII.23 and XIII.25) from the following result (cf. Theorem 4.9 of [6] and Theorem 7.1 of [23]). Theorem A.1. Under the assumptions of Lemma 5.6, khAi−α (H − z)−1 hAi−α k ≤ C

(A.1)

uniformly in z ∈ C+ with Re z ∈ 1, provided α > 21 . Proof. Here we prove this theorem for α = 1. Its extension to the case of α > 21 is done by repeating the proof of Theorem 7.8 of [23]. Our proof follows closely the proofs of Theorem 4.9 of [6] and Theorem 7.1 of [23]. Let 1 ⊂⊂ 11 ⊂⊂ 12 ⊂⊂ 10 and f ∈ C0∞ (12 ), with f ≡ 1 on 11 and f ≥ 0. We use the following notation, M := f (H )[A, H ]f (H ).

(A.2)

Note that due to (5.27), M ≥ θf (H )2 and M ∗ = M. Since k(H − iεM − z)uk ≥ Imh(−H + iεM + z)u, ui/kuk ≥ Im zkuk and similarly for the adjoint operator, we have that (see Lemma 4.4(a) of [6] or Lemma 7.3(a) of [23]): for ε ≥ 0 and Im z > 0, H − iεM − z is invertible.

(A.3)

Denote Gε (z) = (H − iεM − z)−1 . Moreover, we introduce also Fε (z) := DGε (z)D with D = hAi−1 . In what follows the argument z is assumed to satisfy Re z ∈ 1 and Im z > 0; it is fixed and often omitted from the notation. We begin with a series of simple lemmata.

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

583

Lemma A.2. For z ∈ C+ with Re z ∈ 1, and ε ≥ 0, kGε (z)k ≤ C/ε.

(A.4)

Proof. Let f = f (H ). The relations kf Gε ϕk2 = hG∗ε f 2 Gε iϕ , M ≥ θf 2 and Im z ≥ 0 imply 1 hG∗ 2εMGε iϕ 2εθ ε 1 hG∗ (2εM + 2 Im z)Gε iϕ , ≤ 2εθ ε where we used the notation hBiϕ = hϕ, Bϕi. Now, an application of the second resolvent 1 hiG∗ε − iGε iϕ , which in turn implies equation yields kf Gε ϕk2 ≤ 2εθ kf Gε ϕk2 ≤

1/2 1 kf Gε ϕk ≤ √ hGε iϕ , εθ

(A.5)

1 kf Gε k ≤ √ kGε k1/2 . εθ

(A.6)

and therefore

Next, applying the second resolvent equation to Gε and G0 and using that kf¯G0 k < ∞, thanks to dist(z, R\11 ) > 0, we find kf¯Gε k ≤ C(1 + εkGε k),

(A.7)

where f¯ = 1 − f . This inequality together with (A.6) implies (A.4). u t This lemma and its proof have two consequences important for us: kf¯(H )Gε (z)k ≤ C,

(A.8)

uniformly in ε > 0, where f¯(H ) = 1 − f (H ), due to Eqs. (A.4) and (A.7); and kf (H )Gε (z)Dk ≤ Cε−1/2 kFε (z)k1/2 ,

(A.9)

due to Eq. (A.5) with ϕ = Du. The last two equations imply in turn that kGε (z)Dk ≤ C 1 + ε−1/2 kFε (z)k1/2 .

(A.10)

In what follows we assume that φ is a cut-off function satisfying p φ ∈ C0∞ (10 ) and φ ≡ 1 on 12 . φ ≥ 0,

(A.11)

Next, we introduce the symmetric operator Aφ := φ(H )Aφ(H ),

(A.12)

which is well defined in D(A), due to (5.19). Now define [H, Aφ ] as a quadratic form on D(A) ∩ D(H ). Then [H, Aφ ] = φ(H )[H, A]φ(H )

(A.13)

in a sense of quadratic forms. This relation implies that the operator Bφ := [H, Aφ ] is bounded.

(A.14)

584

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

Lemma A.3. Let B := [H, A]. For any ψ ∈ C0∞ (12 ), the operator ψ(H )[B, Aφ ]ψ(H ) is bounded.

(A.15)

Here the operator in (A.15) is initially defined in a sense of quadratic forms. Proof. In the proof below we omit the argument H in φ(H ) and ψ(H ). Using that ψ · φ = ψ, we compute as quadratic forms ψ[B, Aφ ]ψ = ψ[B, A]ψ + ψB[φ, A]ψ + ψ[φ, A]Bψ.

(A.16)

Since, by (5.5), (5.14) and (5.27), ψB, Bψ, [φ, A] and ψ[B, A]ψ are bounded we conclude that the r.h.s. of (A.16) is bounded, so (A.15) follows. u t Our last preparatory step is the following Lemma A.4. The operator [M, Aφ ] defined initially in a sense of quadratic forms is bounded. Proof. Using that f · φ = f and omitting again the argument H , we compute in a sense of quadratic forms [M, Aφ ] = f B[f, Aφ ] + f [B, Aφ ]f + [f, Aφ ]Bf. Since f B, Bf , [f, Aφ ] = φ[f, A]φ and f [B, Aφ ]f are bounded by virtue of (5.26), (5.14) and (A.15), the statement follows. u t Now we are ready for a core estimate of this proof. Lemma A.5. We have the following estimate:

dF (z)

ε

≤ C kFε (z)k + ε−1/2 kFε (z)k1/2 + 1 .

dε

(A.17)

Proof. Using the definitions of Gε (z) and Fε (z), we compute −

dFε = DGε MGε D. dε

Since f · φ = f , we have that M = f Bφ f . Now we decompose dFε = Q1 + Q2 + Q3 , dε

(A.18)

where Q1 = DGε f¯Bφ f¯Gε D, Q2 = DGε f¯Bφ f Gε D + DGε f Bφ f¯Gε D, Q3 = −DGε Bφ Gε D. We bound now the Qj ’s. Equations (A.14) and Eq. (A.8) imply kQ1 k ≤ kDGε f¯k2 kBφ k ≤ C.

(A.19)

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

585

Next, (A.8), (A.9) and (A.14) yield kQ2 k ≤ 2kDGε f¯kkBφkkf Gε Dk ≤ √Cε kFε k1/2 .

(A.20)

The term Q3 is more complicated. We decompose it as −Q3 = Q4 + Q5 ,

(A.21)

where Q4 = DGε [H − iεM − z, Aφ ]Gε D

(∗)

and Q5 = iεDGε [M, Aφ ]Gε D. Expanding the commutator in (∗), we find Q4 = DAφ Gε D − DGε Aφ D. Hence, due to kDAφ k ≤ C and (A.10), kQ4 k ≤ 2kDAφ kkGε Dk ≤ C(1 + ε −1/2 kFε k1/2 ).

(A.22)

Finally, we have due to (A.10) and Lemma A.4, kQ5 k ≤ εkDGε k2 k[M, Aφ ]k ≤ C(ε + kFε k).

(A.23)

Now, Eqs. (A.18)–(A.23) imply (A.17). u t To complete the proof of Theorem 5.5 we iterate the rough estimate kFε (z)k ≤

C , ε

(A.24)

which follows from (A.4), with the help of differential inequality

(A.17). On the first

dFε (z) step plugging (A.24) into the r.h.s. of (A.17) we obtain dε ≤ Cε . Integrating the latter inequality from ε to 1 and using that, due to (A.24), kF1 (z)k ≤ C, we find kFε (z)k ≤ C log 1ε . Plugging the latter estimate into the r.h.s. of (A.17) yields now q 1

log ε

dFε (z)

dε ≤ C ε which upon the integration from 0 to 1 gives kF0 (z)k ≤ C, uniformly in z, Im z > 0 and Re z ∈ 1, which, by virtue of the definition of Fε (z), is equivalent to the statement of Theorem A.1. u t

586

V. Bach, J. Fröhlich, I. M. Sigal, A. Soffer

B. Feshbach Projection Method Lemma B.1. Let B be a self-adjoint operator on a Hilbert space H = H1 ⊕ H2 and let (in the obvious notation) B22 ≥ θ idH2 , θ > 0. Then λ0 := inf spec B is either ≥ θ or it satisfies the relation (B.1) λ0 = inf spec B11 − B12 (B22 − λ0 )−1 B21 . Proof. Let λ0 < θ. The Feshbach projection method implies that λ ∈ σ (B) iff λ ∈ σ B11 − B12 (B22 − λ)−1 B21 ,

(B.2)

provided λ < θ, which implies (B.1). u t Acknowledgements. V. Bach thanks C. Gerard, M. Hübner, and H. Spohn for useful discussions and A. Friedman and R. Gulliver, for hospitality at the IMA, University of Minnesota. I.M. Sigal is grateful to W. Hunziker and J. Fröhlich for hospitality at ETH Zürich and to A. Friedman and R. Gulliver for hospitality at the IMA, University of Minnesota. The authors are grateful to the referee and to M. Merkli for many useful remarks. The research of V. Bach is supported by the Sonderforschungsbereich 288 of the DFG and by the TMR network on “PDE and QM” of the EU, the research of I.M. Sigal is supported by NSERC Grant NA 7901 and research of A. Soffer is supported in part by U.S. NSF grant DMS-9401777 and by Grant Award of FAS-Rutgers.

References 1. Amrein, W.O., Boutet de Monvel, A. and Georgescu, V.: C0 -Groups, Commutator Methods and Spectral Theory of N-Body Hamiltonians. Birkhäuser–Basel–Boston–Berlin: 1996 2. Bach, V.,Fröhlich, J. and Sigal, I.M.: Mathematical theory of non-relativistic matter and radiation. Lett. Math. Phys. 34, 183–201 (1995) 3. V. Bach, J. Fröhlich, and I. M. Sigal. Quantum electrodynami:cs of confined nonrelativistic particles. Adv. Math. 137, 205–298 (1998) 4. Bach, V., Fröhlich, J. and Sigal, I.M.: Renormalization group analysis of spectral problems in quantum field theory. Adv. Math. 137, 299–395 (1998) 5. Bach, V., Fröhlich, J. and Sigal, I.M.: Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field. Commun. Math. Phys. to appear 1999 6. Cycon, H., Froese, R., Kirsch, W. and Simon, B.: Schrödinger Operators. Berlin–Heidelberg–New York: Springer, 1st, 1987 7. Froese, R. and Herbst, I.: A new proof of the Mourre estimate. Duke Math. J. 49, 1075–1085 (1982) 8. Fröhlich, J.: Existence of dressed one-electron states in a class of persistent models. Fortschr. Phys., 22, 159–198 (1974) 9. Hiroshima, F.: Functional integral representation of a model in QED. 1998, to appear 10. Hübner, M. and Spohn, H.: Atom interacting with photons: An N-body Schrödinger problem. Preprint, 1994 11. Hübner, M. and Spohn, H.: Radiative decay: Nonperturbative approaches. Rev. Math. Phys. 7, 363–387 (1995) 12. Hübner, M. and Spohn, H.: Spectral properties of the spin-boson Hamiltonian. Ann. Inst. H. Poincare 62, 289–323 (1995) 13. Hunziker, W. and Sigal, I.M.: The general theory of N-body quantum systems. CRM Proceedings and Lecture Notes 8, 35–72 (1995) 14. Hunziker, W. and Sigal, I.M.: Time dependent scattering theory for many-particle systems. Rev. Math. Phys., to appear 1999 15. Hunziker, W., Sigal, I.M. and Soffer, A.: Minimal velocity estimates. Preprint, Zürich, 1997 16. Jaksic, V. and Pillet, C.A.: On a model for quantum friction I: Fermi’s golden rule and dynamics at positive temperature. Commun. Math. Phys. 176, 619–644 (1996) 17. Jaksic, V. and Pillet, C.A.: On a model for quantum friction II: Ergodic properties of spin-boson system. Commun. Math. Phys. 178, 627–651 (1996)

Positive Commutators and Spectrum of Pauli–Fierz Hamiltonian

587

18. Kato, T.: Smooth operators and commutators. Stud. Math. Appl. 31, 535–546 (1968) 19. Lavine, R.: Absolute continuity of Hamiltonian operators with repulsive potentials. Proc. Am. Math. Soc. 22, 55–60 (1969) 20. Mourre, E.: Absence of singular continuous spectrum for certain self-adjoint operators. Commun. Math. Phys. 78, 391–408 (1981) 21. Mourre, E.: Opérateurs conjugués et propriétés de propagation. Commun. Math. Phys. 91, 279–300 (1983) 22. Okamoto, T. and Yajima, K.: Complex scaling technique in non-relativistic qed. Ann. Inst. H. Poincaré 42, 311–327 (1985) 23. Perry, P., Sigal, I.M. and Simon, B.: Spectral analysis of n-body Schrödinger operators. Annals Math. 114, 519–567 (1981) 24. Putnam, C.R.: Commutation Properties of Hilbert Space Operators and Related Topics. Berlin– Heidelberg–New York: Springer-Verlag, 1967 25. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, IV. New York: Academic Press 26. Sigal, I.M. and Soffer, A.: The n-particle scattering problem: Asymptotic completeness for short-range systems. Ann. Math., 126, 35–108 (1987) Communicated by A. Kupiainen

Commun. Math. Phys. 207, 589 – 620 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Generalized Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems M. Adler1,? , P. van Moerbeke1,2,?? 1 Department of Mathematics, Brandeis University, Waltham, MA 02154, USA.

E-mail: [email protected]; [email protected]

2 Department of Mathematics, Université de Louvain, 1348 Louvain-la-Neuve, Belgium.

E-mail: [email protected] Received: 8 September 1998 / Accepted: 27 April 1999

Dedicated to Jürgen Moser, at the occasion of his 70th birthday Abstract: Classically, a single weight on an interval of the real line leads to moments, orthogonal polynomials and tridiagonal matrices. Appropriately deforming this weight with times t = (t1 , t2 , . . . ), leads to the standard Toda lattice and τ -functions, expressed as hermitian matrix integrals. This paper is concerned with a sequence of t-perturbed weights, rather than one single weight. This sequence leads to moments, polynomials and a (fuller) matrix evolving according to the discrete KP-hierarchy. The associated τ -functions have integral, as well as vertex operator representations. Among the examples considered, we mention: nested Calogero–Moser systems, concatenated solitons and m-periodic sequences of weights. The latter lead to 2m + 1-band matrices and generalized orthogonal polynomials, also arising in the context of a Riemann–Hilbert problem. We show the Riemann–Hilbert factorization is tantamount to the factorization of the moment matrix into the product of a lower–times upper–triangular matrix. Contents 0. 1. 2. 3. 4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vertex Operator Solutions to the Discrete KP Hierarchy . . . . . . . . . Moment Matrix Factorization and Solutions to Discrete KP and 2d-Toda Weights, Flags and Dual Flags . . . . . . . . . . . . . . . . . . . . . . . Toda Lattice, Matrix Integrals and Riemann–Hilbert for Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . 5. Periodic Sequences of Weights, 2m + 1-Band Matrices and Riemann–Hilbert Problems . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

590 594 595 597

. . . 602 . . . 606

? The support of a National Science Foundation grant # DMS-98-4-50790 is gratefully acknowledged.

?? The support of a National Science Foundation grant # DMS-98-4-50790, a Nato, a FNRS and a Francqui

Foundation grant is gratefully acknowledged. Some of the present work was done at the Centre Emile Borel, Paris (fall 96).

590

M. Adler, P. van Moerbeke

6. Soliton Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612 7. Calogero–Moser System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614 8. Discrete KdV-Solutions, with Upper-Triangular L2 . . . . . . . . . . . . . . . 617 0. Introduction The starting point in the standard theory of orthogonal polynomials is a single weight ρ(z)dz on an interval of the real line. The latter leads to moments µij = hzi , zj ρi, depending on i + j only; in turn, moments lead to polynomials pn (z), defined by the determinant (0.2) below and the spectral relation zpn = (Lp)n defines tridiagonal semiinfinite matrices L.PAn important recent development in this ancient theory is that the ∞ i perturbed weight e 1 ti z ρ(z)dz leads to t-dependent tridiagonal matrices L(t) satisfying the standard Toda lattice equations; the determinants of the principal minors of the moment matrix are τ -functions for the Toda lattice and are representable as integrals over Hermitian matrices, as developed extensively in [1]. This paper is designed to show the reader how the introduction of an infinite family of weights ρj (z)dz, rather than a family zj ρ(z)dz generated by one weight ρ(z)dz, leads to a theory having many features in common with the classic situation above. The weights lead to “moments" µij , to a semi-infinite moment matrix m∞ , to polynomials pn (z), as in (0.2), and to semi-infinite matrices L of type (0.4) below, defined by zpn (z) = (Lp(z))n . We mainly deal with: (i) t-deformations (t = (t1 , t2 , . . . )) ρj (t; z) = e

P∞ 1

ti zi

ρj (z)dz, t = (t1 , t2 , . . . ) ∈ C∞ , z ∈ R, j = 0, 1, 2, . . .

of the weights; they imply for the matrix L the so-called “discrete KP-hierarchy" in t; this hierarchy is fully described in [2], and a large class of solutions is explained in Sect. 1. Occasionally, shall we deal with (ii) (t, s)-deformations1 ρj (t, s; z) = e

P∞ 1

ti zi

∞ X

F` (−s)ρj +` (z), t, s ∈ C∞ , z ∈ R, j = 0, 1, 2, . . . .

`=0

of the weights ρj ; they imply for L the 2d-Toda hierarchy, as described in [16,3] and summarized in Sect. 2. To be specific, given a family of weights ρ0 (z)dz, ρ1 (z)dz, . . . on R, and their tdeformations, P∞ k ρjt (z)dz := ρj (t; z)dz = e 1 tk z ρj (z)dz, define the “moments”, with regard to the usual integration in R: µij := hzi , ρj (z)i and µij (t) := hzi , ρjt (z)i, and the moment matrix

mn (t) := µij (t)

0≤i,j ≤n−1

1 where the F (t) are the elementary Schur polynomials e i

P∞

i 1 ti z

(0.1)

.

=

P∞

i=0 Fi (t)z

i

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

591

Then the semi-infinite moment matrix m∞ satisfies the linear differential equations ∂m∞ = 3k m∞ , ∂tk where 3 denotes the standard shift matrix. They form an infinite set of commuting vector fields. Generically the semi-infinite moment matrix m∞ admits a (unique) factorization into upper- and triangular matrices S1 and S2 respectively, with S1 having 1’s on the diagonal: m∞ = S1−1 S2 . Consider the vector p(t, z) := (pn (t, z))n≥0 of monic polynomials in z, depending2 on t = (t1 , t2 , . . . ) ∈ C∞ ,   µ00 (t) . . . µ0,n−1 (t) 1  ..  .. .. 1  .  . . det  pn (t, z) := S1 χ (z) :=  . (0.2) n−1  µn−1,0 (t) . . . µn−1,n−1 (t) z  det mn (t) µn0 (t) . . . µn,n−1 (t) zn The eigenvalue problem zp(t, z) = L(t)p(t, z)

(0.3)

or, alternatively, the S1 -matrix in the factorization above, gives rise to the semi-infinite matrix L = S1 3S1−1



a0 b0  = 3 + a30 + 3T b + 3T 2 c + · · · =   c0 .. .

1 a1 b1 .. .

0 1 a2 .. .

 0 0  1  . .. .

(0.4)

The polynomials pn (t, z) also give rise to a Grassmannian flag of nested infinitet ⊃ . . . , given by3 dimensional planes · · · ⊃ Wnt ⊃ Wn+1 H+ ⊃ Wnt := Wn e−

P∞ 1

ti zi

:= span {pn (t, z), pn+1 (t, z) . . . } .

(0.5)

We shall also need the associated “Vandermonde" determinants4 , `−1 1(ρ) n (z) = det (ρ`−1 (zk ))1≤`,k≤n , 1n (z) = det(zk )1≤`,k≤n ,

(0.6)

and the simple vertex operator, X(t, z) := e

P∞ 1

ti zi −

e

P∞ 1

z−i ∂ i ∂ti

;

this is, in disguise, a Darboux transform acting on KP τ -functions. We now state: 2 χ(z) := (1, z, z2 , . . . ) and χ ∗ (z) := χ (z−1 ) 3 H := span {1, z, z2 , . . . } + 4 1 (z) = Q n 1≤j
(0.7)

592

M. Adler, P. van Moerbeke

Theorem 0.1. Given the moments (0.1) and the construction above, the semi-infinite matrix L in (0.4) satisfies the discrete KP hierarchy ∂L = [(Ln )+ , L], n = 1, 2, . . . , ∂tn and has the following τ -function representation5 : L=

∞ X `=0

˜ n+2−` ◦ τn F` (∂)τ diag τn+2−` τn

(0.8)

! 31−`

(0.9)

n≥0

in terms of a sequence of τ -functions (τ0 = 1, τ1 , τ2 , . . . ), which enjoys many different representations: (moment representation) τn (t) = det µ`,k (t) 0≤`,k≤n−1 Z

1 n!

=

Z ···

Rn

= det Proj : e

= det Proj : e Z =

R

(0.10) 1n (z)1(ρ) n (z) −

P∞ 1

P∞ 1

ti zi −n

z

ti zi n

z

n P Y

e

ti zki

(integral representation)

k=1

Wn → H+

Wn∗

dzk

→ H+

(0.11)

(flag representation)

(0.12)

(dual flag representation) (0.13)

dz zn−1 ρn−1 (z)X(t, z) τn−1 (t)

(vertex representation), (0.14)

where Wn e − Wn∗ e

P∞ 1

P∞ 1

ti zi ti zi

= span {pn (t, z), pn+1 (t, z) . . . } ⊂ H+ , (Z ) ! ρjt (u)du = span , j = 0, . . . , n − 1 ⊕ H+ ⊃ H+ . R z−u (0.15)

The polynomials (0.2) have the following representations: det zµij (t) − µi+1,j (t) 0≤i,j ≤n−1 (0.16) pn (t, z) = det(µij (t))0≤i,j ≤n−1 Z Z n P Y 1 ti zki − z dz ··· 1n (z)1(ρ) (z) = e (z ) k k , n n!τn (t) Rn k=1

(0.17) and satisfy the eigenvalue problem Lp = zp. ∂ 1 ∂ 1 ∂ 5 where F (∂) ` ˜ = F` ( ∂t1 , 2 ∂t2 , 3 ∂t3 , . . . ), for the elementary Schur polynomials F` . The symbol

˜ ◦ g is the customary Hirota symbol F` (∂)f

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

593

Notice that formulae (0.16) and (0.17) go in parallel with (0.10) and (0.11). Formula (0.17) is a generalization of a formula for classical orthogonal polynomials already appearing last century in the work of Heine [11]. We shall apply this theorem to a variety Rof examples, corresponding to Sects. 4 to 8 (δ(x) is the customary delta-function, i. e., R δ(x) f (x)dx = f (0)): ρj (z) := zj ρ(z) ρj +km (z) := z ρk (z) = δ(z

− pk+1 ) − λ2k+1 δ(z

km

ρj (z)

− qk+1 )

ρk (z) = δ 0 (z − pk+1 ) + λk+1 δ(z − pk+1 ) k (k)

ρk (z) = (−1) δ

(z − p) − δ

(k)

(z + p)

tridiagonal matrix L, 2m + 1-band matrix Lm , concatenated solitons, nested Calogero–Moser systems, upper-triangular L2 .

The first example leads to the standard Toda lattice and the classic theory of orthogonal polynomials. Since the work of Fokas, Its and Kitaev [9], the Riemann–Hilbert method is a device to obtain asymptotics for orthogonal polynomials; for semi-classical asymptotics, see Bleher and Its [7]. We show Riemann–Hilbert factorization

⇐⇒

factorization m∞ = S1−1 S2 .

(0.18)

To be precise, we show the Riemann–Hilbert matrices Yn take on the following form (χ(z) and χ ∗ (z) are as in footnote 2 and hn−1 := τn /τn−1 ):   1 ∗ (S1 χ(z))n z (S2 χ (z))n  Yn (z) =  −1 1 ∗ (z)) h−1 (S χ(z)) h (S χ n−1 n−1 n−1 1 n−1 z 2  = 

τn (t−[z−1 ]) n z τn (t)

τn+1 (t+[z−1 ]) −n−1 z τn (t)

τn−1 (t−[z−1 ]) n−1 z τn (t)

τn (t+[z−1 ]) −n z τn (t)

 .

(0.19)

The second example, which is novel and which is developped in Sect. 5, involves a finite set of weights ρ0 (z)dz, . . . , ρm−1 (z)dz for m ≥ 2, which we extend into an infinite “m-periodic” sequence ρ0 (z)dz, . . . , ρm−1 (z)dz, zm ρ0 (z)dz, . . . , zm ρm−1 (z)dz, z2m ρ0 (z)dz, . . . , z2m ρm−1 (z)dz, . . . . This sequence leads naturally to generalized orthogonal polynomials pn (z) by the recipe (0.2), which enjoys the following properties: the polynomials pn (z) satisfy the orthogonality relations hpi (z), ρj (z)i = 0 for i ≥ j + 1; (ii) Applying zm to the vector p(z) := (p0 (z), p1 (z), . . . ) leads to a 2m + 1-band matrix Lm . P∞ i (iii) The t-evolution e 1 ti z ρk (z) implies L evolves according to the discrete KP hierarchy. (i)

594

M. Adler, P. van Moerbeke

The discrete KP-hierarchy on 2m+1-band matrices has been studied in [17]; see also [10]. We also formulate here a Riemann–Hilbert problem, which should characterize the generalized orthogonal polynomials. Another interesting set of examples is provided by picking as weights various combinations of (standard) δ-functions, which lead to concatenated soliton solutions, Calogero–Moser systems, etc . . . . 1. Vertex Operator Solutions to the Discrete KP Hierarchy In [2], we discussed the discrete KP hierarchy and found a general method for generating its solutions, in both, the bi- and semi-infinite situations; this paper mainly deals with the semi-infinite case. In [2] and [4], we gave an application of the bi-infinite discrete KP to the q-KP equation. In general, the main features are summarized in the following statement, whose proof can be found in [2]: Theorem 1.1. From an arbitrary KP τ -function and a sequence of real functions (. . . , ν−1 (λ), ν0 (λ), ν1 (λ), . . . ), defined on R, one constructs the infinite sequence of τ -functions: τ0 = τ and, for n > 0, τ0 (t) = τ (t), Z Z X(t, λ)νn−1 (λ)dλ · · · X(t, λ)ν0 (λ)dλ τ (t), n > 0, τn (t) = Z Z X(−t, λ)ν−n (λ)dλ · · · X(−t, λ)ν−1 (λ)dλ τ (t), n > 0. τ−n (t) = (1.1) Then the bi-infinite vector

9(t, z) =

τn (t − [z−1 ]) P∞ ti zi n e 1 z τn (t)

and bi-infinite matrix L=

∞ X `=0

diag

˜ n+2−` ◦ τn F` (∂)τ τn+2−` τn

(1.2) n∈Z

! 31−`

(1.3)

n∈Z

satisfy the discrete KP-hierarchy equations for n = 1, 2, . . . : ∂L ∂9 = (Lk )+ 9 and = [(Lk )+ , L], with L9(t, z) = z9(t, z). ∂tk ∂tk Then τn (t) is given by the following projection P∞ i τn (t) = det Proj : e− 1 ti z z−n Wn −→ H+ ,

(1.4)

(1.5)

where the Grassmannian flag · · · ⊃ Wn ⊃ Wn+1 ⊃ . . . is given by Wn := spanC {9n (t, z), 9n+1 (t, z), . . . }.

(1.6)

Conversely, a Grassmannian flag · · · ⊃ Wn ⊃ Wn+1 P⊃ . . . , given by (1.6), with i functions 9n (t, z) satisfying the asymptotics 9n (t, z) = e ti z zn (1 + O(1/z)) leads to the discrete KP-hierarchy.

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

595

Remark. A semi-infinite discrete KP-hierarchy with τ0 (t) = 1 is equivalent to a biinfinite discrete KP-hierarchy with τ−n (t) = τn (−t) and τ0 (t) = 1; in the above theorem, this amounts to setting τ0 (t) = 1 and ν−n (λ) := νn−1 (λ), n = 1, 2, . . . . We extend the semi-infinite flag W0 = H+ ⊃ · · · ⊃ Wn ⊃ Wn+1 ⊃ . . . , by setting W−n = Wn∗ , for n ≥ 0. 2. Moment Matrix Factorization and Solutions to Discrete KP and 2d-Toda In (0.1), we considered t-deformations of the sequence of weights, with t ∈ C∞ , (ρ0t (z), ρ1t (z), . . . ) , t ∈ C∞ with ρjt (z) = e

P∞ 0

tk zk

ρj (z).

As announced in the introduction, we consider further deformations of the sequence of weights, in the (t, s)-direction, ρj (t, s; z) = e

P∞ 1

ti zi

∞ X

F` (−s)ρj +` (z), t, s ∈ C∞ , z ∈ R, j = 0, 1, 2, . . . .

`=0

(2.1) and the corresponding moment matrix mn (t, s) = (µij (t, s))0≤i,j
(2.2)

We now state the following proposition (e.g., see [3,5]): Proposition 2.1. The matrix m∞ (t, s) satisfies the differential equations ∂m∞ = 3k m∞ , ∂tk

∂m∞ = −m∞ 3k> . ∂sk

(2.3)

Factorizing the matrix m∞ (t, s) into the product of lower- and upper-triangular matrices S1 and S2 , with S1 having 1’s along the diagonal: m∞ (t, s) = S1−1 (t, s)S2 (t, s),

(2.4)

the sequence of wave functions6 , derived from S1 and S2 , 9i (t, s; z) = eξi (z) Si χ(z) 9i∗ (t, s; z) = e−ξi (z) (Si> )−1 χ ∗ (z), can be expressed in terms of τ -functions τn (t, s) = det mn , as follows: τn (t − [z−1 ], s) P∞ ti zi n 1 e z , 91 (t, s; z) = τn (t, s) n∈Z τn+1 (t, s − [z]) P∞ si z−i n z , e 1 92 (t, s; z) = τn (t, s) n∈Z τn+1 (t + [z−1 ], s) − P∞ ti zi −n 1 e z , 91∗ (t, s; z) = τn+1 (t, s) n∈Z 6 ξ (z) := P t zi and ξ (z) := P s z−i ; also χ (z) := (1, z, z2 , . . . ) and χ ∗ (z) := χ (z−1 ) 1 2 i i

(2.5)

596

M. Adler, P. van Moerbeke

92∗ (t, s; z)

=

τn (t, s + [z]) − P∞ si z−i −n 1 z e τn+1 (t, s)

n∈Z

, (2.6)

with 9i (t, s) satisfying the following differential equations7 : ∂9i ∂9i = (Ln1 )+ 9i , = (Ln2 )− 9i with L1 = S1 3S1−1 , L2 = S2 3−1 S2−1 . ∂tn ∂sn The τ -functions satisfy bilinear identities, for all n, m ≥ 0, I P∞ 0 i τn (t − [z−1 ], s)τm+1 (t 0 + [z−1 ], s 0 )e 1 (ti −ti )z zn−m−1 dz z=∞

I =

z=0

τn+1 (t, s − [z])τm (t 0 , s 0 + [z])e

P∞ 1

(si −si0 )z−i n−m−1

z

dz,

(2.7)

and therefore the KP hierarchy in each of the variables t and s. The following corollary can be found in [5]: Corollary 2.2. 2d-Toda τ -functions satisfy the following (Fay-like) identities for arbitrary z, u, v ∈ C: τn (t − [z−1 ], s + [v] − [u])τn (t, s) − τn (t, s + [v] − [u])τn (t − [z−1 ], s) v−u (2.8) τn+1 (t, s − [u])τn−1 (t − [z−1 ], s + [v]). z P Introduce now the residue pairing about z = ∞, between f = i≥0 ai zi ∈ H+ and P g = j ∈Z bj z−j −1 ∈ H: I X dz = f (z)g(z) ai bi , (2.9) hf, gi∞ = 2π i z=∞ =

i≥0

where the integral is taken over a small circle about z = ∞. But setting s = s 0 and m ≤ n − 1, the right-hand integrand of (2.7) is holomorphic and so the right-hand side of (2.7) vanishes. Of course, freezing s = s 0 yields the discrete KP-hierarchy; see [2]. Therefore I P∞ 0 i τn (t − [z−1 ], s)τm+1 (t 0 + [z−1 ], s)e 1 (ti −ti )z zn−m−1 dz = 0 for n ≥ m + 1, z=∞

(2.10) and so for n ≥ m + 1, I P∞ i τn (t − [z−1 ], s) − P∞ t 0 zi −m−1 τm+1 (t 0 + [z−1 ], s) 1 i z e dz = 0. (2.11) e 1 ti z zn τn (t, s) τm (t, s) z=∞ 7 A and A denote the upper-triangular and strictly lower-triangular part of the matrix A, respectively + −

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

597

Defining the linear space Wn∗ as the space of functions perpendicular to Wn for the residue pairing (2.9), we thus have for fixed s, by virtue of (1.6), (2.6) and then (2.11), Wnt = span{zj

P i τj (t − [z−1 ], s) , j ≥ n} = e− ti z Wn , τj (t, s)

Wn∗t = span{z−j

P i τj (t + [z−1 ], s) , j ≤ n} = e ti z Wn∗ . τj −1 (t, s)

(2.12)

It also shows that τn (t, s) can be obtained from those spaces in two different ways (for fixed s): P i τn (t, s) = det Proj : e− ti z z−n Wn −→ H+ P i (2.13) = det Proj : e ti z zn Wn∗ −→ H+ , where the multiplication by z−n and zn makes the corresponding linear spaces have “genus zero”, in accordance with the terminology of Segal–Wilson [14]. As a special case (Hänkel matrices), consider the sequence of weights ρj (z)dz = zj ρ(z)dz.

(2.14)

Then the (t, s)-deformations take on the following form: P iX P i F` (−s)z`+j ρ(z) = e (ti −si )z zj ρ(z), ρj (t, s; z) = e ti z `≥0

thus depending on t − s only. Therefore µij (t, s) depends only on t − s and i − j (m∞ is a Hänkel matrix) and so τn (t, s) depends only on t − s. Therefore, in this case we may replace t − s by t. In this case, the matrix m∞ is symmetric, which simplifies the factorization (2.4) above. Indeed: m∞ (t) = S1−1 S2 = S1−1 hS1−1> = S −1 (t)S >−1 (t),

(2.15)

S = h−1/2 S1 = h1/2 S2−1> .

(2.16)

upon setting

3. Weights, Flags and Dual Flags The purpose of this section is to prove Theorem 0.1. The point is to derive the τ functions from the Grassmannian flag (1.6). Unfortunately, the matrix associated with the projection (1.5) is infinite; therefore taking its determinant would be non-trivial, although possible. However, it turns out to be infinitely easier to consider the dual flag, which leads to a finite projection matrix, whose determinant is the same τ -function. To carry out this program, we equip the space H := span{zi , i ∈ Z} with two inner products: the usual one Z f (z)g(z) dz, (3.1) hf, gi = R

598

M. Adler, P. van Moerbeke

and remember the residue pairing about z = ∞, between f = P g = j ∈Z bj z−j −1 ∈ H: I X dz f (z)g(z) ai bi , = hf, gi∞ = 2π i z=∞

P

i≥0 ai z

i

∈ H+ and

(3.2)

i≥0

where the integral is taken over a small circle about z = ∞. The two pairings, which will be instrumental in linking the flag to the dual flag, are related as follows: Lemma 3.1.

Z

hf, gi = f,

R

g(u) du z−u

∞

.

(3.3)

Proof. Expanding the integral above into an asymptotic series, which we take as its definition, Z Z X u j g(u) 1 g(u) du du = z R z R z−u j ≥0 Z 1 X −j z g(u)uj du, (3.4) = z R j ≥0

we check that for holomorphic functions f in C, * + Z Z X X g(u) i 1 −j j du . = ai z , z g(u)u du, . f, z R z−u R ∞ i≥0 j ≥0 ∞ X Z i ai g(u)u du = R

i≥0

Z =

R

g(u)

= hf, gi.

X

ai ui du

i≥0

t u

(3.5)

Remark. The series (3.4) only converges outside the support of g(u). So, in general, the series (3.4) diverges, even for large z. In specific examples, this integral will have a precise meaning; see Sects. 4 and 5. P

k

To the family of functions ρ0 (z), ρ1 (z), . . . on R, and ρjt (z) := e tk z ρj (z), we associate the flag of spaces W0 = H+ ⊃ · · · ⊃ Wn ⊃ Wn+1 ⊃ . . . , defined by ⊥ Wn := span{ρ1 , ρ1 , . . . , ρn−1 } = {f ∈ H+ such that hf, ρi i = 0, 0 ≤ i ≤ n − 1}

(3.6)

with respect to the inner product (3.1). So, throughout we shall be playing with the following two representations of the moments: * Z ρ t (u)du + j i t i . (3.7) µij = hz , ρj (z)i = z , R z−u ∞

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

599

With the moments µij (t) := hzi , ρjt i we associate the monic polynomials pk (t, z) in z of degree k, introduced in (0.2). As usual, set Wnt = e−

P

ti zi

Wn and its dual Wn∗t = e

P

ti zi

Wn∗ .

As we showed in (2.12), for the residue pairing we have:

t Wn , Wn∗t ∞ = Wn , Wn∗ ∞ = 0. The integral representation (3.9) below of the dual flag already appears in the work of Mulase [13] for the case ρj (z) = zj ρ(z). Proposition 3.2. The flag H+ ⊃ W1 ⊃ W2 ⊃ . . . , defined by (3.6) at t = 0, evolves into t })⊥ = span{pn (t, z), pn+1 (t, z), . . . } ⊂ H+ , Wnt = (span{ρ0t , ρ1t , . . . , ρn−1

(3.8)

and the dual flag H+ ⊂ W1∗ ⊂ W2∗ ⊂ . . . , evolves into Wn∗t = span

(Z

ρjt (u)du z−u

)

!

, j = 0, . . . , n − 1 ⊕ H+ .

(3.9)

Proof. Indeed to show (3.8), it suffices to check the following, for k ≥ j + 1 and the k X aki (t)zi , defined in (0.2): polynomials pk (t, z) = akk1(t) i=0

hpk (t, z), ρjt i =

k

1 X aki (t)hzi , ρjt i akk (t) i=0 k

1 X aki µij (t) akk (t) i=0   µ00 (t) . . . µ0,k−1 (t) µ0j (t) 1  ..  = 0. .. det  ... = .  . akk (t) µk0 (t) . . . µk,k−1 (t) µkj (t)

=

(3.10) To prove the dual statement (3.9), one checks for k ≥ j + 1, * Z ρ t (u)du + j = hpk (t, z), ρjt (z)i = 0, pk (t, z), z − u R ∞

using Lemma 3.1, and, of course, E D pk (t, z), z`

∞

= 0, for all k, ` ≥ 0.

t u

600

M. Adler, P. van Moerbeke

Remember from (2.13), the τ -functions τn (t) can be computed in two different ways: P i τn (t) = det Proj : e− ti z z−n Wn −→ H+ P i (3.11) = det Proj : e ti z zn Wn∗ −→ H+ . We shall need the following lemma concerning Vandermonde-like determinants, extending a lemma mentioned in [13]: Lemma 3.3. X det u`,σ (k) vk,σ (k) 1≤`,k≤n = det u`,k 1≤`,k≤n det v`,k 1≤`,k≤n . Q

(3.12)

σ∈

Proof of Theorem 0.1. Since zn Wn∗ ⊃ zn H+ , the matrix of the projection (3.9) onto H+ , involving Wn∗ , reduces to a finite matrix, whereas the projection involving Wn would involve an infinite matrix! This is the point of using Wn∗ rather than Wn . Therefore the matrix of the projection P k Proj : e tk z zn Wn∗t −→ H+ is obtained by putting all coefficients of Z P k P k ρj (u)du tk z n for (0 ≤ j ≤ n − 1) and e tk z zn+j for (0 ≤ j < ∞) e z z−u in the j th and n + j th columns respectively, starting on top with z0 , z1 , . . . . Since for any power series, I dz z−j −1 f (z) = hz−j −1 , f (z)i∞ , zj -coef of f = 2π i z=∞ we have

τn (t) = det

where

and

A 0 BC

= det A det C = det A,

P k C = coefzn+i zn+j e tk z 0≤i,j <∞   1 0 0 ...  F1 1 0 . . .   =  F2 F1 1 . . .  , .. .. .. . . . . . . P Z ρj (u)du k A = coefzi zn e tk z R z−u 0≤i,j ≤n−1 P k Z ρ (u)du j = zn−i−1 e tk z , R z − u ∞ 0≤i,j ≤n−1

(3.13)

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

601

D E P k un−i−1 , e tk u ρj (u) 0≤i,j ≤n−1 = µn−i−1,j (t) 0≤i,j ≤n−1 , =

(3.14)

which provides the A-matrix in (3.13), thus establishing (0.10). Hence, τn (t) = det (µ`k (t))0≤`,k≤n−1 Z ` = det z ρk (t, z)dz R

0≤`,k≤n−1

Z

= det zσ`−1 ρ (t, z )dz for a fixed permutation σ σ (k) σ (k) (k) k−1 R 1≤`,k≤n Z Z det zσ`−1 ρk−1 (t, zσ (k) ) dz1 . . . dzn = ··· (k) 1≤`,k≤n Rn Z Z 1 X = det zσ`−1 ρ (t, z ) dz1 . . . dzn ··· k−1 σ (k) (k) 1≤`,k≤n n! σ Rn Z Z n P Y i 1 ··· = det(zk`−1 )1≤k,`≤n det (ρ`−1 (zk ))1≤k,`≤n e ti zk dzk , n! Rn k=1

using Lemma 3.3; this establishes (0.11). Furthermore, we have, continuing the identities above, that8 τn (t) =

1 n!

Z

= =

=

R

n n P Y Y X i (−1)σ ρ`−1 (zσ (`) ) e ti zk dzk

Z

σ

Rn−1

R

`=1

X

Z

Z =

Rn

1n (z)

k=1 n P Y i 1 −1 σ 1n (σ z)(−1) ρ`−1 (z` ) e ti zk dzk ··· n! Rn σ k=1 `=1 Z Z n P i Y 1 n!1n (z) ρ`−1 (z` )e ti z` dz` ··· n! Rn `=1 Z n−1 P i Y zi e ti z ρn−1 (z) dz zn−1 1− z R 1 Z n−1 P i Y 1n−1 (z1 , . . . , zn−1 ) ρ`−1 (z` )e ti z` dz` Rn−1 `=1 Z P 1 ∂ P i − izi ∂ti dz zn−1 ρn−1 (z)e ti z e

Z

=

Z ···

1n−1 (z1 , . . . , zn−1 )

n Y

n−1 Y

ρ`−1 (z` )e

`=1

n−1 dz z ρn−1 (z)X(t, z) τn−1 (t),

8 using in the fifth identity e−

P∞ i 1 a /i = 1 − a

P

ti z`i

dz`

602

M. Adler, P. van Moerbeke

proving (0.14). Therefore, the sequence τn (t) satisfies (1.1) in Theorem 1.1 with νn (z) = zn ρn (z) and τ0 (t) = 1. The τ -functions lead to the expression (1.3) for L and to the expression (1.2) for 9, which both satisfy the discrete KP hierarchy, according to Theorem 1.1. Notice that, from (0.15), (1.5) and (1.2), we have pn (t, z) = e−

P∞ 1

ti zi

9n (t, z);

(3.15)

therefore L defined by τ -functions (1.3) agrees with the semi-infinite L, defined by the semi-infinite polynomial relations zp(t, z) = Lp(t, z), yielding (0.8) and (0.9) for this L. Finally, using (0.10) and (0.11), the wave function (1.2) equals, − [z−1 ]) τn (t) det zµij (t) − µi+1,j (t) 0≤i,j ≤n−1

9n (t, z) = zn e

P∞ 1

ti zi τn (t

, using (0.10), (0.1) and footnote 8, det(µij (t))0≤i,j ≤n−1 P∞ i Z Z n P Y zk zn e 1 ti z ti zki 1n (z)1(ρ) (z) ··· dz e 1 − = k , n n! det(µ`k (t)) z Rn =

k=1

establishing (0.16) and (0.17). u t In the subsequent sections, it is shown that many integrable solutions, when linked together, are nothing but special instances of the situation described in Sect. 2; we mention matrix integrals, 2m + 1-band matrices, soliton formulas, the Calogero–Moser system and others in subsequent sections. 4. Toda Lattice, Matrix Integrals and Riemann–Hilbert for Orthogonal Polynomials Setting ρj (u)du := uj ρ(u)du, ρ t (u) := ρ(u)e

P∞ 1

tk uk

,

define the moment matrix Z P∞ k m∞ (t) := µij (t) 0≤i,j <∞ with µij (t) = zi+j e 1 tk z ρ(z)dz,

(4.1)

(4.2)

and the corresponding t-dependent monic orthogonal polynomials pn (t, z) in z. Note that m∞ is a Hänkel matrix and is therefore symmetric. From the form of the moments, the matrix m∞ (t) satisfies the following differential equations: ∂m∞ (4.3) = 3k m∞ . ∂t Referring to the special case of Hänkel matrices, discussed at the end of Sect. 2, we consider the factorization of the symmetric matrix m∞ (t) into the product of a lowerand upper-triangular matrix S1 and S2 , with 1’s along the diagonal of S1 and h’s along the diagonal of S2 : m∞ (t) = S1−1 S2 = S1−1 hS1−1> = S −1 (t)S >−1 (t), with S = h−1/2 S1 = h1/2 S2−1> . (4.4)

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

603

Theorem 4.1. Then S(t) and the tridiagonal matrix L(t) = S(t)3S −1 (t) satisfy the standard Toda Lattice equations9 : 1 ∂L 1 ∂S = − (Lk )b S and = − [(Lk )b , L]. ∂tk 2 ∂tk 2

(4.5)

The flag and dual flag of (0.15) take on the following form Wnt = span{pn (t, z), pn+1 (t, z), . . . } = span{(S(t)χ (z))n , (S(t)χ (z))n+1 , . . . } −1 n τn (t − [z ]) ,0 ≤ n < ∞ , = span z τn (t) Z pj (t, u)ρ t (u)du , j = 0, . . . , n − 1 ⊕ H+ Wn∗t = span z−u n o −1 = span z ((Sm∞ (t)) χ ∗ (z))j , j = 0, . . . , n − 1 ⊕ H+ −1 −j −1 τj +1 (t + [z ]) , j = 0, . . . , n − 1 ⊕ H+ , = span z τj (t)

(4.6)

with the τ -functions having the following representation, derived from (0.10) up to (0.13): Z zi+j ρ t (z)dz τn (t) = det =

1 n! Z

Z

Z ···

Rn

12n (z) P∞

0≤i,j ≤n−1 n Y P i ti z`

e

ρ(z` )dz`

`=1 i

eT r(V (M)+ 1 ti M ) dM, setting ρ(z) = eV (z) Hn P∞ i = det Proj : e− 1 ti z z−n Wn → H+ Z 2(n−1) dzρ(z) z X(t, z) τn−1 (t), =

=

R

(4.7)

and the orthogonal polynomials, having the form pn (t, z) = zn R =

τn (t − [z−1 ]) τn (t)

Hn

P∞

i

det(zI − M)eT r(V (M)+ 1 ti M ) dM P , R i T r(V (M)+ ∞ 1 ti M ) dM Hn e

where dM = 12n (z) dz1 . . . dzn dU is Haar measure on the set of Hermitian matrices Hn . 9 with regard to the splitting of A ∈ gl into a lower-triangular A and skew-symmetric matrices A ∞ b sk

604

M. Adler, P. van Moerbeke

Before stating the corollary, some explanation is needed. The integral in the matrix below is taken over the R with a small upper semi-circle about z, when =z > 0 and over R, with a small lower semi-circle about z, when =z < 0. Moreover Yn± (z) = lim z0 →z Yn (z0 ). ±=z0 >0

Corollary 4.2. In view of the factorization m∞ (t) = S1−1 S2 of the moment matrix m∞ (t) and setting hn = τn+1 (t)/τn (t), we have the following identity of matrices: ! R pn (t,u) t pn (t, z) ρ (u)du R z−u Yn (z) = −1 R pn−1 (t,u) t h−1 n−1 pn−1 (t, z) hn−1 R z−u ρ (u)du   1 ∗ (S1 χ(z))n z (S2 χ (z))n  = −1 −1 1 ∗ hn−1 (S1 χ(z))n−1 hn−1 z (S2 χ (z))n−1 ! τn+1 (t+[z−1 ]) −n−1 τn (t−[z−1 ]) n z z τn (t) τn (t) . (4.8) = τ (t−[z −1 ]) τn (t+[z−1 ]) −n n−1 n−1 z z τn (t) τn (t) The matrix Yn satisfies the Riemann–Hilbert problem of Fokas, Its & Kitaev [9]: 10 and 1. Y (z) holomorphic on C+ C− . 1 2πiρ t (z) . 2. Y− (z) = Y+ (z) 0 1 n z 0 , when z → ∞. 3. Y (z) = I + O(z−1 0 z−n

Note the first column of Y (z) relates to the Grassmannian Wn and the lower-triangular matrix S1 , whereas the second column to the dual Wn∗ and the upper-triangular matrix S2 in the decomposition of m∞ = S1−1 S2 . Proof of Theorem 4.1. The vertex representation (4.7) of τn (t) shows that the τ -vector τ (t) = (τn (t))n≥0 is a solution of the discrete KP equation (1.4). But more is true: L = S3S −1 is tridiagonal; so, S and L satisfy the standard Toda lattice (4.5). Some of the arguments are contained in [1]. Notice that the Borel decomposition (4.4) is tantamount to finding the orthogonal polynomials pn (t, z) with respect to the inner-product hzi , zj i = µij , to be precise: −1/2

m∞ = S −1 S >−1 ⇐⇒ Sm∞ S > = I ⇐⇒ hhi

−1/2

pi , hj

pj i = δij .

−1/2

(4.9)

pi are given by the It follows that the coefficients of the orthonormal polynomials hi i th row of the matrix S(t) and so X pnj (t)zj . (4.10) S1 (t) = h1/2 S(t) = (pij (t))0≤i,j ≤∞ , where pn (t) = 0≤j ≤n

(i) So, the monic polynomials pn (t, z) of (0.17) have the following form: 1/2

hn (S(t)χ(z))n = (S1 (t)χ(z))n 10 C and C denote the Siegel upper- and lower half plane + −

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

605

= pn (t, z) τn (t − [z−1 ]) = zn τn (t) Z Z n P i Y 1 ··· 12 (u) (z − uk )e ti uk ρ(uk )duk , = n!τn (t) Rn k=1

(4.11) the standard monic leading to the formula in the statement of Theorem 4.1. The pn ’s are P i orthogonal polynomials with regard to the weight ρ t (u) = ρ(u)e ti u . (ii) But, we now prove 1/2 hn S >−1 (t)χ ∗ (z) = S2 (t)χ ∗ (z) n n Z pn (t, u)ρ t (u) τn+1 (t + [z−1 ]) du = z−n . (4.12) =z z−u τn (t) Indeed, we compute, on the one hand, X X 1/2 hn (Sm∞ )nj z−j = (S1 m∞ )nj z−j j ≥0

j ≥0

=

X

z−j

j ≥0

=

X

z−j

Z X R `≥0

Z

=z

pn` (t)µ`j , using (4.11)

`≥0

j ≥0

=

X

Z pn` (t)

`≥0

pn` (t)u`

pn R

X

R

u`+j ρ t (u)du

X u j

j ≥0 t (t, u)ρ (u)

z

ρ t (u)du

du.

z−u

On the other hand, as we have seen in the special case following (2.14), the 2d-Toda τ -function τ (t 0 , s 0 ) depends on t = t 0 − s 0 only, enabling us to write (here ψ stands for 9 without the exponential), X 1/2 z−j = h1/2 S >−1 (t)χ (z−1 ) S >−1 (t) hn j ≥0

nj

n

= S2 (t 0 , s 0 )χ (z−1 )

n

, using (4.4),

= ψ2,n (t 0 , s 0 ; z−1 ) , using (2.5), τn+1 (t 0 , s 0 − [z−1 ]) −n z , using (2.6), τn (t 0 , s 0 ) τn+1 (t + [z−1 ]) −n z , = τn (t) =

from which (4.12) follows, upon using Sm∞ = S >−1 (see (4.9)).

606

M. Adler, P. van Moerbeke

Theorem 4.1 is established by remembering Proposition 3.2 and using (3.8) and (3.9); i.e., Wnt = span{pn (t, z), pn+1 (t, z), . . . }, Z j t u ρ (u)du ∗t , j = 0, . . . , n − 1 ⊕ H+ Wn = span z−u Z pj (t, u)ρ t (u)du = span , j = 0, . . . , n − 1 ⊕ H+ , z−u together with (4.12). u t Proof of Corollary 4.2. Following the arguments of Bleher and Its [7] the first matrix in (4.8) has the desired properties taking into account the following integrals: Z Z 1 pn (t, u) t pn (t, u) t 1 t lim ρ (u)du = pn (t, z)ρ (z) + lim ρ (u)du. 2πi z0 →z R z0 − u 2π i z0 →z R z0 − u =z0 <0

=z0 >0

The formulas (4.11) and (4.12) lead to the desired result. u t Remark. From the fact that det Yn− = det Yn+ , it follows that det Y (z) is holomorphic in C and since det Y (z) = 1 + O(z−1 ), it follows from Liouville’s theorem that det Y (z) = 1, i.e., Z Z pn−1 (t, u) t pn (t, u) t −1 ρ (u)du − pn−1 (t, z) ρ (u)du det Yn = hn−1 pn (t, z) z−u R R z−u 1 τn (t − [z−1 ])τn (t + [z−1 ]) − z−2 τn−1 (t − [z−1 ])τn+1 (t + [z−1 ]) = 2 τn (t) = 1. (4.13) This is not surprising, in view of the fact that the first expression for det Yn is nothing but the Wronskian of the two fundamental solutions of the second order difference equation; see Akhiezer [6]. The second expression, involving τ -functions follows also from Corollary 2.2, by setting u = z−1 and v → 0 and by using the fact that, for the standard Toda lattice, we have τ (t, s) = τ (t − s). 5. Periodic Sequences of Weights, 2m + 1-Band Matrices and Riemann–Hilbert Problems The results of Sect. 4 about tridiagonal matrices will be extended in this section to 2m + 1-band matrices. As usual, we set D E P k (5.1) µij (t) = zi , ρjt (z) , with ρjt (z) = e tk z ρj (z). In proving and stating the results below, we shall also consider the s-deformations, as in (2.1). Here we consider m-periodic sequences of weights ρ0 , ρ1 , . . . , defined by ρj +km (z) = zkm ρj (z), for all j = 0, 1, 2, . . . .

(5.2)

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

607

Theorem 5.1. For the weights (5.2), the polynomials   µ00 (t) . . . µ0,n−1 (t) 1  ..  .. .. 1  .  . . det  pn (t, z) =   µn−1,0 (t) . . . µn−1,n−1 (t) zn−1  det µ`,k (t) 0≤`,k≤n−1 µn0 (t) . . . µn,n−1 (t) zn τn (t − [z−1 ], 0) , where τn (t, 0) = det mn (t) τn (t, 0) det zµij (t) − µi+1,j (t) 0≤i,j ≤n−1

= zn = =

1 n!τn (t)

det(µij (t))0≤i,j ≤n−1 Z Z n P Y i (ρ) ··· 1n (z)1n (z) e i ti zk (z − zk ) dzk , Rn

(5.3)

k=1

lead to matrices L, defined by zp = Lp, (i) which evolve according to the discrete KP hierarchy, (ii) such that Lm is a 2m + 1-band matrix, (iii) the polynomials pn (z) satisfy the generalized orthogonality relations hpi (z), ρj (z)i = 0 for i ≥ j + 1. Remark. It is interesting to point out that the condition (5.2) is equivalent to a seemingly weaker one: zm ρj ∈ span{ρ0 , . . . , ρm+j }, for all j = 0, 1, 2, . . . ,

(5.4)

where ρm+j must appear in the span. Indeed, the pn ’s only depend on the moments µij by means of the determinantal formulae (5.3), which allow for column operations. Corollary 5.2. The following 2 × 2 matrices are all equal   pn (t, z)  Yn (z) =    −1 hn−1 pn−1 (t, z)

Z R

Z

pn (t, u) z m − um

m X k=1

! t zm−k ρk−1 (u)

 du !

    

m pn−1 (t, u) X m−k t z ρk−1 (u) du m m R z −u k=1 ! 1 (S2 χ ∗ (z))n (S1 χ (z))n z = −1 1 ∗ h−1 n−1 (S1 χ (z))n−1 hn−1 z (S2 χ (z))n−1   τn (t − [z−1 ], 0) n τn+1 (t, −[z−1 ]) −n−1 z z   τn (t, 0) τn (t, 0) ; = −1 −1   τn−1 (t − [z ], 0) τ (t, −[z ]) n n−1 −n z z τn (t, 0) τn (t, 0)

h−1 n−1

(5.5)

608

M. Adler, P. van Moerbeke

they solve the following Riemann–Hilbert problem: C . 1. Yn (z) holomorphicon C+ and P k−P m −j 2πi t z k 1 e z ρ (z) j 1 m . 2. Yn− (z) = Yn+ (z) 0 1 −n 0 z 3. Yn (z) −→ I , as z → ∞. 0 zn 1 0 finite, as z → 0. 4. Yn (z) 0 zm−1 The polynomials pn are such that zm pn satisfies 2m + 1-step relations. Note τn (t − [z−1 ], −[z−1 ]) det Yn = τn (t, 0) and the first column of Yn is related to the Grassmannian plane Wn and the second column to a plane11 related to 92 (t, s, z). Remark. In the matrix (5.5), τn (t −[z−1 ], 0) is given in formula (5.3), whereas by (0.10) and (5.11) * +! P∞ i m−1 X zm−k ρk+j (u) . (5.6) τn (t, −[z−1 ]) = det ui , e 1 ti u zm − um k=0

0≤i,j ≤n−1

1/zn+1 ;

this follows from the τ Also note the right-hand column of (5.5) behaves as function representation, but also from the “generalized orthogonality”, mentioned in (iii) (Theorem 5.1). Proving Theorem 5.1 requires the following lemma: Lemma 5.3. Fix m ≥ 1; the polynomials pn (t, z), defined in (0.2), satisfy 2m + 1-step recursion relations, i.e., zm p(t, z) = Lm p(t, z) with Lm a 2m + 1-band matrix, if and only if every ρj , j = 0, 1, . . . satisfies the following requirement: For every ` = 0, 1, . . . , j = 0, 1, . . . there exist constants cr , r = 0, . . . , m + j + ` depending on j and ` such that Pm+j +` cr ρr , ui i = 0 for 0 ≤ i ≤ m + j + ` + 1. hum ρj − 0 Proof. Note the following equivalences: X A(t)nr pr (t, z) zm pn (t, z) =

for some matrix A(t),

n−m≤r≤n+m t ⇐⇒ zm pn (t, z) ∈ Wmax(n−m,0)

for all n ≥ 0,

t ⇐⇒ zm Wnt ⊂ Wmax(n−m,0)

for all n ≥ 0, because of the inclusion · · · ⊃ Wn ⊃ Wn+1 ⊃ . . . ,

⇐⇒ zm Wn ⊂ Wmax(n−m,0) . 11 Unlike the orthogonal polynomial case, the second column dos not contain elements of the dual Grassmannian W ∗

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

Since

609

Wn = (span{ρ0 , ρ1 , . . . , ρn−1 })⊥ = span{pn (z), pn+1 , (z), . . . },

the latter is equivalent to 0 = hum pn (u), ρj (u)i for all 0 ≤ j ≤ n − m − 1, n ≥ 0 = hpn (u), um ρj (u)i n 1 X ani hui , um ρj (u)i = ann (t) i=0   µ00 . . . µ0,n−1 µmj 1   .. .. det  ... = , . . ann (t) µn0 . . . µn,n−1 µm+n,j P where we have used the fact that pn (t, z) = ann1(t) ani (t)zi is represented by (0.2). The vanishing of the determinant above is equivalent to the statement that the last column depends on prior columns; namely there exist c0 , . . . , cn−1 depending on m, n, j such that 0 = µm+i,j −

n−1 X

cr µir for 0 ≤ i ≤ n, 0 ≤ j ≤ n − m − 1

r=0

= hui+m , ρj i −

n−1 X

cr hui , ρr i

r=0

= hui , um ρj − = hui , um ρj −

n−1 X

c r ρr i r=0 j +m+` X

cr ρr i for 0 ≤ i ≤ m + j + ` + 1,

r=0

where ` was defined such that j + ` = n − m − 1. u t Proof of Theorem 5.1. The fact that Lm is a 2m + 1-band matrix follows at once from Lemma 5.3. That the matrix L evolves according to the discrete KP-hierarchy follows straightforwardly from the general statement in Theorem 0.1. u t Proof of Corollary 5.2. For the sake of this proof, we shall be using the (t, s)-deformations of the weights ρj and the corresponding matrix m∞ (t, s) of (t, s)-dependent moments, ∞ D E P∞ i X F` (−s)ρj +` (z). (5.7) µij (t, s) = zi , ρj (t, s; z) , with ρj (t, s; z) = e 1 ti z `=0

In Sect. 2, it was mentioned that m∞ satisfies the differential equations ∂m∞ = 3k m∞ , ∂tk

∂m∞ = −m∞ 3k> . ∂sk

(5.8)

610

M. Adler, P. van Moerbeke

Factorizing the matrix m∞ (t, s) = S1−1 (t, s)S2 (t, s)

(5.9)

into the product of lower- and upper-triangular matrices S1 and S2 then leads to the 2d Toda lattice. For later use, we also compute ρk (t, −[z−1 ]; u), making specific use of the periodicity of the sequence of weights ρj +m (u) = um ρj (u) and the identity12 Fn ([z−1 ]) = z−n .

(5.10)

We find: ρk (t, −[z−1 ]; u) = e =e

P∞ 1

P∞ 1

ti ui

ti ui

∞ X

Fj ([z−1 ])ρj +k (u)

j =0 ∞ X j =0

=e

P∞ 1

 ti ui



ρj +k (u) , zj

m−1 X j =0

=e

P∞ 1

ti ui

m−1 X j =0

=e =e

P∞ 1

P∞ 1

ti ui

ti ui

 2m−1 X ρj +k (u) ρj +k (u) + + ... zj zj j =m

ρj +k (u) 1+ zj

! m 2m u u + + ... z z

m−1 X

ρj +k (u) j z (1 − ( uz )m ) j =0

m−1 X zm−j ρj +k (u) . z m − um j =0

(5.11)

From (5.9), we have S1 m∞ = S2 and hitting χ ∗ (z) with this matrix and using (4.10), we compute, on the one hand, X X X = z−j pn` (t, s)µ`j (S1 m∞ )nj z−j s=0

j ≥0

j ≥0

=

X

`≥0

z−j

j ≥0

=

X

X

pn` (t, s)

R

`≥0

z−j

j ≥0

Z

R

s=0

Z

pn (t, s; u)e

u` ρj (t, s; u)du

s=0

P∞ 1

ti ui

X

F` (−s)ρj +` (u)du

s=0

`≥0

12 obtained, by expanding the following expression in elementary Schur polynomials, by setting t = 0 and by comparing the powers of y:

X n≥0

P

y n Fn (t + [z−1 ]) = e

−i

ti + z i

yi

=e

P

ti y i

1−

∞ k X y −1 X n y = y Fn (t) z z n≥0

0

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

Z = Z = Z = Z =

R

R

R

R

pn (t, 0; u)e

P∞ 1

ti ui

X

611

z−j ρj (u)du

j ≥0

pn (t, 0; u)e

P∞ 1

ti ui

X

Fj ([z−1 ])ρj (u)du

j ≥0

pn (t, 0; u)ρ0 (t, −[z−1 ]; u)du pn (t, 0; u)e

P∞ 1

ti ui

m−1 X zm−j ρ j =0

j (u) du, zm − um

(5.12)

using (5.11) in the last identity. On the other hand, we have by (2.6), X

(S2 (t, s))nj z−j

s=0

j ≥0

= S2 (t, s)χ (z−1 )

n s=0

= ψ2,n (t, 0; z−1 ) =

τn+1 (t, −[z−1 ]) −n z . τn (t, 0)

(5.13)

The right-hand sides of (5.12) and (5.13) coincide (using S1 m∞ = S2 ); so, we find the following identity, together with the desired asymptotics for z → ∞: Z R

  m−1 −1 pn (t, 0; u)  X m−j t  −n τn+1 (t, −[z ]) = z−n (hn + O(1)), z ρ (u) du = z j z m − um τn (t, 0) j =0

(5.14) yielding (5.5) and leading to Condition 3. The Jump Condition 2. follows from the following: 1 lim 2πi z0 →z

=z0 <0

Z

  m−1 pn (t, u)  X 0 m−j t  z ρj (u) du z 0 m − um j =0

  Z m−1 m−1 1 X 1−j t 1 pn (t, u)  X 0 m−j t  z ρj (z) + z ρj (u) du. lim = pn (t, z) m 2π i z0 →z z0 m − um j =0

=z0 >0

j =0

(5.15) Condition 4 follows from the first expression for Yn (z) in (5.5) and the usual contour deformation argument. The formula concerning det Yn follows from setting s = −[z−1 ], t u = 0 and v = z−1 in identity (2.8). u

612

M. Adler, P. van Moerbeke

6. Soliton Formula For future use, we define the vertex operator: X(t; λ, µ) =

Pn −k −k 1 ∂ Pn k 1 k (µ −λ ) k ∂t k . e 1 (λ −µ )tk e 1 λ−µ

(6.1)

Theorem 6.1. Given points pk , qk and λk , k = 1, . . . , and the weights ρk = δ(z − pk+1 ) − λ2k+1 δ(z − qk+1 ), k = 0, 1, . . . , the τ -functions P∞ P∞ k k j −1 j −1 τn (t) = det pi e k=1 tk pi − λ2i qi e k=1 tk qi 1≤i,j ≤n = pnn−1 X(t, pn ) − λ2n qnn−1 X(t, qn ) τn−1 (t) n P P∞ Y ai tk (qik −pjk ) ti pki k=1 e det δij − e = cn qi − pj 1≤i,j ≤n = cn

k=1 n P Y

e

ti pki

k=1

n Y

e−ai X(t;qi ,pi ) 1,

1

form a τ -vector of the discrete KP hierarchy, for appropriately chosen functions ai and cn of p, q, λ. The matrix L, constructed by (0.9) from the τ ’s above, satisfies L(t)p(t, z) = zp(t, z), with polynomial eigenvectors (in z): P∞ k k i) k=1 tk (qi −pj ) e det δij (z − pi ) − aqi (z−q −p i j 1≤i,j ≤n . pn (z) = P∞ k −p k ) t (q ai k det δij − qi −pj e k=1 i j 1≤i,j ≤n

Then Wn = (span{ρ0 , . . . , ρn−1 })⊥ = {f ∈ H+ , such that f (pi ) = λ2i f (qi ), i = 1, . . . , n} and

(

Wn∗

) λ2i 1 = span − , i = 1, . . . , n ⊕ H+ . z − pi z − qi

(6.2)

(6.3)

Proof. Consider the space H+ /zn H+ = span{1, . . . , zn−1 } = span{v1 , . . . , vn },

(6.4)

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

where the polynomials

Y

vk (z) =

613

(z − pj )

j 6 =k 1≤j ≤n

of degree n − 1 form an alternative basis; the determinant of the transformation between Q the two bases being the Vandermonde determinant 1n (p) = 1≤j
n Y (z − pi )

(6.5)

1

and g(z) = e

P∞ 1

ti zi

, ak = λ2k

P (qk ) , cn (t) = vk (pk )

Qn

= cn

1

P 0 (pk )e

n Y

P∞

i i=1 ti pk

1n (p) e

P∞

i i=1 ti pk

.

k=1

With this notation vk (pi ) = δik vk (pk ) = δik P 0 (pk ),

P (qi ) . qi − pk

vk (qi ) =

(6.6)

Using the second line of (3.14) and the formula for ρi , one computes τn (t) = det(µj i )0≤i,j ≤n−1 I = det z−j −1 g(z)zn z=∞

1 det = 1(p)

I

λ2i 1 − z − pi z − qi

!

dz 2π i

λ2i 1 vj (z)g(z) − z − pi z − qi z=∞

!

! 1≤i≤n 0≤j ≤n−1

dz 2π i

!

, 1≤i,j ≤n

using (6.4). The second identity above leads to the first formula about τ in Theorem 6.1, whereas (0.14) is responsible for the second formula. The last formula on the right-hand side, just above, leads to 1 det vj (pi )g(pi ) − vj (qi )g(qi )λ2i τn (t) = 1≤i,j ≤n 1(p) h 1 = det diag(g(pi ))1≤i≤n 1(p) i diag g(pi )−1 δij vi (pi )g(pi ) − vj (qi )g(qi )λ2i 1≤i,j ≤n 1≤i≤n g(p ) 1 i det δij vi (pi )g(pi ) − λ2i vj (qi )g(qi ) , using (6.4) = 1(p) g(pj ) 1≤i,j ≤n 2 vj (qi ) g(qi ) = cn (t) det δij − λi vi (pi ) g(pj ) 1≤i,j ≤n

614

M. Adler, P. van Moerbeke

= cn (t) det δij −

ai g(qi ) qi − pj g(pj )

= cn (t) det δij − ai X(t, qi , pj )1

= cn (t)

n Y

1≤i,j ≤n

1≤i,j ≤n

e−ai X(t,qi ,pi ) 1,

i=1

using in the last equality the vanishing of the square of the vertex operator. The formula for pn (t, z) is derived from the third expression for τ (t), using the standard representation (2.6) for the wave vector 9(t, z). u t Remark. When qi = −pi , the formulae in Theorem 6.1for the KdV τ -function read: !n n n P Y Y 2k pi λi e k,i t2k pi τn (t) = 1

1

P P k k −j odd tk pi − (−1)n−j λ e − odd tk pi det pi λ−1 i i e 1≤i,j ≤n P k k ai = cn (t) det δij + e− odd tk (pi +pj ) . pi + pj 1≤i,j ≤n Note that Segal and Wilson have used, in [14], the infinite matrix representation of the projection of (6.2), rather than (6.3), in order to computeKdV solitons. 7. Calogero–Moser System Theorem 7.1. Given points pk , λk (k = 1, 2, . . . ), the weights ρk = δ 0 (z − pk+1 ) + λk+1 δ(z − pk+1 ), k = 0, 1, . . . , n − 1 determine a sequence of τ -functions for the discrete KP equation13 , Z Z Y n P i 1 e ti zk 1n (z)1(ρ) ··· τn (t) = n (z)dz1 . . . dzn n n! R k=1 ! ∞ P∞ X i = etr 1 ti Y det −X + k t¯ Y k−1 , k

(7.1)

1

with appropriate matrices X and Y , functions of pk and λk ’s, satisfying the commutation relation14 [X, Y ] = Ie , and having the form 1 − δij + diag(ξ1 , . . . , ξn ). (7.2) X = diag(x1 , . . . , xn ) and Y = xi − xj ij The matrix L, constructed by (0.9) from the τ ’s above, satisfies L(t)p(t, z) = zp(t, z), 13 t¯ = (x + t , t , t , . . . ) 1 2 3 14 where I = (1 − δ ) e ij 1≤i,j ≤n

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

with eigenvectors, polynomial in z,  pn (t¯, z) = det zI − Y − xI +

∞ X

615

!−1  .

ktk Y k−1 − X

(7.3)

1

The Grassmannian flag corresponding to this construction is given by Wn = {f ∈ H+ , such that f 0 (pi ) = λi f (pi ), 1 ≤ i ≤ n}, 1 λi − , i = 1, . . . , n ⊕ H+ . Wn∗ = (z − pi )2 z − pi

(7.4)

Proof. As before, we introduce the basis vk (z) of H+ /zn H+ ; note n Y X vk (z) − vi (z) X ∂ (z − pj ) = , vk = ∂z pk − pi i=1 i=1 j 6 =k,i

(7.5)

i6 =k

and the matrices 

 X X˜ = − diag  1≤α≤n α6=i

1  − λi  pi − pα

−

1 − δij pi − pj

1≤i,j ≤n

,

1≤i≤n

Y˜ = diag(p1 , . . . , pn )

(7.6)

with commutation relation ˜ Y˜ ] = Ie , Ie = (1 − δij )1≤i,j ≤n . [X, Then, by (3.7) and the choice15 of ρi , τn (t) = det(µij )0≤i,j ≤n−1 I 1 λi dz = det zj −1 g(z) − (z − pi )2 z − pi 2π i 1≤i,j ≤n z=∞ I 1 1 λi dz = − det vj (z)g(z) 1n (p) (z − pi )2 z − pi 2π i 1≤i,j ≤n 1 det (vj g)0 = −λi vj (pi )g(pi ) z=pi 1n (p) 1≤i,j ≤n   ! n ∞ X X vj (pi )−vα (pi ) 1   = det g(pi ) +vj (pi )g(pi ) ktk pik−1 −λi  1n (p) pj − pα α=1 α6=j

15 c := n

Qn

1 vi (pi ) 1n (p) in the expressions below

1

1≤i,j ≤n

616

M. Adler, P. van Moerbeke

= cn

n Y

 e

P∞ 1

ti pki

k=1

= cn etr

P∞ 1

ti Y˜ i

 n X + δij 

1 − δij det  pi − pj

det −X˜ +

α=1 α6 =i

∞ X

1 + pi − pα

∞ X

  ktk pik−1 − λi 

1

1≤i,j ≤n

!

ktk Y˜ k−1 ,

(7.7)

1

yielding the formula for τn (t). According to Theorem 0.1, the pn (t, z) are polynomials, which we now compute: pn (t¯, z) = zn

τn (t¯ − [z−1 ]) τn (t¯)



∞ ∞ X X Y˜ det −X˜ + k t¯k Y˜ k−1 − z−1 n z Y 1 1 pk ! 1− = zn ∞ z X 1 det −X˜ + k t¯k Y k−1

!k−1 

1

det −X˜ + = det(zI − Y˜ )

∞ X

˜ k−1

k t¯k Y

−z

−1

1−z

1

det −X˜ +

∞ X

−1

= det(zI − Y˜ ) det I − xI +

∞ X

!

! k t¯k Y˜ k−1

1



Y˜

−1



!−1 ktk Y˜ k−1 − X˜

 (z − Y˜ )−1  ,

1

(7.8) yielding (7.3), for the matrices X˜ and Y˜ . It also provides an expression for the wave functions 9n (t, z), upon multiplying by an exponential. The formulae for Wn and Wn∗ follow from (3.8) and (3.9) and the choice of ρk . In order to connect with the form of the matrices X and Y announced in (7.2), consider the hyperplane V perpendicular to e = (1, . . . , 1) ∈ Cn , Cn ⊃ V = {hz, ei = 0} > and the isotropy subgroup G e ∈ U (N ) of Ie , i.e., the U ’s such that U e = e, thus preserving V . That Ie = −I follows at once, from V





Ie z = 

zk 

X k6 =i

= −z.

1≤i≤n

˜ having Since Ge stabilizes Ie , there exists a unitary matrix U ∈ Ge , diagonalizing X, the property ˜ −1 , U Y˜ U −1 ] = [X, Y ] = U Ie U −1 = Ie , [U XU

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

with

617

X = diag(x1 , . . . , xn );

i.e.,

(xi − xj )yij = 1 − δij

i 6 = j,

implying Y must have the form announced in (7.2). Introducing these new matrices into the expressions (7.7) and (7.8) for τn and pn (t, z) yields τn (t) = etr

P∞ 1

= det e = det e

ti Y i

P∞ 1

P∞ 1

det −X +

∞ X

! ktk Y

k−1

1 ti Y i ti Y i

det(−X + t1 I + 2t2 Y + . . . ) n Y (t1 + xi (t2 , t3 , . . . )), i=1

and pn (t, z), as announced in (7.1) and (7.3). Note that the n roots xi (t2 , t3 , . . . .) of the characteristic equation in t1 are solutions in (t2 , t3 , . . . ) of the n-particle Calogero–Moser system with initial configuration coordinates (x1 , . . . , xn , ξ1 , . . . , ξn ); see T. Shiota’s paper [15]. Thus, a solution of the discrete KP system corresponds to a flag of Calogero– Moser systems generated by a sequence of pi and λi , with pi playing the role of constant of motion. u t Remark. Observe that for t = 0, the parameters x and z in pn (t¯, z)|t=0 = det(zI − Y ) det I − (xI − X)−1 (zI − Y )−1 are interchangeable (except for the trivial factor det(zI − Y )). This must be compared to the results in [15] and [14]. 8. Discrete KdV-Solutions, with Upper-Triangular L2 Letting all points pi in the soliton example converge to p, all points qi converge to −p, and all λi converge to 1, the weights ρk (z) take on the form (8.2) below. For future use, define the functions: X ti p i ` even f` = p` sinh odd `

= p cosh

X

ti pi

` odd

odd

and g` = p

`

z sinh

X

i

ti p − p cosh

odd

=p

`

z cosh

X odd

X

! ti p

i

ti p

i

odd i

ti p − p sinh

X odd

` even ! ` odd.

(8.1)

618

M. Adler, P. van Moerbeke

Theorem 8.1. The family of weights, ρk (z) = (−1)k δ (k) (z − p) − δ (k) (z + p), for 0 ≤ k ≤ n − 1, leads to discrete KdV solutions, with KdV τ -functions16 X n ti p i τn (t) = 2n e even W [f0 , f1 , . . . , fn−1 ].

(8.2)

(8.3)

The matrix L has the property that L2 is upper-triangular, with polynomial eigenvectors L(t)p(t, z) = zp(t, z), given by pn (t, z) =

W [g0 , g1 , . . . , gn−1 ] W [f0 , f1 , . . . , fn−1 ]

(8.4)

in terms of (8.1); i.e., the polynomials pn (t, z) satisfy 3-step relations of the following nature: z2 pn (t, z) = αn pn + βn pn+1 + pn+2 . Then

(

Wn = f =

∞ X

) i

ai z such that f

(k)

k

(p) − (−1) f

(k)

(−p) = 0, 0 ≤ k ≤ n − 1

0

= {1, z2 , z4 , . . . } ⊕ z(z2 − p2 )n {1, z2 , z4 , . . . } and Wn∗

(Z

) k ρk (u)du ∂ 2p = = , k = 0, . . . , n − 1 ⊕ H+ z−u ∂p z2 − p 2 ( ) ∂ k 1 = , k = 0, . . . , n − 1 ⊕ H+ . ∂p z2 − p 2

Proof. Indeed, the form of the flag Wn∗ follow from the general formulae (3.9). Also observe that 1 and z(z2 − p2 )n ⊥ Wn∗ with respect to the h , i∞ pairing, and so they are in Wn ; while from its definition, z2 Wn ⊂ Wn , and so Wn has the claimed form. Therefore τn (t) = det(µij )0≤i,j ≤n−1 ! k I ∂ 2p = det zn−j −1 g(z) 2 dz , using (3.8) ∂p z − p2 z=∞ 0≤k,j ≤n−1 ! k I 1 1 ∂ − dz zn−j −1 g(z) = det ∂p z−p z+p 0≤k,j ≤n−1 ! k P P ∂ i i = det p n−j −1 e ti p − (−p)n−j −1 e ti (−p) ∂p 0≤k,j ≤n−1

16 W [. . . ] denotes a Wronskian with respect to the parameter p

Orthogonal Polynomials, Discrete KP and Riemann–Hilbert Problems

619

X X   X i  ti p − ti pi k ti p i  ∂  odd  = det  pn−j −1 eeven + (−1)n−j e odd e  ∂p

= 2n e

n

X

even

0≤k,j ≤n−1

ti p

i

W [f0 , f1 , . . . , fn−1 ],

which is formula (8.3). In order to express the wave vector, one needs to compute zn τn (t − [z−1 ]) = zn det

− (−p)

∂ ∂p

k

n−j −1

pn−j −1 e

e

P

P

ti pi

(1 −

p (1 + ) z X i ti p

ti (−p)i

 k  ∂ = det  pn−j −1 eeven ∂p

p ) z

0≤k,j ≤n−1

X  X i  ti p − ti p i   · e odd (z − p) + (−1)n−j e odd (z + p)

n

n

=2 e

X

even

0≤k,j ≤n−1

ti p

i

W [g0 , g1 , . . . , gn−1 ],

from which formula (8.4) follows. Also notice that from the form of Wn , we have z2 Wn ⊂ Wn and thus z2 Wnt ⊂ Wnt . It shows that the τn (t)’s are KdV τ -functions. This fact, combined with t })⊥ = span{pn (t, z), pn+1 (t, z), . . . } ⊂ H+ , Wnt = (span{ρ0t , ρ1t , . . . , ρn−1

leads to the 3-step relation: z2 pn (t, z) = αn pn + βn pn+1 + pn+2 , t establishing the upper-triangular nature of L2 . u Remark. Letting p → 0 in p −n(n+1)/2 τn (t) leads to the rational KdV solutions, i.e., the Schur polynomials with Young diagrams of type ν = (n, n − 1, . . . , 1). Acknowledgements. We wish to thank Leonid Dickey for insightful comments and criticism, which lead us to recognize the importance of the integral in (0.15) beyond its formal aspects. L. Dickey has shown in a very interesting recent paper [8] that the discrete KP hierarchy is the most natural generalization of the modified KP. We also thank Alexander Its and Pavel Bleher for having explained to us the Riemann–Hilbert problem, and for having posed the problem of finding the connection with the matrix factorization of the moment matrix. We also thank Taka Shiota for a number of very interesting conversations.

620

M. Adler, P. van Moerbeke

References 1. Adler, M. and van Moerbeke, P.: Matrix integrals, Toda symmetries, Virasoro constraints and orthogonal polynomials. Duke Math. J. 80 (3), 863–911 (1995) 2. Adler, M. and van Moerbeke, P.: Vertex operator solutions to the discrete KP-hierarchy. Commun. Math. Phys. 203 185–210 (1999) 3. Adler, M. and van Moerbeke, P.: String orthogonal Polynomials, String Equations and two-Toda Symmetries. Comm. Pure and Appl. Math. L, 241–290 (1997) 4. Adler, M., Horozov, E. and van Moerbeke, P.: The solution to the q-KdV equation. Phys. Letters A 242, 139–151 (1998) 5. Adler, M. and van Moerbeke, P.: The spectrum of coupled random matrices. Ann. of Math. 149, 921–976 (1999) 6. Akhiezer, N.I.: The classical moment problem. University Mathematical monographs, New York: Hafner, 1965 7. Bleher, P., Its, A.: Semiclassical asymptotics of orthogonal polynomials, Riemann–Hilbert problem and universality in the matrix model. Ann. of Math. 150, 1–81 (1999) 8. Dickey, L.: Modified KP and Discrete KP. Lett. Math. Phys. 48, 277–289 (1999) 9. Fokas, A.S., Its, A.R., Kitaev, A.V.: The isomonodromy approach to matrix models in 2d quantum gravity. Commun. Math. Phys. 147, 395–430 (1992) 10. Gieseker, D.: The Toda hierarchy and the KdV hierarchy. Preprint, alg-geom/9509006 11. Heine, E.: Handbuch der Kugelfunktionen, Vols I and II. 2nd edition, Berlin: 1878, 1881 12. Kazhdan, D., Kostant, B. and Sternberg, S.: Hamiltonian group actions and dynamical systems of Calogero–Moser type. Comm. Pure Appl. Math. 31, 481–508 (1978) 13. Mulase, M.: Algebraic theory of the KP equations, Perspectives in Math. Phys., Ed. R. Penner and S.T. Yau, International press company, 157–223 (1994) 14. Segal, G., Wilson, G.: Loop groups and equations of KdV type. Publ. Math. IHES 61, 5–65 (1985) 15. Shiota, T.: Calogero–Moser hierarchy and KP hierarchy. J. Math. Phys. 35, 5844–5849 (1994) 16. Ueno, K., Takasaki, K.: Toda Lattice Hierarchy Adv. Studies in Pure Math. 4, 1–95 (1984) 17. van Moerbeke, P., Mumford, D.: The spectrum of difference operators. Acta Math., 143, 93–154 (1979) 18. Wilson, G.: Collisions of Calogero–Moser particles and an adelic Grassmannian. Inv. Math. 133, 1–41 (1998) Communicated by T. Miwa

Commun. Math. Phys. 207, 621 – 640 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Persistence of the Feigenbaum Attractor in One-Parameter Families Eleonora Catsigeras, Heber Enrich Fac. Ingenieria, IMERL, C.C. 30 Montevideo, Uruguay. E-mail: [email protected] Received: 4 August 1998 / Accepted: 11 May 1999

Abstract: Consider a map ψ0 of class C r for large r of a manifold of dimension n greater than or equal to 2 having a Feigenbaum attractor. We prove that any such ψ0 is a point of a local codimension-one manifold of C r transformations also exhibiting Feigenbaum attractors. In particular, the attractor persists when perturbing a one-parameter family transversal to that manifold at ψ0 . We also construct such a transversal family for any given ψ0 , and apply this construction to prove a conjecture by J. Palis stating that a map exhibiting a Feigenbaum attractor can be perturbed to obtain homoclinic tangencies. 1. Introduction The existence of a local codimension-one manifold of transformations exhibiting Feigenbaum attractors is well known from the Feigenbaum–Coullet–Tresser theory [CT 1978, Fe 1978,Fe 1979] (for proofs see [La 1982,Su 1991,Ly 1999]) for unimodal real analytic maps in the interval that are in a neighborhood of the fixed map of the doubling renormalization. It is a consequence of the hyperbolicity of this fixed point, as proved by Lanford III [La 1982]. It is also true for n-dimensional transformations [1981]. The existence of such a local codimension-one manifold was recently proved to be valid not only for real analytic transformations but also in the C 2+ topology, as shown in [Da 1996] for maps in the interval (see also [Su 1991]), and in the C r topology, for sufficiently large r, for maps in n ≥ 2 dimensions [CE 1998]. All these results are proved to be true only in a small neighborhood of the fixed map for n ≥ 2 (for global results in dimension 1, see [Su 1991,Ly 1999]). As the renormalization is not invertible neither differentiable, we cannot deduce that there is a global codimension-one manifold, in the space of n-dimensional transformations with n ≥ 2 containing Feigenbaum attractors. We address here the question of whether there can exist, far away from the fixed map, a whole open set of transformations exhibiting Feigenbaum attractors, or if, on the contrary, these attractors can always be destroyed by arbitrarily small perturbations. In

622

E. Catsigeras, H. Enrich

dimension one this is a difficult question whose solution requires elaborate complex analytic methods (see [Su 1991,Ly 1999,Mc 1994]). But, in dimension two or greater where the problem was open so far, we will show here, using explicit constructions, that the Feigenbaum attractor is an unstable phenomenon in the space of C r maps. We describe its unstability: although unstable in the space of C r transformations, the Feigenbaum attractor is persistent in the space of one-parameter families of such transformations. In other words, it is a codimension-one phenomenon. The persistence of the Feigenbaum attractor in one-parameter families, when combined with some previously known results, has some immediate consequences. For instance, a one parameter family of n-dimensional transformations, near the quadratic family, will always have a Feigenbaum attractor. These families appear when unfolding generically a homoclinic tangency [PT 1993]. As a consequence, a theorem of Colli [Co 1996] can be applied to prove that, near such unfolding, one can find transformations exhibiting infinitely many coexisting Feigenbaum attractors. To describe the unstability of the Feigenbaum attractor we construct a one-parameter family passing through the given map, showing at one side a sequence of period doubling bifurcations, and at the other side, a sequence of homoclinic bifurcations. The construction of such a family exploits the existence of a good spatial direction, along which the perturbation can be made. This direction is not available when working with unimodal maps of the interval, due to the existence of critical points. That is the reason why the arguments work only in dimension greater than or equal to 2. The one-parameter families constructed in this work are of the kind described in [GST 1989] to provide examples of Feigenbaum attractors for smooth embeddings of the 2-disk. We show that any Feigenbaum attractor is a point of such a family. The construction of a good one-parameter family provides a proof of a conjecture of J. Palis: the map having a Feigenbaum attractor can be perturbed to exhibit a homoclinic tangency. Thus, this attractor is near the chaotic phenomena that appear when unfolding a homoclinic tangency (as strange attractors for instance). Statement of the main theorems and their corollaries. To state the theorems in the C r topology, we fix r sufficiently large. (r ≥ 8 is sufficient to prove the theorems: we do not seek for an optimum value of r). The space of transformations C r (D) is the open set of the Banach space of C r maps from a n-dimensional closed ball D of R n to R n (n ≥ 2), such that the image of D is contained in the interior of D. In Sect. 2 we define the doubling renormalization R of certain maps in C r (D). We then define what we call a Feigenbaum attractor in dimension n ≥ 2. We call it so because of its geometric similarity to the Feigenbaum attractors of unimodal maps in the interval. But it should be remarked that contributions to the theory in dimension greater than one, and also in dimension one, is nowadays far from the origin of the name of such attractors. In dimension two or greater most known results are based on the renormalization theory of unimodal maps of the interval. For details of one-dimensional renormalization theory see [MS 1993]. Roughly speaking a Feigenbaum attractor is a Cantor set attractor of an infinitely doubling renormalizable map, whose dynamics microscopically converges, in the C r topology, to that of the real analytic map that is a fixed point of the renormalization operator. In Sect. 3 we define an appropriate topology in the space F of one-parameter families of maps in C r (D), and we define the persistence of the Feigenbaum attractor in oneparameter families: each X in F, near a given family, passes through a map exhibiting such an attractor.

Feigenbaum Attractor in One-Parameter Families

623

In Sect. 4 we prove the following Theorem 1.1. If 9 = {ψa }a∈I ∈ F is such that ψ0 has a Feigenbaum attractor K and af˜ = {RN ψa }a∈(−,) , ter some finite number N of doubling renormalizations the family 9 for some small > 0, intersects transversally the local stable manifold of the fixed map of the renormalization, then K is persistent in one-parameter families near 9 and there exists a local codimension-one differential manifold M in the space C r (D), transversally intersecting 9 at ψ0 , formed of maps exhibiting Feigenbaum attractors. One may be particularly interested in taking 9 as a family of C r diffeomorphisms, but the last theorem works also for non injective n-dimensional maps. As shown in [La 1982] the quadratic family, after being renormalized a finite number of times, intersects transversally the stable manifold of the fixed map of the renormalization. This is true for unimodal maps in the interval. It can be easily generalized to maps in n dimensions, where the quadratic family acquires the form: ψa (x1 , x2 , . . . , xn−1 , xn ) = (xn , 0, . . . , 0, 1 − axn2 ). Thus we obtain the following: Corollary 1.1. The Feigenbaum attractor exhibited for a = 1.401155 . . . in the quadratic family of dimension n ≥ 2 is persistent in one-parameter families. Moreover, there exists in the space of C r transformations a local codimension-one manifold M, containing ψ1.401155... , of maps exhibiting Feigenbaum attractors, and M is transversal in C r (D) to the quadratic family. Some properties of the dynamics near homoclinic tangencies are deduced from the last corollary and from the theorem of Colli [Co 1996]: Corollary 1.2 (Colli). Let h0 ∈ Diff ∞ (M), where M is a 2-dimensional manifold, be such that h0 has a homoclinic tangency between the stable and unstable manifolds of a dissipative hyperbolic saddle. Then there exists an open set V ⊂ Diff ∞ (M) such that • h0 ∈ V, • there exists a dense subset S ⊂ V such that for all h ∈ S, h exhibits infinitely many coexisting Feigenbaum attractors. In [YA 1983] it is proved that a one-parameter family of transformations forming a horseshoe must pass through a sequence of period doubling bifurcations. This implies the appearance of cascades of period doubling bifurcations when unfolding a generic homoclinic tangency. The map where these bifurcations accumulate has a Cantor set attractor. This raises two questions: – Is this Cantor set a Feigenbaum attractor (i.e. microscopically looks like the Cantor set attractor of the fixed map of the renormalization)? – Is the sequence of period doubling bifurcations pure, in the sense that, for all sufficiently large N , the period 2N orbit that is born at each bifurcation does not suffer other bifurcations before the next period doubling bifurcation appears? As shown in [PT 1993] a family generically unfolding a quadratic homoclinic tangency can be properly renormalized so that it is arbitrarily close to the quadratic family. Therefore, from Corollary 1.1 we can deduce the following:

624

E. Catsigeras, H. Enrich

Corollary 1.3. Let the family {hµ }µ , hµ ∈ Diff ∞ (M), where M is a 2-dimensional manifold, generically unfold a quadratic homoclinic tangency for µ = 0 between the stable and unstable manifolds of a dissipative hyperbolic saddle. Then, for µ sufficiently close to 0, this family passes through a pure sequence of period doubling bifurcations, and the Cantor set attractor, exhibited for the parameter value where this sequence accumulates, is a Feigenbaum attractor. Theorem 1.1 asserts that the Feigenbaum attractor persists for families near a given family 9, which verifies, after being renormalized, a certain transversality condition. For any transformation ψ0 having a Feigenbaum attractor, it is natural to raise the general question of whether ψ0 always belongs to a local codimension-one manifold of transformations with Feigenbaum attractors. In order to apply Theorem 1, one needs to prove the existence of a good family 9 passing through ψ0 . However, since the renormalization is not surjective, the construction of such a family is not trivial. This question is treated in Sect. 5, where we prove: Theorem 1.2. If ψ0 ∈ C r (D) has a Feigenbaum attractor, then there exists a oneparameter family 9 ∈ F, passing through ψ0 , verifying the hypothesis of Theorem 1, and there exists a local codimension-one manifold M in the space of C r maps, passing through ψ0 , transversal to the family 9, such that all χ ∈ M has a Feigenbaum attractor. As a consequence of Theorem 1.2 and of a theorem in [CE 1998] we obtain the following: Corollary 1.4. If ψ0 ∈ C r (D) has a Feigenbaum attractor then there exists a oneparameter family 9 = {ψt }t∈(−,) such that for t < 0 passes through a pure sequence of period doubling bifurcations, accumulating at ψ0 , and for a sequence of positive parameter values tm , converging to 0, ψtm exhibits a homoclinic tangency. This last corollary generalizes previous theorems of approximation with homoclinic tangencies of cascades of period doubling bifurcations, [Ca 19996,CE 1998], and is a contribution to a conjecture of J. Palis according to which global unstable phenomena can be perturbed to obtain homoclinic bifurcating maps [PT 1993]. 2. The Feigenbaum Attractor in n Dimensions Let f0 : [−1, 1] 7 → [−1, 1] be the fixed map in the interval, i.e., the unique real analytic unimodal map in the interval such that f0 (0) = 1, f000 (0) < 0 and f0 (1)−1 · f0 ◦ f0 (f0 (1) · x) = f0 (x) ∀x ∈ [−1, 1]. The existence and unicity of f0 was the central conjecture of the Feigenbaum– Coullet–Tresser theory [CT 1978,Fe 1978,Fe 1979] and was proved in [La 1982], [Su 1991]. We denote λ to be the number −f0 (1) = 0.3995 . . . . The map f0 has a single fixed point in [−1, 1], which is larger than λ. The analytic map f0 is symmetric: f0 (x) = g0 (x 2 ), where g0 is an analytic diffeomorphism from [0, 1] to [−λ, 1]. It can be uniquely extended to an open interval. There exists one single periodic orbit of f0 of period 2N for each natural N ≥ 0, and this orbit is a hyperbolic repellor. The orbit of countably many points eventually fall on one of these repellors. All the other orbits of f0 are attracted to a Cantor set K in the interval which we call the (standard) Feigenbaum attractor in the interval.

Feigenbaum Attractor in One-Parameter Families

625

Definition 2.1. Let the dimension n be a natural number greater than or equal to 2. Let φ0 (x1 , x2 , . . . , xn−1 , xn ) = (xn , 0, . . . , 0, f0 (xn )) = (xn , 0, . . . , 0, g0 (xn2 )) defined in a small compact neighborhood D in R n of [−λ, 1] × {0}n−2 × [−λ, 1] to its interior. It will be called the fixed map in n dimensions. It inherits the Cantor set attractor of the map f0 , that we call the (standard) Feigenbaum attractor in n dimensions. Remark 2.2. Observe that the fixed map in n dimensions has a one dimensional character: it is an endomorphism of D endowing it to a one-dimensional image, contained in its interior, and following the graph of f0 . The repellors of f0 are transformed into periodic hyperbolic saddles of φ0 with infinite contraction along their stable manifolds. There exist such a periodic orbit with period 2N for each natural N ≥ 0. The unstable manifolds of the saddles have dimension one and are contained in φ0 (D). The stable manifold of each saddle is the union of their preimages by φ0 , formed by horizontal (n − 1)-dimensional hyperplanes intersected with D. All the orbits of φ0 , except those in the stable manifolds of the saddles, are attracted to the Feigenbaum attractor. We note that the fixed point of f0 has no other preimages in the interval [−λ, 1]. As a consequence, the stable manifold of the fixed point x0 of φ0 is φ0−1 (x0 ), which does not intersect φ0 (D) except at x0 . We are interested in studying some Cantor set attractors for other n-dimensional maps, particularly for diffeomorphisms that might be far away from the fixed map. First, we generalize the definition of a Feigenbaum attractor: roughly speaking we want to include any Cantor set attracting all the orbits in a neighborhood except those in the stable manifolds of a countable set of periodic hyperbolic orbits, and that, up to a bounded deformation that microscopically approaches the identity, is the standard Feigenbaum attractor. To formalize the idea we need the following: Definition 2.3. A n-disk D, (or simply a disk), is the image by a C r diffeomorphism of the unit closed ball of R n . (n ≥ 2). In particular, the domain D of φ0 can be chosen to be a n-disk. To state our theorems we will work with fixed r that is greater than or equal to a minimal specific value needed for each theorem. We did not look for an optimum value of r, but, as will be explained later r ≥ 8 is good enough for both Theorems 1 and 2. Definition 2.4. Given a n-disk D, the space C r (D) is the open set of all the maps of C r class from D to its interior, with the topology given by the C r norm k · kr . In some parts of this paper we will need to work with the whole Banach space of C r maps from D to R n although their images are not contained in the interior of D. We will still denote it as C r (D), if there is no risk of confusion. Definition 2.5. A map ψ ∈ C r (D) is doubling renormalizable if there exists a disk D1 ⊂ intD such that: ψ(D1 ) ∩ D1 = ∅, ψ 2 (D1 ) ⊂ int(D1 ).

626

E. Catsigeras, H. Enrich

If ψ is doubling renormalizable and ξ : D 7 → D1 is a C r diffeomorphism (called change of variables), the map Rψ defined as Rψ = ξ −1 ◦ ψ ◦ ψ ◦ ξ is a renormalized map of ψ. Note that doubling renormalizability is an open condition in C r (D). Also note that Rψ is not uniquely defined: small perturbations of the change of variables ξ give other renormalized maps of ψ. When referring to the properties of Rψ we understand that there exists some renormalized map of ψ having these properties. By induction is defined: Definition 2.6. A map ψ ∈ C r (D) is m-times (doubling) renormalizable if it is m − 1-times (doubling) renormalizable and its m − 1-renormalized Rm−1 ψ is doubling renormalizable. It is defined a m-renormalized map of ψ as Rm ψ = RRm−1 ψ. Definition 2.7. A map ψ ∈ C r (D) is infinitely (doubling) renormalizable if it is mtimes (doubling) renormalizable for all natural m. Remark 2.8. The main example of infinite doubling renormalizable map is the fixed map φ0 , defined in 2.1. In fact, φ0 is fixed by the renormalization Rφ0 = 3−1 ◦ φ0 ◦ φ0 ◦ 3 = φ0 , where 3(x1 , x2 , . . . , xn ) = (−λx1 , λx2 , . . . λxn−1 , g0 (λ2 g0−1 (xn ))), with λ = 0.3995 . . . . To verify the identity Rφ0 = φ0 we used that −λf0 (x) = f0 ◦ f0 (λx), which implies −λg0 (u) = g0 ([g0 (λ2 u)]2 ) and so φ0 ◦ 3(x1 , x2 , . . . .xn ) = (g0 (λ2 g0−1 (xn )), 0, . . . , 0, −λxn ). Proposition 2.9. The change of variables 3 and φ0 ◦ 3 are contractions. Proof. We must show that the derivatives D3(x) and D(φ0 ◦ 3)(x) have norm smaller than 1 for all x ∈ D. Looking at the expressions of 3 and φ0 ◦ 3, made explicit in Remark 2.8, and as 0 < λ < 1, it is enough to show that λ2 g 0 (λ2 g −1 (x)) 0 0 <1 g 0 (g −1 (x)) 0

0

for all x ∈ [−λ, 1]. The above inequality is trivial if g0−1 (x) = 0. Now, if g0−1 (x) = u2 6 = 0, using that 2ug00 (u2 ) = f00 (u), we obtain λ2 g 0 (λ2 g −1 (x)) λf 0 (λu) 0 0 0 <λ<1 = g 0 (g −1 (x)) f00 (u) 0

0

t because f000 < 0 and 0 < λu < u. u Observe that we have indeed proved that kD3(x)k = kD(φ0 ◦ 3)(x)k = λ = 0.3995 . . . < 1. With φ0 fixed by the renormalization, any map χ in a neighborhood U of φ0 in C r (D) is doubling renormalizable.

Feigenbaum Attractor in One-Parameter Families

627

From the Feigenbaum–Coullet–Tresser theory we have the hyperbolicity of φ0 . In [CE 1998] it is shown that the change of variables ξ(χ) can be chosen depending continuously on χ ∈ U, such that ξ(φ0 ) = 3 and such that the renormalization R, now uniquely defined as Rχ = ξ(χ)−1 ◦ χ ◦ χ ◦ ξ(χ), has a hyperbolic behavior in U. The renormalization R is not Fréchet differentiable in C r (D), so we can not expect φ0 to be a hyperbolic fixed point in the differentiable sense. But we can define the local stable set of φ0 as W s (φ0 ) = χ ∈ C r (D) : χ

is infinitely doubling renormalizable and Rm χ ∈ U for all m ≥ 0 ,

and the local unstable set of φ0 as W u (φ0 ) = χ ∈ C r (D) : for all m ≥ 0 there exists χm ∈ U such that Rm χm = χ . Eventually reducing the neighborhood U of φ0 in C r (D), we have the following result: Theorem 2.10. Let r ≥ 6. In the functional space C r (D) of C r maps of the n-dimensional disk D, let φ0 be the fixed map of the renormalization. Then W s (φ0 ) is a local codimension-one C 1 manifold, W u (φ0 ) is a one-parameter differentiable family {φt }t , and both intersect transversally at φ0 . Moreover, the renormalized maps of any map in W s (φ0 ) converge to φ0 . Note on the proof. The last theorem is proved in [CE 1998](Theorem 3.6) for r large enough. It can be shown that r ≥ 6 is sufficient, using some adjustments to the numerical computation of the derivatives of the fixed map f0 in the interval: in Lemma 2.4 of [CE 1998] one can find an inequality to fix an integer number h λh (a h λ−2 + b) < 1, where λ = 0.3995 . . . , b = 1/λ2 and a = |f00 (λ)|. To prove Theorem 3.6 of [CE 1998] one uses r ≥ h + 2. Replacing the constant a by 1.19236, which is an upper estimation of its value, one obtains h ≥ 4 and thus r ≥ 6. The estimation of a was computed using the first ten coefficients of f0 (x) as a series of powers of x 2 . These coefficients are explicitly determined in [La 1982]. Remark 2.11. As the last theorem is local in a neighborhood of the Feigenbaum map φ0 in n dimensions, it is true also for any other n-dimensional map φ0 , conjugated to φ0 by a C r+1 conjugation ξ between n-disks: in fact, the conjugation ξ defines a bijection of conjugated maps, that is a diffeomorphism from a neighborhood of φ0 in C r (D) to a neighborhood of φ0 in C r (ξ(D)), preserving the doubling renormalization. To prove Theorem 2.10, ([CE 1998] Theorem 3.6) we used a fixed map ϕ0 in n dimensions that does not coincide with the fixed map φ0 defined in this work, but both are equivalent up to a conjugation. In fact: in [CE 1998], as in the paper of Collet, Eckmann and Koch [1981], the unimodal symmetric map of the interval x 7 → f0 (x) = g0 (x 2 ), with g0 a real analytic diffeomorphism, gives place to the n-dimensional map ϕ0 in a

628

E. Catsigeras, H. Enrich

n-disk defined as ϕ0 (x1 , x2 , . . . , xn ) = (g0 (x12 − αxn ), 0, . . . , 0), where α 6 = 0 is constant. In this work we are associating to the map f0 in the interval, the n-dimensional map φ0 in a disk defined as φ0 (x1 , x2 , . . . , xn ) = (xn , 0, . . . , 0, f0 (xn )). Both transformations ϕ0 and φ0 , defined in appropriate disks of R n are equivalent up to a smooth conjugation. In fact, it can be verified that ξ ◦ ϕ0 ◦ ξ −1 = φ0 , where ξ(x1 , x2 , . . . , xn−1 , xn ) = (x1 , x2 , . . . , xn−1 , g0 (x12 − αxn )). Here we have preferred the transformation φ0 instead of ϕ0 because the corollaries of the theorems become immediate. The following theorem allows us to define the Feigenbaum attractors for maps in C r (D) far away from the fixed map: Theorem 2.12. Let r ≥ 1. If ψ ∈ C r (D) is infinitely doubling renormalizable and, for some m ≥ 0, Rm ψ ∈ W s (φ0 ), then there exist a minimal Cantor set K such that ψ(K) = K, a neighborhood U of K and, for each sufficiently large N , a single periodic orbit of period 2N in U , which is hyperbolic of saddle type. Moreover, K attracts all the orbits in U except those in the stable manifolds of these periodic orbits and there are no other periodic orbits in U . All the orbits of K are quasi-periodic. Proof. φ0 is infinitely renormalizable and Rφ0 = φ0 = 3−1 ◦ φ0 ◦ φ0 ◦ 3, where the change of variables 3 is defined in Remark 2.8. By Proposition 2.9, 3 and φ0 ◦ 3 are contractions. Any map χ in a neighborhood U of φ0 is doubling renormalizable and Rχ = ξ −1 ◦ χ ◦ χ ◦ ξ , where ξ = ξ(χ), depending continuously on χ , is near 3 and χ ◦ ξ is near φ0 ◦ 3. Therefore both ξ and χ ◦ ξ are contractions with contraction rate bounded above by a constant β < 1, which is uniform for all χ ∈ U. If besides χ belongs to W s (φ0 ), then it is infinitely renormalizable in U, and its m renormalized map Rm χ is ξm−1 ◦ χ 2 ◦ ξm , where ξm = ξ(χ) ◦ ξ(Rχ) ◦ . . . ◦ ξ(Rm−1 χ). j m As χ ◦ ξm , for j = 0, 1, . . . 2 − 1, can be written as the composition of m maps, each being either ξ(Ri χ ) or Ri χ ◦ ξ(Ri χ) for i = 0, 1, . . . , m − 1, then χ j ◦ ξm is a contraction with contraction rate bounded above by β m . Let us define Aj,m = χ j ◦ ξm (D) for m ≥ 1 and 0 ≤ j ≤ 2m − 1. It is a compact set with diameter smaller than β m diamD. We assert that, for fixed m, the 2m sets Aj,m are pairwise disjoint in j , in spite of χ being not necessarily injective. In fact A0,1 and A1,1 are disjoint because χ is doubling renormalizable. For the same reason, but using Rχ instead of χ , the sets A0,2 and A2,2 are disjoint contained in A0,1 . On the other hand A1,2 and A3,2 are both contained in A1,1 and are transformed by χ in disjoint sets, so are disjoint. By induction the assertion is proved. Let 2m −1 K = ∩∞ m=1 ∪j =0 Aj,m .

Feigenbaum Attractor in One-Parameter Families

629

The diameter of the disjoint compact sets Aj,m for j = 0, 1, . . . , converge to 0 as m → ∞, and so K is a Cantor set. It is easy to show that χ(K) ⊂ K. In fact, χ(Aj,m ) = Aj +1,m for 0 ≤ j ≤ 2m − 2 and χ(A2m −1,m ) ⊂ A0,m . As the diameter of Aj,m converges to 0 with m, each point of K is quasi-periodic, but never periodic. So K is minimal and χ(K) = K. Now let us prove that K is an attractor for χ . To do this we will use the properties of the fixed map φ0 remarked in 2.2: φ0 has a single fixed point x0 that is hyperbolic of saddle type. Its stable manifold W s (x0 ) is the compact set φ0−1 (x0 ). All the forward orbits of φ0 , except those in W s (x0 ), eventually enter the disk 3(D). As 3−1 ◦ φ0 ◦ φ0 ◦ 3 = φ0 and φ0 (D) ⊂ int (D), we can choose an open set W such that φ02 ◦ 3(D) ⊂ W ⊂ W ⊂ int 3(D). So, once a orbit enters 3(D), its following iterates never escape the open set φ0−1 (W )∪W . Let us choose a small neighborhood H of x0 , disjoint with φ0−1 (W ) ∪ W , such that for φ0 , and also for all maps χ near φ0 in C r (D), a forward orbit is in the stable manifold of the fixed point if it is totally contained in H . As φ0−1 (x0 ) = W s (x0 ) we have that φ0−1 (H ) ⊃ W s (x0 ). Let us choose a open set V such that φ0−1 (H ) ⊃ V ⊃ V ⊃ W s (x0 ). As D − V is compact there exists a natural number N such that, for all y ∈ D − V , either φ0N (y) or φ0N+1 (y) belongs to W . In fact, all the forward orbits, except those in W s (φ0 ), eventually remain in W ∪ φ0−1 (W ). Now let us perturb φ0 taking a map χ ∈ C r (D) sufficiently near φ0 so that: • • • • •

χ has a fixed hyperbolic saddle point x 0 ∈ H . χ j (y) ∈ H for all j ≥ 0 implies y ∈ W s (x 0 ). χ(V ) ⊂ H . χ 2 ◦ ξ(χ)(D) ⊂ W ⊂ W ⊂ ξ(χ)(D). If y ∈ D − V , then either χ N (y) or χ N +1 (y) belongs to W .

We assert that all the forward orbits by χ , except those in W s (x 0 ), eventually enter W , and so, into the disk ξ(χ)(D). In fact, if y ∈ D −V either χ N (y) or χ N +1 (y) belongs to W . If χ j (y) ∈ V for all j ≥ 0, then χ j +1 (y) ∈ H for all j ≥ 0 and so y ∈ W s (x 0 ), proving our assertion. Now we are ready to prove the proposition: if Rm ψ → φ0 , the m-renormalized of ψ, for m ≥ M, are all in U and sufficiently near φ0 to verify the conditions described above. Thus for χ = RM (ψ) the invariant Cantor set K attracts all the orbits except those in the stable manifolds of the hyperbolic saddles of period 2N of χ, for N ≥ 0, that is of period 2N+M of ψ. M −1 ◦ ψ 2 ◦ ξM , where ξM is a diffeomorphism between D and a As RM (ψ) = ξM M

t sub-disk DM , we obtain the thesis taking U = ∪j2 =0−1 ψ −j int(DM ). u

Note. From the proof of Theorem 2.12 observe that the Cantor set attractor K has bounded geometry in the sense that the diameter of the atoms Aj,m decrease with a rate bounded below 1. Even more, when looking microscopically the decreasing rate tends to the number λ = 0.3995 . . . , that is a spatial universal constant defined for the

630

E. Catsigeras, H. Enrich

fixed map φ0 . In fact, λ is the contraction rate of the change of variables in the proof of Proposition 2.9. In [GT 1992] an example is given of a n dimensional infinitely renormalizable map whose renormalized maps do not converge to the fixed map φ0 . In spite of that, this example has a Cantor set attractor that verifies the thesis of Theorem 2.12. Its geometry is also bounded, but the bounds are different from λ. Based on Theorem 2.12, and with the aim to extend the Feigenbaum–Coullet–Tresser theory far away from the fixed map, we define: Definition 2.13. If the map ψ ∈ C r (D) is infinitely doubling renormalizable and some of its renormalized maps belongs to the local stable manifold W s (φ0 ) of the fixed map φ0 , then its invariant Cantor set K, described in Theorem 2.12, is called a Feigenbaum attractor. 3. Persistence in One-Parameter Families In this section we define a topology in the space of differentiable one-parameter families of maps in C r (D) and define persistence of the Feigenbaum attractor in one-parameter families near a given family. Let r ≥ 1. Let I be a closed interval such that 0 ∈ intI . Definition 3.1. FI is the set of all one-parameter families 9 = {ψt }t∈I such that • for each fixed t ∈ I the map ψt belongs to the space C r (D), • the transformation that associates to each t ∈ I the map ψt ∈ C r (D) is of class C 1 . Cr

The derivative in C r (D) respect to the parameter t is a vector in the Banach space of maps from D to R n . It is denoted as ∂t∂ ψt . The topology in FI is given by the norm

∂

. k9k1,r = maxt∈I kψt kr , ψt ∂t r

Given ε > 0, we say that the family X ∈ FI is ε-close to 9, if kX − 9k1,r < ε, and we denote it as X ∈ Bε (9). Let us fix a family 9 ∈ FI such that ψ0 has a Feigenbaum attractor K and ∂ ∂t ψt t=0 6 = 0. We have the following classical theorem that uses the inverse function theorem in Banach spaces [La 1972]. Theorem 3.2 (Local immersion form). If ∂t∂ ψt |t=0 6= 0 there exist a real number δ > 0, a neighborhood U of ψ0 in C r (D), a codimension one subspace S of the Banach space of C r maps from D to R n , and a C 1 diffeomorphism ∧ : U 7 → Uˆ ⊂ R × S such that ψˆ t = (t, 0) ∀t ∈ (−δ, δ). Remark 3.3 (Notation). Let us take a closed interval J contained in (−δ, δ), with δ as in the theorem above, and such that 0 ∈ int J . Note that if |t0 | is small enough then ψˆ t+t0 = ψˆ t + ψˆ t0 for all t ∈ J . Through this section the Banach space FJ will be denoted simply as F and the restricted family {ψt }t∈J will be denoted as 9. To avoid confusion we will not use in this section the whole given family {ψt }t∈I .

Feigenbaum Attractor in One-Parameter Families

631

It is easy to verify that given ε > 0, there exists ρ(ε) > 0 such that, if |t0 | < ρ(ε), then the reparametrized family (+t0 )∗ 9 = {ψt+t0 }t∈J is ε- close to 9 in F. Definition 3.4. The Feigenbaum attractor is persistent in one-parameter families near 9 if there exist ε > 0 and a C 1 real function a : Bε (9) 7 → J , such that, if X = {χt }t∈J ∈ Bε (9) ⊂ F, then χa(X) exhibits a Feigenbaum attractor, a(9) = 0 and a((+t0 )∗ 9) = −t0 . Theorem 3.5. The Feigenbaum attractor is persistent in one parameter families near 9 if and only if there exists in C r (D) a C 1 codimension-one local manifold M intersecting transversally the family 9 at the map ψ0 and such that any map χ in M has a Feigenbaum attractor. Proof. It is easy to prove that the existence of M is sufficient for the persistence of the Feigenbaum attractor. In fact, the transversal intersection of 9 and M at ψ0 in the space C r (D), persists for any other family X sufficiently near 9 in the space F. Let us prove the converse: Let ε be the positive real number existing by Definition 3.4 of persistence. Let 0 < ρ < ρ(ε) as in Remark 3.3. Let ∧ : U 7 → Uˆ be the C 1 diffeomorphism of the local immersion form in Theoˆ there exists γ such that kχ0 −ψ0 kr < rem 3.2. As {ψˆ t : t ∈ J } is a compact segment in U, ˆ ˆ ˆ γ implies χˆ 0 − ψ0 + ψt ∈ U for all t ∈ J . Let us denote Bγ (ψ0 ) = {χ0 ∈ C r (D) : kχ0 − ψ0 kr < γ }. Given χ0 ∈ Bγ (ψ0 ), define the family X = {χt }t∈J ∈ F translating ψˆ t by the vector ˆ Precisely, χt ∈ U is the preimage by the diffeomorphism ∧ : U 7 → Uˆ of χˆ t χˆ 0 in U. defined as χˆ t = ψˆ t + χˆ 0 for all t ∈ J . As it is a translation up to the C 1 diffeomorphism ∧ , the application A transforming χ0 ∈ Bγ (ψ0 ) ⊂ U ⊂ C r (D) into the family X = {χt }t∈J ∈ F is of C 1 class. Besides A(ψ0 ) = 9 and A(ψt0 ) = (+t0 )∗ 9, if |t0 | < ρ(ε). Thus, for small γ , the family X is ε-near 9. Using Definition 3.4 take the real function a and construct the C 1 real function b : Bγ (χ0 ) 7 → R defined as b(χ0 ) = a(A(χ0 )). By Definition 3.4 χb(χ0 ) has a Feigenbaum attractor. Besides b(ψ0 ) = a(9) = 0 and b(ψt0 ) = a((+t0 )∗ 9) = −t0 if t0 < ρ(ε). Differentiating the last identity with respect to t0 at t0 = 0 we obtain ∂ψt = −1. Db|ψ0 · ∂t t=0

632

E. Catsigeras, H. Enrich

Thus Db|ψ0 6 = 0 and, for small γ , the equality M = {χ0 ∈ Bγ (ψ0 ) : b(χ0 ) = 0} defines a local immersed submanifold in C r (D) passing through ψ0 , and transversal to 9. As b(χ0 ) = 0 implies χb(χ0 ) = χ0 , and χb(χ0 ) has a Feigenbaum attractor, all χ0 in M have such attractors. u t Theorem 3.5 allows us to say that the existence of a Feigenbaum attractor is locally a codimension-one phenomenon in the space of C r maps near ψ0 when it is persistent in one-parameter families near 9. Observe that Theorem 3.5 does not assert that all the maps near ψ0 that exhibit a Feigenbaum attractor form a codimension one local manifold. We just constructed a local manifold M whose points are some of the maps having such attractors. 4. Proof of Theorem 1.1 To prove Theorem 1.1 we are fixing r ≥ 8. Let F = FI be the Banach space of one-parameter families defined in 3.1. We need to establish some new notation and to prove a lemma: Remark 4.1 (and Notation). Let 9 = {ψa }a∈I be a family in F such that ψ0 is N times N renormalizable. Thus RN ψ0 = ξN−1 ◦ ψ02 ◦ ξN for some C r diffeomorphism ξN . As the assumption of being N times renormalizable is an open condition in the space C r (D), there exist > 0 and a neighborhood V of 9 in F, such that for all X = N {χa }a∈I ∈ V, χa is N times renormalizable if a ∈ [−, ], and RN χa = ξN−1 ◦ χa2 ◦ ξN with ξN fixed. Denote X˜ = {RN χa }−≤a≤ . It is a one-parameter family of maps in C r (D), but it is not necessarily a C 1 curve of maps, because the renormalization is not differentiable in C r (D). This pathology of the renormalization comes from the fact that the composition transforming χ ∈ C r (D) into χ ◦ χ ∈ C r (D), is continuous but not differentiable. In fact, the linear part of the increment (χ + h) ◦ (χ + h) − χ ◦ χ can be shown to be h ◦ χ + (Dχ ◦ χ) · h, working in C j (D) for j ≤ r − 1. This says that the composition is differentiable from C r (D) to C j (D) if j ≤ r − 1, but not in C r (D), because the difference h ◦ (χ + h) − h ◦ χ does not decrease in C r (D) faster than h does. Besides we would like more than RN being differentiable in the space of maps. We would need the application from a family X in F to the renormalized family X˜ to be differentiable. To obtain these good properties of the renormalization we shall work with the C r−2 topology in the set of the renormalized families. Definition 4.2. The space F˜ is the set of one-parameter families X˜ = {χ˜ a }−≤a≤ such that • for each a ∈ [−, ] the map χ˜ a is in C r−2 (D). • the transformation associating to each a the map χ˜ a ∈ C r−2 (D) is of class C 1 .

Feigenbaum Attractor in One-Parameter Families

633

In F˜ consider the topology given by the norm

∂

kχ˜ a kr−2 , χ˜ a ∂a

˜ 1,r−2 = max kXk

−≤a≤

.

r−2

The following lemma gives the reason to work with r −2 (instead of r) after renormalizing. And it is why we work with r ≥ 8 instead of r ≥ 6, increasing the differentiability by two orders from the value obtained in Theorem 2.10. Lemma 4.3. Let 9, V, , X and X˜ as in Remark 4.1. Let T be defined as T (X) = X˜ for X ∈ V. Then T : V ⊂ F 7 → F˜ is of C 1 class. Proof. X˜ = {RN χa }−≤a≤ is a one-parameter family of maps in C r (D) ⊂ C r−2 (D) and ∂ −1 ∂ 2N ∂ N N R χa = ξ ◦ χ 2 ◦ ξN = DξN−1 · χ ◦ ξN ∂a ∂a N  a ∂a a  N −1 2X N −j −1 ∂χ a j 2 ◦ χa = DξN−1 ·  Dχa · ◦ ξN  ∈ C r−1 (D) ⊂ C r−2 (D). ∂a j =0

In the equality above Dχa0 and χa0 are the identity. Thus T (X) is a family in F˜ . To see that T is of C 1 class let us compute the increment T (X + U ) − T (X), where U = {ua }a∈I is near 0 in F so that X + U ∈ V, N

N

T (X + U ) − T (X) = {ξN−1 ◦ (χa + ua )2 ◦ ξN − ξN−1 ◦ χa2 ◦ ξN }−≤a≤ . For each fixed a, the map N

N

ξN−1 ◦ (χa + ua )2 ◦ ξN − ξN−1 ◦ χa2 ◦ ξN can be written as N −1 2X

j =0

N

j

2N −j

DξN−1 (χa2 ◦ ξN ) · Dχa (χa

2N −j −1

◦ ξN ) · ua (χa

◦ ξN ) + θa ,

where θa is the appropriate difference, between the increment and a linear part. This linear part is the candidate to be the differential of T . It depends continuously on the given family X ∈ F, because composition and multiplication are continuous in the space of families of finite differentiable maps. Observe that θa ∈ C r−1 (D) ⊂ C r−2 (D). To prove that T is of C 1 class it is enough to prove that the family 2 = {θa }−≤a≤ is in F˜ and that k2k1,r−2 → 0 when kU k1,r → 0. kU k1,r

634

E. Catsigeras, H. Enrich

We have that θa =

P2N −1 j =0

(Aj − Lj ), where

j

Aj = ξN−1 ◦ χa ◦ (χa + ua )2 Lj = Mj =

2N −j −1 Mj · ua (χa N DξN−1 (χa2

N −j

j +1

◦ ξN − ξN−1 ◦ χa

N −j −1

◦ ξN ,

◦ ξN ), 2N −j

j

◦ ξN ) · Dχa (χa

Denoting

◦ (χa + ua )2

◦ ξN ).

j

Fj (t) = ξN−1 ◦ χa ◦ (χa + tua ) ◦ (χa + ua )2 for t ∈ [0, 1], we can write

Z

Aj = Fj (1) − Fj (0) =

N −j −1

◦ ξN

∂ Fj (t) dt. ∂t

1

0

(1)

The function inside the integral is continuous from [0, 1] to C r−1 (D). In fact, the derivative with respect to t in C r−1 (D), with fixed a, is: ∂ N j Fj (t) = DξN−1 (χa ◦ (χa + tua ) ◦ (χa + ua )2 −j −1 ◦ ξN ) · vj , ∂t where j

vj = Dχa ((χa + tua ) ◦ (χa + ua )2

N −j −1

◦ ξN ) · ua ((χa + ua )2

N −j −1

◦ ξN ).

The equality (1) is also valid in C r−2 (D). Therefore

Z 1

∂

dt. kAj − Lj kr−2 ≤

∂t Fj (t) − Lj 0 r−2 Comparing follows:

∂ ∂t Fj (t)

with Lj we can decompose its difference in three terms, as

∂ N Fj (t) − Lj = 1(DξN−1 )vj + DξN−1 (χa2 ◦ ξN ) ∂t j

·1(Dχa ) · ua ((χa + ua )2

N −j −1

◦ ξN ) + Mj 1ua ,

(2)

where j

1(DξN−1 ) = DξN−1 (χa ◦ (χa + tua ) ◦ (χa + ua )2 j

j

1(Dχa ) = Dχa ((χa + tua ) ◦ (χa + ua )2 1ua = ua ((χa + ua )

2N −j −1

N −j −1

N −j −1

N

◦ ξN ) − DξN−1 (χa2 ◦ ξN ), j

2N −j

◦ ξN ) − Dχa (χa

2N −j −1

◦ ξN ) − ua (χa

◦ ξN ).

As the composition is continuous in C r−2 (D), given δ > 0 we obtain k1(DξN−1 )kr−2 ≤ δ, j

k1(Dχa )kr−2 ≤ δ if kua kr−2 is small enough. Now, to obtain

k1ua kr−2 ≤ δkua kr−2

◦ ξN ),

Feigenbaum Attractor in One-Parameter Families

635

it is sufficient that kua kr−1 be small enough, because the r − 2 first derivatives of ua are Lipschitz with constant kua kr−1 . Therefore, given δ > 0, kAj − Lj kr−2 ≤ δkua kr−2 , if kua kr−1 is small enough, and so, given δ > 0, kθa kr−2 ≤ δkua kr−2 , if kua kr−1 is small enough.

(3)

Up to the moment we have proved that the renormalization RN is differentiable from to C r−2 (D). We could have done the same from C r (D) to C r−1 (D). We have lost only one order in the differentiability of the maps. But we are going to lose another order when moving the parameter a of the family of C r maps, and computing how the derivative with respect to a behaves when renormalizing the family. Let us compute the derivatives with respect to a: C r−1 (D)

N

2X −1 ∂ ∂θa = (Aj − Lj ). ∂a ∂a j =0

We can decompose

∂Lj = Bj + Cj + Ej , ∂a

where Bj =

N D 2 ξN−1 (χa2 N

◦ ξN ) ·

j 2N −j Dχa (χa

2N −j −1 ◦ ξN ) · ua (χa

∂ 2N χ ◦ ξN , ◦ ξN ), ∂a a

∂ j 2N −j 2N −j −1 ◦ ξN ) · ua (χa ◦ ξN ), Dχa (χa ∂a ∂ j 2N −j 2N −j −1 ◦ ξN ) · Dχa (χa ◦ ξN ) · ◦ ξN ) . ua (χa ∂a

Cj = DξN−1 (χa2 ◦ ξN ) · Ej = DξN−1 (χa2

N

Observe that the maps above belong to C r−2 (D). Now let us compute the derivative of Aj : ∂Aj ∂ −1 N N j j +1 = (ξ ◦ χa ◦ (χa + ua )2 −j ◦ ξN − ξN−1 ◦ χa ◦ (χa + ua )2 −j −1 ◦ ξN ). ∂a ∂a N Calling

N

j

Gj (t) = DξN−1 (χa ◦ (χa + tua ) ◦ (χa + ua )2 −j −1 ◦ ξN ) ∂ N j (χa ◦ (χa + tua ) ◦ (χa + ua )2 −j −1 ◦ ξN ), · ∂a we can write

∂Aj = Gj (1) − Gj (0) = ∂a

Z 0

1

∂Gj (t) dt, ∂t

where the integral is computed in the Banach space C r−2 (D), for each fixed a. We can decompose ∂t∂ Gj (t) in three terms to be compared with Bj , Cj and Ej . In fact, denoting N ra = (χa + ua )2 −j −1 ◦ ξN ,

636

E. Catsigeras, H. Enrich

we obtain

∂Gj (t) = Bˆ j + Cˆ j + Eˆ j , ∂t

where j Bˆ j = D 2 ξN−1 (ξN ◦ Fj (t)) · Dχa ((χa + tua ) ◦ ra ) ∂ j (χa ◦ (χa + tua ) ◦ ra ) , ·ua (ra ), ∂a ∂ j Dχa ((χa + tua ) ◦ ra ) · ua (ra ), Cˆ j = DξN−1 (ξN ◦ Fj (t)) · ∂a ∂ j −1 Eˆ j = DξN (ξN ◦ Fj (t)) · Dχa ((χa + tua ) ◦ ra ) · (ua (ra )) . ∂a To compute Cˆ j and Eˆ j we have differentiated first with respect to t and then with respect to a. Joining all together we have that ∂ (Aj − Lj ) = ∂a

Z 0

1

(Bˆ j − Bj + Cˆ j − Cj + Eˆ j − Ej ) dt.

Comparing Bˆ j with Bj , both have ua as a factor, evaluated at different points. With the same arguments used to analyze the equality (2), we conclude that, given δ > 0, kBˆ j − Bj kr−2 ≤ δkua kr−2 , if kua kr−1 is small enough. Analogously kCˆ j − Cj kr−2 ≤ δkua kr−2 , if kua kr−1 is small enough. Comparing Eˆ j with Ej , both have Therefore

∂ua ∂a

as a factor, also evaluated at different points.

∂

ˆ kEj − Ej kr−2 ≤ δ max kua kr−2 , ua , ∂a r−2

∂ ua r−1 are small enough. if kua kr−1 and ∂a We conclude that, given δ > 0,

∂

∂θa

≤ δ max ku k , u , a r−2 a

∂a ∂a r−2 r−2

(4)

∂ ua r−1 are small enough. if kua kr−1 and ∂a To end the proof, from the inequalities (3) and (4), we conclude that given δ > 0, k2k1,r−2 ≤ δkU k1,r−1 ≤ δkU k1,r if kU k1,r−1 ≤ kU k1,r are small enough, as wanted. u t

Feigenbaum Attractor in One-Parameter Families

637

Proof of Theorem 1.1. Let r ≥ 8. Thus r − 2 ≥ 6 and Theorem 2.10 is valid in C r−2 . If ψ0 has a Feigenbaum attractor K, it is infinitely renormalizable and there exists a natural N such that RN ψ0 belongs to the local stable manifold W s (φ0 ), of the fixed point φ0 . ˜ to T (9) ∈ F˜ where T As in Remark 4.1, consider V and > 0, and denote 9 ˜ is the transformation of Lemma 4.3. By hypothesis 9 intersects transversally W s (φ0 ) at RN ψ0 . Therefore, by Theorem 3.5, the Feigenbaum attractor is persistent in one˜ There exists a neighborhood U˜ of 9 ˜ in F˜ such that parameter families of F˜ near 9. s ˜ ˜ ˜ any family X in U intersects W (φ0 ). Let b(X) be the parameter value in the family X˜ 1 ˜ where this intersection occurs. The real function n o b is of C class, defined in U. Denote U = X ∈ V such that T (X) ∈ U˜ . Define a : U 7 → I by a(X) = b(T (X)) for all X ∈ U. The real function a is of C 1 class, because b and T are. For all X = {χt }t∈I in the neighborhood U of 9 there exists a(X) such that RN χa(X) ∈ W s (φ0 ). Thus χa(X) has a Feigenbaum attractor. Therefore, by Definition 3.4, the Feigenbaum attractor K is persistent in one-parameter families near 9. Proposition 3.5 implies the existence of the C 1 codimension-one t manifold in C r (D) passing through ψ0 . u

5. Construction of a Transversal Family We proved in Theorem 1.1 the existence of a local codimension one manifold of transformations having Feigenbaum attractors. To do so we used a given one-parameter family 9 verifying, by hypothesis, a transversal condition after renormalized. In order to prove Theorem 2 it is enough to construct a good family 9, passing through the given transformation ψ0 . As the renormalization is not surjective, the existence of 9 is not obvious. It will be constructed using the following Lemma 5.1. Let φ0 be the fixed map defined in 2.1, for n ≥ 2. There exists a real analytic map w : D 7 → R n such that u : D 7→ R n , defined as u(x) = Dφ0 (x) · w(x) for all x ∈ D, is transversal to W s (φ0 ) in φ0 . Proof. Let f0 be the unimodal real analytic map in the interval [−1, 1], fixed by the doubling renormalization. It is symmetric, and can be written as f0 (x) = g0 (x 2 ) for g0 a real analytic diffeomorphism from [0, 1] to [−λ, 1]. The Feigenbaum–Coullet–Tresser theory in the interval states that f0 is a hyperbolic fixed point of the doubling renormalization, when working in the space of real analytic maps. Its local unstable manifold is a differentiable curve {ft }t of symmetric real analytic unimodal maps of the interval [−1, 1]: ft (x) = gt (x 2 ) for {gt }t a differentiable family of real analytic diffeomorphisms from [0, 1] to its image [EW 1987]. In [1981] the Feigenbaum–Coullet–Tresser theory is extended to n dimensions in the analytic topology. In that work, the fixed map by the doubling renormalization is ϕ0 (x1 , x2 , . . . , xn ) = (g0 (x12 − αxn ), 0, . . . , 0)

638

E. Catsigeras, H. Enrich

for some fixed number α 6 = 0, as observed in Remark 2.11. The local unstable manifold is formed by the maps ϕt : ϕt (x1 , x2 , . . . , xn ) = (gt (x12 − αxn ), 0, . . . , 0). We showed in [CE 1998] (with the same conventions as in the article [1981]) that the unstable manifold in the space of real analytic maps is still valid in the space C r (D), for r large enough. Now, according to Definition 2.1, we call the following a fixed map in n dimensions : φ0 (x1 , x2 , . . . , xn ) = (xn , 0, . . . , 0, g0 (xn2 )). Obviously the n-dimensional disks where φ0 and ϕ0 are defined, are different, but according to Remark 2.11, there exists a diffeomorphism ξ between them, conjugating φ0 and ϕ0 . In fact φ0 = ξ ◦ ϕ0 ◦ ξ −1 , where ξ(x1 , x2 , . . . , xn ) = (x1 , . . . , xn−1 , g0 (x12 − αxn )). So, in our context, the local unstable manifold W u (φ0 ) of φ0 by the doubling renormalization R in C r (D), is the differentiable family {φt }t , where φt = ξ ◦ ϕt ◦ ξ −1 . Computing explicitly: φt (x1 , x2 , . . . , xn ) = gt (g0−1 (xn )), 0, . . . , 0, g0 ([gt (g0−1 (xn ))]2 ) . s u t Call u = ∂φ ∂t t=0 . It is transversal to W (φ0 ), because it is the tangent vector to W (φ0 ) at φ0 . Let us compute explicitly u: ∂gt ∂gt −1 −1 0 2 (g (xn )), 0, . . . , 0, g0 (xn ) · 2xn · (g (xn )) . u(x1 , x2 , . . . , xn ) = ∂t t=0 0 ∂t t=0 0 Now let us compute Dφ0 :



0 0  .  Dφ0 (x1 , . . . , xn ) =  . .  0 0 Taking

... ... ... ... ... ... ...

 0 1  0 0  . .   . . .  . .   0 0 0 2xn g00 (xn2 )

∂gt −1 (g (x )) w(x1 , . . . , xn ) = 0, 0, . . . , n ∂t t=0 0

t u = Dφ0 · w is obtained as wanted. u Remark 5.2. The proof of the former lemma does not work in dimension one. In fact, observe that the matrix Dφ0 is never null. For maps in the interval, it should be substituted by the derivative of the fixed map 2xg00 (x 2 ) that vanishes for x = 0. In dimension two or larger, the derivative of the fixed map gives a spatial direction along which good perturbations can be constructed. In dimension one this spatial direction does not exist.

Feigenbaum Attractor in One-Parameter Families

639

Now let us show how the former lemma allows us to construct a good family 9 of C r maps passing through the given map ψ0 : Lemma 5.3. If ψ0 ∈ C r (D) has a Feigenbaum attractor, then there exists a one˜ defined as parameter family 9 = {ψt }t∈I ∈ F such that the renormalized family 9, {RN ψt }−≤t≤ , for some small enough and some N large enough, intersects transversally W s (φ0 ) at RN ψ0 in the space C r−2 (D). N

Proof. As ψ0 has a Feigenbaum attractor RN ψ0 = ξN−1 ◦ ψ02 ◦ ξN is as near as wanted to φ0 in C r (D), for all N large enough. Let w : D 7 → R n of class C r be as in Lemma 5.1. If N is large enough, the vector u : D 7 → R n , defined as u(x) = D(RN ψ0 )(x) · w(x) is transversal to W s (φ0 ) in RN ψ0 in the space C r−2 (D). By the chain rule: N

N

D(RN ψ0 )(x) · w(x) = DξN−1 (ψ02 ◦ ξN (x)) · Dψ02 −1 (ψ0 ◦ ξN (x)) · Dψ0 (ξN (x)) · DξN (x) · w(x). Using the density of C r maps in C r−2 (D), let us choose v : D 7 → R n of class C r such that, in the C r−2 topology is sufficiently near (Dψ0 ◦ ξN ) · DξN · w, so that N

DξN−1 (ψ02 ◦ ξN ) · Dψ02

N −1

(ψ0 ◦ ξN ) · v

is still a transversal vector to W s (φ0 ) in RN ψ0 in the space C r−2 (D). Construct a C r map v0 from D to R n such that: • v0 (x) = v(ξN−1 (x)) if x ∈ ξN (D), and

N

j

• v0 (x) = 0 if x belongs to some small neighborhood U of ∪2j =1−1 ψ0 ◦ ξN (D),

and take ψt = ψ0 + tv0 for − ≤ t ≤ for > 0 small enough so ψt is N times renormalizable. N Let us differentiate RN ψt = ξN−1 ◦ ψt2 ◦ ξN with respect to t at t = 0. Observe that, if x ∈ U then ψt (x) = ψ0 (x). So we have: N ∂ N N = DξN−1 (ψ02 ◦ ξN (x)) · Dψ02 −1 (ψ0 ◦ ξN (x)) · v0 (ξN (x)). R ψt ∂t t=0 As v0 (ξN (x)) = v(x) for all x ∈ D, we obtain that ∂t∂ RN ψt t=0 is transversal, by u construction, to W s (φ0 ) at RN ψ0 , proving the lemma. t Proof of Theorem 1.2. It is a straightforward conclusion of Lemma 5.3 and Theorem 1.1. t u Acknowledgements. We thank Jacob Palis, Welington de Melo, Charles Tresser, Eduardo Colli, Raúl Ures and Miguel Paternain for their suggestions. We are very thankful to the referee for his corrections.

640

E. Catsigeras, H. Enrich

References [Ca 19996]

Catsigeras, E.: Cascades of period doubling bifurcations in n dimensions. Nonlinearity 9, 1061– 1070 (1996) [CE 1998] Catsigeras, E., Enrich, H: Homoclinic tangencies near cascades of period doubling bifurcations. Ann. de l’IHP. An. non lin. vol.15, No.3, 255–299 (1998) [1981] Collet, P., Eckmann, J.P., Koch, H.: Period doubling bifurcations for families of maps on R n . J. Stat. Phys. 25, N.1 (1–14)CEK 1981 [Co 1996] Colli: E. Infinitely many coexisting strange attractors. Tese de Doutorado do IMPA Rio de Janeiro. To appear in Ann. de l’IHP. 1996 [CT 1978] Coullet, P., Tresser, C.: Itérations d’endomorphismes et groupe de renormalisation. C.R. Acad. Sci. Paris 287, 577–588 (1978) [Da 1996] Davie, A.M.: Period doubling for C 2+ mappings. Commun. Math. Phys. 176 No. 2, 261–272 (1996) [EW 1987] Eckmann, J.P., Wittwer, P.: A complete proof of the Feigenbaum conjectures. J. Stat. Phys. 46, N.3/4, 455–475 (1987) [Fe 1978] Feigenbaum, M.J.: Quantitative universality for a class of nonlinear transformations. J. Stat. Phys. 19, 25–52 (1978) [Fe 1979] Feigenbaum, M.J.: The universal metric properties of non-linear transformations. J. Stat. Phys. 21, 669–706 (1979) [GST 1989] Gambaudo, J.M., van Strien, S., Tresser, C.: Hénon-like maps with strange attractors: there exist C ∞ Kupka-Smale diffeomorphisms on S 2 with neither sinks nor sources. Nonlinearity 2, 287–304 (1989) [GT 1992] Gambaudo, J.M., Tresser, C.: Self similar constructions in smooth dynamics: Rigidity, smoothness and dimension. Commun. Math. Phys. 150, 45–58 (1992) [La 1972] Lang, S.: Differentiable Manifolds. Reading, MA: Addison–Wesley, 1972 [La 1982] Lanford, O. III: A computer assisted proof of the Feigenbaum’s conjectures. Bulletin of the A.M.S. 6, 427–434 (1982) [Ly 1999] Lyubich, M.:Feigenbaum–Coullet–Tresser universality and Milnor’s hairiness conjecture. Ann. of Math.To appear [Mc 1994] McMullen, C.: Complex dynamics and renormalization. Ann. of Math. Stud. Vol. 135, Princeton, NJ: Princeton Univ. Press, 1994 [MS 1993] de Melo, W., van Strien, S.: One-dimensional dynamics. Berlin: Springer-Verlag, 1993 [PT 1993] Palis, J., Takens, F.: Hyperbolicity and sensitive chaotic dynamics at homoclinic bifurcations. Cambridge: Cambridge University Press, 1993 [Su 1991] Sullivan, D.: Bounds, quadratic differentials and renormalization conjectures. Mathematics into the Twenty First Century Vol. 2 Providence, RI: Am. Math. Soc., 1991 [YA 1983] Yorke, J.A., Alligood, K.T.: Cascades of period doubling bifurcations: A prerrequisite for horseshoes. Bull. A.M.S. 9, 319–322 (1983) Communicated by Ya. G. Sinai

Commun. Math. Phys. 207, 641 – 663 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

New Examples of Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin Elena N. Selivanova1,2,? 1 Department of Geometry, Nizhny Novgorod State Pedagogical University, ul. Ulyanova 1, Nizhny Novgorod,

603000 Russia

2 Mathematisches Institut, Universität Tübingen, Auf der Morgenstelle 10, 72076 Tübingen, Germany.

E-mail: [email protected] Received: 6 February 1998/ Accepted: 12 May 1999

Abstract: In this paper we propose new examples of conservative systems on S 2 possessing cubic integrals in momenta. We obtain a classification of a large class of such systems. We show that this class contains new families of integrable systems as well as the classical case of Goryachev–Chaplygin. 1. Introduction Let M be an n-dimensional Riemannian manifold, and U : M → R be a smooth function on M. We consider the Lagrangian L : T M → R of the following form: L(η) =

|η|2 + (U ◦ π )(η), 2

where π : T M → M is the canonical projection. In local coordinates q1 , . . . , qn , q˙1 , . . . , q˙n on T M we have L(η) =

1X gij q˙ i q˙ j + U (q). 2

Identifying T M and T ∗ M by means of the Riemannian metric, we get the Hamiltonian system with the Hamiltonian H : T ∗ M → R which in local coordinates q1 , . . . , qn , p1 , . . . , pn on T ∗ M has the form H =

1 X ij g pi pj + U (q) = K + U. 2

We will call these Hamiltonian systems conservative systems on M. ? Partially supported by DAAD.

642

E. N. Selivanova

A Hamiltonian system is called integrable in Liouville sense if there exist n globally defined constants of motion, which are generically independent and pairwise involutive (one of them is of course the Hamiltonian itself). If the n constants of motion in involution are of class C 1 and if some other conditions hold, then the Arnold–Liouville theorem affirms the global existence of tori on which the flow generated by the Hamiltonian is linear, see [1]. A smooth function F : T ∗ M → R which is an integral of the Hamiltonian system with the Hamiltonian H and which is generically independent of H will be called an integral of this system of degree m in momenta if in local coordinates F has the form F =

X k1 +···+kn ≤m

ak1 ...kn (q)p1k1 · · · pnkn =

m X

Fk ,

k=0

P where F Pk is a homogeneous polynomial of degree k. Note that Fev = k=2l Fk and Fod = k=2l+1 Fk are also integrals of the system with the Hamiltonian H and that Fm is an integral of the system with the Hamiltonian K. We will say that two Hamiltonians H1 = K1 + U1 and H2 = K2 + U2 are equivalent if there exists a diffeomorphism φ of M and a diffeomorphism 8 of T ∗ M such that the diagram 8 : T ∗ M −−−−→ T ∗ M     0 π 0y yπ φ:M

−−−−→ M

is commutative. Here, 8 is linear for any p ∈ M and there are some nonzero constants κ1 , κ2 such that 8∗ (K1 ) = κ1 K2 , φ ∗ (U1 ) = κ2 U2 . Clearly, if the Hamiltonians H1 , H2 are equivalent and one of the corresponding systems possesses an integral of degree m in momenta, then the other system has the same property. In this paper we consider only the case n = 2. The polynomial integrability in Liouville sense for geodesic flows of Riemannian metrics has been discussed in [4,7,9– 11,3]. It has been proved by Kolokol’tsov [11] and Kiyohara [9] that on 2-dimensional orientable compact manifolds of genus g > 1 there is no integrable geodesic flow with an integral which is polynomial in momenta. There is a conjecture that all integrable geodesic flows on T 2 possess an integral quadratic in momenta (for more details see [3]). The classical result by Darboux states that the geodesic flow of a metric possesses a nontrivial integral quadratic in momenta if and only if there is a local coordinate system (x, y) where the metric has the form ds 2 = (f (x)+h(y))(dx 2 +dy 2 ) for some functions f and h, the so-called Liouville metric. This integral is linear if f or h are constant. All Liouville metrics on global 2-manifolds have been described by Kolokol’tsov [11] and Kiyohara [9]. Some results concerning quadratic integrals for manifolds of higher dimensions have been obtained by Kiyohara in [10]. The case of polynomial integrals of degree > 2 is more complicated. There is no analogue of the local result of Darboux, i.e. there is no simple local description of metrics whose geodesic flows admit integrals of degree more than 2. Local criteria of the integrability in these cases have been obtained by Hall in [7] in terms of systems of nonlinear PDEs. The cases of degree 3 and 4 are of particular interest because in these cases these systems of PDEs can be reduced to a single PDE. It was proved in [7] that the existence of a cubic integral is equivalent to the existence of coordinates (x, y) and

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

643

a function f such that the metric has the form ds 2 = (fxx + fyy )(dx 2 + dy 2 ) and f satisfies 3 2 3 ∂ f ∂ 3f ∂ f ∂ 2f ∂ 2f ∂ f ∂ 2f ∂ 3f + 3 = 0. − +2 2 − (1) ∂x 2 ∂x 3 ∂x 2 ∂y ∂x∂y 2 ∂x∂y ∂x 2 ∂y ∂y In the case of degree 4 the equation for f has the following form 2 2 ∂ f ∂ 3f ∂ ∂ f ∂ 3f ∂ 2f ∂ 3f ∂ 2f ∂ 3f ∂ − = 0. + 2 + 2 ∂x ∂x∂y ∂x 3 ∂x 2 ∂x 2 ∂y ∂y ∂x∂y ∂y 3 ∂y 2 ∂y 2 ∂x

(2)

Until now there is no example of a metric on S 2 whose geodesic flow possesses an integral of degree more than 4. There have been known only some classical examples of the integrable geodesic flows on the sphere with integrals of degree 3 and 4 due to Chaplygin, Goryachev [5], Kovalevskaya [13], for more details see also [3]. More precisely, only one example of an integrable conservative system on S 2 , possessing an integral cubic in momenta, has been known, namely the case of Goryachev–Chaplygin in the dynamics of a rigid body. The total energy (the Hamiltonian) of this system has the following form: H =

du21 + du22 + 4du23 4u21 + 4u22 + u23

− u1 ,

(3)

where S 2 is given by u21 + u22 + u23 = 1 (see [3]). In polar coordinates the Hamiltonian of this problem can be written H = λ(r 2 )(r 2 dϕ 2 + dr 2 ) + ρ(r 2 )r cos ϕ,

(4)

where λ, ρ : R+ → R are functions on S 2 . In this paper we propose new cubic integrals of conservative systems on S 2 . One of the aims of our paper is to give a complete classification up to equivalence (see above) of the Hamiltonians of the form (4) where λ, ρ are arbitrary smooth functions on S 2 , possessing an integral F = pϕ3 + κpϕ K + A,

(5)

where A is a linear polynomial in momenta and κ is an arbitrary constant. Our classification consists of three types of Hamiltonians which include the case of Goryachev– Chaplygin and two new cases of one- and two-parameter families of integrable systems (Theorem 1.1). In order to produce new integrable cases on S 2 we look for solutions of (1) with the following ansatz: f (x, y) = ξ(y) + ψ(y) cos x + a(x 2 − y 2 ), where a is a constant and ξ, ψ are unknown functions. Proposition 3.1 explains the choice of this ansatz. Then ψ has to solve the following initial value problem: ψ 0 ψ 000 = ψψ 00 − 2ψ 00 + ψ 0 + ψ 2 , ψ(0) = 0, ψ 0 (0) = 1, ψ 00 (0) = τ. 2

2

The main result of our paper is the following:

(6)

644

E. N. Selivanova

Theorem 1.1. Consider a conservative system on S 2 with the Hamiltoian of the form (4). Then there is a constant T > 0 such that the following holds: the system (4) has an integral of the form (5) and no integral quadratic or linear in momenta if and only if the Hamiltonian is equivalent to one of the following: (i) H =

dϕ 2 + dy 2 ψ 0 2 (y)

− (ψ 00 (y) − ψ(y))ψ 0 (y) cos ϕ, 2

(7)

where ψ is a solution of (6) for τ ∈ (0, T ). (ii) Hb =

ψ 0 2 (y)(ψ 00 (y) − ψ(y)) ψ 0 2 (y) − ψ 2 (y) + b 2 2 + dy − dϕ cos ϕ, (8) ψ 0 2 (y) ψ 0 2 (y) − ψ 2 (y) + b

where ψ is a solution of (6) for τ ∈ (0, T ) and b > b∗ (τ ) = max (ψ 2 − ψ 0 ) or b < b∗ (τ ) = min (ψ 2 − ψ 0 ), 2

y∈R

2

y∈R

where b∗ (τ ) < +∞ and b∗ (τ ) > −∞ for τ ∈ (0, T ). (iii) The Hamiltonian of the form (8), where ψ is a solution of (6) for τ = T and b = ψ 2 (y0 ), ψ 0 (y0 ) = 0, which is in fact the case of Goryachev–Chaplygin in the dynamics of a rigid body. Moreover, we observe that the standard metric S 2 can be smoothly deformed in the class of metrics on S 2 whose geodesic flows possess nontrivial cubic integrals (Proposition 4.6). From the above classification we derive also new examples of metrics smoothly conformally equivalent to the metric on the Lobachevski plane and whose geodesic flows possess cubic integrals (Corollary 4.5). It is interesting to note that the basic idea behind this construction works also in the case of integrals of degree 4, which has been considered in [14,6], where we found new families of integrable cases on S 2 including the case of Kovalevskaya. It turns out that the famous integral of Kovalevskaya can also be found by the ansatz (8). Some modifications of this construction lead to other integrable cases on the plane. The paper is organized as follows. In Sect. 2 we obtain a criterion for integrability for geodesic flows of Riemannian metrics as a PDE. In Sect. 3 we consider a class of solutions of this equation and reduce the problem of classification of conservative systems on S 2 to the ODE (6). In Sect. 4 we prove the main theorem of classification and show that our classification contains new integrable cases on S 2 . In Sect. 5 we prove the main technical lemma concerning solutions of the nonlinear ODE (6). In Sect. 6 we give some modifications of our construction which lead to other integrable cases on the plane. 2. A Criterion for Integrability We consider a metric ds 2 = 2(u, v)(du2 + dv 2 ) in conformal coordinates u, v. It can also be written as ¯ w, ¯ ds 2 = θ (w, w)dwd

(9)

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

645

where w = u + iv. The geodesic flow of ds 2 is a Hamiltonian system with Hamiltonian H =

pw pw¯ . 4θ (w, w) ¯

(10)

A polynomial F in momenta pu , pv can be written as F =

n X k=0

where

k n−k bk (w, w)p ¯ w pw¯ ,

bk = bn−k , k = 0, . . . , n.

If the polynomial F is an additional integral of the geodesic flow with Hamiltonian (10), then {F, H } = 0 and the following holds: θ

∂bk ∂θ ∂θ ∂bk−1 + (n − (k − 1))bk−1 +θ + kbk = 0, ∂w ∂w ∂ w¯ ∂ w¯

(11)

where k = 0, . . . , n + 1 and b−1 = bn+1 = 0. Setting k = 0 and k = n + 1, respectively in (11) we get ∂ b0 ≡ 0 ∂ w¯ and ∂ bn+1 ≡ 0. ∂w One may show that if the integral F of the geodesic flow of (9), is independent of the Hamiltonian and some integral of smaller degree, then there is a conformal coordinate system z = z(w) of this metric such that the coefficients of F for pzn and pzn¯ are equal to 1 identically. Proposition 2.1. Let ds 2 = λ(z, z¯ )dzd z¯ (z = ϕ + iy) be a metric such that there exists a function f : R2 7 → R, satisfying the following conditions: ∂ 2f 1 ∂ 2f + 2 , λ= 4 ∂ϕ 2 ∂y and ∂ ∂ϕ

∂ 2f ∂ 2f − ∂ϕ 2 ∂y 2

∂ 2f ∂ 2f + ∂ϕ 2 ∂y 2

∂ =2 ∂y

∂ 2f ∂ϕ∂y

∂ 2f ∂ 2f + ∂ϕ 2 ∂y 2

. (12)

Then the geodesic flow of ds 2 possesses an integral cubic in momenta. Conversely, if the geodesic flow of a metric ds 2 possesses an integral which is cubic in momenta and which does not depend on the Hamiltonian and an integral of smaller degree, then there exist conformal coordinates ϕ, y and a function f : R2 7 → R such that ds 2 = λ(z, z¯ )dzd z¯ , where z = ϕ + iy, ∂ 2f 1 ∂ 2f + 2 λ= 4 ∂ϕ 2 ∂y and (12) holds.

646

E. N. Selivanova

Proof. We will consider integrals of the form F =

n X k=0

ak (z, z¯ )pzk pzn−k ¯

ds 2 ,

of the geodesic flow of where a0 = an ≡ 1. Then the system (11) has the following form: ∂λ ∂(a1 λ) + = 0, ∂z ∂ z¯

(13)

∂(a1 λ2 ) ∂(a2 λ) + = 0, ∂z ∂ z¯

(14)

3

where a1 = a¯2 . Equation (13) holds if and only if there exists a function h such that λ=

∂h ∂h and a1 λ = −3 . ∂ z¯ ∂z

Equation (14) can be also rewritten in the following form: ∂(a1 λ2 ) ∂(a1 λ2 ) ∂(a1 λ2 ) =− =− . ∂z ∂ z¯ ∂z Therefore, the system (13), (14) is equivalent to the following condition: ∂ ∂h ∂h ∂(a1 λ2 ) = Re = 0, Re ∂z ∂z ∂z ∂ z¯ where Im

∂h = Im λ(z, z¯ ) = 0, ∂ z¯

i.e. there is a real function f such that

∂f ∂z

= h and (12) holds. u t

Equation (12) has been obtained first in [7], see (1) in the Introduction. Here, we gave another proof because further we will need the following corollaries which can be easily derived from the above proof of Proposition 2.1. Corollary 2.2. If in a conformal coordinate system ϕ, y of ds 2 = λ(ϕ, y)(dϕ 2 + dy 2 ) there is an integral of the geodesic flow of ds 2 which has the form F = α3 (pz3 + a1 pz2 pz¯ + a1 pz pz2¯ + pz3¯ ), z = ϕ + iy, α3 = const 6= 0,

(15)

then there is a function f such that (12) holds. Corollary 2.3. If in a conformal coordinate system ϕ, y of ds 2 = λ(ϕ, y)(dϕ 2 + dy 2 ) there is a function f such that (12) holds, then there is an integral F which has the form F = pz3 + a1 pz2 pz¯ + a1 pz pz2¯ + pz3¯ , z = ϕ + iy.

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

647

3. Reduction to ODE Proposition 3.1. Let ds 2 = (91 (r 2 )r cos ϕ + 92 (r 2 ))(r 2 dϕ 2 + dr 2 )

(16)

be a metric on S 2 in polar coordinates. Then the geodesic flow of ds 2 possesses an integral of the form 3 2 2 3 + b1 pw pw¯ + b1 pw pw ¯ 3 pw F = iw 3 pw ¯ − iw ¯ , w = r cos ϕ + ir sin ϕ

if and only if

(17)

91 (r 2 ) = ((ψ 00 − ψ)(log r))r −3

and

92 (r 2 ) = cr −2 ψ 0

or 92 (r 2 ) = ar −2

−2

(log r)

ψ 02 − ψ 2 + b ψ 02

(log r),

where ψ is a solution of the ODE from (6) and a, c, b are constants. Proof. Assume that the geodesic flow of ds 2 possesses an integral of the form (17). Let us consider z = ϕ + iy = i ln w. Then (16) can be rewritten in the following form ds 2 = (91 (exp(2y))(exp y) cos ϕ + 92 (exp(2y)))(exp(2y))(dϕ 2 + dy 2 ). Thus, ds 2 = (93 (y) cos ϕ + 94 (y))(dϕ 2 + dy 2 ) = λ(ϕ, y)(dϕ 2 + dy 2 )

(18)

for some functions 93 , 94 . An integral of the form (17) has then the form (15). Now we can apply Corollary 2.2. So, there is a function f such that (12) holds. Let us consider a function h(ϕ, y) = ψ(y) cos ϕ + ξ(y), where ψ 00 − ψ = 93 and 00 ξ = 94 . Then f = h + δ, where δϕϕ + δyy = 0 and for the function a1 from (15) we get a1 = −3

fϕϕ − fyy fϕy hϕϕ − hyy + δϕϕ − δyy hϕy + δϕy + 6i = −3 + 6i . fϕϕ + fyy fϕϕ + fyy λ(ϕ, y) λ(ϕ, y) (19)

Thus, the functions δϕy and δϕϕ − δyy are periodic in ϕ. Therefore, δ(ϕ, y) = P1 (ϕ, y) + a(ϕ 2 − y 2 ) + B1 ϕy X (Ck (exp(−ky)) + Dk (exp(ky))) sin kϕ + k∈N

+

X

(Ek (exp(−ky)) + Gk (exp(ky))) cos kϕ,

k∈N

where a, B1 , Ck , Dk , Ek , Gk are constants and P1 (ϕ, y) is linear in ϕ, y.

(20)

648

E. N. Selivanova

For the function b1 from (17) we have b1 (w, w) ¯ = a1 |w|2 w and |b1 | is bounded if 2 2 |w| = 0, since ds is a metric on S . Then from (19) and (20) we obtain that Ck , Dk , Ek , Gk are equal to zero when k ≥ 2. Thus, we have to consider functions f of the form f (ϕ, y) = ψ(y) cos ϕ + ξ(y) + a(ϕ 2 − y 2 ) + B1 ϕy + (C1 (exp(−y)) + D1 (exp y)) sin ϕ satisfying (12). So, we obtain the following conditions for λ: λ = (ψ 00 − ψ) cos x + ξ 00 , where ξ 00 ψ 00 + (ξ 00 ψ 0 )0 = 2a(ψ 00 − ψ)

(21)

ψ 0 ψ 000 = ψψ 00 − 2ψ 00 + ψ 0 + ψ 2 .

(22)

and 2

2

Moreover we see that (21) and (22) hold if and only if f (ϕ, y) = ψ(y) cos ϕ + ξ(y) + a(ϕ 2 − y 2 ), is a solution of (12). Then we multiply (21) by ψ 0 2ξ 00 ψ 0 ψ 00 + ξ 000 ψ 0 = 2a(ψ 00 − ψ)ψ 0 2

and integrate. We get ξ 00 (y) = a

ψ 0 2 (y) − ψ 2 (y) + b ψ 0 2 (y)

, b = const if a 6 = 0

or ξ 00 (y) = cψ 0

−2

(y), c = const if a = 0.

Conversely, the integrability of the corresponding geodesic flows follows immediately from Corollary 2.3 and the above calculations. u t Due to Maupertuis’s principle we obtain the following corollary from Proposition 3.1: Corollary 3.2. Hamiltonian systems with Hamiltonians of the form (7) and (8), where ψ is a solution of the ODE (6), possess an integral cubic in momenta of the form (5).

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

649

4. Main Theorem Before we prove the main theorem we have to consider some properties of the geodesic flows of metrics ds 2 = λ(r 2 )(r 2 dϕ 2 + dr 2 )

(23)

on S 2 . These properties follow also from [12], but we will give here a completely different proof. Proposition 4.1. The geodesic flow of a Riemannian metric (23) on S 2 does not possess a nontrivial integral quadratic in momenta which does not depend on H and on linear integrals. Proof. Let us introduce the conformal coordinate system w = r cos ϕ + ir sin ϕ of (23) and consider wˆ = w −1 . Let 2 2 ¯ w + a1 (w, w)p ¯ w pw¯ + a2 (w, w)p ¯ w F = a0 (w, w)p ¯ be an integral of the geodesic flow of (23). It is well known that a0 = a0 (w), a2 = ¯ = a¯ 0 (w) and Im a1 (w, w) ¯ = 0, see [4]. Since (23) is a Riemannian metric on a2 (w) S 2 , it follows that a0 (w) is a polynomial of degree n, where n ≤ 4, see [11]. If F does not depend on the Hamiltonian and on linear integrals, then a0 (w) does not equal zero identically. From the condition {F, H } = 0 we obtain λ

∂a1 ∂λ ∂λ ∂a0 + 2a0 +λ + a1 = 0. ∂w ∂w ∂ w¯ ∂ w¯

This is equivalent to the following condition: ∂λ ∂a0 ∂ λ + 2a0 = 0, Im ∂w ∂w ∂w where a0 (w) = A0 w4 + A1 w3 + A2 w2 + A3 w + A4 and A0 , A1 , A2 , A3 , A4 are some constants such that at least one of them does not equal zero. Denote w w¯ as t. So, we get the following conditions for λ(y): Im (w 2 (12A0 λ + 12A0 λ0 t + 2A0 λ00 t 2 ) + w¯ 2 (2A4 λ00 )) 0 00 + Im (w(6A1 λ + 9A1 λ0 t + 2A1 λ00 t 2 ) + w(3A ¯ 3 λ + 2A3 λ t))

+ Im (A2 (2λ + 6λ0 t + 2λ00 t 2 )) ≡ 0. Thus, we have to consider the following linear differential equations: 1 (6λ + 6λ0 t + λ00 t 2 ) = 2 (λ00 ), 0

00 2

0

(24) 00

1 (6λ + 9λ t + 2λ t ) = 2 (3λ + λ t), 0

00 2

λ + 3λ t + 2λ t = 0,

(25) (26)

where 1 , 2 are equal to −1, 0 or 1. One may see that for any smooth solution λ of (26) it holds λ(0) = 0 and then ds 2 is not a Riemannian metric on S 2 . We get Im A2 = 0.

650

E. N. Selivanova

We note also that any solution of Eqs. (24), (25) if 1 = 0 or 2 = 0 does not define by (23) a Riemannian metric on S 2 . Then the solutions of (24) with 1 = 1 and 2 = 1 can be obtained from the solutions of (24) with 1 = 1 and 2 = −1 by complex transformation t → it. So, the solutions of (24) with 1 = 1, 2 = 1 and the solutions of (24) with 1 = 1, 2 = −1 have the form 1 + t2 t + C2 , C1 , C2 = const λ(t) = C1 (−1 + t 2 )2 (−1 + t 2 )2 and 1 − t2 t + C2 , C1 , C2 = const λ(t) = C1 2 2 (1 + t ) (1 + t 2 )2 respectively. If λ(0) = 1, then there is only one smooth solution of (25) with 1 = 1 or 2 = ±1. These solutions correspond to metrics of curvature −1 if 2 = 1 and to metrics of curvature −1 if 2 = −1. So, if 1 = 1 and 2 = 1, then any solution of (25), which is smooth in zero, has the following form: λ(t) =

C1 , C1 = const, (−1 + t)2

and if 1 = 1, 2 = −1 any solution of (25), which is smooth in zero, has the following form: C1 , C1 = const . λ(t) = (1 + t)2 Thus, we see that either Im A2 = 0 and Ai = 0, i 6 = 2 or λ(t) =

C1 , C1 , D = const, (1 + Dt)2

and therefore, ds 2 =

C1 r 2 dϕ 2 + dr 2 , 2 2 (1 + Dr )

(27)

where C1 , D are constants, i.e. if ds 2 is a metric of constant positive curvature. Recall that the geodesic flow of a metric (23) possesses the linear integral F1 = ¯ w¯ . If Im A2 = 0 and Ai = 0, i 6 = 2, we will consider the integral F2 = iwpw − i wp 2 ˆ , where Cˆ is a constant. Therefore, F depends F +A2 F1 which is obviously equal to CH on H and F1 . So, if the geodesic flow of a metric (23) on S 2 possesses a nontrivial quadratic integral, it can only be a metric of constant positive curvature. But it is well known that the geodesic flows of metrics of constant curvature possess two independent linear integrals. Indeed, the geodesic flow with the Hamiltonian H = l 0 2 (y)(pϕ2 + py2 ), where l 00 (y) = l(y) possesses the integrals F1 = pϕ and Fˆ1 = l(y) cos ϕpϕ − l 0 (y) sin ϕpy . So, in this case any quadratic integral is also trivial, i.e. it depends on the Hamiltonian and on linear integrals. u t Proposition 4.2. The geodesic flow of a metric of type (23) on S 2 possesses two independent linear integrals if and only if it has the form (27), i.e. if it is a metric of constant positive curvature.

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

651

Proof. As mentioned above, the geodesic flow of a metric (23) on S 2 possesses the linear integral F1 = pϕ . In the coordinates w, w, ¯ pw , pw¯ , where w = r cos ϕ + ir sin ϕ, one can write F1 = ¯ w¯ . Indeed, in the coordinates z = ϕ + iy, y = log r, we have F1 = pz + pz¯ iwpw − i wp and, clearly, w = exp iz. Assume that a metric ds 2 of the form (23) on S 2 possesses another linear integral F2 = b0 (w)pw + b¯0 (w)pw¯ which is independent of F1 and, therefore, the function b0 (w) does not equal iwC3 , where C3 is a real constant. 2 which is Let us consider the function F22 = b02 (w)Pw + 2|b0 |2 pw pw¯ + b¯0 (w)pw ¯ 2 2 also an integral of the geodesic flow of ds . Compare now b0 (w) with a0 (w) from Proposition 4.1. Since b0 (w) 6= iwC3 in our case, then at least one Ai , i 6= 2 (see Proposition 4.1) does not equal zero. In the same way as in Proposition 4.1 we may t show that ds 2 has the form (27), i.e. ds 2 is a metric of constant positive curvature. u Recall that conformal coordinates x, y of a metric ds 2 are Liouville if ds 2 = (f (x)+ h(y))(dx 2 + dy 2 ) for some functions f , h. Due to Darboux [4] Liouville coordinates exist if and only if the geodesic flow of ds 2 possesses an integral quadratic in momenta and moreover if one of the functions f or h is constant, then the geodesic flow possesses an integral linear in momenta. So, we may formulate the following Corollary 4.3. Liouville coordinates ϕ, y = log r, related to polar coordinates ϕ, r of a metric (23) on S 2 are unique up to shifts and the transformation y → −y. Proof. Assume that a metric ds 2 on S 2 can be written in two different ways: ds 2 = λ1 (r 2 )(r 2 dϕ 2 + dr 2 ) and

ds 2 = λ2 (˜r 2 )(˜r 2 d ϕ˜ 2 + d r˜ 2 ), where r˜ 6 = Dr ±1 for a constant D. This means that the geodesic flow of ds 2 possesses two independent linear integrals. So, from Proposition 4.2 it follows that ds 2 has the form (27) in polar coordinates, and therefore, r˜ = Dr ±1 for a constant D. Thus, Liouville coordinates ϕ, y = log r related to polar coordinates ϕ, r of ds 2 on S 2 are unique up to shifts and the transformation y → −y. u t Now we can prove the main theorem. Proof of Theorem 1.1. From Proposition 3.1 and Maupertuis’s principle we conclude that if a Hamiltonian system with the Hamiltonian of the form (4) defines a conservative system on S 2 , possessing an integral of the form (5), then (4) is equivalent to (7) or (8), where ψ is a solution of the ODE (6). So, we have only to describe such solutions of this ODE that the Hamiltonians of the form (7), (8) define conservative systems on S 2 . Since 92 (0) 6 = 0, see Proposition 3.1, we get the following restrictions for a solution ψ of the ODE (6): lim ψ 0 exp(∓y) = const 6= 0

y→±∞

(28)

or lim exp(±2y)

y→±∞

ψ 02 − ψ 2 + b ψ 02

= const 6= 0.

(29)

Let us formulate now the main technical lemma which will be proved in the next section.

652

E. N. Selivanova

Lemma 4.4. There is a positive constant T such that any smooth solution of the ODE (6) which satisfies (28) or (29) can be obtained from ψ(y) = cosh y or from the solutions of the initial value problem (6), where τ ∈ [0, T ], by a scaling ψ → α1 ψ with a constant α1 or a linear translation of time y → ±y + α0 , for a constant α0 . For any solution ψτ of (6) with τ ∈ [0, T ] there are smooth functions ητ , ζτ , µτ , ντ such that ψτ0 (y) = (exp(−y))ητ (exp(2y)) = (exp y)ζτ (exp(−2y)), (ψτ00 − ψτ )ψ 0 τ (y) = (exp y)µτ (exp(2y)) = (exp(−y))ντ (exp(−2y)), 2

where ψτ0 (y) > 0 everywhere if and only if τ ∈ [0, T ) and there is y = y0 such that ψT0 (y0 ) > 0 if y > y0 , ψT0 (y0 ) < 0 if y < y0 and ψ 00 (y0 ) 6= 0. From Lemma 4.4 it follows that we must consider solutions of (6) for τ ∈ [0, T ] and ψ(y) = cosh y. We note that if ψ is a solution of (6) for τ = 0 (then ψ(y) = sinh y) or ψ(y) = cosh y, then the Hamiltonians (7), (8) define metrics of constant curvature where there are two independent linear integrals. We consider solutions ψ of (6) when τ ∈ (0, T ]. Let us write then (7), (8) in polar coordinates ϕ, r = exp y and ϕ˜ = −ϕ, r˜ = exp(−y). Using Lemma 4.4 we compute H = =

1 ητ2 (r 2 ) 1 ζτ2 (˜r 2 )

(r 2 dϕ 2 + dr 2 ) − µτ (r 2 )r cos ϕ (˜r 2 d ϕ˜ 2 + d r˜ 2 ) − ντ (˜r 2 )˜r cos ϕ˜

and 8τ (r 2 ) + b + 1 2 2 µτ (r 2 )r cos ϕ (r dϕ + dr 2 ) − 2 2 ητ (r ) 8τ (r 2 ) + b + 1 ˜ τ (˜r 2 ) + b + 1 2 2 −8 ντ (˜r 2 )˜r cos ϕ˜ 2 , (˜ r = d ϕ ˜ + d r ˜ ) − ˜ τ (˜r 2 ) + b + 1 ζτ2 (˜r 2 ) −8

Hb =

where

Z 8τ (t) = ˜ τ (t) = 8

1

Z

1

t t

µτ (s)ητ−1 (s)ds, ντ (s)ζτ−1 (s)ds.

So, for 0 < τ < T the Hamiltonian (7) defines a family of conservative systems on S2. Consider τ = T . From Lemma 4.4 we know that for the corresponding solution ψ of (6) there is y = y0 such that ψ 0 (y0 ) = 0, and therefore, there is r = r0 > 0 such that ξT (r02 ) = 0. Thus, in the case τ = T the Hamiltonian (7) does not define a system on S 2 , and the Hamiltonian (8) defines a system on S 2 if and only if b = ψ 2 (y0 ). Let us find the admissible values for the parameter b in (8). From the above expressions for Hb in polar coordinates we have ˜ τ = M˜ 1 (τ ) b + 1 > max −8τ = M1 (τ ) and b + 1 > max 8 [0,1]

[0,1]

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

or

653

˜ τ = M˜ 2 (τ ). b + 1 < min −8τ = M2 (τ ) and b + 1 < min 8 [0,1]

[0,1]

So, b > b∗ (τ ) = max {M1 (τ ), M˜ 1 (τ )} − 1 or b < b∗ (τ ) = min {M2 (τ ), M˜ 2 (τ )} − 1, where b∗ (τ ) < +∞ and b∗ (τ ) > −∞ for τ ∈ (0, T ). We prove that these systems do not possess integrals linear or quadratic in momenta. Let us assume that a system of this family has an integral which is independent of the energy and which is a polynomial of second degree in momenta (clearly, this assumption includes the case of linear integrals). So, there is an integral F˜τ which is quadratic in momenta. Thus, F˜τ = Aτ (pϕ , py , ϕ, y)+Bτ (ϕ, y), where Aτ (pϕ , py , ϕ, y) is a polynomial of second degree in momenta pϕ , py . We may write {F˜τ , Hτ } = {Aτ (pϕ , py , ϕ, y) + Bτ (ϕ, y), Kτ + Vτ } ≡ 0 and, therefore, {Aτ (pϕ , py , ϕ, y), Kτ } ≡ 0. Thus, the geodesic flow of the metric with the Hamiltonian Kτ has an integral which is quadratic in momenta. Since Fτ does not depend on Hτ , from Propositions 4.1, 4.2 it follows that w.l.o.g. we may put Aτ (pϕ , py , ϕ, y) = pϕ2 . We write then

{Aτ , V } + {Bτ , Kτ } = {pϕ2 , Vτ } + {Bτ , Kτ } ≡ 0.

By computation we obtain

∂Bτ ≡ 0, ∂y

and ψ 0 τ (y) 2

So, we get

∂Bτ ∂Vτ = . ∂ϕ ∂ϕ

Vτ = (Bτ (ϕ) + α(y))ψ 0 τ (y) 2

for a smooth function α(y). Therefore Vτ = Cψ 0 2τ cos ϕ, where C is a constant. Using the asymptotic for ψ 0 (Lemma 4.4), we get C = 0, i.e. ψτ00 (y)−ψτ (y) ≡ 0, and therefore, τ = 0. So, there is no nontrivial quadratic or linear integral of these conservative systems on S 2 . Since the solutions of (6) for τ ∈ (0, T ) cannot be obtained one from another by a scaling or a linear translation of time, from Corollary 4.3 it follows that the Hamiltonians given by (7) where τ ∈ (0, T ) are not equivalent, see the Introduction. We prove that no Hamiltonian in the family (7) is equivalent to the Hamiltonian (3) of the case of Goryachev–Chaplygin. Let us write the integral Fτ in this case Fτ = pϕ3 +

3 ψτ (y) cos ϕpϕ − ψτ0 (y) sin ϕpy . 2

In the case of Goryachev–Chaplygin there is also a coordinate system ϕ, y, where the energy has the following form: H = γ (y)(pϕ2 + py2 ) + β(y) cos ϕ for suitable functions γ and β, see (3), (4). Since K = γ (y)(pϕ2 +py2 ) is the Hamiltonian of the geodesic flow of a smooth metric (23) on S 2 , it follows from Corollary 4.3 that

654

E. N. Selivanova

this coordinate system is unique up to linear transformations y 7 → ±y + const, ϕ 7 → ±ϕ + const. On the other hand, there is an integral F = pϕ3 + κpϕ K + G(pϕ , py , ϕ, y) where κ is a non-zero constant and G(pϕ , py , ϕ, y) is a linear polynomial in momenta1 , see [3]. If a Hamiltonian in family (7) is equivalent to the case of Goryachev–Chaplygin, then there are two independent first integrals in the case of Gorychev–Chaplygin; but it is well-known that in the case of Goryachev–Chaplygin there is only one additional independent of the Hamiltonian integral, see [3]. We show that the Hamiltonians (8) where ψ is a solution of (6) for τ ∈ (0, T ) are not equivalent for different values of the parameters τ and b. Assume that the Hamiltonians of the form (8) for τ1 , b1 and τ2 , b2 are equivalent. Then from Corollary 4.3 it follows that there are some constants C0 6 = 0, C3 6 = 0, y1 such that ψ100 (y) − ψ1 (y) = C0 (ψ200 (y + y1 ) − ψ2 (y + y1 )), ψ 0 21 (y) − ψ 0 21 (y) + b1 ψ 0 21 (y)

= C3

ψ 0 22 (y

+ y1 ) − ψ22 (y + y1 ) + b2 ψ 0 22 (y + y1 )

(30) ,

(31)

where ψ1 , ψ2 are solutions of (6) for τ = τ1 and τ = τ2 correspondingly. So, we get from (30), ψ1 (y) = C0 ψ2 (y + y1 ) + C1 exp y + C2 exp(−y) for some constants C1 , C2 . Let us write for a solution ψ of the ODE (6): ψ(y) = (exp y)g(exp(−2y)). In the proof of Lemma 4.4 we will show that g solves the differential equation (41). So, from our assumption it follows that there is a solution g of the differential equation from (41) such that also g1 (s) = g(s) + C1 + C2 s is a solution of this ODE and C1 , C2 are constants. Then by substituting g1 in the differential equation (41) we obtain 3C2 (g − 2g 0 s) = (3g 0 + sg 00 )(C1 − C2 s). Then from (41) we get

g 000 (C1 − C2 s) = g 00 3C2 .

On the other hand from the asymptotic for ψ 0 (y) = (exp y)(g − 2sg 0 ) it follows that g − 2sg 0 = const 6 = 0. s→+∞ s lim

Thus, C1 = C2 = 0 and therefore ψ1 (y) = C0 ψ2 (y + y1 ). From the initial conditions of (6) we get 0 = ψ1 (0) = C0 ψ2 (y1 ) and therefore ψ2 (y1 ) = 0; but ψ2 (0) = 0 and ψ20 (y) is positive everywhere. So, y1 = 0 and from ψ10 (0) = ψ20 (0) = 1 we obtain C0 = 1. Thus, τ1 = τ2 . We prove now that b1 = b2 . We have from (31), ψ 0 1 (y) − ψ 0 1 (y) + b1 = C3 (ψ 0 1 (y) − ψ 0 1 (y) + b2 ), 2

2

2

2

and therefore either C3 = 1 and b1 = b2 or ψ 0 21 (y) − ψ12 (y) ≡ const which is not true for τ 6 = 0. So we get b1 = b2 . 1 This remark is due to A.V. Bolsinov, personal communication.

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

655

Assume that a Hamiltonian from (i) is equivalent to a Hamiltonian from (ii). Then we obtain in the same way as above that for a solution ψτ of (6) it holds ψ 0 2τ (y) − ψτ2 (y) ≡ const and therefore τ = 0. Thus, no Hamiltonian from (i) is equivalent to a Hamiltonian from (ii). Let us show that no Hamiltonian in family (ii) is equivalent to the case of Gorychev– Chaplygin. There are polar coordinates r, ϕ such that (3) can be rewritten as H = γ1 (r 2 )(r 2 dϕ 2 + dr 2 ) − γ2 (r 2 )r cos ϕ = γ1 (˜r 2 )(˜r 2 dϕ 2 + d r˜ 2 ) − γ2 (˜r 2 )˜r cos ϕ,

(32)

where r˜ = r −1 . Assume that the Hamiltonian (8) where ψ is a solution of (6) for τ = τ0 ∈ (0, T ) and b = b1 is equivalent to the Hamiltonian of the case of Gorychev-Chaplygin. This means also from the symmetry of (32) that (30) and (31) hold for b1 = b2 and τ1 = τ0 , τ2 = −τ0 . In the same way as above we obtain τ0 = 0, but then the Hamiltonian of the case of Gorychev–Chaplygin would be equivalent to the Hamiltonian of a metric of constant positive curvature. So, we have described all conservative systems on S 2 with the Hamiltonians of the form (4) possessing an integral of the form (5) and not possessing an integral quadratic or linear in momenta. We also proved that no Hamiltonian from (i) or (ii) is equivalent to the Hamiltonian of the case of Goryachev–Chaplygin. On the other hand the case of Goryachev–Chaplygin belongs to the class of conservative systems on S 2 with the Hamiltonians of the form (4) possessing an integral of the form (5) and not possessing an integral quadratic or linear in momenta. So, the Hamiltonian of the case of Goryachev– Chaplygin is equivalent to the Hamiltonian (8), where ψ is a solution of (6) for τ = T t and b = ψ 2 (y0 ), when ψ 0 (y0 ) = 0. u Corollary 4.5. There are one- and two-parameter families of metrics which are conformally equivalent to the metric on the Lobachevski plane and whose geodesic flows admit cubic integrals. Proof. Let us consider τ = T , y ≥ y0 in (33) and (34) and introduce w = eiz ey0 , where z = ϕ + iy and y0 has been defined in Lemma 4.4. So, for (33) we get ds 2 = νT (|w|2 exp(−2y0 )) exp(−y0 )|w| cos ϕ + c ·

exp(−2y0 )(1 − |w|2 )2 dwd w¯ ζT (|w|2 exp(−2y0 )) (1 − |w|2 )2

2 = ρc (w, w)ds ¯ L2

in |w| ≤ 1. As above ρc (w, w) ¯ is smooth in |w| < 1. On the other hand we know from Lemma 4.4 that ψ 0 (y0 ) = 0 and ψ 00 (y0 ) 6= 0. Thus, there is ¯ = const 6= 0. lim ρc (w, w)

|w|→1

In the same way one may show for (34) that for any a there is b0 = b0 (a) such that for b > b0 we get metrics which are conformally equivalent to the metric on the Lobachevski plane and whose geodesic flows admit cubic integrals. u t

656

E. N. Selivanova

Proposition 4.6. The metric of constant curvature 1 on S 2 can be perturbed in a class of smooth metrics on S 2 whose geodesic flows admit nontrivial cubic integrals. Proof. In the last theorem we proved the integrability of the geodesic flow of the following metric: c 2 00 (33) ds = (ψ − ψ) cos ϕ + 2 (dϕ 2 + dy 2 ) ψ0 or ds 2 = (ψ 00 − ψ) cos ϕ + a

ψ 02 − ψ 2 + b ψ 02

! (dϕ 2 + dy 2 ),

(34)

where ψ is a solution of the ODE (6). These metrics are Riemannian metrics on S 2 if ψ is a solution of (6), τ ∈ (0, T ) and if a, b, c are sufficiently large constants. As mentioned above, for τ = 0 we get from (33) or (34) a metric of constant curvature on S 2 . Fix c = 4 and consider the family of metrics (33) dsε2 , where τ ∈ (0, ε). We get dsε2 = ρε dsst2 , where ρ0 ≡ 1 and ρε depends on ε smoothly. We show that for τ > 0 the corresponding cubic integrals of the geodesic flows of dsε2 are nontrivial, i.e. no integral is a composition of the Hamiltonian and of integrals of degree 1 or 2. We rewrite (33) in the following form: w + w¯ + 92 (ww) ¯ ¯ dwd w, ¯ ds 2 = 91 (ww) 2 see Proposition 3.1, where w, w¯ are homogeneous coordinates. Recall that our integral has the following form: 3 2 2 3 + b1 (w, w)p ¯ w pw¯ + b¯1 (w, w)p ¯ w pw ¯ 3 pw F = iw 3 pw ¯ − iw ¯.

If F depends on the Hamiltonian and on integrals of degree 1 or 2, then there exists a linear integral F1 of this flow and the corresponding metric has the form ds 2 = λ1 (z¯z)dzd z¯ in some homogeneous coordinates z, z¯ , see [11,9]. So, from Propositions 4.1 and 4.2 it follows that we have to consider two cases: if there is only one independent linear integral F1 and if there are two independent linear integrals (ds 2 is isometric to a metric ¯ w¯ ) and of constant curvature). In the first case, we get immediately F1 = c(iwpw − i wp therefore ds 2 = λ2 (ww)dwd ¯ w. ¯ So, we obtain 91 ≡ 0, i.e. τ = 0 and, as mentioned above, ds 2 is a metric of constant positive curvature. From Proposition 4.1 we know that the geodesic flow of the standard metric on S 2 , dzd z¯ , dsst2 = (1 + |z|2 )2 has two linear integrals F1 = izpz − i z¯ pz¯ = a1 (w)pw + a¯ 1 (w)pw¯ and

Fˆ1 = i(1 − z2 )pz − i(1 − z¯ 2 )pz¯ = aˆ1 (w)pw + a¯ˆ1 (w)pw¯ ,

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

657

where a1 , aˆ 1 are polynomials of degree ≤ 2. Since F depends on F1 , Fˆ1 , H , then either a1 or aˆ 1 is linear. In this case ds 2 is isometric to dsst2 , so w, z are both homogeneous coordinates of ds 2 , and therefore, they can be obtained one from another by a Möbius transformation. Then it is easy to see that aˆ 1 cannot be linear. If a1 is linear, then w + C1 = C2 z−1 , C1 , C2 are constants. Indeed, let us consider w=

1 − C1 , C, C1 ∈ C, z−C

(w.l.o.g. we may put C2 = 1). Then a1 (w) = i

1 + C (w + C1 )2 . w + C1

From the symmetry of the standard metric we may put w + C1 = C2 z. So, there are such constants k ∈ R, C1 ∈ C that 91 (|w|2 )

w + w¯ + 92 (|w|2 ) ≡ k(1 + |w + C1 |2 )−2 . 2

t We get 91 ≡ 0, i.e. τ = 0. u 5. The Main Technical Lemma Proof of Lemma 4.4. Let us consider solutions of (6). We prove now the existence of the constant T and the asymptotic of the solutions. Then (28) and (29) will follow immediately. The initial value problem (6) has a unique solution ψ(y) = ψτ (y) which is positive on (0, ε) and negative on (−ε, 0) for sufficiently small ε. Let us consider the case y > 0. The ODE (6) can be replaced by the following differential equation of the second order: 1 1 + 2q 2 − 3q 4 + q˙ − 7q 2 q˙ − 2q˙ 2 q¨ = q with q(y) = R 0 (y), where R(y) = log ψ(y). We may rewrite this equation as a system of differential equations of first order: 1 (35) 1 + 2q 2 − 3q 4 + p − 7q 2 p − 2p2 . q˙ = p, p˙ = q Since system (35) is symmetric with respect to q 7→ −q, y 7 → −y, it suffices to consider the case q > 0. In order to obtain the phase portrait of (35) we may consider the following smooth system: q˙ = qp, p˙ = 1 + 2q 2 − 3q 4 + p − 7q 2 p − 2p2 .

(36)

The solutions of (35) are obtained from the solutions of (36) by a reparametrization. The system (36) has four singular points: two saddle points p = 1, q = 0, p = − 21 , q = 0 and two nodes p = 0, q = ±1. It is easy to see that for (35) q → 0+ in a finite time interval. We denote by (∗) the orbit of (35), where p = p ∗ (q) and p∗ (q) → − 21 as q → 0+.

658

E. N. Selivanova

The aim of our further investigations is to show that the orbits of (35), which correspond to the solutions of (6) if τ belongs to (−T , T ) and p∗ (q) + q 2 < ∞, q→+∞ q

T = − lim

(37)

converge to the singular point q = 1, p = 0. For any solution ψ(y) = ψτ (y), t ∈ (0, ε) of (6) there is an orbit {0τ : p = pτ (q)} in the phase spase of (35). For any solution of (6) it holds R(y) → −∞ as t → 0+ and, therefore, q(y) = R 0 (y) → +∞, p(y) = q 0 (y) → −∞ as y → 0+. Thus pτ (q) → −∞ as q → +∞. We may now consider only the orbits of (35), where p → −∞ as q → +∞. We show that if τ1 > τ2 , then pτ1 (q) > pτ2 (q). Indeed, for a solution ψ(y) = ψτ1 (y) of the ODE (6) we have τ1 = lim

ψτ001 (y)

y→0+ ψτ0 (y) 1

q 2 + pτ1 (q) . q→+∞ q

= lim

(38)

Note that the function ψ(y) = sinh y satisfies (6) if τ = 0. The related orbit of (35) has then the form {00 : p = p0 (q) = −q 2 + 1}. So, the orbits of (35), corresponding to the solutions of (6), for τ ≥ 0 converge to the singular point q = 1, p = 0. We will prove below that orbit (∗), where p → − 21 as q → 0+, corresponds to the solution of (6) when τ is equal to a negative constant −T and all orbits of (35), lying between (∗) and 00 correspond to the solutions of (6) for τ ∈ (−T , 0). Assume first that there exists a constant τ0 < 0 such that orbit 0τ0 does not converge to the point q = 1, p = 0. Consider the set W of orbits of (35), lying between 0τ0 and 00 . Show that for any orbit in W the value q becomes infinite in a finite time interval. In fact, for any solution of (35) we have Z q(y) dq = y + const . (39) p For orbits in W we have p < −q 2 + 1. Hence, the left-hand side of (39) is bounded for any orbit in W as q → +∞. Without loss of generality we assume that for these solutions of (35) y vanishes as q becomes infinite. We conclude that for any orbit from W it holds by (38), (τ0 + o(1))q < p + q 2 < 1, and for a corresponding solution of (6) we get for y → 0+, τ0 + o(1) <

ψ 00 (y) < 1. ψ 0 (y)

Thus, for a corresponding solution of (6) the function (log ψ 0 (y))0 is bounded in an interval containing 0+. Hence, limy→0+ ψ 0 (y) is bounded for a solution ψ(y) of (6), 0 corresponding to an orbit in W and, therefore, ψ(y) = ψ q(y) → 0 as y → 0+. Let us consider the orbit (∗), where p = p ∗ (q). If (∗) is the same as 0τ0 then T = −τ0 .

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

659

If (∗) is not the same as 0τ0 then it belongs to W and, hence, corresponds to the solution of (6) for τ = −T , where T is equal to a positive constant and, moreover, T < −τ0 . Thus we have shown that for any τ ∈ (−T , T ) where T is a positive constant the system (6) has a solution for all y ≥ 0. Let us assume now that there is no such orbit 0τ0 and, so, all orbits of (35), corresponding to the solutions of the ODE (6), converge to the singular point q = 1, p = 0. Clearly, the orbit (∗) corresponds to a solution ψ(y) of the differential equation in 0 (y) becomes infinite in a the ODE (6). As mentioned above, the function q(y) = ψψ(y) ∗ 2 finite time interval of y, say, as y → 0+. Since p (q) < −q − kq for any k and for sufficiently large q, for ψ(y) it holds ψ 00 (y) = −∞. (40) y→0+ ψ 0 (y) R0 Note that ψ(0) is finite because R(0) − R(y) = y q(y)dt < 0 for y > 0 and ψ(y) = exp R(y). So, there are two cases: ψ(0) = 0 or ψ(0) 6 = 0. Assume that ψ(0) 6 = 0. It follows that ψ 0 (0+) = +∞ and from (40) we get 00 ψ (0+) = −∞. Rewrite now the differential equation in (6) in the following form: lim

ψ 00 − ψ (ψ 00 − ψ)2 ψ0 ψ 000 = −3 . −2 + 0 0 ψ ψ ψψ ψ Therefore, it holds p∗ (q) + q 2 − 1 (p∗ (q) + q 2 − 1)2 ψ 000 = −3 −2 + q. ψ q q We obtain ψ 000 (0+) = −∞. This yields the contradiction ψ 00 (0+) = −∞. Assume that ψ(0) = 0 and ψ 0 (0) 6= 0. From (40) we get ψ 00 (0+) = −∞. Rewriting the differential equation (6) in the following form: ψ 000 = ψ 0 − 3

(ψ 00 − ψ)ψ (ψ 00 − ψ)2 −2 , 0 ψ ψ0

we obtain again ψ 000 (0+) = −∞. Therefore, we only have to consider one case ψ(0) = 0 and ψ 0 (0) = 0. Taking into account that ψ 0 (y) = ψ(y)q(y) and q(y) > 0 as y → 0+, we conclude that ψ 0 (y) → 0+ as y → 0+, but on the other hand from (40) it follows that ψ 00 (y) < 0. These contradictions finally show that there is an orbit 0τ0 which does not converge to the point q = 1, p = 0 and, therefore, for any τ ∈ (−T , T ), where T is given by (37), solutions of the ODE (6) exist on [0, +∞). Since a solution ψ(y) = ψτ (y) of (6) equals −ψ−τ (−y) if y ≤ 0, solutions of (6) exist on (−∞, +∞). Let us consider now the solution of (6) when τ = T which corresponds to the orbit (∗). As mentioned above, see (35), (36), q → 0+ in a finite time interval, say y → y0 , i.e. ψT0 (y0 ) = 0. Assume now that ψT00 (y0 ) = 0. Then from the ODE we obtain ψT (y0 ) = 0. On the other hand, as proved above, ψT0 (y) > 0 if y > y0 and from ψT (0) = 0 we get ψT (y0 ) < 0. So, ψT00 (y0 ) 6 = 0.

660

E. N. Selivanova

Now we examine the asymptotic behaviour of the solutions of (6) with τ ∈ [0, T ] where T is defined by (37). Put s = exp(−2y) and √ 1 g(s) = sψ − log s . 2 Then the initial value problem (6) can be rewritten as g 000 = 3

g 00 2 g 00 g 0 1 + s , g(1) = 0, g 0 (1) = − , g 00 (1) = τ. 0 0 g − 2g s g − 2g s 2

(41)

We will consider τ ∈ (−T , T ]. Compute now for a solution of the ODE (6): ψ 0 (y) = (exp y)(g − 2g 0 s), 00

00

ψ (y) − ψ(y) = 4 exp (−3y)g (s).

(42) (43)

Let us consider the system (35). Since the eigenvalues of the Jacobian of (35) at (q, p) = (1, 0) are equal to −2, −4, there exist functions P and Q of class C 1 such that q = Q(s) = 1 + Cs + o1 (s), p = P (s) = −2Cs + o2 (s), where C is a constant, see [8]. We may write

Z

R(y) =

y

q = y + ξ1 (s).

Now we show that ξ1 is of class C 1 . By differentiation we obtain q −1 Cs + o1 (s) dξ1 (s) =− =− ∈ C0. ds 2s 2s Write now for a solution of the ODE (6), ψ(y) = exp R(y) = (exp y) exp ξ1 (s) = (exp y)g(s), where we have used that g = exp(−y)ψ(y), and, therefore, g(0) = exp ξ1 (0) 6= 0. For a solution of the ODE (6) which can be extended to infinity it holds ψ 0 (y) = ψ(y)q(y) = (exp y)(exp ξ1 (s))(1 + Cs + o1 (s)) = (exp y)g(s)(1 + Cs + o1 (s)), s = exp(−2y). On the other hand, from (42) we obtain g(s)(1 + Cs + o1 (s)) = g(s) − 2sg 0 (s). Thus, for any solution of (41) where τ ∈ (−T , T ] it holds g 0 (0) < ∞. As mentioned above, the solution of (6) for τ = 0 has the form ψ(y) = sinh y. Let us show that for any solution of (41), where τ ∈ (−T , T ] and τ 6= 0 it holds 0 < |g 00 (0)| < ∞. We compute ψ 00 (y) − ψ(y) = ψ(y)(p + q 2 − 1) = (exp y)(exp ξ1 (s))(P (s) + Q2 (s) − 1) = (exp t)(exp ξ1 (s))(−2Cs + o2 (s) + (1 + Cs + o1 (s))2 − 1).

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

Thus,

661

ψ 00 (y) − ψ(y) = (exp y)ξ2 (s),

where ξ2 ∈ C 1 , ξ2 (s) = o3 (s). Rewrite the ODE (6) in the following form: (ψ 000 − ψ 0 )ψ 0 = −3(ψ 00 − ψ)ψ − 2(ψ 00 − ψ)2 . We obtain either ψ 00 (y) − ψ(y) ≡ 0 (then τ = 0) or 1 (log |ψ 00 (y)−ψ(y)|)0 = −3 −2|ψ 00 (y)−ψ(y)|(exp y exp ξ1 (s)(1+Cs +o1 (s)))−1 q 1 = −3 −2|ξ2 (s)|(exp ξ1 (s)(1+Cs +o1 (s)))−1 . q So Then it follows

(log |ψ 00 (y) − ψ(y)|)0 = −3(1 − Cs) + o4 (s). (log ((exp y)|ξ2 (s)|))0 = −3(1 − Cs) + o4 (s).

Thus,

(y + log |ξ2 (s)|)0 = −3(1 − Cs) + o4 (s).

By integrating we get log |ξ2 (s)| = −4y −

3C s + ξ3 (s), 2

where ξ3 ∈ C 1 . Thus, |ξ2 (s)| = (exp(−4y)) exp(−

3C s + ξ3 (s)) = s 2 ξ4 (s), 2

since s = exp(−2y). So, ξ4 ∈ C 1 and ξ4 (0) = const 6= 0. Taking into account (43), we obtain |g 00 (0)| = 41 ξ4 (0) and therefore 0 < |g 00 (0)| < ∞. So, we conclude that the solutions of (41) for τ ∈ (−T , T ] are of class C ∞ in zero. Now the functions ζτ , ντ can be calculated in terms of g. From (42) ζτ (0) > 0 in view of g(0) > 0. Therefore ζτ > 0 everywhere on [0, +∞). Let us consider a smooth solution of the ODE (6) which satisfies (28) or (29). This solution corresponds to some orbits of (35). Now we describe these orbits. As mentioned above, q → 0 in a finite time interval, say y → y0 , and we know that 0 (y) ψ(y0 ) = const 6 = 0. Then q(y) = ψψ(y) must be smooth in y = y0 , and therefore, the orbits where q → 0, p → ±∞ may not correspond to our solutions. We consider the case q → 0, p → 1. Then the corresponding orbit has the form p = −q 2 + 1, |q| < 1 and the corresponding solutions can be obtained from ψ(y) = cosh y by a scaling or a linear translation of time. If q → 0, p → − 21 , then we get the orbit (∗), which corresponds to the solution of (6) for τ = T and y0 < y < 0. So, we only have to consider the orbits which converge to q = 1, p = 0 as y → +∞ and q → +∞ as y → y1 +. Assume y1 = −∞. We get ψ(y) > 0, ψ 0 (y) > 0 on (−∞, +∞). On the other hand from (28) we have lim ψ 0 (y) exp y = const 6= 0,

y→−∞

662

E. N. Selivanova

and from (29),

b 1 + 1− 2 = const 6= 0. lim y→−∞ q (y) ψ 0 2 In both cases there is a constant C > 0 such that ψ 0 (y) > C as y → −∞ and therefore there is y = y2 > −∞ such that ψ(y2 ) < 0. Thus, y1 is bounded. W.l.o.g. say y1 = 0, i.e. q → +∞ as y → 0+. Then ψ(0) = 0. Indeed, if ψ(0) 6 = 0, then ψ 0 (y) = qψ(y) → +∞ and the solution is not smooth. W.l.o.g. put ψ(0+) = 0+. We prove now that ψ 0 (0) 6 = 0. If ψ 0 (0) = 0, then using ψ(0) = 0, we get ψ 00 (0) = 0. 00 On the other hand ψψ 0 → +∞ as y → 0+. (Otherwise there is a constant C 0 such that 000

ψ 00 < C 0 ψ 0 but ψ 0 (0+) = 0+.) Then from the ODE we derive ψψ 0 → −∞ as y → 0+, and therefore ψ 000 < ψ 0 < ψ 00 as y → 0+ but ψ 00 (0+) = 0+. Thus, ψ 0 (0) 6 = 0. W.l.o.g. t we say ψ 0 (0) = 1 and then, as proved above, ψ 00 (0) ≤ T . u

6. Other Examples In this section we will briefly discuss some modifications of the proposed construction. Let us give some other applications of our ansatz. Hall in [7] found the Toda-potential with the ansatz f = exp(y)α(x) + ξ(y). By simple computation, in the same way as above, using Maupertuis’s principle, we show that this ansatz leads to the following family of integrable conservative systems on R2 : Proposition 6.1. Any system with the Hamiltonian H = exp(2y)(px2 + py2 ) − (exp(3y))(α 00 + α), where α is a solution of the initial value problem α

00 2

02

− 2α − 3α = −2, α(0) = α0 > 2

r

(44)

2 0 , α (0) = 0 3

is a conservative system on R2 with multivalued potentials, which admits nontrivial cubic integrals in momenta. q Proof. We show that for α0 > 23 any solution of this ODE with the initial conditions α(0) = α0 , α 0 (0) = 0 exists everywhere. Then we get α˙ = p, q p˙ = 2p2 + 3α 2 − 2. q We consider the orbits of this system where α > 23 when p = 0 and show that the corresponding solutions α(x) exist everywhere. In view of the symmetry one only has to consider the case x > 0. We know the following explicit solutions of this ODE: √ sinh 3x , (45) α(x) = √ 3 α(x) = sin x. (46)

Integrable Conservative Systems on S 2 and the Case of Goryachev–Chaplygin

663

So, the orbits p = f (α), which we consider, satisfy the following inequality: p 1 + 3α 2 > |f (α)|, and therefore,

Z

α

ds > f (s)

Z

α

ds

. 1 + 3s 2 Thus, the corresponding solutions exist everywhere. We note that (45) corresponds to the Toda-potential and (46) to the metric of constant null curvature. Since the kinetic energy in (44) is isometric to the "flat" metric on the plane, we know that all Liouville coordinates (up to linear transformations) of this metric are w = u+iv = a exp(y −ix), where a ∈ C, [4]. Then by the same arguments as in Theorem 1.1 we may show that a system with the Hamiltonian (44) does not possess integrals which are linear or quadratic in momenta. So, we get a family of integrable systems on the plane with multivalued potentials, possessing an integral cubic in momenta. u t x=

α0

α0

√

Acknowledgement. The author would like to thank Professor Gerhard Huisken and the Arbeitsbereich “Analysis” of the University of Tübingen for their hospitality.

References 1. Arnold, V.I.: Mathematical methods in Classical Mechanics. Berlin–Heidelberg: Springer-Verlag, 1978 2. Arnold, V.I., Kozlov, V.V., Neishtadt, A.I.: Mathematical Aspects of Classical and Celestical Mechanics. Berlin–Heidelberg: Springer-Verlag, 1988 3. Bolsinov, A.V., Kozlov, V.V., Fomenko, A.T.: The Maupertuis’s principle and geodesic flows on S 2 arising from integrable cases in the dynamics of a rigid body. Russ. Math. Surv. 50, 473–501 (1995) 4. Darboux, G.: Leçons sur la théorie générale des surfaces et les applications géométriques du calcul infinitésimal. Paris: Gauthier-Villars, 1896 5. Goryachev, D.N.: New cases of integrability of Euler’s dynamical equations (Russian). Varshavskie Universitet’skie Izvestiya, 3, 3–15 (1916) 6. Hadeler, K.P., Selivanova, E.N. On the case of Kovalevskaya and new examples of integrable conservative systems on S 2 . Preprint http: //xxx.lanl.gov/ math.DG/ 9801063 (1998), submitted 7. Hall, L.S.: A theory of exact and approximate configurational invariants. Physica 8D, 90–116 (1983) 8. Hartman, P.: Ordinary differential equations. New York–London–Sydney: John Wiley and Sons, 1964 9. Kiyohara, K. : Compact Liouville surfaces. J. Math. Soc. Japan 43, 555–591 (1991) 10. Kiyohara, K. : Two classes of Riemannian manifolds whose geodesic flows are integrable. Memoirs of the AMS. 619, (1997) 11. Kolokol’tsov, V.N. : Geodesic flows on two-dimensional manifolds with an additional first integral that is polynomial in the velocities. Math. USSR Izvestiya 21, 291–306 (1983) 12. Kolokol’tsov, V.N. : Polynomial integrals of the geodesic flows on the two-dimensional manifolds. PhD thesis, Moscow State University, in Russian, 1984 13. Kovalevskaya S.V.: Sur le problème de la rotation d’un corps solide autour d’un point fixe. Acta Mathematica 12, 177–232 (1889) 14. Selivanova, E.N.: New families of conservative systems on S 2 possessing in integral of fourth degree in momenta. Annals of Global Analysis and Geometry, to appear Communicated by T. Miwa

Commun. Math. Phys. 207, 665 – 685 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Random Unitary Matrices, Permutations and Painlevé Craig A. Tracy1 , Harold Widom2 1 Department of Mathematics and Institute of Theoretical Dynamics, University of California, Davis,

CA 95616, USA. E-mail: [email protected]

2 Department of Mathematics, University of California, Santa Cruz, CA 95064, USA.

E-mail: [email protected] Received: 2 December 1998 / Accepted: 12 May 1999

Abstract: This paper is concerned with certain connections between the ensemble of n× n unitary matrices – specifically the characteristic function of the random variable tr(U ) – and combinatorics – specifically Ulam’s problem concerning the distribution of the length of the longest increasing subsequence in permutation groups – and the appearance of Painlevé functions in the answers to apparently unrelated questions. Among the results is a representation in terms of a Painlevé V function for the characteristic function of tr(U ) and (using recent results of Baik, Deift and Johansson) an expression in terms of a Painlevé II function for the limiting distribution of the length of the longest increasing subsequence in the hyperoctahedral groups. 1. Introduction The characteristic function of the random variable tr U , where U belongs to the ensemble U(n) of n × n unitary matrices with Haar measure, is the expected value En (er tr U +s tr U ). In U(n) we have for any function g with Fourier coefficients gk ,   n Y g(eiθj ) = det Tn (g), En 

(1.1)

(1.2)

j =1

where Tn (g) is the associated n × n Toeplitz matrix defined by Tn (g) = (gj −k ), (j, k = 0, · · · , n − 1). It follows that the distribution function (1.1) equals the determinant of the Toeplitz matrix −1 associated with the function erz+sz . The determinant, which we denote by Dn , is a

666

C. A. Tracy, H. Widom

function of the product rs (see Sect. 2) and so it is completely determined by its values when r = s = t. This function Dn (t) has connections with both integrable systems and combinatorial theory. To state our results, and these connections, we introduce some notation. We set f (z) = et (z+z

−1 )

,

so that Dn (t) = det Tn (f ). Notice that Tn (f ) is symmetric since f (z−1 ) = f (z). We introduce the n-vectors         1 0 f1 fn 0 0  f2   fn−1  . .  .   .  + − + −        δ =  ..  , δ =  ..  , f =  ..  , f =   ..  , 0 0 f  f   n−1

0 and define

fn

1

2

f1

Un = Tn (f )−1 f + , δ − = Tn (f )−1 f − , δ + .

If we set 8n = 1 − Un2 , then 8n as a function of t satisfies the equation 800n =

1 0 2 1 0 1 1 n2 8n − 1 + , (8n ) − 8n − 88n (8n − 1) + 2 2 2 8n − 1 8n t t 8n

(1.3)

which is a variant of the Painlevé V equation1 , and in terms of it Dn has the representation Z t log(t/τ ) τ 8n (τ ) dτ . (1.4) Dn (t) = exp 4 0

This is reminiscent of the many representations now in the literature for Fredholm determinants in terms of Painlevé functions. We shall also show that Wn = Un /Un−1 satisfies Wn00 =

1 1 n − 1 2 4n 4 Wn − + 4 Wn3 − (Wn0 )2 − Wn0 + 4 , Wn t t t Wn

(1.5)

which is a special case of the Painlevé III equation. An important ingredient in the proofs is the following recurrence relation satisfied by the Un : n Un + (1 − Un2 ) (Un−1 + Un+1 ) = 0. t

(1.6)

1 The substitution t 2 = z and 8 = w/(w −1) transforms (1.3) to the standard form of P with parameters n V α = 0, β = −n2 /2, γ = 2 and δ = 0.

Random Unitary Matrices, Permutations and Painlevé

667

We shall see that this recurrence formula, sometimes known as the discrete Painlevé II equation (see, e.g. [6]), is equivalent to one first shown to hold for U(n) by V. Periwal and D. Shevitz [9]. It was rediscovered by M. Hisakado [4], who also derived an equation equivalent to (1.3) and observed that this was one of the Painlevé V equations which, by results of K. Okamoto [8], is reducible to Painlevé III. Carrying through the Okamoto procedure is what led to our Wn , although the proof we give is direct. Our derivations of (1.3) and (1.6) are different from those in [4] and perhaps more down-to-earth since we use only the simplest properties of Toeplitz matrices and some linear algebra. (They cannot be entirely unrelated, though, since the orthogonal polynomials which are central to the argument of [4] can be defined in terms of Toeplitz matrices.) A remarkable connection between U(n) and combinatorics was discovered by Gessel [3]. Place the uniform measure on the symmetric group SN , denote by `N (σ ) the length of the longest increasing subsequence in σ , and define fN n by Prob (`N (σ ) ≤ n) =

fN n . N!

Then Dn (t) is a generating function for the fN n .2 In fact Dn (t) =

X

fN n

N ≥0

t 2N . (N!)2

(1.7)

Recently, E. Rains [10] gave an elegant proof that fN n = En |tr(U )|2N ,

(1.8)

which can be shown to be equivalent to (1.7) by a simple argument. Using the relationship (1.7) a sharp asymptotic result for the distribution function of the random variable `N (σ ) was recently obtained by J. Baik, P. Deift and K. Johansson [1]. And at the same time they discovered yet another connection with Painlevé. Their main result, which was quite difficult, was an asymptotic formula for Dn (t) which we now describe. Introduce another parameter s and suppose that n and t are related by n = 2t + s t 1/3 . Then as t → ∞ with fixed s one has lim e−t D2t+s t 1/3 (t) = F (s). 2

t→∞

(1.9)

Here F is the distribution function defined by Z F (s) = exp −

∞ s

(x − s) q(x)2 dx ,

(1.10)

where q is the solution of the Painlevé II equation q 00 = sq + 2q 3

(1.11)

2 Gessel in [3] does not write down the symbol of the Toeplitz matrix, nor does he mention random matrices. But in light of the well-known formula (1.2) and the subsequent work of Odlyzko et al. [7] and Rains [10], we believe it is fair to say that the connection with random matrix theory begins with the discovery of (1.7).

668

C. A. Tracy, H. Widom

satisfying q(s) ∼ Ai(s) as s → ∞. (For a proof that such a solution exists, see, e.g. [2].) Using a “de-Poissonization” lemma due to Johansson [5] these asymptotics for the generating function Dn (t) led to the asymptotic formula lim Prob

N→∞

! √ `N (σ ) − 2 N ≤ s = F (s) N 1/6

(1.12)

√ for the distribution function of the normalized random variable (`N (σ ) − 2 N )/N 1/6 . It is a remarkable fact that this same distribution function F was first encountered in random matrix theory where it gives the limiting distribution for the normalized largest eigenvalue in the Gaussian Unitary Ensemble of Hermitian matrices. More precisely, we have for this ensemble [11], √ √ 2 N 1/6 ≤ s = F (s). lim Prob λmax (N ) − 2N N→∞

In connection with these results just described, we shall do two things. We show, first, how one might have guessed the asymptotics (1.12). More precisely, we present a simple argument that if there is any limit theorem of this type, with F (s) some distribution function and with some power N α replacing N 1/6 , then necessarily α = 1/6 and F is given by (1.10) with q a solution to (1.11). (The boundary condition on q, however, cannot be anticipated.) This conclusion is arrived at by considering the implications of (1.9) with t 1/3 replaced by t 2α for the recurrence formula (1.6). Secondly, we derive analogues of (1.8) and (1.7) for the subgroup ON of “odd” permutations of SN .3 These are described as follows: if N = 2k think of SN as acting on the integers from −k to k excluding 0, and if N = 2k + 1 think of SN as acting on the integers from −k to k including 0. In both cases σ ∈ SN is called odd if σ (−n) = −σ (n) for all n. The number of elements in the subgroup ON of odd permutations equals 2k k! in both cases. Therefore if bNn equals the number of permutations in ON having no increasing subsequence of length greater than n, Prob(`N (σ ) ≤ n) =

bN n , 2k k!

(1.13)

where the uniform measure is placed on ON . Rains [10] proved identities analogous to (1.8) for these probabilities. Using these we are able to find representations for the two generating functions Gn (t) =

X k≥0

b2k n

X t 2k t 2k , H (t) = b , n 2k+1 n (k!)2 (k!)2

(1.14)

k≥0

analogous to the representation (1.7). (See Theorem 1 below.) The same determinants Dn (t) arise as before but in the representation for Hn (t), whose derivation uses the machinery developed in earlier sections, the quantities Un also appear. Once the representations are established we can use (1.9) and Johansson’s lemma to deduce the 3 Our terminology for O is not standard. For N = 2k one usually denotes O by B , the hyperoctahedral N N k group of order 2k k! which is the centralizer of the reversal permutation in SN . Elements of Bk are commonly called signed permutations. Similar remarks hold for N = 2k + 1.

Random Unitary Matrices, Permutations and Painlevé

669

Table 1. The mean (µ), standard deviation (σ ), skewness (S) and kurtosis (K) of F and FO := F 2 µ

σ

S

K

F

−1.77109

0.9018

0.224

0.093

FO

−1.26332

0.7789

0.329

0.225

Distr

fO

0.5 0.4 0.3

f

0.2 0.1 s

-5

-4

-3

-2

-1

1

2

Fig. 1. The probability densities f = dF /ds and fO = dFO /ds

asymptotics of (1.13). We show that as N → ∞ we have for fixed s, ! √ `N (σ ) − 2 N ≤ s → F (s)2 , Prob 22/3 N 1/6

(1.15)

where F (s) is as in (1.12). In Table 1 we give some statistics of the distribution functions F and FO := F 2 . In Fig. 1 we graph their densities. 2. The Integral Representation for Dn We write 3 = Tn (z−1 ), 30 = Tn (z). Thus 3 is the backward shift and 30 is the forward shift. It is easy to see that Tn (z−1 f ) = Tn (f ) 3 + f + ⊗ δ + = 3 Tn (f ) + δ − ⊗ f − , 0

Tn (z f ) = Tn (f ) 3 + f

−

−

0

+

+

⊗ δ = 3 Tn (f ) + δ ⊗ f ,

(2.1) (2.2)

where δ ± and f ± were defined above and a ⊗ b denotes the matrix with j, k entry aj bk . Relation (2.1) holds for any f but (2.2) uses the fact that f−k = fk .

670

C. A. Tracy, H. Widom −1

To derive (1.4) we temporarily reintroduce variables r and s and set f (z) = erz+sz , so Dn and Tn are functions of r and s. Of course we are interested in Dn (t, t). We shall compute 2 log Dn (r, s) ∂r,s r=s=t

in two different ways. Using the fact that ∂s log Dn (r, s) = tr Tn (f )−1 Tn (∂s f ) = tr Tn (f )−1 Tn (z−1 f ), then differentiating with respect to r, we find that 2 log Dn (r, s) = tr [Tn (f )−1 Tn (f ) − Tn (f )−1 Tn (z f ) Tn (f )−1 Tn (z−1 f )] ∂r,s

= tr [I − Tn (f )−1 Tn (z f ) Tn (f )−1 Tn (z−1 f )]. We now set r = s = t. Since f is now as it was, we can use (2.1) and (2.2). If we multiply their first equalities on the left by Tn (f )−1 and use the notation u± = Tn (f )−1 f ± we obtain Tn (f )−1 Tn (z−1 f ) = 3 + u+ ⊗ δ + , Tn (f )−1 Tn (z f ) = 30 + u− ⊗ δ − . Hence the last trace equals that of I − (30 + u− ⊗ δ − )(3 + u+ ⊗ δ + ) = I − 30 3 − u− ⊗ 30 δ − − 30 u+ ⊗ δ + − (δ − , u+ ) u− ⊗ δ + . The trace of I − 30 3 equals 1, and 30 δ − = 0, tr 30 u+ ⊗ δ + = (30 u+ , δ + ) = (u+ , 3δ + ) = 0, so we have

2 log Dn (r, s) ∂r,s

r=s=t

= 1 − (δ − , u+ ) (u− , δ + ).

But (u− , δ + ) = (δ − , u+ ) = (δ − , Tn (f )−1 f + ) = Un . Therefore 2 log Dn (r, s) = 1 − Un2 = 8n . ∂r,s r=s=t

(2.3)

Now let us go back to general r and s. For any p > 0 Cauchy’s theorem tells us that the j, k entry of Tn (f ) equals Z Z 1 1 −1 −1 −1 et (rz+sz ) z−(j −k+1) dz = p−j et (prz+p sz ) z−(j −k+1) dzpk . 2πi |z|=p 2π i |z|=1 It follows that Dn (r, s) = D √n (pr, s/p), and by analytic continuation this holds for any (complex) p. Setting p = s/r we see that √ √ Dn (r, s) = Dn ( rs, rs).

Random Unitary Matrices, Permutations and Painlevé

671

It follows that Dn (r, s) is a function of the product rs, as stated in the Introduction, and that d2 1 d 2 log Dn (r, s) = 2 log Dn (t, t) + log Dn (t, t). 4 ∂r,s r=s=t dt t dt Comparing this with (2.3) we see that we have shown 1 d d2 log Dn (t, t) + log Dn (t, t) = 4 8n . 2 dt t dt This gives the representation (1.4). Of course it remains to show that this 8n satisfies (1.3). We do this by first finding a formula for dUn /dt and then finding relations among the various quantities which occur for different values of n. 3. Differentiation In addition to u± = Tn (f )−1 f ± , we introduce v ± = Tn (f )−1 δ ± and we compute some derivatives with respect to t. First, d Tn (f ) = Tn ((z + z−1 )f ) = Tn (f ) (3 + 30 ) + f + ⊗ δ + + f − ⊗ δ − dt by the first equalities of (2.1) and (2.2), so dTn (f ) Tn (f )−1 dt = −(3 + 30 ) Tn (f )−1 − u+ ⊗ v + − u− ⊗ v − .

dtTn (f )−1 = −Tn (f )−1

(3.1)

Next,  df + dt

f0





f2



     f1   f3       .   .  ..  +  ..  = Tn (f )δ + + 3f + + fn+1 δ − . =         f   f   n−2   n  fn−1 fn+1

Hence Tn (f )−1

df + = δ + + Tn (f )−1 3f + + fn+1 v − . dt

(3.2)

Multiplying the second equality of (2.1) left and right by Tn (f )−1 gives Tn (f )−1 3 = 3 Tn (f )−1 + u+ ⊗ v + − v − ⊗ u− . Therefore Tn (f )−1 3f + = 3 u+ + u+ (v + , f + ) − v − (u− , f + ),

(3.3)

672

C. A. Tracy, H. Widom

and substituting this into (3.2) gives Tn (f )−1

df + = δ + + 3 u+ + u+ (v + , f + ) + v − fn+1 − (u− , f + ) . dt

Adding this to (3.1) applied to f + gives du+ = δ + − 30 u+ + v − (fn+1 − (u− , f + )) − u− (v − , f + ). dt Taking inner products with δ − in the last displayed formula we obtain (recall the definition of Un ) dUn = −(30 u+ , δ − ) + (v + , δ + ) (fn+1 − (u− , f + )) − (u+ , δ + ) (v + , f − ). dt We used the fact, which follows from the symmetry of Tn (f ), that all our inner products whose entries have signs as superscripts are unchanged if both signs are reversed. To find (30 u+ , δ − ), which is the same as (3 u− , δ + ), we observe that   fn    fn−1     .  ..  = Tn (f ) δ − − f0 δ − . 3f− =       f   1  0 Applying (3.3) to f − therefore gives δ − − f0 v − = 3 u− + u+ (v + , f − ) − v − (u− , f − ). Hence −(30 u+ , δ − ) = −(3 u− , δ + ) = (u+ , δ + ) (v + , f − ) + (v − , δ + ) (f0 − (u+ , f + )). Thus we have established the differentiation formula dUn = (v − , δ + ) f0 − (u+ , f + ) + (v + , δ + ) fn+1 − (u− , f + ) . dt

(3.4)

4. Relations New quantities appearing in the differentiation formula are Vn± = (v ± , δ + ). There are others but we shall see that they may be expressed in terms of these (with different values of n), as indeed so will Un . We obtain our relations through several applications of the following formula for the inverse of a 2 × 2 block matrix:   −1  (A − BD −1 C)−1 × AB .   = (4.1) × × CD

Random Unitary Matrices, Permutations and Painlevé

673

Here we assume A and D are square and the various inverses exist. Only one block of the inverse is displayed and the formula shows that A − BD −1 C equals the inverse of this block of the inverse matrix. At first all that will be used about f is that Tn (f ) is symmetric. (There are modifications which hold in general.) We apply (4.1) first to the (n + 1) × (n + 1) matrix   0 0 ··· 1    f1 f0 · · · fn−1    ,  . .. ..   .. . ··· .    fn fn−1 · · · f0 with A = (0), D = Tn (f ), B = (0 · · · 0 1), C = f + . In this case A − BD −1 C = −(Tn (f )−1 f + , δ − ) = −Un . This equals the reciprocal of the upper-left entry of the inverse matrix, which in turn equals (−1)n times the lower-left n × n subdeterminant divided by Dn . Replacing the first row by (f0 f1 · · · fn ) gives the matrix   f0 f1 · · · fn    f1 f0 · · · fn−1    = Tn+1 (f ).  . .. ..   .. . ··· .    fn fn−1 · · · f0 − and on the other hand The upper-right entry of its inverse equals on the one hand Vn+1 (−1)n times the same subdeterminant as arose above divided by Dn+1 . This gives the identity − −Un = Vn+1

V− Dn+1 = n+1 + . Dn Vn+1

(4.2)

(If we consider the polynomials on the circle which are orthonormal with respect to the weight function f then the right side above is equal to the constant term divided by the highest coefficient in the polynomial of degree n. Therefore our −Un equals the Sn−1 of [4].) If we now take A to be the upper-left corner of Tn+1 (f ) and D the complementary Tn (f ), then C = f + and B is its transpose, and we deduce that f0 − (u+ , f + ) =

1 + . Vn+1

(4.3)

To evaluate fn+1 − (u− , f + ), the other odd coefficient appearing in (3.4), we consider   f0 f1 · · · fn fn+1    f1 f0 · · · fn−1 fn     . .. .. ..   .. . ··· . .   = Tn+2 (f ).      f f  n n−1 · · · f0 f1  fn+1 fn · · · f1 f0

674

C. A. Tracy, H. Widom

We apply to this an obvious modification of (4.1), where A is the 2 × 2 matrix consisting of the four corners of the large matrix, D is the central Tn (f ), C consists of the two columns f + and f − and B consists of the rows which are their transposes. Then   +, f +) f −, f +) − (u − (u f 0 n+1 , A − BD −1 C =  fn+1 − (u− , f + ) f0 − (u+ , f + ) and our formula tells us that this is the inverse of   + − Vn+2 Vn+2 .  − + Vn+2 Vn+2 This gives the two formulas f0 − (u+ , f + ) =

+ Vn+2 + − Vn+2 − Vn+2 2

, fn+1 − (u− , f + ) = 2

− −Vn+2 + − Vn+2 − Vn+2 2

2

.

Comparing the first with (4.3) we see that + − + + − Vn+2 = Vn+1 Vn+2 , Vn+2 2

2

(4.4)

and therefore that the preceding relations can be written f0 − (u+ , f + ) =

1 + Vn+1

, fn+1 − (u− , f + ) = −

1

− Vn+2

+ + Vn+1 Vn+2

.

(4.5)

Notice that (4.2) and (4.4) give 1 − Un2 =

Vn+ + . Vn+1

(4.6)

This is our 8n . The relations we obtained so far in this section are completely general. The recurrence (1.6), however, depends on our specific function f . Integration by parts gives Z t −1 (4.7) (z − z−1 ) et (z+z ) z−k−1 dz = t (fk−1 − fk+1 ). k fk = 2πi Hence if M = diag (1 2 · · · n) we have M Tn (f ) − Tn (f ) M = t Tn ((z − z−1 ) f ), and by the first identities of (2.1) and (2.2) this equals i h t Tn (f ) (30 − 3) + f − ⊗ δ − − f + ⊗ δ + , so

h i Tn (f )−1 M − M Tn (f )−1 = t (30 − 3) Tn (f )−1 + u− ⊗ v − − u+ ⊗ v + .

Random Unitary Matrices, Permutations and Painlevé

675

Applying this to δ − gives h i n v − − M v − = t (30 − 3) v − + u− (v − , δ − ) − u+ (v + , δ − ) .

(4.8)

Now (4.7) says 

M f+

f0 − f2



   f1 − f3      ..  =t  .      f   n−2 − fn  fn−1 − fn+1

whereas (this is relevant since the transpose of 30 − 3 is 3 − 30 ) 

(3 − 30 ) f +

f2



   f3 − f1      ..  . = .    f − f  n−2   n −fn−1

Therefore M f + + t (3 − 30 ) f + = t (f0 δ + − fn+1 δ − ), and so taking inner products with f + in (4.8) gives n − + (v , f ) = f0 (v − , δ + ) − fn+1 (v − , δ − ) + (u− , f + )(v − , δ − ) − (u+ , f + )(v + , δ − ), t or equivalently, since (v − , f + ) = Un , n Un = f0 − (u+ , f + ) Vn− − fn+1 − (u− , f + ) Vn+ . t Using (4.5) we rewrite this as − − Vn− Vn+ Vn+2 Vn+ Vn− Vn+ Vn+2 n + + Un = + + + + = + + . t Vn+1 Vn+1 Vn+2 Vn+1 Vn+ Vn+1 Vn+2

Using this, (4.6) and (4.2) we arrive at (1.6).

676

C. A. Tracy, H. Widom

5. Painlevé V and Painlevé III We first show that 8n satisfies (1.3). Our formula (3.4) for dUn /dt can now be written − − Vn+2 V− V + Vn+2 Vn+ Vn− dUn = +n − +n = − + + + dt Vn+1 Vn+1 Vn+2 Vn+1 Vn+ Vn+2

=

−(1 − Un2 ) (Un−1

(5.1)

− Un+1 ),

by (4.5), (4.6) and (4.2). Adding and subtracting (1.6) gives us the two formulas n dUn = Un + 2 Un+1 (1 − Un2 ), dt t n dUn = − Un − 2 Un−1 (1 − Un2 ). dt t

(5.2) (5.3)

These are Eqs. (4.5) and (4.6) of [4]. As was done there, we solve (5.2) for Un+1 in terms of Un and dUn /dt and substitute this into (5.3) with n replaced by n + 1. We get a second-order differential equation for Un which is equivalent to Eq. (1.3) for 8n = 1 − Un2 . Next we show that Wn = Un /Un−1 satisfies (1.5). In computing the derivative of Wn we use (5.3) to compute the derivative of Un and (5.2) with n replaced by n − 1 to compute the derivative of Un−1 . We get Wn0 = −

2n − 1 Wn − 2 + 4 Un2 − 2 Wn2 . t

(5.4)

Using (5.3) once again we compute n n U 2 (1 − Un2 ) . (Un2 )0 = 2 Un − Un − 2 Un−1 (1 − Un2 ) = −2 Un2 − 4 n t t Wn Differentiating (5.4) and using this expression for (Un2 )0 we obtain a formula for Wn00 in terms of Wn , Wn0 and Un2 . Then we solve (5.4) for Un2 in terms of Wn and Wn0 . Substituting this into the formula for Wn00 gives (1.5). In order to specify the solutions of the Eqs. (1.3) and (1.5) we must determine the initial conditions at t = 0. Clearly 8n (0) = 1, but this does not determine 8n uniquely. (k) One can see that 8n (0) = 0 for k < 2n and that what determines 8n uniquely is (2n) 8n (0). We shall show that 8(2n) n (0) = −

(2n)! . n!2

(5.5)

− + + /Vn+1 . Now Vn+1 is the upper-left corner of Tn+1 (f )−1 and By (4.2), Un = Vn+1 − , which is the upper-right corner of so tends to 1 as t → 0. So let us see how Vn+1 −1 Tn+1 (f ) , behaves. More exactly, let us find the term in its expansion with the lowest power of t.

Random Unitary Matrices, Permutations and Painlevé

677

We have e2t cos θ =

X tk k

k!

(eiθ + e−iθ )k

X tk X tk = C(k, j ) e−i(2j −k)θ = C(k, (j + k)/2) e−ij θ . k! k! 0≤j ≤k

|j |≤k

This gives Tn (f ) =

X tk X tk C(k, (j + k)/2) 3j = I + C(k, (j + k)/2) 3j . k! k! k>0

|j |≤k

(5.6)

|j |≤k

(Here 3j denotes the usual power when j ≥ 0, but when j < 0 it denotes 30 |j | .) We use the Neumann expansion  l k X X t  C(k, (j + k)/2) 3j  . (−1)l  Tn (f )−1 = I + k! k>0 l≥1

|j |≤k

If we expand this out we get a sum of terms of the form coefficient times t k1 +···+kl 3j1 · · · 3jl . Now the product 3j1 · · · 3jl can only have a nonzero upper-right entry when j1 + · · · + jl ≥ n. Since each |ji | ≤ k, the power of t must be at least n, and this power occurs only when each ji = k. That means that we get the same lowest power of t term for the upper-right entry if in (5.6) we only take the terms with j = k, in other words of we replace Tn+1 (f ) by X tk k≥0

k!

3k = et 3 .

The inverse of this operator is e−t 3 and the upper-right corner of this matrix is exactly (−1)n t n /n!. This shows that − + /Vn+1 = (−1)n+1 t n /n! + O(t n+1 ), Un = −Vn+1

and so 8n = 1 − Un2 = 1 −

t 2n + O(t 2n+1 ), (n!)2

which gives (5.5). We also see that Wn = Un /Un−1 satisfies the initial condition t Wn (t) = − + O(t 2 ). n Using the differential equation (1.5) together with this initial condition we find t t3 2t 5 − 2 − 3 n n (n + 1) n (n + 1)(n + 2) (5n + 6)t 7 + O(t 9 ). − 4 n (n + 1)2 (n + 2)(n + 3)

Wn (t) = −

678

C. A. Tracy, H. Widom

6. Painlevé II We present a heuristic argument that if there is any limit theorem of the type (1.12), with some distribution function F (s) and some power N α replacing N 1/6 , then necessarily α = 1/6 and F is given by (1.10). First we note that Johansson’s lemma (which we shall state in the next section) leads from (1.12) to (1.9) with the power t 1/3 replaced by t 2α . We assume that F is smooth and that the limit in (1.9) commutes with d/ds, so that taking the second logarithmic derivative gives d2 log D2t+s t 2α = −q(s)2 , t→∞ ds 2 lim

where q 2 is now defined by −q 2 = (log F )00 and q is defined to be the positive square root of q 2 (for large s). Since changing n = 2t + s t 2α by 1 is the same as changing s by t −2α , we have the large t asymptotics log Dn+1 + log Dn−1 − 2 log Dn ∼ t −4α

d2 log D2t+s t 2α ∼ −t −4α q(s)2 . dt 2

On the other hand Vn+ = Dn−1 /Dn and so Vn+ Dn+1 Dn−1 2 = + = 1 − Un , Dn2 Vn+1

(6.1)

log (1 − Un2 ) ∼ −t −4α q(s)2 , Un2 ∼ t −4α q(s)2 .

(6.2)

by (4.6). We deduce

Now the Un are of variable sign, as is clear from (1.6). Let us consider those n going to infinity such that Un−1 ≥ 0, Un ≤ 0, Un+1 ≥ 0,

(6.3)

t (Un+1 + Un−1 + 2 Un ) (1 − Un2 ) = −(n − 2t) Un − 2t Un3 .

(6.4)

and write (1.6) as

Because of (6.3), (6.2) and the fact that changing n = 2t + s t 2α by 1 is the same as changing s by t −2α , we have when t is large, Un+1 + Un−1 + 2 Un ∼ t −6α q 00 (s). Since also n − 2t ∼ s t 2α , Un ∼ −t −2α q(s), (6.4) becomes the approximation t 1−6α q 00 (s) ≈ s q(s) + 2 t 1−6α q(s)3 .

(6.5)

Let us show that α = 1/6. If α > 1/6 then letting t → ∞ in (6.5) gives q(s) = 0 and so F is the exponential of a linear function and therefore not a distribution function. If

Random Unitary Matrices, Permutations and Painlevé

679

α < 1/6 then dividing by t 1−6α and letting t → ∞ in (6.5) gives q 00 (s) = 2 q 3 . Solving this gives two sets of solutions Z q0 dq p + c2 , s=± q 4 + c1 q where q0 , c1 and c2 can be arbitrary. Now q is small when s is large and positive, so s is large and positive when q is small. Therefore we have to have the + sign and we must have c1 = 0. Then F (s) is of the form |s − c|−1 times the exponential of a linear function and therefore is not a distribution function. The only remaining case is α = 1/6, and then (6.5) becomes (1.11). It follows that F (s) must be given by (1.10) times the exponential of a linear function. This extra factor must be 1 since (1.10) is already a distribution function. Now to derive this we assumed that the n under consideration were such that (6.3) held. We would have reached the same conclusion if all the inequalities were reversed. If n → ∞ in such a way that, say, Un−1 and Un have one sign and Un+1 the other, then (the reader can check this) we would have reached the conclusion q = 0. Thus the only possibility for q to give a distribution function occurs when α = 1/6 and q satisfies (1.11). 7. Odd Permutations Recall that bNn equals the number of permutations in ON having no increasing subsequence of length greater than n. The representations of Rains [10] for these quantities are (7.1) b2k n = En |tr (U 2 )k |2 , (7.2) b2k+1 n = En |tr (U 2 )k tr (U )|2 . Theorem 1. Let Gn (t) and Hn (t) be the generating functions defined in (1.14). Then   D n (t)2 , n even, 2 (7.3) Gn (t) =  D n−1 (t)D n+1 (t), n odd, Hn (t) =

2

2

2 −1

(t)D n2 +1 (t),

 Dn

 D n−1 (t)D n+1 (t), 2

2

n even, n odd.

(7.4)

We prove a lemma which gives a preliminary representation for the generating functions in terms of other Toeplitz determinants. Let g(z, t1 , t2 ) = g(z) = et1 (z+z

−1 )+t (z2 +z−2 ) 2

and define Dˆ n (t1 , t2 ) = Dˆ n = det Tn (g).

680

C. A. Tracy, H. Widom

Lemma. We have Gn (t) = Dˆ n (0, t), 1 ∂ 2 Dˆ n 1 ∂ 2 Dˆ n (0, t) + (0, −t). Hn (t) = 4 ∂t12 4 ∂t12

(7.5) (7.6)

The proof of (7.5) is essentially the same as the proof that (1.7) and (1.8) are equivalent. First observe that 2k 2k X 2k−m 2k 2 2 . = En tr(U 2 )m tr(U 2 ) En tr(U ) + tr(U ) m m=0

Each summand with m 6 = k vanishes since by the invariance of the Haar measure replacing each U by ζ U , with ζ a complex number of absolute value 1, does not change the summand but at the same time multiplies it by ζ 4m−4k . Thus, 2k 2k k = En tr(U 2 )k tr(U 2 ) . En tr(U 2 ) + tr(U 2 ) k Hence (7.1) is equivalent to b2k n =

2k (k!)2 En tr(U 2 ) + tr(U 2 ) . (2k)!

Therefore if the eigenvalues of U are eiθ1 , · · · , eiθn , we have X 2k (2t)2k X X t 2k b2k n = E cos 2θ Gn (t) = n j (k!)2 (2k)! k≥0 k≥0  n Y e2t cos 2θj  = Dˆ n (0, t). = En  j =1

The last step follows from (1.2). This gives (7.5). To prove (7.6) we use (7.2): k b2k+1 n = En tr(U )tr(U 2 )k tr(U )tr(U 2 ) 2 2k 1 (k!)2 2 2 tr(U ) + tr(U ) En tr(U ) + tr(U ) , = 2 (2k)! by expanding the right side as before. Hence 2 2k 1 X t 2k tr(U 2 ) + tr(U 2 ) En tr(U ) + tr(U ) Hn (t) = 2 (2k)! k≥0 X X 2 cos θj cos θj cosh 2t = 2 En     n n X X 2 Y 2 Y cos θj cos θj e2t cos 2θj  + En  e−2t cos 2θj  = En   ∂2

1 En  = 4 ∂t12

j =1 n Y

j =1



 ∂2

1 g(eiθj , t1 , t2 ) (0, t) + En  4 ∂t12

j =1 n Y

j =1



g(eiθj , t1 , t2 ) (0, −t).

Random Unitary Matrices, Permutations and Painlevé

681

From the last equality (7.6) follows. To prove the theorem we consider (7.3) first. Observe that Dˆ n (0, t) is the determinant of Tn (h), where h(z) = f (z2 ). It has Fourier coefficients h2k = fk , h2k+1 = 0. Let us rearrange the basis vectors e0 , e1 , · · · , en−1 of our underlying n-dimensional space as e0 , e2 , · · · ; e1 , e3 , · · · .

(7.7)

Then we see from the above that Tn (h) becomes the direct sum of two Toeplitz matrices associated with f , the orders of these matrices being the sizes of the two groups of basis vectors in (7.7). If n is even both groups have size n/2 whereas if n is odd the sizes are (n ± 1)/2. Since Dˆ n (0, t) = det Tn (h) is the product of the corresponding Toeplitz determinants associated with f , we have (7.3). The proof of (7.4) is not so simple. We have 2 1 ∂ 2 Dˆ n 2 ˆ n + ∂t1 log Dˆ n . = ∂ log D t 1 Dˆ n ∂t12 Now, as in the computation leading to (2.3), ∂t21 log Dˆ n (t1 , t2 ) = tr Tn (g)−1 Tn (∂t21 g) − Tn (g)−1 Tn (∂t1 g) Tn (g)−1 Tn (∂t1 g) i h = tr Tn (g)−1 Tn ((z + z−1 )2 g) − Tn (g)−1 Tn ((z + z−1 )g) Tn (g)−1 Tn ((z + z−1 )g) . This is to be evaluated first at t1 = 0, t2 = t. Since g(z, 0, t) = f (z2 ) = h(z) we must compute i h tr Tn (h)−1 Tn ((z + z−1 )2 h) − Tn (h)−1 Tn ((z + z−1 )h) Tn (h)−1 Tn ((z + z−1 )h) . We write w ± = Tn (h)−1 h± , so that the w± are associated with h just as u± are associated with f . Using (2.1) and (2.2) we find that Tn (h)−1 Tn ((z + z−1 )h) Tn (h)−1 Tn ((z + z−1 )h) = (30 + w− ⊗ δ − + 3 + w + ⊗ δ + )2 , and from this that tr Tn (h)−1 Tn ((z + z−1 )h) Tn (h)−1 Tn ((z + z−1 )h) = 2n − 2 + 2(w − , 3δ − ) +2(w + , 30 δ + ) + 2(δ − , w+ ) (w− , δ + ) + (δ − , w− )2 + (δ + , w+ )2 . Consequently i h tr Tn (h)−1 Tn ((z + z−1 )2 h) − Tn (h)−1 Tn ((z + z−1 )h) Tn (h)−1 Tn ((z + z−1 )h) = tr Tn (h)−1 Tn ((z2 + z−2 )h)

682

C. A. Tracy, H. Widom

+2 − 2(w− , 3δ − ) − 2(w+ , 30 δ + ) − 2(δ − , w+ ) (w− , δ + ) − (δ − , w− )2 − (δ + , w+ )2 . This is ∂t21 log Dˆ n (0, t). Similarly, we find ∂t1 log Dˆ n (0, t) = (w+ , δ + ) + (w − , δ − ), so that when t1 = 0, t2 = t, 2 1 ∂ 2 Dˆ n 2 −1 2 −2 ˆ ˆ = ∂ log D + ∂ log D = tr T (h) T ((z + z )h) n t n n n 1 t1 Dˆ n ∂t12

+ 2 − 2(w − , 3δ − ) − 2(w+ , 30 δ + ) − 2(w+ , δ − ) (w− , δ + ) + 2(w+ , δ + ) (w− , δ − ). Since Tn (h) is symmetric all superscripts in the symbols in the inner product may be reversed as long as we interchange 3 and 30 . We therefore have shown ∂ 2 Dˆ n (0, t) = tr Tn (h)−1 Tn ((z2 + z−2 )h) 2 ∂t1 Dˆ n (0, t) + 2 − 4 (w + , 30 δ + ) − 2 (w+ , δ − )2 + 2 (w+ , δ + )2 . 1

Let us rearrange our basis elements as in (7.7) and suppose the first group has n1 vectors and the second group has n2 . Then Tn (h)−1 becomes the matrix direct sum   0 Tn1 (f )−1 ,  0 Tn2 (f )−1 and Tn (h)−1 Tn ((z2 + z−2 )h) becomes   0 Tn1 (f )−1 Tn1 ((z + z−1 )f ) .  0 Tn2 (f )−1 Tn2 ((z + z−1 )f ) By a now familiar computation we find from this that + + + tr Tn (h)−1 Tn ((z2 + z−2 )h) = 2 (u+ n1 , δn1 ) + 2 (un2 , δn2 ), + + + where u+ m and δm denote the quantities u and δ associated with the index m. We use similar notation below. To continue, after rearranging our basis we have the replacements         + 0 0 δ 0  , w+ →   , δ + →  n1  , 30 δ + →   h+ →  + 0 fn+2 u+ δ n2 n2

and

 δ− → 

 0 δn−2



 if n is even, δ − → 

δn−1 0

  if n is odd.

It follows from these that + (w + , δ + ) = 0, (w+ , 30 δ + ) = (u+ n2 , δn2 ),

Random Unitary Matrices, Permutations and Painlevé

683

− − + and that (w+ , δ − ) = (u+ n2 , δn2 ) if n is even and (δ , w ) = 0 if n is odd. If we modify our notation by writing ± Un± = (u+ n , δn ),

so that Un− is what we have been denoting by Un , the above gives ∂ 2 Dˆ n (0, t) 1 1 = 1 + Un+1 − Un+2 − (w+ , δ − )2 2 Dˆ n (0, t) ∂t12  2  1 − U− n even, n (t) , 2 = +  1 + U+ n+1 (t) − U n−1 (t), n odd. 2

2

To evaluate our quantities at (0, −t) we observe that if C is the diagonal matrix with diagonal entries 1, −1, 1, · · · , (−1)n , and we replace t by −t, then we have the replacements Tn (f ) → CTn (f )C and f + → −Cf + and therefore u+ → −Cu+ . Therefore also Un+ = (u+ , δ + ) → −Un+ and Un− = (u+ , δ − ) → (−1)n+1 Un− . Hence − 2 + is an even function of t whereas U(n±1)/2 are odd in the last displayed formula Un/2 ˆ functions of t. Also, Dn (0, t) is an even function of t. Thus  2  1 − U− n (t) , n even, 2 Hn (t) = Gn (t) ×  1, n odd. Recalling (4.6) and the general fact Vm+ (t) = Dm−1 (t)/Dm (t) we obtain (7.4). Theorem 2. Let `N (σ ) denote the length of the longest increasing subsequence of σ in the subgroup ON of SN . Then ! √ `N (σ ) − 2 N ≤ s = F (s)2 , lim Prob N→∞ 22/3 N 1/6 where F (s) is as in (1.12). For the proof we shall apply Johansson’s lemma [5], which we now state: Lemma. Let {Pk (n)}k≥0 be a family of distribution functions defined on the nonnegative integers n and ϕn (λ) the generating function ϕn (λ) = e−λ

X k≥0

Pk (n)

λk . k!

(Set P0 (n) = 1.) Suppose that for all n, k ≥ 1, (7.8) Pk+1 (n) ≤ Pk (n). √ √ If we define µk = k + 4 k log k and νk = k − 4 k log k, then there is a constant C such that ϕn (µk ) −

C C ≤ Pk (n) ≤ ϕn (νk ) + 2 2 k k

for all sufficiently large k, 0 ≤ n ≤ k.

684

C. A. Tracy, H. Widom

This allows one to deduce that Pk (n) ∼ ϕn (k) as n, k → ∞ under suitable conditions, and is how one obtains the equivalence of (1.9) and (1.12). We shall apply the lemma to the distributions functions p X λk λ/2 = e−λ F2k (n) , k! k≥0 p X λk λ/2 = e−λ ϕno (λ) = e−λ Hn F2k+1 (n) , k! ϕne (λ) = e−λ Gn

k≥0

where FN (n) = Prob(`N (σ ) ≤ n) =

bN n 2k k!

(7.9)

when N = 2k or 2k + 1. To apply this lemma we must prove another Lemma. We have FN+2 ≤ FN for all N. We show this simultaneously for N = 2k and N = 2k + 1. Take a σ ∈ ON +2 and remove the two-point set {−1, 1} from its domain. Then σ maps the remaining N-point set one-one onto another N-point set. If we identify both of these sets with the integers from −k to k (including or excluding 0 depending on the parity of N) by order-preserving maps, then under this identification the restriction of σ becomes an element of SN . In fact it becomes an odd permutation because the two identification maps are odd. Thus we have described a mapping σ → F(σ ) from ON +2 to ON . The mapping is 2k + 2 to 1 and is clearly onto. It is also clear that `N (F(σ )) ≤ `N +2 (σ ), from which it follows that bN+2 n ≤ (2k + 2) bN n for all n. The assertion of the lemma follows upon using (7.9). Theorem 1 tells us that   D n (t)2 , n even, 2 2 e2t ϕne (2t 2 ) =  D n−1 (t)D n+1 (t), n odd, 2  2  D n (t)D n (t), n even, 2 2 −1 2 +1 e2t ϕno (2t 2 ) =  D n−1 (t)D n+1 (t), n odd. 2

2

From the asymptotics (1.9) we find that if we set n = 2t + st 1/3 , 2 then lim ϕ e (2t 2 ) t→∞ n Setting t =

√

= F (s)2 .

2k/2 gives lim ϕne (k) = F (s)2 .

k→∞

Random Unitary Matrices, Permutations and Painlevé

685

Johansson’s lemma tell us that when N runs through the even integers 2k, √ lim FN (2 N + 22/3 sN 1/6 ) = lim FN (n) = F (s)2 . N→∞

N →∞

For N running through the odd integers we obtain the same relation. Thus the proof is complete. Acknowledgements. This work was supported in part by the National Science Foundation through grants DMS-9802122 (first author) and DMS-9732687 (second author). The authors thank the administration of the Mathematisches Forschungsinstitut Oberwolfach for their hospitality during the authors’ visit under their Research in Pairs program, when the first results of the paper were obtained, and the Volkswagen-Stiftung for its support of the program.

References 1. Baik, J.,Deift, P. and Johansson, K.: On the distribution of the length of the longest increasing subsequence of random permutations. J. Amer. Math. Soc 12, 1119–1178 (1999) 2. Deift, P.A. and Zhou, X.: Asymptotics for the Painlevé II equation. Comm. Pure Appl. Math. 48, 277–337 (1995) 3. Gessel, I.M.: Symmetric functions and P-recursiveness. J. Combin. Th. Ser. A 53, 257–285 (1990) 4. Hisakado, M.: Unitary matrix models and Painlevé III. Mod. Phys. Letts A11, 3001–3010 (1996) 5. Johansson, K.: The longest increasing subsequence in a random permutation and a unitary random matrix model. Math. Res. Lett. 5, 63–82 (1998) 6. Nijhoff, F.W. and Papageorgiou, V.G.: Similarity reductions of integrable lattices and discrete analogues of the Painlevé II equation. Phys. Lett. 153A, 337–344 (1991) 7. Odlyzko, A.M., Poonen, B., Widom, H. and Wilf, H.S.: On the distribution of longest increasing subsequences in random permutations. Unpublished notes 8. Okamoto, K.: Polynomial Hamiltonians associated with Painlevé equations. Proc. Japan Acad. A56, 367–371 (1980) 9. Periwal, V. and Shevitz, D.: Unitary-matrix models as exactly solvable string theories. Phys. Rev. Lett. 64, 1326–1329 (1990) 10. Rains, E.M.: Increasing subsequences and the classical groups. Elect. J. of Combinatorics 5, #R12 (1998) 11. Tracy, C.A. and Widom, H.: Level-spacing distributions and the Airy kernel. Commun. Math. Phys. 159, 151–174 (1994) Communicated by T. Miwa

Commun. Math. Phys. 207, 687 – 696 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Uniform Spectral Properties of One-Dimensional Quasicrystals, I. Absence of Eigenvalues David Damanik1,2 , Daniel Lenz2 1 Department of Mathematics, California Institute of Technology, Pasadena, CA 91125, USA.

E-mail: [email protected]

2 Fachbereich Mathematik, Johann Wolfgang Goethe-Universität, 60054 Frankfurt, Germany

E-mail: [email protected] Received: 7 January 1999 / Accepted: 12 May 1999

Abstract: We consider discrete one-dimensional Schrödinger operators with Sturmian potentials. For a full-measure set of rotation numbers including the Fibonacci case, we prove absence of eigenvalues for all elements in the hull. 1. Introduction In this paper we consider discrete one-dimensional Schrödinger operators in l 2 (Z) with Sturmian potentials, namely, (Hλ,α,θ u)(n) = u(n + 1) + u(n − 1) + λvα,θ (n)u(n),

(1)

where vα,θ (n) = χ[1−α,1) (nα + θ

mod 1),

(2)

λ 6 = 0, α ∈ (0, 1) irrational and θ ∈ [0, 1), along with the corresponding difference equation Hλ,α,θ u = Eu.

(3)

The operator family (1) describes a standard one-dimensional quasicrystal model [31, 34] and has been studied in many papers, for example, [3,4,8,22,24,33]. Moreover, the operators Hλ,α,θ have attracted attention since they exhibit spectral properties such as zero-measure spectrum and purely singular continuous spectral measures that seemingly hold for the entire family, in contrast to the almost Mathieu operator where similar properties were shown to hold for a strict subclass of parameter values [23,27]. To put our present study into perspective, let us consider the class of discrete onedimensional Schrödinger operators with strictly ergodic (i.e., minimal and uniquely ergodic), aperiodic potentials taking finitely many values. Among these, potentials generated by primitive substitutions and circle maps have received particular interest [1,2,6,

688

D. Damanik, D. Lenz

7,9,10,14,15,19–21,24,35,36]. Sturmian potentials (2) are circle map potentials sharing some crucial properties with potentials generated by primitive substitutions. The general belief is that, as a rule, the spectrum has zero Lebesgue measure and the spectral measures are purely singular continuous. By results of Kotani [26] and Last–Simon [28], absence of absolutely continuous spectrum holds in full generality, that is, for every such family of operators and for every element of the family. Zero-measure spectrum was proven for all Sturmian potentials by Bellissard et al. [3], extending an approach √ 5−1 used by Süt˝o in the Fibonacci case (i.e., α = 2 ) [36] which is based on Kotani [26], and for a large class of primitive substitutions by Bovier–Ghez [6,7]. Again, the zeromeasure property holds for all elements of the family since, by minimality, the spectrum is constant over the hull. On the other hand, absence of point spectrum has not yet been shown to hold in similar generality. Generic absence of eigenvalues for certain models was proven in [1,3,6–8,15,20,35], while the works [9,10,14,24] contain almost sure results. However, no uniform result, that is, absence of eigenvalues for an entire such family, was known yet. For the particular case of Sturmian potentials, generic absence of eigenvalues is essentially due to Bellissard et al. [3] (the paper does not state the result, see [8,20] for the result and proofs), whereas almost sure absence of eigenvalues was shown by Kaminaga [24], extending an argument of Delyon–Petritis [14] who had already obtained a partial result. Our purpose here is to prove the following theorem. Theorem 1.1. Suppose lim sup an 6= 2, where the an are the coefficients in the continued fraction expansion of α. Then, for every λ and every θ , the operator Hλ,α,θ has empty point spectrum. Remarks. 1. The set of α’s obeying the assumption of Theorem 1.1 has full Lebesgue measure [25]. √ is included as all the an are equal to 1 in 2. In particular, the Fibonacci case α = 5−1 2 this case. 3. Some works associate a slightly different family to the parameters λ, α [19,20] which is larger than the family parametrized by θ ∈ [0, 1). The proof also works for the additional elements in that larger hull. 4. Our approach is based upon the two-block method [17,35] and yields additional information about stability properties that are discussed within a more general context in [12]. 5. Another key ingredient in our proof is an analog to the hierarchical structures and the concept of (de-)composition in self-similar tilings [18]. This is further discussed and exploited in [11,29]. Thus, combining Theorem 1.1 with the results of Bellissard et al. [3], the general picture proves to be correct for most parameter values. Corollary 1.2. Suppose α obeys the assumption of Theorem 1.1. Then, for every λ and every θ, the operator Hλ,α,θ has purely singular continuous spectrum supported on a Cantor set of zero Lebesgue measure. The organization of the paper is as follows. Section 2 recalls basic properties of Sturmian potentials as well as the two-block method. Hierarchical structures in Sturmian sequences are studied in Sect. 3, while a proof of Theorem 1.1 is provided in Sect. 4. Our proof cannot treat certain phases θ that arise in the case lim sup an = 2. While this paper was under consideration, we have found a way to treat these exceptional cases,

Uniform Spectral Properties of 1D Quasicrystals, I.

689

thus being able to extend the result contained in Theorem 1.1 to all α. This proof will be given in [13]. 2. Basic Properties of Sturmian Potentials Since our argument will be based upon the two-block method, our strategy will be to exhibit appropriate squares adjacent to a certain site i ∈ Z. This approach is entirely independent of the actual numerical values the potential takes, and we can, without loss of generality, restrict our attention to the particular case λ = 1. That is, we shall study the combinatorial properties of the sequences vα,θ . Consider such a sequence as a two-sided infinite word over the alphabet A ≡ {0, 1}. All these words share the property that their subword complexity is minimal among the aperiodic sequences. Infinite words with this property are called Sturmian. We shall recall some basic facts about these words that will be used in Sects. 3 and 4. The material is taken from [3,5]. Moreover, as a general reference on Sturmian words we want to mention the forthcoming monograph [30]. Fix α and consider its continued fraction expansion (for general information on continued fractions, see, e.g., [25]), 1

α=

≡ [a1 , a2 , a3 , ...]

1

a1 + a2 +

(4)

1 a3 + · · ·

with uniquely determined an ∈ N. The associated rational approximants

pn qn

obey

p0 = 0, p1 = 1, pn = an pn−1 + pn−2 , q0 = 1, q1 = a1 , qn = an qn−1 + qn−2 . Define the words sn over the alphabet A by an sn−2 , n ≥ 2. s−1 ≡ 1, s0 ≡ 0, s1 ≡ s0a1 −1 s−1 , sn ≡ sn−1

(5)

In particular, the length of the word sn is equal to qn , n ≥ 0. Proposition 2.1. There exist palindromes πn , n ≥ 2, such that s2n = π2n 10,

(6)

s2n+1 = π2n+1 01.

(7)

t Proof. See [5], where also the recursive relations obeyed by the πn can be found. u By definition, for n ≥ 2, sn−1 is a prefix of sn . Therefore, the following (“right”-) limit exists in an obvious sense, cα ≡ lim sn . n→∞

(8)

Similarly, from (5) we infer that, for n ≥ 1, s2n−2 is a suffix of s2n , and hence, the following (“left”-) limit exists, dα ≡ lim s2n . n→∞

(9)

The exceptional role played by the sequence vα,0 is demonstrated by the following proposition.

690

D. Damanik, D. Lenz

Proposition 2.2. vα,0 restricted to {1, 2, 3, . . . } coincides with cα , vα,0 restricted to {. . . , −2, −1, 0} coincides with dα . Proof. The first claim was shown by Bellissard et al. in [3]. The second claim follows from the first combined with Proposition 2.1 and the symmetry vα,0 (−k) = vα,0 (k − 1), k ≥ 2, also shown in [3]. u t For later use, we also want to note the following elementary formula. an −1 sn−2 sn−1 . Proposition 2.3. For each n ≥ 2, sn sn+1 = sn+1 sn−1

Proof.

a

a

sn sn+1 = sn snn+1 sn−1 = snn+1 sn sn−1 a

an an −1 = snn+1 sn−1 sn−2 sn−1 = sn+1 sn−1 sn−2 sn−1 .

t u

The point is that the word sn sn+1 has sn+1 as a prefix. Note that the dependence of an , pn , qn , sn , πn on α is left implicit. This, however, should not lead to any real confusion as α will always be fixed within a local context. We now turn to the study of generalized eigenfunctions, that is, solutions of (3). Recall the standard reformulation of (3) in terms of transfer matrices, namely, U (n + 1) = Tλ,α,θ (n, E)U (n), where u is a solution of (3),

U (n) ≡

and Tλ,α,θ (n, E) ≡

u(n) , u(n − 1)

E − λvα,θ (n) − 1 . 1 0

Thus, the sequence (u(n))n∈Z is determined by two consecutive values, say u(0) and u(1), and all other values can be determined by applying a matrix product of the form Mλ,α,θ (n, E) ≡ Tλ,α,θ (n, E)×· · ·×Tλ,α,θ (1, E) to the vector U (1) (for n ≥ 1, the case n ≤ 0 is similar). Note that det(Mλ,α,θ (n, E)) = 1. Hence, the characteristic equation of Mλ,α,θ (n, E) takes the form Mλ,α,θ (n, E)2 − tr(Mλ,α,θ (n, E))Mλ,α,θ (n, E) + I = 0.

(10)

Now, the spectrum of Hλ,α,θ is independent of θ [3] and can thus be denoted by 6λ,α . Proposition 2.4. For every λ 6 = 0, there exists Cλ such that, for every irrational α, every E ∈ 6λ,α and every n ∈ N, we have |tr(Mλ,α,0 (qn , E)| < Cλ . Proof. See [3]. u t Before we formulate our basic criterion for absence of eigenvalues, we introduce the following two concepts. Let v be a two-sided sequence over A (think of v = vα,θ ). A subword x = x1 . . . xl of v is called adjacent to i ∈ Z if vi . . . vi+l−1 = x or vi−l+1 . . . vi = x. Two finite words x, y having the same length are called conjugate if x is a subword of yy. This notion is easily seen to induce an equivalence relation on Al , for any fixed l. Intuitively, the equivalence class of a word x = x1 . . . xl is given by the collection of all cyclic permutations xj +1 . . . xl x1 . . . xj of x.

Uniform Spectral Properties of 1D Quasicrystals, I.

691

Lemma 2.5. Suppose α, θ are such that vα,θ has infinitely many squares uk uk adjacent to some i ∈ Z, where uk is conjugate to some snk . Then, for every λ, σpp (Hλ,α,θ ) = ∅. Proof. In case i = 1, the assertion follows by standard arguments [17,35] from (10) together with Proposition 2.4. The general case can be reduced to this case by a suitably chosen shift of the sequence vα,θ , which, of course, leaves the spectral type of the t associated operators Hλ,α,θ invariant. u We close this section by introducing the shift operator T on functions on Z, that is, (Tf )(n) ≡ f (n + 1) for arbitrary functions f on Z. 3. The Partition Lemma Consider for a fixed α the family (in θ ) of all the sequences of the form (vα,θ (k))k∈Z . It is well known that in each sequence of the family the same words occur. Moreover, any of these words occurs with a fixed frequency greater than zero which is independent of the sequence (cf. Appendix of [19]). Thus, from a measure theoretical point of view, the family {(vα,θ (k))k∈Z | θ ∈ [0, 1)} behaves very uniformly. However, to prove the absence of eigenvalues for all θ we need some kind of uniform topological structure. In this section we provide this structure by showing that each sequence (vα,θ (k))k∈Z can be decomposed into blocks of the form sn and sn−1 for all n ∈ N0 . Here we denote by N0 the set of all integers together with 0, that is, N0 = {0, 1, 2, . . . }. Definition 3.1. Let n ∈ N0 be given. An (n, α)-partition of a sequence (fk )k∈Z with fk ∈ {0, 1} is a sequence (Ij , zj )j ∈Z of pairs Ij = {dj , dj + 1, . . . , dj +1 − 1} ⊂ Z and zj ∈ {sn , sn−1 } with 0 ∈ I0 such that fdj fdj +1 ...fdj +1 −1 = zj for all j ∈ Z. An (n, α)-partition of a function f : Z −→ {0, 1} is an (n, α)-partition of the sequence (f (n))n∈Z . The zj are referred to as blocks in the n-partition or more specifically as blocks of the form sn if zj = sn , and as blocks of the form sn−1 if zj = sn−1 . The Ij are referred to as positions of the blocks zj . In the sequel we will sometimes suppress the dependence on α if it is clear from the context to which α we refer. In particular, we will write n-partition instead of (n, α)partition. Remarks. 1. One can think of an n-partition as a tiling of Z by sn and sn−1 generating f . So f is composed out of the blocks zj at the positions Ij . 2. Our notion of n-partition is analogous to the notion of nth composition used in the study of self similar tilings [32]. This notion has been used by Hof in [19]. There he studies the Lyapunov exponent and the integrated density of states of discrete Schrödinger operators with a potential generated by a primitive substitution. As s0 = 0 and s−1 = 1, every f : Z −→ {0, 1} admits a 0-partition. But for a general f there does not necessarily exist an n-partition for n > 0. However, it is crucial to our analysis of the eigenvalue problem for Hλ,α,θ that for the sequences vα,θ there do exist unique n-partitions for all n ∈ N0 . This is the content of the next lemma.

692

D. Damanik, D. Lenz

Lemma 3.2. (i) For every n ∈ N0 , there exists a unique n-partition of vα,0 . (ii) For every n ∈ N0 and every θ ∈ [0, 1), there exists a unique n-partition of vα,θ . Proof. Let α = [a1 , a2 , ...] be the continued fraction expansion of α (cf. Eq. 4). (i) Existence: Set v ≡ vα,0 . We show that there are n-partitions of (v(k))k≥1 and of (v(k))k≤0 . Here an n-partition of a one-sided sequence is defined in the obvious way. By (5), (8) and Proposition 2.2, it is clear that there exists an n-partition of (v(k))k≥1 for all n. The existence of an n-partition for (v(k))k≤0 follows similarly by (5), (9) and Proposition 2.2. Uniqueness: This follows by induction. As s0 = 0, s−1 = 1, uniqueness is clear for n = 0. By (5), every (n + 1)-partition gives rise to an n-partition and the positions of the sn+1 in the (n + 1)-partition are determined by the positions of sn−1 in the n-partition. Thus, uniqueness of the n-partition implies uniqueness of the (n + 1)-partition. (ii) Fix θ ∈ [0, 1). As α is irrational there exists a sequence (nk )k∈N , nk ∈ N, such that the sequence (T nk vα,0 )k∈N converges to vα,θ in the product topology on {0, 1}Z for k → ∞. By (i), it is clear that the T nk vα,0 admit unique n-partitions for all n ∈ N0 . To use this to conclude (ii), we introduce the following notion of convergence: Let fk , k ∈ N, and f be functions for which there exist unique n-partitions denoted by (Ijk , zjk ) and (Ij , zj ), respectively. We say that the fk converge to f in the n-sense for k → ∞ if for all C > 0, there exists a k0 such that for k ≥ k0 , (Ijk , zjk ) = (Ij , zj ) for all Ij ⊂ (−C, C). Clearly, (ii) follows if we prove the following claim. Claim. For each n there exists a unique n-partition of vα,θ and the sequence (T nk vα,0 )k∈N0 converges to vα,θ in the n-sense for k → ∞. Proof of the claim. This will be done by induction. We will consider two cases. Case 1. a1 = 1. As s−1 = s1 = 1 and s0 = 0, the cases n = 0 and n = 1 are clear. So suppose the statement is true for n ≥ 1 fixed. Let (Ij , zj ) be the n-partition of vα,θ . By a sn+1 = snn+1 sn−1 (cf. Eq. 5), the existence of an (n + 1)-partition of vα,θ will follow if we show that to the left of each block of the form sn−1 in the n-partition of vα,θ , there are at least an+1 blocks sn . That is, we have to show that zj = sn−1 for j ∈ Z implies zk = sn for k = j − an+1 , ..., j − 1. As T nk vα,0 admits a unique n-partition for each n ∈ N0 , there are at least an+1 blocks sn to the left of each block of the form sn−1 in the n-partition of T nk vα,0 . As the T nk vα,0 converge to vα,θ in the n-sense, the corresponding statement is true for vα,θ . This gives the existence of an (n + 1)-partition of vα,θ . The uniqueness of the (n + 1)-partition follows from the uniqueness of the n-partition as in a part (i). As the blocks sn+1 in the (n + 1)-partition of vα,θ arise from blocks snn+1 sn−1 in n k the n-partition of vα,θ , it is clear that the convergence of T vα,0 to vα,θ in the n-sense implies the convergence in the (n + 1)-sense. This proves the claim in Case 1. Case 2: a1 > 1. As s−1 = 1 and s0 = 0, the case n = 0 is clear. So fix n ≥ 0. If n > 0 we can continue exactly as in Case 1. If n = 0 we can continue as in Case 1 after t replacing an+1 = a1 by a1 − 1 ≥ 1. This proves the claim in Case 2. u The proof of the lemma is therefore finished. u t Corollary 3.3. For n ∈ N, let (Ijn , zjn ) be the n-partition of vα,θ . If zjn = sn−1 , then zjn−1 = zjn+1 = sn . If zjn = sn , then there is an interval I = {d, d +1, . . . , d +l −1} ⊂ Z

Uniform Spectral Properties of 1D Quasicrystals, I.

693

n n of length l ∈ {an+1 , an+1 + 1} with j ∈ I and zin = sn for all i ∈ I and zd−1 = zd+l = sn−1 .

Proof. By the existence part of the partition lemma, there exists an (n + 1)-partition of a vα,θ . By the uniqueness part of the partition lemma and the formula sn+1 = snn+1 sn−1 , all the blocks of the form sn−1 in the n-partition of vα,θ arise from blocks of the form sn+1 in the (n + 1)-partition. This shows that there is no j ∈ Z with zjn = zjn+1 = sn−1 and that there are at least an+1 blocks of the form sn between two blocks of the form sn−1 . That there are at most an+1 + 1 such blocks follows, as there are not two adjacent t blocks of the form sn in the (n + 1)-partition. This proves the corollary. u Remarks. 1. Define to be the set of accumulation points of translates of vα,0 with respect to pointwise convergence, that is, (α) ≡ {ω ∈ AZ | ω = lim T ni vα,0 , ni → ∞}. Then the method of the previous lemma can be easily adapted to prove the existence of a unique n-partition for all ω ∈ (α). We refer the reader to [12] for a discussion of the relationship between (α) and the set {vα,θ | θ ∈ [0, 1)}. 2. The above lemma and a careful analysis of vα,0 allow one to show that the blocks sn and sn−1 in the n-partition of an ω ∈ (α) occur with fixed frequency, see [29] for details. This can be used together with Theorem 1 of [16] to show that, indeed, every word that occurs in some ω0 ∈ (α) occurs in every ω ∈ (α) with a fixed frequency greater than zero. Thus, the above partition lemma is indeed a stronger result than the results about the frequencies of words mentioned at the beginning of this section. 4. Absence of Eigenvalues Let α = [a1 , a2 , ...] be the continued fraction expansion of α (cf. Eq. 4). The proof of Theorem 1.1 will be split into two parts. Proposition 4.1. If lim supn→∞ an ≥ 3, then for every θ and every λ, the operator Hλ,α,θ has no eigenvalues. Proposition 4.2. If lim supn→∞ an = 1, then for every θ and every λ, the operator Hλ,α,θ has no eigenvalues. Remark. In fact the proofs yield the absence of eigenvalues for all potentials in the respective hulls (λ, α) := {λω : ω ∈ (α)}. Proof of Proposition 4.1. Fix θ ∈ [0, 1). We will exclude eigenvalues of Hλ,α,θ using Lemma 2.5, that is, by exhibiting many appropriate squares in vα,θ at zero. By Lemma 3.2, there is for each n ∈ N0 an n-partition ((Ijn , zjn ))j ∈Z , Ijn = {djn , . . . , djn+1 − 1}, zjn ∈ {sn , sn−1 } of vα,θ . We will consider two cases. Case 1. There are infinitely many n ∈ N0 with z0n = sn−1 . Consider such an n with

694

D. Damanik, D. Lenz

n ≥ 4. Corollary 3.3 yields z1n = sn . As sn−1 is a prefix of sn and z2n ∈ {sn , sn−1 }, we have z0n z1n z2n = sn−1 sn sn−1 w an sn−2 (cf. Eq. 5), we get with a suitable word w. Using sn = sn−1 an sn−2 sn−1 w. z0n z1n z2n = sn−1 sn−1

By sn−2 sn−1 = sn−1 v with a suitable v (cf. Proposition 2.3), we finally arrive at an sn−1 vw. z0n z1n z2n = sn−1 sn−1

This implies vα,θ (d0n )vα,θ (d0n + 1)...vα,θ (d0n + 3|sn−1 | − 1) = sn−1 sn−1 sn−1 , where 0 ∈ {d0n , ..., d0n + |sn−1 | − 1). Thus, there exists a square xx, with x being a cyclic permutation of sn−1 to the right of zero. Since this is true for infinitely many n, we can use Lemma 2.5 to exclude eigenvalues. Case 2. There is an n0 ∈ N0 such that z0n = sn for all n ≥ n0 . By lim supn→∞ an ≥ 3, there are infinitely many n ≥ n0 such that an ≥ 3. Fix such an n. As an ≥ 3 and z0n = sn , there are three cases by Corollary 3.3. Subcase 1. z0n = z1n = z2n = sn . In this case we have z0n z1n z2n = sn sn sn . n = zn = s . In this case we have zn zn zn = s s s . Subcase 2. z0n = z−1 n n n n −2 −2 −1 0 n n = zn = s . Calculating as in Case 1 we get Subcase 3. z−1 = z0n = z1n = sn , z−2 n−1 2 z0n z1n z2n z3n = sn sn sn w with a suitable word w. Thus, in all subcases we can conclude as in Case 1 that there exists a square xx, with x being a cyclic permutation of sn either to the left or to the right of zero. Since this applies to infinitely many n, we can use Lemma 2.5 to exclude eigenvalues in Case 2 as well. This proves the proposition. u t Proof of Proposition 4.2. The proof is similar to the proof of Proposition 4.1. So fix θ ∈ [0, 1). By Lemma 3.2, there exists a unique n-partition ((Ijn , zjn ))j ∈Z of vα,θ . Again we will consider two cases. Case 1. There are infinitely many n ∈ N0 with z0n = sn−1 . This case can be treated as in the proof of Proposition 4.1. Case 2: There is an n0 ∈ N0 such that z0n = sn for all n ≥ n0 . By lim supn→∞ an = 1, there is an n1 ∈ N0 such that an = 1 for all n ≥ n1 . Let c := max{n0 , n1 }. As z0n = sn and sn+1 = sn sn−1 for all n ≥ c, it follows easily by induction that d0n = d0c for all n ≥ c. Moreover, z0n = sn and sn+1 = sn sn−1 imply z1n = sn−1 for all n ≥ c and this in turn implies z2n = sn for all n ≥ c. We therefore have z0n z1n z2n = sn sn−1 sn = sn sn w, where we used sn−1 sn = sn w with a suitable word w (cf. Proposition 2.3). Putting all this together, we see vα,θ (d0c )...vα,θ (d0c + 2|sn | − 1) = sn sn for all n ≥ c. Again, an application of Lemma 2.5 yields the desired absence of eigenvalues, concluding the proof. u t

Uniform Spectral Properties of 1D Quasicrystals, I.

695

Acknowledgements. D. D. was partially supported by the German Academic Exchange Service through Hochschulsonderprogramm III (Postdoktoranden) and D. L. received financial support from Studienstiftung des Deutschen Volkes (Doktorandenstipendium), both of which are gratefully acknowledged.

References 1. Bellissard, J., Bovier, A., Ghez, J.-M.: Spectral properties of a tight binding Hamiltonian with period doubling potential. Commun. Math. Phys. 135, 379–399 (1991) 2. Bellissard, J., Bovier, A., Ghez, J.-M.: Gap labelling theorems for one-dimensional discrete Schrödinger operators. Rev. Math. Phys. 4, 1–37 (1992) 3. Bellissard, J., Iochum, B., Scoppola, E., Testard, D.: Spectral properties of one-dimensional quasi-crystals. Commun. Math. Phys. 125, 527–543 (1989) 4. Bellissard, J., Iochum, B., Testard, D.: Continuity properties of the electronic spectrum of 1D quasicrystals. Commun. Math. Phys. 141, 353–380 (1991) 5. Berstel, J.: Recent results in Sturmian words. In: Dassow, J. and Salomaa, A. (Eds.), Developments in Language Theory, Singapore: World Scientific, 1996, pp. 13–24 6. Bovier, A., Ghez, J.-M.: Spectral properties of one-dimensional Schrödinger operators with potentials generated by substitutions. Commun. Math. Phys. 158, 45–66 (1993) 7. Bovier, A., Ghez, J.-M.: Erratum. Spectral properties of one-dimensional Schrödinger operators with potentials generated by substitutions. Commun. Math. Phys. 166, 431–432 (1994) 8. Damanik, D.: α-continuity properties of one-dimensional quasicrystals. Commun. Math. Phys. 192, 169– 182 (1998) 9. Damanik, D.: Singular continuous spectrum for the period doubling Hamiltonian on a set of full measure. Commun. Math. Phys. 196, 477–483 (1998) 10. Damanik, D.: Singular continuous spectrum for a class of substitution Hamiltonians. Lett. Math. Phys. 46, 303–311 (1998) 11. Damanik, D., Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals, II. The Lyapunov exponent. Preprint 12. Damanik, D., Lenz, D.: Half-line eigenfunction estimates and singular continuous spectrum of zero Lebesgue measure. Preprint 13. Damanik, D., Lenz, D., In preparation 14. Delyon, F., Petritis, D.: Absence of localization in a class of Schrödinger operators with quasiperiodic potential. Commun. Math. Phys. 103, 441–444 (1986) 15. Delyon, F., Peyrière, J.: Recurrence of the eigenstates of a Schrödinger operator with automatic potential. J. Stat. Phys. 64, 363–368 (1991) 16. Geerse, C., Hof, A.: Lattice gas models on self-similar aperiodic tilings. Rev. Math. Phys. 3, 163–221 (1991) 17. Gordon, A.: On the point spectrum of the one-dimensional Schrödinger operator. Usp. Math. Nauk 31, 257–258 (1976) 18. Grünbaum, B., Shephard, G. C.: Tilings and Patterns. New York: Freeman and Company, 1987 19. Hof, A.: Some remarks on discrete aperiodic Schrödinger operators. J. Stat. Phys. 72, 1353–1374 (1993) 20. Hof, A., Knill, O., Simon, B.: Singular continuous spectrum for palindromic Schrödinger operators. Commun. Math. Phys. 174, 149–159 (1995) 21. Iochum, B., Testard, D.: Power law growth for the resistance in the Fibonacci model. J. Stat. Phys. 65, 715–723 (1991) 22. Iochum, B., Raymond, L., Testard, D.: Resistance of one-dimensional quasicrystals. Physica A 187, 353–368 (1992) 23. Jitomirskaya, S.: Almost everything about the almost Mathieu operator, II. In : Proceedings of XI International Congress of Mathematical Physics, Paris 1994, Cambridge, MA: Intl. Press, 1995, pp. 373–382 24. Kaminaga, M.: Absence of point spectrum for a class of discrete Schrödinger operators with quasiperiodic potential. Forum Math. 8, 63–69 (1996) 25. Khintchine, A.: Continued Fractions Groningen: Noordhoff, 1963 26. Kotani, S.: Jacobi matrices with random potentials taking finitely many values. Rev. Math. Phys. 1, 129–133 (1989) 27. Last, Y.: Almost everything about the almost Mathieu operator, I. In: Proceedings of XI International Congress of Mathematical Physics, Paris 1994, Cambridge, MA: Intl. Press, 1995, pp. 366–372 28. Last, Y., Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum of onedimensional Schrödinger operators. Invent. Math. 135, 329–367 (1999) 29. Lenz, D.: Hierarchical structures in Sturmian dynamical systems. Preprint 30. Lothaire, M.: Algebraic Combinatorics on Words. In preparation 31. Luck, J., Petritis, D.: Phonon spectra in one-dimensional quasicrystals. J. Stat. Phys. 42, 289–310 (1986)

696

D. Damanik, D. Lenz

32. Lunnon, W., Pleasants, P.: Quasicristallographic tilings. J. Math. Pures et Appl. 66, 217–263 (1987) 33. Raymond, L.: A constructive gap labelling for the discrete Schrödinger operator on a quasiperiodic chain. Preprint 34. Senechal, M.: Quasicrystals and Geometry. Cambridge: Cambridge University Press, 1995 35. Süt˝o, A.: The spectrum of a quasiperiodic Schrödinger operator. Commun. Math. Phys. 111, 409–415 (1987) 36. Süt˝o, A.: Singular continuous spectrum on a Cantor set of zero Lebesgue measure for the Fibonacci Hamiltonian. J. Stat. Phys. 56, 525–531 (1989) Communicated by B. Simon

Commun. Math. Phys. 207, 697 – 733 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Universality at the Edge of the Spectrum in Wigner Random Matrices Alexander Soshnikov California Institute of Technology, Department of Mathematics, Sloan 253-37, Pasadena, CA 91125, USA. E-mail: [email protected] Received: 15 May 1999 / Accepted: 18 May 1999

Abstract: We prove universality at the edge for rescaled correlation functions of Wigner random matrices in the limit n → +∞. As a corollary, we show that, after proper rescaling, the 1th , 2nd , 3rd , etc. eigenvalues of Wigner random hermitian (resp. real symmetric) matrix weakly converge to the distributions established by Tracy and Widom in G.U.E. (G.O.E.) cases. 1. Introduction and Formulation of the Results We study the classical ensembles of random matrices introduced by Eugene Wigner about forty years ago. The two models under consideration are Wigner hermitian matrices and Wigner real symmetric matrices. We start with the hermitian case. 1.1. Wigner hermitian matrices. The ensemble consists of n-dimensional random her√ = ka k, where Re a = Re a = ξ / n, Im a = − Im aj i = mitian matrices A n ij ij j i ij ij √ √ ηij / n, 1 ≤ i < j ≤ n, aii = ξii / n, 1 ≤ i ≤ n, and {ξij , ηij , 1 ≤ i < j ≤ n; ξii , 1 ≤ i ≤ n} are independent real random variables such that the following conditions hold: (i) The random variables ξij , i ≤ j, ηij , i < j , have symmetric laws of distribution, (ii) All moments of these random variables are finite; in particular (i) implies that all odd moments vanish, (iii) E(ξij )2 =

1 8

= E(ηij )2 , 1 ≤ i < j ≤ n,

E(ξii ) ≤ const, 1 ≤ i ≤ n, 2

(1.1) (1.2)

with some const > 0. We shall denote by const various positive real numbers that do not depend on n, i, j .

698

A. Soshnikov

(iv) The distributions of ξij , ηij decay at infinity at least as fast as a Gaussian distribution, namely E(ξij )2m , E(ηij )2m ≤ (const · m)m .

(1.3)

The case of Wigner real symmetric matrices is very similar to the hermitian case, except that we now consider n-dimensional real symmetric matrices An = kaij k.

1.2. Wigner real symmetric matrices. We assume that aij = aj i = and {ξij }i≤j are real independent random variables, such that:

ξ √ij , n

1 ≤ i ≤ j ≤ n,

(i) The laws of distributions of {ξij }i≤j are symmetric, (ii) All moments are finite; in particular, all odd moments vanish, (iii) E(ξij )2 = 41 , 1 ≤ i < j ≤ n,

(1.10 )

E(ξii )2 ≤ const,

(1.20 )

(iv) E(ξij )2m ≤ (const · m)m , m = 1, 2, . . . .

(1.30 )

References [1–25, 34, 35, 41, 42] contain an extensive collection of works on the subject from the fifties to the present. A famous Wigner semicircle law (see [1, 2, 5–12]) (n) (n) (n) can be formulated as follows. Let λi ≥ λ2 ≥ · · · ≥ λn be the eigenvalues of An . The matrix is either real symmetric or hermitian, therefore all eigenvalues are real. At this point, it is not important which of the two cases we consider. The eigenvalues of random matrices can be considered as random variables. One of the fundamental questions is to study their empirical distribution function Nn (λ) =

1 (n) #{λk ≤ λ, k = 1, . . . , n}. n

The Wigner Semicircle Law claims that Nn (λ) converges to a nonrandom limit Rλ N(λ) = −∞ ρ(u) du as n → +∞, where ( √ 2 1 − u2 , |u| ≤ 1 ρ(u) = π 0, |u| > 1. The convergence is understood to be with probability 1 if entries of all matrices An , n = 1, 2 · · · , are defined on the same probability space. The archetypical examples of Wigner random matrices are the Gaussian Unitary Ensemble (G.U.E.) and Gaussian Orthogonal Ensemble (G.O.E.). The G.U.E. is the 1 ), 1 ≤ ensemble of random hermitian matrices such that Re aij , Im aij ∼ N (0, 8n 1 i < j ≤ n, aii ∼ N(0, 4n ), 1 ≤ i ≤ n, are independent Gaussian random variables. Correspondingly, the G.O.E. is the ensemble of random real symmetric matrices such

Universality at the Edge of the Spectrum in Wigner Random Matrices

699

1+δ

that aij ∼ N(0, 4nij ), 1 ≤ i ≤ j ≤ n are independent Gaussian random variables. Then the joint distribution of matrix elements can be written as P (dAn ) = constn,β · e−β·n·Trace(An ) dAn , 2

(1.4)

where β = 2 corresponds to the G.U.E., β = 1 corresponds to the G.O.E., and dAn is Lebesgue measure on the n2 -dimensional ( n·(n+1) -dimensional) space of matrix ele2 ments. Equation (1.4) implies a nice formula for the induced distribution of the eigenvalues of An in the G.U.E. and G.O.E. cases ([17]): dP (λ1 , . . . , λn ) = Pn,β (λ1 , . . . , λn ) dλ1 . . . dλn , with Pn,β (λ1 , . . . , λn ) = const0n,β · u1≤i<j ≤n |λi − λj |β · e−βn·(λ1 +···+λn ) , 2

2

(1.5)

which in turn allows one to calculate k-point correlation functions of the eigenvalues. We recall that k-point correlation functions are defined as Z n! Pn,β (λ1 , . . . , λn ) dλk+1 . . . dλn , (1.6) ρn,β,k (λ1 , . . . , λk ) = (n − k)! R n−k where we integrate out the last n − k variables. For the precise formulas of ρn,β,k we refer to [17], Ch. 5, 6 and [28, 29]. k-point correlation functions are particularly useful in calculating the moments of the number of eigenvalues in an interval I ⊂ R1 . Let νn,I (n) (n) be the number of eigenvalues in I : νn,I = #{λi : λi ∈ I, i = 1, 2, . . . , n}. Then the mathematical expectation of νn,I is given by the formula Z (1.7) E νn,I = ρn,β,1 (x) dx I

and, in general, E νn,I · (νn,I − 1) . . . (νn,I − k + 1) Z ρn,β,k (x1 , x2 , . . . , xk ) dx1 . . . dxk . =

(1.8)

Ik

The integration in (1.8) is over the k-dimensional cube I k = I × · · · × I . To study the most challenging problem in the theory of Wigner matrices – the problem of local distribution of eigenvalues – one has to consider rescaled k-point correlation functions. Let x ∈ R 1 and consider an interval In (x) around x such that the average number of eigenvalues in In (x) is of order of constant. Since Z ρn,β,1 (λ) dλ = 0(1) (1.9) E νn,In = In

−1 (x), we see that for x from the bulk of the spectrum, should imply diam (In (x)) ∼ ρn,β,1 the intervals In (x) will shrink to the point x. Let us define In (x) to be

(x + c1 /ρn,β,1 (x), x + c2 /ρn,β,1 (x)), where c1 and c2 are some fixed constants. We see that the problem of local distribution of eigenvalues in the neighborhood of x can be reduced to the problem of studying the

700

A. Soshnikov

distribution of the number of particles in In (x) as n → +∞. To calculate the moments c1 c2 of νn,In = #{λi ∈ (x + ρn,β,1 (x) , x + ρn,β,1 (x) )}, we consider a rescaling −1 (x) · yi , i = 1, . . . , k λi = x + ρn,β,s

(1.10)

−k (x) · ρn,β,k (λ1 , . . . , λk ). Rn,β,k (y1 , . . . , yk ) = ρn,β,1

(1.11)

and

The functions at the l.h.s. of (1.11) are called rescaled k-point correlation functions. It follows immediately from (1.10) that if λi ∈ In (x), i = 1, . . . , k, then the variables y1 , . . . , yk are of order of constant, namely yi ∈ (c1 , c2 ), i = 1, . . . , k. Moreover, the factorial moments of #(x + c1 /ρn,β,1 (x), x + c2 /ρn,β,1 (x)) are equal to E νn,In · (νn,In − 1) . . . (νn,In − k + 1) Z Rn,β,k (y1 , . . . , yk ) dy1 · dyk . =

(1.12)

[c1 ,c2 ]k

To show that νn,In converges to a limit in distribution as n → ∞, one needs to show then that rescaled k-point correlation functions have a limit too. This result has been established in the Gaussian cases (see [17]). Consider first the hermitian case β = 2 (G.U.E.). Then uniformly on the compact subsets of R k , lim Rn,2,k (y1 , . . . , yk ) = R2,k (y1 , . . . , yk ) = det(K(yi , yj ))ki,j =1 ,

n→∞

(1.13)

where K(y, z) is an example of a so-called integrable kernel, K(y, z) =

A(y) · A0 (z) − A0 (y) · A(z) . y−z

(1.14)

The amazing fact is that K does not depend on x, provided x lies in the bulk of the spectrum, −1 < x < +1. Then one can show that A(y) = sinππy . If x = ±1 the kernel still has the form (1.14), but then A(y) = Ai(±y). Here Ai(y) stands for the Airy function which is defined as the solution of the differential equation f 00 (y) = y · f (y) with the asymptotics f (y) ∼

3

1 − 23 y 2 1 e √ 2· π ·y 4

as y → +∞. Looking back at

(1.10) one may ask √about the order of the spectral density ρn,2,1 (x). The answer is that ¯ uniformly on compact subsets of (−1, 1), and ρn,2,1 (x) = n · π2 1 − x 2 · (1 + o(1)) 2

ρn,2,1 (±1) = 0(n 3 ). In the case of G.O.E., the limiting formulas are slightly more complicated. It appears that R1,k (y1 , . . . , yk ) := lim Rn,1,k (y1 , . . . , yk ) n→∞

can be expressed as a square root of the determinant of a 2k- dimensional matrix consisting of 2 × 2 blocks ξ1 (yi , yj ), 1 ≤ i, j ≤ k. To define ξ1 (y, z) we start with the real-valued kernel (1.14) and introduce d K(y, z), dz Z ∞ K(t, z) dt − 21 sgn(y − z). J K(y, z) = −

DK(y, z) = −

y

Universality at the Edge of the Spectrum in Wigner Random Matrices

Then

K(y, z), DK(y, z) , ξ1 (y, z) = J K(y, z), K(y, z)

701

and R1,k (y1 , . . . yk ) =

(1.15)

r

det(ξ1 (yi , yj ))ki,j =1 .

(1.16)

In [26, 27, 33] Tracy–Widom, and Forrester studied distributions of the first few largest eigenvalues in G.U.E. and G.O.E. (It may be noted that in [27] a Gaussian ensemble of self-dual quaternion matrices, which corresponds to β = 4 in formula (1.5.) was also studied). It was shown that for any positive integer k, the joint distribution function of 2 2 2 (n) (n) (n) the first k rescaled eigenvalues (λ1 − 1) · 2n 3 , (λ2 − 1) · 2n 3 , . . . , (λk − 1) · 2n 3 has a limit as n → ∞: si (n) , i = 1, . . . , k . (1.17) Fβ,k (s1 , . . . , sk ) = lim Pβ λi ≤ 1 + 2 n→∞ 2n 3 The limiting k-dimensional distribution function (which is different in the G.U.E. and G.O.E. cases) can be expressed in terms of the solutions of completely integrable P.D.E. (n) The formulas are the simplest when one considers the maximal eigenvalue λ1 . Let q(s) be the solution of the Painléve II differential equation q 00 (s) = sq(s) + 2q(s)3 determined by the asymptotics q(s) ∼ Ai (s) at s = +∞. Then for the G.U.E. (β = 2): Z +∞ s (n) (x − s) · q 2 (x) dx) (1.18) = exp(− F2,1 (s) = lim P λ1 ≤ 1 + 2 n→∞ s 2n 3 and for the G.O.E. (β = 1): s (n) F1,1 (s) = lim P λ1 ≤ 1 + 2 n→∞ 2n 3 (1.19) Z +∞ 1 2 q(x) + (x − s)q (x) dx . = exp − 2 s The great attention paid to the local distribution of the eigenvalues has its origin in a general belief that local statistics in random matrices mimic those of the energy levels of highly excited states of heavy nuclei (see e.g. [4] and [17]). The computer simulations of random matrices ([3, 17, 41]) show that local fluctuations are always the same in the limit n → ∞ and determined only by the overall symmetries of the ensemble. This allowed Mehta to formulate the following conjecture that can be found in the introduction of his book on random matrices ([17], p. 9): Conjecture (Universality in Wigner Matrices). Let A be n×n real symmetric (hermitian) Wigner random matrices. Then in the limit of large n statistical properties of k eigenvalues of A become independent of the probability distribution of aij . In other words, rescaled k-point correlation functions tend for every k to the k-point limiting correlation functions of the Gaussian Orthogonal (Unitary) Ensemble given in (1.15–1.16) and (1.13–1.14). The purpose of our paper is to establish a universality conjecture both for hermitian and real symmetric ensembles of Wigner matrices (1.1–1.3), (1.10 –1.30 ) at the edge of the spectrum. We shall prove that in the limit n → ∞ the distribution of the first few rescaled eigenvalues is independent of the marginal distributions of matrix elements.

702

A. Soshnikov

Main Result. Let us consider a Wigner ensemble of n × n hermitian or real symmetric matrices ((1.1–1.3) or (1.10 –1.30 )). Fix some arbitrary positive integer k and consider (n) (n) (n) the first k largest eigenvalues of a Wigner matrix: λ1 ≥ λ2 ≥ · · · ≥ λk . Then the joint distribution function of the k-dimensional random vector with the components 2 2 (n) (n) (λ1 − 1) · 2n 3 , . . . , (λk − 1) · 2n 3 has a weak limit as n → ∞, which coincides with that in the G.U.E. (G.O.E.) case. Remarks. 1. This is the first rigorous result about universality at the edge of the spectrum. Recently several groups of mathematicians (Pastur, Shcherbina [44], Deift, Kriecherbauer, McLaughlin, Venekides, Zhou [47], Bleher, Its [46]; see also [45] and [39]) established universality at the bulk of the spectrum for certain classes of unitary invariant ensembles of hermitian random matrices, when P (dAn ) = constn · e−n·Trace V (An ) dAn ,

(1.20)

and V , for example, is a polynomial of even degree with a positive leading coefficient. With the exception of a V quadratic polynomial, which corresponds to G.U.E., matrix elements of A are strongly correlated. 2. There is a recent paper by Johansson [36] that claims the universality of rescaled two-point correlation functions in the bulk of the spectrum for quite a general class of hermitian Wigner matrices. 3. The main result also holds for the smallest eigenvalues of Wigner matrices. 4. It should be noted that the limiting distribution (1.18) appeared recently in the paper by Baik, Deift and Johansson [48] as a limit of the rescaled distribution of the length of the longest increasing subsequence of a random permutation from Sn (see also [31]). In a very recent development, Okounkov ([50]), Borodin, Okounkov, Olshanski ([53]) and Johansson ([54]) independently generalized the Baik–Deift–Johansson result for an arbitrary number of the rows of partitions of n. Among other interesting papers on the subject are [51, 52]. We mention that (1.18) also appeared in another preprint by Johansson ([37]) where shape fluctuations in certain random growth models in two dimensions are studied. The idea of the proof is to study linear statistics of the form X et·θj ,

(1.21)

j =1,2,... :λj ≥0

where t > 0 and θj are obtained from the positive eigenvalues λi > 0 by rescaling λj = 1 +

θj 2

2n 3

.

It follows from the semicircle law that Z 2 1 p 1 − x 2 dx · (1 + o(1))(a.e.) ¯ #{λj > 1 − } = n · π 1− for 0 ≤ ≤ 1. In particular, #{λj > 0} =

n + o(n). ¯ 2

Universality at the Edge of the Spectrum in Wigner Random Matrices

703

P The main contribution to the linear statistics j :λj ≥0 etθj is due to the eigenvalues at the right edge of the spectrum. Indeed, the subsum of (1.21) over 0 ≤ λj ≤ 1 − is negligible: X

2 3

2 3

et·θj ≤ #{0 ≤ λj ≤ 1 − } · e−2t··n ≤ n · e−2t·n = o(1). ¯

j :0≤λj ≤1−

To prove the universality of k-point correlation functions, it is quite convenient to study their Laplace transform: Z ∞ Z ∞ ··· exp(t1 · y1 + · · · + tk · yk ) · Rn,β,k (y1 , . . . yk )dy1 , . . . , dyk = −∞ −∞ X (1.22) et1 ·θj1 · et2 ·θj2 · · · · · etk ·θjk , =E j1 6=···6=jk

where t1 , t2 , . . . tk > 0. We now use the notation (1.11) for an arbitrary Wigner matrix with β = 2 corresponding to the hermitian case, and β = 1 to the real symmetric case. Strictly speaking, Rn,β,k are functions in a usual sense only if the marginal distributions of matrix elements are absolutely continuous with respect to the Lebesgue measure. Generally, Rn,β,k are understood to be distributions. The universality then should imply convergence of the r.h.s. of (1.22) to the limit Z ∞ Z ∞ ··· exp(t1 y1 + · · · + tk yk ) · Rβ,k (y1 , . . . yk ) dy1 · · · dyk . (1.23) −∞

−∞

Actually we are going to prove a slightly weaker form of the last property. The reason for the modification comes from the fact that for large positive θj ’s the corresponding terms in the r.h.s. of (1.22) are going to be exponentially large and therefore even the question of boundedness of the mathematical expectations in the limit n → ∞ is not easy (essentially, we have to control the probabilities of large deviations). To avoid this 1 problem, we consider truncated sums corresponding to θj not greater than n 6 . We define Sn,k (t1 , . . . , tk ) =

∗ X

et1 θj1 · · · etk θjk ,

(1.24)

j1 6 =···6 =jk

where we throw away terms corresponding to 1 λj > 1 + √ . 2 n

(1.25)

The probability of finding an eigenvalue in (1 + 2√1 n , +∞) is tiny. As we shall show in Corollary 2, 1 1 (1.26) > 0 ≤ c1 · exp(−c2 · n 6 ), P # λj > 1 + √ 2 n where c1 , c2 are some positive constants. The estimate (1.26) implies that for the purpose of calculating the limiting k-point correlation functions, it is enough to consider Sn,k (t1 , . . . , tk ), rather than the whole sum from the r.h.s. of (1.22). The following two results will be proved in Sect. 5.

704

A. Soshnikov

Lemma 9. The mathematical expectation ofR Sn,k (t1 ,R. . . , tk ) converges to the limit ∞ ∞ (1.23) as n → ∞: E Sn,k (t1 , . . . tk ) −→ −∞ · · · −∞ exp(t1 · y1 + · · · + tk yk ) · n→∞

Rβ,k (y1 , . . . , yk )dy1 · · · dyk for any t1 , t2 , . . . , tk > 0, where Rβ,k , k = 1, 2, . . . , is the limiting k-point correlation function of the G.U.E. (β = 2) and G.O.E. (β = 1).

Theorem A. Let An be a Wigner random hermitian ((1.1)–(1.3)) or real symmetric ((1.1’)–(1.3’)) matrix. Then rescaled correlation functions at the edge weakly converge to the universal limits (G.U.E. or G.O.E., correspondingly) as n → ∞. In [22, 23] a special combinatorial technique has been developed to treat statistics (1.24). The main idea is to study traces of high powers of An . Let us consider 2

Trace

3 An2[t·n ]

=

n X j =1

n λ2s j ,

2

where sn = [t · n 3 ] and t > 0. Define rescaling at the edge of the spectrum by λj = θ τ 1 + j2 for positive eigenvalues λj > 0, and λj = −1 − j2 for λj ≤ 0. Then 2n 3

2n 3

n X j =1

2 2[t·n 3 ]

λj

=

X 0

1+

j

2[t·n 23 ]

θj

+

2

2n 3

X 00

−1 −

j

2[t·n 23 ]

τj 2

2n 3

,

P P where summation in 0j is over λj > 0 and in 00j is over λj ≤ 0. To proceed we break P00 P0 P0 and into three subsums: = I10 + I20 + I30 , where both X

I10 =

j :0<λj <1−

X

I20 = j :1−

1 1 2n 2

j :λj >1+

and similarly

P00

1 1 2n 2

θj 2n

2 3

2 3

(1 +

≤λj ≤1+

X

I30 =

(1 +

1 1 2n 2

(1 + 1 1 2n 2

θj 2n

)2[t·n ] , θj

2 3

2 3

2n

)2[t·n ] ,

2 3

2 3

)2[t·n ] ,

= I100 + I200 + I300 , where I100

X

=

j :−1+ 11 2n 2

−1 −

j :−1−

1 1 2n 2

2

2n 3

<λj ≤0

X

I200 =

≤λj ≤−1+

2[t·n 23 ]

τj

−1 − 1 1 2n 2

,

2[t·n 23 ]

τj 2

2n 3

,

Universality at the Edge of the Spectrum in Wigner Random Matrices

I300

X

=

−1 −

j :λj <−1− 11 2n 2

2[t·n 23 ]

τj 2

2n 3

705

.

1

1

By definition, the sum in I20 , I200 is over |θj | ≤ n 6 , |τj | ≤ n 6 , while in I10 , I100 (I30 , I300 ) 2 3

1 6

2 3

1 6

1

1

the sum is over −2n < θj < −n , −2n ≤ τj < −n (θj > n 6 , τj > n 6 ). One can see that I10 , I100 are going to zero as n → ∞: 0≤

I10 ,

I100

≤n· 1−

2[tn 23 ]

1 1

2n 2

t 1 ≤ n · exp − n 6 . 2

(1.27)

If we look at I20 , I200 then uniformly in t from compact subsets of (0, +∞) we have X 1 et·θj · 1 + 0(n− 3 ) , I20 = 1

|θj |≤n 6

I200 =

X

1 et·τj · 1 + 0(n− 3 ) .

(1.28)

1

|τj |≤n 6

Finally, let us restrict our attention to I30 , I300 . Such subsums correspond to the large deviations of maximal (minimal) eigenvalues and with probability greater than 1 − c1 · 1 exp(−c2 · n 6 ) will be shown to contain no terms at all. The above arguments combined with some technical considerations in Sect. 2 and 4 will imply   Trace A2[tn

2 3]

X  X t·θ  − e j+ et·τj    −→ 0 (a.e.). 1 θj
1 τj
n→∞

(1.29)

We shall also be able to establish that all the moments of the l.h.s. of (1.29) converge to 2

zero. Considering Trace A2[tn 3 ]+1 , similar arguments will provide   Trace A2[tn

2 3 ]+1

X  X t·θ  − e j− et·τj    −→ 0, 1 θj
1 τj
n→∞

(1.30)

where the convergence is again with probability 1 as well as in Lp , p ≥ 1. It follows from (1.29), (1.30) that X

et·θj −

1

2 2 1 3 3 · (Trace A2[tn ] + Trace A2[tn ]+1 ) −→ 0 (a.e.). n→∞ 2

θj
Linear statistics

P 1

θj
et·θj ,

P 1

τj
et·τj are identically distributed (this follows

from the symmetry A → −A of the model). Once we prove in Sects. 3, 4 that traces of consecutive high powers are asymptotically independent, this will imply asymptotical

706

A. Soshnikov

independence of the linear statistics.An easy way to explain this is to say that distributions of eigenvalues in far apart regions ({λ : |λ − 1| < 1 1 } and {λ : |λ + 1| < 1 1 } in our 2n 2

2n 2

case) are independent in the limit n → ∞. Of course the actual proof requires some work. 2 2 p In [22, 23] we studied the traces of high powers An n , pn = 2[tn · n 3 ], 2[tn · n 3 ] + 1, under the assumption tn −→ 0. In particular, we established n→∞

2

Theorem (Sinai, Soshnikov). Let pn → +∞ be such that pn /n 3 → 0 as n → +∞. Then q  8 · n3 · (1 + o(1)) ¯ if pn even, π p E(Trace An n ) = pn2  0 if pn odd, p

p

and the moments of the centralized trace Trace An n − E Trace An n converge to the moments of N(0, π1 ). Similar results have been established for the joint k-dimensional distribution of Trace Apn1 , . . . , Trace Apnk provided that pn1 , . . . , pnk are of the same order. 2

The theorem itself says nothing about the moments of Trace A2[t·n 3 ] for fixed t > 0. However, as an easy corollary we have Theorem 1. For any > 0, K > 0 there exists some δ(, K) > 0 such that if 0 < t < 2

δ, 1 ≤ k ≤ K, then the k th moment of Trace A2[t·n 3 ] stays bounded as n → ∞ and k

2 3

2 3

(1 − ) · π − 2 · t − 2 ≤ lim inf E(Trace A2[t·n ] )k ≤ lim sup E(Trace A2[t·n ] )k 3k

n→∞

n→∞

k

≤ (1 + ) · π − 2 · t − 2 . 3k

(1.31) Proof. Suppose (1.31) is false. Then there exists a sequence tm & 0 such that 2

− 3k

k

lim supn→∞ E(Trace A2[tm ·n 3 ] )k > (1 + ) · π − 2 · tm 2 (we shall concentrate here only on the lim sup part, since the lim inf part is similar). Then for each m there exists sufficiently large nm such that k 3 − 3k E(Trace An2[tmm ·nm ] )k > (1 + ) · π − 2 · tm 2 . 2 2

2

(1.32) 2

We can choose nm so that tm · nm3 → +∞. Now taking pnm = 2[tm · nm3 ] one concludes 2

2[t ·nm3 ]

that (1.32) contradicts the convergence of E(Trace Anmm moment of N(0, π1 ). u t

−

√1 π

−3

· tm 2 )k to the k th

2

The analogue of the theorem for the joint distribution of Trace A2[tn ·n 3 ] , Trace 2

2

2

2

w N (0, A2[tn ·n 3 ]+1 says that (Trace A2[tn ·n 3 ] −E Trace A2[tn ·n 3 ] , Trace A2[tn ·n 3 ]+1 )− → 2

I d) (see [22, 23]) provided tn → 0 and tn · n 3 → +∞.

1 π·

Universality at the Edge of the Spectrum in Wigner Random Matrices

707

Since X 1

2 1 3 etθj = (Trace A2[tn ·n ] 2

(1.33)

θj
+ Trace A X 1

) · (1 + 0(n− 3 )) + o(1), ¯ 1

2 1 3 etτj = (Trace A2[tn ·n ] 2

(1.34)

τj
− Trace A

) · (1 + 0(n− 3 )) + o(1) ¯ (a.e.), 1

one may hope to establish results similar to Theorem 1 for the linear statistics. This is indeed the case: Corollary 1. For any > 0, K > 0 there K) that if 0 < t < δ, P exists δ(, P> 0 such t·θ t·τ 1 ≤ k ≤ K, then the k th moments of 1 e j, 1 e j stay bounded as θj
n → ∞ and (1 − ) ·

1 √ 2 π

k

τj


k

 X tθ  3k · t − 2 ≤ lim inf E  e j   n→∞ 1



k

θj
(1.35)

k  X tθ  3k 1 j  ≤ (1 + ) · ≤ lim sup E  e · t− 2 , √   2 π n→∞ 1

θj
(1 − ) ·

1 √ 2 π

k



k

 X tτ  3k · t − 2 ≤ lim inf E  e j   n→∞ 1



k

τj
 X tτ  ≤ lim sup E  e j   ≤ (1 + ) · n→∞ 1

τj

1 √ 2 π

(1.36) k

· t− 2 . 3k

Proof. It follows from the arguments around the formulas (1.27), (1.28) that X 2 2 3 3 et·θj − 21 · Trace A2[t·n ] − 21 · Trace A2[t·n ]+1 1 θj
r 1 + r2 + r3 ,

708

A. Soshnikov

where r1 = const · n− 3 · (Trace A2[t·n 2 X 3 λj2[t·n ]+1 , r2 = 1

λj >1+

+ | Trace A2[t·n

2 3 ]+1

|),

1 1 2n 2

X

r3 =

2 3]

−λj2[t·n

2 3 ]+1

.

λj <−1− 11 2n 2

Then it is enough to show that for any given K > 0, the first K moments of r1 , r2 , r3 vanish as n → ∞, provided t is small enough. The statement for r1 immediately follows from Theorem 1. To consider r2 , we write r2 · (1 + Therefore r2 ≤ 1 +

1−2[t·n 23 ]

1 2n

1 2

· Trace A2[2t·n

2 3]

1 1 2n 2

2

2

)2[t·n 3 ]−1 ≤ Trace A2[2t·n 3 ] .

t

1 6

≤ e− 2 n · Trace A2[2t·n

2 3]

for sufficiently large n. Now taking 2t < δ(, K) and applying Theorem 1, one can show that the first K moments of r2 vanish as n → ∞. The case of r3 can be done in a similar fashion. u t As another corollary of Theorem 1, we prove the estimate (1.26): Corollary 2. P (#{λj > 1 +

1

1

1 2

} > 0) ≤ c1 · exp(−c2 · n 6 ),

2n where c1 , c2 are some positive constants. Proof.





 X 1  } > 0 ≤ E P #{λj > 1 + 1  2n 2 λ >1+ j

2

 1 −2[ δ n 23 ] ) 2  · (1 + 1  2n 2

2[ 2δ n 3 ] 

λj

1 1 2n 2

with δ = δ(1, 1) from Theorem 1. Then the last inequality implies − 3 1 2 δ 6 1 δ − 21 } > 0 ≤ 2 · π · · e− 2 ·n . P #{λj > 1 + 1 2 2n 2 Remark 5. It will follow from our results in the Sects. 2–5 that  k k   X t·τ   X tθ   e j e j E  , E   1

θj
have limits for any t > 0 and k ∈ Z1+ .

1

τj
t u

Universality at the Edge of the Spectrum in Wigner Random Matrices

709

The rest of the paper is organized as follows. In Sect. 2 we formulate Theorems 2 and 3 together with some corollaries and revisit a combinatorial problem associated with calculation of moments of Trace Apn . This problem deals with the number of closed paths of length pn on a complete nonoriented graph with n vertices, where the paths are such that each edge appears an even number of times. In [22, 23] we focused on the 2 2 properties of typical paths when pn /n 3 → 0. The case when pn is proportional to n 3 , which is of utmost importance to the statistics at the edge, will be treated in this paper. For a warm-up we shall consider a simpler problem in Sect. 3: the dependence of the behavior of typical closed paths on pn when there is no additional condition that every edge appears in the path an even number of times. Theorems 2, 3 are proven in Sect. 4. In Sect. 5 we deduce Theorem A from Theorems 2, 3 and prove the main result. 2. Traces of High Powers of Wigner Matrices p

We start with a calculation of the mathematical expectation of Trace An n , where pn = 2sn 2 or 2sn + 1; sn = [t · n 3 ]. Clearly, X E ai0 ,i1 ai1 ,i2 . . . aipn −1 i0 . (2.1) E Trace APn n = P

The sum in (2.1) is taken over all closed paths P = {i0 , i1 , . . . , ipn −1 , i0 }, with a distinguished origin, in the set {1, 2, . . . n}. We consider the set of vertices {1, 2, . . . , n} as a nonoriented graph in which any two vertices are joined by an unordered edge. Since the distributions of the random variables aij are symmetric, we conclude that the only paths giving nonzero contribution to (2.1) are those for which the number of occurrences of each edge is even. Indeed, due to the independence of {aij }i≤j , the mathematical expectation of the product factorizes as a product of mathematical expectations of random variables corresponding to different edges of the path. Therefore if some edge appears in P odd number of times at least one factor in the product will be zero. In particular, if 2 the length of P is odd (pn = 2[t · n 3 ] + 1), then E Trace A2[t·n

2 3 ]+1

= 0. 2

For the even powers of A, we established in Theorem 1 that E Trace2[t·n 3 ] is uniformly bounded in n for sufficiently small t. We shall generalize this result in the next theorem: Theorem 2. Let An be either a hermitian ((1.1)–(1.3)) or real symmetric ((1.1’)–(1.3’)) Wigner random matrix. Then the following is true: a) There are some constants γ1 , γ2 > 0 such that for any t > 0, E Trace A2[t·n

2 3]

≤

γ1 t

3 2

eγ2 t

3

(2.2)

uniformly in n ∈ Z1+ b) A subsum of (2.1) that corresponds to the paths, where either at least one edge appears more than twice or there are loops (edges {j, j }, j = 1, . . . , n), goes to zero as n → +∞.

710

A. Soshnikov

We will prove Theorem 2 in Sect. 4. A remarkable corollary of it consists of the fact 2

that the limit of E Trace A2[t·n 3 ] exists for an arbitrary Wigner matrix and is the same as in the special case of G.U.E. (G.O.E.). We start with 2

Lemma 1. For theRGaussian Unitary Ensemble, limn→∞ E A2[t·n 3 ] exists for all t > 0 ∞ and is equal to 2 · −∞ etθ R2,1 (θ) dθ. 2

Lemma 2. For the Gaussian Ensemble, limn→∞ E A2[t·n 3 ] exists for all R ∞ Orthogonal tθ t > 0 and is equal to 2 · −∞ e R1,1 (θ ) dθ. Remark 6. The limits for the hermitian and real symmetric case are different, which is not a surprise since local statistics at the edge (e.g. correlation functions) are different for G.U.E. and G.O.E. We shall prove here Lemma 1 only. The proof for Lemma 2 is quite similar. Proof of Lemma 1. The main ingredient of the proof isP the claimPthat mathematical P P tθ t·τj , t·τ expectations of linear statistics, j etθj , 1 e j, 1 e j have je j :θj
j :τj
the same limit as n → ∞. To calculate mathematical expectations of linear statistics, we need to know the exact formula for the spectral density (one-point correlation function), which in the case of G.U.E. is equal to ρn,2,1 (x) =

n−1 X √ √ 2n · ψ`2 ( 2nx),

(2.3)

l=0

(see [17]) where

(−1)`

x2 exp ψ` (x) = 1 1 2 π 4 · (2` · `!) 2

d` (exp(−x 2 )) dx `

(2.4)

are known as Weber–Hermite functions or normalized eigenfunctions of the harmonic oscillator: x2 1 1 d2 ψ ψ` , ` = 0, 1, . . . . + = ` + (2.5) − ` 2 dx 2 2 2 For use later, we also write here formulas for k-point correlation functions: ρn,2,k (x1 , . . . , xk ) = det(Kn (xi , xj ))ni,j =1 ,

(2.6)

where Kn (x, y) =

n−1 X √ √ √ 2n · ψ` ( 2nx) · ψ` ( 2ny).

(2.7)

`=0

It follows from the asymptotics of Hermite polynomials ([49]) that for x = 1 + have

√ 1 1 lim n 12 ψn ( 2nx) = 2 4 Ai (θ )

n→∞

θ

2

we

2n 3

(2.8)

Universality at the Edge of the Spectrum in Wigner Random Matrices

711

uniformly in θ bounded from below, and for large positive θ and arbitrary positive t uniformly in n, √ 1 ¯ −tθ ). (2.9) n 12 ψn ( 2nx) = o(e Equations (2.8), (2.9) imply that for θ bounded from below we have uniform convergence Z ∞ 1 θ ρ Ai (θ + t)2 dt, (2.10) 1+ = lim 2 n,2,1 2 n→∞ 0 2n 3 2n 3 and similarly for higher correlation functions θk 1 θ1 ,... ,1 + · ρn,2,k 1 + = det(K(θj , θj ))ki,j =1 lim 2 2 2 n→∞ 2 · n3 2n 3 2n 3

(2.11)

uniformly in θ1 , . . . , θk bounded from below, where Z ∞ Ai (x + t)Ai (y + t) dt K(x, y) = 0

Ai (x) · A0i (y) − A0i (x) · Ai (y) . = x−y We denote the limit in (2.11) by R2,k (θ1 , . . . θk ). As an immediate consequence of (2.9), we have Z +∞ X et·θj = 1 et·θ · Rn,2,1 (θ ) dθ −→ 0. E n→∞

n6

1 θj ≥n 6

For arbitrary positive T , formulas (2.8), (2.9) imply Z +∞ Z X et·θj = etθ · Rn,2,1 (θ ) dθ −→ E θj ≥−T

n→∞

−T

+∞

−T

et·θ R2,1 (θ ) dθ.

(2.12)

(2.13)

(2.14)

To prove the convergence E

X

et·θj −→

n→∞

j

Z

∞

−∞

etθ · R2,1 (θ ) dθ,

(2.15)

we have to justify taking the limit in (2.14). One way to do this is by using Plancherel–Rotach asymptotics of Hermite functions near the turning point for large negative θ . Or one can do it as follows: Let δ = δ(1, 1) be as from Corollary 1 (Sect. 1). Take κ = min( 2t , 2δ ). Then X X et·θj ≤ E eκ ·θj · e−κ ·T E θj <−T

θj <−T

≤E

X

1

θj
uniformly in n.

3 1 eκ ·θj · e−κ ·T ≤ √ κ − 2 · e−κ ·T −→ 0 T →∞ π

(2.16)

712

A. Soshnikov

Combining (2.13), (2.14) and (2.16), we obtain Z X X etθj = lim E etθj = lim E n→∞

n→∞

j

∞

−∞

1

θj
etθ R2,1 (θ ) dθ.

Because of the → −A of the model, we also conclude that the last equation P symmetry A P t·τ holds for E j et·τj and E 1 e j. τj
To conclude the proof of Lemma 1 we note that formulas (1.27), (1.28) and discussions around them claim 2 X X t·θj t·τj E Trace A2[t·n 3 ] − E e −E e ≤ 1 1 θj
+E

X 1 τj >n 6

τj

t 1 et·τj + 2n · exp − n 6 2

θj ≥n 6

.

To finish the proof, we note that the r.h.s. in the last inequality goes to zero. u t A similar strategy works for the G.O.E. too and allows us to establish Lemma 2. For the formulas for k-point correlation functions in the G.O.E., the reader is referred to [17, 27]. After we formulated Theorem 2 and proved Lemmas 1 and 2, the picture starts to emerge. Assuming we can prove Theorem 2 (we shall do it in Sect. 4), we arrive at the following corollaries. Corollary 3. For an arbitrary Wigner hermitian matrix (1.1)-(1.3) the limit of E Trace An2[t·n Lemma 1.

2 3]

exists for all t > 0 and coincides with the limit for the G.U.E. from

Corollary 4. For an arbitrary Wigner real symmetric matrix (1.1’)-(1.3’) the limit of E Trace An2[t·n Lemma 2.

2 3]

exists for all t > 0 and coincides with the limit for G.O.E. from

The proof of the corollaries is elementary. Subsums of (2.1) over the paths such that each edge appears in the path twice or does not appear at all and there are no loops (i.e., edges of the form (i, i)) depend only on the second moments of {aij }1≤i<j ≤n that are the same for all Wigner matrices (within its symmetry class). Since the rest of the sum (2.1) goes to zero according to part b) of Theorem 2, Corollaries 1, 2 are proven. u t 2 2 To study the higher moments of Trace Apn , pm = 2[t · n 3 ], 2[t · n 3 ] + 1, we write similarly to (2.1), E

k Y j =1

(j )

Trace

p An n

=

X P1 ,...Pk

(j )

E

−1 k pnY Y j =1 l=0

ai (j ) i (j ) , `

`+1

(2.17)

Universality at the Edge of the Spectrum in Wigner Random Matrices

713

(j )

(j )

where the sum is taken over all closed paths Pj = {i0 , i1 , . . . , i

(j )

(j )

(j )

pn −1

, i0 }, j =

1, . . . k, in the set of n vertices. Since each path Pj is closed its distinguished origin (j ) (j ) i0 coincides with the end vertex i (j ) . The next theorem will be proven in Sect. 4. It pn

establishes an analogue of Theorem 2 for higher moments. Theorem 3. Let An be either a hermitian ((1.1)–(1.3)) or real symmetric ((1.1’)–(1.3’)) Wigner random matrix. Then there are some constants γ1 , γ2 > 0 such that for any t1 , t2 , . . . tk > 0 and pn(1) ∈ {2[t1 · n 3 ], 2[t1 · n 3 ] + 1}, . . . , pn(k) ∈ {2[tk · n 3 ], 2[tk · n 3 ] + 1}, 2

2

2

2

the following holds: a) E

k Y i=1

p

(i)

Trace An n ≤ Qk

γ1k

3k/2 i=1 ti

· exp(γ2 ·

k X i=1

ti3 ).

(2.18)

b) A subsum of (2.18) over k-tuples of paths (P1 , . . . , Pk ) for which at least one nonoriented edge appears more than twice in their union or at least one path has a loop, vanishes in the limit n → ∞. Let us consider first the Gaussian Ensembles. Lemma 3. For the Gaussian Unitary Ensemble, the mathematical expectation at the l.h.s. of (2.18) has a limit as n → ∞. Similarly, for the G.O.E.: Lemma 4. For the Gaussian Orthogonal Ensemble the mathematical expectation at the l.h.s. of (2.18) has a limit as n → ∞. The proof of Lemmas 3 and 4 essentially follows from the existence of limiting k-point correlation functions we have to do is to P in G.U.E. and G.O.E.All P Pconsider linear statistics P exp(t ·θ ), exp(t ·θ ), exp(t ·τ ), 1 1 1 exp(ti ·τj ), i = i j i j i j j j :θj
j :τj
1, . . . , k, and note that

j :τj
(i) their P limits as n → ∞; P moments have independent, and (ii) j exp(ti · θj ), P j exp(tm · τj ) are asymptotically P (iii) the moments of 1 exp(ti · θj ), 1 exp(ti · τj ) go to zero. j :θj ≥n 6

j :τj ≥n 6

Assuming that Theorem 3, is proven we derive an important corollary which we formulate separately for the hermitian and real symmetric cases. Corollary 5. For an arbitrary Wigner hermitian matrix (1.1)–(1.3), the limit lim E

n→∞

k Y i=1

i ·n Trace A2[t n

2 3 ]+i

,

(2.19)

where i = 0, 1; i = 1, . . . , k, exists for all positive t1 , . . . , tk and coincides with the limit in the G.U.E. case.

714

A. Soshnikov

Corollary 6. For an arbitrary Wigner real symmetric matrix (1.1’)–(1.3’), the limit lim E

n→∞

k Y i=1

i ·n Trace A2[t n

2 3 ]+i

,

(2.20)

where i = 0, 1; i = 1, . . . , k, exists for all positive t1 , . . . , tk and coincides with the limit in the G.O.E. case. Remark 7. Exact formulas for the limits in (2.19), (2.20) can be derived from X et1 ·θj1 · · · etk ·θjk lim E n→∞

Z =

1

∞

θj1 6=θj2 6=···6=θjk
Z

∞

−∞ −∞

(2.21)

et1 ·x1 +···+tk ·xk · Rβ,k (x1 , . . . , xk )dx1 · · · dxk .

Remark 8. A large class of linear statistics near the edge has been studied in the Gaussian case in [32]. Another work of related interest is [31], where the author studies general β ensembles. We also would like to draw attention to [38, 40]. The proof of Theorems 2, 3 has a strong combinatorial flavor. Let us discuss it in more p detail. We have shown at the beginning of the section that the calculation of E Trace An n can be reduced to the counting of closed paths of length pn on a nonoriented complete graph with n vertices under the additional conditon that each edge will appear an even number of times. We call the paths satisfying this condition even. In the counting process, we assign to each path a statistical weight E ai0 i1 · ai1 i2 . . . aipn −1 i0 . We assume pn even since there are no even paths of odd length. In the rest of the paper we shall deal with the real symmetric case; the considerations in the hermitian case are very similar. An interesting observation can be made when aij are Bernoulli random variables taking values ± 2√1 n with probability 21 . Then E ai0 i1 · ai1 i2 . . . aipn −1 i0 = 2−pn

(2.22)

for any even path implying that E Trace Apn = 2−pn · #{closed even paths of length pn , with distinguished origin, on a complete nonoriented graph with n vertices}. In the case of an arbitrary Wigner matrix, formula (2.22) still holds if the path has no loops (edges {i, i}) and each edge of the path appears exactly twice. In [1, 2] Wigner proved the celebrated semicircle law by studying even paths of fixed length p. He showed that the paths without loops that visit each edge twice are typical, meaning that the ratio of the number of such paths to the number of all even paths goes to 1 as n → +∞. To study the case when length pn is growing we need some definitions from [22, 23]: Definition 1. An instant ` = 1, 2, . . . , pn − 1 is said to be marked for the closed even path P = {i0 , i1 , . . . ipn −1 , ipn = i0 } if the nonoriented edge of {i`−1 , i` } occurs an odd number of times up to the instant ` (inclusive). The other instants are said to be unmarked. It follows immediately from the definition that the number of marked instants for a closed even path is equal to the number of unmarked instants.

Universality at the Edge of the Spectrum in Wigner Random Matrices

715

3

2 5

4

1

Fig. 1.

Definition 2. A closed even path P is called a path without self-intersections if, for any two distinct marked instants `0 and `00 , one has i`0 6 = i`00 . For purposes of Definition 2, we also assume instant 0 to be marked. Paths without self-intersections have the following structure. First, there is a series of marked instants when we pass through a number of distinct vertices (the number of vertices that we “discover” during this series is equal to the length of the series). Then there is a series of unmarked instants when we pass in the reverse order vertices visited before. At some moment we stop the second series – we do not necessarily “sink” all the way down to the origin of the path – and launch a new series of marked steps, discovering at each of the instants a new vertex. Then we again have a series of unmarked instants when we make a few “steps back”, and so on. One can see that for such paths, each edge appears exactly twice and there are no loops. It is important to note that paths without self-intersections are uniquely determined by their values at marked instants. In [1] Wigner showed that the number of closed even paths of length p = 2s without (2s)! n! · s!·(s+1)! , and that such paths are typical in the limit self-intersections is equal to (n−s)! n → ∞. It follows from this argument that the pth moment of the limiting spectral density of the eigenvalues is equal to s Z 1 1 2p (2s)! · = x 2s · 1 − x 2 dx (2.23) s! · (s + 1)! 4 π −1 for even p = 2s and 0 for odd p. In [13] Füredi and Komlóz showed that paths without 1 6 self-intersections are typical if pn is growing not faster than √ √ n . In [22] we proved that this is still true if pn / n −→ 0. The values pn of order n are critical in a sense that n→∞ starting with this regime, typical paths have self-intersections [23]. Definition 3. A marked instant m is called an instant of self-intersection if there is a marked instant m0 < m such that im0 = im . It is important that moments m0 , m in Definition 3 are required to be marked. In Fig. 1 we give an example of a path without self-intersections. P = {1 → 5 → 3 → 5 → 2 → 4 → 2 → 5 → 1}. An example of a path with self-intersection is given in Fig. 2: P = {1 → 5 → 3 → 2 → 5 → 4 → 5 → 3 → 2 → 5 → 1}.

716

A. Soshnikov 3

2 5

4 1

Fig. 2.

Definition 4. A vertex i is called a vertex of simple (triple, quadruple, etc.) self- intersection if there are exactly two (three, four, etc.) marked instants m such that im = i. It was proven in [23] that if we consider a uniform distribution on the discrete space of all closed even paths of length pn = 2sn and assume √snn −→ c > 0, then the n→∞ probability for nonsimple (i.e., triple, quadruple, etc.) self- intersections to occur goes to zero as n → ∞, and the number of simple (or, in the light of the previous line, the 2 number of all) self-intersections converges in distribution to Poisson law with mean c2 . √ 2 If n pn n 3 (notation “” means that the ratio of terms goes to +∞) then the probability to have only simple self-intersections still converges to 1, and the number of √ simple self-intersections is

sn2 n 2n (1 + 0( sn ))

in a sense that

# of simple self-intersections − √ sn / n

sn2 2n

w

−→ N (0, 1).

n→∞

(2.24)

It is quite straightforward to derive (2.24) from results proved in [23], even though this central limit theorem was not explicitly stated there. Among other important results, it 2 ¯ 3 ) the probability to have some edge passed more than was established that for pn = o(n twice goes to zero. To study the higher moments of Trace Apn , we offered a very neat approach that allows us to count k-tuples of closed paths of length pn satisfying an additional condition that each edge appears in the union of paths an even number of times. In the present paper 2 we extend these techniques to the case pn ∼ n 3 . For warm-up let us start with a more simple combinatorial problem. 3. Toy Model As before, assume that we have a nonoriented complete graph {1, 2, . . . n} (i.e., every vertex is connected with any other vertex by a nonoriented edge). In this section we shall study an ensemble of all closed paths (with distinguished origin) of length pn on the graph. Therefore throughout this section, the condition that each edge appears an even number of times no longer holds. The number of all such paths is npn and we define a uniform distribution on the space of such paths by assigning to each path a probability n−pn . First, we formulate and prove a few propositions.

Universality at the Edge of the Spectrum in Wigner Random Matrices

717

Proposition 1. Suppose that pn /n −→ 0, then the probability of having at least one n→∞

edge passed more than once goes to zero as n → ∞. Proof. By a simple counting argument, the number of such paths is not greater than pn ·(pn −1) · n2 · npn −4 , and therefore the ratio of the number of such paths to the number 2 n −1) −→ 0. u t of all paths is not greater than pn ·(p 2n2 n→∞

Definitions 30 , 40 are the analogues of Definitions 3, 4 in our situation. Definition 30 . An instant m < pn is called an instant of self-intersection if there exists an instant m0 < m such that im0 = im . Definition 40 . A vertex i is called a vertex of simple (triple, quadruple, etc.) self- intersection if there are exactly two (three, four, etc.) instants m < pn such that im = i. Proposition 2. Let to zero.

pn √ −→ n n→∞

0. Then the probability of having a self-intersection goes

Proof. Indeed, the number of paths without self-intersections is n · (n − 1) . . . (n − Qpn −1 Qpn −1 (1 − nk ). Since k=0 (1 − pn + 1). Therefore, the probability in question is 1 − k=0 P P P P pn −1 pn −1 k pn −1 k 2 pn −1 k 3 k k k=0 n − k=0 2n2 − k=0 3n3 − . . . ) = n ) = exp( k=0 ln(1 − n )) = exp(− Qpn −1 pn3 pn (pn −1) (1− exp(− 2n +0( n2 )), we observe that under the condition of Proposition 1 k=0 k ) −→ 0. t u n n→∞

Proposition 3. Let

pn √ −→ n n→∞

c. Then the probability of having a nonsimple self-inter-

section goes to zero. One can also show that the number of simple self-intersections 2 converges in distribution to the Poisson law with mean c2 . Remark 9. As a trivial corollary of Proposition 8, we have that the number of all selfintersections also converges to the same Poisson distribution. Proof of Proposition 3. We shall show that the number of paths with exactly m simple self- intersections is equal to (c2 /2)m ·

1 −c2 /2 pn · n · (1 + o(1)). ¯ e m!

(3.1)

Let us denote the instants of self-intersections by 0 < j1 < j2 < · · · < jm < pn . We are also going to use a notation i0 = 0. Then the number of paths that have their self-intersections (all simple) at these moments is m Y

((n − jk−1 + k − 1) · (n − jk−1 + k − 2) . . . (n − jk + k + 2)

k=1

· (n − jk + k + 1) · (jk − k + 1))

(3.2)

718

A. Soshnikov

with an agreement that if jk = jk−1 + 1, then the k th factor in (3.2) is (jk − k). It is not difficult to see that the product (3.2) is equal to pn −1−m Y

(n − r) ·

r=0

=n

pn

·e

2

− c2

·

m Y

(jk − k + 1) · (1 + o(1)) ¯

k=1 m Y k=1

jk − k + 1 n

!

(3.3) · (1 + o(1)). ¯

√ After taking the summation 0 < j1 < j2 < · · · < jk < pn ∼ c n, we arrive at (3.1). Formula (3.1) implies that the probability of having m (all simple) self-intersections 1 −c2 /2 e . Since limiting probabilities trivially add up to 1, the first tends to (c2 /2)m · m! part of Proposition 3 follows as well. u t One may conjecture then that triple self-intersections do not occur in typical paths 2 3 until pn is of order n 3 , quadruple self- intersections do not occur until pn is of order n 4 , etc. This turns out to be true. For simplicity we consider in the next two propositions the 2 case pn = 0(n 3 ). Proposition 4. Let pn go to infinity in such a way that

pn √ −→ n n→∞

+∞ but

Then:

pn 2

−→ 0.

n 3 n→∞

a) If ηn is the number of simple self-intersections, then p2

ηn − 2nn w −→ (0, 1). √ pn / 2n n→∞ b) The probability of having a nonsimple self-intersection goes to zero. Proof. The calculations are very similar to those in Proposition 3. Let us denote by Z(m) the number of paths with m self-intersection (all simple), then the arguments above show that 2 1 −pn2 /2n pn pn · e · n · (1 + o(1)) ¯ (3.4) Z(m) = 2n m! p2

uniformly in 0 ≤ m ≤ 10 nn (we can replace 10 here by any other constant). Then b) can be proved as before and a) follows from the Central Limit Theorem for a sum of independent identically distributed Poisson random variables. u t 2

Proposition 5. Let pn /n 3 −→ c. Then n→∞

a) The probability of having a self-intersection of any kind other than simple or triple goes to zero as n → ∞. b) The number of triple self-intersections converges in distribution to Poisson law with 3 mean c6 .

Universality at the Edge of the Spectrum in Wigner Random Matrices

719

Proof. This is the first time in this section when some technicalities may appear. Trying to imitate the proofs of Propositions 3, 4 we denote by (1)

0 < j1

(1)

< j2

< · · · < jm(1) < pn

(3.5)

the instants of m simple self-intersections, and by (2)

(2)

(2)

(2)

(2)

(2) jk,1

(2) jk,2

(2)

(2)

(2)

(j1,1 , j1,2 ), (j2,1 , j2,2 ), . . . , (j`,1 , j`,2 ), (2)

(2)

0 < j1,1 < j2,1 < j3,1 < · · · < j`,1 < pn , 0<

<

(3.6)

< pn , k = 1, 2, . . . , `

the pairs of instants corresponding to ` triple self-intersections. The notations mean that (1) we revisit a site of r th simple self- intersection, r = 1, . . . , m at the instant jr , and (2) (2) a site of k th triple self- intersections, k = 1, . . . , `, at the instants jk,1 < jk,2 . Let us denote by T (`) the number of paths that have exactly ` triple self-intersections and no self- intersections of higher order. Then employing a counting argument, one may hope to end up with something like this:   (pnX −3`)/2 X pn −1−m−2` Y  (n − t) T (`) ∼ m=0

· P

m Y

t=0

j

(jr(1) − r + 1) ·

r=1

` Y

(3.7)

(2)

(jk,1 − k + 1) · (1 + o(1)), ¯

k=1

where the sum j is taken over indices (3.5), (3.6). However, one promptly realizes that arguments similar to those from Proposition 3 would just prove that the r.h.s. of (3.7) is an upper bound of T (`). Therefore slightly different arguments are needed to finish the proof. Actually the proof we offer below is easier than the outlined approach. Yet we spend some time discussing it on purpose since some reincarnation of these 2

arguments will be used in Sect. 4 to derive estimates from above for E Trace A2·[t·n 3 ] and higher moments. To proceed with the proof of the proposition, we observe that the number of closed paths with m simple self-intersections, ` triple self-intersections and zero higher-order self-intersections is equal to (2m)! (3`)! pn ! · · m (pn − 2m − 3`)!(2m)!(3`)! (m!) · (2!) `! · (3!)` n! . · (n − pn + m + 2`)! Let us assume for a minute that 2 m − pn < n 41 , ` < log n. 2n

(3.8)

(3.9)

By Stirling’s formula, (3.8) is equal to p epn −2m−3` 1 pn n p · 2π · p · ·√ n p p −2m−3` n n e (pn − 2m − 3`) 2π · (pn − 2m − 3`)

·

1 1 · · 2m · m! 6` · `!

pn −2m−3` Y

(n − t).

t=0

(3.10)

720

A. Soshnikov

Writing pn −1−m−2` Y

(n − t) = npn −m−2` · exp(−

t=0

(pn − 1 − m − 2`)(pn − m − 2`) 2n

1 · (pn − 1 − m − 2`)3 ) · (1 + o(1)) ¯ 6n2 pn3 pn2 pn3 pn −m−2` + · exp − − 2 · (1 + o(1)) ¯ =n 2n 2n2 6n −

and

pn (pn − 2m − 3`)

pn −2m−3`

= 1+

2m + 3` pn − 2m − 3`

pn −2m−3`

= exp(2m + 3` − ((2m + 3`)2 /(2pn − 4m − 6`))) pn3 ) · (1 + o(1)), ¯ 2n2 we conclude that the probability of having exactly m + ` self-intersections, m of which are simple and ` are triple, is 3 ` 2 3 pn pn 1 1 − pn2 pn 2n 6n2 · (1 + o(1)) · · ¯ (3.11) · e · ` 2n m! 6n2 `! · (1 + o(1)) ¯ = exp(2m + 3` −

p2

1

uniformly in |m − 2nn | < n 4 , ` < log n. One readily recognizes (2.31) as a twodimensional Poisson distribution. Then 2 m X 1 − pn2 pn e 2n = 1 − o(1), · ¯ 2n m! 2 1 p

m:|m− 2nn |
which implies that the probability of having ` triple self- intersection (together with some number of simple self-intersections) is equal to ( c6 )` · 3

3

1 − c6 `! e

· (1 + o(1)). ¯ t u

A generalization of Propositions 4, 5 is quite straightforward. k−1

¯ k ), k = 2, 3, . . . . Then the probability of having a Proposition 6. a) Let pn = o(n self-intersection of order k or higher is going to zero as n → ∞. k−1 b) Let the limit of the ratio pn /n k exist and equal to c. Then the number of selfintersections of order k is distributed in the limit according to Poisson law with the k mean ck! . In [22, 23] we showed that the analogues of Propositions 1–4 hold when we impose an additional condition for closed paths to be even. One can view Theorem 2 in this paper as a partial result toward establishing Proposition 5 for even closed paths. These results suggest that the following conjecture may be true. Consider an ensemble of closed paths of length pn = m · sn on the complete nonoriented graph with n vertices, with an additional condition that each edge appears in the path a number of times divisible by m (if m = 2 such paths are exactly even paths ). Definitions 2–4 then need trivial modifications. We shall formulate our conjecture as an open problem. Open Problem. To prove the analogues of Propositions 1–6.

Universality at the Edge of the Spectrum in Wigner Random Matrices

721

4. Proof of Theorems 2 and 3 The considerations in this section are very close to those in Sect. 4,5 from [23]. We p shall start with Theorem 2. Since for odd pn , E Trace An n = 0, we assume pn even, pn = 2sn . According toFDefinition 4 from Sect. 2 all the vertices split into sn + 1 disjoint n Nk with respect to the path P, where Nk is the subset of subsets: {1, . . . , n} = sk=0 vertices of k-fold self-intersections. All vertices of N0 with the possible exception of the initial point of the path i0 do not belong to P. Denoting nk = #(Nk ) we see that P sn k=0 nk = n, sn X

k · nk = sn .

(4.1)

k=0

We say the P is a path of type (n0 , n1 , . . . , nsn ). It is easy to see that every path without self-intersections is a path of type (n−sn , sn , 0, 0, . . . , 0). In the condition of Theorem 2, 2 we assume that sn = [t ·n 3 ], t > 0. It will then follow from our proof that the probability of having nj = 0 for all j > 3 goes to 1. (If Proposition 6 from the previous section k−1

¯ k ) for also holds for even closed paths, it will imply that under the condition sn = o(n k−1 typical paths, nj = 0 for j ≥ k, and if sn ∼ n k , then for typical paths nk is of order of constant.) In [23] we introduced the notions of closed and nonclosed vertices of simple self2 ¯ 3 ), all vertices of simple self-intersections for intersections and proved that for sn = o(n typical paths are closed. Below we construct examples of closed and nonclosed vertices of simple self-intersections. Example 1. n

j

k

i

l

m

Fig. 3.

P = {i → j → k → ` → k → j → m → k → n → k → m → j → i}. For such path i ∈ N0 ; j, `, m, n ∈ N1 ; k ∈ N2 , and k is a closed vertex of simple self-intersection. Note that j belongs to N1 , not to N2 , since we arrive at j at the marked instant only once (from i), while during three other arrivals at j we pass through corresponding edges for a second time. The vertex k in this example is closed because if, after the moment of self-intersection, we wanted to leave k along an already appeared edge, we had only one such possibility (along the edge {k m}).

722

A. Soshnikov

Example 2. q

j k

i

m

Fig. 4.

P = {i → j → k → m → j → q → j → k → m → j → i}. For this path i0 ∈ N0 ; k, m, q ∈ N1 ; j ∈ N2 , and j is a nonclosed vertex of simple self-intersection, because if we wanted to leave j after the moment of selfintersection along an already appeared edge, we had more than one such opportunity ({j, k}, {j, m}, }{j, i}; in the case of P we choose {j, k}). As we already explained before, even closed paths without self-intersections possess a remarkable property; the trajectory of the pass is determined uniquely by the initial point and the restriction of the trajectory to the marked instants. This property simplifies a great deal the problem of counting such paths. For a path with self-intersections, the choice of continuations of trajectory during unmarked instants (the choice of the “backward trajectory") may not be unique (see Example 2). For example, for the “first return" from a vertex of simple self- intersection, one of the following three edges can be chosen: (a) the edge we used to arrive at the vertex for the first time, (b) the edge we used to leave the vertex right after the first arrival, (c) the edge we used to arrive at the vertex for the second time. One can see that in Example 2 we have chosen possibility (b). Such considerations prompted us to call a vertex of simple self- intersection closed if there is a unique way of continuing the trajectory at an unmarked step when we “return" from the vertex. Otherwise, we call a vertex nonclosed. A crucial observation made in [23] is that probability for a vertex of simple self-intersection to be nonclosed is of order of √1sn . Since the s2

number of all self-intersections is 0( nn ) we see then that for typical paths, there are no 3 2

nonclosed vertices provided snn −→ 0. We will show below that if sn ∼ n 3 , then for n→∞ typical paths the number of nonclosed vertices of simple self-intersections is of order of constant. Let us start with the formula (2.1): p

E Trace An n =

2

n X i0 ,i1 ,···ipn −1 =1

E ai0 ,i1 ai1 ,i2 . . . aipn −1 ,i0 .

Universality at the Edge of the Spectrum in Wigner Random Matrices

723

We have shown in [23] that a subsum of (2.1) over the paths of type (n0 , n1 , . . . , nsn ) is bounded from above by n−sn · ·4

n! (2sn )! sn ! · n· · Qsn nk n0 !n1 ! . . . nsn ! sn ! · (sn + 1)! k=1 (k!)

−sn

·

sn Y

(const1 · k)

2k·nk

(4.2)

.

k=2

The last estimate followed from max

P of type (n0 ,n1 ,...,nsn )

E

2sY n −1

ξi` i`+1 · Wn (P | marked instants)

`=0

≤ (4n)−sn · 3r ·

sn Y

(4.3) (const k)knk , const > 0,

k=3

where Wn is the number of ways the trajectory can be chosen at unmarked instants provided that the vertices at the marked instants have already been chosen, and r is the number of nonclosed vertices. One can show then the existence of another P n positive k ·nk ≥ constant const2 > 0 such that the subsum of (2.1) over the paths for which sk=2 s2

const 2 · nn tends to zero as n → ∞. The actual value of const2 is not important. One can show for example, that const2 = 10 is enough. Therefore we restrict our attention to the paths for which sn X

k · nk < 10

k=2

sn2 . n

(4.4)

Our counting strategy will be the following. First we associate to every path P a trajectory X = {x(t), 0 ≤ t ≤ 2sn } of a simple walk on the nonnegative half-lattice. The trajectory starts and ends at zero, x(0) = x(2sn ) = 0, and if 0 < t1 < t2 < · · · < tsn < 2sn are the marked instants, then x(t) − x(t − 1) = 1 if t is marked, and x(t) − x(t 1) = −1 if t P− n (k − 1) · nk . is unmarked. Let P have M instants of self-intersection, that is, M = sk=2 We denote by (1)

tj (1) < tj (1) < · · · < tj (1) , 1 ≤ j1 1

n2

2

(1)

< j2

< · · · < jn(1) ≤ sn 2

(4.5)

the instants of simple self-intersections. Some of these instants may correspond to nonclosed vertices. Assume that the number of nonclosed vertices is r, and the instants of simple self-intersections corresponding to the nonclosed vertices are tu1 < tu2 < · · · < tur , 1 ≤ u1 < u2 < · · · < ur ≤ sn .

(4.6)

We denote the pairs of indices of marked instants corresponding to triple self-intersec(2) (2) (2) (2) (2) (2) tions by (j1,1 , j1,2 ), (j2,1 , j2,2 ), . . . , (jn3 ,1 , jn3 ,2 ) and order them: (2)

(2)

(2)

(2)

(2)

1 ≤ j1,1 < j2,1 < · · · < jn3 ,1 ≤ sn , j`,1 < j`,1 , ` = 1, . . . , n3 .

(4.7)

724

A. Soshnikov

Notations in (4.7) mean that the `th vertex of triple self- intersection is visited for a second time at a marked instant tj (2) , and for a third time at a marked instant tj (2) . Similarly, we (k)

(k)

`,1

(k)

`,2

(k)

denote by (j1,1 , . . . , j1,k ), . . . , (jnk+1 ,1 , . . . , jnk+1 ,k ) the k-tuples of indices of (k + 1)fold self- intersections. We order them in such a way that (k)

(k)

(k)

(k)

(k)

(k)

1 ≤ j1,1 < j2,1 < · · · < jnk+1 ,1 ≤ sn , j`,1 < j`,2 < · · · < j`,k , ` = 1, . . . , nk+1 .

(4.8)

The notations imply that the `th vertex of (k + 1)-fold self-intersection is visited at marked instants tj (k) , . . . , tj (k) after the first visit has occurred. `,1

`,k

To refine our classification of simple self-intersections, we shall do the following. If we look at two edges along which we arrived at marked instants at a vertex of simple self-intersection, then there are two possibilities depending on whether such two edges coincide or not. If they coincide, then the edge appears in the path four times (twice at marked instants and twice at unmarked instants). Let us denote the number of vertices from class N2 which we visit both times at marked instants along the same edge by q, and the corresponding instants by tv1 < tv2 < · · · < tvq , 1 ≤ v1 < v2 < · · · < vq ≤ sn .

(4.9)

Finally we introduce the following characteristic of P: the maximum of all numbers of vertices that can be visited at marked instants from a vertex of the path. We denote this maximum by νn (P). By definition, each vertex of the path can be the left end of at most νn (P) marked edges. The actual order of νn (P) for typical even paths is not known (it is probably log sn ). We shall show below that the subsum of (2.1) over paths with 1

−

¯ (actually we can replace 21 − in the exponent by any γ > 0). νn (P) > sn2 is o(1) Let us denote by Z(n1 , n2 , . . . , nsn ; r, q) a subsum of (2.1) over the paths with fixed Pn s2 k · nk < 10 nn . We nk , k = 1, . . . , sn , and r, q, that also satisfy the condition sk=2 shall split it into two sums: Z(n1 , n2 , . . . , nsn ; r, q) = Z 0 (n1 , n2 , . . . , nsn ; r, q) + Z 00 (n1 , n2 , . . . , nsn ; r, q), 1

−

where the first subsum is over the paths for which νn (P) ≤ sn2 , fixed, and the second is over the rest. We define Z 0 as a sum of all Z 0 (n1 , . . . , nsn ; r, q); and similarly for Z 00 . Our next goal is to obtain a nice (Poisson) upper bound for Z 0 (n1 , n2 , · · · , nsn ; r, q). Let be a collection of all X = {x(t), t = 0, 1, . . . , 2sn } for which x(0) = x(2sn ); x(t) ≥ (2sn )! . 0; x(t + 1) − x(t) = ±1 ∀t. The number of elements in is equal to sn !·(s n +1)! The mathematical expectation with respect to such probability distribution on will be denoted by En,X .

Universality at the Edge of the Spectrum in Wigner Random Matrices

725

Lemma 5. There are positive constants A, B, C, D such that (2sn )! · 4−sn sn ! · (sn + 1)!

Z 0 (n1 , . . . , nsn , r, q) ≤ n ·

2 n2 −r−q s3 sn2 1 s A· n · · e− 2n · e n2 · n (n2 − r − q)! 2n  r 3 B · sn2  1 x(t) · · · En,x ( max √ )r 1≤t≤2sn sn r! n  q 3 n − (D · sn )3 3 1 1  c · sn2  · · · · q! n n3 ! n2

(4.10)

n sn Y 1 (D · sn )k k · . nk ! nk−1 k=4

Proof. Define (1)

} \ ({u1 , . . . , ur } ∪ {v1 , · · · , vq }), {j1 , j2 , · · · , jn2 −r−q } = {j1 , . . . , jn(1) 2 1 ≤ j1 < j2 < · · · < jn2 −r−q ≤ sn . By a simple counting argument we have X X Z 0 (n1 , n2 , . . . , nsn , r, q) ≤ X∈ sn X

X

n1 +···+nsn

k=3

over indices

y1 =0

Y

X

X

X

over indices

over indices

over indices

over indices

in (4.11)

in (4.6)

in (4.9)

(n − y1 ) ·

in (4.8k) nk sn Y Y (3) (k) jy3 ,1 · jyk ,1 y3 =1 k=4 yk =1 n3 Y

n2 Y −r−q y2 =1

jy2 ·

r Y

x(tud ) ·

d=1

(4.11)

in (4.7) q Y

1

sn2

−

·

`=1

(4.12)

sn Y 1 1 r 2q · s · s · 3 · (const · 2) · (const · k)k·nk nn 4n k=3

(if nk = 0 for some k, we assume the corresponding factor is one). We hope that the reader is not scared by the array of sums and products in the formula. Actually it is quite self-explanatory: (a) Each trajectory of a simple walk X leaves us with a choice of marked instants. Qn +···n (b) The product y11=0 sn (n − y1 ) gives us the number of possibilities for choosing all the vertices that will appear in the path in the order of their appearance. (c) Since some vertices may appear more than once, the choice of indices in (4.11), (4.6), (4.9), (4.7), (4.8) lets us set up the moments of self-intersections. (d) The product n2 Y −r−q q r Y Y 1 − jy2 · x(tud ) · (sn2 ) y2 =1

d=1

`=1

gives us an estimate from above for choosing the vertices of simple self- intersections. (1) Indeed, at any moment tj (1) there are no more than jα possibilities for choosing a α

726

A. Soshnikov

vertex (that will be a vertex of simple self-intersection) among previously appeared (1) by this moment vertices. If jα is from (4.6) then we have to pick a nonclosed vertex. Therefore the number of possibilities is even smaller. One can see that this may be (1) done in no more than x(tj (1) ) ways. Finally if jα is from (4.9), then we have to α take the preceding vertex in the path and choose from among all vertices connected 1

−

to that one by an edge from the path. Then we have no more than νn (P) ≤ sn2 possibilities. (e) Similar arguments apply to the products over yk , k = 3, . . . , sn , when we are choosing the vertices of self-intersections belonging to Nk . The choices made in a)–e) let us uniquely determine the restriction of P to the set of marked instants and the initial point. Therefore we are left with the problem of Q2sn −1 ξid id+1 and Wn (P | marked instants) – the number of possible estimating E d=0 choices for continuing the path at unmarked instants provided the vertices at marked instants are known. It immediately follows from (1.3), (1.30 ) that E

2sY n −1

−sn

ξid id+1 ≤ n

d=0

·

sn Y

(const · k)k·nk .

(4.13)

k=1

As for Wn one can write Wn ≤

sn Y

(2k)k·nk ,

(4.14)

k=1

arguing that at each unmarked instant of “return” from a vertex of k- fold selfintersection, we have at most 2k possibilities to choose the next vertex. The last two inequalities give ! 2sY sn n −1 Y ξid id+1 · Wn ≤ n−sn · (2 const · k 2 )k·nk . E d=0

k=1

It appears that one can prove a better estimate: ! 2sY n −1 1 1 ξid id+1 · Wn ≤ s · s · 3r E nn 4n d=0

· (const · 2)

2·q

·

sn Y

k·nk

(const1 · k)

(4.15)

.

k=3

The idea behind (4.15) is that the two factors in the l.h.s. of (4.15) cannot be simultaneously too big – if some edge appears in P a large number of times increasing the first factor, then we will use this edge for “return” many times, thus decreasing the number of possible continuations of the trajectory at the unmarked instants. Formula (4.15) was proven in Lemma 1 [23]. (We proved it there for q = 0 and the argument can be trivially generalized for any q. Looking at the original proof in [23], one can also notice a typographical error – the factor n1s is missing there.) Once (4.12) is proven, the result of Lemma 5 can be proven by taking a summation there. The considerations follow closely those from [23], Sect. 4, and actually are not very difficult. As a corollary of (4.10), we have

Universality at the Edge of the Spectrum in Wigner Random Matrices

727

2

Lemma 6. Let sn grow to infinity such that sn = 0(n 3 ). Then  sn3 ) · En,X n2

1 n · 3 · exp(A · π sn2 s3 1 n ≤ · 3 · exp γ · n2 π n sn2

Z0 ≤

exp B ·



3 2

sn x(t) · max √  n [0,2sn ] sn

(4.16)

with some positive constant γ .

Proof. The first inequality follows from (4.10) by summation. The second one follows from the fact that the tail of the distribution of the normalized maximum decays as fast as a Gaussian uniformly in n (which is a nice exercise). 2

Our next step is to show that if sn is proportional to n 3 the second subsum of (2.1), Z 00 , vanishes in the limit n → ∞. Let us denote by Z 00 (n1 , n2 , . . . , nsn , r, 1

−

νn ) a subsum of (2.1) with fixed n1 , · · · , nsn , r, νn and assume νn > sn2 . By definition, the sum over all such subsums Z 00 . To formulate the analogue of Lemma 5, we Pis sn need more notations. Let N = r + k=3 k · nk , and 0 ≤ t1 < t2 < · · · tN ≤ 2sn

(4.17)

be some integers. We denote by 0t1 ,...tN an event from such that 0t1 ,...tN consists of trajectories of simple walks X for which the following holds: There is an interval among [ti , ti+1 ], i = 1, . . . , N, such that the trajectory of X restricted to a subinterval of the interval descends to some level at least [ νNn ] times but never crosses it (see Fig. 5). x

ti+1

ti Fig. 5.

t

728

A. Soshnikov

Lemma 7. There are positive constants A, B, C, D such that Z 00 (n1 , . . . nsn , r, q, νn ) ≤ n ·

(2sn )! · 4−sn sn ! · (sn + 1)!

2 n2 −r−q s3 sn2 s 1 A· n · n · · e− 2n · e n2 · (n2 − r − q)! 2n  3 r Bsn2  1 C · sn · νn q 1 · · · · · r! n q! n

(4.18)

n sn n (D · sn )3 3 Y 1 (D · sn )k k · · n2 n ! nk−1 k=4 k x(t) r · χ0t1 ,...tN , En,X max √ · max t1 <···
where χ0t1 ...tN is an indicator of the set 0t1 ...tN . Proof. The proof becomes very much similar to that of Lemma 5 once we realize that fixing the value of νn translates into X ∈ 0t1 ,...tN with t’s being the instants of k-fold selfintersections, k ≥ 3, and nonclosed simple self-intersections. Indeed, if a vertex is the left end of νn marked edges of a path with simple self-intersections all of which correspond to closed vertices, then for the corresponding random walk X = {x(t), 0 ≤ t ≤ 2sn } there exists a time interval [T1 , T2 ] on which the trajectory descends νn time to the level x(T1 ) but never crosses it (i.e., never descends to the level x(T1 ) − 1). This is exactly because in order to make each new step from the vertex iT1 we must first return to it along the path. Generally at least one of the intervals [t1 , t2 ], . . . , [tN −1 , tN ] will have t [ νNn ] of such returns, which completes the argument. u 00 (N, ν ) the sum of Z 00 (n , . . . , n , r, q, ν ) over r, q and Let us denote now by ZP n 1 sn n n nk = N is fixed. As a corollary of (4.17), we have n1 , . . . , nsn such that r + sk=3 s3 sn ·νn (2sn )! 1 A· n · 4−sn · e n2 · eC· n · sn ! · (sn + 1)! N!  3  N 2 x(t) s const · sn3 n   · max √ · max En,x exp B · χ0t1 ,···tN . t1 <···
Z 00 (N, νn ) ≤ n ·

(4.19)

It is an exercise to show that the probability of 0t1 ,··· ,tN is exponentially small in νn

νn /N : P (0t1 ,··· ,tN ) ≤ (2sn )2 e−const N ,

(4.20)

which is intuitively clear since at each of [ νNn ] times, the next step is predetermined (we have to go north). Equations (4.19) and (4.20) imply Z 00 =

sn X

sn X N=0

1 −

νn =sn2

econst sn /n · 3

2

1 N!

Z 00 (N, νn ) ≤ +1

sn X N =0

3 N

const · sn n2

sn X 1 −

νn =sn2

+1

1 n · π 23 sn

· e(C·sn ·νn )n · (2sn )2 · e−const·νn /N

(4.21)

Universality at the Edge of the Spectrum in Wigner Random Matrices

729

(we recall that the notation const is used for different constants throughout the paper). 2 Finally one can show that sn ∼ n 3 and (4.20) imply ¯ Z 00 = o(1).

(4.22)

This finishes the proof of P part a) of Theorem 2. Formulas (4.10), (4.21) also imply part n nk > 0 is clearly o(1), ¯ and one can proceed in a similar b) since the sum over q + sk=4 way to show that for typical paths, there are no loops and among a finite number of t vertices from N3 class, none has a corresponding edge to appear more than twice. u In the previous two papers we explained an idea that allows us to deal with higher moments in a similar manner. We start with an obvious formula E

k Y

p

(m)

p

(m)

(1)

(Trace An n − E Trace An n ) = n−(pn

m=1



k  Y  E   (m) m=1

i0

 

Y

r=1

(m) (m) ,i1 ,...i (m) =1 pn −1

· 

(m) pn

n X

(k)

+...+pn )/2

(m) pn

ξi (m) i (m) − E r−1 r

Y

r=1



  ξi (m) i (m)  . r−1 r 

(4.23)

(m)

In what follows, we assume that pn are either pn or pn + 1, i = 1, . . . , m, and pn is 2 proportional to n 3 . Let us consider a set of k closed paths (m)

Pm = {i0

(m)

→ i1

(m)

→ · · · → ip(m) = i0 }, m = 1, . . . , k. n

We recall two definitions from [22, 23]: Definition 5. We say that paths Pm0 , Pm00 intersect by an edge if Pm0 , Pm00 have a common (nonoriented) edge. Definition 6. A subset Pm`1 , Pm`2 , . . . , Pm`k of the set of paths is called a cluster of intersecting paths if the following conditions hold: (a) For each pair Pmi , Pmj from the subset, there exists a chain of paths also from the subset such that Pmi is the first path in the chain, Pmj is the last path in the chain, and any two neighboring paths intersect each other by an edge. (b) Property (a) is violated if we add any other path from the set to this subset. By definition, the sets of edges corresponding to different clusters are disjoint. P Qpn(m) Qpn(m) Q ( r=1 ξi (m) i (m) − E r=1 ξi (m) i (m) )) can be repTherefore E km=1 ( n(m) (m) (m) i0 ,i1 ,...ipn −1 =1

r−1 r

r−1 r

resented as a product of mathematical expectations over disjoint clusters. The following lemma is crucial. (1)

(k)

2

2

Lemma 8. Let pn , . . . , pn ∈ {2[t · n 3 ], 2[t · n 3 ] + 1}. Then   (m)  (m)  pY  pY k  n Y X∗ n  (1) (k)  −(pn )+...+pn )/2   · ξi (m) i (m) − E ξi (m) i (m)  , En  r−1 r  r=1 r−1 r  (m) m=1 (m) r=1 i0 ,... ,i

(m) =1 pn −1

(4.24)

730

A. Soshnikov

P 3 where the sum ∗ is over paths that form a cluster, is bounded by econstk ·t uniformly in n. A subsum of (4.24) over clusters in which some edges appear more than twice is going to zero. To estimate the sum in (4.24) we introduced a correspondence between a set consisting of clusters of k paths and a set of even paths of length approximately k times larger. Loosely speaking, we glue k paths together along common edges and then erase these edges. As a result, we get an even path of length not greater than k · pn and not smaller than kpn − 2k. The details of such correspondence is discussed in [22]. It appears that the number of preimages of an even path under the mapping can be estimated in terms of the trajectory X of a simple random walk associated to the path. Let Kn (X) be the number of instants τi of a simple random walk of length k · pn − q, with 0 ≤ q ≤ 2k, such that 0 ≤ τi ≤ (k − 1)pn − q and x(τ ) ≥ x(τi ) for τi ≤ τ ≤ τi + pn . Then the number of preimages of X is bounded by const k · pnk−1 · Kn (X)k−1 .

(4.25)

The estimate in (4.25) also gives the right order. We proved in [22] that r E X Kn = 2 ·

pn · (1 + o(1)). ¯ π

(4.26)

Repeating the lines of the proof one also has k−1

EX Knk−1 ≤ constk · pn 2 .

(4.27)

We see that the problem is again reduced to the counting of even paths of the length 2 proportional to n 3 . We now have a new factor constk · pnk−1 · Knk−1 (X) in the statistical weight. The inequality (4.27) gives us desired control of this factor and from this point the arguments are the same as in the proof of Theorem 2. u t It is left to be noted that Lemma 8 immediately implies Theorem 3.

5. Proof of the Universality of Local Correlations at the Edge In Sect. 1 we introduced local statistics Sn,k (t1 , . . . , tk ) =

X

et1 ·θj1 · · · etk ·θjk ,

(5.1)

j1 6 =···6 =jk ;

1 0≤λj1 ,... ,λjk <1+ 2√ n

where θ’s are defined by the rescaling λj = 1 +

θj 2

2n 3

for positive eigenvalues.

By definition, it follows that E(sn,k ) is a Laplace transform of the rescaled k-point 1 correlation function restricted to the region θ1 , . . . , θk < n 6 . Lemma 9 formulated in Sect. 1 claims that the mathematical expectation ESn,k (t1 , . . . , tk ); t1 , . . . , tk > 0, has a universal limit as n → ∞.

Universality at the Edge of the Spectrum in Wigner Random Matrices

731

Proof of Lemma 9. Since Sn,k (t1 , . . . , tk ) is a polynomial in terms of Sn,1 (t1 ), . . . , Sn,1 (kt1 ), Sn,1 (t2 ), . . . , Sn,1 (kt2 ), . . . , Sn,1 (ktk ), it is enough to show that the joint moments of Sn,1 (t) have universal limits. But this can be recognized as an easy consequence of Theorem 3. Indeed as we explained in Sect. 1 (see (1.28), (1.29)) X et·θj Sn,1 (t) = 1

θj
1 = Trace 2

2

3 An2[t·n ]

2

3 + An2[t·n ]+1

1 √ , 2 n

2 2[t·n 3 ]

Chebyshev type implies that all moments of Trace [(An are of order 0(e finish the proof. u t

− 13

· χIn (A) · (1 + 0(n

where χIn is an indicator of a segment In = (−1 + 1 −const·n 6

(5.2)

1−

1 √ ). 2 n

)),

An estimate of

2 2[t·n 3 ]+1

+ An

) · χR1 \In (A)]

) and therefore negligible. Now Corollaries 5, 6 from Sect. 2

To deduce Theorem A from Lemma 9, we note that if ρn,β,k (θ1 , . . . , θk ) are rescaled k-point correlation functions at the edge for an arbitrary Wigner matrix, then our calculations imply Z +∞ Z +∞ 1 n! 6 e−constk ·n = o(1). . . . ρn,β,k (θ1 , . . . θk ) dθ1 . . . , dθk ≤ ¯ 1 1 (n − k)! n6 n6 Therefore it is enough to prove weak convergence for ρn,β,k (θ1 , . . . θk ) · χ

1 k

(−∞,n 6 )

(θ1 , . . . , θk )dθ1 . . . dθk .

(5.3)

Multiplying (5.3) by the factor exp(θ1 + · · · + θk ) we get a finite measure. Since convergence of Laplace transforms of finite measures implies weak convergence (this can be proved by Helly’s theorem), we are done. (n) (n) (n) We are now ready to prove the main result. Let λ1 ≥ λ2 ≥ · · · ≥ λk be the first k largest eigenvalues and s1 ≥ · · · ≥ sk an arbitrary ordered set of real numbers. We want (n) (n) to show that Pn {λ1 ≤ 1 + s12 , . . . , λk ≤ 1 + sk2 } has a universal limit (Tracy– 2n 3

2n 3

(n)

Widom distribution) as n → ∞. In terms of θ ’s, the event can be written as {θ1 ≤ (n) s1 , . . . , θk ≤ sk }, and its probability can be written as a finite linear combination of probabilities (n)

Pn {#{θj Pk

∈ (si , si−1 ], j = 1, 2, . . . } = mi }, i = 1, . . . , k, (n)

i=1 mi

(5.4)

(n)

≤ k. Let us introduce ηi = #{θj ∈ (si , si−1 ], j = 1, 2 . . . }. By definition of correlation functions, the factorial moments can be written as Z k `Y i −1 Y (n) (ηi − `) = ρn,β,L (θ1 , . . . , θL ) dθ1 . . . dθL , E

(5.5)

i=1 `=0

Q P where L = ki=1 `i , and integration is over ki=1 (si , si−1 ]`i . Since correlation functions (n) weakly converge, we deduce from (5.5) that the joint moments of ηi , i = 1, . . . , k,

732

A. Soshnikov

have universal limits as n → ∞. It is well known that if the limiting moments grow up not faster than factorials, they uniquely determine the limiting distribution (5.4). (see e.g. [55]. The determinantal (Pfaffian) form of limiting correlation functions asserts that this is exactly the case in our situation. The main result is proven. Acknowledgement. It is a great pleasure to thank Ya. Sinai for his inspiration and interest in this work. The author also would like to thank P. Forrester for the warm hospitality at the University of Melbourne in July 1997 and valuable discussions. The research was partially supported by the National Science Foundation through Grant No. DMS9304580.

References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26]

Wigner, E.: Characteristic vectors of bordered matrices with infinite dimensions Ann. of Math. 62, 548–564 (1955) Wigner, E.: On the distribution of the roots of certain symmetric matrices. Ann. of Math. 67, 325–328 (1958) Porter, C.E., Rosenzweig, N.: Statistical properties of atomic and nuclear spectra. Ann. Acad. Sci. Fennicae, Serie A, VI Physica 44, 1–66 (1960) Porter, C.E. (ed): Statistical Theories of Spectra: Fluctuations. New York: Academic Press, 1965 Marchenko, V.A., Pastur, L.A.: The distribution of the eigenvalues in some ensembles of random matrices. Mat. Sb. 72, 507–536 (1967) Pastur, L.A.: On the spectrum of random matrices. Teor. Mat. Fiz. 10, 102–112 (1972) Arnold, L.: On the asymptotic distribution of the eigenvalues of random matrices. J. Math. Anal. Appl. 20, 262–268 (1967) Arnold, L.: On Wigner’s semicircle law for the eigenvalues of random matrices. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 19, 191–198 (1971) Berezin, F.A.: Several remarks on Wigner distribution. Teor. Mat. Fiz. 17, No. 3, 305–318 (1973) A Girko, V.L.: Spectral theory of random matrices. (Russian), Moscow: Nauka, 1988 Girko, V.L.: On normalized spectra functions of random matrices. Theor. Prob. Math. Statistics 22, 31–34 (1980) Wachter, K.W.: The strong limits of random matrix spectra for sample matrices of independent elements. Ann. Probab. 6 No. 1, 1–18 (1978) Füredi, Z., Komlóz, J.: The eigenvalues of random symmetric matrices. Combinatorica 1 No. 3, 233–241 (1981) Silverstein, J.: The smallest eigenvalue of a large dimensional Wishart matrix. Ann. Probab. 13, 1364– 1368 (1985) Bai, Z.D., Yin, Y.Q.: Necessary and sufficient conditions for the almost sure convergence of the largest eigenvalue of Wigner matrices. Ann. Probab. 16, 1729–1741 (1988) Bai, Z.D.: Convergence rate of expected spectral distributions of large random matrices. I. Wigner matrices, Ann. Probab. 21, 625–648 (1993) Mehta, M.L.: Random matrices. New York: Academic Press, 1991 Molchanov, S.A., Pastur, L.A., Khorunzhy, A.M.: Limiting eigenvalue distribution for band random matrices. Teor. Mat. Fiz. 90, 108–118 (1992) Voiculescu, D.: Limit laws for random matrices and free products. Invent. Math. 104, 201–220 (1991) Khorunzhy, A.M., Khoruzhenko, B.A., Pastur, L.A.: Asymptotic properties of large random matrices with independent entries. J. Math. Phys. 37, 5033–5059 (1996) Ben Arous, G., Gionnet, A.: Large deviations for Wigner’s law and Voiculescu’s non-commutative entropy. Probab. Theory Rel. Fields 108 No. 4, 517–542 (1997) Sinai, Ya., Soshnikov, A.: Central limit theorem for traces of large random symmetric matrices. Bol. Soc. Brasil. Mat. 29 No. 1, 1–24 (1998) Sinai, Ya., Soshnikov, A.: A refinement of Wigner’s semicircle law in a neighborhood of the spectrum edge for random symmetric matrices. Functional Anal. Appl. 32 No. 2, (1998) Boutet de Monvel, A., Khorunzhy, A.: Asymptotic distribution of smoothed eigenvalue density. I. Gaussian random matrices. To appear in Random Oper. Stoch. Equ. (1999) Boutet de Monvel, A., Khorunzhy, A.: Asymptotic distribution of smoothed eigenvalue density. II. Wigner random matrices. To appear in Random Oper. Stoch. Equ. (1999) Tracy, C., Widom, H.: Level-spacing distribution and Airy kernel. Commun. Math. Phys. 159, 151–174 (1994)

Universality at the Edge of the Spectrum in Wigner Random Matrices

733

[27] Tracy, C., Widom, H.: On orthogonal and symplectic matrix ensembles. Commun. Math. Phys. 177, 727–754 (1996) [28] Tracy, C., Widom, H.: Correlation functions, cluster functions, and spacing distribution for random matrices. J. Stat. Phys. 92 No. 5/6, (1998) [29] Widom, H.: On the relation between orthogonal, symplectic and unitary matrix ensembles. To appear in J. Stat. Phys. (1999) [30] Tracy, C., Widom, H.: Random unitary matrices, permutations and Painlevé. Preprint (1998) [31] Forrester, P.J.: In preparation (1999) [32] Basor, E., Widom, H.: Determinants of Airy operators and applications to random matrices. Preprint (1998) [33] Forrester, P.J.: The spectral edge of random matrix ensembles. Nucl. Phys. B. 402, 709–728 (1993) [34] Baker, T.H., Forrester, P.J.: Finite-N fluctuation formulas for random matrices. J. Stat. Phys. 88, No. 5/6, (1997) [35] Johansson, K.: On fluctuations of eigenvalues of random Hermitian matrices. Duke Math. J. 91, 151–204 (1998) [36] Johansson, K.: Universality of local eigenvalue correlations in certain Hermitian Wigner matrices. Preprint (1998) [37] Johansson, K.: Shape fluctuations and random matrices. Preprint (1999) [38] Bowick, M.J., Brézin, E.: Universal scaling of the tail of the density of eigenvalues in random matrix models. Phys. Lett. B. 268, 21–28 (1991) [39] Brézin, E., Zee, A.: Universality of the correlations between eigenvalues of large random matrices. Nuclear Phys. B. 402, 613–627 (1993) [40] Brézin, E., Hikami, S.: Universal singularity at the closure of a gap in random matrix theory. Preprint (1997) [41] Brody, T.A., Flores, J., French, J.B., Mello, P.A., Pandey, A., Wong, S.S.M.: Random matrix physics: Spectrum and strength fluctuations. Rev. Mod. Phys. 53, 385–479 (1981) [42] Kiessling, M., Spohn, H.: A note on the eigenvalue density of random matrices. To appear in Commun. Math. Phys. (1999) [43] Soshnikov, A.: Level spacings distributions for large random matrices: Gaussian fluctuations. Ann. of Math. 148, 573–617 (1998) [44] Pastur, L.A., Shcherbina, M.: Universality of the local eigenvalue statistics for a class of unitary invariant random matrix ensembles. J. Stat. Phys. 86, 109–147 (1997) [45] Deift, P., Its, A., Zhou, X.: A Riemann–Hilbert approach to asymptotic problems arising in the theory of random matrix models, and also in the theory of integrable statistical mechanics. Ann. of Math. 146, 149–235 (1997) [46] Bleher, P., Its, A.: Semiclassical asymptotics of orthogonal polynomials, Riemann–Hilbert problem, and universality in the matrix model. Preprint (1996) [47] Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Preprint (1998) [48] Baik, J., Deift, P., Johansson, K.: On the variance of the length of the longest increasing subsequence of random permutations. To appear in J. Am. Math. Soc. [49] Erdély, A. et al., Higher Transcendental Functions. V. 2. Bateman manuscript project. New York: McGraw-Hill, 1953 [50] Okounkov, A.: Random matrices and random permutations. Preprint (1999) [51] Borodin, A.: Longest increasing subsequences of random colored permutations. Electronic J. Combinatorics, 6 (1999) [52] Tracy, C., Widom, H.: On the distribution of the lengths of the longest monotone subsequences in random words. Preprint (1999) [53] Borodin, A., Okounkov, A. and Olshanski, G.: Asymptotics of Plancherel measures for symmetric groups. Preprint (1999) [54] Johansson, K.: Discrete orthogonal polynomials and the Plancherel measure. Preprint (1999) [55] Simon, B.: The classical moment problem as a self-adjoint finite difference operator. To appear in Advances in Mathematics Communicated by Ya. G. Sinai

Commun. Math. Phys. 207, 735 (1999)

Communications in

Mathematical Physics

© Springer-Verlag 1999

Erratum

Asymptotically Euclidean Manifolds and Twistor Spinors Wolfgang Kühnel1 , Hans-Bert Rademacher2 1 Mathematisches Institut B, Universität Stuttgart, 70550 Stuttgart, Germany.

E-mail: [email protected]

2 Universität Leipzig, Mathematisches Institut, Augustusplatz 10/11, 04109 Leipzig,

Germany. E-mail: [email protected] Received: 2 August 1999 / Accepted: 2 August 1999 Commun. Math. Phys. 196, 67–76 (1998)

We are grateful to Helga Baum for pointing out that for the conclusion on the dimension N of the space of twistor spinors on (M, g) in Theorem 1.3 one has to add the assumption that the manifold M is simply-connected. Without this additional assumption one can conclude in case a) only N ≤ 2 and in case b) N ≤ m+1. For the proof of Theorem 1.3 we use a result of M. Wang [Wa; Prop.] which holds only for simply-connected manifolds. For a discussion on the existence of parallel spinors on non-simply connected manifolds we refer to the preprint A. Moroianu, U. Semmelmann: Parallel spinors and holonomy groups. Preprint math.DG/9903062, to appear in: J. Math. Phys. . Communicated by H. Nicolai