Communications in Mathematical Physics - Volume 236

Commun. Math. Phys. 236, 1–54 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0799-3 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

42 downloads 300 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 236, 1–54 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0799-3

Communications in

Mathematical Physics

Glauber Dynamics of the Random Energy Model II. Aging Below the Critical Temperature∗ G´erard Ben Arous1 , Anton Bovier2 , V´eronique Gayrard1,∗∗ 1

Ecole Polytechnique F´ed´erale de Lausanne, 1015 Lausanne, Switzerland. E-mail: [email protected] 2 Weierstrass-Institut f¨ ur Angewandte Analysis und Stochastik, Mohrenstrasse 39, 10117 Berlin, Germany. E-mail: [email protected] Received: 9 October 2001 / Accepted: 17 October 2002 Published online: 21 March 2003 – © Springer-Verlag 2003

Abstract: We investigate the long-time behavior of the Glauber dynamics for the random energy model below the critical temperature. We establish that for a suitably chosen timescale that diverges with the size of the system, one can prove that a natural autocorrelation function exhibits aging. Moreover, we show that the long-time asymptotics of this function coincide with those of the so-called “REM-like trap model” proposed by Bouchaud and Dean. Our results rely on very precise estimates on the distribution of transition times of the process between different states of extremely low energy. 1. Introduction and Background 1.1. Introduction. In this paper we continue the analysis of the Glauber dynamics of the random energy model that was started in [BBG1]. We refer the reader to the introduction of that paper for the general background of the problem. We recall that we consider the following version of the REM. A spin configuration σ is a vertex of the hypercube SN ≡ {−1, 1}N . On an abstract probability space (, F, P ) we define the family of i.i.d. standard normal random variables {Xσ }σ ∈SN . We set Eσ ≡ [Xσ ]+ ≡ (Xσ ∨ 0). We define a random (Gibbs) probability measure on SN , µβ,N , by setting √

eβ NEσ µβ,N (σ ) ≡ , Zβ,N

(1.1)

1 . It is well-known [D1, D2] that this where Zβ,N is the normalizing partition function √ model exhibits a phase transition at βc = 2 ln 2. For β ≤ βc , the Gibbs measure is ∗

Work Partially supported by the Swiss National Science Foundation under contract 21-65267.01 On leave from CPT-CNRS, Luminy, Case 907, 13288 Marseille Cedex 9, France. E-mail: [email protected] 1 The standard model has X instead of E . This modification has no effect on the equilibrium σ σ properties of the model, and will be helpful for setting up the dynamics. ∗∗

2

G. Ben Arous, A. Bovier, V. Gayrard

√ supported, asymptotically as N ↑ ∞ on the set of states σ for which Eσ ∼ N β, and no single configuration has positive mass. For β > βc , on the other hand, the Gibbs measure gives positive mass to the extreme elements of the order statistics of the family Eσ . The dynamics we will consider is a discrete time Glauber dynamics. That is we construct a Markov chain σ (t) with state space SN and discrete time t ∈ N by prescribing transition probabilities pN (σ, η) = P [σ (t + 1) = η|σ (t) = σ ] by  √ 1 −β NEσ  , if σ − η 2 = 2 N e √ . (1.2) pN (σ, η) = 1 − e−β NEσ , if σ = η   0, otherwise Note that the dynamics is also random, i.e. the law of the Markov chain is a measure valued random variable on that takes values in the space of Markov measures on the path space SNN . We will mostly take a pointwise point of view, i.e. we consider the dynamics for a given fixed realization of the disorder parameter ω ∈ (we persistently suppress the dependence on ω in the notation). It is easy to see that this dynamics is reversible with respect to the Gibbs measure µβ,N . One also sees that it represents a nearest neighbor random walk on the hypercube with traps of random depths (i.e. the probability to make a zero step is rather large when Eσ is large)2 . 1.2. Bouchaud’s trap model. In this sub-section we will explain the heuristics of the dynamics of the REM that was developed in several papers by Bouchaud and others [B, BD, BM, BCKM]. We will actually give a slightly varied form of this model that will fit better with the rigorous analysis we will present later. Understanding the trap model will provide a crucial guideline for the analysis of the full model later on. The basic idea of Bouchaud can be explained √ as follows. As was explained in [BBG1], the Gibbs measure of the REM for β > 2 ln 2 is concentrated, asymptotically, on a countable set of states. Therefore we know that the Glauber dynamics for these temperatures will spend almost all of its time in these same states. This suggests, as we will do in the main part of the paper, to consider the dynamics on these states at appropriate time scales. Instead of doing this, Bouchaud proposes to define directly a new dynamics on these countably many states in the infinite volume limit3 that he expects to behave in the same way as the real model. Thus we start with the random measure µ˜ β defined in Eq. (1.12) of [BBG1]. We want to introduce a stochastic process on the support of this measure that leaves µ˜ β invariant. Obviously we can identify the support of this measure with the atoms of the Poisson point process P (defined in Sect. 1.2 of [BBG1]). The question is what the transition probabilities should be. Bouchaud proposes the following: Starting at a state i with energy Ei , the process waits an exponential time of mean τ0 exp(αEi ) (where α has the physical meaning of α = β/βc ), and then jumps at random to any of the other states j with equal probability. 2 We have chosen this particular dynamics for technical reasons. To study e.g. the Metropolis algorithm would require some extra work, but we expect essentially the same results to hold. 3 This is completely analogous to the procedure of Ruelle to define a model based on the Poisson process as the infinite volume version of Derrida’s REM rather than proving the convergence of Derrida’s model to this limit.

Aging in the REM. Part 2

3

Here τ0 denotes a time-scale that will have to be chosen appropriately later. The problem is that while we would want the process to reach each state with equal probability, this makes no sense given that there are infinitely many states. Thus we have to introduce some cut-off procedure. Bouchaud proposes to allow jumps only to the M states of largest mass, and to take the limit as M ↑ ∞ in the end. We find it more instructive to restrict our process to states whose energy is larger than E, where E is a parameter that will be taken to −∞ later 4 . This is very convenient, since it amounts to replace the Poisson process P (from Sect. 1.2 of [BBG1]) by its restriction PE to the half line [E, ∞). Since our new Poisson process has a finite intensity measure, it has a very useful representation: ∞ Consider a random variable nE ∈ N that is Poisson distributed with parameter E e−x dx = e−E . Let Ei , i ∈ N be a family of i.i.d. real valued r.v., independent of nE whose common distribution has density eE e−x Ix≥E with respect to Lebesgue measure. Then PE is equal in distribution to nE

δEi .

(1.3)

i=1

Given a realization of PE , we can now define a Markov process on the random set SE ≡ {1, . . . , nE }. Let YE (n), n ∈ N be a discrete time Markov chain with state space SE . We will actually only consider the case where YE (n) are i.i.d. random variables with some distribution q. Next we introduce, for each i ∈ N, a family Tn (i), n ∈ N of i.i.d. random variables taking values in R+ and having an exponential distribution with rates τi ≡ τ0 exp(αEi ), i.e. P [Tn (i) ≤ t] ≡ Fi (t) = 1 − e−t/τi .

(1.4)

Now we set Rn ≡

n

Tk (YE (k))

(1.5)

Rn ≤ t < Rn+1 .

(1.6)

k=1

and r(t) = n if

Finally, the Markov jump process is defined as XE (t) ≡ YE (r(t)) t ≥ 0.

(1.7)

Observe that the random variables τi are the atoms of a Poisson point process N ∗ obtained from P by transformation with the map τ : E → τ0 eαE . A simple computation shows 1/α that N ∗ is a Poisson process with intensity measure ν ∗ (dx) = α −1 τ0 x −(1+α)/α dx ∗ (see [Ru]). We will also denote by NE the transform of the restricted process NE which is of course just the restriction of N ∗ to the half-line [τ0 e−αE , ∞). Let us note that in the case where YE (n), n ∈ N are i.i.d., the random variables Tk (YE (k)), k ∈ N are also i.i.d., and therefore r(t) is a renewal process. Moreover, in the case when the distribution, q, of YE (k) is of the form q(YE (k) = i) = p(τi ), 4 This has the advantage that via the parameter E we control explicitly the time-scale we consider, whereas otherwise this would be some non-trivial random variable.

4

G. Ben Arous, A. Bovier, V. Gayrard

E for some non-negative function p satisfying ni=1 p(τi ) = 1, the law of the renewal variable Tk (YE (k)) can be expressed in terms of the process NE∗ as P [Tk (YE (k)) > t] ≡ 1 − FE (t) =

nE

qi (1 − Fi (t)) =

NE∗ (ds)p(s)e−t/s . (1.8)

i=1

The two point function that is used to characterize the “aging” phenomenon is the probability that during a time-interval [t, t + s] the process does not jump, i.e.

(1.9)

E (s, t) ≡ P ∀u∈[t,t+s],XE (u− )=XE (u) (we set f (u− ) ≡ limv↑u f (v)). Here we assume that the initial distribution of the chain coincides with the jump distribution, i.e., P (XE (0) = i) = p(τi ). The following theorem paraphrases the results on the asymptotic behaviour for this correlation function as found by Bouchaud and Dean [BD]: Proposition 1.1. Define H0 (w) ≡

1 πcosec (π/α)

∞

dx w

1 . (1 + x)x 1/α

(1.10)

P -a.s.

(1.11)

Then, for α > 0,

E (s, t) = 1, t,s↑∞ E↓−∞ H0 (s/t) lim

lim

Moreover, the asymptotic behavior of H0 (t/s) when s/t tends to zero or ∞, respectively, is readily evaluated: (i) If (s/t) ↓ 0, H0 (s/t) = 1 −

1 πcosec (π/α)

s/t

dx 0

1 (s/t)1−1/α ∼1− . 1/α (1 + x)x (1 − 1/α)π cosec (π/α) (1.12)

(ii) If (s/t) ↑ ∞, H0 (s/t) ∼

1 πcosec (π/α)

∞

dx s/t

1 x 1+1/α

=

(t/s)1/α . (1/α)π cosec (π/α)

(1.13)

In the remainder of this subsection we outline the proof of this theorem. Lemma 1.2. The function E (s, t) defined in (1.9) satisfies the equations

t

E (s, t) = 1 − FE (s + t) +

E (s, t − u)dFE (u).

(1.14)

0

Proof. The proof of this lemma is elementary since E (s, t) is a function of the renewal process r(t) alone.

Aging in the REM. Part 2

5

Remember that we study the solution of this equation in the limit when E ↓ −∞. For this it is important to make a choice of the time-scale τ0 . The choice τ0 = e−αE is natural since in this way we will measure time at the scale of the fastest states5 . Our first step will be to replace FE by its limit6 ∞ −1 F∞ (t) ≡ 1 − α dxe−t/x x −(1+α)/α (1.15) 1

which is no longer random. From now on we will only consider the case when q is the uniform measure, qi = n1E . Let ∞ (s, t) denote the unique solution of the equation t

∞ (s, t − u)dF∞ (u). (1.16)

∞ (s, t) = 1 − F∞ (s + t) + 0

Lemma 1.4. For all s, t ≥ 0, lim E (s, t) = ∞ (s, t),

E↓−∞

P -a.s.

(1.17)

The limiting equation (1.16) is solved following standard procedures (see e.g. [Fe]). One defines the renewal function M(t) that solves the equation t M(t − u)dF∞ (u). (1.18) M(t) = F∞ (t) + 0

In terms of this function, the solution of (1.16) is then given as t (1 − F∞ (s + t − u))dM(u).

∞ (s, t) = 1 − F∞ (s + t) +

(1.19)

0

(t), Setting f∞ (t) ≡ F∞

f∞ (t) = α −1

∞

e−t/x x −(2α+1)/α dx.

(1.20)

1

∞ Denote by g ∗ the Laplace transform of a function g, i.e. g ∗ (u) = 0 e−ut g(t). Then ∞ dx ∗ F∞ (u) = u−1 − α −1 (ux + 1)x 1/α 1 u∞ dx −1 −1 (1−α)/α = u −α u . (1.21) (1 + x)x 1/α u In the last expression, the integration is understood to be along a transformed path in the complex plane if u is complex. Note that7 ∞ dx π = π cosec (π/α). (1.22) = (α −1 ) (1 − α −1 ) = 1/α (1 + x)x sin(π/α) 0 5

Other choices may lead to completely different behaviors. In this introduction we will not justify the various passages to limits (which is also never done in the physics literature). Note however that these issues are treated in Sect. 4, and the results proven there can easily be used to justify everything that we will do in the present section. 1 dy dx 7 Performing the change of variable x = y −1 − 1, ∞ 0 (1+x)x 1/α = 0 (1−y)1/α y 1−1/α , where one 1 dy (µ) (ν) recognizes the Beta integral 0 µ−1 ν−1 = (µ+ν) . 6

(1−y)

y

6

G. Ben Arous, A. Bovier, V. Gayrard

Thus, when u → 0, the integral in (1.21) converges to the constant π cosec (π/α). Similarly, we have that ∞ 1 ∗ −1 x −(1+α)/α dx. (1.23) f∞ (u) = α 1 + ux 1 ∗ (0) = 1, and In particular, f∞ ∞ ∗ 1 − f∞ 1− (u) = α −1 1

1 1 + ux

x −(1+α)/α dx = α −1 u1/α

u∞ u

dx . (x + 1)x 1/α (1.24)

Taking the Laplace transform of (1.18) this implies that M ∗ (u) =

∗ (u) F∞ 1 = u∞ ∗ (u) −1 (1+α)/α 1 − f∞ α u u

dx (1+x)x 1/α

− u−1

(1.25)

and, by classical results on the asymptotics of the inverse Laplace transform (see [Doe], Vol. 2, Sect. 7), this in turn implies that for t ↑ +∞, M(t) ∼

t 1/α πα −1 (α −1 )cosec (π/α)

− 1.

(1.26)

Finally, we can compute the asymptotics of the solution of Eq. (1.16). Here we will directly make use of the fact that the Laplace transform of ∞ (s, t) is given explicitly as ∞ dx α 1 e−s/x (ux+1)x 1/α ∗

∞ (s, u) = . (1.27) ∗ 1 − f∞ (u) ∗ (u) near u = 0. We still need to We have already established the asymptotics of 1 − f∞ treat the numerator. It will be convenient to write ∞ ∞ ∞ dx 1 −1 −1 −s/x e =α dx dve−v α 1/α (ux + 1)x (ux + 1)x 1/α 1 s/x 1 ∞ ∞ 1 = α −1 dve−v dx (ux + 1)x 1/α 0 ∞ s/v∧1 ∞ 1 = α −1 dve−v dx (ux + 1)x 1/α 0 s/v ∞ 1 1 −α −1 dve−v dx . (1.28) (ux + 1)x 1/α s s/v

Now the first term can be conveniently represented as uα times an explicit Laplace transform: ∞ ∞ ∞/u 1 −1 −v −1 1/α dve dx =α u α (ux + 1)x 1/α 0 s/v 0 u∞ 1 dve−uv dx . (1.29) (x + 1)x 1/α s/v

Aging in the REM. Part 2

7

Note that since all integrands vanish at infinity in the right-half plane, 0/u and u∞ can be replaced with 0 and ∞, resp., i.e. the integration contours can be deformed to integrations along the real line. We will show that this term is the dominant one. In fact, combining (1.24) with (1.28) we get from (1.27) that ∞/u

∗∞ (s, u)

0

=

dve−uv ∞ u

∞

u∞

1 s/v dx (1+x)x 1/α dx (1+x)x 1/α

−

s

dve−v

1

1 s/v dx (u+1/x)x 1/α ∞ dx u1/α u (1+x)x 1/α

Now the integral in the denominator equals ∞ u ∞ dx dx dx = − 1/α 1/α (1 + x)x (1 + x)x (1 + x)x 1/α u 0 0 ∞ un , = πcosec (π/α) − u1−1/α (−1)n n + 1 − 1/α

. (1.30)

(1.31)

n=0

where the last sum is convergent for |u| < 1. Thus the leading singular (at u = 0) term from the first term in (1.30) is given by ∞ 1 −uv ∞ dx 0 dve s/v (1+x)x 1/α , (1.32) πcosec (π/α) which obviously is the Laplace transform of the function H0 (s/t). It remains to consider the second term in (1.30). Here the numerator converges to a constant as u tends to zero, in fact, at u = 0 it equals

∞

dve s

−v

1

dx s/v

1 x 1/α

1 = 1 − 1/α

∞

dye−y 1 − y 1−1/α ≤ const.e−s . (1.33)

s

Therefore the leading asymptotic of the second term is given by Const.u−1/α e−s .

(1.34)

The inverse Laplace transform of the second term has therefore the leading asymptotic behavior H1 (s, t) ∼ Const.t 1/α−1 e−s .

(1.35)

Note that while the asymptotics in t looks the same as that of the second term of H0 (s/t) in the case s/t ↓ 0, due to the exponential decay in s, this term can be neglected if s is large. Thus we have now established the “aging” asymptotics found in Bouchaud.

1.3. The renewal equations. Statement of the main results. Guided by Bouchaud’s trap model, we can now construct the setup for the analysis of aging in the full REM dynamics. First of all the natural subset of states in SN to play the rˆole of the state space in the trap model is the set TN (E) ≡ σ ∈ SN Eσ ≥ uN (E) , (1.36)

8

G. Ben Arous, A. Bovier, V. Gayrard

where (recall Sect. 1.1 of [BBG1]) √ x 1 ln(N ln 2) + ln 4π uN (x) ≡ 2N ln 2 + √ − . √ 2N ln 2 2 2N ln 2

(1.37)

We will call the set TN (E) “the top”, and frequently suppress indices, writing TN (E) = T (E) = T whenever no confusion is likely (the single letter T will only be used within proofs and the change in the notation will always be clearly signalled). Moreover, we will use the convention that M ≡ |TN (E)|, and d ≡ 2M . The idea is clearly to observe the process only at its visits to TN (E). The natural generalization of Bouchaud’s correlation function E (s, t) is therefore the probability that the process does not jump from a state in the top to another state in the top during a time interval of the form [n, n + m]. There is some ambiguity how this should be defined precisely, but the following definition appears most convenient. To formulate it, let us introduce the following random times. For any k ∈ N, let k− denote the last time before k at which the process has visited the top, i.e.

Now set

k− ≡ sup {l < k | σ (l) ∈ TN (E)} .

(1.38)

(m, n, N, E) ≡ P ∀k∈[n+1,n+m] σ (k) ∈ TN (E)\σ (k− ) .

(1.39)

Of course we still have to specify the initial distribution. To be as close as possible to Bouchaud, the natural choice is the uniform distribution on TN (E) that we will denote by πE . However, we will also need to introduce the respective functions with starting point in an arbitrary state σ . Thus we set

σ (m, n, N, E) ≡ P ∀k∈[n+1,n+m] σ (k) ∈ TN (E)\σ (k− ) | σ (0) = σ (1.40) and

(m, n, N, E) ≡

1 |TN (E)|

σ (m, n, N, E).

(1.41)

σ ∈TN (E)

We will also use vector notation and write (n, m, N, E) for the M dimensional vector with components σ (n, m, N, E), σ ∈ TN (E). We are now ready to state the main theorem of this paper. √ √ Theorem 1. Let β > 2 ln 2. Then there is a sequence cN ∼ exp(β N uN (E)) such that for any ε > 0, ([cN s], [cN t], N, E) lim lim lim P − 1 > ε = 0, (1.42) t,s↑∞ E↓−∞ N↑∞

∞ (s, t) where ∞ (s, t) is the limiting correlation function of the trap model, defined in (1.17). Before closing the introduction, let us say a few words about the heuristics of this theorem and the difficulties we will have to expect. Let us recall from [BBG1] the notation, for σ ∈ SN , I ⊂ SN , τIσ ≡ inf{n > 0 | σ (n) ∈ I, σ (0) = σ }

(1.43)

for the first positive time the process starting in σ reaches the set I . Note that it is easy to derive a renewal equation for the quantities (1.40). Namely, the event in the probability in (1.40) occurs either

Aging in the REM. Part 2

9

(i) if σ (k) ∈ TN (E)\σ , for all k ∈ [0, n + m], or (ii) if there is 0 < l ≤ n, s.t. l = inf{k ≤ n | σ (k) ∈ TN (E)\σ }, and ∀k∈[n+1,n+m] σ (k) ∈ TN (E)\σ (k− ). Since this decomposition is disjoint, it implies immediately the following system of renewal equations (writing T (E) = TN (E)):

σ (m, n, E) = P[τTσ(E)\σ > m + n] n Pσ [τTσ(E)\σ = k, Xk = σ , Xl ∈ T (E)\Xl− , ∀n ≤ l ≤ m + n] + k=1 σ ∈T (E)\σ

= P[τTσ(E)\σ > m + n] +

n

P[τσσ = τTσ(E)\σ = k] σ (m, n − k, E).

k=1 σ ∈T (E)\σ

(1.44) The extra difficulty stems from the fact that the kernels P[τσσ = τTσ(E)\σ = k] depend on both σ and σ , while in the trap model it is assumed that this quantity is independent of σ for any value of k. Indeed, if we had the relation P[τσσ = τTσ(E)\σ = k] =

πE (σ ) P[τTσ(E)\σ = k] 1 − πE (σ )

(1.45)

averaging (1.44) over σ would yield πE (σ )P[τTσ(E)\σ > m + n]

(m, n, E) = σ ∈T (E) n

+

πE (σ )P[τTσ(E)\σ = k]

k=1 σ ∈T (E)

×

πE (σ )

σ (m, n − k, E) 1 − πE (σ )

σ ∈T (E)\σ = πE (σ )P[τTσ(E)\σ > m + n] σ ∈T (E) n

+

+

πE (σ )P[τTσ(E)\σ = k] (m, n − k, E)

k=1 σ ∈T (E) n k=1 σ ∈T (E)

πE (σ ) P[τTσ(E)\σ = k]πE (σ ) 1 − πE (σ )

× [ (m, n − k, E) − σ (m, n − k, E)].

(1.46)

The last term is bounded by |T (E)|−1 which tends to zero uniformly as E ↑ ∞ and would be treated as an error term. If we ignore this term for a moment, (1.46) takes the desired form: Setting πE (σ )P[τTσ(E)\σ > n] (1.47) FN,E (n) ≡ σ ∈T (E)

10

G. Ben Arous, A. Bovier, V. Gayrard

and

fN,E (n) ≡

πE (σ )P[τTσ(E)\σ = n],

(1.48)

σ ∈T (E)

Eq. (1.46) then becomes

(m, n, E) = FN,E (m + n) +

n

fN,E (k) (m, n − k, E),

(1.49)

k=1

which has the form of the equation in the trap model. Unfortunately, even though we have shown in [BBG1] that (1.45) is true (up to an negligible error) when summed over k, we have not been able to find an argument that would show that (1.45) was true pointwise. Thus the only way out appears to be to study the solution of the full system (1.44). This will require some substantial preparations and will be undertaken only in Sect. 4. The remainder of this paper is devoted to proving Theorem 1. In the next section we recall some important results from [BBG1]. In Sect. 3 we prove the necessary refined estimates on the probability distributions appearing as kernels or inhomogeneous terms in the renewal system (1.44). Armed with these estimates, we will return to the analysis of the solution of this system in Sect. 4 where we prove Theorem 1. 2. Basic Estimates We will briefly recall a number of estimates that were proven in [BBG1] and that we will use heavily in our analysis. The first concerns various hitting probabilities. 1/2 Proposition 2.1. Set M = |T (E)|, d = 2M and δ(N ) ≡ Nd log N . There exists a subset E ⊂ with P (E) = 1, such that for all ω ∈ E, for all N large enough, the following holds: For ε > 0 a constant, define the sets √ (2.1) B√εN (σ ) = {σ ∈ SN | σ − σ 2 ≤ εN }, σ ∈ SN and Wε (I ) ≡

σ ∈I

c B√ (σ ), εN

I ⊆ SN .

(2.2)

Then, i) For all ε > 0 there exists a constant c > 0 such that, for all η ∈ T (E) and all σ ∈ Wε (T (E)), 1 d (1 + cδ(N )). (2.3) ≤ NM P τησ < τTσ(E)\η − M ii) There exists a constant c > 0 such that, for all η ∈ T (E) and η¯ ∈ T (E) with η = η, ¯ √ β NEη¯ η¯ 1 d P τηη¯ < τT (E)\η − M (1 + cδ(N )). (2.4) ≤ NM e

Aging in the REM. Part 2

11

iii) There exists a constant c > 0 such that, for all η ∈ T (E) and η¯ ∈ T (E) with η = η, ¯ η¯ 1 d − M−1 (1 + cδ(N )). (2.5) P τηη¯ < τT (E)\{η,η} ≤ N(M−1) ¯ iv) There exists a constant c > 0 such that, for all η ∈ T (E), √ β NEη η 1 1 d P τT (E)\η < τηη − 1 − M e ≤ 1− M N (1 + cδ(N )).

(2.6)

v) There exists a constant c > 0 such that, for all σ ∈ / T (E), √ 1 1− M 1 − Nd (1 + cδ(N )) ≤ eβ NEσ P τTσ(E) < τσσ ≤ 1.

(2.7)

vi) For all ε > 0 there exists a constant c > 0 such that, for all σ ∈ / T (E) and all σ¯ ∈ Wε (T (E) ∪ σ ), 1 d P τσσ¯ ≤ τTσ¯ (E) ≤ M + NM (1 + cδ(N )). (2.8) The next statement (Theorem 1.4 of [BBG1]) gives sharp estimates on mean transition times. √ Theorem 2.2. Assume that α ≡ β/ 2 ln 2 > 1. Then there exists a subset E ⊂ with = 1, such that for all ω ∈ E, for all N large enough, the following holds: P (E) i) For all η ∈ T (E), η

E(τT (E)\η ) =

1 1−

1 M

eβ

√ NEη

+ Wβ,N,T (E) (1 + O(1/N )).

(2.9)

ii) For all σ ∈ / T (E), E(τTσ(E) ) ≤ E(τTσ(E) ) ≥

1 1 1− M 1

1−

1 M

√ eβ NEσ + Wβ,N,T (E) (1 + O(1/N )), √ 1 − eE (α − 1) eβ NEσ + Wβ,N,T (E) (1+O(1/N )). (2.10) 1 + 1/M

iii) For all η, η¯ ∈ T (E), η = η, ¯ η η η η E(τη¯ | τη¯ ≤ τT (E)\η ) − E(τT (E)\η ) ≤

1 1−

1 M

Wβ,N,T (E) O(1/N ),

(2.11)

where √

Wβ,N,T (E)

e(α−1)E+β NuN (0) ≡ M(α − 1)

α−1 1 + VN,E eE/2 √ 2α − 1

(2.12)

and VN,E is a random variable of mean zero and variance one. We will also make use of the following simple corollary to this proposition: Corollary 2.3. Under the assumptions and with the notation of Theorem 2.2 we have:

12

G. Ben Arous, A. Bovier, V. Gayrard

i) For all η, η¯ ∈ T (E), η = η: ¯ 1 η η η η η η E(τη¯ | τη¯ ≤ τT (E)\η ) − E(τη¯ | τη¯ ≤ τT (E)\η ) |T (E) \ η| η∈T ¯ (E)\η

≤

1 1−

1 M

Wβ,N,T (E) O(1/N ).

(2.13)

ii) For all η ∈ T (E), 0 < E(τT (E)\η ) − P−1 (τT (E)\η < τηη ) ≤ η

1

η

1−

1 M

Wβ,N,T (E) (1 + O(1/N )). (2.14)

Proof of Corollary 2.3. The first assertion is an obvious consequence of the last assertion of Theorem 2.2. The second assertion simply follows from Eq. (3.8) of [BBG1] and is proven just as the first assertion of Theorem 2.2. Equipped with this information we proceed in the next section to analyse the Laplace transforms of the distribution functions of such transition times. 3. Estimates on Laplace Transforms We will use the method of Laplace tranforms to solve the system of renewal equations (1.44). Doing so this will require precise control on the Laplace transforms of the distribution functions of the probability distributions appearing in these equations. In this section we derive the basic estimates on these Laplace transforms. As in [BEGK1], Sect. 3, the first crucial step is an estimate of the maximal mean time to reach the set T (E). Lemma 3.1. Define (E) ≡ max EτTσ(E) σ ∈SN

and (E) ≡ (1 −

√ −1 β NuN (0)+αE 1 |T (E)| ) e

(3.1)

e−E E/2 α − 1 1 + Ve 1+ √ |T (E)|(α − 1) 2α − 1

× (1 + O(1/N )) ,

(3.2)

where V is a random variable of mean zero and variance 1. Then, under the assumptions of Theorem 2.2, (E). (E) ≤

(3.3)

(E) follows immediately from the esProof. For σ ∈ T (E), the bound EτTσ(E) ≤ timate from Theorem 2.2, i). If σ ∈ T (E), the forward Kolmogorov equation shows that pN (σ, σ ) + pN (σ, σ ) 1 + EτTσ(E) . (3.4) EτTσ(E) = σ ∈T (E)

σ ∈T (E)

Using the previous result in (3.4) one sees that the same estimate holds in this case.

Aging in the REM. Part 2

13

We define, for σ ∈ SN , I, J ⊂ SN , and u ∈ D ⊂ C, σ

GσI,J (u) ≡ EeuτI I{τIσ ≤τJσ } ≡

∞

P[τIσ = n ≤ τJσ ]enu ,

(3.5)

n=1

where D is chosen such that the right-hand side of (3.5) exists. Note that this is always the case for u s.t. (u) ≤ 0, but in fact, for σ, I, J given, there will be some u0 ≡ u0 (σ, I, J ) > 0, s.t. GσI,J (u) exists for all u with (u) ≤ u0 . Similarly we define σ

GσI (u) ≡ EeuτI .

(3.6)

Theorem 3.2. For any σ ∈ T (E), the Laplace transform GσT (E)\σ (u) can be written as GσT (E)\σ (u) =

aσ −u 1 − (1 − e )EτTσ(E)\σ bσ

+ Rσ (u),

(3.7)

where (E)/EτTσ(E)\σ , aσ = 1 + O (E)/EτTσ(E)\σ , bσ = 1 + O

(3.8) (3.9)

(E), periodic with period 2π in the and Rσ (u) is analytic in the half-plane (u) < 1/ imaginary direction, and satisfies (E), (i) for all |u| ≤ a/ 2 √ (E) |Rσ (u)| ≤ C(a) e−β NEσ

(3.10)

and √ (E) and |1 − e−u | ≥ 2ε−1 e−β NEσ (ii) for all u with (u) < (1 − ε) √

e−β NEσ . |Rσ (u)| ≤ 2 (E)) |1 − e−u |(1 − (u)

(3.11)

aσ + Rσ (0) = 1.

(3.12)

Moreover,

This proposition allows in fact to prove very good estimates on the distribution function of τTσ(E)\σ . Note first that if L(u) ≡

∞

eun P[τTσ(E)\σ > n],

(3.13)

n=0

then L(u) =

GσT (E)\σ (u) − 1 eu − 1

.

(3.14)

14

G. Ben Arous, A. Bovier, V. Gayrard

Corollary 3.3. With the notation of Theorem 3.2, for any ε > 0 and for any positive integer n ∈ N, aσ −n/EτTσ (E)\σ bσ P[τTσ(E)\σ = n] = e σ EτT (E)\σ bσ √ (E)ε , +O e−n(1−ε)/(E) e−β NEσ ε −1 ln (3.15) and (for n > 0) P[τTσ(E)\σ > n] = aσ e

−n/EτTσ (E)\σ bσ

√ (E)ε −1 . + O e−n(1−ε)/(E) e−β NEσ (3.16)

Proof of Theorem 3.2. Our analysis of the Laplace transforms will follow closely the strategy employed in [BEGK1], but some simplifications will occur due to the particular properties of the model at hand. 3.1. A priori estimates on Laplace transforms. As in [BEGK1], Lemma 3.1 implies immediate control on the Laplace transforms gσσ (u) ≡ Gσσ,T (E) (u): (E), for all σ, σ ∈ SN , Lemma 3.4. For all ε > 0, and for all real u ≤ (1 − ε)/

gσσ (u) ≤

1 ≤ ε −1 . (E) 1 − u

(3.17)

Proof. The proof is identical to the proof of Lemma 3.2 of [BEGK1]. Just note that if we set  σ  gσ (u), for σ ∈ T (E) ∪ σ vu (σ ) ≡ 1, (3.18) for σ = σ  0, for σ ∈ T (E)\σ then vu is the unique solution of the Dirichlet problem (1 − eu PN )vu (σ ) = 0, vu (σ ) = 1, vu (σ ) = 0

if

σ ∈ T (E) ∪ σ,

if

σ ∈ T (E)\σ.

(3.19)

Setting wu (σ ) ≡ vu (σ ) − v0 (σ ), we see that wu solves (1 − PN )wu (σ ) = (1 − e−u )vu (σ ), if σ ∈ T (E) ∪ σ, wu (σ ) = 0 if σ ∈ T (E) ∪ σ.

(3.20)

The solution of (3.20) can be represented as

τTσ (E)∪σ −1

wu (σ ) = E

(1 − e−u )vu (Xt )

(3.21)

t=1

implying that

σ

σ

vu (σ ) = P[τσ = τT (E)∪σ ] + (1 − e−u )E

τTσ (E)∪σ −1

t=1

vu (Xt ).

(3.22)

Aging in the REM. Part 2

15

Setting S(u) ≡ maxσ ∈T (E)∪σ vu (σ ), (3.22) implies S(u) ≤ 1 + (1 − e−u )

max

σ ∈T (E)∪σ

EτTσ(E)∪σ S(u)

(E)S(u), ≤ 1 + u

(3.23)

and hence S(u) ≤ which proves the lemma.

1 (E) 1 − u

(3.24)

This basic estimate can be improved in certain cases: Lemma 3.5. Let σ ∈ T (E). Then, for u as in Lemma 3.4, (i) GσT (E)\σ,σ (u) ≤ e−β (ii)

√ NEσ

√ Gσσ,T (E) (u) ≤ eu 1 + e−β NEσ

eu ≤ 2ε −1 P[τTσ(E) < τσσ ], (E) 1 − u 1 (E) 1 − u

(3.25)

≤ 1 + 2ε −1 P[τTσ(E) < τσσ ]. (3.26)

Proof. Let us first prove (i). This goes essentially along the same lines as the proof of Lemma 3.4. Define  σ  GT (E)\σ,σ (u), for σ ∈ T (E) ∪ σ (3.27) ψu (σ ) ≡ 1, for σ ∈ T (E)\σ  0, for σ = σ and φu (σ ) ≡ ψu (σ ) − ψ0 (σ ). Then φu solves (1 − PN )φu (σ ) = (1 − e−u )ψu (σ ), if φu (σ ) = 0 if σ ∈ T (E).

σ ∈ T (E), (3.28)

Just as in the previous proof, we get first the uniform bound ψu (σ ) ≤ Now GσT (E)\σ,σ (u) =

1 . (E) 1 − u

pN (σ, σ )eu GσT (E)\σ,σ (u) ≤

σ =σ

(3.29)

√ 1 eu . e−β NEσ (E) N 1 − u

σ =σ

(3.30) Since P[τTσ(E) < τσσ ] ∼ In the same way,

√ e−β NEσ ,

(i) is proven.

Gσσ,T (E) (u) = eu pN (σ, σ ) + eu ≤ 1+e and this proves (ii).

√ −β NEσ

pN (σ, σ )Gσσ,T (E) (u)

σ =σ

1 eu (E) 1 − u

(3.31)

16

G. Ben Arous, A. Bovier, V. Gayrard

Finally we turn to the Laplace transform of hitting times without extra exclusion sets. Proposition 3.6. Let σ ∈ T (E). Then, for u = ρ/EτTσ(E)\σ , if |ρ| ≤ (1 − γ ), γ > 0, GσT (E)\σ ρ/EτTσ(E)\σ =

1

(E)/Eτ σ 1 − ρ 1 + O(e ) + ρO ( )2 T (E)\σ (E)/EτTσ(E)\σ . × 1 + ρO (3.32)

−αu−1 N (Eσ )+E

Proof. As in the analogous analysis in [BEGK1], the starting point of our analysis is the renewal equation GσT (E)\σ (u) =

GσT (E)\σ,σ (u) 1 − Gσσ,T (E) (u)

(3.33)

.

It is reasonable to rewrite this as GσT (E)\σ (u) =

P[τTσ(E)\σ < τσσ ] 1 − Gσσ,T (E) (u)

+

GσT (E)\σ,σ (u) − GσT (E)\σ,σ (0) 1 − Gσσ,T (E) (u)

≡ (I ) + (I I ). (3.34)

Using the Taylor-Lagrange formula with remainder to second order, we have (I ) =

P[τTσ(E)\σ < τσσ ] 2

d σ P[τTσ(E)\σ < τσσ ] − uEτσσ I{τσσ ≤τTσ(E) } − u2 /2 du ˜ 2 Gσ,T (E) (u) −1  2 d σ Eτσσ I{τσσ ≤τTσ(E) } G ( u) ˜ 1 2 σ,T (E) du  . = 1 − ρ − ρ2 P[τTσ(E)\σ < τσσ ]EτTσ(E)\σ 2 P[τTσ(E)\σ < τσσ ](EτTσ(E)\σ )2

(3.35) We want to show that the coefficient of ρ in the denominator is essentially equal to one, while the coefficient of ρ 2 tends to zero. Differentiating the renewal equation (3.33) and evaluating at u = 0 gives

E τTσ(E)\σ |τTσ(E)\σ = τTσ(E) = E τTσ(E)\σ −

Eτσσ I{τσσ ≤τTσ(E) }

, 1 − P τσσ ≤ τTσ(E)

(3.36)

which implies immediately that Eτσσ I{τσσ ≤τTσ(E) } P[τTσ(E)\σ < τσσ ]EτTσ(E)\σ

≤ 1.

(3.37)

Moreover, Eτσσ I{τσσ ≤τTσ(E) } ≥ P[τσσ = 1] = 1 − eβ

√ NEσ

,

(3.38)

while by (2.9) of Theorem 2.2 and (2.6) of Proposition 2.1, the denominator in (3.37) is bounded from above by −1

1 + e−αuN

(Eσ )

α−1 Ve−E/2 √ . 2α − 1

(3.39)

Aging in the REM. Part 2

17

Thus Eτσσ I{τσσ ≤τTσ(E) } P[τTσ(E)\σ < τσσ ]EτTσ(E)\σ

≥

1 − e−β −1

1 + e−αuN

(Eσ )

√ NEσ

VeE/2 √α−1 2α−1

.

(3.40)

Next we turn to the coefficient of ρ 2 . By (3.31) we can write Gσσ,T (E) (u) = eu pN (σ, σ ) + f (u),

(3.41)

(E) and satisfies where f (u) is analytic in the half-plane (u) < 1/ f (u) ≤ e−β

√ NEσ

eu . (E) 1 − u

(3.42)

(E), By Cauchy’s integral formula, this implies that for (u) < (1 − γ )/ √ f (u) ≤ e−β NEσ

Cγ −1 (E)−1 − (u))2 (

(3.43)

(E), with some universal numerical constant C. Thus for u = λ/EτTσ(E)\σ ≤ (1 − γ )/ γ > 0, we get √ (E)2 . f λ/EτTσ(E)\σ ≤ e−β NEσ Cγ −3

(3.44)

Therefore, under the same condition, √ d2 σ −β NEσ Cγ −2 (E)2 (E)2 G ( u) ˜ e 2 σ,T (E) du −3 ≤ ≤ 2Cγ , σ σ σ P[τ σ σ 2 σ 2 (EτTσ(E)\σ )2 T (E)\σ < τσ ](EτT (E)\σ ) P[τT (E)\σ < τσ ] (EτT (E)\σ ) (3.45) which is small if u−1 N (Eσ ) E. Finally we turn to the term (II). While the denominator is the same as in (I), the numerator can now be written as GσT (E)\σ,σ (u) − GσT (E)\σ,σ (0) = u

d σ G (u). ˜ du T (E)\σ,σ

(3.46)

This can be bounded in the same way as before, using the Cauchy estimates under the same assumptions on u (with a different constant C), by √ d σ −β NEσ G (E). ( u) ˜ Cγ −2 du T (E)\σ,σ ≤ e

(3.47)

This shows that (II) can be estimated as a small fraction of (I). This concludes the proof of the proposition.

18

G. Ben Arous, A. Bovier, V. Gayrard

3.2. Analyticity properties. Let us note first that all Laplace transforms that we are considering can be identified with meromorphic functions that are given as the solutions of Dirichlet problems of the same type as (3.19). Note also that trivially all these functions are periodic with period 2π in the imaginary direction. Equation (3.33) allows to derive more precise estimates on our Laplace transform than we have obtained so far. Note that both Laplace transform on the left hand side of (3.33) are analytic in the half . This implies that the only singularities of Gσ plane (u) < 1/ T (E)\σ (u) in that half-plane are poles at those values of u for which the denominator vanishes, i.e. 1 = Gσσ,T (E) (u).

(3.48)

By inspection of the proof of Proposition 3.6, there is only one solution of this equation in the strip −π ≤ (u) ≤ π, uσ = ρ/EτTσ(E)\σ , where ρ satisfies −1 (E)/EτTσ(E)\σ )2 . 1 − ρ = ρO e−αuN (Eσ )+αE + ρ 2 O (

(3.49)

2 (E)/Eτ σ This implies the existence of a solution ρ0 = 1 + O ( T (E)\σ ) . This implies that the function GσT (E)\σ (u) has simple poles at uσ (mod + i2π ), and (E)−1 , (u) = 0 or (u) = π . Moreover, Proposition all other poles satisfy (u) ≥ 3.6 implies that the residue at uσ equals

res uσ =

GσT (E)\σ (uσ ) d σ du Gσ,T (E) (uσ )

=

1 EτTσ(E)\σ

(E)/EτTσ(E)\σ )2 . 1 + O (

(3.50)

This allows in particular to extend the validity of the renewal equation (3.33) to the entire domain of analyticity of this function. This will prove very helpful in obtaining good bounds. As a first observation, we note that the domain of validity of (3.32) can be (E). immediately extended to the set ρ < EτTσ(E)\σ / We will now estimate the difference between GσT (E)\σ (u) and the contribution from the pole at uσ . We set

Rσ (u) = GσT (E)\σ (u) +

GσT (E)\σ,T (E) (uσ ) d (u − uσ ) du Gσσ,T (E) (uσ )

.

(3.51)

(E), a < 1. We first give a uniform estimate of the modulus of Rσ on the disk |u| ≤ a/ Note that a straightforward computation and the use of Taylor expansion to first order shows that

Aging in the REM. Part 2

19

Rσ (u) d Gσ σ σ σ GσT (E)\σ,T (E) (u)(u − uσ ) du σ,T (E) (uσ ) − GT (E)\σ,T (E) (uσ )(Gσ,T (E) (u) − Gσ,T (E) (uσ )) = d σ σ (1 − Gσ,T (E) (u))(u − uσ ) du Gσ,T (E) (uσ ) =

2 d d σ σ ˜ − 21 GσT (E)\σ,T (E) (uσ ) d 2 Gσσ,T (E) (u) ˆ du Gσ,T (E) (uσ ) du GT (E)\σ,T (E) (u) du , d Gσ d σ du σ,T (E) (uσ ) du Gσ,T (E) (u )

(3.52)

where u, ˜ u, ˆ u are somewhere on the ray between uσ and u. From (3.31) and the Cauchy bounds used as in (3.43) we get that √ −β NEσ (E) d σ e u G , (3.53) du σ,T (E) (u) − pN (σ, σ )e ≤ C 1 − (u) (E) √ −β NEσ (E) d σ e G , du T (E)\σ,T (E) (u) ≤ C 1 − (u) (E)

(3.54)

√ 2 (E))2 d e−β NEσ ( σ (u) G (u) ≤ e + C du2 σ,T (E) (E))2 (1 − (u)

and by Lemma 3.5,

(3.55)

√

σ GT (E)\σ,T (E) (u) ≤ C

e−β NEσ . (E) 1 − (u) (E), Combining these estimates, we see that indeed on |u| ≤ a/ 2 √ (E) |Rσ (u)| ≤ C(a) e−β NEσ

(3.56)

(3.57)

as desired. It remains to estimate GσT (E)\σ (u) for (E). (1/EτTσ(E)\σ ) < (u) < 1/ To do so, we rely on (3.33). We will use (3.25) to bound the numerator uniformly in the imaginary part of u, while the denominator will provide extra decay in the imaginary direction. Note that by (3.31), √ σ |Gσσ,T (E) (u)| Gσ,T (E) (u) − pN (σ, σ )eu ≤ e−β NEσ |eu | max σ ∼σ

√

e−β NEσ |eu | . ≤ (E) 1 − (u) Therefore |Gσσ,T (E) (u) − 1|

≥ |e ||1 − e u

−u

|−e

√ −β NEσ

(3.58)

1 . 1− (E) 1 − (u)

(3.59)

Combining this estimate with (i) of Lemma 3.5, we arrive at the bound, valid for (u) < √ (E) and |1 − e−u | ≥ 2ε−1 e−β NEσ , (1 − ε)/ √

|GσT (E)\σ (u)|

e−β NEσ ≤2 . (E))|1 − e−u | (1 − u

Combining these observations we arrive at the assertion of Theorem 3.2.

(3.60)

20

G. Ben Arous, A. Bovier, V. Gayrard

Im u Pi

Im u

Im u

Pi

Pi

-Pi

-Pi

Re u

Re u

Re u

Integration contour

-Pi

Deformed integration contour

Deformed integration contour

Finally we prove Corollary 3.3. Proof of Corollary 3.3. We give only the proof of (3.15), the proof of (3.16) being completely analogous. Note that by the Laplace inversion formula [Doe], iπ 1 σ e−un GσT (E)\σ (u)du, (3.61) P[τT (E)\σ = n] = 2πi −iπ where the integration is along the imaginary axis. Inserting (3.7) into (3.61), in the first two terms the integration contour can be modified to any circle enclosing the point 1/EτTσ(E)\σ bσ , and the integral yields, by Cauchy’s theorem, the residue of e−un 1−(1−e−ua)σEτ σ at this point. In the integral over the remainder term Rσ (u), we b T (E)\σ σ

(E) along the positive real axis and use the uniform bound shift the contour by (1 − ε)/ (3.11) along the integration contour. This gives the claimed estimate.

4. The Renewal Equations 4.1. Introduction. We have now all ingredients needed to study the system of renewal equations (1.44) established in Sect. 1.4. As usual, to solve (1.44) we pass to Laplace transforms, solve the ensuing linear system, and then transform back. We set

∗σ (m, u, E)

≡

∞

enu σ (m, n, E)

(4.1)

n=0

for u ∈ C whenever this sum converges. Let us define Fσ∗ (m, u) ≡

∞

enu P[τTσ(E)\σ > m + n].

(4.2)

n=0

Then it follows from (1.44) that for any σ ∈ T (E),

∗σ (m, u, E) = Fσ∗ (m, u) + Gσσ ,T (E)\σ (u) ∗σ (m, u, E). σ ∈T (E)\σ

(4.3)

Aging in the REM. Part 2

21

Let us denote by KE∗ (u) the |T (E)| × |T (E)| matrix with elements8 Gσσ ,T (E)\σ (u), if σ = σ (KE∗ (u))σ,σ ≡ . 0, if σ = σ

(4.4)

Then clearly the solution of Eq. (4.3) can be written as9

−1 ∗

∗ (m, u, E) = I − KE∗ (u) KE (u) + I F ∗ (m, u),

(4.5)

where ∗ and F ∗ denote the vectors with components ∗σ , and Fσ∗ . The matrix

−1 ∗ ME∗ (u) ≡ I − KE∗ (u) KE (u)

(4.6)

is known as the Laplace transform of the resolvent of the system of renewal equations. Our task is to compute the inverse Laplace transform of the right hand side of (4.5). This requires estimates in the complex u-plane. We will separate this analysis in two steps. First, we establish a priori bounds on the norm of ME∗ in a suitable domain. Next we will perform a suitable perturbation analysis that is valid in a small neighborhood of u = 0 only. Then we show that the dominant part of the contribution from the Laplace-inversion formula comes from this region and is thus explicitly computed, while the remainder is controlled by our a priori bounds.

4.2. Bounds on the resolvent. In the sequel we will always work with the matrix norm |Kσ,σ |. (4.7)

K ≡ max σ ∈T (E)

σ ∈T (E)

Note that . is an operator norm in L∞ (CM ) equipped with the supremum norm, i.e. KF ∞ ≤ K F ∞ . This norm serves our purposes, and moreover will turn out to be particularly well suited to the matrices that we need to deal with. We will begin by deriving estimates on the matrices KE∗ (u). It follows from the results of Sect. 3 that Lemma 4.1. Considered as a function C → L(CM , CM ), KE∗ (u) is (i) Periodic with period 2π in the imaginary direction. (ii) Meromorphic in C with poles only on the positive real axis and its 2π translates. (iii) For σ = σ ∈ T (E), ∗ Kσ,σ (u)

=

Gσσ ,T (E) (u) 1 − Gσσ,T (E) (u)

.

(4.8)

The following observation will be extremely useful: ∗ (u) instead of (K ∗ (u)) We will often write Kσ,σ σ,σ whenever no confusion is possible E

9 The reason for separating the I in this representation is that the operator I − K ∗ (u) −1 K ∗ (u) has E E

∗ (u) −1 itself. This is important for computing the better decay properties at infinity than the I − KE inverse Laplace transforms. 8

22

G. Ben Arous, A. Bovier, V. Gayrard

Lemma 4.2. For any u ∈ C for which GσT (E)\σ,T (E) (u) is finite,

Gσσ ,T (E) (u) = GσT (E)\σ,T (E) (u).

(4.9)

σ ∈T (E)\σ

Proof. It is enough to prove (4.9) for u in the negative imaginary half plane. Now IτTσ(E)\σ ≤τTσ(E) = Iτ σ ≤τTσ(E) . (4.10) σ

σ ∈T (E)\σ

Thus GσT (E)\σ,T (E) (u) = Ee =

uτTσ (E)\σ

ItTσ (E)\σ ≤τTσ(E) = E Ee

uτσσ

σ ∈T (E)\σ

e

uτTσ (E)\σ

σ ∈T (E)\σ

Iτ σ ≤τTσ(E) = σ

Iτ σ ≤τTσ(E) σ

Gσσ ,T (E) (u).

(4.11)

σ ∈T (E)\σ

An immediate, but important consequence of Lemma 4.2 is that

KE∗ (0) = 1.

(4.12)

The first step towards control in the complex plane will be to show that KE∗ (u) decreases down from zero along the imaginary axis in the strip (u) ∈ [−π, π ]. Lemma 4.3. Let v ∈ [−π, π ] and set ¯ ≡ eβ

√ NuN (0)+αE

(4.13)

.

Recall M = |T (E)| and d = 2M . Then (for N large enough),

KE∗ (iv) ≤

1

¯ −1 ) + 1 − ¯ 2 1 − O( 2(1 − cos v)

4 M−1

(1 + O(d/N ))

.

(4.14)

Before proving the lemma, we will note the obvious consequence that Corollary 4.4. Under the assumptions and notations of Lemma 4.3, 3 ¯ , then KE∗ (iv) < 1. (i) If |v| > √M−1 (ii) For any 0 < ε < 1, if ε 2−ε 9 −2 ¯ 2(1 − cos v) ≥ + (1 + O(d/N )) 1 − ε 1 − ε (m − 1)(1 − ε)

(4.15)

then KE∗ (iv) ≤ 1 − ε. (iii) Under the same assumptions as in (i),

ME∗ (iv) ≤ !

1 2 ¯ ¯ 1 + 2 (1 − cos v)(1 − O(−1 )) − 1 −

4 M−1 (1 + O(d/N ))

.

(4.16)

Aging in the REM. Part 2

23

Proof. To bound the norm of KE∗ , we use simply that σ σ ∈T (E)\σ |Gσ ,T (E) (iv)| ∗ |Kσ,σ (iv)| = |1 − Gσσ,T (E) (iv)| σ ∈T (E)\σ

≤

P[τTσ(E)\σ ≤ τTσ(E) ] |1 − Gσσ,T (E) (iv)|

(4.17)

.

Thus the key point is to bound the denominator from below. Now Gσσ,T (E) (iv)

=

∞

sin(vn)P[τσσ = τTσ(E) = n]

n=1

= sin(v)pN (σ, σ ) +

pN (σ, σ )

σ ∈T (E)

∞

sin(v(n + 1))P[τσσ = τTσ(E) = n]

n=1

≡ pN (σ, σ ) sin v + dσ (v),

(4.18)

where |dσ (v)| ≤ e

√ −β NEσ

√

1 e−β NEσ P[τσσ = τTσ(E) ] ≤ 2 (1 + O(|T (E)|/N )), N |T (E)|

σ ∼σ

(4.19) where we used the bound (2.3) from Proposition 2.1, 1 − Gσσ,T (E) (iv) = pN (σ, σ )(1 − cos v) + cσ (v),

(4.20)

where P[τTσ(E)\σ

=

τTσ(E) ]

≤ cσ (v) ≤

P[τTσ(E)\σ

=

e τTσ(E) ] + 2

√ −β NEσ

|T (E)|

(1 + O(|T (E)|/N )). (4.21)

Thus we have that |1 − Gσσ,T (E) (iv)| " ≥

2 (pN (σ, σ ) sin v)2 + pN (σ, σ )(1 − cos v) + P[τTσ(E)\σ = τTσ(E) ]

−|dσ (v)| − |cσ (v) − P[τTσ(E)\σ = τTσ(E) ]|.

(4.22)

To simplify the notation, set pN ≡ pN (σ, σ ), Pσ ≡ P[τTσ(E)\σ = τTσ(E) ]. Let Y ≡ (pN sin v)2 + (pN (1 − cos v) + Pσ )2 = 2pN (1 − cos v)(pN + Pσ ) + Pσ2 . (4.23) Thus we have in fact that |1 − Gσσ,T (E) (iv)| ≥

2pN (1 − cos v)(pN + Pσ ) + Pσ2 −

4 −β √NEσ (1+O(M/N )) e M (4.24)

24

G. Ben Arous, A. Bovier, V. Gayrard

which together with (4.17) gives that ∗ |Kσ,σ (iv)| σ ∈T (E)\σ

≤! =

Pσ

2pN (1 − cos v)(pN + Pσ ) + Pσ2 − 1

√ 4 −β NEσ (1 + O(M/N )) Me

2pN Pσ−2 (1 − cos v)(pN + Pσ ) + 1 −

. √ 4 −1 −β NEσ P e (1 + O(M/N )) σ M

(4.25)

Now recall from Proposition 2.1, (iii), that 1 1−

1 M

(1 − O(d/N)) ≤ Pσ−1 e−β

√ NEσ

≤

1

(1 + O(d/N )).

(4.26)

e−β NEσ + Pσ ≥ 1 − (1 − O(d/N )) M

(4.27)

1−

1 M

It follows readily that pN + Pσ = 1 − e

√ −β NEσ

√

and hence 1 > pN (pN + Pσ ) ≥ 1 − e−β

√ NEσ

(1 + 1/M)(1 + O(d/N )). √ Since by definition of T (E), minσ ∈T (E) N Eσ ≥ uN (E), this implies ¯ −1 (1 + 1/M)(1 + O(d/N )) min pN (pN + Pσ ) ≥ 1 −

σ ∈T (E)

(4.28)

(4.29)

and 1

KE∗ (iv) ≤ ! 2 −1 ¯ ¯ 2(1 − cos v)(1 − (1 + 1/M) + 1 −

4 M−1 (1 + O(d/N ))

(4.30) which proves the lemma.

The proof of Corollary 4.4 is an exercise in simple algebra and is left to the reader. Next we use these results to extend similar bounds somewhat into the positive imaginary half plane. The important point permitting this is that we will need to Taylor-expand in the real part of u only Dirichlet Green’s functions with exclusion set T (E) and these . Let us first fix some notation. are analytic up to (u) ≈ 1/ Notation . As before the letter u ∈ C denotes a complex number. Its real and imaginary parts will always be called w and v: u = w + iv.

(4.31)

For given u ∈ C, we will denote by z ∈ C the number (E)u. z=

(4.32)

Aging in the REM. Part 2

25

The real and imaginary parts of z will always be called r and s: z = r + is.

(4.33)

(E)w, r= (E)v. s=

(4.34)

Thus

To simplify the notation the dependence on u of z (or on w, resp. v, of r, resp. s) will never be made explicit. No confusion should arise from this as, up until Sect. 4.710 , the letters u, w, v and z, r, s will be used exclusively according to the relations specified above. For ready reference we make the following definitions. Definition 4.5. Let 0 < C1 , C2 < ∞, and 0 < γ < 1 be numerical constants. With the above notation we define the sets: # ! √ $ D1 (C1 ) ≡ u ∈ C : r 2 + s 2 ≥ C1 / M , % & γ s2 D2 (C2 , γ ) ≡ u ∈ C : 0 ≤ r < min , 1 − γ , v ∈ [−π, π ] , √ C2 1 + s 2 D3 ≡ {u ∈ C : −1 ≤ r < 0, |s| < 1} , D4 ≡ {u ∈ C : |r| < 1, |s| < 1} . (4.35) Lemma 4.6. There exist constants 0 < C, C < ∞ such that, for all 0 < γ < 1 and all u ∈ D2 (C , γ ),

KE∗ (u) ≤ !

1 + Cγ −1 r ¯ −1 )) − 2 2(1 − cos v)(1 − O( 1+

4 −1 r M−1 (1 + O(d/N )) − C γ

.

(4.36) Proof. As in the proof of Lemma 4.3, we begin by writing the analogue of (4.17) and again we bound the numerator by the value obtained when putting its imaginary part equal to zero. This yields σ σ ∈T (E)\σ |Gσ ,T (E) (w)| ∗

KE (u) ≤ |1 − Gσσ,T (E) (w + iv)| GσT (E)\σ,T (E) (w) . (4.37) = |1 − Gσσ,T (E) (w + iv)| We now Taylor expand both the numerator and the denominator. Note that we will only . For the numerator we will use (3.46) together with the be interested in w ≤ (1 − γ )/ , bound (3.47) to write, for 0 ≤ w ≤ (1 − γ )/ e−β GσT (E)\σ,T (E) (w) ≤ P[τTσ(E)\σ ≤ τTσ(E) ] + Cwγ −1 10

√ NEσ

There, the letter s will retrieve the initial meaning it was given in Theorem 1.

.

(4.38)

26

G. Ben Arous, A. Bovier, V. Gayrard

On the other hand, from (3.31) and the Cauchy bound we get that, again for 0 ≤ w˜ ≤ , (1 − γ )/ √ d σ ≤ |ew˜ | + Cγ −1 e−β NEσ (E) ≤ C γ −1 . (iv + w) ˜ (4.39) G dw σ,T (E) This implies again |1 − Gσσ,T (E) (iv + w)| ≥ |1 − Gσσ,T (E) (iv)| − wγ −1 C .

(4.40)

As we already have bounded the first term on the right in the proof of Lemma 4.3, we readily arrive at ∗ |Kσ,σ (u)| σ ∈T (E)\σ

≤

γ −1 e−β 1 + Cw

√ NEσ P−1 σ

√ −β NEσ

1 + 2Pσ−2 pN (1−cos v)(pN + Pσ )− 4e M Pσ

. (1 + O(M/N ))−C γ −1 Pσ−1 w (4.41)

Proceeding from there on exactly as in the proof of Lemma 4.3 we then get, using relation (4.34), ∗ |Kσ,σ (u)| σ ∈T (E)\σ

≤

1 + Cγ −1 r ¯ −1 )) − 4 (1 + O(d/N )) − C γ −1 r 1 + Pσ−2 2(1 − cos v)(1 − O( M−1 Pσ

.

(4.42)

Since we need to take the maximum over all σ ∈ T (E), it is important to restrict r as a function of v in such a way that the maximum will be taken on by the σ that maximises Pσ . Some elementary algebra shows that this will be the case provided that 2 ¯ −1 )) 2 2 (1 − O( 2(1 − cos v) −1 ≥ C γ r (4.43) ¯ −1 )) 2 (1 − O( 1 + 2(1 − cos v) or r≤

¯ −1 )) 2 (1 − O( 2(1 − cos v) ! . ¯ −1 )) 2 (1 − O( γ −1 C 1 + 2(1 − cos v)

(4.44)

Since this is a serious condition only if v is very small we see, using relation (4.34), that this condition reduces to γ s2 ,1 − γ . (4.45) r < min √ C 1 + s2 On this domain we can thus estimate the norm of KE∗ by

KE∗ (u) ≤ !

1 + Cγ −1 r ¯ −1 )) − 2 2(1 − cos v)(1 − O( 1+

4 −1 r M−1 (1 + O(d/N )) − C γ

.

(4.46) This proves the lemma.

Aging in the REM. Part 2

27

As in the case of Lemma 4.3, we get as an immediate corollary an upper bound on the norm of the resolvent. Corollary 4.7. For all 0 < γ < 1 there exists a constant 0 < L < ∞ (depending on C, C and γ ) such that, for all u ∈ D1 (4) ∩ D2 (L, γ ),

KE∗ (u) < 1

(4.47)

and

ME∗ (u)

≤!

1+Cγ −1 r ¯ −1 ))−1− 4 (1+O(d/N ))−(C +C )γ −1 r 2 2(1−cos v)(1−O( 1+ M−1

. (4.48)

Finally we will need an estimate on ME∗ (u) in the case when |u| is very small and w ≤ 0 that shows that there, the negative real part helps to depress KE∗ (u) < 1 down from one. Lemma 4.8. For M large enough, (i) for all u ∈ D3 ,

KE∗ (u) ≤ √

1 1 + r2

+ s2 −

5 M

(4.49)

,

(ii) for all u ∈ D1 (4) ∩ D3 , KE∗ (u) < 1 and

ME∗ (u) ≤ √

1 1 + r2

+ s2 − 1 −

5 M

.

(4.50)

Proof. The proof of this estimate goes quite along the lines of the proof of the previous lemmas. However, to simplify things, we bound the Green function in the numerator of (4.37) by its value at zero and, instead of using (4.40) in the denominator, we go back to the estimates (4.18) and (4.20) which we modify slightly to yield, for w ≤ 0, Gσσ,T (E) (iv + w) =

∞

enw sin(vn)P[τσσ = τTσ(E) = n]

n=1

= ew sin(v)pN (σ, σ ) ∞ + pN (σ, σ ) ewn sin(v(n + 1))P[τσσ = τTσ(E) = n] σ ∈T (E)

n=1

≤ pN (σ, σ )ew sin v + dσ (v) with dσ (v) from (4.18). Similarly, 1 − Gσσ,T (E) (iv + w) = pN (σ, σ )(1 − ew cos v) + cσ (v) with cσ (v) from (4.20). On the other hand

(4.51)

(4.52)

28

G. Ben Arous, A. Bovier, V. Gayrard

|Pσ + pN (1 − eu )|2 = Pσ2 + 2pN (1 − cos v)(pN + Pσ ) 2 2w −2 cos vpN (ew − 1)(pN + Pσ ) + pN (e − 1).

(4.53)

For w small, we can expand ew to second order and, using that w ≤ 0, we get |Pσ + pN (1 − eu )|2 = Pσ2 + 2pN (1 − cos v)(pN + Pσ ) − 2wp N [p N − cos v(pN + Pσ )] +w2 pN [2pN − cos v(pN + Pσ )] + O w 3 = Pσ2 + 2pN (1 − cos v)(pN + Pσ )(1 − w) − 2wpN (1 − pN ) +w2 pN [2pN − cos v(pN + Pσ )] + O w 3 ≥ Pσ2 + v 2 + w 2 + O w 3 . (4.54) Thus

1

∗ |Kσ,σ (u)| ≤

2 (s 2 + r 2 ) − 1 + Pσ−2

σ ∈T (E)\σ

,

(4.55)

5 M

and since this is clearly monotone in Pσ , it follows that

KE∗ (u) ≤ √

1 1 + s2

+ w2 −

(4.56)

5 M

and hence, for u ∈ D1 (4), KE∗ (u) < 1 and

ME∗ (u) ≤ √

1 1 + s2

+ w2 − 1 −

5 M

.

(4.57)

4.3. Perturbative estimates for small u. Notation . In this sub-section we will systematically write T for T (E). The a priori bounds obtained in the last subsection will suffice to show that the contributions from u away from zero in the Laplace inversion formula are sub-dominant. In the neighborhood of zero we have to proceed more carefully and extract the dominant contribution to the resolvent, while estimating the remainders. This will be done by decomposing KE∗ (u) in a suitable way, the idea being that the leading term should allow exact computations; in fact, we will want this term to be a matrix with constant columns. To this end note that for σ = σ , by Taylor’s formula, 1 d σ u2 d 2 σ σ G G (0) + u (0) + G ( u) ˜ σ ,T 1 − Gσσ,T (u) du σ ,T 2 du2 σ ,T 1 u2 d 2 σ σ σ σ σ σ = P[τ ≤ τ ] + uEτ I + G ( u) ˜ , T σ σ {τσ ≤τT } 1 − Gσσ,T (u) 2 du2 σ ,T (4.58)

∗ Kσ,σ (u) =

Aging in the REM. Part 2

29

where u˜ is on the ray between 0 and u. The idea is of course that since u is small, the quadratic term is a small perturbation11 while the constant and linear terms are essentially independent of σ , the deviations being treatable as perturbations as well. Let us first establish a bound on the second order contribution. The notation and definitions of the present are the same as in the previous one (recall in particular Definition 4.5). ∗(2)

Lemma 4.9. Denote by KE

the matrix with entries  d2 σ ˜  21 u2 du 2 Gσ ,T (u) ∗,(2) σ 1−Gσ,T (u) , if σ = σ . Kσ,σ (u) =  0, if σ = σ

For 0 < γ < 1, let the constant L be chosen such that $ # u ∈ C | r ≤ s 2 /4 ⊆ D2 (L, γ ) ∩ D4 .

(4.59)

(4.60)

Then, there exists a constant C > 0 such that for all for u ∈ D2 (L, γ ) ∩ D4 and N large enough, ∗(2)

KE (u) ≤ !

γ −2 C(s 2 + r 2 ) 1 + (s 2 + r 2 )/2 − 5/M

.

(4.61)

Remark . The assumption (4.60) is made for convenience only as it allows to simplify the expressions of our estimates. u, like γ −2 C(s 2 + Remark . Note also that the bound (4.61) simply behaves, for small r 2 ). Proof. To bound the denominator we proceed as in the proofs of Lemmas 4.6 and 4.8 with the difference that, for r > 0, the bound 4.54 becomes, using that r ≤ s 2 /4, |Pσ + pN (1 − eu )|2 ≥ Pσ2 + (v 2 + w 2 )/2 + O(w 3 ). For the numerator we use that d2 d2 d2 σ σ G ( u) ˜ Gσσ ,T (u) ˜ = G (u) ˜ du2 σ ,T ≤ 2 du du2 T \σ,T σ ∈T \σ

(4.62)

(4.63)

σ ∈T \σ

, bound the last quantity in the r.h.s. proceeding as in the and, since u˜ ≤ (1 − γ )/ proof of Proposition 3.6 (see the treatment of the term (II) therein). ∗(2)

What remains of KE∗ after subtraction of KE is almost of the desired form (i.e. has almost constant columns); however, a few cosmetic changes need to be made: first, the matrix elements

1 ∗(1) Kσ,σ (u) ≡ (4.64) P τσσ ≤ τTσ 1 + uE τσσ |τσσ ≤ τTσ , σ = σ σ 1 − Gσ,T (u) have to be replaced by their leading, σ -independent part 11 It will become clear only later why we expand to second order and are not content with the first order as before.

30

G. Ben Arous, A. Bovier, V. Gayrard ∗(0)

Kσ,σ (u) ≡

1 σ σ σ σ σ 1 P[τ < τ ] 1 + uE[τ |τ = τ ] , σ = σ . T \σ T T \σ T \σ T 1 − Gσσ,T (u) M (4.65)

As shown in the next lemma, this replacement can be done at the cost of error terms of order at most O(1/N ). ∗(0)

∗(1)

Lemma 4.10. Denote by KE and KE the matrix with off-diagonal entries given respectively by (4.65) and (4.64) and zero diagonals. Then, under the assumptions and with the notation of Lemma 4.9 and Proposition 2.2 we have, for N large enough, √ 1 + 3 s2 + r 2 ∗(0) ∗(1)

K (u) − K (u) ≤ ! O(1/N ). (4.66) 1 + (s 2 + r 2 )/2 − 5/M Second, since the matrix K ∗(0) (u) has zero diagonal, we still have to compare it to the matrix K∗(0) with entries 1 ∗(0) 1 Kσ,σ (u) ≡ P[τTσ\σ < τTσ ] 1 + uE[τTσ\σ |τTσ\σ = τTσ ] , ∀σ, σ ∈ T . σ M 1 − Gσ,T (u) (4.67) This involves controlling the norm of the diagonal matrix K∗(0) (u) − K ∗(0) (u): Lemma 4.11. Let K∗(0) be the matrix defined in (4.67). Under the assumptions and with the notation of Lemma 4.10 we have, for N large enough, √ 1 + s2 + r 2 ∗(0) ∗(0)

K (u) − K (u) ≤ ! O(1/(M − 1)). (4.68) 1 + (s 2 + r 2 )/2 − 5/M Proof of Lemma 4.10. For σ, σ ∈ T , σ = σ , let κσ,σ (u) be defined through ∗(0)

∗(1)

Kσ,σ (u) − Kσ,σ (u) =

κσ,σ (u) . 1 − Gσσ,T (u)

(4.69)

Since the denominator in (4.69) has already been dealt with in Lemma 4.9, what we need is an upper bound on |κσ,σ (u)|. Appropriately sorting out the different terms contributing to κσ,σ (u) we may write, √ 1 |κσ,σ (u)| ≤ P[τσσ ≤ τTσ ] − e−β NEσ 1 + |u|E[τTσ\σ |τTσ\σ = τTσ ] M σ σ +|u|P[τσ ≤ τT ] E[τσσ | τσσ ≤ τTσ ] − E[τTσ\σ | τTσ\σ = τTσ ] . (4.70) Plugging in the estimates of Proposition 2.1, ii), √

e−β NEσ |κσ,σ (u)| ≤ 1 + |u|E[τTσ\σ |τTσ\σ = τTσ ] O(1/N ) M σ σ σ σ σ σ + |u| E[τσ | τσ ≤ τT ] − E[τT \σ | τT \σ = τT ] (1 + O(1/N )) , (4.71)

Aging in the REM. Part 2

31

and we are left to bound the expected transition time E[τTσ\σ |τTσ\σ = τTσ ], together with the difference E[τσσ | τσσ ≤ τTσ ] − E[τTσ\σ | τTσ\σ = τTσ ]. To deal with the latter, first observe that differentiating the renewal equation Gσσ ,T \σ (u) =

Gσσ ,T (u) 1−Gσσ,T (u) ,

we have

d σ G (0) = (1 − P[τσσ ≤ τTσ ])Eτσσ I{τ σ ≤τTσ\σ } − P[τσσ ≤ τTσ\σ ]Eτσσ I{τσσ ≤τTσ } σ du σ ,T Eτσσ I{τσσ ≤τTσ } σ σ σ σ σ (4.72) = P[τσ ≤ τT ] E[τσ | τσ ≤ τT \σ ] − 1 − P[τσσ ≤ τTσ ] implying that

E τσσ | τσσ ≤ τTσ = E[τσσ | τσσ ≤ τTσ\σ ] −

Eτσσ I{τσσ ≤τTσ }

(4.73)

1 − P[τσσ ≤ τTσ ]

and, since the last term in the r.h.s. is σ -independent, we can express our conditional expectation in the following, remarkably useful form:

E τσσ | τσσ ≤ τTσ =

1 E τσσ | τσσ ≤ τTσ |T \ σ |

% + E[τσσ | τσσ ≤ τTσ\σ ] − Next observe that by (4.9),

σ ∈T \σ

σ ∈T \σ

σ ∈T \σ

& 1 E[τσσ | τσσ ≤ τTσ\σ ] . |T \ σ |

(4.74)

σ ∈T \σ

P[τσσ ≤ τTσ ] = P[τTσ\σ ≤ τTσ ], as well as

Eτσσ I{τ σ ≤τTσ } = EτTσ\σ I{τTσ\σ ≤τTσ }

(4.75)

σ

hold ((4.75) is obtained by differentiating (4.9) and setting u = 0). On the other hand, using (2.4) from Proposition 2.1, the first term in the r.h.s of (4.74) may thus be rewritten as

1 E τσσ | τσσ ≤ τTσ |T \ σ | σ ∈T \σ   Eτσσ I{τ σ ≤τTσ } P[τ σ ≤ τ σ ] 1 T  σ σ  = σ σ] σ ≤ τσ] P[τ ≤ τ |T \ σ | P[τ T \σ T T σ σ ∈T \σ

σ ∈T \σ

= E[τTσ\σ | τTσ\σ = τTσ ](1 + O(1/N )).

(4.76)

Since the term in braces in the last line of (4.74) was estimated in Corollary 2.3, inserting (2.13) and (4.76) in (4.74), we obtain that, under the assumptions and with the notation of Proposition 2.2,

E τσσ | τσσ ≤ τTσ − E[τTσ\σ | τTσ\σ = τTσ ] ≤ O(1/N ) E[τTσ\σ | τTσ\σ = τTσ ] + (1 −

1 −1 M ) Wβ,N,T

.

(4.77)

32

G. Ben Arous, A. Bovier, V. Gayrard

Therefore, collecting (4.77) and (4.71), √ e−β NEσ |κσ,σ (u)| ≤ 1 + |u| 2E[τTσ\σ |τTσ\σ = τTσ ] M 1 −1 + (1 − M ) Wβ,N,T O(1/N ),

(4.78)

and we are left to bound the term E[τTσ\σ |τTσ\σ = τTσ ] from above. To do so, we proceed as in (4.72), (4.73), but this time using (3.36) and the fact that Eτσσ I{τσσ ≤τTσ } ≥ P[τσσ = 1] = 1 − e−β

√ NEσ ,

we obtain that

E[τTσ\σ |τTσ\σ = τTσ ] ≤ E[τTσ\σ ] − ≤

1 1−

1 M

1 P(τTσ\σ

τσσ )

<

+

1

√ eβ NEσ P(τTσ\σ

< τσσ )

1 + Wβ,N,T (1 + O(1/N )),

(4.79)

where the second line follows from the bound (2.14) of Corollary 2.3 together with the estimate (2.6) of Proposition 2.1. Inserting this bound in (4.78) yields, √ e−β NEσ −1 1 (4.80) 1 + Wβ,N,T O(1/N ). 1 + 3|u|(1 − M ) |κσ,σ (u)| ≤ M Thus

K

∗(0)

≤

(u) − K

∗(1)

(u) ≤ max

|κσ,σ (u)| σ |1 − Gσ,T (u)| σ ∈T \σ

σ ∈T √ 1 −β NE σ (1 − M )e 1 −1 max ) 1 + W O(1/N ), 1 + 3|u|(1 − β,N,T M σ ∈T |1 − Gσσ,T (u)|

(4.81)

and observing that, by assertion (v) of Proposition 2.1, (1 −

√ −β NEσ 1 M )e

= GσT \σ,T (0)(1 + O(1/N ))

(4.82)

we finally arrive at

K ∗(0) (u) − K ∗(1) (u) ≤ max

GσT \σ,T (0)

|1 − Gσσ,T (u)| 1 −1 1 + 3|u|(1 − M ) 1 + Wβ,N,T O(1/N ).

σ ∈T

(4.83)

From there on, the proof proceeds exactly as the proofs of Lemma 4.6, 4.8 and 4.9, yielding √ −1 1 −1 1 + Wβ,N,T 1 + 3 s 2 + r 2 (1 − M ) ∗(0) ∗(1) ! O(1/N )

K (u) − K (u) ≤ 1 + (s 2 + r 2 )/2 − 5/M (4.84) 1 −1 −1 ≤ 1, gives (4.68), proving Lemma 4.10. 1 + Wβ,N,T ) which, since (1− M

Aging in the REM. Part 2

33

Proof of Lemma 4.11. By definition of K∗(0) (u) and K ∗(0) (u), ∗(0)

K∗(0) (u) − K ∗(0) (u) = max |Kσ,σ (u)| σ ∈T

√

1 −1 −β NEσ (1 − M ) e 1 σ σ σ ≤ max |τ = τ ] . 1 + |u|E[τ T \σ T \σ T M − 1 σ ∈T |1 − Gσσ,T (u)|

(4.85)

Equation (4.79) then yields the bound √

1 (1 − M )e−β NEσ 1

K∗(0) (u) − K ∗(0) (u) ≤ max M − 1 σ ∈T |1 − Gσσ,T (u)| 1 −1 × 1 + |u|(1 − M ) 1 + Wβ,N,T O(1/N ) (4.86)

which, up to some constants, is identical to that of (4.81). From there on the proof follows that of Lemma 4.10. Let us introduce the decomposition K ∗ (u) ≡ K∗(0) (u) + K∗(1) (u)

(4.87)

and note that K∗(1) (u) can be written in the form K∗(1) (u) ≡ (K ∗(0) (u) − K∗(0) (u)) + (K ∗(1) (u) − K ∗(0) (u)) + K ∗(2) (u).

(4.88)

The following corollary then is an immediate consequence of the previous three lemmata. Corollary 4.12. Under the assumptions and with the notation of Lemma 4.9 and Lemma 4.10 we have, for N large enough, √ γ −2 C(s 2 + r 2 ) + (1 + 3 s 2 + r 2 ) max (O(1/(M − 1)), O(1/N )) !

K∗(1) (u) ≤ . 1 + (s 2 + r 2 )/2 − 5/M (4.89) The leading contribution to K ∗ (u) thus comes from the matrix K∗(0) (u) whose spectrum is easily analysed. In particular, K∗(0) (u) has a unique non zero eigenvalue of algebraic multiplicity one, denoted by λ(u), and given by: ∗(0) λ(u) ≡ Kσ,σ (u). (4.90) σ ∈T

The corresponding left eigenvector is proportional to (1, 1, . . . , 1). Similarly, defining M ∗(0) (u) ≡ [I − K∗(0) (u)]−1 K∗(0) (u)

(4.91)

we decompose the Laplace transform of the resolvent (defined in 4.6) into M ∗ (u) ≡ M ∗(0) (u) + M ∗(1) (u).

(4.92)

It obviously follows from the previous results that M ∗(0) (u) has two eigenvalues, 0 and λ(u)[1 − λ(u)]−1 , the latter having algebraic multiplicity one. We will have to show that the matrix M ∗(1) (u) has small norm, and this smallness should be inferred from that of

K∗(1) (u) . To make this explicit we want to use the following result:

34

G. Ben Arous, A. Bovier, V. Gayrard

Lemma 4.13. Set R(u) ≡ [I − K∗(0) (u)]−1 , ρ(u) ≡ max |1 − λ(u)|−1 , 1 .

(4.93)

Then, M ∗(1) (u) = R(u)K∗(1) (u)R(u)

1 I − R(u)K∗(1) (u)

(4.94)

and, if R(u)K∗(1) (u) < 1,

M ∗(1) (u) ≤

K∗(1) (u) ρ(u)2 . 1 − K∗(1) (u) ρ(u)

(4.95)

Proof. Observe that using the decomposition (4.87), [I − K ∗ (u)]−1 can be written in the form 1 1 = R(u) + R(u)K∗(1) (u) . I − K ∗ (u) I − K ∗ (u)

(4.96)

Thus M ∗ (u) = M ∗(0) (u) + R(u)K∗(1) (u) + R(u)K∗(1) (u) = M ∗(0) (u) + R(u)K∗(1) (u)

1 . I − K ∗ (u)

1 K ∗ (u) I − K ∗ (u) (4.97)

Equation (4.94) then results from (4.97) together with the identity 1 1 = R(u) I − K ∗ (u) I − R(u)K∗(1) (u)

(4.98)

We now turn to the proof of (4.96). It follows from the spectral properties of K∗(0) (u) that

[I − K∗(0) (u)]−1 = max |1 − λ(u)|−1 , 1 ≡ ρ(u).

(4.99)

Equation (4.94) then yields the bound

M ∗(1) (u) ≤ ρ(u)2 K∗(1) (u) [I − R(u)K∗(1) (u)]−1

(4.100)

and (4.95) follows from the fact that, if R(u)K∗(1) (u) < 1, then

[I − R(u)K∗(1) (u)]−1 ≤ [1 − R(u)K∗(1) (u) ]−1 ≤ [1 − ρ(u) K∗(1) (u) ]−1 . (4.101) The lemma is proven.

Aging in the REM. Part 2

35

At this stage we see that to fully control the behavior of both M ∗(0) (u) and M ∗(1) (u) in a small neighborhood of the origin requires a precise control of 1 − λ(u). Observe that + , GσT \σ,T (0) 1 σ σ σ 1 − λ(u) = 1− (4.102) 1 + uE[τT \σ |τT \σ = τT ] |T | 1 − Gσσ,T (u) σ ∈T

so that 1 − λ(u) takes the form of a sum over σ . The evaluation of such sums is a rather involved question whose treatment is the object of the next subsection. The analysis of M ∗(0) (u) and M ∗(1) (u) will then be brought to a close in Sect. 4.5. As for the present section, we conclude it with the analysis of the summands of (4.102). (E) and set Lemma 4.14. Recall that u = z/ −β √NE 1 σ (E). e (4.103) zσ ≡ 1 − M If u belongs to the set

# $ Dδ ≡ u ∈ C | r < s 2 /4, |z| ≤ δ , 0 < δ < 1,

then, for N large enough, GσT \σ,T (0) z σ σ σ 1 + uE[τT \σ |τT \σ = τT ] − 1 − ≤ C(δ)|z| 1 − Gσσ,T (u) z − zσ

(4.104)

(4.105)

for some constant 0 < C(δ) < ∞ that only depends on δ. Proof. Let us write GσT \σ,T (0) 1− 1 + uE[τTσ\σ |τTσ\σ = τTσ ] σ 1 − Gσ,T (u) . GσT \σ,T (0) σ σ σ = 1− |τ = τ ] − uE[τTσ\σ |τTσ\σ = τTσ ]. 1 + uE[τ T \σ T \σ T 1 − Gσσ,T (u) (4.106) Recall that we denote by uσ the smallest real number that solves the equation Gσσ,T (u) = 1. We will first look at the term in round brackets: 1−

GσT \σ,T (0) 1 − Gσσ,T (u)

=

Gσσ,T (0) − Gσσ,T (u)

=−

1 − Gσσ,T (u) Gσσ,T (0) − Gσσ,T (u) d (u − uσ ) du Gσσ,T (uσ ) -

1 1 + σ d 1 − Gσ,T (u) (u − uσ ) du Gσσ,T (uσ ) d Gσσ,T (uσ ) Gσσ,T (0) − Gσσ,T (u)+u du

.

+(Gσσ,T (0) − Gσσ,T (u)) =

u − u − uσ +

d (u − uσ ) du Gσσ,T (uσ) d (Gσσ,T (0) − Gσσ,T (u)) (u − uσ ) du Gσσ,T (uσ )+1 − Gσσ,T (u)

d (1 − Gσσ,T (u))(u − uσ ) du Gσσ,T (uσ ) u σ (u)) + R σ (u), = (1 + R u − uσ

(4.107)

36

G. Ben Arous, A. Bovier, V. Gayrard

σ being defined through σ and R R 2

σ (u) ≡ R

d σ ˆ (u˜ − uσ ) du 2 Gσ,T (u) d σ du Gσ,T (uσ )

, 2

1 d σ ˇ 2 2 Gσ,T (u) σ (u) ≡ u d Gσσ,T (u) R , ˜ d σ du d σ du du Gσ,T (u ) du Gσ,T (uσ )

(4.108)

where u˜ is on the ray between 0 and u, uˆ on the ray between u˜ and uσ , and both uˇ and u are on the ray between u and uσ , and u . σ (u) and The various first and second derivatives entering the expressions of R σ (u) can be bounded with the help of (3.53) and (3.55). We then get that on the R (E), 0 < δ < 1, disk |u| ≤ δ/ σ (u)| ≤ c(δ)zσ |z|, |R

(4.109)

where zσ is defined in (4.103) and 0 < c(δ) < ∞ only depends on δ. Similarly, using that u˜ is on the ray between 0 and u, σ (u)| ≤ c (δ)zσ (|z| + | (E)uσ |) |R

(4.110)

for some 0 < c (δ) < ∞. Recall from Sect. 3.2 (formula (3.49)) that uσ ≈ Eτ1σ ; T \σ however, inspecting the proof of Proposition 3.6 (see also (2.9)) an alternative representation is uσ = GσT \σ,T (0)(1 + O(e−β

√ NEσ

(E))),

(4.111)

and this will be even more convenient here as, using (4.82), we then have (E)uσ = zσ (1 + O(zσ )).

(4.112)

The bound (4.110) thus becomes σ (u)| ≤ c (δ)zσ (|z| + zσ ). |R

(4.113)

We now come to the main contribution to the r.h.s. of (4.107), namely to the term u/(u − uσ ). Using (4.112) we can write u z = + R σ (z), u − uσ z − zσ

(4.114)

where (E) − zσ ) zO(zσ2 ) z(uσ . = (E)) (z − zσ )(z − zσ (1 + O(zσ ))) (z − zσ )(z − uσ To bound this term we use that on the set z ∈ C | r < s 2 /4 : R σ (z) ≡

|z − zσ | ≥ zσ and |z − zσ (1 + O(zσ ))| ≥

(4.115)

(4.116)

zσ (1 + O(zσ )), if zσ (1 + O(zσ )) ≤ 2 √ . (4.117) 2 zσ (1 + O(zσ )) − 1, otherwise

Aging in the REM. Part 2

37

Therefore, for z ∈ Dδ , |R σ (z)| ≤

|z|O(zσ2 ) ≤ c|z| |z − zσ | |z − zσ (1 + O(zσ ))|

(4.118)

for some constant c > 0. Inserting (4.114) in (4.107), and plugging the resulting expression in (4.106), we may now write 1−

GσT \σ,T (0) 1 + uE[τTσ\σ |τTσ\σ = τTσ ] = Iσ0 (u) + Iσ1 (u), σ 1 − Gσ,T (u)

(4.119)

where

z σ (u)) + R σ (u), ≡ + R σ (z) (1 + R z − zσ E[τTσ\σ |τTσ\σ = τTσ ] z σ (u)) + R σ (u) − 1 . Iσ1 (u) ≡ z + R σ (z) (1 + R (E) z − zσ (4.120)

Iσ0 (u)

(E)−1 ≤ 1, it readily follows from the Assume that z ∈ Dδ . Since E[τTσ\σ |τTσ\σ = τTσ ] estimates (4.109), (4.113), (4.118), and the bound z z − z

= 1 + zσ ≤ 2 z − z σ σ

(4.121)

which, by (4.116), holds for all z ∈ Dδ , that |Iσ1 (u)| ≤ C (δ)|z|

(4.122)

for some constant C (δ) > 0. To treat the term Iσ0 (u) note that using in turn (4.113) and (4.116), z zσ |z| z − z Rσ (u) ≤ c (δ) |z − z | (|z| + zσ ) ≤ c (δ)|z|(|z| + zσ ). σ σ

(4.123)

Therefore, 0 I (u) − z ≤ C (δ)|z| σ z − zσ

(4.124)

for some constant C (δ) > 0. Combining (4.119) together with (4.123) and (4.124) yields (4.105). This concludes the proof of Lemma 4.14.

38

G. Ben Arous, A. Bovier, V. Gayrard

4.4. Poisson convergence. Finally we need to control the convergence of various integral functions of the variables zσ . We will do this in a general setting first and then apply this to the various occurrences later on. Note first that by (4.103) and (3.2), √

(E) zσ = (1 − 1/M)e−β NEσ −1 α−1 e−E 1 + VN,E eE/2 √ = e−α(uN (Eσ )−E) 1 + (1+O(1/N )) |T (E)|(α − 1) 2α − 1 1 (4.125) ≡ α(u−1 (E )−E) σ e N τE,N only depends on σ through u−1 N (Eσ ). As has been explained in Sect. 1, the point process ∗ NN,E ≡

δexp{α(−E+u−1 (Eσ ))} = N

σ ∈{−1,1}N

δ1/(zσ τN,E )

(4.126)

σ ∈{−1,1}N

converges weakly to the Poisson point process NE∗ on [1, ∞) with intensity measure α −1 eE x −1−1/α dx. We will now show how to make use of the convergence of our point processes to Poisson point processes in the analysis of the asymptotic behavior of our functions as both N and E tend to infinity. As a first example we will explain how to control the behavior of the random coefficients τN,E . Lemma 4.15. Set τ∞ ≡

α−1 α .

Then,

lim lim τN,E = τ∞ ,

E↓−∞ N↑∞

in Probability.

(4.127)

Proof. τN,E depends on two random variables, VN,E (defined in Eq. (3.2) of [BBG1]) and |T (E)|. Let us first look at VN,E . We want to show that VN,E eE/2 tends to zero. By Chebychev’s inequality of order four, we have that P[|VN,E e+E/2 | > ε] ≤

4 EVN,E

ε 4 e−2E

.

(4.128)

But (see [BKL], Lemma 3.3, where however the normalisation of VN is different) the moments of the random variable VN,E converge, as N ↑ ∞, and in particular 4 lim EVN,E =

N↑∞

(2α − 1)2 −E e + 3. 4α − 1

(4.129)

Therefore, there exists N0 , such that for all N > N0 , and for −E large enough, P[|VN,E eE/2 | > ε] ≤

4eE . ε4

(4.130)

∞ Next we note that |T (E)| = E NN (dx) converges, as N ↑ ∞, to a Poisson random variable with parameter eE . In particular,

Aging in the REM. Part 2

39

lim P[|e |T (E)| − 1| > ε] = E

e−E (1−ε) −nE e

N↑∞

n=0

+

n! ∞

n=e−E (1+ε)

e−e

−E

e−nE −e−E 2 −E e ≤ Ce−E e−ε e . (4.131) n!

Combining these two observations proves the lemma.

Remark . Note that we actually prove that τN,E converges, as N ↑ ∞, to a random variable τE which in turn, as E ↓ −∞, converges to a constant. This latter convergence can easily be shown to take place almost surely. However, it is not correct that the joint convergence takes place almost surely. It may be possible to show that almost sure convergence holds along certain diagonal limits N ↑ ∞ with E = EN depending on N in a suitable way. Due to the generally rather slow convergence of extremal distributions, proving such a statement rigorously would require a considerable extra effort and is not guaranteed to succeed. The next lemma is an immediate application of the weak convergence of the point ∗ : process NN,E + Lemma 4.16. Let g be a bounded continuous function on R , such that ∞ dx 0 x 1+1/α g(x) < +∞, and let XN be a family of positive random variables that converge in distribution to the positive random variable X. Then for any b > 0, ∞ ∗ (i) b NN,E (dx)g(xXN ) converges, as N ↑ ∞, to the random variable ∞ ∗ N (dx)g(xX). E b (ii) If XE is a family of random variables such that, as E ↓ −∞, XE → a ∈ R+ almost surely, then ∞ ∞ dx NE∗ (dx)g(xXE ) = α −1 g(xa), a.s. (4.132) lim e+E E↓−∞ x 1+1/α 1 1

(iii) If g is a complex valued function on C, and if for some domain B ⊂ C, for all x ∈ R+ , z ∈ B, g(zx) is bounded, and for all z ∈ B, ∞ dx <∞ g(zx) (4.133) x 1+1/α 0 holds, then lim P lim sup eE E↓−∞ N↑∞ z∈B

∞ 1

NE∗ (dx)g(zxXE ) − (az)1/α α −1

= 0.

∞ az

g(x) > ε x 1+1/α dx

(4.134)

Proof. (i) is a standard result that follow from the equivalence of convergence in distribution of a r.v. and almost sure convergence of a sequence of r.v. having the same distribution. To prove (ii), recall that by definition of the Poisson process NE∗ , 1

∞

NE∗ (dx)g(x)

=

nE i=1

g(xi ),

(4.135)

40

G. Ben Arous, A. Bovier, V. Gayrard

where nE is a Poisson random variable with mean eE , and xi , i ∈ N are i.i.d. random variables such that a dx −1 . (4.136) P [xi ≤ a] ≡ α 1+1/α x 1 Note that first by continuity g(xXE ) − g(xa) converge to zero and since g is integrable w.r.t. the law of xi , g(xXE ) − g(xa) ↓ 0 as a random variable. On the other hand, it follows from our assumptions that g(xi ) are bounded random variables. In particular, their moment generating functions Eeλg(xi ) are finite for all λ. Therefore standard arguments imply that there exists a constant c such that + , nE ε 2 nE −1 P |nE (g(xi ) − Eg(xi ))| > ε ≤ 2EnE exp − , (4.137) cvar2 (g) i=1

where EnE denotes expectation with respect to the Poisson variable nE and 2 ∞ ∞ dx dx −1 var2 (g) ≡ α −1 g(x) − α g(x) x 1+1/α x 1+1/α 1 1

(4.138)

is, by our assumptions on g, finite. Together with the exponential estimate on the concentration of the Poisson variable nE (4.131), this yields ∞ −1 ∞ ∗ dx −1 P nE NE (dx)g(x) − α g(x) ≥ ε x 1+1/α 1 1 ε 2 e−E −E/4 ≤ 2 exp − + Ce−E e−e . (4.139) 2cvar2 (g) From this (ii) follows immediately. To prove (iii), note that (ii) also holds if g takes complex values by simply considering real and imaginary parts separately. By a simple change of variables we have, for s ≤ 1, ∞ dx 1/α −1 Eg(s·) = s α g(x) (4.140) 1+1/α x s and var2 (g(s·)) = s 1/α α −1

∞ s

dx x 1+1/α

g(x) − s 1/α α −1

∞ s

dx

x

2

g(x) 1+1/α

. (4.141)

If (4.133) holds, this implies that Eg(s·) ≤ Cs 1/α and var2 (g(s·)) ≤ Cs 1/α for small s. Thus, for s small, we get from (4.139) that for some finite constant Cg depending on g, ∞ ∞ E ∗ −1 1/(2α) P e NE (dx)g(sx) − α g(sx) ≥ s ε 1 1 2 −E ε e −E/4 + Ce−E e−e ≤ 2 exp − . (4.142) 2Cg Remark . This means that fluctuations are at most of order s 1/(2α) eE/2 which is less than the mean as long as s > eE . This should be taken as a sign that on time scales larger than e−E self-averaging no longer takes place.

Aging in the REM. Part 2

41

The uniformity of the convergence in z claimed under (iii) follows from the exponential estimate (4.142) and the continuity of g by standard arguments. This concludes the proof of the lemma. As the first and main application of this lemma we obtain the Corollary 4.17. Uniformly in (z) < max(|(z)|, 1/2), 1 z E↓−∞ N↑∞ |T (E)| z − zσ ∞ σ ∈T (E) dx xzτ∞ , = α −1 1+1/α x xzτ∞ − 1 1 lim lim

in Probability.

Moreover, on the same set, ∞ dx xzτ∞ −1 α = (−zτ∞ )1/α π cosec (π/α) + O(|z|) 1+1/α x xzτ∞ − 1 1

(4.143)

(4.144)

for |z| small. Proof. To get (4.143), just check that the hypotheses of Lemma 4.16 are satisfied. To prove (4.144), note first that 1 1 dx xzτ∞ dx |z|τ∞ x ≤ √ 1+1/α 1+1/α xzτ∞ − 1 2 0 x 0 x |z|τ∞ 1 , (4.145) = √ 2 1 − 1/α where we used that |(a + ib − 1)|−1 ≤ [(a − 1)2 + a 2 ]−1/2 ≤ 2−1/2 , if a ≤ |b|. Thus it remains to compute the integral from zero to infinity. To do this we change variables from x to xzτ∞ . This turns the integral into an integration over a path C1 in the complex z plane which is the straight line from zero passing through s to infinity. Since z−1 is analytic in the complex plane with the positive real axis removed, the integration path C1 can be rotated to the negative real axis C2 without changing the integral, since the integral along the arc A at infinity vanishes (see the figure). In fact ∞ ∞z −∞ dx xzτ∞ dx x dx x 1/α 1/α 1/α 1/α τ τ = z = z ∞ ∞ 1+1/α 1+1/α 1+1/α x xzτ∞ − 1 x−1 x x−1 0 0 x 0 ∞ dx x 1/α 1/α . (4.146) = (−z) τ∞ x 1+1/α x + 1 0 This proves the lemma.

We can now collect the results obtained in this and the previous subsection to control the asymptotics of the eigenvalue λ(u). (E)−1 , on the domain Dδ defined in (4.104), Corollary 4.18. With u = z lim lim (1 − λ(u)) = (−zτ∞ )1/α πcosec (π/α) + O(|z|),

E↓−∞ N↑∞

in Probability. (4.147)

42

G. Ben Arous, A. Bovier, V. Gayrard

Im A

C1

s

Re

C2

Rotation of the integration contour to the real axis

Proof. It suffices to combine formula (4.102), the estimate (4.105) of Lemma 4.14, and Corollary 4.17. Remark . The diligent reader (if any) who has reached this point will be relieved to finally see some formulas familiar from the trap-model emerge. Having this result, we can now also estimate the norm of the error term M ∗(1) (u). (E)−1 , on the domain Dδ defined in (4.104), Corollary 4.19. With u = z lim sup lim sup M ∗(1) (u) ≤ C(d)|z|2(1−1/α) , E↓−∞

in Probability.

(4.148)

N↑∞

Proof. This follows from Lemma 4.13 and Corollary 4.18.

Remark . We can only now appreciate why we expanded to second order in (4.58). It is crucial to have the norm of K∗(1) (u) bounded by something of order |z|2 to obtain an estimate that tends to zero in the corollary above.

4.5. Controlling the inhomogeneous term. Our next step is to establish control over the inhomogeneous term Fσ∗ (m, u) defined in (4.2). To do so we use the Markov property to represent P[σ (m) = σ , τTσ(E)\σ > m]P[τTσ(E)\σ > n] P[τTσ(E)\σ > m + n] = σ ∈T (E)\σ

=

P[σ (m) = σ , τTσ(E)\σ > m]P[τTσ(E)\σ > n]

σ ∈T (E)

+ P[σ (m) = σ, τTσ(E)\σ > m]P[τTσ(E)\σ > n].

(4.149)

Aging in the REM. Part 2

43

Inserting this relation into (4.2) we obtain that

Fσ∗ (m, u) =

P[σ (m) = σ , τTσ(E)\σ > m]Lσσ (u)

σ ∈T (E)

+P[σ (m) = σ, τTσ(E)\σ > m]Lσσ (u),

(4.150)

where Lσσ (u) is given by (3.14) and

Lσσ (u)

=

GσT (E)\σ (u) − 1 eu − 1

.

(4.151)

Thus, using that

GσT (E)\σ (u) = GσT (E)\σ,T (E) (u) + Gσσ,T (E) (u)GσT (E)\σ (u)

(4.152)

we get 1 Fσ∗ (m, u) = u P[σ (m) = σ, τTσ (E)\σ > m] GσT (E)\σ (u) − 1 e −1 + P[σ (m) = σ , τTσ (E)\σ > m] GσT (E)\σ,T (E) (u) + Gσσ,T (E) (u)GσT (E)\σ (u) − 1 . σ ∈T (E)

(4.153)

As is by now usual, we will need a rather crude bound for u away from the origin complemented by a finer estimate for very small values of |u|. The former follows from the next lemma. −1 . Then Lemma 4.20. Assume that (u) ≤ 21 2 σ σ P[τ > m] (u) (4.154) G +2 . T (E)\σ T (E)\σ u |e − 1| Proof. By Lemma 3.4, under the condition on u, Gσσ,T (E) (u) = gσσ (u) ≤ 2 (in fact ≤ 2/(M − 1)). Similarly, GσT (E)\σ,T (E) (u) ≤ 2. Inserting this into (4.153) and noting that σ ∈T (E)\σ P[σ (m) = σ , τTσ(E)\σ > m] = P[τTσ(E)\σ > m] one arrives readily at the claimed bound. |Fσ∗ (m, u)| ≤

Bounds for |u| 1. As was the case for the resolvent, we have to identify more precisely the leading term of the inhomogeneous term for the contribution to the inversion integral for u very close to the origin. We begin with the m-dependent probabilities in (4.153). Lemma 4.21. There is a finite positive constant C such that, with bσ as in (3.9), √ −m/EτTσ (E)\σ bσ . (4.155) P τTσ(E)\σ > m, σ (m) = σ − pN (σ, σ )m ≤ Cme−β NEσ e

44

G. Ben Arous, A. Bovier, V. Gayrard

Proof. Note that pN (σ, σ )m is the probability of the event that σ (k) remains at σ during the entire period from time zero to time m which is a subset of the event {τTσ(E)\σ > m, σ (m) = σ }. In what remains, there must be a first time when σ (k) = σ . Thus P τTσ(E)\σ > m, σ (m) = σ − pN (σ, σ )m m−1

≤ pN (σ, σ )k−1 pN (σ, σ )P τTσ(E)\σ > m − k, σ (m − k) = σ k=1

≤ (1 − pN (σ, σ ))

σ ∼σ m−1

σ pN (σ, σ )k−1 max P τ > m − k . T (E)\σ σ ∼σ

k=1

(4.156)

The probability in the last line is similar to the probabilities estimated in Corollary 3.3, except that the starting point is now σ instead of σ . However using the decomposition (4.152), one verifies easily that following the same lines as in the proof of that corollary, one obtains the estimate

−(m−k)/EτTσ (E)\σ bσ P τTσ(E)\σ > m − k ≤ Ce

(4.157)

which is all we will need here. Inserting this estimate into (4.156) and using that, by Proposition 2.2 (together with the remark that follows it), √ k √ −β NEσ −k/EτTσ (E)\σ pN (σ, σ )k = 1 − e−β NEσ ≤ e−ke ≤e ,

the bound (4.155) follows directly.

(4.158)

Remark . Let us note that the bound (4.155) is really effectively smaller than the dominant term, if Eσ is “deep” within the top, even though we concede a little of the exponential √ decay when replacing eβ NEσ by ETσ (E)\σ . The point is that this error will tend to zero, while the prefactor of the exponential tends to zero as well. Since it will be the σ with exceptionally large Eσ that contribute to the long time behavior, this will do the job. Lemma 4.22. There exists a finite positive constant C such that , then (i) If EτTσ(E)\σ >

P τTσ(E)\σ > m, σ (m) ∈ T (E) ≤

√

e−β NEσ −m/EτTσ (E)\σ bσ e . σ /Eτ 1− T (E)\σ

(4.159)

, then (ii) If EτTσ(E)\σ ≤ m

P τTσ(E)\σ > m, σ (m) ∈ T (E) ≤ e−m/ .

(4.160)

Proof. Note that if the event {τTσ(E)\σ > m, σ (m) ∈ T (E)} occurs, then there exists a last time m − k < m when the process visits the σ . This gives us the bound

Aging in the REM. Part 2

45

P τTσ(E)\σ > m, σ (m) ∈ T (E) ≤

m−1

P τTσ(E)\σ > m − k, pN (σ, σ )P τTσ(E) > k − 1 σ ∼σ

k=1

≤ (1 − pN (σ, σ ))

m−1

(E) −(m−k)/EτTσ (E)\σ −(k−1)/

e

.

(4.161)

k=1 −m/Eτ σ

T (E)\σ from the sum and oversum the remaining geoIn case (i) we can extract e metric series to get (4.159), while in the latter case we simply bound the exponential terms by their maximum and retain that there are only m terms in the sum. This proves the lemma.

Next we want to deal with the Laplace transforms appearing in (4.153). Concerning the first line, we are already in good position, since we have the estimates needed for GσT (E)\σ (u) − 1 (see Proposition 3.6). The second term has, as we have seen, a prefactor that is of lower order in the m behavior, but we have to show that the u-dependent coefficient is not more singular than that of the first term. To this end we rewrite

GσT (E)\σ,T (E) (u) + Gσσ,T (E) (u)GσT (E)\σ (u) − 1 = GσT (E)\σ,T (E) (u) + Gσσ,T (E) (u) GσT (E)\σ (u) − 1 + Gσσ,T (E) (u) − 1 = GσT (E) (u) − 1 + Gσσ,T (E) (u) GσT (E)\σ (u) − 1 . (4.162) −1 , It will suffice to use that, for u < 21

|GσT (E) (u) − 1| ≤ |u|

(4.163)

and that Gσσ,T (E) (u) is bounded and analytic.

4.6. Laplace inversion 1. The error terms. After this preparation we are now ready to attack the Laplace inversion of the function ∗ (u, m, E) given in principle by (4.5). Recall that we are interested in computing

(n, m, E) ≡

1

σ (n, m, E) ≡ (I, (n, m, E)). |T (E)|

(4.164)

σ ∈T (E)

Setting

0 (n, m, E) ≡ (n, m, E) − (I, F (n, m)),

(4.165)

(4.5) and the inversion formula for Laplace transforms, we can write 1

(n, m, E) = 2πi 0

iπ

−iπ

due−un I, ME∗ (u)F ∗ (m, u) .

(4.166)

46

G. Ben Arous, A. Bovier, V. Gayrard

The notation of Sect. 4.2 (see (4.31)–(4.34)) are again brought into force in the present u. The first step of the analysis consists in desection; recall in particular that z = forming the contour of integration to the contour C consisting of three parts $ # √ ] , (4.167) A ≡ u ∈ C : z = 1/2, |z| ∈ [1/ 2κ, π $ # (4.168) B ≡ u ∈ C : z ∈ [1/t˜, 1/2], z = κ|z|2 , and D ≡ D1 ∪ D 2 ,

(4.169)

where

# $ D1 ≡ u ∈ C : |z| = 1/t , z < c|z|2 , & % D2 ≡ u ∈ C : z ∈ [ 1/(4κ 2 ) + 1/t 2 − 1/(2κ), 1/t˜], z = κ|z|2 . (4.170)

Here t and κ are positive parameters that are assumed to be chosen such that C lies in the domain of validity of Corollary 4.7 and Lemma 4.8, (ii), namely in (D1 (4) ∩ D2 (L, γ )) ∪ D3 ,

for some fixed

1 2

≤ γ < 1.

(4.171)

(Note that this essentially only imposes a constraint on κ, which has to be taken small enough compared with γ /L.) In what follows, t must be thought of as very large compared with one. At this stage no constraint is imposed on the parameter t˜; it will be chosen as t˜ = t η , for suitable 0 < η < 1, later. For future reference let us define the points: zA = rA + isA , rA = 1/2, √ sA = 1/ 2κ,

zB = rB + isB , ˜, rB = 1/t√ sB = 1/ κ t˜,

zD = ! rD + isD , rD = ! 1/(4κ 2 ) + 1/t 2 − 1/(2κ), 1/2 sD = 1 + (2κ/t)2 − 1 /2κ 2 .

We expect the main contribution to the integral to come from the part D of the integration. Thus we show first how to bound the two other contributions. From now on the letter c will denote a positive constant whose value may change from line to line. Lemma 4.23. Let A be defined in (4.167). Then ) due−un I, M ∗ (u)F ∗ (m, u) ≤ ce−n/(2 . E A

(4.172)

Proof. Calling IA the left hand side of (4.172) we clearly have π ) + iv)F ∗ (m, 1/(2 ) + iv) IA ≤ 2e−n/(2) dv I, ME∗ (1/(2 sA /

≤ 2e

) −1 −n/(2

π √ 1/ 2κ

/ // / )/ /F ∗ (m, (1/2 + is)/ )/ . ds /ME∗ ((1/2 + is)/ ∞ (4.173)

Aging in the REM. Part 2

47

s

A

B

zA

zB D2

D1

zD 1/t

~ 1/ t

r

0.5

The contour C in the variables r and s

/ / )/ can be bounded as in (4.48) of CorUnder our assumption on κ, /ME∗ ((1/2 + is)/ √ ) is monotone decreasing on [1/ 2κ, π ], we may add to ollary 4.7. Since cos(s/ our previous requirement on κ that it is chosen small enough so that ! ¯ −1 ) √ C +C 1 − O( 4 2 2(1−cos(1/( 2κ))) ≥ 1+ (1+O(d/N )) + . 1+ 2 M −1 2γ (4.174) The bound (4.48) then yields −1 / / ∗ 1 / ≤ ! 1 + Cγ /M ((1/2 + is)/ ! ) . E 2 −1 2(1 − cos(s/ )) ¯ ) 1+ 1 − O(

(4.175)

/ / )/ we use Lemma 4.20 together with the fact that on To bound /F ∗ (m, (1/2 + is)/ ∞ A, by the estimates of Proposition 3.2, |GσT (E)\σ (u)| ≤ c, to get that −1 / / ∗ /F (m, (1/2 + is)/ )/ ≤ c e(1/2+is)/ − 1 . ∞

(4.176)

)) and v = s/ . Then ρeiv − 12 = (1 − ρ)2 + 2ρ(1 − cos v), and Set ρ = exp(1/(2 ), since ρ > 1 + 1/(2 / / ∗ /F (m, (1/2 + is)/ )/

∞

≤!

c 2 2(1 − cos(s/ )) 1/4 +

.

(4.177)

48

G. Ben Arous, A. Bovier, V. Gayrard

Inserting (4.175) and (4.177) in (4.173) we get π 1 ) −n/(2 ! IA ≤ 2ce √ ds ! 2 2(1 − cos(s/ )) 1/4 + 2 2(1 − cos(s/ )) 1/ 2κ 1+ π 1 . (4.178) ≤ 2ce−n/(2) √ ds 2 2(1 − cos(s/ )) 1/4 + 1/ 2κ √ /4] To evaluate the last integral above, we split the integration interval into [1/ 2κ, π )) is well approximated /4, π ]. On the first of these intervals, 2 2(1−cos(s/ and [π by s 2 so that π π /4 /4 ds 1 ≤ c ds ≤ c . (4.179) √ √ 2 1 + s2 1/4 + 2(1 − cos(s/)) 1/ 2κ 1/ 2κ )) > 2 so that 2 2(1 − cos(s/ We then use that on the remaining interval π c 1 ≤ . ds 2 2(1 − cos(s/ 1/4 + )) ı/4

Inserting (4.179) and (4.180) in (4.178) yields the claim of the lemma. (E) and t˜ = Lemma 4.24. Let B be defined in (4.168). If t = n/ 0 < η < 1, due−un I, M ∗ (u)F ∗ (m, u) ≤ ct η exp(−t 1−η ). E

(4.180)

tη

then, for all

(4.181)

B

Proof. It will be enough to use norm estimates, that is, calling IB the left hand side of (4.181), // / / |du|e−nu /ME∗ (u)/ /F ∗ (m, u)/∞ IB ≤ B / // / −1 )/ /F ∗ (m, z/ )/ = |dz|e−tz /ME∗ (z/ ∞ B / // / sA 2 / / −1 )/ )/ ≤ 2c ds e−κs t /ME∗ ((κs 2 + is)/ / /F ∗ (m, (κs 2 + is)/ / . ∞

sB

(4.182)

As in the proof of the previous lemma we use (4.48) to write the bound / / / ∗ )/ / /ME ((κs 2 + is)/ ≤ !

1 + Cγ −1 κs 2

. ¯ −1 )) − 1 − 4 (1 + O(d/N)) − (C + C )γ −1 κs 2 2 2(1 − cos(s/ ))(1 − O( 1+ M−1 (4.183)

2 2 2 Using this time that on the √ integration interval, 2(1−cos(s/)) ≥ s (1−1/(6κ ) ), and that for 0 < x < 1, 1 + x ≥ 1 + x/2, we get (for κ small enough, t small enough compared with M, and M, N large) that the denominator in the r.h.s. of (4.183) is greater than s 2 /4. Since the numerator is bounded above by a constant, we may write / / / ∗ )/ (4.184) /ME ((κs 2 + is)/ / ≤ cs −2 .

Aging in the REM. Part 2

49

/ / )/ observe that, proceeding as we did to Turning to the term /F ∗ (m, (κs 2 + is)/ ∞ derive (4.176) we obtain, / −1 / 2 / ∗ )/ (4.185) / ≤ c e(κs +is)/ − 1 . /F (m, (κs 2 + is)/ ∞

) > 1 and v = s/ , ρeiv − 12 = (1 − ρ)2 + 2ρ(1 − Now, with ρ = exp((κs 2 )/ cos v) ≥ 2(1 − cos v). Combining this with the bound established on the line following (4.183), (4.185) becomes / ∗ / /F (m, (1/2 + is)/ . )/ ≤ cs −1 (4.186) ∞ Collecting (4.182), (4.184) and (4.186), we arrive at sA 1/√2 2 −κs 2 t −3 IB ≤ c ds e s ≤ c √ ds e−s t s −3 ≤ ce−t/t˜ 1/ t˜

sB

√ 1/ 2 √ 1/ t˜

ds s −3 ≤ ct˜e−t/t˜. (4.187)

Thus, choosing t˜ = t η , 0 < η < 1, concludes the proof of the lemma.

We now consider the error term resulting from the M ∗(1) (u) part of the resolvent on the part D of the integration contour. (E) then, for all 0 < δ < 1/2, Lemma 4.25. If t = n/ −nu ∗(1) ∗ I, M (u)F (m, u) lim sup lim sup due D E↓−∞ N↑∞ c t −2δ(1−1/α) + ct −(1−2δ) exp(−t 1−2δ ). (4.188) ≤ ct −2(1−1/α) + 2(1 − 1/α) Proof. Again, it will be enough to use norm estimates, that is due−nu I, M ∗(1) (u)F ∗ (m, u) ≤ |du|e−nu M ∗(1) (u)

F ∗ (m, u) ∞ . D D (4.189) To bound F ∗ (m, u) ∞ we proceed as in the previous two lemmata and use Lemma 4.20 together with the fact that on D, by the estimates of Proposition 3.2, |GσT (E)\σ (u)| ≤ c, to establish that

F ∗ (m, u) ∞ ≤ c|eu − 1|−1 ≤ c|u|−1 . Hence D

|du|e−nu M ∗(1) (u)

F ∗ (m, u) ∞ ≤ c

and12 by Corollary 4.19,

lim sup lim sup E↓−∞

N↑∞

D

D

(4.190)

) |z|−1 |dz|e−tz M ∗(1) (z/ (4.191)

) |z|−1 |dz|e−tz M ∗(1) (z/

12 The appearance of after the limit has been taken in the inequality below may look confusing. D does not depend on N and E so that this Observe however that, for all N, E, the rescaled contour notation is formally correct.

50

G. Ben Arous, A. Bovier, V. Gayrard

≤c

D

|dz|e−tz |z|1−2/α ≡ cI D

(4.192)

We now decompose I D as I D = I D1 + I D2 according to (4.170). Clearly I D1 ≤

2π

dθ t −(1−2/α)−1 = 2π t −(2−2/α) .

(4.193)

0

To bound I D2 we first observe that I D2 ≤

sB

ds e

−s 2 κt

!

(2κs)2 + 1

!

(κs 2 )2 + s 2

1−2/α

≤c

sD

√

√

κsB

ds e−s t s 1−2/α , 2

κsD

(4.194) and since for t large, sD ≈ 1/t, I D2 ≤ c

1

√ κ/t

ds e−s t s 1−2/α . 2

(4.195)

Introducing a number 0 < δ < 1/2, we then split the last integral above into J1 ≡

1/t δ

√

ds e

−s 2 t 1−2/α

s

and J2 ≡

1

ds e−s t s 1−2/α . 2

(4.196)

1/t δ

κ/t

As no exponential decay is to be gained in J1 , we simply write J1 ≤

1/t δ

√

ds s 1−2/α =

κ/t

−2δ(1−1/α) 1 t − κ 1−1/α t −2(1−1/α) . 2(1 − 1/α)

(4.197)

To deal with J2 we distinguish two cases: if 1 − 2/α > 0, then J2 ≤

1

ds e−s

2t

1/t δ

=

1 2

e−xt dx √ ≤ t δ x 1/t 2δ 1

1 1/t 2δ

dx e−xt ≤ t −(1−δ) exp − t 1−2δ , (4.198)

while if 1 − 2/α ≤ 0, J2 ≤ t δ(2/α−1)

1

ds e−s

2t

1/t δ

≤ t δ(2/α−1)−(1−δ) exp(−t 1−2δ ) ≤ t −(1−2δ) exp − t 1−2δ . (4.199)

We have thus obtained that I D2 ≤

c t −2δ(1−1/α) + ct −(1−2δ) exp − t 1−2δ 2(1 − 1/α)

which, together with (4.193), yields the claim of the lemma.

(4.200)

Aging in the REM. Part 2

51

4.7. Laplace inversion 2. The main contributions. Warning. In this last section we abandon the notation s = (z) introduced in (4.33). The letter s now takes back its initial meaning and designates the rescaled time variable ) of Theorem 1. s ≡ m/ We are now moving towards the principle contributions. Note that λ(u) I, M ∗(0) (u)F ∗ (m, u) = I, F ∗ (m, u) ≡ hN,E (m, u). 1 − λ(u)

(4.201)

We will prove the following result which together with the estimates on the error terms will imply our main theorem. Proposition 4.26. For u on C, we have that lim lim hN,E (m, u) = H0∗ (s, z)(1 + O(|z|1−1/α , |z|1/α )) + O(z−1/α e−s/τ∞ ),

E↓−∞ N↑∞

(4.202) where H0∗ (s, u) ≡ defined in (1.10).

∞ 0

dte

zt ∞

dx s/t x 1/α (1+x)

is the Laplace transform of the function H0

Proof. The analysis of I, F ∗ (m, u) is in spirit and even detail very similar to that of M ∗ (u), except that it is considerably simpler. Note that using (4.153), Lemma 4.21, Lemma 4.22, Eq. (4.162), and the estimate (4.163), the leading term in this expression is

I, F ∗ (m, u) ≈

GσT (E)\σ (u) − 1 1 pN (σ, σ )m . |T (E)| eu − 1

(4.203)

σ ∈T (E)

Note that from (4.107) we get furthermore that GσT (E)\σ (u) − 1 eu − 1

(E) =

1 (1 + R(u)) , zσ − z

(4.204)

where the remainder R(u) is of the same type as those appearing in the proof of Lemma 4.14. Thus we obtain Lemma 4.27. With the notation of Lemma 4.14, √ Gσ (u) − 1 m T (E)\σ −me−β NEσ (E) (E)|z|. −e ≤ C pN (σ, σ ) u e −1 zσ − z Proof. Essentially contained in the proof of Lemma 4.14.

(4.205)

Next we can now prove the analogue of Corollary 4.17. . Then, uniformly on z < max(z, 1/2), and (u) ≤ |u|, Lemma 4.28. Set s ≡ m/ √ 1 1 −β NEσ e−me E↓−∞ N↑∞ |T (E)| zσ − z ∞ σ ∈T (E) dx −1 = α τ∞ e−s/(xτ∞ ) in Probability. (1 − zxτ∞ )x 1/α 1

lim lim

(4.206)

52

G. Ben Arous, A. Bovier, V. Gayrard

Moreover, α −1 τ∞

∞

dx e−s/(xτ∞ ) 1/α (1 − zxτ ∞ )x 1 1/α −1 z πcosec (π/α) − = (−zτ∞ )

∞

dte 0

s/t

zt 0

√ me−β NEσ

dx 1/α x (1 + x)

+ O(e−s/τ∞ ). (4.207)

√ se−β NEσ

Proof. Observing that, by (4.125), = szσ (1 − 1/M)−1 , = (4.206) is proven like (4.143) of Corollary 4.17. To prove (4.207), it will be convenient to extend the integration in (4.206) all the way to zero, as in the proof of Corollary 4.17. One can easily estimate the difference, namely 1 1 dx dx −s/(xτ∞ ) ≤ √1 e e−s/(xτ∞ ) 1/α 1/α (1−zxτ∞ )x x 2 0 0 −s/τ ∞ e 1 . (4.208) ≤ √ min s −1 , 1−1/α τ∞ 2 In the extended integral we again change variables and rotate the integration contour to the negative real axis to get that z∞ ∞ dx dx −s/(xτ∞ ) 1/α−1 e = (zτ∞ ) e−sz/x 1/α (1 − zxτ )x (1 − x)x 1/α ∞ 0 0 ∞ dx = −(−zτ∞ )1/α−1 e+sz/x . (4.209) (1 + x)x 1/α 0 According to whether (z) is positive or negative, we can represent z∞ +∞ e+sz/x = e−t dt = z e−zt dt or −zs/x −s/x −z∞ +∞ +sz/x −t = e dt = −z e+zt dt respectively. e −zs/x

(4.210)

s/x

Inserting these representation into (4.209) and changing the order of integration in the resulting double integrals gives in both cases ∞ dx e−s/(xτ∞ ) 1/α (1 − zxτ 0 ∞ )x s/t ∞ dx −1 = τ∞ (−zτ∞ )1/α (z)−1 απcosec (π/α) − dtezt . x 1/α (1 + x) 0 0 (4.211) We can now combine the asymptotics for 1 − λ(u) obtained in Corollary 4.18 with the preceding result. This shows that √ (E) λ(u) 1 − NEσ e−me E↓−∞ N↑∞ 1 − λ(u) |T (E)| zσ − z

lim lim

σ ∈T (E)

Aging in the REM. Part 2

∞ =z

−1

−

0

dtezt

53

s/t 0

dx x 1/α (1+x)

πcosec (π/α)

1 + O(|z|1−1/α , |z|1/α ) + O z−1/α e−s/τ∞ . (4.212)

The leading term is readily identified as the Laplace transform of s/t dx H0 (s/t) ≡ 1 −

0

x 1/α (1+x)

πcosec (π/α)

(4.213)

which we recognise as precisely the function that appeared as the leading asymptotic contribution in the trap model in Theorem 1.1. The bounds on the error terms then follow from simply estimating the corrections uniformly on C. The last step before completing the proof of Theorem 1 is now to consider the contribution FN,E (n + m). We leave it to the reader to show that the leading asymptotics of this term is given by ∞ ∞ 1 −1 −(t+s)/x −1−1/α α dxe x ≤ dxe−1/x x −1−1/α (4.214) α(t + s)1/α 0 1 which is sub-dominant as s and t tend to infinity. Collecting all the estimates of this section concludes the proof of the main theorem. Acknowledgement. We thank the Weierstrass Institute and the Mathematics Department of the EPFL for financial support and mutual hospitality.

References [B]

Bouchaud, J.P.: Weak ergodicity breaking and aging in disordered systems. J. Phys. I (France) 2, 1705 (1992) [BBG1] Ben Arous, G., Bovier, A., Gayrard, V.: Glauber dynamics of the random energy model. I. Metastable motion on extreme states. Commun. Math. Phys., to appear; DOI 10.1007/s00220003-0798-4 [BCKM] Bouchaud, J.P., Cugliandolo, L., Kurchan, J., M´ezard, M.: Out-of-equilibrium dynamics in spin-glasses and other glassy systems. In: Spin-Glasses and Random Fields A.P. Young (ed.), Singapore: World Scientific, 1998 [BD] Bouchaud, J.P., Dean, D.: Aging on Parisi’s tree. J. Phys. I (France) 5, 265 (1995) [BDG] Ben Arous, G., Dembo, A., Guionnet, A.: Aging of spherical spin glasses. Probab. Theor. Rel. Fields 120, 1–67 (2001) [BEGK1] Bovier, A., Eckhoff, M., Gayrard, V., Klein, M.: Metastability in stochastic dynamics of disordered mean field models. Probab. Theory Relat. Fields 119, 99–161 (2001) [BEGK2] Bovier, A., Eckhoff, M., Gayrard, V., Klein, M.: Metastability and low lying spectra in reversible Markov chains. Commun. Math. Phys. 228, 219–255 (2002) [BKL] Bovier, A., Kurkova, I., L¨owe, M.: Fluctuations of the free energy in the REM and the p-spin SK models. Ann. Probab. 30, 605–651 (2002) [BM] Bouchaud, J.P., Monthus, C.: Models of traps and glass phenomenology. J. Phys. A-Math. Gen. 29(14), 3847–3869 (1996) [BMR] Bouchaud, J.P., Maass, P., Rinn, B.: Multiple scaling regimes in simple aging models. Phys. Rev. Letts. 84(23), 5403–5406 (2000) [CD] Cugliandolo, L., Dean, D.: Full dynamical solution for a spherical spin-glass model. J. Phys. A-Math. Gen. 28(15), 4213–4234 (1995) [D1] Derrida, B.: Random energy model: Limit of a family of disordered models. Phys. Rev. Letts. 45, 79–82 (1980) [D2] Derrida, B.: Random energy model: An exactly solvable model of disordered systems. Phys. Rev. B 24, 2613–2626 (1981)

54 [Doe] [Fe] [Li] [LLR] [Ru]

G. Ben Arous, A. Bovier, V. Gayrard Doetsch, G.: Handbuch der Laplace-Transformation. Vol II, Lehrb¨ucher und Monographien aus dem Gebiete der exakten Wissenschaften, Mathematische Reihe Band 15, Basel: Birkh¨auser Verlag, 1955 Feller, W.: An introduction to probability theory and its applications. Vol II, Wiley series in probability and mathematical statistics, New York: John Wiley, 1971 Liggett, T.M.: Interacting particle systems. Berlin: Springer, 1985 Leadbetter, M.R., Lindgren, G., Rootz´en, H.: Extremes and related properties of random sequences and processes. Berlin-Heidelberg-New York: Springer, 1983 Ruelle, D.: A mathematical reformulation of Derrida’s REM and GREM. Commun. Math. Phys. 108, 225–239 (1987)

Communicated by M. Aizenman

Commun. Math. Phys. 236, 55–63 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0803-y

Communications in

Mathematical Physics

Thermodynamical Limit for Correlated Gaussian Random Energy Models P. Contucci, M. Degli Esposti, C. Giardin`a, S. Graffi Dipartimento di Matematica, Universit`a di Bologna, 40127 Bologna, Italy. E-mail: {contucci,desposti,giardina,graffi}@dm.unibo.it Received: 17 June 2002 / Accepted: 31 October 2002 Published online: 21 February 2003 – © Springer-Verlag 2003

To Francesco Guerra on his sixtieth birthday Abstract: Let {Eσ (N )}σ ∈N be a family of |N | = 2N centered unit Gaussian random variables defined√ by the covariance matrix CN of elements cN (σ, τ ) :=Av(Eσ (N )Eτ (N )) and HN (σ ) = − N Eσ (N ) the corresponding random Hamiltonian. Then the quenched thermodynamical limit exists if, for every decomposition N = N1 + N2 , and all pairs (σ, τ ) ∈ N × N : cN (σ, τ ) ≤

N1 N2 cN1 (π1 (σ ), π1 (τ )) + cN2 (π2 (σ ), π2 (τ )), N N

where πk (σ ), k = 1, 2 are the projections of σ ∈ N into Nk . The condition is explicitly verified for the Sherrington-Kirkpatrick, the even p-spin, the Derrida REM and the Derrida-Gardner GREM models. 1. Introduction, Definitions and Results It has recently been proved by Guerra and Toninelli [GuTo] that for the SherringtonKirkpatrick (hereafter SK) model (as well as for the even-p-spin models) the thermodynamical limit exists for the quenched free energy and almost everywhere for its random realizations. In this paper we single out general sufficient conditions that imply the existence of the quenched thermodynamical limit for any correlated Gaussian random energy model. Our analysis thus includes as special cases not only the even p spin models (in particular the SK one, p = 2) but also the Derrida REM model[De1, De2] and the Derrida-Gardner GREM[DeGa]. The paper is organized as follows: in this section we introduce the definitions and state the results. In Sect. 3, after introducing and elucidating the operation of lifting for a family of Gaussian random variables, we describe the proof of our theorem. In Sect. 4 we show how our analysis applies to the specific examples listed above. To define the set up we consider a disordered model having 2N energy levels where N is the size of the system. We label the energy levels by the index σ = {σ1 , σ2 , . . . , σN },

56

P. Contucci, M. Degli Esposti, C. Giardin`a, S. Graffi

where each σi takes the values ±1 for i = 1, . . . , N. We denote N the set of all σ . Then |N | = 2N . Clearly N coincides with the space of all possible 2N Ising configurations of length N. Definition 1. Denote {Eσ (N )}σ ∈N a family of 2N centered unit Gaussian random variables: Av (Eσ (N )) = 0, (1) and covariance matrix CN with elements defined by cN (σ, σ ) := Av Eσ2 (N ) = 1,

(2)

cN (σ, τ ) := Av (Eσ (N )Eτ (N )) .

(3)

Here Av (−) denotes expectation with respect to the probability measure 1 1 −1 dP E1 , . . . , E2N = e− 2 E, C E dE1 · · · dE2N . N (2π)2 det (C)

(4)

Definition 2. 1. For each N the Hamiltonian is given by

√ HN (σ ) = − N Eσ (N ).

2. The partition function of the system is: √ ZN (β, E) = e−βHN (σ ) = eβ NEσ (N) . σ

(5)

(6)

σ

3. The quenched free energy fN (β) of the system is defined as: −βfN (β) := αN (β) :=

1 Av (ln ZN (β, E)) . N

(7)

Remark 1. From now on we write Eσ (N ) = Eσ , dropping the N -dependence. Remark moreover that Definition 1 includes Gaussian families of the form Eσ (N ) = J0 + J i σi + Ji,j σi σj + Ji,j,k σi σj σk i

+... +

i,j

i,j,k

Ji1 ,i2 ,...,iN σi1 σi2 . . . σiN

(8)

i1 ,i2 ,...,iN

in which every J is an independent Gaussian variable. Examples . 1. The SK model. Consider first the model defined by Eσ :=

N 1 Ji,j σi σj , N

(9)

i,j =1

where the Ji,j are N 2 i.i.d. unit Gaussian random variables. A short computation yields Av(Eσ Eτ ) = [qN (σ, τ )]2 ,

Thermodynamical Limit for Correlated Gaussian Random Energy Models

57

where, as usual qN (σ, τ ) :=

N 1 σk τ k N

(10)

k=1

is the overlap between the σ and τ spin configurations. The standard SK model is instead defined by N 1 EσSK := Ji,j σi σj . (11) N i<j =1

However the quenched free energy densities (7) of the two models coincide up to a rescaling of the temperature, i.e.: √ SK αN ( 2β) = αN (β). (12) In fact, Ji,j σi σj are centered, unit and i.i.d. Gaussian random variables ∀ (i, j ), and D √ D Ji,j σi σj = Jj,i σj σi . Hence Ji,j σi σj + Jj,i σj σi = 2Ji,j σi σj (here = denotes equality in distribution of two random variables). Therefore, taking into account also the N diagonal terms: √ D √ √ N Eσ = N 2EσSK + J, (13) where J is a centered unit Gaussian variable. By (6,7) formula (13) immediately yields the relation (12). 2. The p-spin models. Here we consider the model: Eσ :=

1 Np

N

Ji1 ,...,ip σi1 · · · σip ,

(14)

i1 ,...,ip =1

where the Ji1 ,...,ip are once more i.i.d. unit Gaussian random variables. As before, a short computation yields Av(Eσ Eτ ) = [qN (σ, τ )]p .

(15)

3. The Derrida REM. Here the model is specified by Definition 1 with Av(Eσ Eτ ) = δ(σ, τ ).

(16)

4. The Derrida-Gardner GREM. Its inclusion into the above framework is described in detail in Sect. 3.3. Definition 3. For each σ ∈ N let π1 and π2 be the two canonical projections over the two subsets N1 and N2 , generated by a partition P of the coordinates (σ1 , . . . , σN ) into a subset of N1 coordinates and into a complementary set of N2 coordinates: N1 + N2 = N, N = N1 × N2 , π1 ⊗ π2 = 1N . (Example: N = 4; σ ∈ 4 with coordinates denoted {σ1 , σ2 , σ3 , σ4 }. Consider for N1 = N2 = 2 the partition Pσ = (σ1 , σ2 ) ∪ (σ3 , σ4 ). Then N = N1 × N2 and the two projections πk : N → Nk , k = 1, 2 act in the following way: π1 (σ1 , σ2 , σ3 , σ4 ) = (σ1 , σ2 ) and π2 (σ1 , σ2 , σ3 , σ4 ) = (σ3 , σ4 ). Our main result is the following:

58

P. Contucci, M. Degli Esposti, C. Giardin`a, S. Graffi

Theorem 1. Let the covariance matrices CN fulfill the condition: cN (σ, τ ) −

N1 N2 cN1 (π1 (σ ), π1 (τ )) − cN2 (π2 (σ ), π2 (τ )) ≤ 0, N N

(17)

˜ every (σ, τ ) ∈ N × N and every decomposition N1 + N2 = N . for every N ≥ N, Then the thermodynamical limit exists, in the sense that lim

N→∞

1 1 Av(log ZN (β)) = sup Av(log ZN (β)) . N N N

(18)

Remark 2. The result (18) can be extended to the almost-everywhere convergence of free energy density, internal energy and ground state energy with elementary probability methods (see [GuTo]). Remark 3. The conditions (17) are not necessary. The proof itself will show that we only need the sign of the quantity in the left-hand side of (17) in average, not pointwise. Moreover the condition (1) can be replaced by a more general small deviation vanishing for large N and (2) by a uniform (in N) bound over the diagonal terms. We plan to return to such a general case elsewhere. Remark 4. It is still an open interesting question whether the class of models we control the thermodynamical limit of does have, in that limit, the properties axiomatically introduced by Ruelle in [Ru1] to define directly the infinite particle systems. To this purpose see [BS, BoKu1, BoKu2 and BoKu3]. 2. Proof Within this section it is useful to consider 2 identical copies of the same system: system 1 is assigned the Hamiltonian H (σ ) and system 2 the Hamiltonian H (τ ). Definition 4. The quenched measure over the two copies − is defined by < − > = Av[Z(β, E)]−2 − eβ(H (σ )+H (τ )) .

(19)

(σ,τ )∈N ×N

The definition may of course be generalized to r copies. We want now to embed a Gaussian system {Eσ }K into a larger one {Eτ }L for some K < L. In particular we want to embed two of them of size N1 and N2 into one of size N = N1 + N2 . Our embedding procedure is defined in terms of the two canonical projections πj , j = 1, 2 from N to Nj given in Definition (3). (1)

Definition 5. Given the family {Eµ }N1 of size N1 we lift it to one of size N : {Eσ }N defining D

Eσ(1) = Eπ1 (σ ) .

(20)

Moreover starting from {Eµ }N2 we define in the same way D

Eσ(2) = Eπ2 (σ ) . (1)

(2) {Eσ }N

by (21)

(2)

Having defined each family {Eσ }N , {Eσ }N1 and {Eσ }N2 we specify their joint distribution requiring mutual independence.

Thermodynamical Limit for Correlated Gaussian Random Energy Models (1)

59 (2)

Remark 5. The embedded Gaussian systems {Eσ }N1 and {Eσ }N2 are degenerate: In fact for all σ and τ such that π1 (σ ) = π1 (τ ), Eσ(1) = Eτ(1) .

(22)

(1) (2) Summarizing we define the joint measure of {Eσ }N , {Eσ }N1 and {Eσ }N2 d Pˆ = dP dP1 dP2 defined by the three covariances CN , CN1 and CN2 .

Proof of Theorem 1. We proceed in three lemmas. Lemma 0. Interpolation. Given a pair (π1 , π2 ) as before, following [GuTo], we pick (j ) three independent Gaussian systems Eπj (σ ) , j = 0, 1, 2 and introduce the quantity (π0 (σ ) = σ ), 2 (j ) H(N,N1 ,N2 ) (σ, t) := − tj Nj Eπj (σ ) , (23) j =0

where t0 = t and t1 = t2 = (1 − t), and the correspondent partition sum ZN (t, β) := e−βH(N,N1 ,N2 ) (σ,t) .

(24)

σ ∈N

It is now easy to see that: and ZN (0, β) =

ZN (1, β) = ZN (β),

e

√ √ (1) (2) β( N1 Eπ (σ ) + N2 Eπ (σ ) )

σ ∈N

= =

(25)

1

2

e

√ √ (1) (2) β( N1 Eπ (σ ) + N2 Eτ )

τ ∈N2 σ ∈N ; π2 (σ )=τ √ (2) β N2 Eτ

e

τ ∈N2

1

eβ

√ (1) N1 Eγ

γ ∈N1

= ZN1 (β) · ZN2 (β).

(26)

Lemma 1. Boundedness. The Jensen inequality Av (log Z) ≤ log(Av (Z))

(27)

implies 1 β2 Av (log ZN (β)) ≤ log(2) + (28) N 2 2 N β after performing the Gaussian integration. because by (6) Av (Z) = − 2e 2 Lemma 2. Monotonicity. Taking the t derivative of the logarithm of (24) we get: (here we abbreviate HN,N1 ,N2 = H ) 2

Nk (k) β d −βH (σ,t) , (29) k E e log ZN (t) = dt ZN (t) tk πk (σ ) σ ∈N

where 0 = 1 and 1 = 2 = −1.

k=0

60

P. Contucci, M. Degli Esposti, C. Giardin`a, S. Graffi

We now use the integration by parts formula for correlated Gaussian variables {ξi } with covariance ci,j , which states

n ∂f Av ξj · f = Av . (30) cj,k · ∂ξk k=1

This yields

1 d Av log ZN (t) β dt

(k)

Eπk (σ ) e−βH Nk = k Av tk ZN (t) σ ∈N k=0   2 −βH ∂ Nk e . = k Av  cNk (πk (σ ), τk ) · tk ZN (t) ∂Eτ(k) k σ ∈N k=0 τk ∈N 2

k

(31)

Given now τk ∈ Nk fixed, we calculate e−βH (σ,t)

∂ (k)

∂Eτk

=β

ZN (t)

=β

√ π (σ ) Nk tk δτkk e−βH (σ,t) · ZN (t) − e−βH (σ,t) ·

∂ZN (k) ∂Eτk

2 (t) ZN

√ √ π (σ ) Nk tk δτkk e−βH (σ,t) · ZN (t) − Nk tk e−βH (σ,t) · ξ ∈N , πk (ξ )=τk e−βH (ξ,t) 2 (t) ZN

.

The term with k = 0 in formula (31) is easy to calculate and we get:    −βH (σ,t) e NβAv  cN (σ, τ ) δτσ − δξτ e−β(H (ξ,t)+H (σ,t))  ZN σ ∈N τ ∈N ξ ∈N   −βH (σ,t) e = NβAv  cN (σ, σ ) · − cN (σ, τ ) e−β(H (τ,t)+H (σ,t))  ZN σ ∈N

(σ,τ )∈N ×N

= Nβ1 − cN (σ, τ )t ,

(32)

where < − >t is the quenched measure with respect to the Hamiltonian (23). In the same way for the term k = 1 (and similarly for k = 2) we obtain:

e−βH (σ,t) τ τ −β(H (ξ,t)+H (σ,t)) N1 βAv cN1 (π1 (σ ), τ ) δπ1 (σ ) − δπ1 (ξ ) e ZN σ ∈N τ ∈N1

ξ ∈N

= N1 1 − cN1 (π1 (σ ), π1 (τ ))t .

(33)

Summing up the three contributions we obtain: 1 d Av (log ZN (t)) N dt = −β 2 < cN (σ, τ ) −

N1 N2 cN (π1 (σ ), π1 (τ )) − cN (π2 (σ ), π2 (τ )) >t , N 1 N 2

(34)

Thermodynamical Limit for Correlated Gaussian Random Energy Models

61

and, by the hypothesis (17): d Av (log ZN (t)) ≥ 0. (35) dt Formula (35) together with the boundary conditions (25) and (26) gives for every N1 + N2 = N, N1 N2 αN1 + αN2 . αN ≥ (36) N N This entails Theorem 1 as explained for instance in [Ru2]. Remark 6. Lemma 3 is indeed a particular case of a theorem by J-P. Kahane [K] (see also [LT], Theorem 3.11, p.74). fact be identified √ The Gaussian process X of [K] can in√ √ with our Gaussian process N E, and the process Y with our process N1 E (1) + N2 E (2) . The further identifications A ≡ N × N , B = ∅, f ≡ ln Z immediately entail that Hypothesis (1) of [K] reduces to (17) and Assertion (3) to our formula (36) because Hypothesis (2) is just convexity of ln Z. 3. Examples 3.1. The SK and even p-spin models. For the sake of completeness we recover here the Guerra-Toninelli result [GuTo]. First note that by the definition (10) we have N1 N2 (37) qN1 (π1 (σ ), π1 (τ )) − qN (π2 (σ ), π2 (τ )) = 0, N N 2 so that (17) holds as an equality for p = 1 (the random field model). By (36) this means that the random field model free energy density doesn’t depend on the size: αN = α1 . For p = 2u (SK corresponds to u = 1) formula (37) together with the convexity of the function x → x 2u implies (17): qN (σ, τ ) −

N1 2u N2 2u q (π1 (σ ), π1 (τ )) − q (π2 (σ ), π2 (τ )) ≤ 0. N N1 N N2 For the standard p-spin model defined as p! Eσ = Ji1 ,...,ip σi1 · · · σip p 2N 2u qN (σ, τ ) −

(38)

(39)

i1 <...
we refer to [GuTo]. 3.2. The REM. The model is defined by: Av (Eσ Eσ ) = δσ,σ .

(40)

Condition (17) is verified because it becomes δσ,σ ≤

N1 N2 δπ1 (σ ),π1 (σ ) + δπ (σ ), π2 (σ ) . N N 2

(41)

In fact if σ = σ the previous formula is an identity. If σ = σ the left-hand side is 0 but the right-hand side is not always zero. Let us take for instance σ = (+, +) and σ = (+, −), π1 (+, +) = +, π1 (+−) = +, π2 (+, +) = +, π2 (+, −) = −. In that case the left-hand side is zero and the right-hand side is 1/2.

62

P. Contucci, M. Degli Esposti, C. Giardin`a, S. Graffi

3.3. The GREM. To show the inclusion in our scheme of the Derrida-Gardner GREM [DeGa] let us first√recall its construction. The GREM considers 2N Gaussian random energies H (µ) = N Eµ . Their covariance is specified after the assignment of a rooted tree with n layers and 2N leaves, n < N . The root furcates into α1N branches, the vertices at the end of the first layer furcate into α2N branches etc., up to the vertices at the end of the n − 1 layer which αnN -furcate into the 2N leaves. Remark 7. The topological constraint over the successive furcations which end up on 2N leaves implies ni=1 αiN = 2N . Each αiN is an integer which by the previous formula divides 2N . By the fundamental theorem of arithmetics αiN = 2ki . Here ki , i = 1, . . . , n is a non-negative integer, and k1 + k2 + . . . kn = N . In other words: given any tree with 2N leaves the construction allows only for furcations in powers of 2 at each layer. The ki

cofficients αi must depend on N : in fact, αi = 2 N and the only N -independent choice of the vector α is obtained for ki = Nli , where the integers li have to divide N for all N . Hence they must fulfill the constraint ni=1 l1i = 1 which is impossible. The previous remark allows us to associate to each leaf µ a spin configuration {σ1 , σ2 , . . . , σN }. This can be done observing that the α1 (N )N = 2k1 (N) branches emerging from the root identify canonically the configurations of k1 spins, the successive branches the configuration of k2 spins and so on. We have in this way associated to each leaf either a path (the only one joining the root to it) or a spin configuration. The (µ) model is finally specified by the formula E(µ) = ni=1 εi , where the εi are thrown according to n Gaussians with Av(εi ) = 0 and Av[(εi )2 ] = ai : to each branch of the tree we associate an independent ε whose distribution depends (through its variance) only l−1 at which layer the branch starts. Defining v (l) = ai , (v (0) = 0 and v (1) = 1) it is i=1

immediate to prove that if two paths µ and ν merge at the level l we have Av(Eµ Eµ ) = v (l) . For fixed n and N this construction is exactly the Derrida-Gardner process over a tree Tn,N ; we will denote it {E, Tn,N }. Theorem 1 entails existence of the thermodynamical limit for the GREM, in the sense that if {E, Tn,N } is assigned for a given n and all N > n, and the sequence of {Tn,N } is increasing, i.e. ki (N ) ≥ ki (M) for N ≥ M, then its free energy density is (at fixed n) decreasing (and bounded) in N . To show this assertion, starting from a process {E, Tn,N1 } (1) we build the process {Eπ1 , Tn,N } with N = N1 +N2 in the following way: at each vertex of the tree Tn,N1 sitting on the layer i we increase the multiplicity of the furcation by a (1) factor 2ki (N)−ki (N1 ) , assigning the same value εi to all newly introduced branches. By construction the new process will enjoy the property (1) (1) (42) Av Eπ1 (σ ) Eπ1 (τ ) ≥ v (l) . (2)

We apply the same construction to build {Eπ2 , Tn,N } and we have (2) (2) Av Eπ2 (σ ) Eπ2 (τ ) ≥ v (l) . It is now straightforward to verify that conditions (42) and (43) imply (17).

(43)

Thermodynamical Limit for Correlated Gaussian Random Energy Models

63

Acknowledgements. One of us (P.C.) thanks Francesco Guerra for useful conversations and Michael Aizenman for introducing him to the Correlated Gaussian Random Energy Models. We also thank the referees and Anton Bovier for interesting observations and for pointing out the reference [K]. This work has been partially supported by the EC RTN-HPRN-CT-2000-00103 (Mathematical Aspects of Quantum Chaos) and by Universit`a di Bologna, Funds for Selected Research Topics.

References [BS]

Bolthausen, E., Sznitman, A.S.: On Ruelle’s probability cascades and an abstract cavity method. Commun. Math. Phys. 197, 247–276 (1998) [Bo] Bovier, A.: Statistical Mechanics of Disordered Systems. MaPhySto Lecture Notes 10, Aarhus, 2001 [BoKu1] Bovier, A., Kurkova. I.: Derrida’s generalized random energy models. 3. Models with continuous hierarchies. Preprint 729, U. Paris 6, 2002 [BoKu2] Bovier, A., Kurkova, I.: Derrida’s generalized random energy models. 2. Gibbs measures and probability cascades. Preprint 728, U Paris 6, 2002 [BoKu3] Bovier, A., Kurkova, I.: Derrida’s generalized random energy models. 1. Poisson cascades and extremal processes. Preprint 727, U. Paris 6, 2002 [De1] Derrida, B.: Random energy model: limit of a family of disordered system. Phys. Rev. Lett. 45, 79 (1980) [De2] Derrida, B.: Random energy model: an exactly solvable model of disordered system. Phys. Rev. B 24, 2613 (1981) [DeGa] Derrida, B., Gardner, E.: Solution of the generalised random energy model. J.Phys. C 19, 2253–2274 (1986) [GuTo] Guerra, F., Toninelli, F.L.: The thermodynamical limit in mean field spin glass model. Commun. Math. Phys. 230, 71–79 (2002) [K] Kahane, J.-P.: Une inegalite du type de Slepian et Gordon sur les processus gaussiens. Israel J. Math. 55, 109–110 (1985) [LT] Ledoux, M., Talagrand, M.: Probability on Banach spaces. Berlin-Heidelberg-New York: Springer Verlag, 1990 [Ru1] Ruelle, D.: A Mathematical reformualtion of Derrida’s REM and GREM. Commun. Math. Phys. 108, 225–239 (1987) [Ru2] Ruelle, D.: Statistical Mechanics. Rigorous results. New York: W.A. Benjamin Inc., 1969 Communicated by M. Aizenman

Commun. Math. Phys. 236, 65–92 (2003) Digital Object Identifier (DOI) 10.1007/s00220-002-0791-3

Communications in

Mathematical Physics

A Complete 2D Stability Analysis of Fast MHD Shocks in an Ideal Gas Yuri Trakhinin1,2 1 2

Sobolev Institute of Mathematics, Koptyuga pr. 4, 630090 Novosibirsk, Russia Department of Mathematics, University of Hull, Cottingham Road, Hull, HU6 7RX, UK. E-mail: [email protected]

Received: 12 March 2002 / Accepted: 8 November 2002 Published online: 10 February 2003 – © Springer-Verlag 2003

Abstract: An algorithm of numerical testing of the uniform Lopatinski condition for linearized stability problems for 1-shocks is suggested. The algorithm is used for finding the domains of uniform stability, neutral stability, and instability of planar fast MHD shocks. A complete stability analysis of fast MHD shock waves is first carried out in two space dimensions for the case of an ideal gas. Main results are given for the adiabatic constant γ = 5/3 (mono-atomic gas), that is most natural for the MHD model. The cases γ = 7/5 (two-atomic gas) and γ > 5/3 are briefly discussed. Not only the domains of instability and linear (in the usual sense) stability, but also the domains of uniform stability, for which a corresponding linearized stability problem satisfies the uniform Lopatinski condition, are numerically found for different given angles of inclination of the magnetic field behind the shock to the planar shock front. As is known, uniform linearized stability implies the nonlinear stability, that is local existence of discontinuous shock front solutions of a quasilinear system of hyperbolic conservation laws. 1. Introduction Shocks in ideal media, where dissipative mechanisms (e.g., viscosity or heat conduction) can be neglected, are usually viewed as surfaces of strong discontinuity. In a pure form, such an inviscid treatment is possible only for evolutionary (compressive) shock waves, for which a corresponding linearized stability problem (LSP) is correctly posed according to the number of boundary conditions (linearized Rankine-Hugoniot relations). The first question arising in the study of shocks is, of course, that of their real existence. It is clear that only the shocks being stable against small perturbations can exist. At the same time, such a linear (or linearized) stability does not still guarantee the existence of a shock wave as a physical structure (nonlinear stability), i.e., the existence (at least, short-time) of discontinuous shock front solutions of a quasilinear system of hyperbolic conservation laws governing the fluid motion. In this connection, the concept of stability should be determined more accurately.

66

Yu. Trakhinin

Analyzing the linearized stability of gas dynamical shocks, D’yakov [24] has first found such linearly stable strong discontinuities that small perturbations of a planar discontinuity front and a steady uniform fluid flow do not increase and do not decrease with time. The domain of parameters of the LSP where such a property takes place was called in [24] that of spontaneous sound radiation by the discontinuity. Later spontaneously radiating strong discontinuities came to be called also neutrally stable (see, e.g., [25]). It turns out that one cannot judge the real existence of neutrally stable discontinuities on the linear level. To understand a connection between linearized and nonlinear stability we should use such rigorous mathematical notations as the Lopatinski condition (LC) and the uniform Lopatinski condition (ULC) that can be introduced for a LSP just as was done by Kreiss [40] for (standard) initial boundary value problems (IBVP) for linear hyperbolic systems (see [47, 49, 11, 22]). In accordance with the fulfillment or violation of the LC and the ULC, the whole domain of admissible parameters of a LSP consists of the following subdomains: 1. The domain of fulfillment of the ULC (uniform stability domain); 2. The domain of fulfillment of the general LC and violation of the ULC (neutral stability domain); 3. The domain where the LC is violated, i.e., the LSP is ill-posed (instability domain). Besides, the union of domains 1 and 2 is referred as the weak stability domain. Uniformly stable shocks are distinguished by the existence of a priori estimates without loss of smoothness for a corresponding LSP. Such estimates in Sobolev spaces W22 were first deduced by Blokhin [7, 8], by the dissipative integrals techniques [11, 22], for gas dynamical shock waves. Later, by utilizing Kreiss’ symmetrizer techniques [40] and (generalized) pseudodifferential calculus, Majda [47, 49] has extended results in [40] to the LSP for Lax k-shocks [45, 37, 53] and proved the equivalence of the ULC and the existence of a priori estimates without loss of smoothness in L2,η weighted Sobolev norms. The passage to nonlinear stability was first done by Blokhin for gas dynamical shocks. Namely, in [9, 10] (see also [11, 22]), by a straightforward adaptation of the dissipative integrals techniques for the quasilinear case, he has proved the theorem on the short-time W2s -existence and uniqueness of discontinuous shock front solutions to the gas dynamics system; where s ≥ 3 as in Kato’s short-time existence theorem [38] for the Cauchy problem, and the shock is supposed to be uniformly stable according to linear analysis. Using pseudodifferential calculus, Majda [48, 49] has proved the theorem on the s -existence (s ≥ 10 for 3D) of discontinuous shock fronts for a system of short-time W2,η conservation laws that satisfies some block structure conditions. Note that these conditions have been recently shown by M´etivier [51] to be satisfied for a class of hyperbolic symmetrizable systems with constant multiplicities. This class contains, for example, the gas dynamics equations, the Maxwell equations, the equations of linear elasticity, etc. However, for the MHD system Majda’s block structure conditions seem to require separate verification (or their generalization to prove a short-time existence theorem like Majda’s [48, 49]). Thus, in the generic case we cannot deduce, with a full mathematical strictness, nonlinear stability (in the sense above) from the uniform linearized stability. But, on the other hand, one can show that in domain 1 an ill-posedness example of Hadamard type cannot be constructed for the LSP and as well as for all close problems, which are obtained by perturbation of the system and the boundary conditions. That is to say, it would not be a mistake to assert, with a certain degree of strictness, that domain 1 is the domain, or a part of the domain, of nonlinear stability.

A Complete 2D Stability Analysis of Fast MHD Shocks

67

The nonlinear stability domain can apparently be supplemented by a part of domain 2 (for shocks there are no corresponding results up to now; see however [54] for the study of a certain IBVP in nonlinear elastodynamics). On the other hand, one can unfortunately say nothing about the real existence of neutrally stable shocks according to linear analysis. Moreover, as was shown in [47, 11], a corresponding LSP has only a priori estimates with loss of smoothness that cannot be carried over the nonlinear level. Thus, the existence of neutrally stable shocks has to be analyzed in the initial nonlinear statement. This is a very complicated problem that seems to be far from its resolution. In view of the above reasoning, for shocks it is of great importance to find not only the domain of their linear (weak) stability but also the uniform stability domain. Unlike gas dynamics, where the question on the stability of shock waves has been fully investigated, [24, 35, 39, 26, 7–11, 47–49], even for an arbitrary state equation of a gas (at least, in the linear statement and in the nonlinear one for uniformly stable shocks), in MHD the analogous problem is not completely resolved up to now. Recall that in MHD there are two types of evolutionary shocks: fast and slow [1]. The stability of fast parallel and perpendicular shock waves (the magnetic field is supposed to be parallel or perpendicular to the normal to the shock front) was studied by Gardner and Kruskal [30]. They have derived a condition for the weak stability of such shocks and shown that it holds for an ideal gas with the adiabatic constant γ < 3. Lessen and Deshpande [46] have numerically found some 2D instability domains for slow MHD shocks in an ideal gas with γ = 5/3 (ill-posedness examples for the LSP were constructed numerically). Analogous, but more complete, results were obtained by Filippova in [27] where stability was studied in the general case of three space dimensions. In [27] instability domains were found also for fast shocks (in an ideal gas with γ = 5/3). Looking ahead, we note that, according to the results of the present paper, in [27] only a part of the whole instability domain for fast shocks was found. Filippova [27] has also shown that the examination of stability against 3D perturbations does not enlarge the instability domain found for fast shocks in the case of 2D perturbations, i.e., in the study of the weak stability of fast shocks one can restrict oneself for the 2D stability analysis. The stability of MHD shock waves in an ideal gas for the asymptotic cases of a weak magnetic field and a strong magnetic field was analyzed by Blokhin and Druzhinin [15, 16] (see also [12]). They have shown that fast shocks are weakly stable under a weak magnetic field whereas slow shocks are unstable under a strong magnetic field (without restrictions to γ ). Besides, in [15, 16] the uniform stability of fast shock waves, as tested there by the dissipative integrals techniques, has been proved for the cases of parallel and perpendicular shocks. This result was extended by Blokhin and Trakhinin [20] to the general case of an arbitrary inclination of the vector of magnetic field to the planar shock front, i.e., the uniform stability of fast MHD shocks in an ideal gas under a weak magnetic field was shown. In [21] Blokhin and Trakhinin have refined the results of [30] and established that fast parallel MHD shock waves in an ideal gas are always weakly stable irrespective of the adiabatic constant γ . But, it seems to be more important that in [21] a necessary and sufficient condition for the uniform stability of fast parallel shocks was found (the fact of the possibility of the existence of neutrally stable fast parallel MHD shocks in an ideal gas has been first pointed out by Egorushkin and Kulikovskii [25]). Notice also that the restriction γ < 3 of [30] was removed by Anile and Russo [3] for perpendicular shocks, i.e., they have shown the weak stability of fast perpendicular MHD shocks. At last, we observe that in [56] a necessary and sufficient condition for the uniform stability of fast parallel shocks was derived for an arbitrary state equation in the more general case of relativistic MHD.

68

Yu. Trakhinin

In [22], the extreme importance of the ability to test the LC and the ULC for LSPs for strong discontinuities in different models of continuum mechanics was emphasized. Note that in practice it is often impossible to verify analytically the LC (not to mention the ULC). In this connection, in [22] the attention of researchers was drawn to the problem of constructing effective numerical algorithms for testing the LC and the ULC. This would enable one to find numerically the domains of uniform/neutral stability and instability for strong discontinuities (in particular, for shock waves). First attempts in this direction were made in [46, 27] where the instability of MHD shocks was being proved numerically. At the same time, as was underlined above, it is much more important to be able to find numerically the domains of neutral and uniform stability. It is the question on the numerical finding of uniform stability domains that is regarded as of paramount importance in this paper. The main difficulty in the test of the LC and the ULC is often connected (e.g., for MHD shocks) with the fact that, even on the first step, one cannot analytically find the roots of a dispersion relation for a linearized hyperbolic system (of course, in the multidimensional case). Having no representation for these roots, we are not able to write out the Lopatinski determinant [40]. It turns out that such a difficulty can be overcome for Lax 1-shocks, for which the Lopatinski determinant can always be computed analytically. True, it will depend on an additional complex parameter that is a certain root of the dispersion relation. The method to obtain an equivalent form for the LC for 1-shocks was suggested by Gardner and Kruskal [30] (although, they did not use the mathematical terminology connected with the Kreiss-Lopatinski condition). In [21, 56, 57], by utilizing ideas of [30], an equivalent form was given also for the ULC for 1-shocks. It is the equivalent form that will be the basis for an algorithm of numerical testing the ULC suggested in the present work. The paper is organized as follows. In Sect. 2 we give the mathematical statement of the LSP for fast MHD shocks and describe the domain of its admissible parameters. In Sect. 3, the equivalent forms for the LC and the ULC for 1-shocks (actually, for a wider class of hyperbolic problems having the 1-shock property) are given. Then, we suggest an algorithm for numerical testing the LC and the ULC based on these equivalent forms. The algorithm is tested on two basic examples: the first of which is the IBVP in the half-plane for the wave equation, the second one is the LSP in the 2D case for fast parallel MHD shocks in an ideal gas for which uniform and neutral stability domains were analytically found in [21]. Observe that in this work we consider only the case of two space dimensions. For the 3D case the algorithm suggested has no fundamental difference, but it requires more calculation times. Therefore, a further refinement of the algorithm is necessary, which is the point of future research. Section 4 consists of the basic results of the paper. Namely, by using the suggested numerical algorithm, the domains of uniform stability, neutral stability, and instability of fast MHD shocks in an ideal gas are found. The main consideration is given for the case of one-atomic gas (γ = 5/3). From the physical point of view this case is most natural for the MHD model. However, we briefly discuss also the cases γ = 7/5 (two-atomic gas) and γ > 5/3. The results of computations are presented for different given angles ϕ of inclination of the magnetic field behind the shock to the planar shock front. The evolution of stability properties from parallel to perpendicular shocks is analyzed (the angle ϕ increases from 0◦ to 90◦ ). Section 5 is devoted to the discussion of open problems and contains concluding remarks.

A Complete 2D Stability Analysis of Fast MHD Shocks

69

2. The LSP for Fast MHD Shocks 2.1. Equations of ideal MHD and shock waves. The MHD system governing the motion of an ideal fluid can be written in the conservative form (see [33, 42, 41, 30]) ρt + div (ρv) = 0 , (1) 1 |H|2 (ρv)t + div ρv ⊗ v − H⊗H +∇ p+ = 0 , (2) 4π 8π Ht − rot (v×H) = 0 , (3)

ρE + ρ

|v|2 |H|2 1 |v|2 + + pV + H×(v×H) = 0 . (4) + div ρv E + 2 8π t 2 4π

Here ρ, v = (v1 , v2 , v3 )∗ , H = (H1 , H2 , H3 )∗ , p, E are the density, the fluid velocity, the magnetic field, the pressure, and the internal energy respectively (asterisk stands for transposition), V = 1/ρ is the specific volume. The temperature T and the entropy S satisfy the Gibbs relation T dS = dE + pdV , that implies the thermodynamical equalities ∂E ∂E = ρ2 , p=− ∂V S ∂ρ S

T =

∂E ∂S

ρ

(for the sake of brevity, we will below write Eρ instead of (∂E/∂ρ)S , ES instead of (∂E/∂S)ρ , etc.). Thus, with a state equation of medium, E = E(ρ, S), we can regard (1)–(4) as a closed system for finding the vector U = (p, S, v∗ , H∗ ). Besides, Eqs. (1)–(4) should be supplemented by the divergent constraint div H = 0 , that is, as a matter of fact, an additional requirement on the initial data for system (1)–(4). Finally, system (1)–(4) implies the additional conservation law (entropy conservation) (ρS)t + div (ρSv) = 0

(5)

which holds on smooth solutions. It is the conservation law (5) that was used by Godunov [31] for the symmetrization of the MHD system (1)–(4). Following [12], the MHD equations can be rewritten as the symmetric system A0 (U)Ut +

3

Ak (U)Uxk = 0 ,

(6)

k=1

where A0 = diag (1/(ρc2 ), 1, ρ, ρ, ρ, 1/(4π), 1/(4π ), 1/(4π )) is the diagonal matrix, Ak are symmetric matrices which can be easily written out (see [12]), c2 = (ρ 2 Eρ )ρ is the square of the sound velocity. The quasilinear system (6) is symmetric t-hyperbolic

70

Yu. Trakhinin

(in the sense of Friedrichs [29]) if, as in gas dynamics, the following natural assumptions (hyperbolicity conditions) hold: c2 > 0

ρ > 0,

(7)

(A0 > 0). In addition, we impose on the MHD system the natural physical restrictions p > 0,

T > 0.

(8)

Consider piecewise smooth solutions to system (1)–(4) with smooth parts separated by the surface of strong discontinuity with the equation f˜(t, x) = x1 − f (t, x ) = 0

(9)

(x = (x1 , x ), x = (x2 , x3 )). As is known (see, e.g., [43, 37, 49]), on surface (9) some jump conditions should hold for limit values of solutions to the system of conservation laws ahead (f˜ → −0) and behind (f˜ → +0) the discontinuity front (sometimes, by analogy with gas dynamics, they are called Rankine-Hugoniot conditions). The MHD Rankine-Hugoniot conditions have the form (see, e.g., [33, 42, 41]) [j ] = 0 ,

[HN ] = 0 ,

j [vτ ] =

j [vN ] + [p] +

HN [Hτ ] , 4π

|v|2 |H|2 j E+ + 2 8πρ

+

1 2 |H| = 0 , 8π

HN [vτ ] = j [V Hτ ] ,

(10)

(11)

HN |H|2 p+ vN − (H, v) = 0 . 8π 4π

(12)

Here j = ρ(vN − DN ) is the mass transfer flux across the discontinuity surface; N=

1 |∇ f˜|

(1, −fx2 , −fx3 )∗ ,

DN = −

f˜t |∇ f˜|

=

ft |∇ f˜|

are the unit normal to the discontinuity front and the discontinuity speed in the normal direction; |∇ f˜| = (1 + fx22 + fx23 )1/2 , HN = (H, N) ,

vτ = (vτ1 , vτ2 )∗ ,

vN = (v, N) ,

τ 1 = (fx2 , 1, 0)∗ ,

Hτ = (Hτ1 , Hτ2 )∗ ,

vτi = (v, τ i ) ,

τ 2 = (fx3 , 0, 1)∗ ,

Hτi = (H, τ i ) ,

(τ i , N) = 0 ,

i = 1, 2 ;

[g] = g − g∞ denotes the jump for every regularly discontinuous function g with corresponding values behind (g := g|f˜→+0 ) and ahead (g∞ := g|f˜→−0 ) of the discontinuity front (here and below the subindex ∞ stands for boundary values ahead of the shock front).

A Complete 2D Stability Analysis of Fast MHD Shocks

71

MHD shocks. Recall that if j = 0 and [ρ] = 0, then a strong discontinuity is called a shock wave (for the detail classification of MHD discontinuities see, e.g., [33, 42, 41, 12]). For the case of shock waves, instead of (12) one can use the condition (the MHD analog of the Hugoniot adiabat; see, e.g., [41]) [E] +

|[H]|2 p + p∞ [V ] + [V ] = 0 , 2 16π

that can be rewritten as p = H(V , g, p∞ , V∞ ) ,

(13)

with g = |[H]|2 /(16π). As was shown by Iordanskii [36] (see also [42]), the compulsory physical condition of entropy increase through the MHD shock discontinuity (14)

S > S∞ is equivalent to the compressibility conditions ρ > ρ∞ ,

(15)

p > p∞

if the Bethe condition [6]

Vpp =

∂ 2V ∂p 2

>0

(16)

S

holds together with the additional assumption on the positiveness of the thermal coefficient, Ep > 0. Even though the condition (16) is not thermodynamical (see [43]), but it usually holds, for example, for an ideal gas, which obeys the state equation E=

pV , γ −1

(17)

where γ > 1 is an adiabatic constant. For an ideal gas, conditions (16) and Ep > 0 are reduced to the valid inequalities (γ + 1)V /(γ 2 p 2 ) > 0 and V /(γ − 1) > 0. For the physical admissibility of MHD shocks in an ideal gas one should thus require the fulfillment of the compressibility conditions (15). Evolutionarity conditions. Unlike gas dynamics, in MHD, even for the case of an ideal gas, the entropy increase condition (15) does not guarantee that a shock is evolutionary [42, 43], i.e., a corresponding LSP is correctly posed according to the number of boundary conditions. That is, admissible MHD shocks must moreover satisfy the evolutionarity conditions [42, 43] (they are necessary but not sufficient for stability). Evolutionary MHD shocks (fast and slow [42, 41]) are known to be Lax k-shocks. Consider a planar stationary MHD shock with the equation x1 = 0. Let, without loss of generality, v1∞ > 0 and H1 ≥ 0 (it follows from (10)–(12) and (15) that H1 = H1∞ and v1∞ > v1 ). The matrix A−1 0 A1 (cf. (6)) has the following eigenvalues, λ1 ≤ . . . ≤ λ8 : + , λ1,8 = v1 ∓ cM

λ2,7 = v1 ∓ cA ,

− λ3,6 = v1 ∓ cM ,

λ4,5 = v1

(18)

72

Yu. Trakhinin

(the eigenvalues of the matrix A−1 for system (6) ahead of the planar shock have 0∞ A √1∞ an analogous form). Here cA = H1 / 4πρ is the Alfv´en velocity [2],  1/2 1/2 2   |H|2 2 2 H 1 |H| ± cM + c2 ± + c2 − 1 c2 =  2  4πρ 4πρ πρ − + are the fast and slow magnetosonic velocities. It is easily verified that cM ≤ cA ≤ cM . For a planar stationary discontinuity the Lax shock conditions [45, 37, 47, 53], which guarantee evolutionarity, have the form −1 λk (A−1 0 A1 ) < 0 < λk (A0∞ A1∞ ) , −1 λk−1 (A−1 0∞ A1∞ ) < 0 < λk+1 (A0 A1 ) .

(19)

In this work we shall focus on fast MHD shock waves, which are known to be 1-shocks. For them, with regard to (18), (19), the velocities v1∞ and v1 (ahead of and behind the shock) should satisfy the inequalities (evolutionarity conditions) + v1∞ > cM∞ ,

+ cA < v1 < cM .

(20)

As was already noted above, unlike gas dynamics, in MHD the entropy Lax conditions (20) do not provide, in general, the fulfillment of the physical entropy condition (14) (and vice versa). 2.2. Solvability of the MHD Rankine-Hugoniot conditions for physically admissible shocks. Let us now discuss the existence of solutions to the jump conditions (10)–(12) for a planar stationary MHD shock satisfying the entropy increase condition (14) and the evolutionarity inequalities (20) (fast shock). The domain of existence of such solutions satisfying also the natural requirements (7), (8) is the domain of admissible parameters (denoted below as the domain D) of the LSP for fast MHD shock waves (the LSP will be formulated below). Let us consider a piecewise constant solution to the MHD system for the case of two space dimensions (we shall consider only 2D perturbations; see Sect. 1). Such a solution, ˆ ∞ = (pˆ ∞ , Sˆ∞ , vˆ1∞ , vˆ2∞ , Hˆ 1∞ , Hˆ 2∞ )∗ , x1 < 0 ; U U(t, x) = (21) ˆ = (p, ˆ vˆ1 , vˆ2 , Hˆ 1 , Hˆ 2 )∗ , x1 > 0 , U ˆ S, should satisfy, at x1 = 0, the jump conditions (10)–(12): ρˆ = vˆ1∞ , ρˆ∞ vˆ1

Hˆ 1 = Hˆ 1∞ ,

[vˆ1 Hˆ 2 ] = Hˆ 1 [vˆ2 ] ,

ˆ 2] ˆ [p] ˆ [|H| = 0 , [vˆ2 ] = H1 [Hˆ 2 ] , + ˆ ˆ 8π j j 4π jˆ ˆ 2 ˆ pˆ |H| |ˆv|2 ˆ = 0. + − H1 [(ˆv, H)] Eˆ + 2 + ρˆ 4π ρˆ 4π jˆ (22) [vˆ1 ] +

ˆ vˆ1,2 , ρˆ∞ , Sˆ∞ , vˆ1∞,2∞ , Hˆ 1∞,2∞ are constants (here and below all the hat Here ρ, ˆ S, values stand for parameters of the uniform discontinuous flow); jˆ = ρˆ vˆ1 = 0 (vˆ1 > 0, ˆ vˆ1∞ > 0), Eˆ = E(ρ, ˆ S); ˆ , pˆ = ρˆ 2 Eρ (ρ, ˆ S)

ˆ + ρˆ 2 Eρρ (ρ, ˆ , cˆ2 = 2ρE ˆ ρ (ρ, ˆ S) ˆ S)

2 pˆ ∞ = ρˆ∞ Eρ (ρˆ∞ , Sˆ∞ ) ,

A Complete 2D Stability Analysis of Fast MHD Shocks

73

2 2 cˆ∞ = 2ρˆ∞ Eρ (ρˆ∞ , Sˆ∞ ) + ρˆ∞ Eρρ (ρˆ∞ , Sˆ∞ ) ,

Vˆ = 1/ρˆ ,

Vˆ∞ = 1/ρˆ∞ .

From here, we consider only an ideal gas, cf. (17). As was pointed out above, for an ideal gas the compressibility conditions (15) provide the entropy increase, (14). But, the point is that, unlike gas dynamics, in MHD solutions of the jump conditions for planar shocks, (22), do not always satisfy the compressibility conditions (15) together with the evolutionarity inequalities (20). Actually, the question on resolving system (22) in the class of piecewise constant solutions satisfying properties (7), (8), (15), and (20) is a rather difficult (technically) problem. It was analyzed, in particular, by Kulikovsky and Lyubimov [41]. However, the domain D of the existence of such solutions can be found numerically (see below). Introduce the dimensionless parameters [15, 16, 12] h = (h1 , h2 )∗ =

q = |h| ,

ˆ H 4πγ pˆ

h∞ = (h1 , h2∞ )∗ ,

,

q∞ = |h∞ | ,

l=

h1 , q

R=

Hˆ 2∞ h2∞ = , 4π γ pˆ

ρˆ , ρˆ∞

P =

pˆ ∞ ; γ pˆ

+ M = vˆ1 /cˆ is the downstream Mach number, M0 = vˆ1 /cˆM the downstream fast Mach number. Then, the Alfv´en and magnetosonic velocities read cˆ ± cˆA = ch ˆ 1 , cˆM = √ 1 + q 2 ± (1 + q 2 )2 − 4h21 2

(for an ideal gas cˆ = γ p/ ˆ ρ), ˆ and the evolutionarity inequalities (20) become the form

2 γ P + q∞ +

0 < M0 < 1 ,

(23)

M > lq ,

(24)

2 )2 − 4l 2 q 2 γ P < 2M 2 R , (γ P + q∞

(25)

where, in view of (8) and (15), the parameters R and P should satisfy R > 1,

0

1 . γ

(26)

At the beginning, consider the special case when fast shocks are parallel: Hˆ 1 > 0, ˆ H2 = Hˆ 2∞ = 0 (l = 1, m = h2 /q = 0); i.e., the magnetic field is supposed to be parallel to the shock front. For this case the jump conditions (22), except the equality Hˆ 1 = Hˆ 1∞ , do not depend on the magnetic field and thus coincide with the corresponding ones in gas dynamics, which for an ideal gas implies R=

(γ − 1)M 2 + 2 , (γ + 1)M 2

P =

2(M 2 − 1) 1 + . γ γ +1

(27)

74

Yu. Trakhinin

− If q = h1 < 1, then cˆM = cˆA = ch ˆ 1 , and the fast Mach number coincides with the usual − + one: M0 = M. If q > 1, then cˆM = c, ˆ cˆA = cˆM = ch ˆ 1 that contradicts the evolutionarity inequalities. So, q < 1 and solution (27) satisfies the compulsory conditions (26), provided inequalities (γ − 1)/(2γ ) < M 2 < 1 hold. Accounting also for (23)–(25), we find the domain D of physically admissible parameters for fast parallel MHD shock waves: γ −1 0
For a generic case, when Hˆ 2∞ = 0, Hˆ 2 = 0, introduce the dimensionless parameter 2 = (l 2 + χ 2 m2 )q 2 ) measuring the competition between χ = Hˆ 2∞ /Hˆ 2 (by the way, q∞ the tangential components of the magnetic field ahead of and behind the shock wave. The jump conditions (22) yield the relations (1 − χ R)M 2 = l 2 (1 − χ )q 2 , 1 m2 + (1 − χ 2 )q 2 , γ 2 2 1 1 m2 2 2 − P R + (1 − R) +P + (1 − χ ) q = 0 . γ −1 γ γ 2 P = (1 − R)M 2 +

(29) (30)

(31)

By (26) and (29), χ ∈ (0, 1) (i.e., for fast shocks Hˆ 2 > Hˆ 2∞ ; see likewise [42]). In [16, 12, 20] the case of a weak magnetic field, when q 1, was analyzed. For this case system (29)–(31) has the solution [12]: χ=

(γ + 1)M02 2 + (γ − 1)M02 P =

which satisfies (23)–(26) if

+ O(q 2 ) ,

R=

1 + O(q 2 ) , χ

2(M02 − 1) 1 + + O(q 2 ) , γ γ +1

γ −1 < M0 < 1 . 2γ

(32)

That is, for a weak magnetic field the domain D is described by inequalities (32). Observe that the point γ −1 M0 = M0 min = 2γ is that of maximum compression for which, as in gas dynamics, R = (γ + 1)/(γ − 1). Consider now the case of arbitrary q. Let ϕ be the angle of inclination of the magnetic field behind the shock to the normal to the shock front, i.e., cos ϕ = l, sin ϕ = m. For given γ and ϕ the domain D is determined by two parameters: M0 and q. Indeed, from system (29)–(31) one gets for χ the equation [12]: a3 χ 3 + a2 χ 2 + a1 χ + a 0 = 0 ,

(33)

A Complete 2D Stability Analysis of Fast MHD Shocks

75

with a3 = q 4 l 2 ,

a2 = q 4 l 2 (γ − 1) − q 2 (γ − 2)M 2 ,

a0 = −

2 γ +1 2 2 2 − l q , M m2

2 + (γ − 1)M 2 2 2 2 (γ + 1)l . +q γ −q a1 = M − l q m2 m2

2

2 2

Then, for given M0 and q we find the root χ to Eq. (33) with the property 0 < χ < 1 (such a root is unique). After that, from relations (29), (30) we find P and R. One can show that if 0 < χ < 1, M0 < 1, M > lq, and P > 0, then the compressibility conditions R > 1, P < 1/γ as well as the evolutionarity inequality (25) + (vˆ1∞ > cˆM∞ ) are automatically fulfilled. Thus, for given γ and ϕ to find the domain D in the plane of parameters M0 and q one should draw the lines M0 = 1, q = 0, and the curves M = lq, P = 0 (provided that χ ∈ (0, 1)). These curves were plotted by R using MAPLE software. Plottings show that in the plane (M0 , q) the curve P = 0 lies always lower than the curve M = lq (the domains P > 0 and M > lq are situated under these curves). Note however that for ϕ < 50◦ these curves on a certain region lie close to each other. For example, for the scale in Fig. 1a (where γ = 5/3, ϕ = 40◦ ) the curves M = lq and P = 0 merge on a certain region, but a rescaling shows that the curve P = 0 is, actually, always lower than the curve M = lq. The curves M = lq and P = 0 have the vertical asymptotes at M0 = l and M0 = 1 respectively. In Fig. 1a one can see that from a certain moment the curve P = 0 “leaves” the curve M = lq; and, for example, for the scenario in Fig. 1b, with ϕ = 60◦ , these curves are already rather far from each other. As the final result, we conclude that the boundaries of the domain D in the plane of parameters M0 and q are lines M0 = 1, q = 0, and the “thermodynamical” curve P = 0 (recall that the inequality P > 0 expresses the natural thermodynamical requirement that the pressure should be positive). Besides, for ϕ = 0 (parallel shock) the domain D is bounded (it is the trapezoid determined by inequalities (28)); whereas for ϕ > 0 it at once becomes unbounded, and the curve P = 0 has the vertical asymptote at M0 = 1. Near the line M0 = 1 the shock is weak (R is close to 1 and P is close to 1/γ ), i.e., for rather large q (strong magnetic field) fast MHD shock waves are weak. At the same

7

a) 40 degrees M=lq

P=0

M=lq

6

6

5

5

4 q 3

D

2

1 M=lq P=0 0 0.4 0.5

1 0.7 M0

0.8

0.9

P=0

4 q 3

2

0.6

b) 60 degrees

7

1 0 0.4

D

0.5

0.6

0.7 M0

0.8

Fig. 1a, b. The domain D for γ = 5/3: (a) ϕ = 40◦ , (b) ϕ = 60◦

0.9

1

76

Yu. Trakhinin

time, the point of intersection of the curve P = 0 and the line q = 0 is that of maximum compression, M0 = M0 min , for which, as in gas dynamics, R = (γ + 1)/(γ − 1). Finishing the discussion of the question on the domain of admissible parameters, ˆ ∞ |2 we notice that in MHD one often uses the dimensionless parameter β = 4π pˆ ∞ /|H (so-called value of the plasma), which is the relation of gas dynamical and magnetic pressures ahead of the shock (the analogous downstream value is 1/(γ q 2 )). In terms of the dimensionless parameters utilized in this paper β=

P P . = 2 2 2 q∞ (q (l + m2 χ 2 ))

In [27] instability domains were found for given β in the plane of parameters ϕ∞ and ξ , where

1 M2 ϕ∞ = arccos , ξ = MA2 − 1 = 2 2 − 1 , l q 1 + χ 2 tg2 ϕ ϕ∞ is the angle of inclination of the magnetic field ahead of the shock, MA = vˆ1 /cˆA is the downstream Alfv´en Mach number. Remark 1. Unlike the study in [27], we do not examine here switch-on MHD shocks, which obey the conditions Hˆ 2∞ = 0, Hˆ 2 = 0. Such MHD shocks are known to be overcompressive and cannot thus be treated as strong discontinuities. They can be considered only as viscous profiles [58] (overcompressive MHD shocks were studied, e.g., in [28, 59]). Moreover, even if we consider a switch-on MHD shock as a strong discontinuity, this discontinuity is characteristic (det A1 = 0) and, hence, we may not apply to it the method of constructing Hadamard-type ill-posedness examples for a LSP suggested by Gardner and Kruskal [30].

2.3. Setting of the stability problem. To set the LSP for fast MHD shock waves we linearize the MHD system and the MHD Rankine-Hugoniot conditions about the piecewise constant solution (21) (recall that we consider 2D case). Let solution (21) satisfy the 1-shock inequalities (20) and the compressibility conditions (15), i.e., the planar MHD shock is supposed to be a physically admissible fast shock wave. Without loss of generality we choose a reference frame in which vˆ2 = 0. Taking this into account and linearizing the MHD system (1)–(4) (in 2D), we obtain, in the half-plane x1 > 0, the magnetoacoustic system (in a dimensionless form [12]) Lp + div v = 0 ,

LS = 0 ,

∂p ∂H1 2 M 2 Lv1 + ∂x + h2 ∂H ∂x1 − h2 ∂x2 = 0 , 1 ∂p ∂H2 1 M 2 Lv2 + ∂x + h1 ∂H ∂x2 − h1 ∂x1 = 0 , 2 ∂v2 − h ∂v1 = 0 , LH + h ∂v1 − h ∂v2 = 0 , LH1 + h1 ∂x 2 ∂x 2 2 ∂x 1 ∂x 2 2 1 1

(34)

for the vector of small perturbations U = (p, S, v1 , v2 , H1 , H2 )∗ (in order to simplify the notation we again indicate the perturbations by the same letters as in the nonlinear case).

A Complete 2D Stability Analysis of Fast MHD Shocks

77

Here L = ∂/∂t + ∂/∂x1 ; and we use the following scaled values: x = x/lˆ (lˆ the charˆ p = p/(ρˆ cˆ2 ), S = S/S, ˆ v = v/vˆ1 , H = H/(cˆ 4π ρˆ ) acteristic length), t = t vˆ1 /l, (the primes in (34) were removed). Analogously one can write out the magnetoacoustic system ahead of the shock, i.e., for x1 < 0. But, for fast shock waves, in view of the 1-shock conditions (20), this system has no outgoing characteristic modes. Hence, without loss of generality one can assume that there are no perturbations ahead of the shock: U ≡ 0 for x1 < 0. Linearizing then also the jump conditions (10)–(12) (in 2D) and accounting for (13), we obtain the LSP (in a dimensionless form) for fast MHD shock waves [12]. 2 we Problem 1 (LSP for fast MHD shocks). In the domain t > 0, x = (x1 , x2 ) ∈ R+ seek a solution to system (34) satisfying the boundary conditions

v1 + b1 p + b2 H1 + b3 H2 = 0 ,

S = b4 p + b5 H2 ,

Ft = b6 p + b7 H1 + b8 H2 ,

(35)

v2 = b9 Fx2 + b10 p + b11 H1 + b12 H2 , H1 = [h2 ]Fx2 ,

H2 = [h2 ]Ft + h1 v2 − h2 v1

at x1 = 0 (t > 0 , x2 ∈ R) and the initial data U(0, x) = U0 (x) ,

2 x ∈ R+ ,

F (0, x2 ) = F0 (x2 ) ,

x2 ∈ R

(36)

for t = 0. Here b1 =

1+a , 2M 2

b4 =

M2 − a , M2

b7 = −b2

a=

−ρˆ 2 vˆ12

lq , 2M 2

b3 =

mq − M 2 b5 , 2M 2

ˆ pˆ ∞ , Vˆ∞ ) mq(1 − χ )aHg (Vˆ , g, , 2M 2

b6 =

R(1 − a) , 2M 2 (1 − R)

HV (Vˆ , g, ˆ pˆ ∞ , Vˆ∞ )

b5 = −

2−R , 1−R b11 = −

b8 =

b2 = −

R(mq + M 2 b5 ) , 2M 2 (1 − R)

b7 M 2 w∞ + mq , RM 2

w∞ =

,

lm(χ − 1)q 2 , M2

b9 = R − 1 ,

b12 =

b10 = −

b6 w ∞ , R

lqR − b8 2M 2 w∞ , RM 2

[h2 ] = h2 − h2∞ ;

F = δf = F (t, x2 ) is a small displacement of the planar shock front x1 = 0. For an ideal gas, which obeys the state equation (17), the values a and Hg (Vˆ , g, ˆ pˆ ∞ , Vˆ∞ ) read: a=

M 2 (γ + 1 − R(γ − 1)) , 2 + M 2 (1 − R)(γ − 1) + q 2 m2 (1 − χ )(γ − 1)

78

Yu. Trakhinin

2(R − 1)(γ − 1) Hg (Vˆ , g, ˆ pˆ ∞ , Vˆ∞ ) = , γ + 1 − R(γ − 1) Besides, M 2 − l 2 q 2 (1 − χ ) R= , M 2χ

M = 2

1+q M02

2

+

(1 + q 2 )2 − 4l 2 q 2 , 2

i.e., for given γ the coefficients of system (34) and the boundary conditions (35) are determined through the parameters ϕ, M0 , and q (recall that l = cos ϕ, m = sin ϕ). For given functions p and H2 the entropy perturbation S is found from a separate IBVP (the second equation in (34) with the second boundary condition in (35) at x1 = 0). Denote the vector of other unknown functions in Problem 1 again by U: U = (p, v1 , v2 , H1 , H2 )∗ . Then, it follows from (34) that U satisfies the linear symmetric t-hyperbolic system A0 Ut + A1 Ux1 + A2 Ux2 = 0 ,

(37)

where A0 = diag (1, M 2 , M 2 , 1, 1) is the diagonal matrix (A0 > 0);    1 1 0 0 0 0 0 1 0 2  1 M 0 0 mq   0 0 0 −mq    A1 =  0 0 M 2 0 −lq  , A2 =  1 0 0 lq 0 0 0 1 0   0 −mq lq 0 0 0 0 0 0 mq −lq 0 1

 0 0  0 . 0 0

By means of cross differentiation, one can eliminate the function F from the boundary conditions (35) and rewrite them in the form G0 U + G1 Ut + G2 Ux2 = 0 ,

x1 = 0 ,

(38)

where Gα (α = 0, 2) are rectangular matrices (of order 4 × 5) that can easily be written out. That is to say, Problem 1 is rewritten in the form of the IBVP (37), (38) with corresponding initial data. Notice that, in view of the presence in the boundary conditions (38) of the derivatives Ut and Ux2 , this problem differs from (standard) linear IBVPs studied by Kreiss in [40]. For parallel shocks, in system (37) one should set l = 1, m = 0; and the boundary conditions become the form [21]: v1 + d 1 p = 0 ,

Ft = d2 p ,

v2 = d3 Fx2 ,

H2 = qv2 ,

H1 = 0 ,

(39)

where for an ideal gas d1 =

3 − γ + (3γ − 1)M 2 , 2M 2 (2 + (γ − 1)M 2 )

d2 = −

γ +1 , 4M 2

d3 =

d1 =

(γ − 1)(1 − M 2 )2 , M 2 (2 + (γ − 1)M 2 )

2(1 − M 2 ) , (γ + 1)(M 2 − q 2 )

M = M0 .

The second and the third boundary conditions in (39) yield (v2 )t = d2 d3 px2 at x1 = 0, i.e., the boundary conditions (39) are rewritten in form (38).

A Complete 2D Stability Analysis of Fast MHD Shocks

79

3. Numerical Testing of the ULC for Hyperbolic Problems with the 1-Shock Property 3.1. The LC and the ULC for hyperbolic IBVPs with the 1-shock property. Consider an IBVP in the half-plane x1 > 0 for an abstract linear symmetric t-hyperbolic system of n equations in form (37) for the vector U = (u1 , . . . , un ) with boundary conditions like (38). Definition 1. The IBVP for a symmetric t-hyperbolic system (37) with boundary conditions in form (38) is said to have the 1-shock property if: 1) The boundary x1 = 0 is noncharacteristic, i.e., det A1 = 0; 2) Among the eigenvalues λ1 ≤ . . . ≤ λn of the matrix A1 there is only one positive eigenvalue, and the others are negative: λ1 < 0 ,

λ2 , . . . , λn > 0 .

All the LSPs for 1-shocks (e.g., Problem 1) have, of course, the 1-shock property. Let the IBVP (37), (38) have the 1-shock property. Applying the Fourier-Laplace transform (the Fourier transform with respect to x2 and the Laplace transform with respect to t), we obtain the following boundary-value problem for the system of ODEs: dU = M(s, ω)U , x1 > 0 , dx1 M0 (s, ω)U = 0 , x1 = 0 .

(40) (41)

Here U = U(x1 , s, ω) is the Fourier-Laplace transform of the vector function U(t, x); s = η + iξ ,

η > 0,

(ξ, ω) ∈ R2 ,

M = M(s, ω) = −A−1 1 (sA0 + iωA2 ) .

Because of the presence of the derivatives Ut and Ux2 in the boundary conditions (38), the matrix M0 depends on the variables s and ω. One can show that the matrix M has the following property (see [32, 30]). For all ω ∈ R and η > 0 only one eigenvalue of the matrix M lies in the right half-plane (Re λ > 0), and the others in the left one (Re λ < 0). Following [30], we seek a solution to problem (40), (41) in the form: ! 1 (sA0 + λA1 + iωA2 )−1 A1 U0 exp(λx1 )dλ , (42) U(x1 ) = 2πi C

where C is a contour large enough to enclose all the singularities of the integrand; U0 is a constant vector satisfying the boundary conditions (41). The singularities of the integrand are the eigenvalues λ of the matrix M and thus satisfy the dispersion relation det(sA0 + λA1 + iωA2 ) = 0 .

(43)

It follows from (42) that U(x1 ) is a sum of residues at the poles of the integrand. As was noted above, there is only one (!) eigenvalue λ with Re λ > 0. For this eigenvalue exp(λx1 ) → +∞ as x1 → +∞. Hence the residue at this value of λ must be zero. As was shown by Gardner and Kruskal [30] (in other terms), the latter is the same as the

80

Yu. Trakhinin

statement that for given ω ∈ R there exist complex numbers s and λ, with Re s = η > 0, Re λ > 0, such that the homogeneous algebraic system (sA0 + λA1 + iωA2 )X = 0 , X ∗ A1 U0 = 0

(44) (45)

has a nonzero solution X. We recall that these values of s, λ, and ω must satisfy (43). Since λ with Re λ > 0 is a simple eigenvalue, then we can choose n − 1 linearly independent equations from system (44). Adding Eq. (45) to them, we obtain for the vector X a linear algebraic system GX = 0. If its determinant (Lopatinski determinant) is equal to zero, det G(η, ξ, ω, λ) = 0 ,

(46)

then the sequence of vector functions " √ # Uk (t, x) = exp − k + k(ηt + iξ t + iωx2 ) U(x1 ) (k = 1, 2, 3, . . . ) is the Hadamard-type ill-posedness example for the IBVP (37), (38) with special initial data. Thus, we are now in a position to give equivalent definitions for the LC and the ULC for hyperbolic IBVPs with the 1-shock property. Definition 2. The IBVP (37), (38) satisfies the LC if det G(η, ξ, ω, λ) = 0 for all η > 0, (ξ, ω) ∈ R2 , and λ being a solution of (43) with Re λ > 0. Definition 3. The IBVP (37), (38) satisfies the ULC if det G(η, ξ, ω, λ) = 0 for all η ≥ 0, (ξ, ω) ∈ R2 (η2 + ξ 2 + ω2 = 0), and λ being a solution of (43) with Re λ ≥ 0 and λ(0, ξ, ω) = lim λ(η, ξ, ω). η→+0

Remark 2. Let λ0 = lim λ(η, ξ, ω), where Re λ > 0 for η > 0. Then, generalη→+0

ly speaking, Re λ0 ≥ 0; but the case Re λ0 > 0 (for corresponding outgoing modes Re λ|η=0 < 0) corresponds to the transition between the classes of strong well-posedness (the ULC holds) and ill-posedness (for shocks this transition corresponds to the boundary between the domains of instability and uniform stability). As was pointed by Benzoni-Gavage et al. [5], this transition belongs to the class of weak stability, when the LC holds, but the ULC is violated. For shocks, it is a rather specific case of neutral stability referred to in [5] as surface waves of finite energy (or Rayleigh waves). In this case a LSP (like Problem 1) has normal modes in the form U = U0 exp{i(−ωt + kx1 + lx2 )} with Im ω = 0, Im k > 0, Im l = 0. For example, for gas dynamical shocks such a kind of neutral stability does not appear because the boundary between the domains of instability and uniform stability corresponds to the prohibited case with the Mach number M = 1 (see [24, 35, 39]). Looking ahead, we observe that in this paper Rayleigh waves are first discovered for classical shocks (e.g., for phase transitions viewed as discontinuities see [4]). Namely, we find that for fast MHD shocks there exists a “sharp” transition from strong instability to strong (uniform) stability. At last, notice that there is no necessity to examine separately the case of Rayleigh waves, i.e., the case when the Lopatinski determinant vanishes for η = 0 and Re λ > 0, because boundaries of the ill-posedness domain are directly found by testing the LC. So, to locate the boundary between the domains of strong and weak well-posedness (uniform and neutral stability for shocks) one should analyze only the case Re λ0 = 0.

A Complete 2D Stability Analysis of Fast MHD Shocks

81

In comparison with the usual definitions of the LC and the ULC (see [40, 47, 22]) the main advantage of Definitions 2, 3 is that the Lopatinski determinant can always be computed analytically. Indeed, to write out the matrix G following from system (44), (45) we have not found the roots λ of the dispersion relation (43), that cannot often be done in practice. With the help of Definitions 2 and 3 the boundaries of the domains of uniform stability, neutral stability, and instability for fast parallel shock waves in MHD and relativistic MHD were analytically found by Blokhin and Trakhinin [21, 56] (for relativistic MHD shocks the case of an arbitrary state equation, for which the instability domain is not empty, was examined in [56]). The LSP for fast parallel MHD shocks is the IBVP for system (37) under l = 1, m = 0 with the boundary conditions (39). As was shown in [21], this LSP satisfies the ULC, i.e., fast parallel MHD shocks are uniformly stable, if and only if F = F (M, q) = F M +

M2

+ 2/(γ − 1)

> 0,

where F(z) = (zM −1)z4 +q 2 {(zM −1)(z2 −2)z2 −q 2 (z2 −1)2 }, i.e., the hypersurface F = 0 is the boundary between the domains of uniform and neutral stability. Notice also that the points of this boundary belong to the domain of neutral stability. We use R MAPLE software to plot the curve F = 0 (see Fig. 2 where test calculations for the numerical algorithm that will be considered below are represented). It is easy to see that fast parallel shock waves are neutrally stable in a narrow range of admissible parameters M and q (recall that for parallel shocks the domain D is bounded by the lines M = M0min , M = 1, q = 0, and q = M). Along the axis M the width of the neutral stability domain adjoining to the line M = M0min (where shocks are most strong) is not greater than 10−2 (see Fig. 2).

0.5

F=0

0.4 0.3 q 0.2 0.1

0

0.45 0.455

0.46 0.465 M

0.47 0.475

Fig. 2. Fast parallel MHD shock: ◦ the points of uniform stability; · the points of neutral stability

82

Yu. Trakhinin

3.2. Numerical testing of the LC and the ULC. First of all, we note that the left-hand sides in the dispersion relation (43) and Eq. (46) are known to be homogenous polynomials of variables s, λ, and ω. Therefore, the complex parameters s and λ can be “normalized”: s = s/|ω|, λ = λ/|ω| (provided that ω = 0, i.e., a 1D ill-posedness example cannot be constructed; see Remark 4 below). Thus, without loss of generality we can assume that ω = ±1. Suppose ω = 1 (the case ω = −1 is analogous). Then, to test the LC and the ULC we should study the system of two polynomial equations f (s, λ) = 0 ,

g(s, λ) = 0

(47)

for two complex unknowns s and λ, where f (s, λ) = det(sA0 + λA1 + iA2 ) ,

g(s, λ) = det G|ω=1 .

The algorithm of numerical testing the LC and the ULC is extremely simple. It is R realized in the form of a MAPLE procedure which input arguments are the polynomials f and g (to find f and g one can, in principle, use symbolic computations). At the first stage, for a given point of the domain D of admissible parameters for the IBVP we R find the roots of system (47) (for this purpose some MAPLE functions are utilized; see also Remark 5 below). Because of the homogeneity of the polynomials, it is easy to show that the number of roots is equal to the degree of the resultant of the polynomials f and g (with respect to s or λ). Then, if there exists a root with Re s = η > 0 and Re λ > 0, we conclude the violation of the LC, i.e., the IBVP is ill-posed. If there exists a root with η = 0 and Re λ > 0, then in the point of the domain D under consideration Rayleigh waves take place (see Remark 2) and this point belongs to the class of weak stability (the LC holds, but the ULC is violated). Note however that for a “real” IBVP (e.g., for the LSP for fast MHD shocks) the probability to hit exactly on the boundary between domains 1 and 3 is extremely small (for short, as in Sect. 1, we will below denote the domains of the ULC, weak well-posedness, and ill-posedness by numbers 1, 2, and 3 respectively). If there exists a root (s∗ , λ∗ ) of system (47) hitting on the imaginary axis, i.e., Re s∗ = 0, Re λ∗ = 0 (s∗ = iξ∗ ), then to test the ULC we perturb the parameter s∗ . Namely, one considers sε = ε + s∗ = ε + iξ∗ , where ε 1 is a rather small constant (see Remark 3 below). We find then the roots λi (i = 1, n) of the equation f (sε , λ) = 0 and choose among them a root λ being most close to λ∗ , i.e., λ = min |λi − λ∗ | . i=1,n

If Re λ > 0 (because of the 1-shock property there is only one root λi with Re λi > 0), then we conclude the violation of the ULC, i.e., the point of the domain D under consideration belongs to domain 2. Otherwise, the ULC holds, i.e., this point belongs to domain 1. Remark 3. Concerning the “perturbation” ε, for test problems considered below and as well as for Problem 1 we take ε = 10−12 . Actually, the smaller we choose ε, the better we localize the boundary between domains 1 and 2, but the more floating-point precision is needed for numerical calculations. On the other hand, maybe there is no sense to find for shocks the boundary of domain 2 with big accuracy, because the question on the real existence of neutrally stable shock waves should be solved in the initial nonlinear statement.

A Complete 2D Stability Analysis of Fast MHD Shocks

83

Remark 4. A 1D ill-posedness (instability) condition, which corresponds to the violation of the LC for ω = 0, can always be computed analytically. As was noted by Serre [55] (see also [5]), for a certain class of problems (e.g., for the LSP for gas dynamical shocks; see [22]), the hypersurface of 1D ill-posedness is the boundary between domains 2 and 3. Recall that if a hyperbolic IBVP is correctly posed according to the number of boundary conditions (see, e.g., [37]), but it is nevertheless ill-posed, this means that, for its 1D variant reduced to the canonical diagonal form of Riemann invariants, outgoing Riemann invariants cannot be expressed through incoming ones on the boundary x1 = 0. For a LSP, in such a case one says that the so-called Majda’s conditions [47, 49] are violated. Test problem 1: The wave equation. As the first test problem for the numerical algorithm suggested we consider the IBVP in the half-plane x1 > 0 for the 2D wave equation: utt = ux1 x1 + ux2 x2 , x1 > 0 ; ut + aux1 + bux2 = 0 , x1 = 0 ;

(48) (49)

where a and b are real constants. As is known (see [34]), problem (48), (49) is ill-posed in the half-disk a 2 + b2 < 1, a > 0 and on the line a = 1; it is well-posed (the ULC holds) in the half-strip |b| < 1, a < 0 and weakly well-posed in all other points (a, b) ∈ R2 . Besides, for a = 0, |b| < 1 Rayleigh waves take place, and the line a = 1 is that of 1D ill-posedness. The wave equation (48) can be rewritten in the form of a symmetric t-hyperbolic system, cf. (37), for example, with the following matrices: 

1  21 A0 =  2 0

 1 0 2  1 0, 0 1



− 1 −1  2 1 A1 =  −1 − 2 0 0

0



 0 , 1 2



 0 0 −1 1  A2 =  0 0 − 2  , −1 − 21 0

where U = (u1 , u2 , u3 )∗ = (ut , ux1 , ux2 )∗ . As boundary conditions for system (37) we take condition (49) (i.e., u1 + au2 + bu3 = 0) and the trivial relation (u1 )x2 = (u3 )t . Such an IBVP has the 1-shock property (see Definition 1), and we easily write out for it the dispersion relation and the Lopatinski determinant: f (s, λ) = (s + λ)(s 2 − λ2 + ω2 ) = 0 ,

g(s, λ) = (2s + λ)(s − aλ + ibω) = 0 ,

with ω = ±1. It is clear that in the representations of the polynomials f and g the factors s + λ and 2s + λ can be removed, because they correspond to outgoing characteristic modes (Re λ < 0 for η > 0), but we however retain them (for “the purity of experiment”). The numerical calculations in the square |a| ≤ 2, |b| ≤ 2 (a = 1; see Remark 4) result in Fig. 3 (calculations were performed for the grid of 40 × 40 points). Test problem 2: Fast parallel MHD shock. The dispersion relation for the magnetoacoustic system (37) reads: f (s, λ) = M 2 2 (M 2 2 − λ2 + ω2 ) + q 2 (ω2 − λ2 ) M 2 2 − (lλ + imω)2 = 0,

(50)

84

Yu. Trakhinin

2

1

b0

–1

–2 –2

–1

0 a

1

2

Fig. 3. The IBVP for the wave equation: ◦ the points of the ULC; · the points of the LC, but not the ULC; the points of ill-posedness

where = s + λ (while writing f , we have removed the factor corresponding to outgoing characteristic modes); for parallel shocks l = 1, m = 0, and without loss of generality ω = 1. For fast parallel shock waves the Lopatinski determinant has the form [21]: g(s, λ) = M 2 2 − 2M 2 λ −

2 λ2 = 0 . γ −1

The numerical calculations near the boundary F = 0 between the domains of uniform and neutral stability, which was analytically found in [21], result in Fig. 2. Remark 5. In this work, we do not aim to develop a certain universal and extremely rational algorithm of numerical testing the LC and the ULC. For example, it is rather R nonrational to seek all the roots of system (47) (they were found by using MAPLE functions). For instance, it would be useful to work out an iterative method for finding only the roots of system (47) that indicate the violation of the LC or the ULC. On the other hand, there is a method to classify roots of a polynomial of one variable according to their location in the complex plane (the number of roots in the right/left half-plane and on the imaginary axis is determined; see, e.g., [23]). A generalization of this method to the case of two polynomials of two variables would be the best improvement of the algorithm that is most important for its usage in 3D. Observe that the algorithm of numerical testing the LC and the ULC in 3D has no principle difference from that in 2D. Instead of the scalar real parameter ω we have in 3D the vector ω = (ω1 , ω2 ). Besides, without loss of generality one can assume that |ω| = 1, i.e., ω1 = cos ψ, ω2 = sin ψ, with 0 ≤ ψ < 2π . Then, we choose a partition of the interval (0, 2π ) and, for all its points, to test the LC and the ULC we do the same as in the 2D case. Moreover, because of the analyticity of the functions f and g, to prove numerically well-posedness or weak well-posedness the number of partition points can not be so big.

A Complete 2D Stability Analysis of Fast MHD Shocks

85

4. Numerical Investigation of the Stability of Fast MHD Shocks Consider now the general case of fast shocks, when l = 1 (ϕ = 0). The dispersion relation for Problem 1 has form (50). To write out Eq. (45) it is convenient to rewrite the boundary conditions (41) obtained by applying the Fourier-Laplace transform to conditions (35) as follows: a 1 U 0 = a p0 ,

x1 = 0 ,

(51)

where p0 is the Fourier-Laplace transform of the function p(t, 0, x2 ); the components aα = aα (s, ω) (α = 1, 5) of the vector a = (a1 , . . . , a5 )∗ are found through the coefficients of the boundary conditions (35). To find the coefficients aα that are not given here R (because of their awkwardness) we use MAPLE symbolic computation software. It follows from (34) that the vector X, cf. (44), is parallel to the vector with components kα = kα (s, λ, ω) (α = 1, 5), where k1 = M 2 2 (mλ − ilω) , k2 = ilωλ − m(M 2 2 + ω2 ) , k3 = l(M 2 2 − λ2 ) − imωλ ,

k4 = −iqω M 2 2 − (lλ + imω)2 ,

k5 = qλ M 2 2 − (lλ + imω)2 . Then, with regard to (51), Eq. (45) finally implies the Lopatinski determinant: g(s, λ) = d1 M 2 2 (mλ − ilω) + d2 ilωλ − m(M 2 2 + ω2 ) + d3 l(M 2 2 − λ2 ) − imωλ + q M 2 2 − (lλ + imω)2 (d4 λ − id5 ω) ,

(52)

where ω = ±1, d1 = a1 + a2 , d2 = a1 + M 2 a2 + mqa5 , d3 = M 2 a3 − lqa5 , d4 = q(ma2 − la3 ) + a5 , d5 = a1 . For the 1D case, when ω = 0, the coefficients aα (α = 1, 5) do not depend on s and ω and have the form a1 = q (lb12 + mb3 + m(1 − χ )b8 ) − 1 , a3 = b12 a5 + b10 a1 ,

a4 = 0 ,

a2 = −b3 a5 − b1 a1 ,

a5 = −q (lb10 + mb1 + m(1 − χ )b6 ) .

From the dispersion relation (50) for ω = 0 we explicitly find the root λ corresponding to a unique incoming characteristic: λ= M0 s/(1 − M0 ). Then, the 1D instability condition reads: m M 2 (M0 − M 2 ) + q 2 M0 (M 2 − l 2 M02 ) a2 + M 2 (M 2 − M02 ) − q 2 M0 (M 2 − l 2 M0 2) a3 + q(1 − M0 )(M02 − M 2 )a5 − mM 2 (1 − M0 )a1 = 0 .

(53)

86

Yu. Trakhinin

In a detailed form condition (53) is rather “boundless”, but for parallel and perpendicular shocks it is quite simple: F = 1 + M0 −

M02 (R − 1) =0 ρE ˆ p (ρ, ˆ p) ˆ

(recall that for parallel shocks the usual and fast Mach numbers coincide: M0 = M). As was shown by Gardner and Kruskal [30], for F > 0 fast parallel and perpendicular shock waves are weakly stable, whereas for F < 0 they are unstable. That is the hypersurface of 1D instability, F = 0, is the boundary between the domains of instability and weak stability (exactly as for gas dynamical shocks; see Remark 4). For an ideal gas ρE ˆ p (ρ, ˆ p) ˆ = 1/(γ − 1), and it is easily verified that in the plane of parameters M0 and q the line F = 0 lies outside of the domain D of admissible parameters. Numerical calculations show that in the general case, when l = 0, m = 0, condition (53) for an ideal gas cannot also be fulfilled in the domain D, i.e., in an ideal gas fast MHD shocks are always 1D stable. It is interesting to note that for fast MHD shocks (as well as for slow ones) the 1D instability condition was not written out earlier somewhere. The point is that the finding of this condition by constructing (in the usual way) a 1D Hadamard-type ill-posedness example or by rewriting the boundary conditions in Riemann invariants is connected with extremely big technical difficulties. Applying for (50), (52) the algorithm of numerical testing the LC and the ULC suggested above, we find the domains of uniform stability, neutral stability, and instability (domains 1, 2, and 3) for fast MHD shock waves. The numerical calculations for γ = 5/3 and different given angles of inclination ϕ result in Figs. 4, 5 where the transition curves (inner boundaries of domains 1, 2, 3) are drawn (for a set of M0 -coordinates from the interval (M0min , 1) we find, to within 10−3 , the q-coordinates of the boundary points (M0 , q) of domains 1, 2, and 3; the left curve in Figs. 4, 5 is the boundary P = 0 of the domain D). One can see that the neutral stability domain (being too narrow for ϕ = 0; see Fig. 2) increases while ϕ does (see Fig. 4a, 4b), and for ϕ > 30◦ (ahead of the shock corresponding angles ϕ∞ > 0◦ ) the instability domain arises (see Fig. 4c, 4d, 4e). Moreover, a part of the boundary of domain 3 marks the transition to uniform stability (domain 1), i.e., for fast MHD shocks Rayleigh waves appear (see Remark 2). The instability domain increases up to a certain ϕ (see Fig. 4c, 4d, 4e). After that it is gradually narrowed down (see Fig. 4f, 5a), and for ϕ = 80◦ there are already no unstable shocks. The neutral stability domain (domain 2) for perpendicular shocks is represented in Fig. 5b. Naturally, it is technically impossible to investigate the whole unbounded domain D. Actually, it is not necessary because for rather large q fast shocks are weak (see Sect. 2). But, weak Lax shocks were shown by M´etivier [50] to be uniformly stable. At the same time, most strong fast MHD shock waves are neutrally stable or unstable. Indeed, domains 2 and 3 adjoin to the curve P = pˆ ∞ /(γ p) ˆ = 0 for not large q ˆ 2 /(4πγ p) (q 2 = |H| ˆ < 2), i.e., in this case the gas dynamical pressure ahead of the shock is much less than that behind the shock. For an ideal gas with γ = 7/5 (two-atomic gas, e.g., air) the left boundary (the curve P = 0) of the domain D shifts to the left in comparison with the case γ = 5/3. Domains 1, 2, 3 for γ = 7/5 and ϕ = 60◦ are represented in Fig. 6a. Besides, the domains of instability and neutral stability for γ = 7/5 are relatively more wide than for γ = 5/3 (see Fig. 6b). For γ > 5/3 (notice that 7/5 < 5/3) the domain D conversely becomes more narrow (its left boundary shifts to the right in comparison with the case

A Complete 2D Stability Analysis of Fast MHD Shocks

87

a) 10 degrees

b) 25 degrees

1.4

1.4

1.2

1.2

1

1

q0.8

q0.8

0.6

0.6

0.4

2

0.4

1

0.2 0 0.4

0.5

0.6

0.7 0.8 M0 c) 35 degrees

0.9

1

0 0.4

0.6

0.7 0.8 M0 d) 45 degrees

1.2 3

1

0.9

1

0.9

1

0.9

1

3 1

1

1

q0.8

q0.8

0.6

0.6 2

2

0.4

0.2

0.2 0.5

0.6

0.7 0.8 M0 e) 60 degrees

0.9

1

1.4

0 0.4

0.5

0.6

0.7 0.8 M0 f) 70 degrees

1.4

1.2

1.2 3

1 q0.8

1

1

q0.8

2

0.6

0.4

0.4

0.2

0.2 0.5

3

1

0.6

0 0.4

0.5

1.4

1.2

0 0.4

1

0.2

1.4

0.4

2

0.6

0.7 M0

0.8

0.9

1

0 0.4

2

0.5

0.6

0.7 M0

0.8

Fig. 4a–f. Domains 1,2,3 for γ = 5/3: (a) ϕ = 10◦ , (b) ϕ = 25◦ , (c) ϕ = 35◦ , (d) ϕ = 45◦ , (e) ϕ = 60◦ , (f) ϕ = 70◦

γ = 5/3), and the domains of instability and neutral stability are likewise relatively more narrow than for γ = 5/3 (we do not present here the results of corresponding numerical calculations). Remark 6. In [27] instability domains were being sought on the basis of the condition (in our notations) η > 0, 001. Therefore, it is clear that in [27] only a part of the whole

88

Yu. Trakhinin a) 75 degrees

b) 90 degrees

1.4

1.4

1.2

1.2 3

1

1

1

q0.8

q0.8

2

0.6

0.6

0.4

0.4

0.2

0.2

0 0.4

0.5

0.6

0.7 M0

0.8

0.9

1

0 0.4

2

1

0.5

0.6

0.7 M0

0.8

0.9

1

Fig. 5a,b. Domains 1,2,3 for γ = 5/3: (a) ϕ = 75◦ , (b) ϕ = 90◦ (perpendicular shock) a)

b)

1.4

1.4

1.2

1.2

3 1

1 q0.8

1 q0.8

2

0.6

0.6

0.4

0.4

0.2

0.2

0

0.4

0.5

0.6

0.7 M0

0.8

0.9

1

0

0.4

0.5

0.6

0.7 M0

0.8

0.9

1

Fig. 6a,b. Domains 1,2,3 for ϕ = 60◦ : (a) γ = 7/5, (b) comparison of the cases γ = 7/5 and γ = 5/3

domain of instability was found. For example, in [27] instability shocks were discovered for angles of inclination of the upstream magnetic field ϕ∞ < 22◦ . Actually, for instance, for the inclination ϕ = 75◦ behind the shock (see Fig. 5a) corresponding inclination angles ahead of the shock can achieve the value ϕ∞ = 52◦ . For the instability domain found in [27] the value of the plasma β < 0, 25. But, for most weak unstable fast shock waves, which correspond to the points of domain 3 being most far from the curve P = 0, the parameter β can actually achieve the value β = 0, 45 (for angles closed to the least possible ϕ for which instability appears, e.g., for ϕ = 35◦ ; cf. Fig. 4c). Remark 7. According to numerical calculations performed by Filippova [27], considering 3D perturbations does not enlarge the instability domain for fast MHD shocks found for the 2D case. Moreover, for the angle ψ = 0 (see Remark 5) corresponding to 2D perturbations the instability domain is wider than that for ψ > 0 (see [27]). It is not necessary, but quite possible that the same takes place likewise for neutrally stable shocks, i.e., the neutral stability domain for fast MHD shock waves found in this paper for the 2D case will coincide with that for 3D.

A Complete 2D Stability Analysis of Fast MHD Shocks

89

5. Concluding Remarks/Open Problems In view of Majda’s results in [47], in the domain of uniform stability of fast MHD shock waves (domain 1; see Figs. 4, 5) the following a priori esimate is valid for Problem 1: C F 2W 1 (R2 ) + U|x1 =0 2L (R2 ) + ηU2L (R ×R2 ) ≤ F0 2W 1 (R) , + 2,η 2,η + + + 2,η 2 η

(54)

where η is a sufficiently large real number (see [40, 47, 49]), C is a constant independent of η. Besides, it is supposed that initial data for fluid perturbations are homogenous: U0 ≡ 0 (an a priori estimate for U0 = 0 seems to be obtained by using for the LSP Rauch’s arguments [52] extending Kreiss’ results in [40] to the case of nonhomogeneous initial data). As was already pointed out in Sect. 1, with a certain degree of strictness one can conclude the nonlinear stability of uniformly stable shocks. To prove this rigorously for fast MHD shock waves we should either show that the MHD system satisfies Majda’s block structure conditions [47, 48] or (if it does not satisfy them) generalize Majda’s short-time existence theorem [48] to MHD. In [20] the uniform stability of fast MHD shocks in an ideal gas was proved for the case of a weak magnetic field (q 1). For this case, in [20] the following a priori estimates for Problem 1 were deduced: U(t)W 2 (R2 ) ≤ K1 U0 W 2 (R2 ) , 2

+

F W 3 ((0,T )×R) ≤ K2 , 2

2

+

(55) (56)

where the “layer-wise” estimate (55) is valid for all 0 < t ≤ T < ∞; K1 > 0 is a constant depending on T ; K2 is a constant depending on T , F0 W 3 (R) , and U0 W 2 (R2 ) . + 2 2 The question on obtaining such estimates in the whole domain of uniform stability found in this paper remains open yet. It should be noted that for Lax shocks there is yet no demonstration of the equivalence of the ULC for a LSP and the existence of a priori estimates like (55), (56), which, from the practical point of view, are more preferable than estimates like (54). That is, one has to deduce estimates in form (55), (56) separately for each concrete LSP (in the uniform stability domain). On the other hand, not the estimates (55), (56) themselves are important, but their constructive character in the sense that they are obtained by the techniques of dissipative energy integrals [11], which can be used for constructing adaptive calculation models [11, 13]. For such calculation models one can write out difference analogs of dissipative energy integrals that imply energy estimates indicating the stability of a calculation model under consideration (see [11, 13]). Note however that such an approach has been yet developed only on the linear level. First ideas for its application to nonlinear hyperbolic problems can be found in [18] where a Cauchy problem for the gas dynamics equations is examined. In this paper we have shown a principal possibility of numerical testing the ULC for linear hyperbolic IBVPs (on the example of problems with the 1-shock property). Actually, one can apparently test the LC and the ULC for linear hyperbolic problems having no 1-shock property. For this purpose, roots of the equation g(s, ω) = 0 are to be found, where g(s, ω) = det G is the Lopatinski determinant computed by the usual way [40]. Even without knowing an explicit form of the function g(s, ω) one can always compute its value at any given point (s, ω) (ω = (ω1 , ω2 ) ∈ R2 ). To this end, for the matrix M(s, ω) at given values s and ω we should find the Schur decomposition [44]

90

Yu. Trakhinin

(from the computational point of view the Schur form is more preferable than the Jordan one). Then, by usual arguments [40, 49, 22], we compute the Lopatinski determinant (to be exact, its value at a given point (s, ω)). If we are able to compute values of the function g(s, ω) at given points, then for given ω we can numerically find (e.g., by regula falsi) its zeros s. If among these zeros there is one with Re s = η > 0, then the IBVP is ill-posed in the point of the domain D under consideration. If the LC holds, but there is a zero with η = 0, then the ULC is violated. It is clear that from the computational point of view such an algorithm of numerical testing the LC and the ULC is already not so simple as one suggested in this work for IBVPs with the 1-shock property. With the help of such a general algorithm one can, for example, carry out a complete stability analysis (find domains 1, 2, and 3) for slow MHD shocks and Alfv´en waves. Note that rotational (Alfv´en) discontinuities are shown by Blokhin and Trakhinin [19] to be unstable under a strong magnetic field (q 1). Concerning other MHD discontinuities, as was established by Blokhin and Druzhinin [14, 17] (see also [12, 22]), contact discontinuities are uniformly stable, and tangential discontinuities are almost always (see [22]) unstable. In this connection, we emphasize that a full resolution of the problem of stability for all the types of MHD strong discontinuities is of great theoretical and practical importance, in particular, for numerical computations of inviscid MHD flows. Acknowledgements. The author gratefully thanks Prof. A.M. Blokhin for many helpful discussions. This research was partially supported by INTAS grant 01–868.

References 1. Akhiezer, A.I., Liubarskii, G.Ia., Polovin, R.V.: The stability of shock waves in magnetohydrodynamics. Sov. Phys. JETP 35 (8), 507–511 (1959) 2. Alfv´en, H.: On the existence of electromagnetic-hydrodynamic waves. Ark. Mat. Astron. Fys. B 29(2), 1–7 (1943) 3. Anile, A.M., Russo, G.: Corrugation stability of magnetohydrodynamic shock waves. In: Nonlinear wave motion, Jeffrey, A (ed.) Pitman Monogr. Surv. Pure Appl. Math. 43, Harlow: Longman Scientific & Technical; New York: John Wiley & Sons,1989, pp. 11–21 4. Benzoni-Gavage, S.: Stability of subsonic planar phase boundaries in a van der Waals fluid. Arch. Rat. Mech. Anal. 150, 23–55 (1999) 5. Benzoni-Gavage, S., Rousset, F., Serre, D., Zumbrun, K.: Generic types and transitions in hyperbolic initial-boundary value problems. Proc. R. Soc. Edinb. Sect. A, to appear 6. Bethe, H.A.: On the theory of shock waves for an arbitrary equation of state. Office of Scientific Research and Development. Report No. 545 1942. In: Classic papers in shock compression science, New York: Springer-Verlag, 1982, pp. 421–492 7. Blokhin, A.M.: A mixed problem for a system of equations of acoustics with boundary conditions on a shock wave. Izv. Sibirsk. Otdel. Akad. Nauk SSSR Ser. Tekhn. Nauk, 13, 25–33 (1979), in Russian 8. Blokhin, A.M.: A mixed problem for a three-dimensional system of equations of acoustics with boundary conditions on the shock wave. Dinamika Sploshn. Sredy, 46, 3–13 (1980), in Russian 9. Blokhin, A.M.: Estimation of the energy integral of a mixed problem for gas dynamics equations with boundary conditions on the shock wave. Siberian Math. J. 22, 501–523 (1981) 10. Blokhin,A.M.: Uniqueness of the classical solution of a mixed problem for equations of gas dynamics with boundary conditions on a shock wave. Siberian Math. J. 23, 604–615 (1982) 11. Blokhin, A.M.: Energy integrals and their applications in problems of gas dynamics. Novosibirsk: Nauka, Sibirsk. Otdel., 1986, in Russian 12. Blokhin, A.M.: Strong discontinuities in magnetohydrodynamics. New York: Nova Science Publ., 1994 13. Blokhin, A.M.: A new concept of construction of adaptive calculation models for hyperbolic problems. NATO ASI Ser., Ser. C, Math. Phys. Sci. 536, 23–64 (1999)

A Complete 2D Stability Analysis of Fast MHD Shocks

91

14. Blokhin, A.M., Druzhinin, I.Yu.: Formulation of problems on the stability of discontinuities in magnetohydrodynamics. Boundary value problems for partial differential equations, Collect. Sci. Works, Novosibirsk, 1988, pp. 16–38, in Russian 15. Blokhin, A.M., Druzhinin, I.Yu.: On the stability of a fast magnetohydrodynamic shock wave for a weak magnetic field. Partial differential equations, Collect. Sci. Works, Novosibirsk, 1989, pp. 15–32 in Russian 16. Blokhin, A.M., Druzhinin, I.Yu.: Stability of shock waves in magnetohydrodynamics. Siberian Math. J. 30, 511–524 (1989) 17. Blokhin, A.M., Druzhinin, I.Yu.: Well-posedness of some linear problems on the stability of strong discontinuities in magnetohydrodynamics. Siberian Math. J. 31, 187–191 (1990) 18. Blokhin, A.M., Sokovikov, I.G.: On a certain approach to constructing difference schemes for quasilinear equations of gas dynamics. Siberian Math. J. 40, 1044–1050 (1999) 19. Blokhin, A.M., Trakhinin, Yu.L.: A rotational discontinuity in magnetohydrodynamics. Siberian Math. J. 34, 395–411 (1993) 20. Blokhin, A.M., Trakhinin, Yu.L.: Investigation of the well-posedness of the mixed problem on the stability of fast shock waves in magnetohydrodynamics. Matematiche (Catania) 49, 123–141 (1994) 21. Blokhin, A.M., Trakhinin, Yu.L.: Stability of fast parallel MHD shock waves in polytropic gas. Eur. J. Mech. B/Fluids 18, 197–211 (1999) 22. Blokhin, A.M., Trakhinin, Yu.L.: Stability of strong discontinuities in fluids and MHD. In: Handbook of Mathematical Fluid Dynamics, Friedlander, S., Serre, D (eds.), 1, Paris: Elsevier, 2002, pp. 545–652 ¨ 23. Cohn, A.: Uber die Anzahl der Wurzeln einer algebraischen Gleichung in einem Kreise. Math. Z. 14, 110–148 (1922) 24. D’yakov, S.P.: On stability of shock waves. Zh. Eksp. Teor. Fiz. 27, 288–296 (1954), in Russian [English transl.: Atomic Energy Research Establishment AERE Lib./trans. 648 (1956)] 25. Egorushkin, S.A., Kulikovsky, A.G.: On the stability of solutions of some boundary value problems for hyperbolic equations. J. Appl. Math. Mech. 56, 36–45 (1992) 26. Erpenbeck, J.J.: Stability of step shocks. Phys. Fluids 5, 1181–1187 (1962) 27. Filippova, O.L.: Stability of plane MHD shock waves in an ideal gas. Fluid Dyn. 26, 897–904 (1991) 28. Freist¨uhler, H.: Contributions to the mathematical theory of magnetohydrodynamic shock waves. AMS/IP Stud. Adv. Math. 3, 175–187 (1997) 29. Friedrichs, K.O.: Symmetric hyperbolic linear differential equations. Commun. Pure Appl. Math. 27, 123–131 (1974) 30. Gardner, C.S., Kruskal, M.D.: Stability of plane magnetohydrodynamic shocks. Phys. Fluids 7, 700–706 (1964) 31. Godunov, S.K.: Symmetrization of magnetohydrodynamics equations. Chislennye Metody Mekhaniki Sploshnoi Sredy, Novosibirsk 3, 26–34 (1972), in Russian 32. Hersh, R.: Mixed problems in several variables. J. Math. Mech. 12, 317–334 (1963) 33. Hoffman, F., Teller, E.: Magnetohydrodynamic shocks. Phys. Rev. 80, 696–703 (1950) 34. Ikawa, M.: Mixed problem for the wave equation with an oblique derivative boundary condition. Osaka J. Math. 7, 495–525 (1970) 35. Iordanskii, S.V.: On the stability of a planar steady shock wave. Prikl. Mat. Mekh. 21, 465–472 (1957), in Russian 36. Iordanskii, S.V.: On compression waves in magnetohydrodynamics. Sov. Phys. Dokl. 3, 736–738 (1959) 37. Jeffrey, A.: Quasilinear Hyperbolic Systems and Waves. New York: Pitman, 1976 38. Kato, T.: The Cauchy problem for quasi-linear symmetric hyperbolic systems. Arch. Rat. Mech. Anal. 58, 181–205 (1975) 39. Kontorovich, V.M.: On the shock waves stability. Sov. Phys. JETP 33(6), 1179–1180 (1959) 40. Kreiss, H.-O.: Initial boundary value problems for hyperbolic systems. Commun. Pure Appl. Math. 23, 277–296 (1970) 41. Kulikovsky, A.G., Lyubimov, G.A.: Magnetohydrodynamics. Massachusets: Addison-Wesley, 1965 42. Landau, L.D., Lifshiz, E.M.: Electrodynamics of continuous media. Course of Theoretical Physics, Vol. 8. Oxford, London, New York, Paris: Pergamon Press, 1960 43. Landau, L.D., Lifshiz, E.M.: Fluid Mechanics. Course of Theoretical Physics, 6. New York and Oxford: Pergamon Press, 1997 44. Lankaster, P.: Theory of matrices. New York and London: Academic Press, 1969 45. Lax, P.D.: Hyperbolic systems of conservation laws (II). Commun. Pure Appl. Math. 10, 537–566 (1957) 46. Lessen, M., Deshpande, M.V.: Stability of magnetohydrodynamic shocks waves. J. Plasma Phys. 1, 463–472 (1967) 47. Majda, A.: The stability of multi-dimensional shock fronts — a new problem for linear hyperbolic equations. Providence: Mem. Amer. Math. Soc. 41(275), (1983)

92

Yu. Trakhinin

48. Majda, A.: The existence of multi-dimensional shock fronts. Providence: Mem. Amer. Math. Soc. 43(281), (1983) 49. Majda, A.: Compressible Fluid Flow and Systems of Conservation Laws in Several Space Variables. New York: Springer-Verlag, 1984 50. M´etivier, G.: Stability of multidimensional weak shocks. Comm. Partial. Diff. Equ. 15, 983–1028 (1990) 51. M´etivier, G.: The block structure condition for symmetric hyperbolic systems. Bull. Lond. Math. Soc. 32, 689–702 (2000) 52. Rauch, J.: L2 is a continuable initial condition for Kreiss mixed problems. Commun. Pure Appl. Math. 25, 265–285 (1971) 53. Rozhdestvenskii, B.L., Janenko, N.N.: Systems of quasilinear equations and their applications to gas dynamics. Providence: Translations of Mathematical Monographs, 55, Providence, RI: American Mathematical Society, 1983 54. Sabl´e-Tougeron, M.: Existence pour un probleme de l’elastodynamique Neumann non lineaire en dimension 2. Arch. Rat. Mech. Anal. 101, 261–292 (1988) 55. Serre, D.: La transition vers l’instabilit´e pour les ondes de choc multi-dimensionnelles. Trans. Am. Math. Soc. 353, 5071–5093 (2001) 56. Trakhinin, Yu.L.: On stability of shock waves in relativistic magnetohydrodynamics. Quar. Appl. Math. 59, 25–45 (2001) 57. Trakhinin, Yu.L.: On stability of fast shock waves in classical and relativistic MHD. In: Freist¨uhler, H., Warnecke, G. (eds.) Hyperbolic problems: Theory, Numerics, Applications. Proceedings, 8th International Conference, Magdeburg 2000, Basel, Boston, Berlin: Birkh¨auser, 2001, pp. 911–919 58. Whitham, G.B.: Linear and Nonlinear Waves. New York etc.: John Wiley & Sons, 1974 59. Wu, C.C.: Formation, structure, and stability of MHD intermediate shocks. J. Geophys. Res. 95, 8149–8175 (1990) Communicated by P. Constantin

Commun. Math. Phys. 236, 93–133 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0801-0

Communications in

Mathematical Physics

Hitchin Systems – Symplectic Hecke Correspondence and Two-Dimensional Version A. M. Levin1,∗ , M. A. Olshanetsky1,∗∗ , A. Zotov2 1 2

Max Planck Institute of Mathematics, Bonn, Germany. E-mail: [email protected]; [email protected] Institute of Theoretical and Experimental Physics, Moscow, Russia. E-mail: [email protected]

Received: 19 April 2002 / Accepted: 14 November 2002 Published online: 18 February 2003 – © Springer-Verlag 2003

Abstract: The aim of this paper is two-fold. First, we define symplectic maps between Hitchin systems related to holomorphic bundles of different degrees. We call these maps the Symplectic Hecke Correspondence (SHC) of the corresponding Higgs bundles. They are constructed by means of the modification of the underlying holomorphic bundles. SHC allows to construct B¨acklund transformations in the Hitchin systems defined over Riemann curves with marked points. We apply the general scheme to the elliptic Calogero-Moser (CM) system and construct SHC to an integrable SL(N, C) Euler-Arnold top (the elliptic SL(N, C)-rotator). Next, we propose a generalization of the Hitchin approach to 2d integrable theories related to the Higgs bundles of infinite rank. The main example is an integrable two-dimensional version of the two-body elliptic CM system. The previous construction allows us to define SHC between the two-dimensional elliptic CM system and the Landau-Lifshitz equation. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . ˘ 2. Hitchin Systems in the Cech Description . . . . . . . . . . . . . 2.1 The moduli space of holomorphic quasi-parabolic bundles ˘ in the Cech description . . . . . . . . . . . . . . . . . . . 2.2 Hitchin systems . . . . . . . . . . . . . . . . . . . . . . . 2.3 Standard description of the Hitchin system . . . . . . . . . ˘ 2.4 Modified Cech description of the moduli space . . . . . . 3. Symplectic Hecke Correspondence . . . . . . . . . . . . . . . . 3.1 Hecke correspondence . . . . . . . . . . . . . . . . . . . 3.2 Symplectic Hecke correspondence . . . . . . . . . . . . . ∗ ∗∗

. . . . . . . . . . . . . . . . . . .

. . . . . . .

On leave from Institute of Oceanology, Moscow, Russia On leave from Institute of Theoretical and Experimental Physics, Moscow, Russia

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

94 96 96 98 101 102 103 103 105

94

4.

5.

6.

7. 8.

A.M. Levin, M.A. Olshanetsky, A. Zotov

3.3 SHC and skew-conormal bundles . . . . . . . . . . . . . . . 3.4 B¨acklund transformation . . . . . . . . . . . . . . . . . . . Elliptic CM System – Elliptic SL(N, C)-Rotator Correspondence . 4.1 Elliptic CM system . . . . . . . . . . . . . . . . . . . . . . 4.2 The elliptic SL(N, C)-rotator . . . . . . . . . . . . . . . . . 4.3 A map RCM → Rrot . . . . . . . . . . . . . . . . . . . . . 4.4 B¨acklund transformations in the CM systems . . . . . . . . Hitchin Systems of Infinite Rank . . . . . . . . . . . . . . . . . . ˆ 5.1 Holomorphic L(GL(N, C))-bundles . . . . . . . . . . . . . 5.2 Gauge symmetries . . . . . . . . . . . . . . . . . . . . . . 5.3 Phase space . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Symplectic reduction . . . . . . . . . . . . . . . . . . . . . 5.5 Coadjoint orbits . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Conservation laws I . . . . . . . . . . . . . . . . . . . . . . 5.7 Equations of motion . . . . . . . . . . . . . . . . . . . . . . 5.8 Conservation laws II . . . . . . . . . . . . . . . . . . . . . 5.9 Hamiltonians in SL(2, C) case . . . . . . . . . . . . . . . . ˆ L(SL(N, C))-Bundles over elliptic Curves with Marked Points . . 6.1 General case . . . . . . . . . . . . . . . . . . . . . . . . . . ˆ 6.2 L(SL(2, C))-bundles over elliptic curves with marked points 6.3 Hamiltonians for the 2d elliptic Gaudin model . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Appendix A. Sin-Algebra . . . . . . . . . . . . . . . . . . . 8.2 Appendix B. Elliptic functions . . . . . . . . . . . . . . . . 8.3 Appendix C: 2d sl(2, C) Calogero L-M pair . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

105 107 107 107 109 110 112 113 113 114 115 115 116 117 117 118 119 120 120 120 124 126 127 127 127 130

1. Introduction Nowadays many examples of integrable one-dimensional and two-dimensional models are known. The problem of listing all of them, up to some equivalence, was solved for some particular forms of two-dimensional models [1]. The recently developed concept of duality for one-dimensional models [2] can shed light on the classification problem in analogy with string theory. In spite of this progress we are still far from understanding the structure of this universe. Therefore, the classification of integrable systems, apart from solving any individual equation, continues to be an actual task. We will consider integrable systems that have the Lax or Zakharov-Shabat representations. In these cases the gauge transformations of the accompanying linear equations lead essentially to the same systems, though their equations of motion differ in a significant way. For example, the non-linear Schr¨odinger model is gauge equivalent to the isotropic Heisenberg magnet [3]. In such a manner the integrable system should be classified up to gauge equivalence, though it is not the only equivalence principle in their possible classifications. The crucial and delicate point of this approach is the exact definition of allowed gauge transformations, and it will be discussed here. We restrict ourselves to Hitchin systems [4] and their two-dimensional generalizations that we will construct. The Hitchin construction establishes relations between finite dimensional integrable systems and the moduli space of holomorphic vector bundles over Riemann curves. The phase space of the integrable system is the cotangent bundle to the moduli space and the dual variables are called the Higgs fields. The pair (E, ),

Hitchin Systems – Symplectic Hecke Correspondence

95

where E is a holomorphic bundle, is called the Higgs bundle. The Lax representation arises immediately in this scheme as the equation of motion and the Lax operator is just the Higgs field defined on shell. The C ∞ gauge transformations of the Lax pair define the equivalent holomorphic bundles. The different gauge fixing conditions give equivalent integrable systems. We consider the generalization of the Hitchin systems based on the quasi-parabolic Higgs bundles [5], where the Higgs fields are allowed to have the first order poles at the marked points on the base curve. The gauge transformations preserve the flag structures that arise at the marked points. The corresponding integrable systems were considered in [6–9]. We loosen the smoothness condition of the gauge transformations and allow them to have a simple zero or a pole at one of the marked points. This type of gauge transformations (the upper and lower symplectic Hecke correspondence (SHC)) is suggested by the geometric Langlands program. SHC changes the degree of the underlying bundles on ±1. We assume, that HC is consistent with flag structures on the source and target bundles. It allows to choose a canonical form of the modifications. HC can be lifted as the symplectic correspondence (SHC) to the Higgs bundles. In this way SHC define a map of Hitchin systems related to bundles of different degrees. One can consider an arbitrary chain of consecutive SHC attributed to different marked points. If the resulting transformation preserves the degree of bundle, then it defines the B¨acklund transformations of the Hitchin system related to the initial bundle, or the integrable discrete time map [10]. Our construction is similar to the scheme proposed by Arinkin and Lysenko [11] in the investigations of the flat SL(2, C)- bundles over rational curves and the geometric structure of the B¨acklund transformations in the Painl´eve 6 system [12]. As an example, we consider a trivial holomorphic SL(N, C)-bundle E CM (deg(E CM ) = 0) over an elliptic curve with a marked point. The corresponding quasi-parabolic Higgs bundle leads to the elliptic N-body Calogero-Moser system (CM system). The upper SHC defines a map of the Higgs bundle related to E CM to the Higgs bundle (E rot , rot ) with deg(E rot ) = 1. SHC is generated by the N th order matrix with theta-functions depending on coordinates of the particles as the matrix elements. The system (E rot , rot ) is the integrable SL(N, C)-Euler-Arnold top (SL(N, C)-elliptic rotator). The Lax pair for this top was proposed earlier [13]. The consecutive upper and lower SHC define the B¨acklund transformations of both systems. A construction of this type was suggested in [14] for studying the B¨acklund transformations of the Ruijsenaars model. Another way to find a B¨acklund transformation is achieved by applying N consecutive upper modifications, since they lead to equivalent Higgs bundles. In the second part of the paper we try to gain insight into the interrelation between integrable theories in dimension one and two. It is known that some one-dimensional integrable systems can be extended to the two-dimensional case without sacrificing the integrability. For example, the Toda field theory comes from the corresponding Toda lattice. To understand this connection we apply the Hitchin construction to two-dimensional systems. For this purpose we consider infinite rank bundles over the Riemann curves with marked points. The transition group of the bundles is the central extended ˆ loop group L(GL(N, C)). If the central charge vanishes the theory in essence becomes one-dimensional. In the two-dimensional situation the Higgs field is a gl(N, C) connecˆ tion on a circle S 1 . In addition, we put coadjoint orbits of L(GL(N, C)) at the marked points and in this way introduce the quasi-parabolic structure on the Higgs bundle of infinite rank. The monodromy of the Higgs field is a generating function for the infinite number of conservation laws. The equations of motion on the reduced phase space are

96

A.M. Levin, M.A. Olshanetsky, A. Zotov

the Zakharov-Shabat equations. A similar class of Hitchin type systems from a different point of view was introduced recently by Krichever [15]. We consider in detail the case ˆ of a L(SL(2, C))-bundle over an elliptic curve with n marked points. The Higgs bundle corresponds to the two-dimensional version of the elliptic Gaudin system. For the 1 marked point case we come to the 2d two-body elliptic CM theory. The upper SHC is working in the two-dimensional situation as well. It leads to the map of the 2-body elliptic CM field theory to the Landau-Lifshitz equation.1 To summarize we consider here the following diagram:

2 − body elliptic CM −→ SL(2, C) − elliptic rotator system ↓ ↓ 2 − body elliptic CM −→ Landau-Lifshitz equation field theory Fig. 1. Interrelation in integrable theories

In fact, the upper SHC can be applied to the SL(N, C) case. The quadratic Hamiltonian of the N-body elliptic CM field theory was constructed in [15], but the SL(N, C) generalization of the Landau-Lifshitz equation is unknown. It should be mentioned that the quantum version of SL(N, C) SHC appeared in a different context long ago [16]. It was defined as a twist transformation of the quantum R-matrices, and Hasegawa [17] constructed such types of twists that transform the dynamical elliptic R-matrix of Felder [18] to the non-dynamical R-matrix of Belavin [19]. It was proved [20] that the dynamical R-matrix corresponds to the elliptic Ruijsenaars system [21]. The later is the relativistic deformation of the elliptic CM system. In this way the Hasegawa twist is the quantization of SHC we have constructed, since the elliptic CM system and the elliptic Ruijsenaars system are governed by the same R-matrix [22]. ˘ 2. Hitchin Systems in the Cech Description In this section we consider vector bundles with structure group G = GL(N, C), or any simple complex Lie group.

2.1. The moduli space of holomorphic quasi-parabolic bundles ˘ in the Cech description. Let E be a trivial rank r holomorphic vector bundle over a Riemann curve n with n marked points. Consider a covering of n by open disks Ua , a = 1, 2 . . .. Some of them may contain one marked point wα . The holomorphic structure on E can be described by the differential d . On Ua it can be represented as ∂ ¯ ¯ d = ∂¯a + A¯ a , A¯ a = h−1 , a ∂a ha , ∂a = ∂ z¯ a 1

The equivalence of these models was pointed out by A. Shabat.

Hitchin Systems – Symplectic Hecke Correspondence

97

where za is a local coordinate on Ua , and ha is a C ∞ G-valued function on Ua . It is a section of the local sheaf 0C ∞ (n , Aut E). The transition functions gab = ha h−1 b are defined on the intersections Uab = Ua ∩Ub . They are holomorphic since A¯ a = A¯ b on Uab gab ∈ 0hol (Uab , Aut E). The transformation ha → fa ha by a function holomorphic on Ua (fa ∈ 0hol (Ua , Aut E)) does not change A¯ a . Similarly, the transformation hb → fb hb by fb ∈ 0hol (Ub , Aut E) does not change A¯ b . Then the holomorphic structures described by the transition functions gab and fa gab fb−1 are equivalent. Globally we have the collection of transition maps −1 LC = {gab (za ) = ha (za )hb (zb (za )), za ∈ Uab , a, b = 1, 2 . . . , }.

(2.1)

They define holomorphic structures on E or P = AutE depending on the choice of the representations. The definition of the holomorphic structures by the transition functions works as well in the case if deg(E) = 0 (G = GL(N, C)). They should satisfy the cocycle condition

and

gab (z)gbc (z)gca (z) = Id, z ∈ Ua ∩ Ub ∩ Uc ,

(2.2)

−1 gab = gba .

(2.3)

The degree of the bundle E is defined as the degree of the linear bundle L = det g. We choose an open subset of stable holomorphic structures LC,st in LC . The gauge C,st hol group G acts as the automorphisms of L , hol gab → fa gab fb−1 , fa = f (za ), fb = fb (zb (za )), f ∈ G .

(2.4)

hol at the marked points. We prescribe the local behavior of the gauge transformations G Let P1 , . . . , Pα , . . . , Pn

be parabolic subgroups of G attributed to the marked points. Then we assume that (0) (0) (1) f˜ + zα fα + . . . , f˜α ∈ Pα , if zα = z − wα , wα is a marked point, fa = α(0) (1) (0) fa + za fa + . . . , fa ∈ G if a = α, (Ua does not contain a marked point). (2.5) It follows from (2.4) that the left action of the gauge group at the marked points preserves the flags Eα (s) ∼ Pα \ G, Eα = F l1 (α) ⊃ · · · ⊃ F lsα (α) ⊃ F lsα +1 (α) = 0.

(2.6)

The moduli space of the stable holomorphic bundles Mn (, G) with the quasi-parabolic structure at the marked points is defined in Ref.[23] as the factor space under this action hol Mn = G \LC,st (2.7) . For G = GL(N, C) we have a disjoint union of components labeled by the corresponding (d) degrees d = c1 (det E) : Mn (, G) = Mn .

98

A.M. Levin, M.A. Olshanetsky, A. Zotov

The tangent space to Mn (, G) is isomorphic to h1 (, EndE). Its dimension can be extracted from the Riemann-Roch theorem and for curves without marked points (n = 0) dim h0 (, EndE) − dim h1 (, EndE) = (1 − g) dim G. For stable bundles h0 (, EndE) = 1 and dim M0 (, G) = (g − 1)N 2 + 1 for GL(N, C), and

dim M0 (, G) = (g − 1) dim G

for simple groups. For elliptic curves one has dim h1 (, EndE) = dim h0 (, EndE), and

dim Md0 = g.c.d.(N, d).

(2.8)

In this case the structure of the moduli space for the trivial bundles (i.e. with deg(E) = 0 and, for example, for bundles with deg(E) = 1 are different. We use this fact below. For the quasi-parabolic bundles we have dim Mdn = dim Md0 +

n

fα ,

(2.9)

α=1

where fα is the dimension of the flag variety Eα . In particular, for G = GL(N, C), we get sα 1 2 2 fα = mi (α) , mi (α) = dim F li (α) − dim F li+1 (α). (2.10) N − 2 i=1

LC

The space is a sort of a lattice 2d gauge theory. Consider the skeleton of the covering {Ua , a = 1, . . .}. It is an oriented graph whose vertices Va are some fixed inner points in Ua and edges Lab connect those Va and Vb for whose Uab = ∅. We choose an orientation of the graph, saying that a > b on the edge Lab and put the holomorphic function zb (za ) which defines the holomorphic map from Ua to Ub . Then the space LC can be defined by the following data. To each edge Lab , a > b we attach a matrix valued function gab ∈ G along with zb (za ). The gauge fields fa are living on the vertices Va and the gauge transformation is given by (2.4). ˘ 2.2. Hitchin systems. The Hitchin systems in the Cech description can be constructed in the following way [24]. We start from the cotangent bundle T ∗ LC n to the holomorphic structures on P = AutE (2.1). Now ∗ 0 T ∗ LC n = {ηab , gab | ηab ∈ hol (Uab , (EndE) ), gab ∈ hol (Uab , P )}. (1,0)

(2.11)

The one forms ηab are called the Higgs fields. This bundle can be endowed with a symplectic structure by means of the Cartan-Maurer one-forms on 0hol (Uab , P ). Let ab (βγ ) be an oriented edge in Uab with the end points in the triple intersections β ∈ Uabc = Ua ∩ Ub ∩ Uc , γ ∈ Uabd . The fields ηab , gab are attributed to the edge

Hitchin Systems – Symplectic Hecke Correspondence

99

ab (βγ ). If we change the orientation ab (βγ ) → ba (γβ) the fields should be replaced −1 (see (2.3)) and on gba = gab −1 (za ). ηab (za ) = gab (za )ηba (zb (za ))gab

For this reason the integral ab (βγ )

−1 tr ηab (za )Dgab gab (za )

(2.12)

(2.13)

is independent of the orientation. We can put the data (2.11) on the graph {ab } corresponding to the covering {Ua }. Taking into account (2.13) we define the symplectic structure −1 Dtr ηab (za )Dgab gab (za ) . (2.14) ωC = b edges a (βγ )

Since ηab and gab are both holomorphic in Uab , the integral is independent of the choice of the path ab within Uab . It is worthwhile to note that the cocycle condition (2.2) does not yield the additional constraints. The symplectic form is invariant under the gauge transformations (2.4) supplemented by ηab → fa ηab fa−1 . (2.15) The set of invariant commuting Hamiltonians on T ∗ LC is d C C = ν(j,k) (za )tr(ηabj (za )), (k = 1, . . . , nj ), Ij,k

(2.16)

b edges a (βγ )

C where dj are the orders of the basic invariant polynomials corresponding to G and νj,k are (1 − dj , 0)-differentials. They are related locally to the (1 − j, 1)-differentials by D = ∂ν ¯ C and νj,k j,k

nj = h1 (, T ⊗(dj −1) ) = (2dj − 1)(g − 1) + (dj − 1)n, (j = 1, . . . , r) for the simple groups, and (2j − 1)(g − 1) + (j − 1)n, nj = g,

(j = 2, . . . , N ) j =1

for GL(N, C). The total number of independent Hamiltonians is equal to N j =1

1 nj = Md0 + r(r + 1)n. 2

This number is greater than the dimension of the moduli space Mdn (2.9). There are rn highest weight integrals, (j = r), that become Casimir elements of coadjoint orbits after the symplectic reduction, that we will consider below.

100

A.M. Levin, M.A. Olshanetsky, A. Zotov

Perform the symplectic reduction with respect to the gauge action (2.4), (2.15) of hol (2.5). The moment map is G n ∗ hol µG hol (ηab , gab ) : T ∗ LC → Lie (G ).

hol ) is defined with respect to the pairing Here the Lie coalgebra Lie∗ (G hol tr(ξa a ), a ∈ Lie(G ). b edges a (βγ )

Then locally we have  −1 ˜ −2 (−2) + . . . dz ,  ξ˜a ∈ Lie∗ (Pα ), (Ua contains a marked  a  z a ξa + z a ξ a point wα ) ξa =    za−1 ξa(−1) + za−2 ξa(−2) + . . . dza , ξa(−1) ∈ Lie∗ (G)(Ua does not contain wα ). (2.17) The canonical gauge transformations (2.4),(2.15) of the symplectic form (2.14) are generated by the Hamiltonian F hol = tr(ηab (za )ahol (za )) − tr(ηab (za )gab (za )bhol (zb (za ))gab (za )−1 ) b edges a (βγ )

=

a

a

tr(ηab (za )ahol (za )),

b

where a is an oriented contour around Ua . The non-zero moment is fixed in a special way at the neighborhoods of the marked ˜ α ⊂ Pα be the maximal semi-simple subgroup of the parabolic group points. Let G Pα defined at the marked point wα . We drop for a moment the index α for simplicity. We choose an ordering in the Cartan subalgebra h ∈Lie(G), which is consistent with ˜ be the Cartan subalgebra in G. ˜ Consider the the embedding P ⊂ G. Let h˜ = h ∩ G ∗ orthogonal decomposition of h , h∗ = h˜ ∗ + h∗ . We fix a vector p(0) ∈ h∗ such that it is a generic element in h∗ and p(0) , h˜ ∗ = 0,

(2.18)

where , is the Killing scalar product in h∗ . Since h∗ ⊂Lie∗ (P ), we can take µG hol in the form n µG hol = µ0 = pα(0) zα−1 dzα , p(0) ∈ h∗ , (2.19)

α=1

where zα = z − wα is a local coordinate in Uα . The moment equation µG hol = µ0 0,hol can be read off from F hol . It follows from the definition of Lie∗ (G ) that ηab is the boundary value of some holomorphic or meromorphic one-form Ha defined on Ua via ηab (za ) = Ha (za ),

for za ∈ Uab ,

Ha ∈ hol (Ua , End∗ (E)), (1,0)

(2.20)

Hitchin Systems – Symplectic Hecke Correspondence

where

101

za−1 pα + Ha + za Ha + . . . , (0) (1) Ha + za Ha + . . . , (0)

(0)

(1)

if Ua contains a marked point wα if Ua does not contain a marked point. (2.21) The gauge fixing means that the transition functions gab are elements of the moduli space Mdn (, E). The symplectic quotient Ha =

hol hol −1 Hnd = G \\T ∗ LC = G \µ (µ0 )

(2.22)

is called the Higgs bundle with the quasi-parabolic structures. We set off the zero modes (0) gαb of the transition functions in the symplectic form on the reduced space (see (2.14)) −1 (za ) ωC = edges b (βγ ) Dtr ηab (za )Dgab gab a n (0) (0) (0) + 2πi α=1 b Dtr pα Dgαb (gαb )−1 . (2.23) The last sum defines the Kirillov-Kostant symplectic forms on the set of coadjoint orbits O(n) = (O1 , . . . Oα , . . . , On ), where Oα = {pα ∈ Lie∗ (G) | pα = (gα(0) )−1 pα(0) gα(0) }.

(2.24)

Note that dim(Oα ) = 2fα (2.10). Remark 2.1. It is possible to construct another type of orbit Oα of the same dimension. (0) ˜ α ) in Lie∗ (Pα ) such There exist elements pα that belong to the complements of Lie∗ (G that the orbit Oα = {pα = (gα(0) )−1 pα(0) gα(0) } is symplectomorphic to the cotangent bundles to the corresponding flags Eα (2.6) without the zero section T ∗ Eα \ O(Eα ), while Oα (2.24) is a torsor over Oα . Globally, Hnd (2.22) is a torsor over T ∗ Mdn . 2.3. Standard description of the Hitchin system. The standard approach of the Hitchin systems [4] is based on the description of the holomorphic bundles in terms of the operator d . The upstairs phase space has the form ∗ T ∗ LD n = {, d | ∈ C ∞ (n , End E)}, (1,0)

where is called the Higgs field. The symplectic form ¯ ωD = tr(D ∧ D A)

(2.25)

(2.26)

n

is invariant under the action of the gauge group C G

∞

= {f ∈ 0C ∞ (n , Aut V )},

¯ + f −1 Af. ¯ → f −1 f, A¯ → f −1 ∂f

(2.27)

102

A.M. Levin, M.A. Olshanetsky, A. Zotov

The gauge invariant integrals take the evident form (compare with (2.16)) D D ν(j,k) tr(dj ), (k = 1, . . . , nj ), Ij,k =

(2.28)

n

D are (1 − j, 1)-differentials on . The symplectic reduction with respect to where νj,k n this action leads to the moment map ∞

∗ C ¯ ¯ µ : T ∗ LD n → Lie (G ) µ = ∂ + [A, ].

The Higgs field is related to η in a simple way, ηab = h−1 a ha |Uab , ¯ ¯ and A¯ a = h−1 a ∂a ha . The holomorpheity of η is equivalent to the equation µ(, A) = 0, and has the same simple poles as Ha (2.20). For simplicity, we call η the Higgs field. The bundle E equipped with the one-form η is called the Higgs bundle. ˘ ˘ 2.4. Modified Cech description of the moduli space. We modify the Cech description of the moduli space of GL(N, C)-vector bundles in the following way. Consider a formal (or rather small) disk D embedded into in such way that its center maps to the point w. Consider first the case of G = PGL(N, C)-bundles. The moduli space Mdn is the quotient of the space GD ∗ of G-valued functions g on the punctured disk D ∗ by the right action of the group Gout of G-valued holomorphic functions on the complement to w and by the left action of the group Gint of G-valued holomorphic functions on the disk: Mdn = Gint \GD ∗ /Gout ,

g → hint ghout .

We assume that these transformations preserve the quasi-parabolic structure of the vector bundle E. Now consider GL(N, C)-bundles. The group GL(N, C) is not semi-simple. One has an action of the Jacobian J ac() on the moduli space of vector bundles by the tensor multiplication, and the quotient is equal to the space of PGL(N, C)-bundles. This follows from the exact sequence 1 → O∗ → GL(N, O) → PGL(N, O) → 1. Hence locally the moduli space of vector bundles is the product of the Jacobian of the curve and moduli space of PGL(N, C)-bundles. We associate to the pair (g, L) the bundle which is equal to CN ⊗ L on the complement of a point, and the transition function on the punctured disk is g. Assume for simplicity that there is only one marked point and it coincides with the center of D ∗ . Let z be the local coordinate on D ∗ . Then the gauge group GD ∗ can be identified with the loop group L(GL(N, C)). A parabolic subgroup of L(GL(N, C)) has the form gj zj , gj ∈ gl(N, C), Gint ∼ P · exp L+ (gl(N, C)), L+ (gl(N, C)) = j >0

Hitchin Systems – Symplectic Hecke Correspondence

103

where P is a parabolic subgroup in GL(N, C). The quotient LF (s) = Gint \GD ∗ is the infinite-dimensional flag variety, corresponding to the finite-dimensional flag E(s) (see (2.6)), LF l(s) = · · · ⊃ LF lr,k ⊃ LF lr+1,k ⊃ · · · ⊃ LF ls,k ⊃ LF0,k−1 ⊃ · · · , LF lr,k = zk F lr +

(2.29)

Ezj ) (LF ls+1,k = LF l0,k−1 ).

j
The GL(N, C) Higgs bundle Hnd (2.22) can be identified with the Hamiltonian quotient Gin \\T ∗ GD ∗ × T ∗ J ac()//Gout . The cotangent bundle of T ∗ GD ∗ is identified with the space of pairs (g, η), where η is a Lie∗ (G)-valued one-form. The canonical one-form is equal to resw (tr(ηDgg −1 )). The second component T ∗ J ac() is the pair (t, L), where L is a point of J ac() and t is the corresponding co-vector. The canonical one-form is t, DLJ ac and the brackets denote the pairing between vectors and co-vectors on the Jacobian. The group Gout acts as (g, η) → (ghout , η). The corresponding momentum constraint can be reformulated as the following condition: gηg −1 is the restriction of some Lie∗ (G)valued form on the complement to w. The group Gint acts as (g, η) → (hint g, hint ηh−1 int ). The momentum constraint means that η is holomorphic in Uw if w is a generic point, or it has the first order pole if w is a marked point.

3. Symplectic Hecke Correspondence In this section we consider only GL(N, C)-bundles. 3.1. Hecke correspondence. Let E and E˜ be two bundles over of the same rank. Assume that there is a map + : E → E˜ (more precisely a map of the sheaves of sec˜ such that it is an isomorphism on the complement to w and it has tions (E) → E) one-dimensional cokernel at w ∈ : +

0 → E → E˜ → C|w → 0.

(3.1)

It is the so-called upper modiﬁcation of the bundle E at the point w. On the complement to the point w consider the map −

˜ E ← E,

such that − + =Id. It defines the lower modiﬁcation H− w at the point w. Definition 3.1. The upper Hecke correspondence (HC) at the point w ∈ is an auto-correspondence H+ w on the moduli space of Higgs bundles H related to the upper modification + (3.1).

104

A.M. Levin, M.A. Olshanetsky, A. Zotov

(d) × M(d+1) . The lower HC is defined in a HC H+ w has components placed only at M similar way. In this form the HC was used in the Hitchin systems in Ref. [15, 25]. Now consider two quasi-parabolic bundles E and E˜ with the flag structure at the marked points. While the flag Eα (s) at wα corresponding to E has the form (2.6), for E˜ we declare the following flag structure:

E˜ α (s) = F˜ l 1 (α) ⊃ · · · ⊃ F˜ l sα (α) ⊃ F˜ l sα +1 (α) = 0, where F˜ l k ∼ F lk−1 /F lsα for sα + 1 ≥ k ≥ 2. We define E˜ in terms of the sheaves of ˜ sections (E). Let + α be a map of the sheaves of sections (E) → (E) such that it is an isomorphism on the complement to a marked point wα ∈ . Let σ ∈ (E) and ˜ If σ |wα ∈ F lk−1 , then σ˜ |wα ∈ F˜lk . The section σ can be + ˜ ∈ (E). α : σ → σ singular of order one if its principle part belongs to F lsα . All together this means that + α acts as the shift on the infinite flag (2.29) at the marked point (3.2) + α (LF lr,k ) = LF lr−1,k . We call + α the upper modification of the quasi-parabolic bundle E. The lower modification of the quasi-parabolic bundles acts in the opposite direction. It looks like the upper modification (3.1), but we temporally do not assume that + α has a one-dimensional cokernel. Definition 3.2. The upper Hecke correspondence of the quasi-parabolic bundles at the marked point wα is an auto-correspondence H+ w on M related to the the upper modification + (3.2). Let the flag Eα (2.6) have a one-dimensional subspace (dim(F lsα ) = 1). In this case the upper modification + α can be fixed in the following way. Let (e1 , . . . , eN ) be a basis of local sections of E compatible with the flag structure F l1 → (e1 , . . . , eN ), . . . , F lsα → (eN ). It follows from Definition 3.2 that + α can be gauge transformed to the canonical form + α

=

0 IdN−1 zα 0

.

(3.3)

It is just the Coxeter transformation in the loop algebra L(gl(N, C)), that has been defined on the punctured disk Dα∗ ⊂ Uα in Subsect. 2.4. The Coxeter transformation ˜ coprovides the upper modification E d → E˜ d+1 . In fact, the sheaf of sections (E) incides with the sheaf of sections (E) with a singularity of the first order at wα and the singular section lies in the kernel of + (see Ref.[11]). For the local basis of (E) we have (eN z−1 , e1 , . . . , eN−1 ). In this way the HC of the quasi-parabolic bundles is described by the diagram (3.1). In a similar way the lower modification can be transformed to the form + α

=

0 IdN−1

zα−1 0

.

(3.4)

Hitchin Systems – Symplectic Hecke Correspondence

105

3.2. Symplectic Hecke correspondence. We define a map of the Higgs bundles f : (E, η) ˜ η) → (E, ˜ as the bundle map f : E → E˜ such that f η = ηf. ˜

(3.5)

˜ η), Consider two Higgs bundles (E, η) and (E, ˜ where E is a quasi-parabolic bundle and + ˜ ˜ η) E is the upper modification α of E at wa ∈ . We call (E, ˜ the upper modification + + of (E, η) α η = η ˜ α. Definition 3.3. The upper symplectic Hecke correspondence (SHC) S+ α at a point wα is an auto-correspondence on T ∗ M related to the upper modification + α of the Higgs bundles. The lower SHC S− a is defined in a similar way. Let wα be a marked point. The Higgs field η has the first order poles at wα (2.21) (0) and the residue pα of η defines an orbit Oα . ± Lemma 3.1. The gauge transforms ± α corresponding to Hα : • do not change singularity of the Higgs field at wα ; • are symplectic; • preserve the Hamiltonians (2.16). (0)

Proof. The choice of pα (2.18) is consistent with the canonical forms (3.3), (3.4) of ± α and their action does not change the order of the pole. The action is symplectic with (0) respect to (2.23) since ± α depends only on pα . The invariance of the Hamiltonians follows from (3.5). In particular, Lemma 3.1 means that ± α preserves the whole Hitchin hierarchy defined by the set of Hamiltonians (2.16) and the symplectic form (2.23).

3.3. SHC and skew-conormal bundles. Here we consider the curves without marked points. The general case can be derived in a similar way and we drop it for the sake of simplicity. For any smooth correspondence Z between equi-dimensional varieties X and Y we define a skew-conormal bundle SN ∗ Z of Z as follows. Let ν = (νX , νY ) ∈ Tz∗ (X × Y ) = Tx∗ X ⊕ Ty∗ Y be a co-vector attached to a point z = (x, y) ∈ Z ⊂ X × Y . It belongs to the fiber SN ∗z Z of the skew-conormal bundle SN ∗ Z at the point z iff for any vector v = (vX , vY ) tangent to Z, νX (vX ) = νY (vY ). Note that for the conormal bundle one has the opposite sign: νX (vX ) = −νY (vY ). The total space of the skew-conormal bundle is a Lagrangian subvariety of the total space of the cotangent bundle T ∗ (X × Y ) with respect to the symplectic form ωX − ωY , where ω denotes the canonical symplectic form on the cotangent bundle. So, the skewconormal bundle of a correspondence is rather close to the graph of a symplectic map between cotangent bundles.

106

A.M. Levin, M.A. Olshanetsky, A. Zotov

Proposition 3.1. The graph of the SHC Sw is isomorphic to the skew-conormal bundle SN ∗ Hw of the usual Hecke correspondence Hw . Proof. As explained in Subsect. 2.4, a GL(N, C)-bundle E is determined by the pair (g, L) in a neighborhood of a point w ∈ . An upper HC of E corresponds to (g, ˜ L), where g˜ = g and ¯ = 0 in Uw , ordw (det()) = 1. ∂ (3.6) Therefore, the skew-conormal bundle SN ∗ Hw of the HC SN ∗ Hw can be described by the data (g, g; ˜ η, η; ˜ t, t˜, L), g˜ = g, where satisfies (3.6), and t, DLJ ac = t˜, DLJ ac ,

(3.7)

resw (Tr(ηD ˜ g˜ g˜ −1 )) = resw (Tr(ηDgg −1 ))

(3.8)

˜ The first condition for any variations of g and g, ˜ that preserve properties of = g −1 g. (3.7) means that t = t˜. The condition (3.8) can be rewritten as −1 ˜ + (−1 η ˜ − η)g −1 D g) = 0. resw tr(ηD Since variations of g and are independent, both terms in the last expression must vanish separately: −1 ˜ )) = 0, (3.9) resw (tr(ηD ˜ − η)g −1 D g) = 0. (3.10) resw tr(−1 η Consider first the case when w is not a marked point. Then we will demonstrate that ˜ is holomorphic in w. Consider the value of at zero, this (3.9) means that η = −1 η matrix has rank N − 1. Denote by K its kernel and by I its image. An essential variation of corresponds to the variations of its image, so it is a map I → CN /I . This variation −1 at w is an corresponds to the right action: D = . The singular part −1 sing of operator of rank 1. Its kernel equals I and its image equals K, so the singular part ηsing

−1 N of η is a map from CN /Ker(−1 sing ) = C /I to Im(sing ) = I . The first condition can be rewritten as: −1 ˜ )) = resw (tr(η −1 D)) = tr(ηsing ) 0 = resw (tr(ηD

for any ∈ Hom(I, Cr /I ). The space Hom(I, Cr /I ) is dual to Hom(Cr /I, I ), so ηsing vanishes and η is holomorphic. Note that η determines some Higgs field for g. Indeed, it is holomorphic in Uw and −1 g η g = g˜ −1 η˜ g˜ is the restriction of some one-form on the complement to w. As the canonical one-form tr(ηDgg −1 ) is non-degenerate on T ∗ Mdn , from the second condition (3.10) we conclude that η = η. If w is a marked point then is fixed and it maps the Higgs field η into the Higgs field η˜ (Lemma 3.1). There is no variation of and we immediately have that again η = η.

Hitchin Systems – Symplectic Hecke Correspondence

107

3.4. B¨acklund transformation. Consider the Higgs bundles with the quasi-parabolic ± structures at the marked points. The gauge transformations ± α related to the SHC Sα depend only on the marked point wα . They define the maps of the Hitchin systems α ∗ (d) ∗ (d+1) S+ (n , G), α ∼ ξ : T M (n , G) → T M ∗ (d) ∗ (d−1) S− (n , G). α ∼ ξβ : T M (n , G) → T M Consider consecutive upper and lower modifications

ξαα21 = ξ α1 · ξα2 .

(3.11) (3.12) (3.13)

Since deg(E) does not change it is a symplectic transform T ∗ Mdn (, E). In this way ξαα21 maps solutions of the Hitchin hierarchy into solutions. Corollary 3.1. The map (3.13) is the B¨acklund transformation, parameterized by a pair of marked points (wα1 , wα2 ). We can generalize (3.13) as αj ;...;αj

ξαi 1;...;αi s = ξ αj1 · ξαi1 · · · . s

1

Because the B¨acklund transformation is a canonical one, we can consider a discrete Hamiltonian system defined on the phase space T ∗ Mdn (, E). They pairwise commute and in terms of the angle variables generate a lattice in the Liouville torus [10, 26]. In our case the dimension of the Liouville torus is equal to dim Mdn (2.9), but the lattice we have constructed has in general a smaller dimension. Note that when n is an elliptic curve, the Hitchin systems corresponding to d = kN and d = 0 (d =deg(V )) are equivalent. Hence, in this case one can construct some B¨acklund transformations by applying the upper SHC N times. 4. Elliptic CM System – Elliptic SL(N, C)-Rotator Correspondence 4.1. Elliptic CM system. The elliptic CM system was first introduced in the quantum version [27]. It is defined on the phase space   CM R = v = (v1 , . . . , vN ), u = (u1 , . . . , uN ), vj = 0, uj = 0 , (4.1) j

j

with the canonical symplectic form ωCM = (Dv ∧ Du).

(4.2)

The second order with respect to the momenta v Hamiltonian is 1 2 vj + ν 2 ℘ (uj − uk ; τ ). 2 N

H2CM =

j =1

j >k

It was established in [7, 28] that the elliptic CM system can be derived in the Hitchin approach. The Lax operator LCM is the reduced Higgs field η over the elliptic curve Eτ = C/L, L = Z + τ Z with a marked point z = 0. In this way the phase space RCM is the space of pairs

108

A.M. Levin, M.A. Olshanetsky, A. Zotov

(quasi-parabolic SLN -bundle V over Eτ , the Higgs field LCM on this bundle (4.8)). The bundle is determined by the transition functions (the multipliers) I dN : z → z + 1,

(4.3)

e(u) = diag(e(u1 ), . . . , e(uN )) : z → z + τ, where e is defined in (A.1). The Lax operator LCM (z) is the quasi-periodic one-form LCM (z + 1) = LCM (z), LCM (z + τ ) = e(−u)LCM (z)e(u).

(4.4)

It is the N th order matrix with the first order pole at z = 0 and the residue 

p(0) = Resz=0 (LCM (z)) = LCM −1

0 1 =ν  ...

1 0 .. .

··· ··· .. .

 1 1 . ..  .

(4.5)

1 1 ··· 0

This residue defines the minimal coadjoint orbit O (2.24) (dim(O) = 2N − 2). These degrees of freedom are gauged away by the action of rest gauge symmetries generated by the constant diagonal matrices. For this reason the second term in (2.23) does not contribute in the symplectic form (4.2). The column-vector e1 = (1, 1, · · · , 1) is an eigen-vector e1 , LCM −1 e1 = (N − 1)νe1 .

(4.6)

There is also an (N − 1)-dimensional eigen-subspace TN−1 corresponding to the degenerate eigen-value −ν, LCM −1 ea

= −νea , ea = (a1 , . . . , aN ),

an = 0 .

(4.7)

n

The quasi-periodicity (4.5) leads to the following form of LCM : LCM = P + X, where P = diag(v1 , . . . , vN ), Xj k = νφ(uj − uk , z),

(4.8)

and φ is defined as (B.5). The M CM -operator corresponding to H2CM has the form M CM = −D + Y, where D = diag(Z1 , . . . , ZN ), Yj k = y(uj − uk , z), Zj =

k=j

℘ (uj − uk ), y(u, z) =

∂φ(u, z) ∂u

(4.9)

Hitchin Systems – Symplectic Hecke Correspondence

109

4.2. The elliptic SL(N, C)-rotator. The elliptic SL(N, C)-rotator is an example of the Euler-Arnold top [20]. It is defined on a coadjoint orbit of SL(N, C): Rrot = {S ∈ sl(N, C), S = g −1 S(0) g},

(4.10)

where g is defined up to the left multiplication on the stationary subgroup G0 of S(0) . The phase space Rrot is equipped with the Kirillov-Kostant symplectic form ωrot = tr(S(0) Dgg −1 Dgg −1 ).

(4.11)

The Hamiltonian is defined as 1 H rot = − tr(SJ (S)), 2

(4.12)

where J is a linear operator on Lie(SL(N, C)). The inverse operator is called the inertia tensor. The equation of motion takes the form ∂t S = [J (S), S].

(4.13)

We consider here a special form J , that provides the integrability of the system. Let J (S) = J · S = Jmn Smn , mn

where J is a N th order matrix, m , (m, n = 1, . . . , N ), (m, n ∈ Z mod N, m + nτ ∈ L) , J = {Jmn } = ℘ n (4.14) m + nτ m ;τ . ℘ =℘ n N We write down (4.13) in the basis of the sin-algebra S = Smn Emn (see (A.4)), N π k Sk,l Sm−k,n−l ℘ sin (kn − ml). ∂t Smn = l π N

(4.15)

k,l

The elliptic rotator is a Hitchin system [7]. We give a proof of this statement. Lemma 4.1. The elliptic SL(N, C)-rotator is a Hitchin system corresponding to the Higgs quasi-parabolic GL(N, C)-bundle E (deg(E)=1) over the elliptic curve Eτ with the marked point z = 0. Proof. It can be proved that (4.15) is equivalent to the Lax equation. The Lax matrices in the basis of the sin-algebra take the form nz m + nτ m m Smn ϕ (z)Emn , ϕ (z) = e − φ − ; z , (4.16) Lrot = n n N N m,n M

rot

=

m,n

Smn f

nz m m (z)Emn , f (z) = e − ∂u φ(u; z)|u=− m+nτ . (4.17) n n N N

110

A.M. Levin, M.A. Olshanetsky, A. Zotov

They lead to the Lax equation for the matrix elements √ π m m−k k ∂t Smn ϕ (z) = −1 Sm−k,n−l Skl ϕ (z)f (z) sin (nk − ml). n n−l l N k,l

Using the Calogero functional equation (B.27) we rewrite it in the form (4.15). Since 1 tr(Lrot )2 = −2H rot + trS2 ℘ (z), N H rot is the Hitchin quadratic integral. The Lax operator satisfies the Hitchin equation √ ¯ rot = 0, ResLrot |z=0 = 2π −1S ∂L and is quasi-periodic Lrot (z + 1) = Q(τ )Lrot (z)Q−1 (τ ),

(4.18)

˜ ˜ Lrot (z + τ ) = (z, τ )Lrot (z)((z, τ ))−1 ,

(4.19)

−z− 21 τ

˜ where (z, τ ) = −e( The transition functions

N

) and the matrices Q and are defined in (A.2),(A.3). Q(τ ) : z → z + 1, ˜ (z, τ) : z → z + τ

(4.20) (4.21)

define the GL(N, C)-bundle over Eτ with deg(V ) = 1. For these bundles we have dim(M10 ) = 1 (2.8) and after the symplectic reduction we come to the coadjoint orbit G0 \ SL(N, C) (4.10). The Kirillov-Kostant form (4.11) arises as the last terms in (2.23) attributed to the point z = 0. Thus, the phase space of the SLN -rotator is the space of ˜ with the first order the Higgs fields Lrot on the bundle determined by multipliers Q, singularities at zero. 4.3. A map RCM → Rrot . We construct a map from the phase space of the elliptic CM system RCM into the phase space of the SLN -rotator Rrot . We assume here that the SLN -rotator is living on the most degenerate orbit corresponding to LCM −1 (4.5). The phase space of CM systems with spins is mapped into the general coadjoint orbits. This generalization is straightforward. In this way, for N = 2 we describe the upper horizontal arrow in Fig. 1. The map is defined as the conjugation of LCM by some matrix (z): Lrot = × LCM × −1 .

(4.22)

It follows from comparing (4.4) with (4.18) and (4.19) that must intertwine the multipliers of bundles: (z + 1, τ ) = Q × (z, τ ), (4.23) ˜ (z + τ, τ ) = (z, τ ) × (z, τ ) × diag(e(uj )). (4.24) The matrix (z) degenerates at z = 0, and the column-vector (1, · · · , 1), in accordance with Lemma 3.1, should belong to the kernel of (0). In this case, × LCM × −1 has a first order pole at z = 0.

Hitchin Systems – Symplectic Hecke Correspondence

111

˜ Consider the following (N × N )- matrix (z, u1 , . . . , uN ; τ ) : i − 21 N ˜ ij (z, u1 , . . . , uN ; τ ) = θ (z − N uj , N τ ), N

(4.25)

2

a (z, τ ) is the theta function with a characteristic (B.31). Sometimes we omit b nonessential arguments of for brevity. where θ

˜ is transformed under the translations z → z + 1, z → z + τ Lemma 4.2. The matrix and uj → uj + 1, uj → uj + τ as : ˜ + 1, τ ) = −Q × (z, ˜ (z τ ),

(4.26)

˜ + τ, τ ) = (z, ˜ ˜ (z τ ) × (z, τ ) × diag(e(uj )), (4.27) τ z ˜ (z, τ ) = −e − − ; 2N N ˜ j ; τ ) × diag(1, · · · , (−1)N , · · · , 1), ˜ j + 1, ; τ ) = (u (4.28) (u Nτ ˜ j + τ ; τ ) = (u ˜ j ; τ ) × diag(1, · · · , (−1)N e − (u + z − N uj · · · , 1). 2 (4.29) Proof. The statement of the lemma follows from the properties of the theta functions with characteristics (B.33)–(B.35). Now we assume that uj = 0, so uN is no more an independent variable, but it is equal to − N−1 j =1 uj . The determinant formula of the Vandermonde type [17] ϑ(ul − uk ) ˜ ij (z, u1 , . . . , uN ; τ ) ϑ(z) det =√ (4.30) √ √ −1η(τ ) −1η(τ ) 1≤k
Proof. We must prove that for any i the following expression N l=1

l

(−1) θ

i N

− N 2

1 2

(z − N ul , Nτ )

ϑ(uk − uj , τ )

(4.31)

j
vanishes. First, the symmetric group SN acts on u by permutation of u1 , . . . , uN and (4.31) is antisymmetric with respect to the SN action. Hence it vanishes on the hyperplanes ui = uj . As a function on u1 , (4.31) has 2N zeroes: N − 2 zeroes u1 = uk ,

112

A.M. Levin, M.A. Olshanetsky, A. Zotov

k = 1, N, N − 2 zeroes uN = uk , k = 1, N and four zeroes u1 = uN (the last equation is 2u1 = − N−1 j =2 uj ). Second, (4.31) is quasiperiodic with respect to the shifts u1 → u1 + 1, u1 → u1 + τ with multiplicators 1 and e (−(N − 1)τ − (N − 1)(u1 − uN )). Any quasiperiodic function with such multiplicators is either zero or has 2N − 2 zeroes. Since our expression vanishes in 2N points it vanishes identically. It follows from the previous lemmas that the matrix   ˜ (z) = (z) × diag (−1)l ϑ(uk − uj , τ )

(4.32)

j
is the singular gauge transform from Lemma 2.1 that maps LCM to Lrot . This transformation leads to the symplectic map RCM → Rrot , (v, u) → S.

(4.33)

Consider in detail the case N = 2. Let S = Sa σa , where σa denote the sigma matrices subject to the commutation relations √ [σa , σb ] = 2 −1εabc σc . Then the transformation has the form  2 (0) θ10 (0) θ10 (2u) θ00 (2u)θ01 (2u)  , S1 = −v θϑ10 (0)  ϑ(2u) − ν θ00 (0)θ01 (0) ϑ 2 (2u)    2 θ00 (0) θ00 (0) θ00 (2u) θ10 (2u)θ01 (2u) √ S2 = −v √−1ϑ , (0) ϑ(2u) − ν −1θ (0)θ (0) ϑ 2 (2u) 10 01   2   10 (2u) S3 = −v θ01 (0) θ01 (2u) − ν θ01 (0) θ00 (2u)θ . θ00 (0)θ10 (0) ϑ (0) ϑ(2u) ϑ 2 (2u)

(4.34)

Formulae of this kind were obtained in [16]. 4.4. B¨acklund transformations in the CM systems. We now use the map (4.33) to construct the B¨acklund transformation in the CM systems ˜ ξ : (v, u) → (˜v, u). ˜ Consider Let the Lax matrix depend on the new coordinates and momenta L = L(˜v, u). the upper modification (z) (4.32). To construct the B¨acklund transformation ξ , we map ˜ to the same point S ∈ Rrot : (v, u) and (˜v, u) Lrot (S) ˜ (z, u)

(z, u) LCM (v, u)

ξ

−→

˜ LCM (˜v, u)

In this way we reproduce implicitly the general formula (3.13) for the B¨acklund transformations. This transformation defines an integrable discrete time dynamics of a CM

Hitchin Systems – Symplectic Hecke Correspondence

113

system. One example of this dicretization was proposed in [29]. It can be supposed to correspond to ξ . Another way to construct new solutions from (v, u) is to act by N consecutive upper modifications (4.35) (N) = DN N · · · j · · · 2 · . Here the matrices j , j = 2, . . . , N, satisfy the quasi-periodicity conditions ˜ j j (z) ˜ 1−j , j (z + τ ) = and DN is an arbitrary diagonal matrix. We come back to the N -dimensional moduli space M(N) (see (2.5)) and to the map (N)

˜ LCM (v, u) −→ LCM (˜v, u). If we break the chain (4.35) on a step k < L, then we obtain the map LCM → Lrot,k , where Lrot,k is the Lax operator for the elliptic rotator related to the holomorphic bundle of degree k. It satisfies the quasi-periodicity condition (4.18) and ˜ j Lrot,k (z) ˜ −j Lrot,k (z + τ ) = instead of (4.19). 5. Hitchin Systems of Infinite Rank Here we generalize the derivation of finite-dimensional integrable systems in the form (2.25)–(2.28) on two-dimensional integrable field theories. ˆ 5.1. Holomorphic L(GL(N, C))-bundles. Let L(gl(N, C)) be the loop algebra of ∞ ˆ C -maps L(gl(N, C)) : S 1 → gl(N, C), and L(gl(N, C)) be its central extension with the multiplication ! " (5.1) (g, c) × (g , c ) = gg , cc exp C(g, g ) , ˆ C)) providing the associativity of the where exp C(g, g ) is a 2-cocycle of L(GL(N, multiplication. Consider a holomorphic vector bundle V of an infinite rank over a Riemann curve n ˆ with n marked points. The bundle is defined by the transition functions from L(GL(N, C)). ˆ Its fibers are isomorphic to the Lie algebra L(gl(N, C)). The holomorphic structure on V is defined by the operator

(0)

(0,1)

d : C ∞ (n , End V ) → C ∞ (n , End V ).

It has two components d = dA¯ + dλ . The first component is

(0)

(0,1)

dA¯ : C ∞ (n , L(gl(N, C))) → C ∞ (n , L(gl(N, C))).

114

A.M. Levin, M.A. Olshanetsky, A. Zotov

Locally ¯ dA¯ = ∂¯ + A,

∂¯ = ∂z¯ ,

¯ A¯ = A(x, z, z¯ ),

x ∈ S1.

The operator dA¯ acts on a N-dimensional column vector e(x; z, z¯ ). The second compo

nent is defined by the connection dλ on a trivial linear bundle L on n , given by

dλ = ∂¯ + λ. ˆ The field λ is a map from n to the central element of the Lie algebra L(gl(N, C)). A local section σ of V is holomorphic if d σ = 0. The sections allow to define the transition functions. We assume that A¯ and λ are smooth at the marked points. In addition we define n copies of the central extended loop groups located at the marked points ˆ α = (gα (x), cα ), Gα = GL(N, C), (α = 1, . . . , n), x ∈ S 1 , LG with the multiplication (5.1). Thus, we have the set R of fields playing the role of the “coordinate space”: # $ ¯ λ, (g1 , c1 ), . . . , (gn , cn ) . R = A,

(5.2)

5.2. Gauge symmetries. Let G be the group of automorphisms of R (the gauge group), ˆ G = C ∞ Map(n → L(GL(N, C))) = {f (z, z¯ , x), s(z, z¯ )}, where f (z, z¯ , x) takes values in GL(N, C), and s(z, z¯ ) is the map to the central element ˆ of L(GL(N, C))). The multiplication is pointwise with respect to n , (f1 , s1 ) × (f2 , s2 ) = (f1 f2 , s1 s2 exp C(f1 , f2 )) , ˆ where exp C(f1 , f2 ) is a map from n to the 2-cocycle of L(GL(N, C)). Let (fα = fα (x), sα ) be the value of the gauge fields at the marked point wα . The action of G on R takes the following form: ¯ + f −1 Af, ¯ A¯ → f −1 ∂f λ→λ+s

−1 ¯

∂s +

%

¯ −1 ∂x f )dx, tr(Af

cα → cα sα , gα → gα fα .

(5.3)

(5.4) (5.5)

The quotient space N = R/G is the moduli space of infinite rank holomorphic bundles over Riemann curves with marked points.

Hitchin Systems – Symplectic Hecke Correspondence

115

5.3. Phase space. The cotangent space to R has the following structure. Consider the (1,0) analog of the Higgs field ∈ C ∞ (n , (End V )∗ ). It is a one-form on n taking val(1,0) ues in the Lie coalgebra L∗ (gl(N, C)). Let k be a scalar one-form on n , k ∈ C ∞ (n ). It is dual to the field λ. At the marked points we have the Lie coalgebras Lie∗ (Gα ) ∼ L(gl(N, C)) along with the central elements rα , dual to cα . Thus the cotangent bundle T ∗ R contains the fields # $ ¯ ), (λ, k); (g1 , c1 ; p1 , r1 ), . . . , (gn , cn ; pn , rn ) . T ∗ R = (A, (5.6) There is a canonical symplectic structure on T ∗ R. For F ∈ C ∞ (n , (End V )∗ ) (0,1) ˆ C))) defines the pairing and G ∈ C ∞ (n , L(gl(N, % tr(F G)dx. < F |G >= (1,0)

n

Then ω =< D ∧ D A¯ > +

Dk ∧ Dλ + n

n

ωα ,

(5.7)

α=1

ˆ α ). It is constructed in the canonical way by where ωα is a canonical form on T ∗ L(G ˆ α ) = {gα , cα }. The result is means of the Maurer-Cartan form on L(G % % rα −1 −1 tr(D(pα gα )Dgα ) + D(rα cα )Dcα + tr gα−1 Dgα ∂x (gα−1 Dgα ) . ωα = 2 Sα1 Sα1 (5.8) 5.4. Symplectic reduction. Now consider the lift of G to the global canonical transformations of T ∗ R. In addition to (5.3),(5.4),(5.5) we have the following action of G: → f −1 k∂x f + f −1 f,

k → k,

pα → fα−1 pα fα + rα fα−1 ∂x fα ,

rα → rα .

(5.9) (5.10)

This transformation leads to the moment map from the phase space to the Lie coalgebra of the gauge group µ : T ∗ R → Lie∗ (G). It takes the form n n ¯ ] + ¯ + ¯ − k∂x A¯ + [A, pα δ(zα ), ∂k rα δ(zα ) . (5.11) µ = ∂ α=1

α=1

We assume that µ = (0, 0). Therefore, we have the two holomorphity conditions ¯ − k∂x A¯ + [A, ¯ ] + ∂

n

pα δ(zα ) = 0,

(5.12)

α=1

¯ + ∂k

n

rα δ(zα ) = 0.

(5.13)

α=1

The constraint equation (5.13) means that the k-component of the Higgs field is a holomorphic one-form on with first order poles at the marked points.

116

A.M. Levin, M.A. Olshanetsky, A. Zotov

Let us fix a gauge

¯ + f −1 Af. ¯ L¯ = f −1 ∂f The same gauge action transform as

(5.14)

L = kf −1 ∂x f + f −1 f.

(5.15)

We preserve the same notations gα , pα for the gauge transformed variables. The moment constraint equation (5.12) has the same form in terms of L¯ and L, ¯ L] + ¯ − k∂x L¯ + [L, ∂L

n

pα δ(zα ) = 0.

(5.16)

α=1

Solutions of this equation along with (5.13) define the reduced phase space T ∗ R//G ∼ T ∗ N . The symplectic form (5.7) on T ∗ N becomes ¯ ω =< δL|δ L > + n

δkδλ +

n

(5.17)

ωα .

α=1

ˆ 5.5. Coadjoint orbits. Consider in detail the symplectic form ω (5.8) on T ∗ L(G) ∼ {(p, r); (g, c)}. We omit the subscript α below. The following canonical transformation ˆ of ω by (f, s) ∈ L(G), where s is a central element, g → f g, p → p, r → r, c → sc, f ∈ L(G),

(5.18)

has not been used so far. The symplectic reduction with respect to this transformation ˆ leads to the coadjoint orbits of L(GL(N, C)). In fact, the moment map ˆ ˆ → Lie∗ (L(GL(N, C))) µ : T ∗ L(G) takes the form

µ = (−gpg −1 + r∂x gg −1 , r).

ˆ Let us fix the moment µ = (p(0) , r (0) ). The result of the symplectic reduction of T ∗ L(G) is the coadjoint orbit ˆ C)) /G0 , O(p(0) , r (0) ) = (p = −g −1 p (0) g − r (0) g −1 ∂x g, r (0) ) = µ−1 T ∗ L(SL(N, ˆ C)) that preserves µ, where G0 is the subgroup of L(GL(N, G0 = {g ∈ L(GL(N, C), s is arbitrary) | p (0) = −g −1 p (0) g + r (0) g −1 ∂x g}. The symplectic form (5.8) being pushed forward on O takes the form % % r (0) −1 tr g −1 DgD(g −1 ∂x g) . ω = tr(D(pg )Dg) + 2 (0)

(5.19) (0

In what follows we will consider the collection of the orbits Oα (pα , rα ) at the ˆ α ). In this way we come to the marked points instead of the cotangent bundles T ∗ L(G notion of the Higgs bundle of infinite rank Hˆ nd (see (2.22) and (5.6)) & ' ¯ ), (λ, k), O1 (p (0) , r (0 ), . . . , On (pn(0) , rn(0 ) . Hˆ nd = (A, (5.20) 1 1

Hitchin Systems – Symplectic Hecke Correspondence

117

5.6. Conservation laws I. The Higgs field is transformed as a connection with respect to the circles S 1 (5.9). If the central charge k = 0, the standard Hitchin integrals (2.28) cease to be gauge invariant. Invariant integrals are generated by the traces of the monodromies of the Higgs field . The generating function of Hamiltonians is given by % 1 , (5.21) H (z) = tr P exp k S1 where z is a local coordinate of an arbitrary point. At a marked point, has a first order pole and +∞ Hj z j . (5.22) H (z) = j =−1

Since H (z) is gauge invariant one can replace by L in (5.21), % 1 H (z) = tr P exp L . k S1

(5.23)

5.7. Equations of motion. Consider the equations of motion on the “upstairs” space T ∗ R (5.20). They are derived by means of the symplectic form ω (5.7), where ωα is replaced by (5.19), and the Hamiltonians (5.21), (5.22). Let tj be a time variable corresponding to the Hamiltonian Hj . Taking into account that Hj is a functional depending on the Higgs field and the central charge k only, we arrive at the following free system; ∂j = 0,

(5.24)

δHj , (5.25) δ δHj (5.26) , ∂j pα = 0. ∂j k = 0, ∂j λ = δk After the symplectic reduction we are led to the fields L¯ (5.14) and L (5.15). For simplicity, we keep the same notation for the coadjoint orbits variables pα , so they are transformed as in (5.10). Substituting (5.15) in (5.24) we obtain the Zakharov-Shabat equation (5.27) ∂j L − k∂x Mj + [Mj , L] = 0, (∂j = ∂tj ), ∂j A¯ =

where Mj = ∂j ff −1 . The operator Mj can be restored partly from the second equation (5.25), δHj ¯ = ¯ j − ∂j L¯ + [Mj , L] . (5.28) ∂M δL The last two equations along with the moment constraint equation (5.16) are the consistency conditions for the linear system (k∂x + L) = 0,

(5.29)

¯ (∂¯ + L) = 0,

(5.30)

(∂j + Mj ) = 0.

(5.31)

118

A.M. Levin, M.A. Olshanetsky, A. Zotov

5.8. Conservation laws II. The matrix equation (5.29) allows to write down the conservation laws. Its generic solutions can be represented in the form 1 x (x) = (I + R) exp − (5.32) Sdx , k 0 where I is the identity matrix, R is an off-diagonal periodic matrix and S = diag(S 1 , . . . , S N ) is a diagonal matrix. Equation (5.29) means that L can be gauge transformed to the diagonal form (I + R)S = k∂x (I + R) + L(I + R). (5.33) Consider this equation in neighborhood of a point on n with a local coordinate z. Assume, for simplicity, that it is a pole of the Lax operator and k is a constant. In particular, it follows from (5.13) that rα0 = 0 and the coadjoint orbits have the form Oα = {pα = −gpα0 g −1 }. Then substitute into (5.33) the series expansions L(z) = L−1 z−1 + L0 + L1 z + . . . , (L−1 = resL = p), S(z) = S−1 z−1 + S0 + S1 z + . . . , (I + R)(z) = h + R1 z + R2 z2 + . . . , (diag(Rm ) ≡ 0). It follows from (5.32) that the diagonal matrix elements Sjm are the densities of the conservation laws % log Hj,l = Sjl dx. We present a recurrence procedure to define the diagonal matrices Sj . On the first step we find that S−1 = h−1 L−1 h = h−1 ph, p = L−1 = Res Lz=0 .

(5.34)

In other words the diagonal matrix S−1 determines the orbit located at the point z = 0. In the general case we get the following equation: Sk + [h

−1

−1

Rk+1 , S−1 ] = h

k∂x Rk + h

−1

k

Ll−1 Rk−l+l − Rl Sk−l h−1 Lk h.

l=1

Separating the diagonal and the off-diagonal parts allows us to express Sk and Rk in terms of the lower coefficients k−1 k −1 −1 −1 −1 Ll Rk−l + h Ll−1 Rk−l+l − Rl Sk−l h Lk h , Sk = h k∂x Rk + h l=0

l=1

[h−1 Rk , S−1 ] = h−1 k∂x Rk−1 + h−1

k

diag

(5.35)

Ll−1 Rk−l+l − Rl Sk−l h−1 Lk h

l=1

. nondiag

(5.36) In particular,

S0 = (h−1 k∂x h + h−1 L0 h)diag , −1

S1 = h

k∂x R1 + h

−1

L1 h − h

−1

R1 S0 + h

−1

(5.37)

L 0 R1

diag

,

(5.38)

where R1 is defined by the equation [h−1 R1 , S−1 ] = (h−1 k∂x h + h−1 L0 h)nondiag .

(5.39)

Hitchin Systems – Symplectic Hecke Correspondence

119

5.9. Hamiltonians in SL(2, C) case. Let us perform the gauge transformation f −1 Lf + kf −1 ∂x f = L ,

(5.40)

with f defined as follows: √ L12 √ f = L − √L11 − k ∂x L12L12 12

0

√1 L12

Then the Lax matrix L is transformed into 0 1 , L = T 0

.

(5.41)

(5.42)

where T = L21 L12 + L211 + k The linear problem

L11 ∂x L12 1 ∂ 2 L12 3 (∂x L12 )2 − k∂x L11 − k 2 x + k2 . (5.43) L12 2 L12 4 L212

(xk∂x + L )ψ = 0, (5.44) (∂j + Mj )ψ = 0, ( where ψ is the Bloch wave function ψ = exp{−i χ }, leads to the Riccati equation: ik∂x χ − χ 2 + T = 0.

(5.45)

The decomposition of χ (z) provides densities of the conservation laws (see [30]): χ=

∞

z k χk ,

(5.46)

dxχk−1 .

(5.47)

k=−1

% Hk ∼

The values of χk can be found from (5.45) using the expression (5.43) for T (z) = ∞ zk Tk in a neighborhood of zero. For k = −2, −1 and 0 we have: k=−2

√  √  χ√ −1 = T−2 = h, 2 hχ = T−1 + ik∂x χ−1 = T−1 ,  √ 0 2 hχ1 = T0 + ik∂x χ − χ02 .

(5.48)

In Subsects. 7.2, 7.3 below, explicit formulae for Tk are used for the computation of the Hamiltonians for the elliptic 2d Calogero-Moser and the elliptic Gaudin models.

120

A.M. Levin, M.A. Olshanetsky, A. Zotov

ˆ 6. L(SL(N, C))-Bundles over elliptic Curves with Marked Points ˆ 6.1. General case. We apply the general construction to the L(SL(N, C))-bundle over that elliptic curve Eτ with marked points wα , α = 1, . . . , n. It is a two-dimensional generalization of the elliptic Gaudin model [8]. In particular, for one marked point z = 0 we come to the N -body elliptic CM field theory. Let us construct solutions of the moment equations (5.11), taking for simplicity at the marked points the orbits with vanishing central charges & ' α Oα = pij , rα = 0 . For elliptic curves one can fix the central charge as k = 1. For the stable bundles the ¯ gauge transformation (5.3) allows to diagonalize A: √ 2π −1 ¯ Aij = δij ui . τ − τ¯

(6.1)

Then the Lax operator LG should satisfy (5.16). It takes the form: LG ij

√ z − z¯ vi α pii 2π −1 + + E1 (z − wα ) 2 τ − τ¯ α 1 − δij α z − wα − (¯z − w¯ α ) − pij e uij φ(uij , z − wα ), (uij = ui − uj ). √ τ − τ¯ 2π −1 α

δij =− √ 2π −1

(6.2) By the quasiperiodic gauge transform f = diag{e(

z − z¯ ui )}, τ − τ¯

(6.3)

one comes to the holomorphic quasiperiodic Lax operator 1 − δij α vi α pii E1 (z − wα ) − pij φ(uij , z − wα ). + √ 2 2π −1 α α (6.4) Reducing the moment map equation to the diagonal gives the additional constraint lijG (z)

δij =− √ 2π −1

1 √

2π −1

piiα = ∂x ui .

(6.5)

α

ˆ 6.2. L(SL(2, C))-bundles over elliptic curves with marked points. In this subsection we study 2-body elliptic Calogero field theory in detail.

Hitchin Systems – Symplectic Hecke Correspondence

121

The operator L. According to (6.4) the holomorphic Lax operator is  α p11 v G =− √ √ E1 (z − wα ),  l11 −  4π −1 2π −1   α  α p12 G l12 = − 2π √−1 φ(2u, z − wα ),  α  α   l G = − p√21  φ(−2u, z − wα ), 21 2π −1

(6.6)

α

with the additional constraint (6.5) 1 √

2π −1

α p11 = ux .

(6.7)

α

We still have the freedom to fix the gauge with respect to the action of the diagonal subgroup. The corresponding moment map is (6.7). For the one marked point w1 = 0 the corresponding orbit is √ ux − ν p = 2π −1 , (6.8) −ν − ux where ν =const. is the result of the gauge fixing. In this case the Lax operator is a 2d generalization of the Lax operator for the two-body CM model: νφ(2u, z) − 4π √1 −1 v − ux E1 (z) CM L2D = . (6.9) 1 √ νφ(−2u, z) v + ux E1 (z) 4π −1 This operator is still periodic under the shift z → z + 1 and CM LCM 2D (z + τ ) = e(u)L2D (z)e(−u) + e(u)∂x e(−u),

where e(u) = diag(exp u, exp −u). Hamiltonians for the 2d elliptic sl(2, C) CM model. In this case the coefficients Tk are (see (5.43)–(5.48)):  CM T = u2x + ν 2 = h    −2   CM T−1 = 2 4π √v −1 ux − ννx ux + uxx , (6.10)   2 ν ν  v v 1 CM 2 2 2 x x  T0 = − 16π 2 + (2ux − ν )℘ (2u) − 4π √−1 ν + 4 ( ν ) where h is the Casimir function, fixing the coadjoint orbit at the marked point. It can be chosen as a constant. Thus, we have ν 2 = h − u2x . The next order Hamiltonian is quadratic % v νx CM H−1 = √ ux − ux . ν 2π −1 It can be written in the following way: % CM H−1 =

v √

2π −1

ux +

uxx h . ν2

(6.11)

(6.12)

122

A.M. Levin, M.A. Olshanetsky, A. Zotov

( Since { dx uνxx2 , v(y)} = 0, the equations of motion are:  ut = vt =

1 √ u , 2π −1 x 1 √ v . 2π −1 x

(6.13)

Note that the L-M pair is simple in this case: M = 2π √1 −1 L. The first nontrivial Hamiltonian H0 is quadratic in the momenta field v. It is a twodimensional generalization of the quadratic CM Hamiltonian % % √ 1 2 H0CM = dx2 hχ1 = dx T0 − (6.14) T−1 . 4h A direct evaluation yields: T0CM

1 v2 CM 2 − ) =− (T−1 4h 16π 2

u2 1− x h

+ (3u2x − h)℘ (2u) −

u2xx . 4ν 2

(6.15)

The equations of motion produced by H0CM are: ut = −

v 8π 2

1−

u2x h

(6.16)

,

1 1 uxxx ν − νx uxx ). ∂x (v 2 ux ) − 2(3u2x − h)℘ (2u) + 6∂x (ux ℘ (2u)) + ∂x ( 8π 2 h 2 ν3 It is reduced to the two-body elliptic CM system for the x-independent fields. vt =

The L-M pair for the 2d elliptic sl(2, C) CM model. The equations of motion (6.16) produced by the quadratic Hamiltonian H0CM can be represented in a form of the Zakharov-Shabat equation with the L matrix defined by (6.9) and the M matrix given as follows:  1 1 2 u + 6u ℘ (2u) + uxxx ν−νx uxx √  M = −u E (z) − v 11 t 1 x x  2 3 2ν 4π −1 8π h     + 2πu√x−1 (E2 (2u) − E2 (z)), uxx vux ν ν ν 1 (2u, z) + √ √ √  M φ(2u, z), = − φ E (z) + − 12 1  8π 2 h 2π −1 2π 4π −1 ν  −1   ν  M = − √ν φ (−2u, z) + √ E (z) + vux ν + √1 uxx φ(−2u, z). 21

2π −1

2π −1

1

8π 2 h

4π −1 ν

(6.17) See Appendix C for details of the proof. This construction completes the description of right vertical arrow in Fig.1. 2d CM - LL correspondence. The upper modification that produces the map of the elliptic CM system into the elliptic rotator (4.10), (4.12) works in the two-dimensional case as well. The two-dimensional extension of the SL(2, C)-elliptic rotator is the Landau-Lifshitz (LL) equation 1 1 ∂t S = [S, J (S)] + [S, ∂xx S]. (6.18) 2 2

Hitchin Systems – Symplectic Hecke Correspondence

123

This equation can be fitted in the Zakharov-Shabat form [32]. The Lax operator LLL has the same form as for the SL(2, C) elliptic rotator Lrot (4.16). For sl(2, C) the basis of the sigma matrices coincides with the basis of the sin-algebra and LLL takes the form ua (z)Sa σa , L= a

0 1 1 (z), u2 = ϕ (z), u3 = ϕ (z). u1 = ϕ 1 1 0 The M LL operator has a very simple extension ua (z)tr(σa [S, ∂x S])σa . M LL = M rot − Lrot E1 (z) + a

It is easy to check that the Zakharov-Shabat equation leads to (6.18) if Sa2 = 1. a

Thereby we have defined the right vertical arrow in Fig.1. Consider the upper modification 2D that has the same quasi-periodicity as but corresponds to the residue p (6.8) of LCM 2D (6.9). Then the Lax operator for the LL system is the result of the upper modification CM −1 LLL = 2D ∂x −1 2D + 2D L2D 2D .

(6.19)

It means that we can pass from the CM fields v(x, t), u(x, t) and the constant ν to the LL fields S = (S1 , S2 , S3 ) with the orbit fixing condition

Sa2 = −

a

1 (u2 + ν 2 ) = 1. 2π 2 x

It completes the description of the diagram on Fig.1. Relations with the Sinh-G equation and the nonlinear Schr¨odinger equation. It is known that the LL model is universal; it contains as a special limit the Sinh-Gordon and the Nonlinear Schr¨odinger models [3]. In this way they can be derived within the 2d CM system. The scaling limit in the CM model is a combination of the trigonometric limit I mτ → ∞ with shifts of coordinates: u = U + 21 I mτ and renormalization of the coupling con1

stant ν = ν¯ e 2 I mτ [31]. This procedure applied to the 2d elliptic CM Hamiltonian yields the sinh-Gordon system: HSG = −

v2 Ux2 2 2U −2U . − ν ¯ (e + e ) + 16π 2 4

The equations of motion are:  Ut = − 8πv 2 , vt = 2ν¯ 2 (e2U − e−2U ) + 21 Uxx .

(6.20)

(6.21)

124

A.M. Levin, M.A. Olshanetsky, A. Zotov

The L-M pair is: 

− 4π √v −1 − 21 Ux

LSG = 

ν¯ ( Z1 − e−2U )



− U2t −

 M SG = 

v √ 4π −1

1 √ U 8π −1 x

ν¯ √ (e−2U 4π −1

+

ν¯ (1 − e2U Z)

1 Z)

+ 21 Ux

 ,

(6.22) 

ν¯ √ (1 + e2U Z) 4π −1  . Ut 1 √ + U 2 8π −1 x

(6.23)

Let us consider 2d CM theory for N = 2 in the rational limit when the both periods of the basic spectral curve go to infinity. The upper modification (6.19) transforms this system in the Heisenberg magnet. Then using the non-singular gauge transform from Ref. [3] we come to the nonlinear Schr¨odinger equation.

6.3. Hamiltonians for the 2d elliptic Gaudin model. Using (B.28) we obtain the Hamiltonian: G H−1,a =2

a b pa p11 p11 +2 √ √11 √ E1 (za − zb ) 4π −1 2π −1 2π −1 2π −1 b

−

v √

a=b

−

a pb pb pa p12 12 21 φ(2u, za − zb ) √ 21 φ(2u, zb − za ) + √ 2 (2π −1)2 (2π −1) a=b

a a ∂x p12 p11 . √ 2π −1 p12

(6.24)

The last term makes the above Hamiltonian different from the one-dimensional version. Let us consider the sl(2, C) case with two marked points on the elliptic curve. We will use the following notations:  1 √ √ 2 = 2π −1γ , p = 2π −1γ1 , p11  2   11 √ √ 1 1 p12 = −2π −1ν+ , p21 = −2π −1ν− , (6.25)  √ √  p2 = −2π −1µ+ , p2 = −2π −1µ− . 12 21 The L matrix is: G l = − 4π √v −1 − γ1 E1 (z − z1 ) − γ2 E1 (z − z2 ),    11 G = νφ(2u, z − z ) + µ φ(2u, z − z ), l12 1 + 2   l G = νφ(−2u, z − z1 ) + µ− φ(−2u, z − z2 ). 21

(6.26)

The solution exists if γ1 + γ2 = ux .

(6.27)

The gauge fixing condition is chosen to be ν+ = ν− = ν.

(6.28)

Hitchin Systems – Symplectic Hecke Correspondence

125

We fix the Casimir elements h1 = γ12 +ν 2 and h2 = γ22 +µ+ µ− to be constants: h1 , h2 ∈ C. On the reduced phase space there are two independent fields besides u and v. Let them be for example ν and µ+ , then  ) 2   γ1 = h1 − )ν , γ2 = ux − h1 − ν 2 , ) (6.29)   µ = 1 (h − (u − h − ν 2 )2 ). − x 1 µ+ 2 However we are going to use all kinds of variables in order to make the formulae more transparent. The non-trivial brackets on the reduced phase space are: {v(x), u(y)} = δ(x − y), {v(x), γ1 (y)} = −δ (x − y), {v(x), ν(y)} = {µ+ (x), γ1 (y)} = − 2π √1 −1 µ+ δ(x − y), {µ+ (x), γ2 (y)} = 2π √1 −1 µ+ δ(x − y), {ν(x), µ− (y)} =

γ1 ν δ (x

− y),

{µ+ (x), µ− (y)} = −2 2π √1 −1 γ2 δ(x − y), {µ+ (x), ν(y)} = 2π √1 −1 γν1 µ+ δ(x − y),

γ1 1 √ µ δ(x 2π −1 ν −

− y). (6.30)

The Hamiltonian is: % G H−1 = dx 2γ1

v √

νx + ∂x γ1 + νµ+ φ(2u, z1 − z2 ) ν 4π −1 − νµ− φ(2u, z2 − z1 ) + −2γ1 γ2 E1 (z1 − z2 ) . − γ1

(6.31)

The equations of motion are:  1   ∂t u(x) = √ γ1 (x),   2π −1     1 γ 1 µ+ γ 1 µ−   φ(2u, z1 − z2 )) + ∂x ( φ(2u, z2 − z1 )) ∂t v(x) = √ vx − ∂x (    ν ν 2π −1    γ1 ∂x γ1 (2u, z − z ) + 2νµ φ (2u, z − z ) − ∂  , − 2νµ φ  + 1 2 − 2 1 x  ν2   2    ∂t ν = − √1 ∂x γ1 + √γ1 (µ+ φ(2u, z1 − z2 ) − µ− φ(2u, z2 − z1 )), ν 2π −1 2π −1ν    1 v   ∂t µ+ = 2 √ µ+ 2µ+ (γ2 − γ1 )E1 (z1 − z2 ) √    2π −1 4π −1      2νγ 2   − √ φ(2u, z2 − z1 )    2π −1    γ 1 µ+    − √ (µ+ φ(2u, z1 − z2 ) − µ− φ(2u, z2 − z1 )). 2π −1ν The quadratic Hamiltonian is the direct generalization of (6.15) % % ) 1 2 G , T H0 = dx2 h1 χ1 = dx T0 − 4h1 −1

(6.32)

(6.33)

126

A.M. Levin, M.A. Olshanetsky, A. Zotov

where ) γ12 v2 (∂x γ1 )2 2 2 h 1 χ1 = − (1 − ) + (2u γ − ν )℘ (2u) − + µ+ µ− (E2 (z1 − z2 ) x 1 16π 2 4ν 2 h1 − E2 (2u)) + 4η1 γ1 γ2 + νµ− φ(2u, z2 − z1 )(E1 (z1 − z2 ) − E1 (2u) + E1 (2u + z2 − z1 )) − νµ+ φ(2u, z1 − z2 )(E1 (z1 − z2 )γ22 E12 (z1 − z2 ) v + E1 (2u) − E1 (2u + z1 − z2 )) + 2γ2 √ E1 (z1 − z2 ) 4π −1 νx µ+ − γ2 E1 (z1 − z2 ) + γ1 2 φ(2u, z1 − z2 ) − γ1 φ(2u, z1 − z2 ) ν ν ∂x µ+ µ+ × + 2ux (E1 (z1 − z2 + 2u) − E1 (2u)) ν ν 1 − (νµ+ φ(2u, z1 − z2 ) − νµ+ φ(2u, z2 − z1 ) + 2γ1 γ2 E1 (z1 − z2 ))2 4h1 1 − (νµ+ φ(2u, z1 − z2 ) − νµ+ φ(2u, z2 − z1 ) 2h1 v νx + 2γ1 γ2 E1 (z1 − z2 )) (2γ1 √ (6.34) − γ1 + ∂x γ1 ). ν 4π −1 7. Conclusion Here we briefly summarize the results of our analysis and discuss some unsolved related problems. The following two subjects were investigated in the paper. (i) We have constructed symplectic maps between Hitchin systems related to holomorphic bundles of different degrees. It allowed us to construct the B¨acklund transformations in the Hitchin systems defined over Riemann curves with marked points. We applied the general scheme to the elliptic CM systems and constructed the symplectic map to an integrable SL(N, C) Euler-Arnold top (the elliptic SL(N, C)-rotator). The open problem is to write down the explicit expressions for the spin variables in terms of the CM phase space for an arbitrary N as was done for the case N = 2 (4.34). It should help to construct the B¨acklund transformations for the CM systems explicitly, and more generally, to construct the generating function for them. The later can be considered as the integrable discrete time mapping [10]. (ii) We have proposed a generalization of the Hitchin approach to 2d integrable theories related to holomorphic bundles of infinite rank. The main example is the integrable two-dimensional version of the two-body elliptic CM system. The upper modification allows to define the symplectic map to the Landau-Lifshitz equation and to find, in principle, the B¨acklund transformations in the field theories. It will be extremely interesting to find the 2d generalization of the SL(N, C)-rotator for N > 2 (the matrix LL equation). There is another point of view on the 2d generalizations of the Hitchin systems. One can try to define them starting from holomorphic bundles over complex surfaces, that are fibrations over Riemann curves. In this case the spectral parameter lives on the base of the fibration, while the space variable lives on the fibers. It will be interesting to analyze, for example, the known solutions of the LL equation from this point of view. Acknowledgement. We would like to thank K. Hasegawa and I. Krichever for illuminating discussions and A. Shabat, T. Takebe, V. Sokolov and A. Zabrodin for useful remarks and H. Gangl for a careful

Hitchin Systems – Symplectic Hecke Correspondence

127

reading of the manuscript. The remarks of the referee allow us to improve the text in an essential way. We are grateful for the hospitality of the Max-Planck-Institut f¨ur Mathematik (Bonn), where the paper was partly prepared during the visits of A.L. and M.O. The work of all authors was supported by the grant 00-15-96557 of the scientific schools and RFBR-01-01-00539 (A.L.), RFBR-00-02-16530 (M.O.,A.Z.), INTAS-00-00561 (A.Z.), INTAS-99-01782 (M.O.).

8. Appendix 8.1. Appendix A. Sin-Algebra.

√ e(z) = exp(2π −1z),

(A.1)

Q = diag(e(1/N), . . . , e(m/N ), . . . , 1),   0 1 0 ··· 0 0 0 1 ··· 0 . . . . .  =  .. .. . . . . ..  , 0 0 0 ··· 1 1 0 0 ··· 0

(A.2)

(A.3)

mn m n )Q , (m = 0, . . . , N − 1, n = 0, . . . , N − 1, (modN ) m2 + n2 = 0) 2N (A.4) is the basis in sl(N, C). The commutation relations in this basis take the form √ π (A.5) [Esk , Enj ] = 2 −1 sin (kn − sj )Es+n,k+j , N (A.6) tr(Esk Enj ) = δs,−n δk,−j N. Emn = e(

8.2. Appendix B. Elliptic functions. We summarize the main formulae for elliptic functions, borrowed mainly from [33 and 34]. We assume that q = exp 2π iτ , where τ is the modular parameter of the elliptic curve Eτ . The basic element is the theta function: 1 (−1)n eπi(n(n+1)τ +2nz) ϑ(z|τ ) = q 8 n∈Z 1

= q 8 e− 4 (eiπz − e−iπz ) iπ

∞

(1 − q n )(1 − q n e2iπz )(1 − q n e−2iπz ).

(B.1)

n=1

The Eisenstein functions. E1 (z|τ ) = ∂z log ϑ(z|τ ), E1 (z|τ ) ∼ where

where

1 − 2η1 z, z

∞ ∞ 1 1 24 η (τ ) 3 = = 2 , η1 (τ ) = ζ 2 π m=−∞ n=−∞ (mτ + n)2 2π i η(τ ) 1

η(τ ) = q 24

n>0

(1 − q n )

(B.2)

(B.3)

128

A.M. Levin, M.A. Olshanetsky, A. Zotov

is the Dedekind function, E2 (z|τ ) = −∂z E1 (z|τ ) = ∂z2 log ϑ(z|τ ), E2 (z|τ ) ∼

1 + 2η1 . z2

(B.4)

The next important function is ϑ(u + z)ϑ (0) . ϑ(u)ϑ(z)

(B.5)

z 1 + E1 (u) + (E12 (u) − ℘ (u)) + . . . , z 2

(B.6)

φ(u, z) = It has a pole at z = 0 and φ(u, z) = and

(B.7) φ(u, z)−1 ∂u φ(u, z) = E1 (u + z) − E1 (u). The following formula plays an important role in checking the zero curvature equation: φ (u, z) = φ(u, z)(E2 (z) − E12 (z) + 2E1 (z)(E1 (u + z) − E1 (u)) + 2E2 (u) − 6η1 ). (B.8) It follows from: (E1 (z) + E1 (u) − E1 (z + u))2 = E2 (u) + E2 (z) + E2 (u + z) − 6η1 .

(B.9)

Relations to the Weierstrass functions. ζ (z|τ ) = E1 (z|τ ) + 2η1 (τ )z,

(B.10)

℘ (z|τ ) = E2 (z|τ ) − 2η1 (τ ), σ (u + z) φ(u, z) = exp(−2η1 uz) , σ (u)σ (z) φ(u, z)φ(−u, z) = ℘ (z) − ℘ (u) = E2 (z) − E2 (u).

(B.11)

Particular values. E1

τ √ 1 1+τ = 0, E1 = E1 = −π −1. 2 2 2

Series representations.

 2πiz e 1  E1 (z|τ ) = −2πi  + 2 1 − qn n=0   q n e2πiz 1 1 = −2πi  + + , 1 − q n e2πiz 1 − q n e2πiz 2

(B.12) (B.13)

(B.14)



n<0

E2 (z|τ ) = −4π 2

n∈Z

φ(u, z) = 2πi

(B.15)

n≥0

n∈Z

q n e2πiz , (1 − q n e2πiz )2

(B.16)

e−2πinz . 1 − q n e−2πiu

(B.17)

Hitchin Systems – Symplectic Hecke Correspondence

129

Parity. ϑ(−z) = −ϑ(z),

(B.18)

E1 (−z) = −E1 (z),

(B.19)

E2 (−z) = E2 (z),

(B.20)

φ(u, z) = φ(z, u) = −φ(−u, −z).

(B.21)

Quasi-periodicity. ϑ(z + 1) = −ϑ(z), E1 (z + 1) = E1 (z),

1

ϑ(z + τ ) = −q − 2 e−2πiz ϑ(z),

(B.22)

E1 (z + τ ) = E1 (z) − 2π i,

(B.23)

E2 (z + 1) = E2 (z), φ(u + 1, z) = φ(u, z),

E2 (z + τ ) = E2 (z),

φ(u + τ, z) = e−2πiz φ(u, z).

(B.24) (B.25)

Addition formula. φ(u, z)∂v φ(v, z) − φ(v, z)∂u φ(u, z) = (E2 (v) − E2 (u))φ(u + v, z),

(B.26)

φ(u, z)∂v φ(v, z) − φ(v, z)∂u φ(u, z) = (℘ (v) − ℘ (u))φ(u + v, z).

(B.27)

or

The proof of (B.26) is based on (B.6),(B.21), and (B.25). In fact, φ(u, z) satisfies more a general relation which follows from the Fay three-section formula φ(u1 , z1 )φ(u2 , z2 ) − φ(u1 + u2 , z1 )φ(u2 , z2 − z1 ) − φ(u1 + u2 , z2 )φ(u1 , z1 − z2 ) = 0. (B.28) A particular case of this formula is φ(u1 , z)φ(u2 , z) − φ(u1 + u2 , z)(E1 (u1 ) + E1 (u2 )) + ∂z φ(u1 + u2 , z) = 0.

Integrals.

(B.29)

E1 (z|τ )dzd z¯ = 0. Eτ

(B.30)

130

A.M. Levin, M.A. Olshanetsky, A. Zotov

Theta functions with characteristics. For a, b ∈ Q put : τ a θ (z, τ ) = e (j + a)2 + (j + a)(z + b) . b 2

(B.31)

j ∈Z

In particular, the function ϑ (B.1) is the theta function with a characteristic 1/2 ϑ(x, τ ) = θ (x, τ ). 1/2

(B.32)

a a (z + 1, τ ) = e(a)θ (z, τ ), (B.33) b b a a + a 2 τ θ (z + a τ, τ ) = e −a − a (z + b) θ (z, τ ), (B.34) b b 2 a+j a θ (z, τ ) = θ (z, τ ), j ∈ Z. (B.35) b b a/2 For simplicity we denote θ = θab . b/2 The following identities are useful for the upper modification procedure in sl(2, C) case:

One has

θ

θ01 (x, τ )θ00 (y, τ ) + θ01 (y, τ )θ00 (x, τ ) = 2θ01 (x + y, 2τ )θ01 (x − y, 2τ ), θ01 (x, τ )θ00 (y, τ ) − θ01 (y, τ )θ00 (x, τ ) = 2ϑ(x + y, 2τ )ϑ(x − y, 2τ ), (B.36) θ00 (x, τ )θ00 (y, τ ) + θ01 (y, τ )θ01 (x, τ ) = 2θ00 (x + y, 2τ )θ00 (x − y, 2τ ), θ00 (x, τ )θ00 (y, τ ) − θ01 (y, τ )θ01 (x, τ ) = 2θ10 (x + y, 2τ )θ10 (x − y, 2τ ); ! ! x−y " " ! x+y " ! x−y " 2ϑ(x, 2τ )θ01 (y, 2τ ) = ϑ x+y 2 , τ θ10 2 , τ + θ10 2 ,τ ϑ 2 ,τ , ! x+y " ! x−y " " " ! x+y ! x−y 2θ00 (x, 2τ )θ10 (y, 2τ ) = ϑ 2 , τ ϑ 2 , τ + θ10 2 , τ θ10 2 , τ , " ! ! x−y " ! x+y " ! x−y " 2θ00 (x, 2τ )θ00 (y, 2τ ) = θ00 x+y ,τ , 2 , τ θ00 2 , τ + θ01 2 , τ θ01 ! x+y ! x−y ! x+y ! 2 " " " " 2θ10 (x, 2τ )θ10 (y, 2τ ) = θ00 2 , τ θ00 2 , τ − θ01 2 , τ θ01 x−y 2 ,τ . (B.37) 8.3. Appendix C: 2d sl(2, C) Calogero L-M pair. The Zakharov-Shabat equations in sl(2, C) case are:   11 : ∂t L11 − ∂x M11 = M21 L12 − M12 L21 , 12 : ∂t L12 − ∂x M12 = 2L11 M12 − 2L12 M11 , (C.1)  21 : ∂ L − ∂ M = 2M L − 2L M . t 21 x 21 11 21 11 21 Let the non-diagonal terms in the M matrix be of the form: 1 (x)E (z) + f 0 (x))(2u, z), M12 = c(x) (2u, z) + (f12 1 12 1 (x)E (z) + f 0 (x))(−2u, z). M21 = c(x) (−2u, z) + (f21 1 21

(C.2)

Then from the diagonal part of (C.1) we conclude: M11 = −ut E1 (z) + α(x) + M11 ,

(C.3)

Hitchin Systems – Symplectic Hecke Correspondence

where α(x) = −

1 √

4π −1

131

1 2 uxxx ν − νx uxx v ux + 6ux ℘ (2u) + 2 8π h 2ν 3

,

(C.4)

and M11 will be defined in the following. It is supposed to be dependent on E2 (2u) in order to cancel terms proportional to E2 (2u) and E2 (2u) in (C.1). Using formula (B.8) in the non-diagonal part of (C.1), we arrive at some conditions equivalent to cancellations of the terms proportional to functions ξ(2u, z) = E1 (2u + z) − E(2u), E1 (z), E12 (z), E1 (z)ξ(−2u, z):  1 = −c, E1 (z)ξ(2u, z) : f12    E 2 (z) : f 1 = −c,  1 12 v 0 = −2c √ (12) (C.5) ξ(2u, z) : 2νut − cx − 2ux f12 ,  4π −1    E (z) : −∂ f 1 = −2 √v f 1 − 2u f 0 + 2u ν, 1 x 12 x 12 t 4π −1 12

(21)

 1 = −c, E1 (z)ξ(−2u, z) : f12    2 1  E (z) : f = −c, 1 12 v 0 = 2c √ ξ(−2u, z) : −2νut − cx + 2ux f21 ,  4π −1    E1 (z) : −∂x f 1 = 2 √v f 1 + 2ux f 0 − 2ut ν. 21 21 4π −1 12

Thus

 1 1   f12 = f21 = −c, 0 2ux f12 = 2νut − cx + 2c 4π √v −1 ,   2ux f 0 = 2νut + cx + 2c √v , 21 4π −1 0 + f0 = f+ = f21 12 0 − f0 = f− = f21 12

2 ux (νut cx ux .

+ c 4π √v −1 ),

The remaining parts of the non-diagonal equations are:  0 − 2cu E (z) − 4cu E (2u) νt + 12cux η1 − ∂x f12  x 2 x 2   v 0  = −2 4π √−1 f12 − 2να − 2νM11 , 0 − 2cu E (z) − 4cu E (2u) −νt + 12cux η1 + ∂x f21  x 2 x 2   v 0  √ = −2 4π −1 f21 − 2να − 2νM11 .

(C.6)

(C.7)

(C.8)

(C.9)

Subtracting the above equations we have: 2

ux v ∂x ut + ∂x f+ = −2 √ f− . ν 4π −1

Substituting f+ and f− from (C.8) into (C.10) we arrive at the equation for c: 1 v ux v cx νut + c √ ∂x ut + ∂ x . =− √ ν ux u 4π −1 4π −1 x

(C.10)

(C.11)

CM this equation yields Now some concrete equations of motion should be used. For H−1 * c ∼ uvx . However the coefficient of the proportionality appears to be equal to zero.

For H0CM (6.15) we have c = − 2π √ν −1 .

132

A.M. Levin, M.A. Olshanetsky, A. Zotov

References 1. Sokolov, V.V., Shabat, A.B.: Classification of integrable evolution equations. Soviet Sci. Rev. C4, 221–280 (1984). Mikhailov, A.V., Shabat, A.B., Yamilov, R.I.: The symmetry approach to classification of nonlinear equations. Complete list of integrable systems. Uspekhi Mat. Nauk 42, 3–53 (1987) Fokas, A.S.: Symmetries and integrability. Stud.Appl.Math. 77, 253–299 (1987) 2. Ruijsenaars, S.N.M.: Action-Angle maps and Scattering Theory for Some Finite-Dimensional Integrable systems. Commun. Math. Phys. 115, 127–165 (1988) Fock, V.: In: Geometry and Integrable Models. P. Pyatov, S. Solodukhin (eds.), Singapore: World Scientific, 1995, p. 20 Gorsky, A.: Integrable many-body problems from the field theories. Theor. Math. Phys. 103, 681– 700 (1995), hep-th/9410228 Nekrasov, N.: On a Duality in Calogero-Moser-Sutherland Systems. hep-th/9707111 Fock, V., Gorsky, A., Nekrasov, N., Rubtsov, V.: Duality in integrable systems and gauge theories. hep-th/9906235 Mironov, A.: Seiberg-Witten theory and duality in integrable systems. hep-th/0011093 3. Takhtajan, L., Zakharov, V.: The equivalence the nonlinear Schrodinger equation and the Heizenberg ferromagnet. Theor. Math. Phys. 38, 26–35 (1979) 4. Hitchin, N.: Stable bundles and integrable systems. Duke Math. J. 54, 91–114 (1987) 5. Simpson, S.T.: Harmonic bundles on non-compact curves. J. of AMS 3, 713–770 (1990) 6. Beauville, A.: Syst`emes hamiltoniens compl`etement intégrables associ´es aux surfaces K3. Problems in the theory of surfaces and their classification (Cortona, 1988), Sympos. Math., XXXII, London: Academic Press, 1991, pp. 25–31 7. Markman, E.: Spectral curves and integrable systems. Comp. Math. 93, 255–290 (1994) 8. Nekrasov, N.: Holomorphic bundles and many-body systems. Commun. Math. Phys. 180, 587-604 (1996) hep-th/9503157 9. Enriquez, B., Rubtsov, V.: Hitchin systems, higher Gaudin operators and R-matrices. Math. Res. Lett. 3, 343–357 (1996) 10. Veselov, A.P.: Integrable maps. Russ. Math. Surv. 46, 1–51 (1991) 11. Arinkin, D., Lysenko, S.: Isomorphisms between moduli spaces of SL(2)-bundles with connections on P1 /{x1 , . . . , x4 }. Preprint On the moduli spaces of SL(2)-bundles with connections on P1 /{x1 , . . . , x4 }. Preprint 12. Okamoto, K.: Studies in the painlev´e equations I. Sixth Painlev´e equation PVI. Annali Mat. Pura Appl. 146, 337–387 (1987) 13. Reyman, A.G., Semenov-Tyan-Shanskii, M.A.: Lie algebras and Lax equations with spectral parameter on elliptic curve. Notes from Sci. Seminar LOMI Vol. 150, 104–118 (1986) (in Russian) 14. Vakulenko, V.: Note on the Ruijsenaars-Schneider model. math.QA/9909079 15. Krichever, I.: Vector bundles and Lax equations on algebraic curves. hep-th/0108110 16. Baxter, R.J.: Eight-vertex model in lattice statistics and one-dimensional anisotropic Heisenberg chain. I. Ann. Phys. 76, 1–24 (1973) Jimbo, M., Miwa, T., Okado, M.: Local state probabilities of solvable lattice models: An A1n−1 family. Nucl. Phys. B300, [FS22], 74–108 (1988) Faddeev, L., Takhtajan, L.: The quantum method of the inverse problem and the Heisenberg XYZ model. Uspekhi Mat. Nauk 34:5, 11–68 (1979) (English transl.) 17. Hasegawa, K.: Ruijsenaars’ commuting difference operators as commuting transfer matrices. q-alg/9512029 18. Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Proc. of the ICM 94, Boston-Basel, Birkhaeuser 1994, pp. 1247–1255, hep-th/9609153 19. Belavin, A.A.: Dynamical symmetry of integrable system. Nucl. Phys. 180 [FS2], 189–200 (1981) 20. Arutyunov, G., Chekhov, L., Frolov, S.: Quantum dynamical R-matrices. Amer. Math. Soc. Transl. 191(2), 1–33 (1999) 21. Ruijsenaars, S.N.M.: Complete integrability of relativistic Calogero-Moser systems and elliptic function identities. Commun. Math. Phys. 110, 191 (1987) 22. Suris, B.Yu.: Elliptic Ruijsenaars-Schneider and Calogero-Moser hierarchies are governed by the same r-matrix. solv-int/9603011 23. Mehta, V.B., Seshardi, C.S.: Moduli of vector bundles on curves with parabolic structures. Math. Annalen 248, 205–239 (1980) 24. Levin, A., Olshanetsky, M.: Double coset construction of moduli space of holomorphic bundles and Hitchin systems. Commun. Math. Phys. 188, 449–466 (1997) 25. Enriquez, B., Rubtsov, V.: Hecke-Tyurin parametrization of the Hitchin and KZB systems. math.AG/9911087

Hitchin Systems – Symplectic Hecke Correspondence

133

26. Kuznetsov, V., Sklyanin, E.: On Bcklund transformations for many-body systems. J. Phys. A 31, 2241–2251 (1998) 27. Calogero, F.: J. Math. Phys. 12, 419 (1971) Moser, J.: Adv. Math. 16, 197–220 (1975) 28. Gorsky, A., Nekrasov, N.: Elliptic Calogero-Moser system from two dimensional current algebra. hep-th/9401021 29. Nijhoff, F.W., Pang, G.D.: Discrete-time Calogero-Moser model and lattice KP equations. In: Symmetries and Integrability of Difference Equations. (Est´erel, PQ, 1994), CRM Proc. Lecture Notes, 9, Providence, RI: Amer. Math. Soc., 1996, pp. 253–264 30. Dubrovin, B., Matveev, V., Novikov, S.: Nonlinear equations of KdV type and finite gap linear operators and abelian manifolds. UMN (1976), XXXI, 1 (187) (in Russian) 31. Inozemtsev, V.I.: The finite toda lattices. Commun. Math. Phys. 121, 629–638 (1989) 32. Sklyanin, E.: On complete integrability of the Landau-Lifshitz equation. Preprint LOMI E-3-79, 1979 Borovik, A.E., Robuk, V.N.: Linear pseudopotentials and conservation laws for the Landau-Lifshitz equation. Theor. Math. Phys. 46, 371–381 (1981) 33. Weil, A.: Elliptic functions according to Eisenstein and Kronecker. Berlin-Heidelberg: SpringerVerlag, 1976 34. Mumford, D.: Tata Lectures on Theta I, II. Boston: Birkh¨auser, 1983, 1984 Communicated by L. Takhtajan

Commun. Math. Phys. 236, 135–159 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0813-9

Communications in

Mathematical Physics

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori A. Polishchuk1 , A. Schwarz2 1 2

Department of Mathematics and Statistics, Boston University, 111 Cummington Street, Boston, MA 02215, USA. E-mail: [email protected] Department of Mathematics, University of California, Davis, CA 95616, USA. E-mail: [email protected]

Received: 24 November 2002 / Accepted: 25 November 2002 Published online: 28 February 2003 – © Springer-Verlag 2003

Abstract: In this paper we study the category of standard holomorphic vector bundles on a noncommutative two-torus. We construct a functor from the derived category of such bundles to the derived category of coherent sheaves on an elliptic curve and prove that it induces an equivalence with the subcategory of stable objects. By the homological mirror symmetry for elliptic curves this implies an equivalence between the derived category of holomorphic bundles on a noncommutative two-torus and the Fukaya category of the corresponding symplectic (commutative) torus. Introduction In this paper we study holomorphic vector bundles on a noncommutative 2-dimensional torus Tθ , where θ ∈ R, following the approach of [19, 16, and 3]. More precisely, we equip Tθ with a complex structure given by some τ ∈ C \ R and study analogues of ¯ ∂-connections on vector bundles over the complex noncommutative torus Tθ,τ . Recall that vector bundles correspond to projective modules over the algebra Aθ of smooth functions over Tθ . As in [3] we restrict our attention to standard holomorphic structures on basic Aθ -modules, for which there exist compatible connections of constant curvature. We do not consider here the problem how to characterize such holomorphic bundles, although it seems that similar to the commutative case the answer should involve some notion of stability (see remarks at the end of 2.1). Motivated by the calculations of [8] we prove that the derived category of standard holomorphic vector bundles on Tθ,τ is equivalent to the full subcategory of stable objects in the derived category D b (Xτ ) of coherent sheaves on the complex elliptic curve Xτ = C/(Z + τ Z). In particular, the former category does not depend on θ . Under this equivalence standard holomorphic bundles on Tθ,τ land into an abelian subcategory of D b (Xτ ) obtained by a certain tilting of the category Coh(Xτ ) of coherent sheaves on Xτ depending on θ . Roughly speaking,

The work of both authors was partially supported by NSF grants.

136

A. Polishchuk, A. Schwarz

to get this abelian subcategory of D b (Xτ ), one has to cut Coh(Xτ ) into two subcategories generated by stable bundles of slopes < θ and > θ respectively (we assume that θ irrational) and then reassemble these subcategories in a different way into a new abelian category. The general framework for such constructions is provided by torsion theory of [7]. These t-structures were considered by Bridgeland in connection with general stability conditions for derived categories. In physical terms the objects of our study are D-branes of B-type in SUSYYang-Mills theory on noncommutative 2-tori. Although the corresponding derived categories are the same for all θ, the physical branes (stable objects) do depend on θ . This resembles the picture where stability conditions on the derived category of coherent sheaves over a Calabi-Yau manifold vary with K¨ahler structure (see [5] for more general discussion). An interesting feature of the above equivalence of categories is that it is compatible with Morita equivalences and the action of SL2 (Z) on D b (Xτ ) defined by Mukai (see [10]). Namely, for every g ∈ SL2 (Z) one can consider the fractional-linear action of g on the parameter θ . The noncommutative tori Tgθ and Tθ are Morita equivalent, i.e., there exists an Agθ − Aθ -bimodule inducing an equivalence of categories of modules over Agθ and Aθ . This equivalence extends to an equivalence between categories of holomorphic bundles on Tgθ,τ and Tθ,τ . Now one can compare equivalences between the corresponding derived categories with D b (Xτ ). It turns out that they differ by the action of the autoequivalence of D b (Xτ ) corresponding to g t . Thus, our picture can be considered as a “noncommutative explanation” of the existence of the Fourier-Mukai transform. We consider two approaches to constructing the above equivalences. One uses the explicit calculation of structure constants of products of noncommutative theta functions in [3]. Another is based on a construction of the Fourier-Mukai type functor from the category of holomorphic vector bundles on Tθ,τ to the category of complexes of sheaves of O-modules with coherent cohomology on Xτ . We restrict ourselves to considering only standard holomorphic structures on basic projective modules over Aθ , which corresponds to considering only stable vector bundles on the commutative elliptic curve (or more generally, simple coherent sheaves). However, we believe that the functor we construct extends to all holomorphic bundles on Tθ . It is also clear that some of the results of this paper can be generalized to higherdimensional noncommutative tori. Note that in this generalization sometimes one has to replace the category of coherent sheaves on the corresponding commutative complex torus X by the twisted category associated with some cohomology class in H 2 (X, O∗ ) (this twisting is always trivial in the 2-dimensional case). In the paper [8] that motivated us the category of standard holomorphic bundles on Tθ is compared with the Fukaya category of a symplectic torus. In view of the equivalence we establish in this paper, such a comparison essentially reduces to the usual homological mirror symmetry for an elliptic curve proved in [13]. It is remarkable that the functor from the Fukaya category to the category of holomorphic bundles on Tθ considered in [8] seems to be simpler than the similar functor in the commutative case. It would be interesting to study the relation between the higher-dimensional generalization of this functor and Fukaya’s work on noncommutative mirror symmetry [6]. We should mention that a different construction of categories associated with noncommutative elliptic curves was proposed by Soibelman and Vologodsky in [18]. The categories Bq they construct depend only on one parameter q = exp(2π iθ), while our categories depend on two parameters θ and τ . It seems plausible that the categories Bq can be obtained as some asymptotic limits of our categories when τ goes to infinity. On

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

137

the other hand, the categories Bq are in some sense degenerations of the categories of coherent sheaves on commutative elliptic curves C∗ /q Z (where |q| < 1) as |q| → 1. The general philosophy of this degeneration procedure was outlined in [17]. It would be also interesting to understand the role of holomorphic structures in Manin’s program that is supposed to relate the real multiplication for noncommutative two-tori with arithmetics of real quadratic fields (see [9]). 1. Preliminaries 1.1. Basic modules. Let θ be a real number, Tθ be the corresponding 2-dimensional noncommutative torus. By definition, the algebra Aθ of smooth functions on Tθ consists of formal linear combinations (n1 ,n2 )∈Z2 an1 ,n2 U1n1 U2n2 with the coefficient function (n1 , n2 ) → an1 ,n2 rapidly decreasing at infinity. The product is defined using the rule U1 U2 = exp(2π iθ)U2 U1 . We are interested in finitely generated projective right Aθ -modules. It is known (see [1, 14]) that every such module is isomorphic to one of the following modules En,m (θ ), where (m, n) is a pair of integers such that n + mθ = 0. Assume first that m = 0. Then En,m (θ ) is defined as the Schwartz space S(R × Z/mZ) equipped with the following right action of Aθ : n + mθ αn , α − 1 , f U2 (x, α) = exp 2π i x − f (x, α), f U1 (x, α) = f x − m m |n|

where x ∈ R, α ∈ Z/mZ. For m = 0 and n = 0 we set En,0 (θ ) = Aθ with the obvious right Aθ -action. Let us denote by deg(En,m (θ )) = m, rk(En,m (θ )) = n + mθ, m µ(En,m (θ )) = , n + mθ the natural analogues of degree, rank and slope. The modules E−n,−m (θ ) and En,m (θ ) are equivalent, however, sometimes we would like to distinguish them by introducing the Z/2Z-grading. Namely, we will say that the module E is even if rk E > 0 and that it is odd if rk E < 0. Thus, in the category of Z/2Z-graded projective Aθ -modules we have E−n,−m (θ ) = En,m (θ ), where E → E denotes the parity change. In the case when θ is rational, we could also try to use the above formulas to define modules associated with (m, n) such that n + mθ = 0; however, these modules are not projective. Basic modules are the modules En,m (θ ) with m and n relatively prime. It is known that for (m, n) relatively prime one has End,md (θ ) En,m (θ )⊕d . It is often convenient to extend the pair (m, n) to a matrix in SL2 (Z).When we need to ab make this choice we will use the following notation: for a matrix g = ∈ SL2 (Z) cd we set Eg (θ ) = Ed,c (θ ). In particular, for g = 1, we get E1 (θ ) = E1,0 (θ ) = Aθ . In

138

A. Polishchuk, A. Schwarz

accordance with our formulas for the degree and rank of modules we set for a matrix g as above deg(g) = c, rk(g, θ ) = cθ + d, so that deg(Eg (θ )) = deg(g) and rk(Eg (θ )) = rk(g, θ ). Note that the function g → rk(g, θ ) satisfies the following cocycle condition: rk(g1 g2 , θ) = rk(g1 , g2 θ) rk(g2 , θ),

(1.1)

where we denote by θ → gθ = (aθ + b)/(cθ + d) the natural action of SL2 (Z) on the set of θ ’s (it is always well-defined for irrational θ , and partially defined for rational θ ). We will also need the following identity which can be easily checked: deg(g2 g1−1 ) rk(g3 , θ) − deg(g3 g1−1 ) rk(g2 , θ) + deg(g3 g2−1 ) rk(g1 , θ ) = 0.

(1.2)

It is well-known that endomorphisms of the basic right Aθ -module Eg (θ ) can be identified with the algebra Agθ . The corresponding left action of Agθ on Eg (θ ) is given by the formulas 1 x α U1 f (x, α) = f x − , α − a , U2 f (x, α) = exp 2π i − f (x, α), c cθ + d c

ab where g = . cd

1.2. Pairings. We have a canonical pairing b : Eg −1 (gθ ) ⊗C Eg (θ ) → C,

(1.3)

defined by the formula b(f1 ⊗ f2 ) =

α∈Z/ deg(g)Z x∈R

f1

x , α f2 (x, −aα)dx, rk(g, θ )

where f1 ∈ Eg −1 (gθ ), f2 ∈ Eg (θ ), g is given by the matrix above. Lemma 1.1. For every U ∈ Agθ , V ∈ Aθ one has b(f1 U ⊗ f2 ) = b(f1 ⊗ Uf2 ), b(Vf1 ⊗ f2 ) = b(f1 ⊗ f2 V ). Proof. It suffices to check these equalities when U (resp. V ) is one of the two generators of the algebra Agθ (resp. Aθ ), in which case it is straightforward.

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

139

Proposition 1.2. (a) For every g1 , g2 ∈ SL2 (Z) one has a well-defined pairing of right Aθ -modules tg1 ,g2 : Eg1 (g2 θ ) ⊗C Eg2 (θ ) → Eg1 g2 (θ ) given by the following formulas. If g1 = 1, g2 = 1 and g1 g2 = 1 then x rk(g1 , g2 θ) c2 d12 α tg1 ,g2 (f1 ⊗ f2 )(x, α) = + f1 − n , a1 d12 α − n rk(g2 , θ) c1 c12 n∈Z d12 α n × f2 x − + , a2 n , c12 c2 a b a b where x ∈ R, α ∈ Z/c12 Z, gi = i i , g1 g2 = 12 12 . The pairing ci di c12 d12 tg,1 : Eg (θ ) ⊗C E1 (θ ) → Eg (θ ) is given by the right action of the algebra Aθ = E1 (θ ) on Eg (θ ). The pairing t1,g : E1 (gθ ) ⊗C Eg (θ ) → Eg (θ ) is set to be the left action of the algebra Agθ = E1 (gθ ) on Eg (θ ) defined before. Finally, the pairing tg −1 ,g : Eg −1 (gθ ) ⊗C Eg (θ ) → E1 (θ ) = Aθ is given by the formula

tg −1 ,g (f1 ⊗ f2 ) =

U1n1 U2n2 b(U2−n2 U1−n1 f1 ⊗ f2 ),

(n1 ,n2 )∈Z2

where b is the pairing (1.3). (b) For every triple of elements g1 , g2 , g3 ∈ SL2 (Z) the following diagram is commutative: tg1 ,g2 ⊗ id -E Eg1 (g2 g3 θ ) ⊗ Eg2 (g3 θ ) ⊗ Eg3 (θ ) g1 g2 (g3 θ) ⊗ Eg3 (θ ) id ⊗tg2 ,g3

tg1 g2 ,g3

? Eg1 (g2 g3 θ ) ⊗ Eg2 g3 (θ )

tg1 ,g2 g3

-

(1.4)

? Eg1 g2 g3 (θ )

(c) The map Eg1 (g2 θ ) → HomAθ (Eg2 (θ ), Eg1 g2 (θ )) induced by tg1 ,g2 is an isomorphism. Before starting the proof it is convenient to rewrite the formula for tg1 ,g2 a little bit. Lemma 1.3. (a) For g1 , g2 ∈ SL2 (Z) such that g1 = 1, g2 = 1, g1 g2 = 1 one has tg1 ,g2 (f1 ⊗ f2 )(x, α) =

α1 ∈Z/c1 Z,α2 ∈Z/c2 Z n∈Ig1 ,g2 (α1 ,α2 ,α)

f1

x n rk(g1 , g2 θ )n , α1 f 2 x − , α2 , + rk(g2 , θ ) c1 c12 c2 c12

140

A. Polishchuk, A. Schwarz

where g1 , g2 and g1 g2 are given by the same matrices as in the above proposition, Ig1 ,g2 (α1 , α2 , α) = {n ∈ Z| n ≡ −c1 α + c12 α1 (c12 c1 ), n ≡ c2 d12 α − c12 d2 α2 (c12 c2 )}. (b) For n ∈ Ig1 ,g2 (α1 , α2 , α) one has n ≡ −c1 α2 + d1 c2 α1 (c1 c2 ). (c) One has Ig2 ,−g −1 g −1 (α1 , α2 , α3 ) = Ig1 ,g2 (−a1 α3 , −α1 , −a12 α2 ). 2

1

Proof of lemma. (a) In the formula of Proposition 1.2(a) one should change the summation variable by setting n = c2 d12 β − c12 n. Then the congruences a1 d12 α − n ≡ α1 (c1 ), a2 n ≡ α2 (c2 ) are equivalent to the congruences n ≡ c2 d12 α − c12 d2 α2 (c12 c2 ), n ≡ (c2 − a1 c12 )d12 α + c12 α1 (c12 c1 ) (we used the fact that d2 ≡ a2−1 (c2 )). It remains to use the relation (c2 − a1 c12 )d12 = −c1 a12 d12 ≡ −c1 (c12 c1 ) to see that these congruences are equivalent to n ∈ Ig1 ,g2 (α1 , α2 , α). (b) Let us pick a representative in Z for α ∈ Z/c12 Z. Since c2 d12 − c12 d2 = −c1 we have the congruence n ≡ −c1 α + c12 d2 (α − α2 )(c12 c2 ) for n ∈ Ig1 ,g2 (α1 , α2 , a). Hence, k := (n + c1 α)/c12 is an integer and we have k ≡ d2 (α − α2 )(c2 ), k ≡ α1 (c1 ). Using these two relations we get n + c1 α = c12 k = c1 a2 k + d1 c2 k ≡ c1 a2 d2 (α − α2 ) + d1 c2 α1 (c1 c2 ) ≡ c1 (α − α2 ) + d1 c2 α1 (c1 c2 ) as required. (c) This follows easily from (b).

Proof of Proposition 1.2. The pairing tg1 ,g2 for g1 = 1, g2 = 1, g1 g2 = 1 is essentially the same as in Sect. 2 of [3]. We just replaced left modules by right modules everywhere. Now the results of [3] imply that tg1 ,g2 for g1 = 1, g2 = 1 gives rise to a map of Ag1 g2 θ -Aθ -bimodules (in fact, an isomorphism) Eg1 (g2 θ ) ⊗Ag2 θ Eg2 (θ ) → Eg1 g2 (θ ).

(1.5)

This proves the associativity in (b) for the case when one of the elements gi is equal to 1, and two other elements gj and gk are different from 1 and are not inverse to each other. It is also easy to see directly that the pairing tg −1 ,g gives rise to a map of Aθ -Aθ -bimodules (also an isomorphism) Eg −1 (gθ ) ⊗Agθ Eg (θ ) → Aθ , which implies the associativity in (b) for the triples (g −1 , g, 1), (g −1 , 1, g) and (1, g −1 , g). In the case when all the elements g1 , g2 , g3 , g1 g2 , g2 g3 , g1 g2 g3 are non-trivial the associativity can be checked by a straightforward calculation using Lemma 1.3(a),(b).

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

141

ai b i a b a b , gi gj = ij ij , g1 g2 g3 = 123 123 , ci di cij dij c123 d123 then the associativity follows from the statement that for every collection (α1 , α2 , α3 , α), where αi ∈ Z/ci Z, α ∈ Z/c123 Z, the map (m, n) → ((c3 m + c123 n)/c23 , (c2 m − c1 n)/c23 ) gives a bijection

More precisely, if we denote gi =

∪α23 ∈Z/c23 Z Ig1 ,g2 g3 (α1 , α23 , α) × Ig2 ,g3 (α2 , α3 , α23 ) → ∪α12 ∈Z/c12 Z Ig1 g2 ,g3 (α12 , α3 , α) × Ig1 ,g2 (α1 , α2 , α12 ). The remaining interesting cases are: (i) g1 g2 g3 = 1, all elements g1 , g2 , g3 are non-trivial; (ii) g1 g2 = 1, all elements g2 , g3 , g2 g3 are non-trivial; (ii)’ g2 g3 = 1, all elements g1 , g2 , g1 g2 are non-trivial. The case (i) boils down to the identity b(tg1 ,g2 (f1 ⊗ f2 ) ⊗ f3 ) = b(f1 , tg2 ,g3 (f2 ⊗ f3 )) which is not difficult to check directly. Let us prove the identity in the case (ii) (the case (ii)’ is similar). We can assume that θ is irrational (the case of rational θ will follow by continuity). Then we can easily show that the associativity holds up to a constant. Indeed, for fixed f1 ∈ Eg −1 (g2 g3 θ ), f2 ∈ Eg2 (g3 θ), the map 2

f3 → tg −1 ,g2 g3 (f1 ⊗ tg2 ,g3 (f2 ⊗ f3 ) 2

is a morphism of right Aθ -modules Eg3 (θ ) → Eg3 (θ ), hence it is given by the left action of some element t (f1 , f2 ) ∈ Ag3 θ . Moreover, the map t gives a morphism of Ag3 θ − Ag3 θ -bimodules Eg −1 (g2 g3 θ ) ⊗Ag2 g3 θ Eg2 (g3 θ) → Ag3 θ . 2

Since tg −1 ,g2 induces an isomorphism of the bimodule on the left with Ag3 θ we see that 2 this morphism differs from tg −1 ,g2 by the multiplication with a central element in Ag3 θ . 2 Since we assumed that θ is irrational, this central element should be a constant. Thus, it suffices to prove the following identity: tg −1 ,g2 g3 (f1 ⊗ tg2 ,g3 (f2 ⊗ f3 )(0, 0) = (tg −1 ,g2 (f1 ⊗ f2 )f3 )(0, 0). 2

2

(1.6)

Also, we can assume that f1 (x, α) = 0 for α ≡ 0(c1 ) and denote f1 (x) = f1 (x, 0). Then the LHS of (1.6) can be rewritten as n rk(g2 , g3 θ) n m f1 , n f3 − , −a3 n f2 m + rk(g2 , g3 θ ) c2 c3 m,n∈Z m = (U1n f3 )(0, 0) · f1 (f2 U1−n )(m, 0). rk(g2 , g3 θ) n∈Z

m∈Z

Now we can rewrite the inner sum using the Poisson summation formula: m (f2 U1−n )(m, 0) f1 rk(g2 , g3 θ ) m∈Z y = exp(2πimy)f1 (f2 U1−n )(y, 0)dy rk(g2 , g3 θ ) m∈Z y = f1 b(f1 , f2 U1−n U2−m ). (f2 U1−n U2m )(y, 0)dy = rk(g2 , g3 θ ) m∈Z

m∈Z

142

A. Polishchuk, A. Schwarz

Therefore, the LHS of (1.6) is equal to

(U1n f3 )(0, 0)b(U1−n U2−m f1 , f2 ) =

m,n∈Z

(U2m U1n f3 )(0, 0)b(U1−n U2−m f1 , f2 )

m,n∈Z

which is the definition of the RHS of (1.6). Part (c) follows from the well-known fact that all basic modules considered as bimodules are invertible and from the isomorphism (1.5).

Remark. The cocycle condition 1.1 implies that the pairing tg1 ,g2 is even (where we define the Z/2Z-grading of the tensor product in the usual way). Corollary 1.4. (a) For every pair of basic modules Eg (θ ), Eh (θ ) there is a canonical isomorphism HomAθ (Eg (θ ), Eh (θ )) Ehg −1 (gθ )

(1.7)

of Ahθ − Agθ -bimodules. (b) For a triple of basic modules (Eg (θ ), Eh (θ ), Ek (θ )) the canonical composition map HomAθ (Eh (θ ), Ek (θ )) ⊗ HomAθ (Eg (θ ), Eh (θ )) → HomAθ (Eg (θ ), Ek (θ )) is identified via isomorphisms (1.7) with the pairing tkh−1 ,hg −1 . Remark. Recall that the Morita equivalence associated with a basic module Eg0 (θ ) is the functor M → M ⊗Ag0 θ Eg0 (θ ) from the category of right Ag0 θ -modules to that of right Aθ -modules. According to (1.5) it sends Eg (g0 θ) to Egg0 (θ ). The corresponding isomorphism HomAg0 θ (Eg (g0 θ ), Eh (g0 θ )) HomAθ (Egg0 (θ ), Ehg0 (θ )) is compatible with the identification of both sides with Ehg −1 (gg0 θ) given in the above corollary.

2. Holomorphic Structures 2.1. Standard holomorphic structures on basic modules. Let us fix a complex number τ such that Im(τ ) = 0. We will think about τ as a complex structure on the noncommutative torus Tθ . Namely, τ defines a one-dimensional subalgebra in the Lie algebra of derivations of the algebra Aθ spanned by the derivation δτ given by  δτ 

(n1 ,n2 )∈Z2

 an1 ,n2 U1n1 U2n2  = 2πi

(n1 τ + n2 )an1 ,n2 U1n1 U2n2 .

(2.1)

(n1 ,n2 )∈Z2

We denote by Tθ,τ the noncommutative torus Tθ equipped with this complex structure.

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

143

Definition. (i) A holomorphic structure on a right Aθ -module E (compatible with the complex structure on Tθ,τ ) is an operator ∇ : E → E satisfying the following Leibnitz identity: ∇(ea) = ∇(e) · a + e · δτ (a), where e ∈ E, a ∈ Aθ . (ii) If E and E are right Aθ -modules equipped with holomorphic structures, then we say that a morphism f : E → E of right Aθ -modules is holomorphic if ∇(f e) = f (∇e). (iii) For a right Aθ -module E equipped with a holomorphic structure ∇ we define the cohomology as follows: H ∗ (E) = H ∗ (E, ∇) := H 0 (E, ∇) ⊕ H 1 (E, ∇), where H 0 (E, ∇) := ker(∇ : E → E), H 1 (E, ∇) := coker(∇ : E → E). It is natural to call projective right Aθ -modules equipped with holomorphic structures holomorphic bundles on the complex noncommutative torus Tθ,τ . We are going to introduce a family of holomorphic structures (depending on one complex parameter) on every basic module. Namely, for E = En,m (θ ) with m = 0 we set ∂f ∇ z (f ) = + 2π i(τ µ(E)x + z)f, ∂x where µ(E) = m/(n + mθ ), f ∈ E, z ∈ C. It is easy to see that for every z this is a holomorphic structure on E. For E = Aθ we define holomorphic structures on the trivial module ∇ z : Aθ → Aθ by setting ∇ z (a) = 2πiz · a + δτ (a). We will denote by Egz (θ ) the basic module Eg equipped with the holomorphic structure ∇z. Definition. A standard holomorphic structure on a basic module Eg is one of the structures ∇ z . A standard holomorphic bundle on Tθ is one of the bundles Egz (θ ). Note that there is a natural notion of equivalence of holomorphic structures on the same module E: two such structures ∇ and ∇ are equivalent if there exists a holo morphic isomorphism (E, ∇) → (E, ∇ ). In other words, the group of automorphisms of E as a right Aθ -module acts on the space of holomorphic structures on E, so that equivalence classes are orbits under this action. Proposition 2.1. (a) For every basic right Aθ -module E = Eg (θ ) and every z ∈ C the operator ∇ z : E → E satisfies the following Leibnitz rule with respect to the left action of Agθ on E: 1 δτ (b) · e, ∇ z (be) = b · ∇ z (e) + rk(E) where b ∈ Agθ , e ∈ E. (b) The holomorphic structures ∇ z and ∇ z are equivalent if and only if z ≡ z 1 mod rk(E) (Z + τ Z).

144

A. Polishchuk, A. Schwarz

(c) Let E0 = Eg (θ ) be a basic right Aθ -module equipped with a standard holomorphic structure ∇ z , E be a right Agθ -module. Then for every holomorphic structure ∇ : E → E on E the formula ∇(e ⊗ e0 ) =

1 ∇(e) ⊗ e0 + e ⊗ ∇ z (e0 ) rk(E0 )

defines a holomorphic structure on E ⊗Agθ E0 . Proof. (a) This follows from the explicit formulas for the action of Agθ on E. (b) The identities of (a) for b = U1 and b = U2 can be rewritten as follows: τ (U1 e) = U1 ∇ z (e), ∇ z− rk(E)

∇ z−

1 rk(E)

(U2 e) = U2 ∇ z (e).

τ (resp. ∇ Therefore, U1 (resp. U2 ) induces an equivalence between ∇ z and ∇ z− rk(E) z−

1 rk(E)

).

Conversely, assume that ∇ z is equivalent to ∇ z . Then z = z − δτ (u)u−1 / rk(E) for some u ∈ A∗gθ . But this implies that δτ (u) is proportional to u, which is possible only for u of the form c · U1n1 U2n2 . Hence, constant elements z and z belong to the same orbit if and only if z − z ∈ rk(E)−1 (Z + τ Z). (c) This follows easily from (a).

It is easy to see that not every holomorphic structure on a basic module is equivalent to a standard one. Indeed, if θ is irrational then every basic module E is isomorphic to a direct sum E1 ⊕ E2 ⊕ . . . ⊕ En of two or more other basic modules (this follows from Rieffel’s cancellation theorem, see [14]). Then the direct sum of holomorphic structures on Ei ’s will not be equivalent to a standard holomorphic structure on E, since it will admit holomorphic endomorphisms that are not proportional to the identity, while for standard holomorphic structures this is not the case (as follows from Corollary 2.3 below). Holomorphic bundles isomorphic to standard ones can be characterized as those bundles admitting a compatible unitary connection with constant curvature (and such that the underlying projective module is basic). In the commutative case the theorem of Narasimhan and Seshadri [11] asserts that these are exactly stable bundles. In case of elliptic curves an equivalent characterization is that these are bundles with only scalar endomorphisms. One may hope that this remains true for noncommutative tori. One natural approach to this problem would be to mimick the method of Donaldson in [4]. Roughly speaking the idea is to apply the gradient flow of the Yang-Mills functional and then use the theorem of Connes-Rieffel [2] stating that the minima of this functional are exactly connections of constant curvature (this approach is advocated in [19]). However, the induction in rank employed in [4] cannot be used for irrational θ, since ranks of bundles on Tθ can be arbitrarily small. It is possible that one can modify the argument using Rieffel’s description of all critical sets of the Yang-Mills functional (see [15]). Another possible approach would be to use an analogue of the Fourier-Mukai transform defined in Sect. 3.3.

2.2. Tensor products and Hom in holomorphic category. Now we are going to study the relation between the pairings tg1 ,g2 and holomorphic structures.

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

145

Proposition 2.2. (a) For every g1 , g2 ∈ SL2 (Z), z1 , z2 ∈ C one has ∇ z1 +z2 (tg1 ,g2 (f1 ⊗ f2 )) 1 = tg1 ,g2 ∇ rk(g2 ,θ)z1 (f1 ) ⊗ f2 + tg1 ,g2 (f1 ⊗ ∇ z2 (f2 )), rk(g2 , θ)

(2.2)

where f1 ∈ Eg1 (g2 θ ), f2 ∈ Eg2 (θ ). (b) The pairing tg1 ,g2 induces a well-defined pairing H i (Eg1 (g2 θ ), ∇ rk(g2 ,θ)z1 ) ⊗ H j (Eg2 (θ ), ∇ z2 ) → H i+j (Eg1 g2 (θ ), ∇ z1 +z2 ), where (i, j ) is either (0, 0), (1, 0) or (0, 1). Proof. (a) Since ∇ z (f ) = ∇ 0 (f ) + zf , it suffices to consider the case z1 = z2 = 0. If g1 = 1, g2 = 1 and g1 g2 = 1 then (2.2) follows immediately from the explicit formula for the pairing tg1 ,g2 . For g2 = 1 it reduces to Leibnitz identity, while for g1 = 1 it follows from Lemma 2.1. Finally, in the case g1 g2 = 1 the equality (2.2) follows from the identity 1 b ∇ 0 (f1 ) ⊗ f2 + b(f1 ⊗ ∇ 0 (f2 )) = 0 rk(g, θ ) for the pairing (1.3), which is easy to check using integration by parts. (b) This follows immediately from (a).

Corollary 2.3. Let E = Egz (θ ) and E = Egz (θ ) be a pair of basic modules equipped with holomorphic structures ∇ z and ∇ z respectively. (a) The formula ∇(φ)(e) = rk(E)[∇ z (φ(e)) − φ(∇ z e)] defines a holomorphic structure on the right Agθ -module HomAθ (E, E ) (where φ ∈ HomAθ (E, E ), e ∈ E). (b) Under the isomorphism (1.7) the above holomorphic structure on HomAθ (E, E ) corresponds to the operator ∇ rk(E)(z −z) on Eg g −1 (gθ ), i.e. we have rk(g,θ)(z −z)

HomAθ (Egz (θ ), Egz (θ )) Eg g −1

(gθ )

in a way compatible with holomorphic structures. (c) The subspace H 0 (HomAθ (E, E )) ⊂ HomAθ (E, E ) coincides with the subspace of holomorphic morphisms E → E . In the same way as in commutative geometry we can interpret the space H 1 (HomAθ (E, E )) in terms of holomorphic extensions.

Proposition 2.4. Let E = Egz (θ ) and E = Egz (θ ) be a pair of basic modules equipped with standard holomorphic structures. Then there is a functorial bijection between the space H 1 (HomAθ (E , E)) and isomorphism classes of extensions 0 → E → F → E → 0 in the category of holomorphic bundles on Tθ .

146

A. Polishchuk, A. Schwarz

Proof. Let 0 → E → F → E → 0 be an extension in the holomorphic category. Since E is projective we can choose a (non-holomorphic) splitting s : E → F which induces an isomorphism F E ⊕ E . The holomorphic structure ∇ F of Aθ -modules ∇E φ for some φ ∈ HomAθ (E , E). All possible splittings on F should have form 0 ∇ E s : E → F form a principal homogeneous space over HomAθ (E , E) and changing of s by an element f ∈ HomAθ (E , E) leads to the change φ → φ + ∇(f ), where ∇ is the holomorphic structure on HomAθ (E , E) defined above. This easily implies the assertion.

2.3. Holomorphic vectors and cohomology. The cohomology spaces H ∗ (Egz (θ )) := H ∗ (Eg (θ ), ∇ z ) are easy to compute. Namely, the following result holds. Proposition 2.5. Assume that Im(τ ) < 0. Let E = Egz (θ ) be a basic module equipped with a holomorphic structure ∇ z . (a) If µ(E) > 0 then H 1 (E) = 0 and H 0 (E) has dimension deg(E). (b) If µ(E) < 0 then H 0 (E) = 0 and H 1 (E) has dimension deg(E). (c) If z = 0 then H ∗ (Aθ , ∇ z ) = 0. (d) The spaces H 0 (Aθ , ∇ 0 ) and H 1 (Aθ , ∇ 0 ) are 1-dimensional. Proof. In the case deg(E) = 0 the problem reduces to the computation of the cohomology of the operator f → f + 2π i(τ µ(E)x + z)f on the Schwartz space S(R). Now one has to use the fact that for every complex number a with Re(a) = 0 the operator f → f + (ax + z)f on S(R) has 1-dimensional kernel and no cokernel (resp., 1-dimensional cokernel and no kernel) if Re(a) > 0 (resp. Re(a) < 0). To prove this fact one can use conjugation by the operator f → f exp(i Im(a)x 2 /2) and rescaling the variable x to reduce oneself to the case a = ±1. Furthermore, making the change of the variable x → x ± Re(z) and conjugating by the operator f → f exp(i Im(z)x) we can assume that z = 0. Now writing f as a linear combination of (Hn (x) exp(−x 2 /2), n ≥ 0), where Hn are Hermite polynomials, one easily proves that exp(−x 2 /2) generates the kernel for a = 1 (resp., the cokernel for a = −1). In the case E = Aθ the assertion follows immediately from the formula defining ∇ z .

Corollary 2.6. Assume that Im(τ ) < 0. Then for every basic module E equipped with some holomorphic structure one has χ (E) = dim H i (E) − dim H i+1 (E) = deg(E), where i = 0 if rk(E) > 0 and i = 1 if rk(E) < 0. The computation of 2.5 gives explicit bases in H ∗ (Egz (θ )). In particular, the space of holomorphic vectors H 0 (Egz (θ )) in a basic module Egz (θ ), such that deg(g) > 0, has a natural basis φαz (x, β) := e(−τ µ(E)x 2 /2 − zx)δα (β), α ∈ Z/ deg(g)Z,

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

147

where e(t) := exp(2πit), δα is the delta-function at α ∈ Z/ deg(g)Z. The basis in H 0 (Aθ , ∇ 0 ) is the element φ 0 = 1 ∈ Aθ . The following proposition is essentially equivalent to the main result of [3]. Proposition 2.7. For every g1 , g2 ∈ SL2 (Z) such that deg(g1 ) > 0, deg(g2 ) > 0, rk(g1 , θ ) > 0, rk(g2 , θ) > 0 one has deg(g1 g2 ) > 0 and 2 ,θ)z1 ⊗ φ z2 cαα1 ,α2 φαz1 +z2 , tg1 ,g2 φαrk(g α2 = 1 α∈Z/c12 Z

where

cαα1 ,α2

=

−τ m2 /2 + (c1 z2 − rk(g1 g2 , θ )c2 z1 )m , e c1 c2 c12

m∈Ig1 ,g2 (α1 ,α2 ,α)

where we use the notation of Proposition 1.2. Proof. The fact that deg(g1 g2 ) > 0 follows from the identity deg(g1 g2 ) = deg(g1 ) rk(g2−1 , θ) + deg(g2 ) rk(g1 , θ) which is the particular case of (1.2). Hence, g1 g2 = 1, and we can use the formula of rk(g ,θ)z Proposition 1.2(a) to compute tg1 ,g2 (φα1 2 1 ⊗ φαz22 ). This gives the above formula for α cα1 ,α2 .

The main thing to observe in the formula of the above proposition is that θ enters only through the combination deg(g1 )z2 − rk(g1 g2 , θ) deg(g2 )z1 . 2.4. Translations, tensorings, pull-back and push-forward with respect to isogenies. To conclude this section let us consider noncommutative analogues of some natural operations on holomorphic vector bundles. For every (v1 , v2 ) ∈ R2 we denote by αv1 ,v2 the automorphism U1 → e(v1 )U1 , U2 → e(v2 )U2 of Aθ . Then for a right Aθ -mod∗ ule E we denote by t(v E the space E with the new right action of Aθ given by 1 ,v2 ) −1 f → f · αv1 ,v2 (a), where a ∈ Aθ . Since δτ commutes with Av1 ,v2 , a holomorphic ∗ structure ∇ : E → E induces a holomorphic structure on t(v E (given by the same 1 ,v2 ) operator). Another natural operation is an analogue of tensoring with a topologically trivial line bundle. Clearly, analogues of such line bundles are modules E1λ (θ ), where λ ∈ C (recall that E1λ (θ ) is Aθ equipped with the holomorphic structure ∇ λ = δτ + 2π iλ). It is easy to see that for every right Aθ -module E with holomorphic structure ∇ the tensor product E ⊗Aθ E1λ (θ ) is isomorphic to E as a Aθ -module, but the holomorphic structure on it (defined in Proposition 2.1(c)) corresponds to ∇ + 2π iλ under this isomorphism. Thus, we have Egz (θ ) ⊗Aθ E1λ Egz+λ . Therefore, using Proposition 2.1(b), we obtain an isomorphism m+nτ

E ⊗Aθ E1rk(E) E for m, n ∈ Z.

148

A. Polishchuk, A. Schwarz

For a standard holomorphic bundle E = Egz (θ ) such that deg(E) = 0, the map f (x) → e(µ(E)v1 x)f (x + v2 ) induces an isomorphism µ(E)(τ v2 −v1 )

∗ E →E ⊗ E 1 t(v 1 ,v2 )

(2.3)

compatible with holomorphic structures. On the other hand, we have isomorphisms ∗ t(v E z E1z 1 ,v2 ) 1

for all (v1 , v2 ) ∈ R2 and all z ∈ C given by the map αv1 ,v2 : E1 → E1 . Note that if c = deg(E) = 0 then taking in (2.3) (v1 , v2 ) = (m/c, n/c) with m, n ∈ Z we obtain isomorphisms ∗ t(m/c,n/c) E →E of holomorphic vector bundles on Tθ,τ . Therefore, combining induced isomorphisms ∗ of cohomology spaces with the natural identifications H ∗ (t(m/c,n/c) E) = H ∗ (E) we obtain a collection of invertible operators U(m/c,n/c) : H i (E) → H i (E), where i is the unique degree such that H i (E) = 0. Similar to the commutative case we have the following result. Lemma 2.8. The operators U(m/c,n/c) define an action on H i (E) of the Heisenberg group H which is a central extension of (Z/cZ)2 by U (1), such that U (1) acts in the standard way. This representation is irreducible. Proof. Unravelling the definitions we get the following formula for U(m/c,n/c) (up to a slight rescaling): m U(m/c,n/c) f (x, α) = e − α f (x, α − na), c where a ∈ Z is relatively prime to c. Hence, the action of these operators on H i (E) is equivalent to the standard representation of the Heisenberg group on the space of functions on Z/cZ.

There are also analogues of pull-back and push-forward with respect to isogenies in our situation. Namely, for every positive integer N we have a noncommutative morphism π : Tθ → TNθ corresponding to the embedding of algebras ANθ → Aθ : U1 → U1N , U2 → U2 . If τ ∈ C \ R then π can be considered as a morphism of complex noncommutative tori Tθ,τ → TNθ,Nτ . Indeed, the derivation δτ preserves the subalgebra ANθ ⊂ Aθ and δτ |AN θ = δNτ . For a projective right Aθ -module E we set π∗ (E) = E, considered as a ANθ -module. Clearly, a holomorphic structure on E induces a holomorphic structure on π∗ (E) (with respect to N τ ). The analogue of pull-back is the operation E → π ∗ E = E ⊗AN θ Aθ on projective ANθ -modules. Here is another description of π ∗ E: it is the space of functions f : Z → E satisfying the equation f (n + N ) = f (n)U1−1 with

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

149

the action of Aθ given by f U1 (n) = f (n − 1), f U2 (n) = e(nθ )f (n)U2 . A holomorphic structure ∇ : E → E satisfying the Leibnitz rule with respect to δNτ induces a holomorphic structure on π ∗ E by the formula ∇(f )(n) = ∇(f (n)) + 2π inτf (n), where n ∈ Z. It is easy to check that (π ∗ , π∗ ) is an adjoint pair of functors between the categories of holomorphic vector bundles on TNθ,Nτ and Tθ,τ . Proposition 2.9. (a) For every m relatively prime with N one has an isomorphism z z π∗ E1,m (θ ) EN,m (N θ)

of holomorphic vector bundles on TNθ,Nτ . (b) For every m one has an isomorphism z z π ∗ E1,m (N θ ) E1,mN (θ )

of holomorphic vector bundles on Tθ,τ . Proof. (a) The required isomorphism is given by f (x, α) → f (x, N α). z (b) We can consider elements of π ∗ E1,m (N θ ) as functions f (x, α, n) of x ∈ R, α ∈ Z/NZ and n ∈ Z satisfying f (x, α, n + N ) = f (x + (1 + mN θ)/m, α + 1, n). The z (θ ) sends such a function to g(x, n) = f x − n(1+mNθ) , 0, n , isomorphism with E1,mN mN where x ∈ R and n ∈ Z/mN Z.

3. Derived Categories of Holomorphic Bundles 3.1. Definition and basic properties. Let us fix a complex number τ such that Im(τ ) < 0. We are going to define a certain dg-category C(θ, τ ) for the complex noncommutative torus Tθ,τ . The corresponding cohomology category H 0 (C(θ, τ )) can be considered as an analogue of the derived category of holomorphic vector bundles. Objects of the dg-category C = C(θ, τ ) are of the form E[n], where E is a finitely generated projective right Aθ -module such that rk(E) > 0, equipped with a complex structure (compatible with δτ ), n is an integer. We will often write E instead of E[0]. Morphisms are defined as follows: Hom•C (E[n], E [n ]) = (HomAθ (E, E ) → HomAθ (E, E ))[n − n], where this two-term complex is placed in degrees n − n and n − n + 1, the differential sends f ∈ HomAθ (E, E ) to the Aθ -linear map e → ∇(φ(e)) − φ(∇(x)). The composition Hom•C (E2 , E3 ) ⊗ Hom•C (E1 , E2 ) → Hom•C (E1 , E3 ) is defined in the obvious way. We denote by C st = C st (θ, τ ) the full subcategory of C consisting of objects E[n] such that E is a standard holomorphic bundle. Note that in this case the differential on HomAθ (E, E ) is proportional to the holomorphic structure ∇ on it defined in Corollary 2.3.

150

A. Polishchuk, A. Schwarz

Remark. One can also consider the Z/2Z-graded analogue of C. Recall that the Agθ module HomAθ (E, E ) can be either even or odd (depending on the sign of its rank). We can equip the space HomC (E, E ) with the induced Z/2Z-grading, so that the differential is odd. We are going to check that in the case θ = 0 the category H 0 C st is equivalent to the full subcategory in the derived category D b (X) of coherent sheaves on the elliptic curve X = Xτ = C/(Z + τ Z). Namely, let us denote by C st (X) the dg-category with objects E[n], where E is a stable holomorphic vector bundle on X, n ∈ Z. The spaces HomC st (X) (E1 , E2 ) are given by the Dolbeault complexes of E1∨ ⊗ E2 . Then this definition is extended to objects of the form E[n] as before. The associated usual category H 0 C st (X) is the full subcategory in the derived category D b (X). Proposition 3.1. The dg-categories C st (0, τ ) and C st (Xτ ) are equivalent. If Im(τ ) < 0 z then under this equivalence En,m corresponds to a stable vector bundle of rank n and degree m on Xτ . Proof. Let us define real coordinates (x, y) on C by setting z = x − τy. Then we can identify the algebra of smooth functions on the torus X = Xτ with A0 by setting U1 = exp(2π ix), U2 = exp(2πiy). Then the derivation δτ on A0 is proportional to ∂. Given a holomorphic vector bundle V over X the space C ∞ (V ) of smooth sections of V is a projective A0 -module. Furthermore, trivializing the bundle of (0, 1)-forms on X we can consider the operator ∂ : C ∞ (V ) → C ∞ (V ) (appropriately rescaled) as a holomorphic structure on the A0 -module C ∞ (V ). Let us check that if V is stable then C ∞ (V ) is a basic A0 -module with a standard holomorphic structure. Assume first that V is a line bundle of degree c = 0. Every such line bundle on X is isomorphic to a line bundle Lc (u), where u ∈ C, such that C ∞ (Lc (u)) is the space of smooth functions f on C satisfying the equations f (z + 1) = f (z), f (z − τ ) = e(u − cz)f (z), and holomorphic structure is given by the ∂-operator with respect to the complex variable z. Using the real coordinates (x, y) such that z = x − τy we can write a section of Lc (u) in the form f (z) = fn (y) e(nx), n∈Z

where fn (y + 1) = fn+c (y) e(u + cτy). Setting gn (y) = fn (y) e(−cτy 2 /2 + (cτ/2 − u)y), we note that gn (y + 1) = gn+c (y), so the function φ(y, α) = gα (y − α/c) depends only on α mod cZ. Smoothness of f implies that the functions φ(y, α) belong to the Schwartz spaces. It is easy to check that the map f → φ(y, α) induces an isomorphism of A0 -modules C ∞ (Lc (u)) E1,c (0). Furthermore, the holomorphic structure on Lc (u) corresponds to the holomorphic structure ∇ u−cτ/2 (with respect to the complex structure δτ on A0 ). Next, we observe that the construction V → C ∞ (V ) is compatible with the operation π∗ , where π : C/(dZ + τ Z) → C/(Z + τ Z) = X is a natural isogeny of degree d. Since every stable holomorphic vector bundle of rank d has form π∗ Lc (u), applying Proposition 2.9 we derive that the space of smooth sections of such a bundle can be identified with a basic module over A0 (and that the holomorphic structure on it is standard).

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

151

Let us fix a basic right Aθ -module E0 = Eg (θ ) and equip it with a standard holomorphic structure ∇ 0 . Then as we have shown in Proposition 2.1, the map E → E ⊗Agθ E0 extends to a functor from the category of holomorphic bundles over Tgθ to the category of holomorphic bundles over Tθ . Proposition 3.2. The above functor extends to an equivalence of dg-categories C(gθ, τ ) →C(θ, τ ) (resp., C st (gθ, τ ) →C st (θ, τ )). Proof. Indeed, for every holomorphic vector bundles E, E over Tgθ we have a natural map HomAgθ (E, E ) → HomAθ (E ⊗Agθ E0 , E ⊗Agθ E0 ) commuting with the differentials. The inverse Aθ − Agθ -bimodule to E0 will define an inverse map.

The following result is an analogue of Serre duality for the category H ∗ (C st (θ, τ )). Proposition 3.3. (a) The pairing (1.3) induces a perfect pairing − rk(g,θ)z H i Eg −1 (gθ ) ⊗ H 1−i (Egz (θ )) → C for i = 0, 1. (b) There is a canonical functorial isomorphism H i HomC (E, E ) HomC (E , E)∗ , where E, E ∈ C st .

H 1−i

Proof. (a) The identity

1 b ∇ − rk(g,θ)z (f1 ) ⊗ f2 + b(f1 ⊗ ∇ z (f2 )) = 0, rk(g, θ ) where f1 ∈ Eg −1 (gθ ), f2 ∈ Eg (θ ), ensures that b descends to the required pairing. We already know that dimensions of both spaces are the same. To prove that this pairing is non-degenerate it suffices to check non-degeneracy of the pairing ker(d/dx + x : S(R) → S(R)) ⊗ coker(d/dx − x : S(R) → S(R)) → C induced by (f1 , f2 ) = x∈R f1 (x)f2 (x)dx. But both spaces are generated by exp(−x 2 /2) and (exp(−x 2 /2), exp(−x 2 /2)) = 0. (b) This follows from (a) and from the isomorphisms of Corollary 2.3(b).

3.2. Comparison of the categories for different θ . Proposition 2.7 gives explicit formulas for the structure constants of the composition law in the categories H ∗ C st (θ, τ ). Looking at these formulas one can make the following observation. Let us denote by C st (θ, τ )θ the full dg-subcategory of C st (θ, τ )θ formed by objects Egz (θ )[n] with rk(g, θ ) = 0. Theorem 3.4. For every pair (θ, θ ) the graded categories H ∗ C st (θ, τ )θ and H ∗ C st (θ , τ )θ are equivalent.

152

A. Polishchuk, A. Schwarz

Proof. The main ingredient of the proof is the following computation of the composition in the cohomology category of C st (θ ) = C st (θ, τ ). Assume that deg(g2 g1−1 ) > 0, deg(g3 g2−1 ) > 0, and rk(gi , θ) > 0 for i = 1, 2, 3. Then the composition H 0 Hom Egz22 (θ ), Egz33 (θ ) ⊗H 0 Hom Egz11 (θ ), Egz22 (θ ) → H 0 Hom Egz11 (θ ), Egz33 (θ ) can be identified using (1.7) and Corollary 2.3 with the map rk(g2 ,θ)(z3 −z2 ) rk(g1 ,θ)(z2 −z1 ) rk(g1 ,θ)(z3 −z1 ) 0 0 0 H E −1 (g2 θ ) ⊗H E −1 (g1 θ) → H E −1 (g1 θ) g3 g2

g2 g1

g3 g1

induced by the pairing tg3 g −1 ,g2 g −1 . Applying Proposition 2.7 we get

2

1

rk(g1 ,θ)(z2 −z1 )

tg3 g −1 ,g2 g −1 φαrk(g2 ,θ)(z3 −z2 ) ⊗ φβ 2

1

where

=

cα,β φγrk(g1 ,θ)(z3 −z1 ) , γ

γ

γ

cα,β = m∈I

e

(α,β,γ ) g3 g2−1 ,g2 g1−1

−τ m2 /2 + [deg(g3 g2−1 ) rk(g1 , θ )(z2 − z1 ) − deg(g2 g1−1 ) rk(g3 , θ )(z3 − z2 )]m deg(g3 g2−1 ) deg(g2 g1−1 ) deg(g3 g1−1 )

.

γ

Therefore, cα,β depends on θ only through the expression deg(g3 g2−1 ) rk(g1 , θ )(z1 − z2 ) + deg(g2 g1−1 ) rk(g3 , θ )(z3 − z2 ). Applying identity (1.2) we can rewrite this as deg(g3 g2−1 ) rk(g1 , θ)z1 − deg(g3 g1−1 ) rk(g2 , θ)z2 + deg(g2 g1−1 ) rk(g3 , θ )z3 , so we see that this expression for data (z1 , z2 , z3 , θ ) is equal to a similar expression for (z1 , z2 , z3 , θ ), where zi = zi rk(gi , θ)/ rk(gi , θ ). Now we can construct the equivalence functor F = Fθ,θ : H ∗ (C st (θ )θ ) → ∗ H (C st (θ )θ ). Without loss of generality we can assume that θ < θ . For every g ∈ SL2 (Z) let us denote λ(g) = rk(g, θ )/ rk(g, θ ). Then F is defined on objects by setting λ(g)z rk(g, θ ) > 0, Eg (θ )[n], z F (Eg (θ )[n]) = λ(g)z E−g (θ )[n − 1], rk(g, θ ) < 0. We have to define isomorphisms Fg1 ,g2 : H i HomC (θ) (Egz11 (θ ), Egz22 (θ )) λ(g )z λ(g )z → H i+1 −2 HomC (θ ) E(−1)1 11g1 (θ ), E(−1)2 22g2 (θ ) compatible with the composition, where i ∈ {0, 1} are defined by (−1)i = sign rk(gi , θ ). Assume first that deg(g2 g1−1 ) > 0. Then we claim that only the following three cases can occur: (i) rk(g1 , θ ) > 0, rk(g2 , θ ) > 0; (ii) rk(g1 , θ ) < 0,

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

153

rk(g2 , θ ) > 0; (iii) rk(g1 , θ ) < 0, rk(g2 , θ ) < 0. Indeed, this is clear from the follow−1 ing geometric interpretation of the conditions rk(g, θ ) > 0,deg(g 2 g1 ) > 0. Consider ab plane vectors vθ = (−θ, 1) and vg = (d, c), where g = . Then rk(g, θ ) > 0 cd means that vg belongs to the half-plane Hθ consisting of vectors v such that the pair (v, vθ ) is positively oriented. The assumption θ < θ is equivalent to the condition −vθ ∈ Hθ . On the other hand, the assumption deg(g2 g1−1 ) > 0 means that the pair (vg1 , vg2 ) is positively oriented. Since both vectors vg1 and vg2 belong to the half-plane Hθ , our claim follows. In the case (i) we define the map rk(g1 ,θ)(z2 −z1 ) rk(g1 ,θ )(λ(g2 )z2 −λ(g1 )z1 ) 0 0 (g1 θ ) → H E −1 (g1 θ ) Fg1 ,g2 : H E −1 g2 g1

by setting

g2 g1

Fg1 ,g2 φαrk(g1 ,θ)(z2 −z1 ) = φαrk(g1 ,θ )(λ(g2 )z2 −λ(g1 )z1 )

for all α ∈ Z/ deg(g2 g1−1 )Z. In the case (ii) we define the map rk(g1 ,θ)(z2 −z1 ) − rk(g1 ,θ )(λ(g2 )z2 −λ(g1 )z1 ) (g1 θ ) → H 1 E (−g θ ) Fg1 ,g2 : H 0 E −1 1 −1 −g2 g1

g2 g1

by the formula − rk(g ,θ )(λ(g2 )z2 −λ(g1 )z1 )

where g2 g1−1 − rk(g,θ)z

Fg1 ,g2 (φαrk(g1 ,θ)(z2 −z1 ) ) = ψ−d12 α 1 a12 b12 − rk(g,θ)z , (ψα = ,α ∈ c12 d12

,

Z/cZ) denotes the basis of

H 1 (Eg −1 (gθ )) dual to the basis (φαz ) of H 0 (Egz (θ )) with respect to the natural pairing (see Proposition 3.3). In the case (iii) the definition is similar to the case (i). We can easily extend these definitions to arbitrary morphisms using the compatibility with Serre duality. It remains to check the compatibility of this functor with the composition H j HomC (θ) (Egz22 (θ ), Egz33 (θ )) ⊗ H i HomC (θ) (Egz11 (θ ), Egz22 (θ )) → H i+j HomC (θ) (Egz11 (θ ), Egz33 (θ )) and the composition of the corresponding objects in C st (θ ). By Serre duality it suffices to consider the case when deg(g2 g1−1 ) > 0, deg(g3 g2−1 ) > 0 and either rk(gi , θ ) > 0 for i = 1, 2, 3, or rk(g1 , θ ) < 0, rk(g2 , θ ) > 0, rk(g3 , θ ) > 0. In the former case this compatibility follows from our observation in the beginning of the proof. In the latter γ case we have to use some cyclic symmetry of the structure constants cα,β appearing above. To reflect the dependence of these constants on various data let us change the γ notation to cα,β (g1 , g2 , g3 ; z1 , z2 , z3 ; θ ). The required compatibility follows from the equality −d β

12 cα,β (g1 , g2 , g3 ; z1 , z2 , z3 ; θ ) = c−d13 γ ,α (g2 , g3 , −g1 ; λ(g2 )z2 , λ(g3 )z3 , λ(g1 )z1 ; θ ), aij bij where gj gi−1 = . This equality can be easily checked using Lemma 1.3(c). cij dij

γ

154

A. Polishchuk, A. Schwarz

Corollary 3.5. If θ and θ are irrational then the graded categories H ∗ (C st (θ, τ )) and H ∗ (C st (θ , τ )) are equivalent. In the case when θ is irrational and θ = 0 we can extend the composition of the above functor H 0 C st (θ )θ → H 0 C st (0) with the embedding H 0 C st (0) ⊂ D b (Xτ ) (constructed in Proposition 3.1) to the entire category H 0 C st(θ ). Namely, assuming that θ < 0 we 0 1 z extend the functor F above to Eg (θ ), where g = by setting F (Egz (θ )) = O−θz −1 0 (this is the structure sheaf of the point −θ z mod (Z + τ Z) ∈ Xτ ). The above construction of F on morphisms then can be modified appropriately to prove the equivalence of the category H ∗ (C st (θ, τ )) with some full subcategory of the derived category of coherent sheaves on the elliptic curve C/Z + τ Z. Below we will give a more natural construction of the composition of this equivalence with the Fourier-Mukai transform. 3.3. Analogue of the Fourier-Mukai transform. As before we fix a complex structure τ on Tθ such that Im(τ ) < 0. Leibnitz rule implies that for every holomorphic structure ∇ on a right Aθ -module E the cohomology spaces H ∗ (E, ∇ + 2π iz id), where z ∈ C, depend only on z mod (Z + τ Z). So one can try to define an analogue of the FourierMukai transform of (E, ∇) by descending this family of spaces to the elliptic curve X = Xτ = C/(Z + τ Z). For every Aθ -module E let us consider a sheaf of O-modules EC on C such that sections of EC over an open set U ⊂ C are E-valued holomorphic functions on U . We can define two morphisms ρv : tv∗ EC → EC , where v = 1 or τ ; tv : C → C denotes the translation z → z + v. Namely, for a local section f of EC we set ρv (f )(z) = f (z + v)U v , v where U = U1 or U2 for v = τ or 1 respectively. Clearly, we have ρ1 ◦ t1∗ ρτ = exp(2π iθ)ρτ ◦ tτ∗ ρ1 . Changing ρτ to

ρτ = exp(−2π iθz)ρτ

we will obtain

ρ1 ◦ t1∗ ρτ = ρτ ◦ tτ∗ ρ1 . Let us denote by EX the sheaf of O-modules on the elliptic curve X (with respect to the classical topology) obtained from EC using the Z2 -action given by ρ1 and ρτ . Now assume in addition that E is equipped with a holomorphic structure ∇. Then we define S(E) = S(E, ∇) to be the complex d∇ -E EX X

of O-modules over X, where d∇ is defined in terms of the corresponding operator on EC : d∇ (f )(z) = ∇(f (z)) + 2π izf (z). We leave for the reader to check that this operator is compatible with the Z2 -action so it descends to EX . The following basic property is one of the reasons to consider S as an analogue of the Fourier transform.

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

155

Lemma 3.6. For every λ ∈ C, (v1 , v2 ) ∈ R2 one has S(E ⊗ E1λ ) tλ∗ S(E), ∗ S(t(v E) P(v1 ,v2 ) ⊗ S(E), 1 ,v2 ) where P(v1 ,v2 ) is the holomorphic bundle on X corresponding to the following action of Z + τ Z on the trivial bundle OC : ρτ (f )(z) = e(−v1 )f (z + τ ), ρ1 (f )(z) = e(−v2 )f (z + 1). The proof is the direct application of the definitions. Now let us compute the Fourier transform of standard objects. 0 (θ ) be a basic module corresponding to a pair (n, m) such Proposition 3.7. (a) Let En,m that m > 0 (and as usual mθ + n > 0) equipped with the complex structure ∇ 0 . Then 0 (θ )) is quasi-isomorphic to the vector bundle V the complex S(En,m m,−n on X obtained m with respect to the following action of Z + τ Z: by the descent from the vector bundle OC n n τ ρτ (f )(z, α) = f (z + τ, α − 1) e z+ , ρ1 (f )(z, α) = f (z + 1, α) e − α , m 2µ m m as functions f (z, α) in z ∈ C and α ∈ Z/mZ, where we think about sections of OC which are holomorphic in z. The vector bundle Vm,−n has degree −n and is stable. 0 (θ )) is quasi-isomorphic to V [−1], where (b) If m < 0 (and mθ + n > 0) then S(En,m V is a stable vector bundle on X. z (c) The complex S(E1,0 (θ )) is quasi-isomorphic to O−z [−1], where O−z is the structure sheaf of the point −z mod (Z + τ Z) in X. 0 (θ ). A version of Proposition 2.5 shows that the morphism Proof. (a) Set E = En,m d∇ 0 : EC → EC is surjective and its kernel is a vector bundle on C. Moreover, we have an explicit isomorphism of the kernel with the rank-m trivial bundle on C: for a holomorphic function f (z, α) on U × Z/mZ, where U ⊂ C is an open subset, the function f (z, α) e(−τ µ(E)x 2 /2 − zx) defines a section of ker(d∇ 0 ) over U . It is easy to check that the Z2 -action on ker(d∇ 0 ) induced by the action of U1 and U2 coincides with the action defining Vm,−n . The fact that the vector bundle Vm,−n has degree −n and is stable is well-known (note that the sign with nz/m in the definition of ρτ is opposite to the standard one due to our assumption Im(τ ) < 0). (b) The proof of Proposition 2.5 shows that the morphism d∇ 0 : EC → EC has no kernel and its cokernel is a bundle V of rank −m on C. Let g ∈ SL2 (Z) be a matrix with (m, n) as the second row. Proposition 3.3 gives a family of isomorphisms − rk(g,θ)z Vz∗ H 0 Eg −1 (gθ ) .

The identifications Vz Vz+1 and Vz Vz+τ correspond to similar isomorphisms of the H 0 -spaces induced by the left action of generators of Aθ on Eg −1 (gθ ). In other words, the bundle dual to V (or rather its descent to X) is isomorphic to the Fourier-Mukai transform of Eg0−1 (gθ ) considered as a left Aθ -module. Therefore, our claim follows from the analogue of part (a) for the Fourier-Mukai transform of left Aθ -modules. z0 (c) The formula for the operator ∇ z : Aθ → Aθ shows that the complex S(E1,0 (θ )) is exact outside the point −z0 mod (Z + τ Z) ∈ X and that locally near −z0 it is z0 +z

quasiisomorphic to the complex O → O.

156

A. Polishchuk, A. Schwarz

3.4. Equivalences. Let us call an object E ∈ D b (X) stable if Hom(E, E) = C. Equivalently, E V [n] for some n ∈ Z, where V is either a stable vector bundle or a structure sheaf of a point. Theorem 3.8. Assume that θ is irrational. Then the functor E → S(E) extends to an equivalence of H 0 C st (θ, τ ) with the full subcategory of D b (X) consisting of stable objects. Proof. First of all, we extend S to all objects of C st (θ, τ ) by requiring that it commutes with the shift E → E[1]. It is clear that a holomorphic map E → E induces a map of complexes S(E) → S(E ). On the other hand, a morphism E → E [1] in H 0 C st (θ, τ ) can be interpreted as an extension class 0 → E → F → E → 0 (see Proposition 2.4). Such an extension induces an exact sequence of complexes 0 → S(E ) → S(F ) → S(E) → 0, hence a morphism S(E) → S(E )[1] in the derived category. Clearly, these maps define a functor from H 0 C st (θ, τ ) to D b (X). Using Proposition 3.7 we see that S(E) is a stable object for every E ∈ C st (θ, τ ). Let us check that the natural maps Homi (E, E ) → HomiD b (X) (S(E), S(E ))

(3.1)

are isomorphisms. The idea of the proof is that both these Hom-spaces are irreducible representations of the same Heisenberg group and that the map between them is a map of Heisenberg modules. Recall that the Heisenberg group action on Homi (E, E ) = H i (HomAθ (E, E )) was constructed in Lemma 2.8. On the other hand, the Heisenberg group action on Homi (A, B) for every pair of stable objects in D b (X) (such that A B[n] and Homi (A, B) = 0) is constructed as follows. The functors of translation by points of X and of tensoring by topologically trivial line bundles on X generate the action of the Heisenberg groupoid H on D b (X) which is an extension of X ×X by the groupoid of Gm -torsors (see [12]). Now for every stable object A in D b (X) let KA ⊂ X × X be the subgroup of translations and tensorings preserving A up to an isomorphism. Then the abelian group KA ∩ KB is finite and has a natural Heisenberg extension acting on Homi (A, B). Indeed, applying an automorphism of D b (X) we can assume that A = OX , in which case this construction is standard (note that KOX = X × {0} ⊂ X × X is the subgroup of all translations). The compatibility of S with Heisenberg group actions on Hom’s now follows easily from Proposition 3.6. It remains to check that all the maps (3.1) are non-zero. The commutative diagram Hom1 (E, E) Hom1 (E , E) ⊗ Hom(E, E ) (3.2) ? Hom1 (S(E ), S(E)) ⊗ Hom(S(E), S(E ))

? - Hom1 (S(E), S(E))

whose rows are perfect pairings, shows that it is enough to prove nonvanishing of the maps Hom1C st (θ,τ ) (E, E) → Hom1D b (X) (S(E), S(E)). 0 (θ ), where This can be done by direct computation. Let us assume that E = En,m (n, m) = (1, 0) (we leave the case E = Aθ to the reader). Recall that according to Proposition 2.4 a generator of Hom1 (E, E) gives rise to a holomorphic extension

0 → E → E (2) → E → 0.

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

157

It is easy to see that E (2) = E ⊕ E with the holomorphic structure given by ∇0 1 . 0 ∇0 Let us assume that m = deg(E) > 0 (the case deg(E) < 0 is considered similarly). Then S(E) is the bundle Vm,−n described in Proposition 3.7(a). Local sections of S(E (2) ) over an open subset of C are described by E ⊕ E-valued functions of the form ((f1 (z, α) + f2 (z, α)x) e(−τ µ(E)x 2 /2 − zx), f2 (z, α) e(−τ µ(E)x 2 /2 − zx)), where f1 (z, α) and f2 (z, α) are holomorphic in z. The Z2 -action on pairs (f1 (z, α), f2 (z, α)) is given by the rule τ n z+ , ρτ (f1 , f2 ) = (f1 (z + τ, α − 1) − µf2 (z + τ, α − 1), f2 (z + τ, α − 1)) e m 2µ n ρ1 (f1 , f2 ) = (f1 (z + 1, α), f2 (z + 1, α)) e − α , m where µ = µ(E). This easily implies that S(E (2) ) is a non-split extension of Vm,−n by itself.

Remark. It is not difficult to see that the restriction of the functor S to H 0 C st (θ, τ )0 is the composition of the functor Fθ,0 from Theorem 3.4 with the embedding of H 0 C st (0, τ ) in D b (X) given in Proposition 3.1, followed by the standard Fourier-Mukai transform on D b (X) (where the elliptic curve X is identified with its dual). Now let us explain the relation between the constructed equivalences and the Morita equivalences. Assume that θ < 0, θ = gθ < 0 and rk(g, θ ) > 0. Then the Morita functor from Mg,θ : C(gθ, τ ) → C(θ, τ ) is given by E → E ⊗Agθ Eg0 (θ ). Hence, it maps u (θ ), where u = z/ rk(g, θ ). On the other hand, the functor F Ehz (θ ) to Ehg θ,0 sends an u rk(hg,θ)/ rk(hg,0)

u (θ ) to E object Ehg hg

u rk(hg,θ)/ rk(hg,0)

(0) or E−hg

(0)[1] depending on wheth-

z rk(h,θ )/ rk(h,0) er rk(hg, 0) is positive or negative. Similarly, Fθ ,0 (Ehz (θ )) is either Eh (0) z rk(h,θ )/ rk(h,0) or E−h (0)[1]. Note that u rk(hg, θ ) = z rk(h, θ ). It follows that the autob equivalence G = Fθ,0 ◦ Mg,θ ◦ Fθ−1 ,0 of the subcategory of stable objects D (X) sends t rk(h,0)/ rk(hg,0) t rk(h,0)/ rk(hg,0) (0) or to E−hg (0)[1] depending on the sign of Eht (0) to Ehg

rk(hg, 0). The autoequivalence G is induced by some autoequivalence of the derived category D b (X) that acts as the matrix g t on the column vector (deg, rk).

3.5. Tilted t-structures. Let C be an abelian category. Recall that a torsion pair in C is a pair of full subcategories stable under extensions (C1 , C2 ), such that for every A1 ∈ C1 , A2 ∈ C2 one has HomC (A1 , A2 ) = 0 and every object A ∈ C fits into an exact sequence 0 → A1 → A → A2 → 0 with A1 ∈ C1 , A2 ∈ C2 . In this situation the above exact sequence is unique up to a unique isomorphism (hence, it depends functorially on A).

158

A. Polishchuk, A. Schwarz

A torsion pair p = (C1 , C2 ) gives rise to the following t-structure in the derived category D(C): D p,≤0 := {K ∈ D(C) : H >0 (K) = 0, H 0 (K) ∈ C1 }, D p,≥1 := {K ∈ D(C) : H <0 (K) = 0, H 0 (K) ∈ C2 }. The heart of this t-structure C p := D p,≤0 ∩ D p,≥0 is equipped with the torsion pair (C2 [1], C1 ). In the case when C is the category Coh(X) of coherent sheaves on an algebraic curve X over a field k, there are natural torsion pairs (Coh>θ (X), Coh≤θ (X)) and (Coh≥θ (X), Coh<θ (X)) in Coh(X) associated with every real number θ (for irrational θ these two pairs are the same). Namely, for a subset I ⊂ R ∪ {+∞} we denote by CohI (X) the smallest full subcategory in Coh(X) closed under extensions, containing all simple sheaves with slope µ ∈ I (where the slope of a stable bundle is defined as the ratio deg / rk, a slope of a torsion sheaf is defined to be +∞). Now let us assume that θ is irrational and denote by (D θ,≤0 , D θ,≥1 ) the t-structure on D b (X) associated with the torsion pair (Coh>θ (X), Coh<θ (X)). Let Coh(X)θ be the heart of this t-structure, so that Coh(X)θ is equipped with the torsion pair (Coh<θ [1], Coh>θ ). Proposition 3.9. Assume that θ < 0. Then the functor F = Fθ,0 : H 0 C st (θ ) → D b (X) defined for the pair (θ, 0) sends the subcategory of holomorphic vector bundles on Tθ −1 to Coh(X)−θ [−1]. Proof. Indeed, if rk(g, 0) > 0 then F (Egz (θ )) is a vector bundle of degree m and rank n such that mθ + n > 0, hence F (Egz (θ )) ∈ Coh<−θ −1 . Similarly, if rk(g, 0) ≤ 0 then F (Egz (θ )) ∈ Coh>−θ −1 [−1].

Acknowledgement. We are grateful to Marc Rieffel, Yan Soibelman and Mauro Spera for useful discussions. Part of this research was carried out during the stay of both authors at the Institut des Hautes Etudes Scientifiques. We thank this institution for the hospitality and for stimulating working conditions.

References 1. Connes, A.: C ∗ alg`ebres et g´eom´etrie differentielle. C. R. Acad. Sci. Paris, S´er. A-B 290, A599–A604 (1980) 2. Connes, A., Rieffel, M.A.: Yang Mills for Non-Commutative Two-Tori. In: Proceedings of the Conference on Operator Algebras and Mathematical Physics. Univ. of Iowa (1985), 1987, pp. 237–266 3. Dieng, M., Schwarz, A.: Differential and complex geometry of two-dimensional noncommutative tori. Preprint. math.QA/0203160 4. Donaldson, S.K.: A new proof of a theorem of Narasimhan and Seshadri. J. Differential Geom. 18, 269–277 (1983) 5. Douglas, M.R.: Dirichlet branes, homological mirror symmetry, and stability. Preprint. math.AG/0207021 6. Fukaya, K.: Floer homology of Lagrangian foliation and noncommutative mirror symmetry I. Preprint. 1998, http://www.kusm.kyoto-u.ac.jp/ ˜ fukaya/fukaya.html 7. Happel, D., Reiten, I., Smalo, S.O.: Tilting in abelian categories and quasitilted algebras. Memoirs AMS 575, (1996) 8. Kajiura, H.: Kronecker foliation, D1-branes and Morita equivalence of Noncommutative two-tori. Preprint. hep-th/0207097 9. Manin, Yu.I.: Real Multiplication and noncommutative geometry. Preprint. math.AG/0202109 ˆ with its application to Picard sheaves. Nagoya Math. 10. Mukai, S.: Duality between D(X) and D(X) J. 81, 153–175 (1981)

Categories of Holomorphic Vector Bundles on Noncommutative Two-Tori

159

11. Narasimhan, M.S., Seshadri, C.S.: Stable and unitary vector bundles on a compact Riemann surface. Annals of Math. 82, 540–567 (1965) 12. Polishchuk, A.: Symplectic biextensions and a generalization of the Fourier-Mukai transform. Math. Research Letters 3, 813–828 (1996) 13. Polishchuk, A., Zaslow, E.: Categorical mirror symmetry in the elliptic curve. In: Winter School on Mirror Symmetry, Vector Bundles and Lagrangian Submanifolds. Providence NJ: AMS and International Press, 2001, pp. 275–295 14. Rieffel, M.A.: The cancellation theorem for projective modules over irrational rotation C∗ -algebras. Proc. London Math. Soc. (3) 47, 285–302 (1983) 15. Rieffel, M.A.: Critical points of Yang-Mills for noncommutative two-tori. J. Differ. Geom. 31, 535– 546 (1990) 16. Schwarz, A.: Theta functions on noncommutative tori. Lett. Math. Phys. 58, 81–90 (2001) 17. Soibelman, Y.: Quantum Tori, Mirror symmetry, and Deformation quantization. Lett. Math. Phys. 56, 99–125 (2001) 18. Soibelman, Y., Vologodsky, V.: Non-commutative compactifications and elliptic curves. Preprint. math.AG/0205117 19. Spera, M.:Yang Mills theory in non commutative differential geometry. Rendiconti Seminari Facolt`a Scienze Univ. Cagliari Supplemento al Vol. 58, 409–421 (1988) Communicated by A. Connes

Commun. Math. Phys. 236, 161–186 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0807-7

Communications in

Mathematical Physics

Chern Character in Twisted K-Theory: Equivariant and Holomorphic Cases Varghese Mathai, Danny Stevenson Department of Pure Mathematics, University of Adelaide, Adelaide, SA 5005, Australia. E-mail: [email protected]; [email protected] Received: 10 January 2002 / Accepted: 9 December 2002 Published online: 25 February 2003 – © Springer-Verlag 2003

Abstract: It was argued in [25, 5] that in the presence of a nontrivial B-field, D-brane charges in type IIB string theories are classified by twisted K-theory. In [4], it was proved that twisted K-theory is canonically isomorphic to bundle gerbe K-theory, whose elements are ordinary Hilbert bundles on a principal projective unitary bundle, with an action of the bundle gerbe determined by the principal projective unitary bundle. The principal projective unitary bundle is in turn determined by the twist. This paper studies in detail the Chern-Weil representative of the Chern character of bundle gerbe K-theory that was introduced in [4], extending the construction to the equivariant and the holomorphic cases. Included is a discussion of interesting examples.

1. Introduction As argued by Minasian-Moore [20], Witten [25], D-brane charges in type IIB string theories are classified by K-theory, which arises from the fact that D-branes have vector bundles on their world-volumes. In the presence of a nontrivial B-field but whose Dixmier-Douady class is a torsion element of H 3 (M, Z), Witten also showed that D-branes no longer have honest vector bundles on their world-volumes, but they have a twisted or gauge bundle. These are vector bundle-like objects whose transition functions gab on triple overlaps satisfy gab gbc gca = habc I, where habc is a Cech 2-cocycle representing an element of H 2 (M, U (1)) ∼ = H 3 (M, Z), and I is the n × n identity matrix, for n coincident D-branes. This proposal was later related by Kapustin [13] to projective modules over Azumaya algebras.

The authors acknowledge the support of the Australian Research Council

162

V. Mathai, D. Stevenson

In the presence of a nontrivial B-field whose Dixmier-Douady class is a general element of H 3 (M, Z), it was proposed in [5] that D-brane charges in type IIB string theories are measured by the twisted K-theory that was described by Rosenberg [23], and the twisted bundles on the D-brane world-volumes were elements in this twisted K-theory. In [4], it was shown that these twisted bundles are equivalent to ordinary Hilbert bundles on the total space of the principal projective unitary bundle P over M with Dixmier-Douady invariant [habc ] ∈ H 3 (M, Z). These Hilbert bundles over P are in addition required to have an action of the bundle gerbe associated to P , and were called bundle gerbe modules. That is, bundle gerbe modules on M are vector bundles over P that are invariant under a projective action of the projective unitary group. The theory of bundle gerbes was initiated by Murray [21] as a bundle theory analogue of the theory of gerbes due to Giraud [9] and Brylinski [6] that used sheaves of categories. It was also proved in [4] that the K-theory defined using bundle gerbe modules, called bundle gerbe K-theory, is naturally isomorphic to twisted K-theory as defined by Rosenberg [23], thus yielding a nice geometric interpretation of twisted bundles on the D-brane world-volumes in the presence of a nontrivial B-field. There are discussions in [7, 16] on the uses of twisted K-theory in string theory. This paper extends the Chern-Weil construction of the Chern character of bundle-gerbe K-theory that was defined in [4], to the equivariant and the holomorphic cases. It also details the tensor product construction in bundle gerbe K-theory, which turns out to be a delicate matter when the curvature of the B-field is non-trivial. The non-trivial multiplicativity property of the Chern character is also studied, as well as the expression of the Chern character in the odd degree case. It has been noted by Witten [25] that D-brane charges in Type IIA string theories are classified by K 1 (M) with appropriate compact support conditions. The relevance of the equivariant case to conformal field theory was highlighted by the remarkable discovery by Freed, Hopkins and Teleman [8] that the twisted G-equivariant K-theory of a compact connected Lie group G (with mild hypotheses) is graded isomorphic to the Verlinde algebra of G, with a shift given by the dual Coxeter number and the curvature of the B-field. Recall that Verlinde algebra of a compact connected Lie group G is defined in terms of positive energy representations of the loop group of G, and arises naturally in physics in Chern-Simons theory which is defined using quantum groups and conformal field theory. The relevance of aspects of holomorphic K-theory to physics have been discussed in Sharpe [24] and Kapustin-Orlov [14], but using coherent sheaves and categories, instead of holomorphic vector bundles that is used in this paper. Section 2 contains a brief review of the theory of bundle gerbes and its equivariant analogue, as well as the theory of connections and curvature on these. In Sect. 3 a more detailed account of twisted cohomology is given than what appears in [4]. Section 4 deals with bundle gerbe K-theory and the delicate problem of defining the tensor product as well as the multiplicativity property of the Chern character in this context. Section 5 contains a derivation of the Chern character in odd degree bundle gerbe K-theory. Section 6 contains the extension of the earlier discussion of the Chern character to the case of equivariant bundle gerbe K-theory, and in Section 7 to the case of holomorphic bundle gerbe K-theory. Section 8 contains a discussion of the natural class of examples, of Spin and SpinC bundle gerbes, and the associated spinor bundle gerbe modules. Also included are examples in the holomorphic and equivariant cases. We would like to thank Eckhard Meinrenken for pointing out an error in an earlier version of this paper. We would also like to thank the referee for useful comments.

Chern Character in Twisted K-Theory

163

2. Review of Bundle Gerbes Bundle gerbes were introduced by Murray [21] and provided an alternative to Brylinski’s category theoretic notion of a gerbe [6]. Gerbes and bundle gerbes provide a geometrical realisation of elements of H 3 (M, Z) just as line bundles provide a geometrical realisation of H 2 (M, Z). We briefly review here the definition and main properties of bundle gerbes. A bundle gerbe on a manifold M consists of a submersion π : Y → M together with a line bundle L → Y [2] on the fibre product Y [2] = Y ×π Y . L is required to have a product, that is a line bundle isomorphism which on the fibres takes the form L(y1 ,y2 ) ⊗ L(y2 ,y3 ) → L(y1 ,y3 )

(1)

for points y1 , y2 and y3 all belonging to the same fibre of Y . This product is required to be associative in the obvious sense. A motivating example of a bundle gerbe arises whenever we have a central extension ˆ → G and a principal G bundle P → M on M. Then on the fibre of groups C× → G product P [2] we can form the map g : P [2] → G by comparing two points which lie in the same fibre. We can use this map g to pull back the line bundle associated to the ˆ The resulting line bundle L → P [2] (the “primitive” line bundle) principal C× bundle G. ˆ In [21] this is a bundle gerbe with the product induced by the product in the group G. bundle gerbe is called the lifting bundle gerbe. We will be particularly interested in the case when G = P U , the projective unitary group of some separable Hilbert space H. Then, as is well known, we have the central extension U (1) → U → P U . Therefore, associated to any principal P U bundle on M we have an associated bundle gerbe. For use in Sect. 6 we will explain here what we mean by a G-equivariant bundle gerbe. Suppose that G is a compact Lie group acting on the manifold M. We first recall the notion of a G-equivariant line bundle on M (see for instance [6]). We denote by p1 and m the maps M × G → M given by projection onto the first factor and the action of G on M respectively. A line bundle L → M is said to be G-equivariant if there is a line bundle isomorphism σ : p1∗ L → m∗ L covering the identity on M × G. Thus σ is equivalent to the data of a family of maps which fiberwise are of the form σg : Lm → Lg(m)

(2)

for m ∈ M and g ∈ G. These maps are required to vary smoothly with m and g and satisfy the obvious associativity condition. A G-equivariant bundle gerbe consists of the data of a submersion π : Y → M such that Y has a G-action covering the action on M together with a G-equivariant line bundle L → Y [2] . L → Y [2] has a product, i.e. a G-equivariant line bundle isomorphism taking the form (1) on the fibres. We remark that this definition is really too strong, we should require that G acts on L only up to a “coherent natural transformation”. However the definition we have given will be more than adequate for our purposes. A bundle gerbe connection ∇ on a bundle gerbe L → Y [2] is a connection on the line bundle L which is compatible with the product (1), i.e. ∇(st) = ∇(s)t + s∇(t) for sections s and t of L. In [21] it is shown that bundle gerbe connections always exist. Let F∇ denote the curvature of a bundle gerbe connection ∇. It is shown in [21] that we can find a 2-form f on Y such that F∇ = δ(f ) = π2∗ f − π1∗ f . f is unique up to 2-forms pulled back from M. A choice of f is called a choice of a curving for ∇. Since F∇ is closed we must have df = π ∗ ω for some necessarily closed 3-form ω on M. ω is called

164

V. Mathai, D. Stevenson

the 3-curvature of the bundle gerbe connection ∇ and curving f . It is shown in [21] that ω has integral periods. For later use it will also be of interest to know that we can find a G-equivariant connection on a G-equivariant bundle gerbe L → Y [2] . This is a connection on L which is compatible with the product structure on L and is also invariant under the action of G. Since G is assumed to be compact, we can always find such a connection by an averaging procedure.

3. Twisted Cohomology Twisted cohomology turns out to be the natural target space for the Chern character defined in bundle gerbe K-theory. In this section we give a short account of the main properties of twisted cohomology. Suppose H is a closed differential 3-form. We can use H to construct a differential δH on the algebra • (M) of differential forms on M by 2 = 0. We set setting δH (ω) = dω−H ω for ω ∈ • (M). It is easy to check that indeed δH • H (M, H ) to be the quotient kerδH /imδH . If λ is a differential 2-form on M then we can form the differential δH +dλ and the group H • (M, H +dλ). One can construct an isomorphism H • (M, H ) → H • (M, H + dλ) by sending a class in H • (M, H ) represented by ω to the class in H • (M, H + dλ) represented by exp(λ)ω. So any two closed differential 3-forms H and H representing the same class in H 3 (M, R) determine isomorphic cohomology groups H • (M, H ) and H • (M, H ). The two groups H • (M, H ) and H • (M, H ) are not uniquely isomorphic, as there are many 2-forms λ such that H = H + dλ. Note also that there is a homomorphism of groups H • (M, H ) ⊗ H • (M, H ) → H • (M, H + H ) defined by sending a class [ω] ⊗ [ρ] in H • (M, H ) ⊗ H • (M, H ) to the class [ωρ] in H • (M, H + H ). Finally note that H • (M, H ) has a natural structure of an H • (M) module: if [ω] ∈ H • (M, H ) and [ρ] ∈ H • (M) then [ρω] ∈ H • (M, H ). If P is a principal P U bundle on M with Dixmier-Douady class δ(P ) ∈ H 3 (M; Z) then we define a group H • (M; P ) by choosing a closed 3-form H on M representing the image of δ(P ) in de Rham cohomology and set H • (M; P ) = H • (M; H ). Proposition 3.1. Suppose that P is a principal P U (H) bundle on M. Then H • (M, P ) is an abelian group with the following properties: (1) If the principal P U (H) bundle P is trivial then there is an isomorphism H • (M, P ) = H • (M). (3) If the principal P U (H) bundle P is isomorphic to another principal P U (H) bundle Q then there is an isomorphism of groups H • (M, P ) = H • (M, Q). (3) If Q is another principal P U (H) bundle on M, then there is a homomorphism H • (M, P ) ⊗ H • (M, Q) → H • (M, P ⊗ Q). (4) H • (M, P ) has a natural structure as a H • (M) module. (5) If f : N → M is a smooth map then there is an induced homomorphism f ∗ : H • (M, P ) → H • (N, f ∗ P ). In 3 above the P U bundle P ⊗Q on M obtained from P U bundles P and Q is defined [6] by first forming the P U × P U bundle P ×M Q on M and then using the homomorphism P U (H) × P U (H) → P U (H ⊗ H) to define P ⊗ Q = (P ×M Q) ×P U ×P U P U . We can then choose an isometry H ⊗ H = H to make P ⊗ Q into a principal P U (H) bundle.

Chern Character in Twisted K-Theory

165

4. The Twisted Chern Character: Even Case 4.1. Geometric models of K˜ 0 (X). We first discuss some features of reduced even K-theory K˜ 0 (X). Since K˜ 0 (X) is a ring, it follows that a model for the classifying space of K˜ 0 (X) must be a “ring space” in the sense that there exist two H -space structures on the model classifying space which satisfy the appropriate distributivity axioms. Let Fred0 denote the connected component of the identity in the space of Fredholms Fred, and BGLK the classifying space of the group of invertible operators GLK which differ from the identity by a compact operator. The H -space structures on Fred0 and on BGLK which induce addition in K˜ 0 (X) are easy to describe; on Fred0 the H -space structure is given by composition of Fredholm operators while on BGLK the H -space structure is given by the group multiplication on BGLK . The H -space structures inducing multiplication on K˜ 0 (X) are harder to describe. On Fred0 this H -space structure is given as follows (see [2]). If F1 and F2 are Fredholm operators on the separable Hilbert space H then we can form the tensor product operator F1 ⊗I +I ⊗F2 . One can show that this operator is Fredholm of index dimker(F1 )dimker(F2 ) − dimker(F1∗ )dimker(F2∗ ). In particular, if ind(F1 ) = ind(F2 ) = 0, then F1 ⊗ I + I ⊗ F2 is Fredholm of index zero on H ⊗ H. If we choose an isometry of H ⊗ H with H then this operation of tensor product of Fredholms induces an H -map on Fred0 which one can show corresponds to the multiplication on K˜ 0 (X). Note that this works only for Fred0 , for Fred one must use a Z2 -graded version of this tensor product to get the right H -map. Since BGLK and Fred0 are homotopy equivalent there is an induced H -map on BGLK , however it is difficult to see what this map is at the level of principal GLK bundles. To investigate this H -map on BGLK , we shall replace principal GLK bundles P → X with Hilbert vector bundles E → X equipped with a fixed reduction of the structure group of E to GLK . We shall refer to such vector bundles as GLK -vector bundles. Note that Koschorke, in [15], reserves the terminology GLK -vector bundle for more general objects. In other words we can find local trivialisations φU : E|U → U ×H such that the transition functions gU V relative to these local trivialisations take values in GLK . Another way of looking at this is that the principal frame bundle F (E) of E has a reduction of its structure group from GL to GLK (note that there will be many such reductions). This reduction is determined up to isomorphism by a classifying map X → Fred0 . Suppose we are given two GLK -vector bundles E1 and E2 on X with the reductions of the frame bundles F (E1 ) and F (E2 ) to GLK corresponding to maps F1 , F2 : X → Fred0 . We want to know the relation of the pullback of the universal GLK bundle over Fred0 to the bundles F (E1 ) and F (E2 ). We can suppose that X is covered by open sets U small enough so that we can write F (x) = GU (x) + KU (x), F1 (x) = gU (x) + kU (x) and F2 (x) = gU (x) + kU (x), where GU , gU and gU are invertible and KU , kU and kU are compact – see for example [1]. It is then not hard to show that we can find hU : U → GL −1 −1 so that GU V = h−1 U gU V ⊗ gU V hV on U ∩ V , where GU V = GU GV , gU V = gU gV −1 and gU V = gU gV . The gU V and gU V are the GLK valued transition functions for the GLK bundles F (E1 ) and F (E2 ) while the GU V are the GLK valued transition functions for the pullback by F of the universal GLK bundle on Fred0 . It follows that this last bundle is a reduction of the structure group to GLK of the frame bundle F (E1 ⊗ E2 ) of the tensor product Hilbert bundle E1 ⊗ E2 . We therefore have an alternative description of the ring K˜ 0 (X) as the Grothendieck group associated to the semi-group VGLK (X) of GLK -vector bundles on X, where the addition and multiplication in VGLK (X) are given by direct sum and tensor product of GLK -vector bundles, after identifying H ⊕H, H ⊗ H and H via isometries.

166

V. Mathai, D. Stevenson

4.2. Twisted K-theory and bundle gerbe modules. After these preliminary remarks, we now turn to the definition of twisted K-theory. Given a principal P U bundle P on M with Dixmier-Douady class δ(P ) ∈ H 3 (M; Z), Rosenberg [23] defines twisted K-groups K i (M, P ) to be the K-groups Ki (A(P )) of the algebra A(P ) = C ∞ (E, M). Here E = P ×P U K is the bundle on M associated to P via the adjoint action of P U on the compact operators K. Rosenberg shows that one can identify K 0 (M, P ) with the set of homotopy classes of P U -equivariant maps [P , Fred]P U and K 1 (M, P ) with the set of homotopy classes of P U -equivariant maps [P , GLK ]P U . We do not assume that the class δ(P ) is necessarily torsion. The reduced theory K˜ 0 (M, P ) can be shown to be equal to [P , Fred0 ]P U . We can replace Fred0 by any other homotopy equivalent space on which P U acts. For example we could take BGLK instead of Fred0 . Various other characterisations of K˜ 0 (M, P ) are possible, we summarise them in the following proposition from [4]. Proposition 4.1 ([4]). Given a principal P U bundle P → M with Dixmier-Douady class δ(P ) ∈ H 3 (M, Z) we have the following isomorphisms of groups: (1) K˜ 0 (M, P ). (2) space of homotopy classes of sections of the associated bundle P ×P U BGLK . (3) space of homotopy classes of P U equivariant maps: [P , BGLK ]P U . (4) space of isomorphism classes of P U covariant GLK bundles on P . (5) space of homotopy classes of Fredholm bundle maps F : H0 → H1 between Hilbert vector bundles H0 and H1 on P which are bundle gerbe modules for the lifting bundle gerbe L → P [2] . Here a P U covariant GLK bundle Q on P is a principal GLK bundle Q → P together with a right action of P U on Q covering the action on P such that (qg)[u] = q[u]u−1 gu, where g ∈ GLK , [u] ∈ P U . In 5 above (see below for the definition of a bundle gerbe module) we require that the map F : H0 → H1 is a Fredholm map on the fibres and is also compatible with the module structures of H0 and H1 in the sense that the following diagram commutes: π1∗ F ⊗1

π1∗ H0 ⊗ L → π1∗ H1 ⊗ L ↓ ↓ π2∗ H0

π2∗ F

→

π2∗ H1

Given two such maps F, G : H0 → H1 , we require that they are homotopic via a homotopy which preserves the module structures of H0 and H1 in the above sense. It can be shown that twisted K-theory has the following properties. Proposition 4.2. Twisted K-theory satisfies the following properties: (1) If the principal P U bundle P → M is trivial then K p (M, P ) = K p (M). (2) K p (M, P ) is a module over K 0 (M). (3) If P → M and Q → M are principal P U bundles on M then there is a homomorphism K p (M, P ) ⊗ K q (M, Q) → K p+q (M, P ⊗ Q). (4) If f : N → M is a map then there is a homomorphism K p (M, P ) → K p (M, f ∗ P ), where f ∗ P → M denotes the pullback principal P U bundle. The reduced theory K˜ • (M, P ) satisfies the analogous properties.

Chern Character in Twisted K-Theory

167

Associated to the principal P U bundle P → M via the central extension of groups U (1) → U → P U is the lifting bundle gerbe L → P [2] . In [4] the notion of a bundle gerbe module for L was introduced. When the Dixmier-Douady class δ(P ) is not necessarily torsion a bundle gerbe module for L was defined to mean a GLK -vector bundle E on P together with an action of L on E. This was a vector bundle isomorphism π1∗ E ⊗ L → π2∗ E on P [2] which was compatible with the product on L. We also require a further condition relating to the action of U on the principal GLK bundle associated to E. More specifically, let GL(E) denote the principal GL(H) bundle on P associated to E whose fibre at a point p of P consists of all isomorphisms f : H → Ep . g ∈ GL(H) acts via f g = f ◦ g. Since E has structure group GLK there is a reduction of GL(E) to a principal GLK bundle R ⊂ GL(E). We require that if u ∈ U (H) and p2 = p1 [u], so that u ∈ L(p1 ,p2 ) , then the map GL(E)p1 → GL(E)p2 which sends f ∈ GL(E)p1 to uf u−1 preserves R. So if f ∈ Rp1 then uf u−1 ∈ Rp2 . A bundle gerbe module E is not quite a P U -equivariant vector bundle, since the action of P U will not preserve the linear structure on the fibres of E, note however that the projectivisation of E, P (E), will descend to a bundle of projective Hilbert spaces on M. Recall from Sect. 4 that the space of GLK bundles on P forms a semi-ring under the operations of direct sum and tensor product. It is easy to see that the operation of direct sum is compatible with the action of the lifting bundle gerbe L → P [2] and so the set of bundle gerbe modules for L, ModGLK (L, M), has a natural structure as a semi-group. We denote the group associated to ModGLK (L, M) by the Grothendieck construction by ModGLK (L, M) as well. We remark as above that we can replace BGLK by any homotopy equivalent space; in particular we can consider GLtr -vector bundles in place of GLK -vector bundles. To define characteristic classes we must make this replacement. The following result is proven in [4]. Proposition 4.3 ([4]). If L → P [2] is the lifting bundle gerbe for a principal P U bundle P → M with Dixmier-Douady class δ(P ) ∈ H 3 (M, Z) then K˜ 0 (M, P ) = ModGLK (L, M). The result remains valid when we replace GLK -vector bundles by GLtr -vector bundles. 4.3. The twisted Chern character in the even case. In [4] a homomorphism chP : K˜ 0 (M, P ) → H ev (M, P ) was constructed with the properties that 1) chP is natural with respect to pullbacks, 2) chP respects the K˜ 0 (M)-module structure of K˜ 0 (M, P ) and 3) chP reduces to the ordinary Chern character in the untwisted case when δ(P ) = 0. It was proposed that chP was the Chern character for (reduced) twisted K-theory. We review the construction of chP here and prove the result stated in [4] that chP respects the K˜ 0 (M)-module structure of K˜ 0 (M, P ). To motivate the construction of chP , let P → M be a principal P U bundle with Dixmier-Douady class δ(P ) ∈ H 3 (M, Z) and let E → P be a module for the lifting bundle gerbe L → P [2] . We suppose that L comes equipped with a bundle gerbe connection ∇L and a choice of curving f such that the associated 3-curvature is H , a closed, integral 3-form on M representing the image, in real cohomology, of the Dixmier-Douady class δ(P ) of P . Recall that L acts on E via an isomorphism ψ : π1∗ E ⊗ L → π2∗ E. Since the ordinary Chern character ch is multiplicative, we have π1∗ ch(E)ch(L) = π2∗ ch(E).

(3)

168

V. Mathai, D. Stevenson

Assume for the moment that this equation holds on the level of forms. Then ch(L) is represented by the curvature 2-form FL of the bundle gerbe connection ∇L on L. A choice of a curving for ∇L is a 2-form f on P such that FL = δ(f ) = π2∗ f − π1∗ f . It follows that ch(L) is represented by exp(FL ) = exp(π2∗ f − π1∗ f ) = exp(π2∗ f ) exp(−π1∗ f ). Therefore we can rearrange Eq. (3) above to get π1∗ exp(−f )ch(E) = π2∗ exp(−f )ch(E). Since we are assuming that this equation holds at the level of forms, this implies that the form exp(−f )ch(E) descends to a form on M which is clearly closed with respect to the twisted differential d − H . To make this argument rigorous, we need to choose connections on the module E so that Eq. (3) holds on the level of forms. To do this we need the notion of a bundle gerbe module connection. A bundle gerbe module connection on E is a connection ∇E on the vector bundle E which is compatible with the bundle gerbe connection ∇L on L under the action of L on E. In other words, under the isomorphism ψ : π1∗ E ⊗ L → π2∗ E, we have the transformation law π1∗ ∇E ⊗ I + I ⊗ ∇L = ψ −1 π2∗ ∇E ψ.

(4)

Let FL denote the curvature of ∇L and let FE denote the curvature of ∇E . Then Eq. (4) implies that the curvatures satisfy the following equation: π1∗ FE + FL = ψ −1 π2∗ FE ψ. Recall that the curving f for the bundle gerbe connection ∇L satisfies FL = δ(f ) = π2∗ f − π1∗ f . Therefore we can rewrite the equation above as π1∗ (FE + f I ) = ψ −1 π2∗ (FE + f I )ψ.

(5)

We would like to be able to take traces of powers of FE + f I . Then Eq. (5) would imply that the forms tr(FE + fI)p descend to M. Unfortunately it is not possible to find connections ∇E so that the bundle valued 2-form FE + f I takes values in the sub-bundle of trace class endomorphisms of E unless the 3-curvature H of the bundle gerbe connection ∇L and curving f is zero. Since we are interested in K-theory and hence in Z2 -graded vector bundles we can get around this difficulty by considering connections ∇ on the Z2 -graded module E = E0 ⊕ E1 which are of the form ∇ = ∇0 ⊕ ∇1 , where ∇0 and ∇1 are module connections on E0 and E1 respectively such that the difference ∇0 − ∇1 takes trace class values. By this we mean that we can cover P by open sets over which E0 and E1 are trivial and in these local trivialisations the connections ∇0 and ∇1 are given by d + A0 and d + A1 respectively such that the difference A0 − A1 is trace class. To see that we can always find such connections recall, as pointed out in [4], that a P U -covariant GLtr -bundle Q on P can be viewed as the total space Q of a principal GLtr P U bundle over M, where the semi-direct product is defined by (g1 , u1 ) · (g2 , u2 ) = (uˆ −1 ˆ 2 g2 , u1 u2 ). It follows that the Lie bracket on Lie(GLtr P U ) is 2 g1 u given by [(ξ, U ), (η, V )] = ([ξ, η] + [ξ, Vˆ ] + [Uˆ , η], [U, V ]), where ξ , η ∈ Lie(GLtr ), U, V ∈ Lie(P U ) and Uˆ , Vˆ are lifts of U and V to Lie(U ). Let

be a connection 1-form on the GLtr P U bundle Q → M. Let p1 and p2 denote the projections of Lie(GLtr P U ) onto Lie(GLtr ) and Lie(P U ) respectively. We cannot

Chern Character in Twisted K-Theory

169

define a connection 1-form on the GLtr bundle Q → P by p1 ( ) as this 1-form is not equivariant with respect to the action of GLtr . Instead we have p1 ((g −1 , 1) (g, 1)) = g −1 (p1 )g + g −1 (p 2 )g − (p 2 ).

(6)

Note that p2 pushes forward to define a connection 1-form A on the P U -bundle P . The transformation law (6) turns out to be just what is needed to define a module connection on the associated bundle E = Q ×GLtr H. Indeed, if [s, f ] is a section of E then (6) shows that ∇X [s, f ] = [s, df (X) + s ∗ (p1 )(X)f + σ (A(X))f ]

(7)

is a well defined connection on E (here X is a vector field on P and σ : Lie(P U ) → Lie(U ) is a choice of splitting of the central extension of Lie algebras iR → Lie(U ) → Lie(P U )). To show that ∇ defines a module connection on E we want to show that under the isomorphism ψ : π1∗ E ⊗ L → π2∗ E the connection π2∗ ∇ on π2∗ E is mapped into the tensor product connection π1∗ ∇ + ∇L on π1∗ E ⊗ L. ψ sends a section [s, f ] ⊗ [u, ˆ λ] of π1∗ E ⊗ L to the section [su−1 , λuf ˆ ] of π2∗ E. Then we have ˆ ] = π1∗ ∇[s, f ] ⊗ [u, ˆ λ] + [s, f ] ⊗ [u, ˆ dλ + λ(uˆ −1 d uˆ u · π2∗ ∇[su−1 , λuf −σ (u−1 du)) + λ(uˆ −1 σ (π2∗ A)uˆ − σ (u−1 π2∗ Au)]. One can check that if we define a connection ∇L on the lifting bundle gerbe L → P [2] by ∇L [u, ˆ λ] = [u, ˆ dλ + λ(uˆ −1 d uˆ − σ (u−1 du)) + λ(uˆ −1 σ (π2∗ A)uˆ − σ (u−1 π2∗ Au))] then ∇L is a bundle gerbe connection. Therefore ∇ is a module connection on E. Suppose now that E1 and E2 are GLtr modules for L. Then by definition the frame bundles U (E1 ) and U (E2 ) have GLtr reductions Q1 and Q2 respectively which are P U covariant. Therefore we may construct module connections on E1 and E2 by the above recipe. We may choose connection 1-forms 1 and 2 on the GLtr P U bundles Q1 and Q2 respectively such that p2 1 and p2 2 push forward to define the same connection 1-form A on the P U bundle P . It follows therefore from (7) that the difference of the associated module connections ∇1 and ∇2 in local trivialisations is trace class. We let F0 and F1 denote the curvatures of the module connections ∇0 and ∇1 respectively. It follows that the difference (F0 + f I ) − (F1 + f I ) and hence (F0 + f I )p − (F1 + f I )p takes trace class values. It is shown in [4] that the 2p-forms tr((F0 + fI)p − (F1 + fI)p ) are defined globally on P and moreover they descend to 2p-forms on M. We have tr(exp(F0 + fI) − exp(F1 + fI)) = exp(f)tr(exp(F0 ) − exp(F1 )),

(8)

where we note that there is a cancellation of the degree zero terms – this is due to the fact that we are considering reduced twisted K-theory. We summarise this discussion in the following proposition from [4]. Proposition 4.4. Suppose that E = E0 ⊕ E1 is a Z2 -graded module for the lifting bundle gerbe L → P [2] , representing a class in K˜ 0 (M, P ) under the isomorphism K˜ 0 (M, P ) = ModGLtr (L, M). Suppose that ∇ = ∇0 ⊕ ∇1 is a connection on the Z2 graded module E such that ∇0 and ∇1 are module connections for E0 and E1 respectively such that the difference ∇0 − ∇1 is trace class. Let chP (∇, E) denote the differential form on M whose lift to P is equal to exp(f )tr(exp(F0 ) − exp(F1 )). Then chP (∇, E) is closed with respect to the twisted differential d − H on • (M) and hence represents a class in H ev (M, P ). The class chP (E) = [chP (∇, E)] is independent of the choice of module connections ∇0 and ∇1 .

170

V. Mathai, D. Stevenson

We note that it is essential to consider Z2 -graded modules E to define the forms chP (∇, E) as we need to be able to consider differences of connections in order to take traces. It is straightforward to show that the assignment E → [chP (∇, E)] is additive with respect to direct sums; i.e. [chP (∇ ⊕∇ , E ⊕E )] = [chP (∇, E)]+[chP (∇ , E )]. We want to show here that the homomorphism chP respects the K˜ 0 (M)-module structure of K˜ 0 (M, P ). We first recall the action of K˜ 0 (M) on K˜ 0 (M, P ): if F = F + ⊕ F − is a Z2 -graded GLtr -vector bundle on M representing a class in K˜ 0 (M) and E = E + ⊕ E − is a Z2 -graded GLtr -bundle gerbe module on P for the lifting bundle gerbe L → P [2] ˆ ∗ F , where π : P → M is the prothen E ·F is the Z2 -graded bundle gerbe module E ⊗π ∗ ˆ ∗ F ) = chP (E)ch(F ), jection. L acts trivially on π F . We want to show that chP (E ⊗π where ch(F ) is the ordinary Chern character form for F . Choose a connection ∇E = ∇E+ ⊕ ∇E− on the Z2 -graded module E = E + ⊕ E − such that the connections ∇E+ and ∇E− are module connections on E + and E − such that the difference ∇E+ − ∇E− is trace class. Choose also a connection ∇F = ∇F+ ⊕ ∇F− on the Z2 -graded GLtr -vector bundle F . Then a differential form representing chP (E)ch(F ) is given by exp(f )tr(exp(FE+) − exp(FE−))π ∗ tr(exp(FF+) − exp(FF−)). This form is equal to exp(f )tr(exp(FE+ ⊗ I + I ⊗ π ∗ FF+) + exp(FE− ⊗ I + I ⊗ π ∗ FF−) − exp(FE + ⊗ I + I ⊗ π ∗ FF −) − exp(FE − ⊗ I + I ⊗ π ∗ FF +)).

(9)

The connections (∇E + ⊗ I + I ⊗ π ∗ ∇F +) ⊕ (∇E − ⊗ I + I ⊗ π ∗ ∇F −) and (∇E + ⊗ I + I ⊗ π ∗ ∇F −) ⊕ (∇E − ⊗ I + I ⊗ π ∗ ∇F +) are module connections, however their difference is not trace class. We next choose module connections ∇E + ⊗π ∗ F + ⊕ ∇E − ⊗π ∗ F − and ∇E + ⊗π ∗ F − ⊕ ∇E − ⊗π ∗ F + on the modules E + ⊗ π ∗ F + ⊕ E − ⊗ π ∗ F − and E + ⊗ π ∗ F − ⊕ E − ⊗ π ∗ F + respectively. We choose these connections so that all of the differences ∇E + ⊗π ∗ F + − ∇E − ⊗π ∗ F − , ∇E + ⊗π ∗ F + − ∇E + ⊗π ∗ F − and ∇E + ⊗π ∗ F + − ∇E − ⊗π ∗ F + are trace class. An argument similar to the one above shows that one can always do this. We next define a family of module connections ∇E ± ⊗π ∗ F ± (t) by ∇E ± ⊗π ∗ F ± (t) = t (∇E ± ⊗ I + I ⊗ π ∗ ∇F ± ) + (1 − t)∇E ± ⊗π ∗ F ± .

(10)

Using the fact that the difference of any two of the connections ∇E ± ⊗π ∗ F ± , or the difference ∇E + − ∇E − or ∇F + − ∇F − is trace class one can show that the form FE + ⊗π ∗ F + (t)k−1 +FE − ⊗π ∗ F − (t)k−1 −FE + ⊗π ∗ F − (t)k−1 −FE − ⊗π ∗ F + (t)k−1 takes trace class values. Similarly one can show that A˙ E + ⊗π ∗ F + (t)+ A˙ E − ⊗π ∗ F − (t)− A˙ E + ⊗π ∗ F − (t) − A˙ E − ⊗π ∗ F + (t) takes trace class values. From these two results it is easy to deduce that the form A˙ E + ⊗π ∗ F + (t)(FE + ⊗π ∗ F + (t))k−1 + A˙ E − ⊗π ∗ F − (t)(FE − ⊗π ∗ F − (t))k−1 − A˙ E + ⊗π ∗ F − (t)(FE + ⊗π ∗ F − (t))k−1 − A˙ E − ⊗π ∗ F + (t)(FE − ⊗π ∗ F + (t))k−1

(11)

takes trace class values. It is also straightforward to check that the trace of the form (11) above descends to M. An argument similar to the proof that the class [chP (E, ∇)] is independent of the choice of connection in [4] shows that the trace of the form (11) is a transgression form. We propose that chP (E) represents the twisted Chern character. We summarise the discussion of this section in the following proposition.

Chern Character in Twisted K-Theory

171

Proposition 4.5. Let P be a principal P U (H) bundle on M with Dixmier-Douady class δ(P ). The homomorphism chP : K˜ 0 (M; P ) → H ev (M, P ) satisfies the following properties: (1) chP is natural with respect to pullbacks by maps f : N → M. (2) chP respects the K˜ 0 (M)-module structure of K˜ 0 (M; P ). (3) chP reduces to the ordinary Chern character in the case where P is trivial. 5. The Odd Chern Character: Twisted Case The relevance of K 1 to Type IIA string theory, at least in the case where there is no background field, has been pointed out by Witten [25], see also [12]. It turns out that odd K-theory, K 1 (M), with appropriate compact support conditions, classifies D-brane charges in type IIA string theory. Related work also appears in [10] and [18]. Suppose L → Y [2] is a bundle gerbe with bundle gerbe connection ∇L . Recall that a module connection ∇E on a bundle gerbe module E for L is a connection on the vector bundle E which is compatible with the bundle gerbe connection ∇L , i.e. under the isomorphism ψ : π1∗ E ⊗ L → π2∗ E the tensor product connection π1∗ ∇E ⊗ ∇L on π1∗ E ⊗ L is mapped into the connection π2∗ ∇E on π2∗ E. Suppose now that L → P [2] is the lifting bundle gerbe for the principal P U (H) bundle P → M and that ∇L is a bundle gerbe connection on L with curvature FL and curving f , where FL = π1∗ f − π2∗ f such that the associated 3-curvature (which represents the image, in real cohomology, of the Dixmier-Douady class of L) is equal to the closed, integral 3-form H . If E is a trivial Utr bundle gerbe module for L then we can consider module connections ∇E = d + AE on E; however the algebra valued 2-form FE + f I cannot take trace class values (here FE denotes the curvature of the connection ∇E ). Let φ : E → E be an automorphism of the trivial Utr bundle gerbe module E that respects the Utr bundle gerbe module structure, that is, φ ∈ Utr (E), then φ −1 ∇E φ is another module connection for E. Then the difference of connections φ −1 ∇E φ − ∇E = A(φ) is a one form on P with values in the trace class class endomorphisms of E. Recall that the transformation equation satisfied by the module connections ∇E and φ −1 ∇E φ is π1∗ ∇E + ∇L = ψ −1 ◦ π2∗ ∇E ◦ ψ and

π1∗ φ −1 ∇E φ + ∇L = ψ −1 ◦ π2∗ φ −1 ∇E φ ◦ ψ.

Therefore one has the following equality of End(π1∗ E⊗L) = End(π2∗ E) valued 1-forms on P [2] : π1∗ A(φ) = ψ −1 ◦ π2∗ A(φ) ◦ ψ.

(12)

Recall also that the curvature satisfies the following equality of End(π1∗ E ⊗ L) = End(π2∗ E) valued 2-forms on P [2] : π1∗ (FE + f I ) = ψ −1 ◦ π2∗ (FE + f I ) ◦ ψ

(13)

π1∗ (φ −1 FE φ + f I ) = ψ −1 ◦ π2∗ (φ −1 FE φ + f I ) ◦ ψ.

(14)

and

172

V. Mathai, D. Stevenson

It follows that the differences (φ −1 FE φ + f I ) − (FE + f I ) and hence (φ −1 FE φ + f I )k − (FE + f I )k are differential forms with values in the trace class endomorphisms of E. Therefore tr((φ −1 FE φ + fI)k − (FE + fI)k ) is well defined and is equal to zero on P . In fact, by (13) and (14), these differential forms descend to M and are zero there. In particular, one has tr(exp(φ −1 FE φ + fI) − exp(FE + fI)) = exp(f)tr(exp(φ −1 FE φ) − exp(FE )) = 0. (15) Let ∇E (s) = sφ −1 ∇E φ + (1 − s)∇E , where s ∈ [0, 1], denote the linear path of connections joining ∇E and φ −1 ∇E φ, and FE (s) denote its curvature. Since exp(FE (s)) − exp(FE (0)) is a differential form on P with values in the trace class class endomorphisms of E for all s ∈ [0, 1], then ∂s tr(exp(FE (s)) − exp(FE (0))) = tr(∂s exp(FE (s))) = d (tr(A(φ) exp(FE (s)))) (16) is well defined since A(φ) = ∂s ∇E (s) is a one form on P with values in the trace class endomorphisms of E. Integrating (16) and using (15), yields the transgression formula, −1 tr exp(φ FE φ +fI) − exp(FE + fI) = exp(f) d

1

ds tr(A(φ) exp(FE (s))) ,

0

(17)

1

By (15), it follows that ds tr(A(φ) exp(FE (s))) is a closed form. Therefore 0 1 1 ds tr(A(φ) exp(FE (s))) = ds tr(A(φ) exp(FE (s) + fI)) is closed with exp(f ) 0

0

respect to the twisted differential d − H . In fact, by (12), (13) and (14), the differential 1 ds tr(A(φ) exp(FE (s) + fI)) descends to M. form 0

By (15) and (17) we have the following proposition, except for the last claim. Proposition 5.1. Suppose that E is Utr bundle gerbe modules for the lifting bundle gerbe L → P [2] equipped with a bundle gerbe connection ∇L and curving f , such that the associated 3-curvature is H . Suppose that ∇E is a module connection on E and φ is an automorphism of E such that φ − IE is a fibre-wise trace class operator. Then the difference φ −1 ∇E φ − ∇E = A(φ) is trace class. Let chP (∇E , φ) ∈ • (M) denote the 1 ds tr(A(φ) exp(FE (s)+fI)), where differential form on M whose lift to P is given by 0

∇E (s) = sφ −1 ∇E φ + (1 − s)∇E , for s ∈ [0, 1], denotes the linear path of connections joining ∇E and φ −1 ∇E φ, and FE (s) denotes its curvature. Then chP (∇E , φ) is closed with respect to the twisted differential d − H on • (M) and hence represents a class [chP (∇E , φ)] in H odd (M, P ). Furthermore, the class [chP (∇E , φ)] is independent of the choice of module connection ∇E on E and choice of automorphism φ of E such that φ − IE is a fibre-wise trace class operator, and is called the odd twisted Chern character of [E, φ] ∈ K 1 (M, P ), denoted by chP ([E, φ]) ∈ H odd (M, P ).

Chern Character in Twisted K-Theory

173

Proof. We have already seen that chP (∇E , φ) is a d − H closed form on M. It remains to prove that the twisted cohomology class [chP (∇E , φ)] is independent of the various choices. Let ∇E (s, t) be a two parameter family of module connections on E. Suppose that ∇E (s, t) = d +A(s, t) and that FE (s, t) is the curvature of ∇E (s, t). In the situations that we consider, ∂s (A) and ∂s exp(FE )) are differential forms on P with values in the trace class endomorphisms of E for all s ∈ [0, 1], where we have supressed the dependence of A and FE on s and t. Let P (X) denote the invariant function tr(exp(X)) and let P and P denote the first and second differentials of P respectively. An easy calculation shows that tr(∂s (∂(A)FEk−1 ) − ∂t ((∂s (A)FEk−1 )) = tr(∂t (A)∂s (FEk−1 ) − ∂s (A)∂t (FEk−1 )). We rewrite the right-hand side as follows. We have tr(∂t (A)∂s (FEk−1 ) − ∂s (A)∂t (FEk−1 )) k−1 ∂t (A)FEi−1 (d∂s (A) + ∂s (A)A + A∂s (A))FEk−i−1 = tr −

1 k−1

∂s (A)FEi−1 (d∂t (A) + ∂t (A)A + A∂t (A))FEk−i−1

1

= tr

k−1

+ +

1 k−1 1 k−1

∂t (A)FEi−1 d∂s (A)FEk−i−1 −

k−1

∂t (A)FEi−1 ∂s (A)AFEk−i−1 − ∂t (A)FEi−1 A∂s (A)FEk−i−1

−

d∂t (A)FEk−i−1 ∂s (A)FEi−1

1 k−1 1 k−1

1

∂t (A)AFEk−i−1 ∂s (A)FEi−1 ∂t (A)FEk−i−1 ∂s (A)FEi−1 A

,

1

where we have used the invariance of tr. Reindexing the sums we get tr −

k−1

d∂t (A)FEi−1 ∂s (A)FEk−i−1 +

1

−

k−1

∂t (A)FEi−1 d∂s (A)FEk−i−1

1 k−1 1

∂t (A)FEi−1 ∂s (A)[FEk−i−1 , A] +

k−1

∂t (A)[FEi−1 , A]∂s (A)FEk−i−1

1

which equals −dtr k−1 ∂t (A)FEi−1 ∂s (A)FEk−i−1 , using the Bianchi identity. To recap, 1 we have the following identity: tr(∂t (∂s (A(s, t)) exp(FE (s, t))) − ∂s (∂t (A(s, t)) exp(FE (s, t)))) = dP (FE (s, t), ∂t (A(s, t)), ∂s (A(s, t))). Next observe that we can write tr(∂s (∂t (A(s, t)) exp(FE (s, t)))) as ∂s tr (∂t (A(s, t)) exp(FE (s, t)) − ∂t (A(0, t) exp(FE (0, t))) ,

(18)

174

V. Mathai, D. Stevenson

where the expression inside the trace takes trace class values due to our hypotheses on A(s, t) and FE (s, t). Integrating with respect to s, t, we obtain the identity, 1 1 d ds ∧ dt P (FE (s, t), ∂t (A(s, t)), ∂s (A(s, t))) (19) 0

0

1

=

dttr (∂t (A(1, t)) exp(FE (1, t)) − ∂t (A(0, t)) exp(FE (0, t)))

(20)

0

−

1

1

ds P (FE (s, 1), ∂s (A(s, 1))) +

0

ds P (FE (s, 0), ∂s (A(s, 0))).

(21)

0

The first 2-parameter family that we will consider is the following. ∇E (s, t) = (1−s)(d +A(t))+sφ −1 (d +A(t))φ, where A(t) = tA1 +(1−t)A0 and A0 , A1 are connection 1-forms on E. To prove the independence of the choice of the connection 1-form, from (19), (20), (21), it suffices to show that (20) vanishes. We compute, ∂t ∇E (s, t) = −(1 − s)(A1 − A0 ) + sφ −1 (A1 − A0 )φ. In particular, ∂t ∇E (1, t) = φ −1 (A1 − A0 )φ and ∂t ∇E (0, t) = (A1 − A0 ). Also, FE (1, t) = φ −1 FE (0, t)φ. So (20) becomes, 1 dttr φ −1 (A1 − A0 )φ exp(φ −1 FE (0, t)φ) − (A1 − A0 ) exp(FE (0, t)) = 0 0

by the invariance of P . Therefore chP (∇E (1), φ) − chP (∇E (0), φ) 1 1 ds ∧ dt P (FE (s, t), ∂t ∇E (s, t), ∂s ∇E (s, t)), = (d − H ) exp(f ) 0

0

as claimed, where ∇E (j ) = d + Aj , j = 0, 1. To prove the independence of the choice of φ, we choose the 2-parameter family given by ∇E (s, t) = (1 − s)(d + A) + sφt−1 (d + A)φt . It suffices to show that (20) is an exact form. We compute, ∂t ∇E (s, t) = −sφt−1 φ˙ t φt−1 (d + A)φt + sφt−1 (d + A)φ˙ t is a trace class operator. In particular, ∂t ∇E (1, t) = −φt−1 φ˙ t φt−1 (d + A)φt + φt−1 (d + A)φ˙ t and ∂t ∇E (0, t) = 0. Set ∇E = d + A and let FE denote its curvature. Using the fact that P (X) = tr(exp(X)), we get that (20) becomes tr d(φ˙ t φt−1 ) exp(FE )) + φ˙ t φt−1 [exp(FE )), A] = tr d(φ˙ t φt−1 exp(FE )) − φ˙ t φt−1 d(exp(FE ))) + φ˙ t φt−1 [exp(FE ), A] = dtr φ˙ t φt−1 exp(FE ) , since the other terms vanish by repeated application of the Bianchi identity dFE − [FE , ∇E ] = 0. Observe that φ˙ t φt−1 is a trace class operator, so that all the terms are of trace class. Therefore chP (∇E , φ1 ) − chP (∇E , φ0 ) 1 1 ds ∧ dt P (FE (s, t), ∂t ∇E (s, t), ∂s ∇E (s, t)) = (d − H ) exp(f ) 0

0 1

+ (d − H ) exp(f ) as claimed.

0

dt tr φ˙ t φt−1 exp(FE ) ,

Chern Character in Twisted K-Theory

175

Let Mod1Utr (M, L) denote the semi-group of all pairs (E, φ) consisting of a trivial Utr bundle gerbe module E for the lifting bundle gerbe L → P [2] associated to a principal P U (H) bundle P → M with Dixmier-Douady class equal to δ(P ), together with an automorphism φ : E → E such that φ − IE is a fibre-wise trace class operator. Then we have defined a map chP : Mod1Utr (M, L) → H odd (M, P ). Observe that Mod1Utr (M, L) = [P , Utr ]P U , and by Palais theorem, we have the natural isomorphism Mod1Utr (M, L) = K 1 (M, P ) that is induced by the inclusion Utr ⊂ UK . Therefore we get a map chP : K 1 (M, P ) → H odd (M, P ). We assert that this map defines the odd Chern character for twisted K-theory. It can be shown that the odd Chern character for twisted K-theory is uniquely characterised by requiring that it is a functorial homomorphism which is compatible with the K 0 (M)module structure on K 1 (M, P ) and reduces to the ordinary odd Chern character when P is trivial. It is easy to check that the map chP : K 1 (M, P ) → H odd (M, P ) is functorial with respect to smooth maps f : N → M. The proof that chP is a homomorphism is exactly as in the even degree case, cf. Sect. 9, [4]. The proof that chP is compatible with the K˜ 0 (M)-module structure of K 1 (M, P ) is exactly as in the earlier section for twisted K 0. 6. Equivariant Chern Character: Twisted Case Suppose that G is a compact Lie group acting on the smooth manifold M. We want 0 (M, [H ]). In this to define a Chern character for the equivariant twisted K-theory K˜ G 3 equivariant setting the twisting is done by a class in HG (M, Z), where HG denotes equivariant cohomology. Recall that HG is defined by first constructing the Borel space M G = M ×G EG and then setting HG• (M, Z) = H • (M G , Z). When G acts freely on M one can show that there is an isomorphism HG• (M, Z) = H • (M/G, Z). Note that equivariant cohomology groups are a lot larger than ordinary cohomology groups; HG• (M, Z) is a module over HG• (pt, Z) = H • (BG, Z). A G-equivariant principal P U -bundle P determines an element of HG3 (M, Z). To see this, we note that P induces a principal P U -bundle P G over M G by first lifting it to the product M × EG and by equivariance, it descends to M G . So by standard Dixmier-Douady theory, it determines a class in H 3 (M G , Z) = HG3 (M, Z). Brylinski [6] identifies HG3 (M, Z) with equivalence classes of G-equivariant gerbes on M, and Meinrenken [19] identifies HG3 (M; Z) with equivalence classes of G-equivariant bundle gerbes on M. G-equivariant principal P U bundles on M form a subgroup of HG3 (M : Z). In the case when M = G and G acts by conjugation on itself, the principal P U -bundles that are associated bundles to the canonical loop group LG bundle over G via positive energy representations of the loop group LG, are all G-equivariant in our sense. Although these representations are strongly continuous, they are equivalent to norm continuous ones. We now explain how a G-equivariant P U bundle on M gives rise to a G-equivariant bundle gerbe. Suppose P is a G-equivariant P U bundle on M, by replacing M and P by M × EG and P × EG respectively, we may assume without loss of generality that G acts freely. Associated to the P U -bundle P /G on M/G is the lifting bundle gerbe J → (P /G)[2] = P [2] /G. The projection P → P /G covering M → M/G induces a map P [2] → (P /G)[2] . Let L denote the pullback of J to P [2] under this map. Then L is

176

V. Mathai, D. Stevenson

a G-equivariant line bundle on P [2] . It is easy to see that in fact L → P [2] has a bundle gerbe product compatible with the isomorphisms σg : L(p1 ,p2 ) → Lg(p1 ,p2 ) of (2). Given a G-equivariant P U -bundle P with Dixmier-Douady class δ(P ) in HG3 (M, Z) 0 (M, P ) as follows. We let H be a we define the equivariant twisted K-theory K˜ G G separable G-Hilbert space in which every irreducible representation of G occurs with 0 (M, P ) to be the space of G-homotopy classes infinite multiplicity. We then define K˜ G of G-equivariant, P U -equivariant maps P → BGLK , where P → M is a G-equivariant principal P U bundle on M corresponding to the class δ(P ) in HG3 (M, Z). Here 1 (M, P ) BGLK is the quotient GL(HG )/GLK (HG ). The odd-dimensional group K˜ G is defined to be the space of G-homotopy classes of G-equivariant, P U -equivariant maps P → GLK . Again GLK is GLK (HG ). Again there are various other equivalent 0 (M, P ); we have in analogy with Proposition 4.1. definitions of K˜ G Proposition 6.1. Given a G-equivariant P U bundle P on M corresponding to a class 0 (M, P ): in HG3 (M, Z), we have the following equivalent definitions of K˜ G (1) space of G-homotopy classes of G-equivariant sections of the associated bundle P ×P U BGLK , (2) G-isomorphism classes of G-equivariant, P U covariant GLK bundles on P . Associated to the G-equivariant P U bundle P → M is the G-equivariant lifting bundle gerbe L → P [2] . We can formulate the notion of a G-equivariant bundle gerbe module for L: this is the G-equivariant GLK -vector bundle E → P on which L acts via a G-equivariant isomorphism π1∗ E ⊗ L → π2∗ E. Again we require an extra condition relating to the action of U (H) on the principal GLK bundle associated to E. As before we form the principal GL(H) bundle GL(E) on P with fibre at p equal to the isomorphisms H → Ep . This is a G-equivariant GL(H) bundle. The GLK -structure on E induces a reduction R of GL(E) to a GLK -bundle. Again R is a G-equivariant GLK bundle. We require that the action of U (H) on GL(E) given by sending f to uf u−1 is a G-map preserving R. The set of G-equivariant bundle gerbe modules for L forms a semi-group ModGLK (L, M)G under direct sum. In analogy with Proposition 4.3 we have the following result. Proposition 6.2. If L → P [2] is the G-equivariant lifting bundle gerbe associated to the G-equivariant P U bundle P → M corresponding to the class δ(P ) ∈ HG3 (M, Z) then 0 K˜ G (M, P ) = ModGLK (L, M)G . Again the result remains valid if we replace G-equivariant GLK -vector bundles by G-equivariant GLtr -vector bundles. We must perform this replacement in order to define the Chern character. Before we define the Chern character we must first have a model for an equivariant de Rham theory. Good references for this are [3] and [17]. We shall use the Cartan model to compute equivariant de Rham theory following [3]. Recall that one introduces the algebra S(g∗ )⊗• (M). This is given a Z-grading by deg(P ⊗ω) = 2 deg(P )+deg(ω). An operator on S(g∗ ) ⊗ • (M) is defined by the formula dg = d − i, where i denotes contraction with the vector field on M induced by the infinitesimal action of an element of g on M. dg preserves the G-invariant sub-algebra •G (M) = (S(g∗ ) ⊗ • (M))G and raises the total degree of an element of •G (M) by one. If we restrict dg to •G (M) then dg2 = 0. It can be shown that for G compact the cohomology of the complex (•G (M), dg) is equal to the equivariant cohomology HG• (M).

Chern Character in Twisted K-Theory

177

We briefly recall, following [3], the construction of Chern classes living in equivariant cohomology HGev (M) associated to a G-equivariant vector bundle E → M. Suppose that ∇ is a G-invariant connection on E. By this we mean the following (cf. [6]). The connection ∇ induces by pullback connections p1∗ ∇ and m∗ ∇ on the bundles p1∗ E and m∗ E on M × G. There exists a 1-form A on M × G with values in the endomorphism bundle End(p1∗ E) = p1∗ End(E) such that p1∗ ∇ = σ −1 m∗ ∇σ + A.

(22)

The condition that ∇ is G-invariant means that A vanishes in the M-direction. We associate a moment map µ ∈ (M, End(E)) ⊗ g∗ to ∇ by defining µ(ξ )(s) = Lξ (s) − ∇ξˆ (s),

(23)

where s is a section of E and ξˆ is the vector field on M induced by the infinitesimal action of G. Here the Lie derivative on sections is defined by Lξ (s) =

d |t=0 exp(tξ ) · s, dt

where G acts, for g close to 1, on sections by the formula (gs)(m) = gs(g −1 m). This last point perhaps requires some more explanation. G itself does not act on E but rather there is the isomorphism σ : p1∗ E → m∗ E over M × G. As we have explained already, σ consists of a family of isomorphisms σg : E → g ∗ E which are smooth in g. For g close to the identity therefore we can assume that g acts on E. Note that µ actually lives in ((M, End(E)) ⊗ g∗ )G . There is however another way [6] to look at the moment map which will be more useful for our purposes. Recall the 1-form A defined above by comparing the pullback connections p1∗ ∇ and m∗ ∇ via σ . One can show, as a result of the associativity condition on σ that A is left invariant under the left action of G on M × G, where G acts on itself by left multiplication. Since ∇ is G-invariant A vanishes in the M-direction. Therefore we can define a section µE ∈ (M, End(E)) ⊗ g∗ by (µE )ξ (s)(m) = A((m, 1); (0, ξ ))(s(m)). Again note that µE belongs to ((M, End(E)) ⊗ g∗ )G . We have the equality µ = µE . To define an equivariant Chern character we put chG (E, ∇) = tr(exp(F∇ − µ)). One can show that chG (E, ∇) ∈ (S(g∗ ⊗ • (M))G = •G (M) is equivariantly closed and moreover that the class defined by chG (E, ∇) in HGev (•G (M)) is independent of the choice of connection ∇. We need to explain how the theory of bundle gerbe connections and curvings carries over to the equivariant case. So suppose that L → Y [2] is a G-equivariant bundle gerbe on M. So Y has a G-action and π : Y → M is a submersion which is also a G-map. L → Y [2] is a G-equivariant line bundle with a G-equivariant product. Suppose that L comes equipped with a connection ∇ that preserves the bundle gerbe product (1) and is G-invariant. Since L is G-equivariant there is a line bundle isomorphism σ : p1∗ L → m∗ L covering the identity on Y [2] × G (corresponding to the family of maps σg of (2)). As above ∇ induces connections p1∗ ∇ and m∗ ∇ on p1∗ L and m∗ L respectively and we can define a 1-form AL on Y [2] × G in the same manner as Eq. (22) above. Again it is easy to see that AL is left invariant under the action of G on Y [2] × G, where G acts on

178

V. Mathai, D. Stevenson

itself by left multiplication and since ∇ is G-invariant AL vanishes in the Y [2] -direction. It is also straightforward to see that we have the equation (π1 × 1)∗ AL − (π2 × 1)∗ AL + (π3 × 1)∗ AL = 0

(24)

in 1 (Y [3] × G) (here for example (π1 × 1)(y1 , y2 , y3 , g) = (y2 , y3 , g)). It follows that we can find a 1-form B on Y × G so that AL = (π2 × 1)∗ B − (π1 × 1)∗ B. One can show that it is possible to choose B so that it is invariant under G and vanishes in the Y -direction. It follows that we can define maps µ˜ L : Y [2] → g∗ , µL : Y → g∗ such that µ˜ L (y1 , y2 )(ξ ) = A((y1 , y2 , 1); (0, 0, ξ )) and µL (y)(ξ ) = B((y, 1); (0, ξ )) and moreover µ˜ L = δ(µL ) = π2∗ µL − π1∗ µL . The degree two element F − µ˜ L of •G (Y [2] ) is equivariantly closed, i.e. closed with respect to dg. We have F − µ˜ L = δ(f − µL ) and, since δ commutes with dg, we have δdg(f − µL ) = 0. It follows from here that dg(f − µL ) = π ∗ H for some degree three element H of •G (M) which is necessarily equivariantly closed. We also need to explain how twisted cohomology arises in the equivariant case. Suppose H is the equivariantly closed degree three element of •G (M) associated to an invariant bundle gerbe connection on the equivariant bundle gerbe associated to the G-equivariant P U bundle P on M. Thus the image of H in H 3 (•G (M)) represents the image of the equivariant Dixmier-Douady class of P in HG3 (M). As in Sect. 3, we introduce a twisted differential on the algebra •G (M) by dH = dg − H . Because H is 2 = 0. We denote the cohomology of the complex (• (M), d ) equivariantly closed, dH H G • by HG (M, H ). To define the twisted equivariant Chern character we need to explain what an equivariant module connection is. Suppose firstly that the equivariant bundle gerbe L → P [2] has a G-invariant connection ∇ and “equivariant curving” f − µL . If E → P is a G-equivariant bundle gerbe module for the equivariant bundle gerbe L → P [2] then a G-equivariant module connection for E is a connection on the vector bundle E which is invariant with respect to the G-action on E and which is compatible with the bundle gerbe connection ∇ in the sense of Eq. (4). Since the isomorphism π1∗ E ⊗ L → π2∗ E is a G-isomorphism we have the commuting diagram p1∗ (π1∗ E ⊗ L) p1∗ ψ↓

p1∗ π2∗ E

π1∗ σE ⊗σL

→

π2∗ σE

→

m∗ (π1∗ E ⊗ L) ↓m∗ ψ m∗ π2∗ E

over P [2] × G. From this diagram we deduce that the End(E)-valued 1-form AE on P × G and the 1-form AL on P [2] × G satisfy the equation (π1 × 1)∗ AE ⊗ I + I ⊗ AL = p1∗ ψ −1 (π2 × 1)∗ AE p1∗ ψ

(25)

in 1 (P [2] × G) ⊗ p1∗ End(π1∗ E ⊗ L). From here it easy to conclude that the moment maps µE and δ(µL ) satisfy π1∗ (µE − µL ) = ψ −1 π2∗ (µE − µL )ψ.

(26)

If we are given a Z2 -graded equivariant bundle gerbe module E = E0 ⊕E1 we can define a twisted equivariant Chern character chG P (E) taking values in the twisted equivariant

Chern Character in Twisted K-Theory

179

cohomology group HG (M, H ) as follows. We first of all choose a Z2 -graded equivariant bundle gerbe module connection ∇ = ∇0 ⊕ ∇1 on E such that the difference ∇0 − ∇1 is trace class when considered in local trivialisations. It is easily seen that the difference exp(F0 + µE0 + f − µL ) − exp(F1 + µE1 + f − µL )

(27)

is trace class. It follows from Eqs. (5) and (26) that the trace of this expression descends to M. The resulting form is easily seen to be closed with respect to the twisted equivariant differential. 7. Chern Character: The Twisted Holomorphic Case In this section, we introduce the theory of holomorphic bundle gerbes, holomorphic bundle gerbe modules, and holomorphic bundle gerbe K-theory, as well as the Chern character in this context. For the possible relevance of the material in this section to D-brane theory, cf. [24, 14].

7.1. Holomorphic bundle gerbes. Holomorphic gerbes have been studied by various authors such as [6] using categories, stacks and coherent sheaves. Our goal in this subsection is to sketch a holomorphic bundle theory analogue, as accomplished by Murray [21] in the smooth case. In this section M denotes a complex manifold. A holomorphic bundle gerbe on M consists of a holomorphic submersion π : Y → M, together with a holomorphic line bundle L → Y [2] on the fibre product Y [2] = Y ×π Y . L is required to have a product, that is a holomorphic bundle isomorphism which on the fibres takes the form L(y1 ,y2 ) ⊗ L(y2 ,y3 ) → L(y1 ,y3 )

(28)

for points y1 , y2 and y3 all belonging to the same fibre of Y . This product is required to be associative in the obvious sense. If one chooses an open cover U = {Ui } of M over which there exist local holomorphic sections si : Ui → Y of π , then we can form holomorphic maps (si , sj ) : Uij → Y [2] and pull back the holomorphic line bundle L to get a family of holomorphic line bundles Lij on Uij together with holomorphic isomorphisms Lij ⊗ Lj k → Lik . One example of such a structure is the case when we have a holomorphic principal P GL bundle Y over M. That is, Y has holomorphic local trivialisations. Then we can form the lifting bundle gerbe L → Y [2] in the standard manner. L is a holomorphic line bundle and the bundle gerbe product on L induced from the product in the group GL is holomorphic. Every holomorphic bundle gerbe on M gives rise to a class in the sheaf cohomology group H 2 (M, O∗M ), where O∗M denotes a sheaf of nonvanishing holomorphic functions on M. We give an informal sketch of a way to associate a O∗ -valued ˇ Cech 2-cocycle ij k to the holomorphic bundle gerbe L → Y [2] in the manner described in [21] and [6]. We first pick an open cover {Ui } of M such that there exist holomorphic local sections si : Ui → Y of π : Y → M. Then we can form the holomorphic maps (si , sj ) : Uij → Y [2] and form the pullback line bundle Lij = (si , sj )∗ L. Lij is a holomorphic line bundle on Uij . We then pick holomorphic sections σij : Uij → Lij and define a holomorphic function ij k : Uij k → O∗ . It is ˇ easy to see that ij k satisfies the Cech 2-cocycle condition δ(ij k ) = 1. The class

180

V. Mathai, D. Stevenson

in H 2 (M, O∗M ) determined by ij k can be shown to be independent of all the choices. Brylinski [6] proves that there is an isomorphism between equivalence classes of holomorphic gerbes on M and H 2 (M, O∗M ). To obtain an analogue of this result for holomorphic bundle gerbes we would need to introduce the notion of stable isomorphism of holomorphic bundle gerbes, analogous to what was done in [22] in the smooth case. Consider now holomorphic bundle gerbes L → Y [2] with a given hermitian metric, and call these hermitian holomorphic bundle gerbes. A compatible bundle gerbe connection ∇ on a hermitian holomorphic bundle gerbe L → Y [2] is a connection on the line bundle L which preserves a hermitian metric and is compatible with the product given in (28), i.e. ∇(st) = ∇(s)t + s∇(t) for sections s and t of L. In [21] it is shown that bundle gerbe connections that preserve a hermitian metric always exist in the smooth case, and this implies the existence in the holomorphic case. The curvature F∇ ∈ 1,1 (Y [2] ) ⊂ 2 (Y [2] ) of a compatible bundle gerbe connection ∇ is easily seen to satisfy δ(F∇ ) = 0 in 2 (Y [3] ). It follows that we can find a (1, 1)-form f on Y such that F∇ = δ(f ) = π2∗ f − π1∗ f . f is unique up to (1, 1)-forms pulled back from M. A choice of f is called a choice of a curving for ∇. Since F∇ is closed we must have df = π ∗ ω for some necessarily closed 3-form ω ∈ 2,1 (M) ⊕ 1,2 (M) on M. ω is called the 3-curvature of the compatible bundle gerbe connection ∇ and curving f . It follows from [21] that ω has integral periods, and the Dixmier-Douady invariant of the hermitian holomorphic bundle gerbe L → Y [2] is [ω] ∈ (H 1,2 (M, C) ⊕ H 2,1 (M, C)) ∩ H 3 (M, Z) ⊂ H 3 (M, Z). This says in particular that not all classes in H 3 (M, Z) arise as the Dixmier-Douady invariant of hermitian holomorphic bundle gerbes. If we instead merely considered holomorphic bundle gerbes L → Y [2] , then a compatible bundle gerbe connection ∇ would have curvature F∇ ∈ 2,0 (Y [2] ) ⊕ 1,1 (Y [2] ) ⊂ 2 (Y [2] ). Arguing as above, we can find a choice of curving for ∇ which is a form f on Y that is in the subspace 2,0 (Y ) ⊕ 1,1 (Y ) ⊂ 2 (Y ), satisfying F∇ = δ(f ) = π2∗ f −π1∗ f . It follows as before that if ω is the 3-curvature of the compatible bundle gerbe connection ∇ and curving f , then ω is a closed form in 3,0 (M)⊕2,0 (M)⊕1,1 (M) ⊂ 3 (M). Let OM denote the structure sheaf of M. The exact sequence of sheaves exp

0 → Z → OM → O∗M → 0 induces an exact sequence in cohomology δ

· · · → H 2 (M, OM ) → H 2 (M, O∗M ) → H 3 (M, Z) → H 3 (M, OM ) → · · · .

(29)

Since in general the cohomology groups H j (M, OM ), j = 2, 3, do not vanish, we conclude that holomorphic bundle gerbes are certainly not classified by their DixmierDouady invariant. We have shown above that δ(H 2 (M, O∗M )) ⊂ (H 1,2 (M, C) ⊕ H 2,1 (M, C)) ∩ H 3 (M, Z). We will now argue that in fact equality holds, that is δ(H 2 (M, O∗M )) = (H 1,2 (M, C) ⊕ H 2,1 (M, C)) ∩ H 3 (M, Z). (This result was known to I. M. Singer). To see this, it suffices by (29) to show that the image of (H 1,2 (M, C) ⊕ H 2,1 (M, C)) ∩ H 3 (M, Z) in H 3 (M, OM ) is trivial. It suffices to show that the image of H 1,2 (M, C) ⊕ H 2,1 (M, C) in H 3 (M, OM ) is trivial. Let πj : j (M) → 0,j (M) denote the projection onto the subspace of differential forms of type (0, j ). Then it is not hard to see that the mapping H 3 (M, C) → H 3 (M, OM )

Chern Character in Twisted K-Theory

181

¯ is represented by a mapping of a d-closed differential 3-form ψ onto the ∂-closed differi,j ential form π3 (ψ). Since any class in H (M, C) is represented by a d-closed differential form ψ of type (i, j ), it follows that π3 (ψ) = 0 for all d-closed differential form ψ of type (3, 0), (1, 2) or (2, 1). In particular, the image of H 1,2 (M, C) ⊕ H 2,1 (M, C) in H 3 (M, OM ) is trivial as claimed. We conclude that a bundle gerbe is stably isomorphic to a holomorphic bundle gerbe if and only if its Dixmier-Douady invariant lies in the subspace (H 1,2 (M, C) ⊕ H 2,1 (M, C)) ∩ H 3 (M, Z) of H 3 (M, Z). 7.2. Holomorphic bundle gerbe modules.. In this subsection, we will assume that Y is a holomorphic principal P GL(H) bundle over M. In the notation above, we consider holomorphic principal GLK bundles P over Y , and their associated Hilbert bundles E over Y in the standard representation. Such bundles have local holomorphic trivializations and will be called holomorphic GLK -vector bundles over Y . A holomorphic bundle gerbe module for L is defined to be a holomorphic GLK -vector bundle E on Y together with an action of L on E. This is a holomorphic vector bundle isomorphism π1∗ E ⊗ L → π2∗ E on Y [2] which is compatible with the product on L. As in the previous cases there is an extra condition regarding the action of GL(H) on the principal GLK bundle associated to E. As before we form the holomorphic principal GL(H) bundle GL(E) on Y with fibre at a point y equal to the isomorphisms f : H → Ep . Let R denote the principal GLK reduction of GL(E) determined by the GLK structure of E. Then for u ∈ GL(H) such that y2 = y1 [u], the map GL(E)y1 → GL(E)y2 given by sending f to uf u−1 preserves R. Analogous to the result in Sect. 4, the space of holomorphic GLK -vector bundles on Y forms a semi-ring under the operations of direct sum and tensor product. It is easy to see that the operation of direct sum is compatible with the action of the lifting holomorphic bundle gerbe L → Y [2] and so the set of holomorphic bundle gerbe modules for L, Modhol GLK (L, M) has a natural structure as a semi-group. We denote the group associated to Modhol GLK (L, M) by the Grothendieck construction by hol ModGLK (L, M) as well. We define the (reduced) holomorphic bundle gerbe K-theory as 0 (M, Y ) = Modhol K˜ GLK (L, M), where [H ] is the Dixmier-Douady class of the holomorphic bundle gerbe L → Y [2] . Moreover, we replace holomorphic GLK -vector bundles by holomorphic GLtr -vector bundles and recover the same holomorphic K-theory. It can be shown that when L is trivial, the holomorphic bundle gerbe K-theory is isomorphic to the K-theory of holomorphic vector bundles as in [11]. By forgetting the holomorphic structure, we see that there is a natural homomorphism 0 (M, Y ) → K ˜ 0 (M, Y ) to bundle gerbe K-theory. By composing with the Chern charK˜ → H ev (M, Y ) in bundle gerbe acter homomorphism chL : K˜ 0 (M, Y ) K-theory defined in [4], we obtain a Chern character homomorphism in holomorphic bundle gerbe K-theory, 0 (M, Y ) → H ev (M, Y ). chY : K˜

(30)

It has the following properties: 1) chY is natural with respect to pullbacks under holo0 (M)-module structure of K 0 (M, Y ), and 3) ch ˜ morphic maps, 2) chY respects the K˜ Y 0 reduces to the ordinary Chern character on K˜ (M) when Y is trivial, cf. [11]. Rationally, the image of the Chern character (30) is far from being onto, as can be seen by choosing

182

V. Mathai, D. Stevenson

hermitian connections compatible with the homomorphic structure in the Chern-Weil description of the Chern character. In the particular case when Y is trivial, the image of the Chern character is contained in Dolbeault cohomology classes of type (p, p), and the precise image is related to the Hodge conjecture. The Chern-Weil expression for the Chern character in this context is again given by the expression in Proposition 4.5.

8. Spinor Bundle Gerbe Modules In this section we give concrete examples of bundle gerbe modules associated to a manifold M without a SpinC -structure and also to manifolds without a Spin-structure. This construction easily extends to the case when a general vector bundle on M does not either have a SpinC -structure or a Spin-structure. In the case when M does not have a SpinC -structure, this bundle gerbe module represents a class in twisted K-theory of M, where the twisting is done by a 2-torsion class in H 3 (M, Z). Suppose then that M is an n-dimensional oriented manifold without a SpinC (n)-structure. It is a well known result of Whitney that orientable manifolds of dimension less than or equal to four have SpinC -structures, but there are many examples of higher dimensional orientable manifolds that do not have any SpinC -structures. One C -structures are the Dold collection of examples of manifolds that do not have any Spin

2m+1 manifolds P (2m + 1, 2n), which is defined as the quotient S × CP 2n /Z2 , where the action of Z2 is given by (x, z) → (−x, z¯ ). Recall that SpinC (n) = Spin(n) ×Z2 S 1 and hence there is a central extension S 1 → SpinC (n) → SO(n). Let SO(M) denote the oriented frame bundle of M. Then associated to SO(M) is the lifting bundle gerbe arising from the central extension of SpinC (n). More precisely over the fibre product SO(M)[2] = SO(M) ×π SO(M) (here π : SO(M) → M denotes the projection) we have the canonical map SO(M)[2] → SO(n). We can pullback the principal S 1 -bundle SpinC (n) → SO(n) to SO(M)[2] via this map. The resulting bundle L → SO(M)[2] is a bundle gerbe. It is natural to call this a SpinC bundle gerbe. The Dixmier-Douady class of this bundle gerbe in H 3 (M, Z) coincides with the third integral Stieffel-Whitney class W3 (T M), which measures precisely the obstruction to M being SpinC (n). Recall that W3 (T M) = βw2 (T M), the Bockstein β applied to the second Steifel-Whitney class w2 (T M). As a consequence W3 (T M) is a 2-torsion class. π We can pullback the SO(n)-bundle SO(M) → M to SO(M) via π to get an SO(n) bundle π ∗ SO(M) → SO(M). Since W3 (π ∗ SO(M)) = 0, we can construct a lift of the structure group of π ∗ SO(M) to SpinC (n); i.e. there is a bundle map π ∗ SO(M) π ∗ SO(M) → π ∗ SO(M) covering the homomorphism p : SpinC (n) → SO(n). It is easy to see what this lift is. Note that π ∗ SO(M) identifies canonically with SO(M)[2] . Therefore we can regard the line bundle L → SO(M)[2] as sitting over π ∗ SO(M). It is easy to see that a lift π ∗ SO(M) is given by the total space of the line bundle over π ∗ SO(M). We can form the bundle of spinors S = S(π ∗ SO(M)) → SO(M) associated to π ∗ SO(M). Recall that we do this by taking an irreducible representation V of SpinC (n) and forming the associated bundle S = π ∗ SO(M)× SpinC (n) V on SO(M). It is straightforward to show that S is a module for the bundle gerbe L → SO(M)[2] . Recall that in the case when the dimension of M is even, there are two half spin representations and in

Chern Character in Twisted K-Theory

183

the odd dimensional case, there is a unique spin representation. It is natural to call this a spinor bundle gerbe module. This discussion can be extended to cover the case when we have a real vector bundle E → M. We can then associate a lifting bundle gerbe to the oriented frame bundle SO(E) which measures the obstruction to SO(E) having a lift of the structure group to SpinC . On SO(E) we can form the pullback bundle π ∗ SO(E). This has a lift π ∗ SO(E) to a SpinC (n) bundle on SO(E). Associated to π ∗ SO(E) we can construct the bundle of spinors S(π ∗ SO(E)) → SO(E): again this is a module for the lifting bundle gerbe L → SO(E)[2] . The possible spinor bundle gerbe modules for the SpinC bundle gerbe L → P [2] are parametrised by H 2 (M, Z). We see this as follows. Let L → M be a line bundle on M. Then given a spinor bundle gerbe module S we can extend the action of L on S to an action on S ⊗ π ∗ L by acting trivially with L on π ∗ L. S ⊗ π ∗ L is a spinor bundle gerbe module since tensoring with π ∗ L preserves rank and therefore takes bundles of irreducible SpinC modules to bundles of irreducible SpinC modules. Thus we have an action of the category of line bundles on M on the category of all spinor bundle gerbe modules. The following argument showing that this is a transitive action is due to M. Murray. Suppose we have two spinor bundle gerbe modules S1 and S2 on SO(M). 1 and π ∗ SO(M) 2 respectively The frame bundles of S1 and S2 provide lifts π ∗ SO(M) ∗ C of the structure group of π SO(M) to Spin (n). It is straightforward to see that there 2 = π ∗ SO(M) 1 ⊗ P . Let is a principal U (1) bundle P on SO(M) such that π ∗ SO(M) Lˆ denote the complex line on SO(M) associated to P . Then we have an isomorphism ˆ We claim that the line bundle Lˆ on SO(M) descends to a line bundle S2 = S1 ⊗ L. L on M. To see this, think of the fibre Lˆ y of Lˆ at y ∈ SO(M) as consisting of isomorphisms (S1 )y → (S2 )y . Similarly we can think of a point l of the fibre L(y,y ) of L at (y, y ) ∈ SO(M)[2] as an isomorphism l : (S1 )y → (S1 )y or as an isomorphism l : (S2 )y → (S2 )y . We can therefore use l to construct an isomorphism Lˆ y → Lˆ y . This isomorphism is easily seen to be independent of the choice of l ∈ L(y,y ) . It therefore follows that the line bundle Lˆ descends to a line bundle L on M. The construction above can be generalized in a straightforward manner to yield the following proposition, which constructs bundle gerbe modules in the case when the Dixmier-Douady invariant is a torsion class. Proposition 8.1. (1) The following data can be used to construct bundle gerbe modules E on a manifold M. • A principal G-bundle P on M, where G is a finite dimensional Lie group; • A central extension → G; U (1) → G • A finite dimensional representation → GL(H) ρ:G such that the restriction of ρ to the central U (1) subgroup is the identity. The bundle gerbe modules E determine elements in K 0 (M, P ). (2) The following data can be used to construct holomorphic bundle gerbe modules E on a complex manifold M.

184

V. Mathai, D. Stevenson

• A holomorphic principal G-bundle P on M where G is a finite dimensional complex Lie group; • A central extension → G; C∗ → G • A finite dimensional representation → GL(H), ρ:G such that the restriction of ρ to the central C∗ subgroup is the identity. 0 (M, P ). The holomorphic bundle gerbe modules E determine elements in K (3) Suppose that K is a compact, connected and simply connected simple Lie group. Then the following data can be used to construct K equivariant bundle gerbe modules E on a manifold M.

• A K equivariant principal G-bundle P on M, where G is a finite dimensional Lie group; • A central extension → G; U (1) → G • A finite dimensional representation → GL(H), ρ:G such that the restriction of ρ to the central U (1) subgroup is the identity. 0 (M, P ). The K equivariant bundle gerbe modules E determine elements in KK

In part (3) of the proposition, the only difficulty lies in showing that the class in HK3 (M; Z) associated to the K-equivariant principal G-bundle P → M vanishes when lifted to HK3 (P ; Z). To see this it suffices to consider the universal case, when P is EG. Recall that the K-equivariant cohomology of EG is defined to be the cohomology of the space EG ×K EK. Note that K acts freely on EG × EK and hence EG ×K EK has the homotopy type of BK. The hypotheses on K, namely 1 = π0 (K) = π1 (K) = π2 (K) imply that 1 = π0 (BK) = π1 (BK) = π2 (BK) = π3 (BK), using the fibration K → EK → BK and the long exact sequence in homotopy. By the Hurewicz theorem, it follows that 0 = H1 (BK; Z) = H2 (BK; Z) = H3 (BK; Z). By the universal coefficient theorem, it follows that H 3 (BK; Z) = 0, that is, the degree three K-equivariant coho∗ P of mology of EG vanishes. It follows that we can then choose a K-equivariant lift π ∗ ∗ P × H. It ˆ bundle. We can then form the associated vector bundle E = π π P to a G ρ is clear that E is a K-equivariant bundle gerbe module. Proposition 8.1 can be viewed as the analogue in the twisted case of the associated be a compact Lie group bundle construction. It can be formalised as follows. Let G, G and a U (1) central extension respectively. Let R(G) denote the representation ring of which we recall is defined as the free Abelian group generated by the irreducible G, Let R0 (G) denote the subgroup of R(G) defined as those complex representations of G. such that the restriction of ρ to the central U (1) subgroup is the representations ρ of G identity. → Z assigns to each representaThe augmentation homomorphism ε : R0 (G) its dimension, and the augmentation subgroup I0 (G) is the kernel of ε. tion in R0 (G) Given a principal G bundle over M, the construction in Proposition 8.1 part (1) yields a homomorphism → K 0 (M, P ). αP : R0 (G)

Chern Character in Twisted K-Theory

185

If M is a point, then the homomorphism reduces to the augmentation homomorphism ε. If f : N → M is a smooth map, then it is not hard to see that the following diagram commutes, f!

K 0 (M, P ) −→ K 0 (N, f ∗ P ) αP

αf ∗ P

R0 (G). Similarly, the hypotheses of Proposition 8.1 part (3) yields the homomorphism 0 → KK αP : R0 (G) (M, P ).

We now consider the case when the manifold M does not have a Spin-structure. The discussion above also makes sense if we replace SpinC (n) by Spin(n) and consider the central extension Z2 → Spin(n) → SO(n), so we will avoid repetition. Given a principal SO(n) bundle P on M (in particular the oriented bundle of frames on M) we can consider the lifting bundle gerbe associated to this central extension of SO(n) by Z2 . This time we will have a principal Z2 bundle L → P [2] (or equivalently a real line bundle over P [2] ). It is natural to call this a Spin bundle gerbe. The “real version” of the Dixmier-Douady invariant of the Spin bundle gerbe coincides with the second StieffelWhitney class of P in H 2 (M, Z2 ). We remark that the real version of Dixmier-Douady theory involves the obvious modifications to standard Dixmier-Douady theory, and is in the literature (cf. [23]). The application of the real version of Dixmier-Douady theory to the real version of bundle gerbe theory is what is used here, the details of which are obvious modifications of the standard theory of bundle gerbes. As above, the pullback ∗ P → P . We consider the associated π ∗ P of P to P has a lifting to a Spin(n) bundle π bundle of spinors by taking an irreducible representation V of Spin(n) and forming ∗P × the associated vector bundle S = π Spin(n) V on P . S is a bundle gerbe module for L, called a spinor bundle gerbe module as before. One can show that the possible spinor bundle gerbe modules for the Spin bundle gerbe L → P [2] are parametrised by H 1 (M, Z2 ), i.e. the real line bundles on M, by following closely the proof given above in the SpinC case. References 1. Atiyah, M.F.: K-theory, New York: W.A. Benjamin, 1967; ibid, K-theory past and present, math.KT/0012213 2. Atiyah, M.F., Singer, I.M.: The index of elliptic operators I. Ann. Math. 87, 484–530 (1968); ibid, Index theory for skew-adjoint Fredholm operators. Publ. Math. IHES, 37, 305–326 (1969) 3. Berline, N., Getzler, E., Vergne, M.: Heat kernels and dirac operators. Grundlehren Math. Wiss. 298, Berlin: Springer-Verlag, 1992 4. Bouwknegt, P., Carey, A., Mathai, V., Murray, M., Stevenson, D.: Twisted K-theory and K-theory of bundle gerbes. To appear in Commun. Math. Physics, hep-th/0106194 5. Bouwknegt, P., Mathai, V.: D-branes, B-fields and twisted K-theory. J. High Energy Phys. 03, 007 (2000), hep-th/0002023 6. Brylinski, J.-L.: Loop spaces, characteristic classes and geometric quantization. Prog. Math. Vol. 107, Boston MA: Birkh¨auser Boston, 1993; ibid, Gerbes on complex reductive lie groups, math.DG/0002158. 7. Diaconescu, D.-E., Moore, G.W., Witten, E.: E8 gauge theory, and a derivation of K theory from M theory. hep-th/0005090 8. Freed, D.S.: The Verlinde algebra is twisted equivariant K-theory. math.RT/0101038 9. Giraud, J.: Cohomologie non-ab´elienne. Die Grundlehren Math. Wiss., Band 179, Berlin-New York: Springer-Verlag, 1971

186

V. Mathai, D. Stevenson

10. Harvey, J.A., Moore, G.: Non-commutative Tachyons and K-theory. hep-th/0009030 11. Hirzebruch, F.: Topological methods in algebraic geometry. Die Grundlehren der Mathematischen Wissenschaften, Band 131 New York: Springer-Verlag New York, Inc., 1966 12. Hoˇrava, P.: Type IIA D-branes, K-theory and Matrix theory. Adv. Theor. Math. Phys. 2(6), 1373– 1404 (1999) 13. Kapustin, A.: D-branes in a topologically non-trivial B-field. Adv. Theor. Math. Phys. 4, 127 (2001), hep-th/9909089 14. Kapustin, A., Orlov, D.: Vertex algebras, mirror symmetry, and D-branes: the case of complex tori. hep-th/0010293 15. Koschorke, U.: Infinite dimensional K-theory and characteristic classes of fredholm bundle maps. In: Proc. Symp. Pure Math., Vol. XV, 1968, pp. 95–133 16. Maldacena, J., Moore, G., Seiberg, N.: D-brane instantons and K-theory charges. hep-th/0108100 17. Mathai, V., Quillen, D.: Superconnections, Thom classes and equivariant differential forms. Topology 25(1), 85–110 (1986) 18. Mathai, V., Singer, I.M.: Twisted K-homology theory and Twisted Ext-theory. Dec. 2000, 13 pp, hep-th/0012046 19. Meinrenken, E.: The basic gerbe over a compact simple Lie group. math.DG/0209194 20. Minasian, R., Moore, G.: K-theory and Ramond–Ramond charge. JHEP 11, 002 (1997), hep-th/9710230 21. Murray, M.K.: Bundle gerbes. J. Lond. Math. Soc. (2) 54(2), 403–416 (1996) 22. Murray, M.K., Stevenson, D.: Bundle gerbes: stable isomorphism and local theory. J. London Math. Soc. (2) 62, 925–937 (2000) 23. Rosenberg, J.: Continuous trace C ∗ -algebras from the bundle theoretic point of view. J. Aust. Math. Soc. A47, 368–381 (1989); ibid, Homological invariants of extensions of C ∗ -algebras. Proc. Symp. in Pure Math. 38, 35–75 (1982) 24. Eric R. Sharpe: Stacks and D-brane bundles. Nucl.Phys. B610, 595–613 (2001) 25. Witten, E.: D-Branes and K-theory. J. High Energy Phys. 12, 019 (1998), hep-th/9810188; ibid, Overview of K-theory applied to strings. Int. J. Mod. Phys. A16, 693 (2001) hep-th/0007175 Communicated by R.H. Dijkgraaf

Commun. Math. Phys. 236, 187–198 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0792-x

Communications in

Mathematical Physics

On Unitary Representations of Nilpotent Gauge Groups Esther Galina, Aroldo Kaplan CIEM-FaMAF, Universidad Nacional de C´ordoba, Ciudad Universitaria, 5000 C´ordoba, Argentina Received: 22 October 2000 / Accepted: 22 November 2002 Published online: 21 February 2003 – © Springer-Verlag 2003

Abstract: Groups of smooth maps from spheres to appropriate nilpotent Lie groups exhibit some peculiar properties of the unitary duals of infinite-dimensional groups.

1. Introduction In this article we study representations of some groups of smooth maps C ∞ (X, G). Their special properties answer various questions and conjectures about the unitary duals of infinite-dimensional groups. For the most part, X will be the unit sphere in Rm and G a certain associated 2-step nilpotent Lie group – a so-called group of Heisenberg type. The simplest example is the Loop Group of the complex Heisenberg group. The case when such a group is replaced by a reductive group of its automorphisms will be mentioned occasionally. One finds families of irreducible unitary representations of C ∞ (X, G) satisfying one or several of the following properties: – Do not arise from any single representation of G, or from any pair of them, contrary to an existing conjecture for all gauge groups with nilpotent structure group [AHMT] (Theorem 3.1). –Are Continuous Tensor Product Representations, mutually non-equivalent but equivalent factorwise (Proposition 3.1 and Theorems 3.2, 4.1 and 4.2). – Are representations of type S, mutually non-equivalent but with the same multiplier function, in contrast to the finite-dimensional case [Del, Ar] (Proposition (3.1) and Theorem (3.2)). – Transforming these representations by isometries of X or automorphisms of G may or may not result in equivalent representations, according to the following rules and in partial contrast with other finite or infinite dimensional nilpotent Lie groups.

This research was supported in part by CONICET, FONCYT, CONICOR and UNC.

188

E. Galina, A. Kaplan

If π is a unitary representation of C ∞ (X, G), φ a diffeomorphism of X and a an automorphism of G, πφ (g) = π(g ◦ φ),

π a (g) = π(a ◦ g)

will denote the corresponding transformed representations. Now, as we will explain below, for our G there is an exact sequence 1 → F → SpG × Spin(m) → Out (G) → D → 1, where F is finite, SpG is real reductive and D is a one-dimensional group of dilations. Note that Spin(m) also acts by isometries on X. The rule then will be: – For a ∈ SpG , π a is (unitarily) equivalent to π if and only if a lies in maximal compact UG . Instead, for σ ∈ Spin(m), πσ and π σ are mutually equivalent but not equivalent to π as soon as σ = 1. For emphasis: acting on X by non-trivial isometries yields inequivalent representations of C ∞ (X, G). These results seem to require that Lemma 6 in 2.5.1 and Comment (a) in 2.6.2 of [AHMT] be revised. In any case, it follows that a finite quotient of the symmetric space Spin(m) × SpG /UG embeds in the unitary dual of C ∞ (X, G) -in fact, this holds for the full gauge manifold: C ∞ (X, Spin(m) × SpG /UG ) → C ∞ (X, G). Correspondingly, the intertwining operators that appear when the actions of Spin on X and G are coupled define a “metaplectic” representation of a reductive subgroup of automorphisms, namely C ∞ (X, UG ) × Spin(m). The representations considered here will all be of order zero. It is possible to obtain some of positive order with analogous invariance properties, coming from a Dirac cocycle on C ∞ (X, G). They are related to the Energy Representation of C ∞ (X, UG ) in the sense of Graev, Gelfand and Vershik ([GGV1, GGV2]) and will be discussed elsewehere. Those here will all be variations of a basic one, U. Let us describe it, recalling in the process the basic properties of groups of Heisenberg type and adjusting the notation accordingly. Let C(m) be the Clifford algebra of Rm , that is the quotient of the tensor algebra of Rm by the relations a ⊗ b + b ⊗ a = −2 < a, b > for all a, b ∈ Rm . The natural inclusion of Rm in the tensor algebra induced the inclusion Z := Rm ⊂ C(m). In particular, if V is a real, finite-dimensional unitary module over a Clifford algebra C(m), each z ∈ Z acts linearly on V ; we denote this action by Jz . Then, the bilinear map :V ×V →Z defined by z, (u, v) Z = Jz u, v V is skew-symmetric and strongly non-degenerate, in the sense that for all non-zero z ∈ Z, the ordinary skew form φz (u, v) = z, (u, v) Z is non-degenerate. The 2-step nilpotent Lie algebra g = V ⊕ Z,

[v + z, v + z ] = (v, v ),

as well as the corresponding simply-connected Lie group G, are known as of Heisenberg type [Ka]. We will drop the subscripts V , Z, from the inner products and norms if no ambiguity is possible.

On Unitary Representations of Nilpotent Gauge Groups

189

Let X = S m−1 denote the unit sphere in Z. Each x ∈ X determines a complex structure Jx on V compatible with the inner product. Let W → X be the complex fiber bundle whose fiber over x is the (−i)-eigenspace of Jx in V ⊗ C. W is a hermitian bundle, so ¯ is identified with its dual. Let (W) be the space (pre-Hilbert space) of smooth secW ¯ -essentially, the space of holomorphic functions tions and F the Fock space over (W) ¯ – which are L2 relative to the associated Gaussian measure. For v ∈ V and on (W) x ∈ X, let wv (x) = (1/2)(v + iJx v) ∈ Wx . One can now let exp(v + z) ∈ C ∞ (X, G) (v ∈ C ∞ (X, V ), z ∈ C ∞ (X, Z)), act on a generic f ∈ F by (U(exp(v + z))f )(w) = ei

X x,z(x) dx

e

−(1/2) wv 2L +w,wv L2 2

f (w − wv ),

¯ U will be the basic representation. w ∈ (W). As to the automorphisms of G, let SpG = Sp() = {s ∈ GL(V ) | (su, sv) = (u, v)}, UG = U() = Sp() ∩ O(V ). This acts on g by v + z → gv + z. For x ∈ X let ρx be the reflection in Z through the hyperplane orthogonal to x. Then Pin(m) is represented by automorphisms of g, with Jx , |x| = 1, acting by v + z → Jx (v) − ρx⊥ (z). The group of outer automorphisms of g, modulo the dilations v + z → tv + t 2 z, is a finite quotient of Pin(m) × Sp() (for this, as well as a discussion of the group Pin(m), see [Sa, KR]). We would like to thank H. Araki, P. Delorme, F. Levstein, L. Orstead, L. Saal, D. Vogan and N. Wallach for helpful comments. 2. Continuous Tensor Product Representations For finite X, the irreducible representations of C ∞ (X, G) are the tensor products of representations of G indexed by X. One can give a meaning to x∈X πx for a measure space X, provided the factors πx are representations on Fock (or “symmetric”) vector spaces: in the formal identity x∈X S(Wx ) = S( x∈X Wx ), one interprets the direct sum as a direct integral. The implementation of this requires a good notion of Fock space over an infinite-dimensional Hilbert space W . Although the usual definition when W is finite-dimensional, as the space of all holomorphic functions on W satisfying |w|2 |f (w)|2 e− 2 dw < ∞, W

still makes sense in general, the operator analysis on it needs the notion of rigged Hilbert space [GGVi]. A simpler approach, introduced by Araki [A, G1] and motivated by physical reasons, is based on the fact that classical Fock space is spanned by exponential functions (see also [GGV3]). Given a complex Hilbert space W , let S(W ) denote the closure of the free pre-Hilbert space generated by all expressions EXP(w), w ∈ W , with the inner product

( EXP(w), EXP(w )) = e(w,w ) . EXP(w) can be viewed as a holomorphic function on W , namely

190

E. Galina, A. Kaplan ∞ 1 √ w ⊗n , n! n=0

EXP(w) =

where w⊗n (v) = (v, w)n . Scalar multiples of elements of the form EXP(w) are the coherent states (of a free boson field); EXP(0) is the vacuum state. It is worth emphasizing that EXP(w) is a scalar function (on W , or W¯ , and for each w), but not “ EXP” itself. Analogs of the Fock representation, or representations of type S, are constructed as follows. Let π be a continuous unitary representation of a topological group on a complex Hilbert space W , b : → W a continuous 1-cocycle with respect to π , that is b(rr ) = b(r) + π(r)b(r )

(2.1)

and c : → T is a compatible multiplier, i.e., a continuous map into the unit circle such that

c(rr ) = c(r)c(r )ei Imb(r),π(r)b(r ) .

(2.2)

Then, the admissible triple (π, b, c) determines a continuous unitary representation of on the Fock space S(W ) by (cf. [G1]) 1

U(π,b,c) (r) EXP(w) = c(r)e− 2 b(r)

2 −π(r)w,b(r)

EXP(π(r)w + b(r)).

Now let G be a Lie group and X a Riemannian manifold, both finite-dimensional. Let {(πx , bx , cx )}x∈X be a field of admissible triples for G such that the multipliers are of the form cx (g) = eiγx (g) , the field {πx }x∈X integrable over X for all g ∈ G and ⊕ and the functions x → γx (g) are the map g → X bx (g)dx and the function g → X γx (g)dx are continuous. Define W =

⊕

Wx dx, X

where Wx is the representation space of πx and dx the Riemannian measure on X and set π(g) =

⊕

πx (g(x))dx, X

b(g) =

⊕

bx (g(x))dx,

c(g) = ei

X γx (g(x))dx

,

X

for g ∈ C ∞ (X, G). Then (π, b, c) is an admissible triple for the group C ∞ (X, G). The corresponding continuous unitary representation U(π,b,c) on S(W ), is, by definition, a continuous tensor product representation (CTPR) of C ∞ (X, G). For more details, see [G1, AHMT].

On Unitary Representations of Nilpotent Gauge Groups

191

3. Representations of C ∞ (X, G) Let G be a group of Heisenberg type as defined before, and X ⊂ Z the unit sphere in the center of its Lie algebra g = V ⊕ Z. Classically, each x ∈ X gives rise to a representation of G, of type S, on the Fock space over Wx := (−i)-eigenspace of Jx in V ⊗ C. This is just U(1(x) ,b(x) ,c(x) ) , where 1(x) = 1Wx

b(x) (exp(v + z)) = wv (x)

c(x) = eix,z .

Indeed, except for the normalization |x| = 1 and the characters, these constitute the full unitary dual of G [KR]. Remarks. (a) The correspondence v ∈ V ↔ wv (x) =

1 (v + iJx v) ∈ Wx , 2

is an isomorphism of hermitian vector spaces, if V is endowed with the complex structure Jx and the hermitian form hx (, ) =

1 1 , + iφx (, ), 2 2

while Wx has the induced structures from V ⊗ C with the hermitian extension of the inner product of V . On the V side, the cocycle of the Fock representation defined above is simply b(x) (exp(v + z)) = v. In what follows the identification will be used implicitly. (b) Although Spin(m) acts orthogonally on the real vector space V , the action is not unitary on the complex space (V , Jx , hx ). Fix xo ∈ X and set Wx = (V , hxo ),

πx = 1(xo ) ,

bx = b(xo ) ,

cx = c(xo ) ,

for all x ∈ X. This is a “constant” field of admissible triples satisfying the required integrability and continuity conditions, hence it defines a CTPR U (xo ) of C ∞ (X, G) ⊕ on S( X Wx dx). For a group of Heisenberg type, this realizes the standard representations of the group C ∞ (X, G) (as defined for example in [AHMT] for arbitrary X and G nilpotent). Conjecturally, these should have exhausted the unitary dual of C ∞ (X, G), modulo the given normalizations. For a group of Heisenberg type, one may also consider the variable field (W, π, b, c) on X, where Wx = (V , hx ) and πx = 1Wx ,

bx (exp(v + z)) = v,

cx = eix,z .

(3.1)

192

E. Galina, A. Kaplan

Theorem 3.1. The triple (π, b, c) is admissible. The resulting unitary Continuous Tensor Product Representation U = U(π,b,c) of C ∞ (X, G), given on coherent vectors by U(exp(v + z)) EXP(w) = ei

X x,z(x) dx

1

e− 2 v

2 −h(w,v)

EXP(w + v),

for each w ∈ W , is irreducible and not unitarily equivalent to any standard representation. Proof. All the conditions on the triple (3.1) to obtain a CTPR are evident, except perhaps for (2.2). As the hermitian form hx on V satisfies I m(hx (v, v )) = 21 x, (v, v ) , one has 1 cx (exp(v + z) exp(v + z )) = cx exp(v + v + z + z + (v, v )) 2 ix,z+z + 21 (v,v ) =e i = eix,z eix,z e 2 x,(v,v ) = cx (exp(v + z))cx (exp(v + z ))ei Imhx (v,v ) . By definition, the representation space is the Fock space over the direct integral of the (V , Jx , hx ). This direct integral can be identified with the complex Hilbert space whose underlying real vector space consists of the L2 maps v : X → V , endowed with the complex and hermitian structures (J v)(x) = Jx (v(x)), h(u, v) = hx (u(x), v(x))dx. X

For emphasis, U represents

C ∞ (X, G)

W =

⊕

on S(W ), where

(V , Jx , hx ) dx = L2 (X, V ),

(3.2)

X

with the given hermitian structures. Like any connected nilpotent Lie group, G is of type I and its cohomology H 1 (G, ρ) is trivial for any non-trivial irreducible representation ρ. According to [D, I], in such a case a unitary CTPR is irreducible if and only if almost every factor is irreducible. As we observed before, the individual factors are just the natural “Fock representations” of G as described in [KR], which are indeed irreducible. Hence U is irreducible (this can also be deduced from the proof of the next theorem). To show that U cannot be equivalent to any U (xo ) , compute their restrictions to the subgroup of constant maps into the center, exp Z ⊂ C ∞ (X, G). One obtains: U (xo ) (exp(z)) = eixo ,z ,

U(π,b,c) (exp(z)) = ei

X x,z dx

= 1.

Obviously, equivalent representations must agree on the center, so the theorem follows. Remark. The difference between a standard representation and U is that the first one is the tensor product over X of the same Fock representation of G, while the second one is the tensor product of non-equivalent Fock representations of G, each one associated to a different parameter x ∈ X.

On Unitary Representations of Nilpotent Gauge Groups

193

Recall the vector-valued symplectic form : V × V → Z and the groups Sp() = {a ∈ GL(V ) | (au, av) = (u, v)},

U() = Sp() ∩ O(V ).

The first is real reductive and the second is maximal compact in the first: indeed, Sp() is invariant under the Cartan involution of GL(V ) defined by the given inner product. In particular, M = Sp()/U() is a symmetric space. For a ∈ C ∞ (X, Aut (G)) and n ∈ C ∞ (X, G), U a (n) := U(a ◦ n) defines again a unitary, irreducible representation of C ∞ (X, G) on S(L2 (X, V )). Proposition 3.1. The representations U a are Continuous Tensor Product Representa tion for all a ∈ C ∞ (X, Sp()) with the same multiplier c(exp(v + z)) = ei X x,z(x) dx . Moreover, for each x ∈ X the factors Uxa are equivalent for all a ∈ C ∞ (X, Sp()). Proof. Let us see that U a is CTPR associated to the admissible triple (π a , ba , ca ) given by the data πxa = 1Wx ,

bxa (exp(v + z)) = a(x) · v,

cxa = eix,z .

The fact that a(x) ∈ Sp() gives the compatibility between the cocycle and the multiplier,

1

cxa (exp(v + z) exp(v + z )) = eix,z+z + 2 (v,v )

i

i

= eix,z eix,z e 2 x,(v,v )

= eix,z eix,z e 2 x,(a(x)·v,a(x)·v ) a a = cxa (exp(v + z))cxa (exp(v + z ))ei Im hx (bx (v),bx (v )) . Now, for exp(v + z) ∈ C ∞ (X, G), U a (exp(v + z)) EXP(w) = U(a · exp(v + z)) EXP(w) = U(exp((a · v) + z)) EXP(w) = ei

X x,z(x) dx

1

e− 2

= c (exp(v + z))e a

X

a(x)·v(x) 2 dx−

X

hx (w(x),a(x)·v(x))dx

− 21 ba (v) 2 −h(w,ba (v))

EXP(w + a · v)

EXP(w + b (v)) a

for each w ∈ W . This means that the representation U a is the tensor product of the Fock representations Uxa of G. For a given x ∈ X all the Uxa are Fock representations of G that have the same values on the center Z, then they are all equivalent. Theorem 3.2. The map a → U a , a ∈ C ∞ (X, Sp()), defines an inclusion of the manifold C ∞ (X, M) into the unitary dual of the group C ∞ (X, G), whose image contains no standard class.

194

E. Galina, A. Kaplan

Proof. Identify C ∞ (X, M) with the quotient C ∞ (X, Sp())/C ∞ (X, U()) and fix a ∈ C ∞ (X, Sp()); the theorem asserts that U a is unitarily equivalent to U if and only if a ∈ C ∞ (X, U()), and that U a is not unitarily equivalent to any of the standard representations U (xo ) . For the first statement, consider L2 (X, V ) as a real Hilbert space with inner product u, v L2 = X u(x), v(x) dx, complex structure (J v)(x) = Jx (v(x)) and symplectic form A(u, v) = X x, (u(x), v(x)) Z dx. These are mutually compatible, in the sense that A(u, v) = J u, v L2 . Consider the infinite-dimensional Heisenberg group H = L2 (X, V ) × T with multiplication i

(v1 , t1 ) · (v2 , t2 ) = (v1 + v2 , t1 t2 e 2

X x,(v1 (x),v2 (x)) dx

).

Its basic Fock representation T on the Fock space over (L2 (X, V ), J ) satisfies 1

T (v, eit ) EXP(w) = e− 2 v

2 −w,v +it

EXP(w + v).

The map τ : C ∞ (X, G) → H defined by τ (exp(v + z)) = (v, ei

X x,z(x) dx

)

is a Lie homomorphism. Comparing T with the explicit formula for U, it is straightforward to check that U a = T a ◦ τ. By a theorem of Shale [Sh], for any invertible bounded operator B on L2 (X, V ) the representation T B is equivalent to T if and only if B is in the corresponding symplectic group of the Hilbert space L2 (X, V ) and the operator (B ∗ B)1/2 − I is Hilbert-Schmidt. The operator Ba acting on L2 (X, V ) by Ba (v)(x) = a(x)(v(x)) is certainly bounded and symplectic. It satisfies Ba∗ = Ba ∗ , so, for k ∈ C ∞ (X, U()), (Bk∗ Bk )1/2 − I = 0, which is Hilbert-Schmidt. Therefore U k ∼ = U. Conversely, assume that (Ba∗ Ba )1/2 − I is Hilbert-Schmidt. Let {vi (x)} be an L2 frame over X of the hermitian bundle with fiber (V , Jx , hx ). Since a(x) ∈ Sp(φx ), we can write a(x) = h(x)d(x)k(x) with h(x) (hx ) and d(x) diagonal relative and k(x) in U −1 ∗ to that frame. With f := 1 − d, I − Ba Ba = I − Bk Bd Bk = Bk−1 Bf Bk . For the Hilbert-Schmidt norm, I − Ba∗ Ba H S = Bk−1 Bf Bk H S = Bf H S . Indeed, if {φn } is an o.n. basis of L2 (X, V ), so is {Bk (φn )} and therefore Bk−1 (Bf Bk φn ) 2 = Bf (Bk φn ) 2 = Bf 2H S . Bk−1 Bf Bk 2H S = n

n

Now, the components of φn (x) relative to the frame {vi (x)} constitute orthonormal basis of L2 (X). We may therefore assume that V = C and that f is a smooth complex function on X, acting on L2 (X) by multiplication. But such an action is not Hilbert-Schmidt

On Unitary Representations of Nilpotent Gauge Groups

195

unless the function is zero. Indeed, if f ≡ 0 we can find finitely many translates of |f | by isometries of X such that f1 (x) + · · · + fp (x) ≥ c > 0 for all x. Then p||f ||H S = ||f1 ||H S + · · · + ||fp ||H S ≥ ||f1 + · · · + fp ||H S ≥ c||1||H S = ∞. We conclude: if U a ≡ U then a ∈ C ∞ (X, U()). For the second statement of the theorem, recall from the proof of (3.1) that the restriction of the representation U to the subgroup of C ∞ (X, G) consisting of constant maps into the center, is trivial, while that of any standard representation is not. For the representations U a , this restriction is equal to that of U, because Sp() acts trivially on the center. Therefore none of the U a is equivalent to a standard representation. 4. The Spin Action Recall that Pin(m) is generated in GL(V ) by the operators Jx , x ∈ X. It acts on G by automorphisms. Indeed, let a generator act on the Lie algebra g = V + Z by jx : v + z → Jx (v) − ρx (z), where ρx is the reflection in Z with respect to the orthogonal complement of x. Since [jx (v + z), jx (v + z )] = (Jx v, Jx v ), contracting with any y ∈ X gives y, [jx (v + z), jx (v + z )] = y, (Jx v, Jx v ) = Jy Jx v, Jx v = −Jx Jy v, Jx v − 2x, y v, Jx v = −Jy v, v + 2x, y Jx v, v = −y, (v, v ) + 2y, (v, v ), x x = y, −ρx ((v, v )) = y, jx ((v, v )) = y, jx ([v + z, v + z ]) . In particular, (s.v, s.v ) = s.(v, v ) for all s ∈ Spin(m), so this group acts by automorphisms of G and C ∞ (X, G). On the other hand, Spin(m), as a two-cover of SO(m), acts on C ∞ (X, G) by diffeomorphism on X. Let U be the representation given in Theorem 3.1. Both actions of Spin(m) give new irreducible unitary representations of C ∞ (X, G). For each s ∈ Spin(m), they are defined by U s (g) = U(s · g),

Us (g) = U(g · s).

Indeed, U s makes sense for any s ∈ C ∞ (X, Spin(m)). Theorem 4.1. The irreducible unitary representations U s are Continuous Tensor Product Representation for all s ∈ C ∞ (X, Spin(m)) and they are pairwise non-equivalent representations. Moreover, for each constant s ∈ Spin(m) the representations U s and Us are equivalent and not equivalent to any standard one.

196

E. Galina, A. Kaplan

Proof. Let’s see that U s is CTPR associated to the admissible triple (π s , bs , cs ) given by the data πxs = 1Wx ,

bxs (exp(v + z)) = s(x) · v,

cxs = eix,s(x)z .

The fact that s(x) ∈ Spin(m) gives the compatibility between the cocycle and the multiplier, cxs (exp(v + z) exp(v + z )) 1 s = cx exp (v + v ) + (z + z + (v, v )) 2

1

= eix,s(x)(z+z + 2 (v,v ))

i

i

= eix,s(x)z eix,s(x)z e 2 x,s(x)(v,v )

= eix,s(x)z eix,s(x)z e 2 x,(s(x)·v,s(x)·v ) s s = cxs (exp(v + z))cxs (exp(v + z )) ei Imhx (bx (exp(v+z)),bx (exp(v +z ))) . Now, for exp(v + z) ∈ C ∞ (X, G), U s (exp(v + z)) EXP(w) = U(s · exp(v + z)) EXP(w) = U(exp((s · v) + (s · z))) EXP(w) = ei

X x,s(x)z(x) dx

= c (exp(v + z))e s

1

e− 2

X

s(x)·v(x) 2 dx−

X

hx (w(x),s(x)·v(x))dx

− 21 bs (exp(v+z)) 2 −h(w,bs (exp(v+z)))

EXP(w + s · v)

EXP(w + bs (exp(v + z)))

for each w ∈ W . Then, U s is a CTPR. U s and U are not equivalent unless s = 1. In fact, the restriction to C ∞ (X, Z) of the first one is U s (exp(z)) = c(s · exp(z)) = ei

X x,s(x)·z(x) dx

.

For each s ∈ C ∞ (X, Spin(m)) we can choose a function z ∈ C ∞ (X, Z) such that x, s(x) · z(x) dx = x, z(x) dx. X

X

So, U s and U differ on the center for s ≡ 1. Now, we will prove for each s ∈ Spin(m) that U s = σ Us σ −1 , where σ is the unitary operator on the space representation (3.3) of U given on the dense set {λ EXPw w ∈ W, λ ∈ C} by

. σ (λ EXPw) = λ EXP(s · w · s −1 ) = λ EXP(sw(s −1 x) x∈X

In fact, for exp(v + z) ∈

C ∞ (X, G),

On Unitary Representations of Nilpotent Gauge Groups

197

σ Us (exp(v + z)) σ −1 EXP(w) = σ Us (exp(v + z)) EXP(s −1 · w · s) = σ U(exp(((v + z) · s)) EXP(s −1 · w · s)

v(sx) 2 dx− X hx (s −1 w(sx),v(sx))dx i X s −1 x,z(x) dx − 21 X v(x) 2 dx− X hs −1 x (s −1 w(x),v(x))dx

= σ ei =e

X x,z(sx) dx

1

e− 2

X

e

EXP((s −1 · w + v) · s) EXP(s · (s −1 · w + v))

for each w ∈ W . But, as Spin(m) acts orthogonally on the real vector space V , hs −1 x (v, v ) = v, v + iφs −1 x (v, v ) = v, v + iJs −1 x v, v = v, v + is −1 Jx s.v, v = v, v + iJx s.v, s.v = hx (s.v, s.v ). Then, σ Us (exp(v + z)) σ −1 EXP(w) = ei

X x,s.z(x) dx

1

e− 2

X

s.v(x) 2 dx−

X

hx (w(x),s.v(x))dx

− 21 bs (exp(v+z)) 2 −h(w,bs (exp(v+z)))

= c (exp(v + z))e = U s (exp(v + z)). s

EXP(w + s · v))

EXP(w + bs (exp(v + z)))

Finally, U s is not standard by the same argument used for U in Theorem 3.1.

As a consequence of Theorem 4.1 we can reformulate Theorem 3.2 as follows. Theorem 4.2. The map a → U a , a ∈ C ∞ (X, Spin(m) × Sp()), defines an inclusion of the manifold C ∞ (X, Spin(m) × Sp()/U()) into the unitary dual of the group C ∞ (X, G), whose image contains no standard class. References [A]

Araki, H.: Factorizable representations of current algebras. Publ. Res. Inst. Math. Sc. Kyoto 5, 361–422 (1969) [AHMT] Albeverio, S., Hoeg-Krohn, R., Marion, J., Testard, D., Torresani, B.: In: Nonconmutative Distributions. M. Decker, 1993 [D] Delorme, P.: 1-cohomologie des repr´esentations unitaires des groupes de Lie semi-simples et r´esolubles. Produits tensoriels continus de repr´esentations. Bull. Soc. Math. de France 105, 281–336 (1977) [G1] Guichardet, A.: Symmetric Hilbert spaces and related topics. Lectures Notes in Math 261, Berlin-Heidelberg-New York: Springer Verlag, 1972 [G2] Guichardet, A.: Sur la cohomologie de groupes topologiques II. Bull. Sc. Math (2) 96, 305–332 (1972) [GGV1] Gelfand, I.M., Graev, I., Vershik, A.: Representations of the group SL(2,R), where R is a ring of functions. Russ. Math. Surv. 28, 78–132 (1973) [GGV2] Gelfand, I.M., Graev, I., Vershik, A.: Representations of the group of smooth mappings of a manifold X into a compact Lie group. Compositio Math. 35, 299–334 (1977) [GGV3] Gelfand, I.M., Graev, I., Vershik, A.: Model of representations of current groups. In: Representations of Lie Groups and Lie Algebras, A.A. Kirillov (ed.), Budapest: Akademiai Kiado 1971, pp. 121–179 [GGVi] Gelfand, I.M., Graev, I., Vilenkin, N.Y.: Generalized functions, Vol. 4, London-New York: Academic Press, Inc., 1966 [I] Ismagilov, R.: Func. Anal. Applic. 28, 92–99 (1994) [Ka] Kaplan, A.: Fundamental solutions for a class of hypoelliptic PDE. Trans. Am. Math. Soc. 258, 147–153 (1980)

198 [KR] [Ki] [R] [Ri] [Sa] [Sh]

E. Galina, A. Kaplan Kaplan, A., Ricci, F.: Harmonic analysis on groups of Heisenberg type. In: Harmonic Analysis. Lecture Notes in Mathematics, 992, Berlin-Heidelberg-New York: Springer-Verlag, 1983, pp. 416–435 Kirillov, A.: Elements of the theory of representations. Trans. Am. Math. Soc. 93, 53–73 (1959) Ricci, F.: Commutative algebras of invariant functions on groups of Heisemberg type. J. Lond. Math. Soc. II Ser. 32, 265–271 (1985) Riehm, C.: The automorphism group of composition of quadratic forms. Trans. Amer. Math. Soc. 269, 403–415 (1982) Saal, L.: The automorphism group of a Lie algebra of Heisenberg type. Rend. Sem. Mat. Univ. Politec. Torino. 54, 101–113 (1996) Shale, D.: Linear symmetries of free Boson fields. Trans. Am. Math. Soc. 103, 149–167 (1962)

Communicated by H. Araki

Commun. Math. Phys. 236, 199–221 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0825-5

Communications in

Mathematical Physics

On the Distribution of Free Path Lengths for the Periodic Lorentz Gas III Emanuele Caglioti1 , Fran¸cois Golse2 1 2

Dipartimento di Matematica, Istituto Guido Castelnuovo, Universit`a di Roma “La Sapienza”, p.le Aldo Moro 2, I00185 Roma, Italy. E-mail: [email protected] Institut Universitaire de France, & D´epartement de Math´ematiques et Applications, Ecole Normale Sup´erieure Paris, 45 rue d’Ulm, 75230 Paris Cedex 05, France. E-mail: [email protected]

Received: 2 August 2002 / Accepted: 27 November 2002 Published online: 14 April 2003 – © Springer-Verlag 2003

Abstract: For r ∈ (0, 1), let Zr = {x ∈ R2 | dist(x, Z2 ) > r/2} and define τr (x, v) = inf{t > 0 | x + tv ∈ ∂Zr }. Let r (t) be the probability that τr (x, v) ≥ t for x and v uniformly distributed in Zr and S1 respectively. We prove in this paper that 1/4 t dr 2 1 1 lim sup = 2 +O 2 , r r r π t t →0+ | ln | 1/4 1 t dr 2 1 r = 2 +O 2 lim inf r r π t t →0+ | ln | as t → +∞. This result improves upon the bounds on r in Bourgain-Golse-Wennberg [Commun. Math. Phys. 190, 491–508 (1998)]. We also discuss the applications of this result in the context of kinetic theory. 1. Statement of the Problem and Main Results 1.1. The periodic Lorentz gas. Let r ∈ 0, 21 and define Zr = {x ∈ R2 | dist(x, Z2 ) > r/2}.

(1.1)

Consider a point particle moving at speed 1 inside Zr and being specularly reflected each time it meets the boundary of Zr . Such a dynamical system is referred to as “a periodic, two-dimensional Lorentz gas”. (Indeed, Lorentz used the methods of kinetic theory to describe the motion of electrons in a metal as that of a collisionless gas of point particles bouncing on the crystalline structure of atoms in the metal [13]). The “free path length” (or “(forward) exit time”) starting from x ∈ Zr in the direction v ∈ S1 ” is defined as τr (x, v) = inf{t > 0 | x + tv ∈ ∂Zr },

(x, v) ∈ Zr × S1 .

(1.2)

200

E. Caglioti, F. Golse

1

r

Fig. 1. The Lorentz gas: Zr and the punctured torus Yr

For each v = (v1 , v2 ) ∈ S1 such that v1 v2 = 0 and the ratio v1 /v2 ∈ R \ Q, one has τr (x, v) < +∞, since any orbit of a linear flow with irrational slope on the 2-torus is dense (see for instance in [1], Sect. 51, Corollary 1, p. 287, for this well-known fact). Set Yr = Zr /Z2 ; since τr (x, v) = τr (x + k, v) for each (x, v) ∈ Zr × S1 and k ∈ Z2 , the function τr can be seen as defined on Yr × S1 .

1.2. Invariant measure for the Lorentz gas. Let Vr = dxdv − meas(Yr × S1 ) and µr =

1 dxdv. Vr

Thus µr is a Borelian probability measure on Yr × S1 . On the other hand, the evolution of the Lorentz gas is governed by the broken Hamiltonian flow (x(t), ˙ v(t)) ˙ = (v(t), 0), +

−

whenever x(t) ∈ Yr ,

−

v(t ) = v(t ) − 2v(t ) · nx(t) nx(t) ,

whenever x(t) ∈ ∂Yr ,

(1.3)

where nx denotes the inward unit normal to ∂Yr at point x. As can be easily checked, the measure µr is invariant under the flow (1.3). Let m ∈ L∞ (S1 ) such that m ≥ 0 and m(θ )dµr (x, θ ) = 1. (1.4) Yr ×S1

Free Path Lengths for Periodic Lorentz Gas

201

The object of interest in the present paper is the distribution of τr under mdµr , i.e. the function m r : R+ → [0, 1] defined by 1 m r (t) = mdµr − Prob ({(x, v) ∈ Yr × S | τr (x, v) > t}).

(1.5)

+ More precisely, we are interested in the asymptotic behavior of m r (t) as r → 0 and for large values of t – large compared to 1/r. As explained in [7], pp. 221–222, knowing the distribution of the free path lengths in the small r limit has several important implications. For instance, it leads to the correct asymptotic model for the Boltzmann-Grad limit of the periodic Lorentz gas – which is not governed by the linear Boltzmann equation (Eq. (10) of [13]): see below. Another application bears on the asymptotic behavior of the Kolmogorov-Sinai entropy – or equivalently, of the Lyapunov exponent – of the periodic Lorentz gas in the small r limit. We refer to [9] for a survey of the most recent results and open questions on this subject, and to [8] which addresses one of these open problems by methods similar to those developed here.

1.3. Main result. The reference [6] established for each m ∈ L∞ (S1 ) as in (1.4) the such that, for each r ∈ 0, 1 and each existence of two positive constants Cm and Cm 2 t > 1/r, Cm Cm (t) ≤ ≤ m ; r rt rt

(1.6)

(see Theorems B and C in [6]). In (1.6), the upper bound was proved by an argument based on Fourier series, while the lower bound was obtained by a geometric construction exhausting all possible channels, i.e. infinitely long open strips included in Zr . The validity of the lower bound in (1.6) was extended to arbitrary space dimensions in [12]. Numerical simulations in [12] suggest the following questions: for each m ∈ L∞ (S1 ) as in (1.4) does one have, for each t large enough (say, for each t > 2) t → m (t) as → 0 ? (1.7) m r r And, if so, does one have, for some constant C > 0, m (t) ∼

C as t → +∞ ? t

(1.8)

We have not been able to fully answer (1.7), but, were (1.7) true, our main result in this paper (Theorem 1.1 below) would answer (1.8) by giving an explicit value for C. It confirms the numerical results obtained in [12] (see Figs. 4 and 5 there). Throughout the paper, the notation for the convergence in the sense of Cesaro is as follows: C−lim sup f () = l means that lim sup →0+

→0+

1 | ln |

∗

f (r)

dr =l r

for some ∗ > 0. (A similar notation is used for the lim inf and the limit in the sense of Cesaro.)

202

E. Caglioti, F. Golse

Theorem 1.1. Let m ∈ L∞ (S1 ) satisfying (1.4). Let t ∗ > 3 and let ∗ m + (t )

=

C−lim sup m r r→0+

t∗ r

, and

∗ m − (t )

= C−lim inf r→0+

m r

t∗ r

.

Then ∗ m + (t ) ∼

2 π 2t ∗

∗ , and m − (t ) ∼

2 π 2t ∗

as t ∗ → +∞.

(1.9)

More precisely, m ∗ ∞ (t ) − 2 ≤ 8 m L . ± ∗ 2 ∗ π t (t − 3)2

(1.10)

A serious shortcoming of the result above is the need for averaging in r before letting r → 0+ . It seems however that it cannot be avoided, at least by using the techniques of the present paper. This leads to a natural question, that of the choice of the measure dr r to define the Cesaro mean in Theorem 1.1. The reasons for this choice are made clear by following the proof, but we take this opportunity of giving an idea of this proof by providing some qualitative argument in favor of this choice. The proof of Theorem 1.1 is based on comparing the size r of the obstacle with the sequence of errors dn in the approximation by continued fractions of v2 /v1 , i.e. of the slope of the linear flow (say, in the case where 0 < v2 < v1 ). It is natural in this context to renormalize the problem by applying to α = v2 /v1 the Gauss map T : x → x1 − [ x1 ] – we refer to the appendix for more details on these notions. Lemma 7.1 below and especially the formula dn (α) = αdn−1 (T (α)) show that the exit time problem with slope α and obstacle of size r is mapped to the analogous problem with slope T α and obstacle of size αr. Hence it is natural to define the Cesaro average in Theorem 1.1 with the ∗ measure dr r which is the scale invariant (Haar) measure of the multiplicative group R+ . ∗ ) = m (t ∗ ) for each t ∗ large enough (say, Remark 1.1. In fact, one can prove that m + (t − ∗

t for t ∗ > 10). (In other words C−limr→0+ m r r exists). However, this result requires some significant improvements of the method used in the present paper; they will be described in [8].

2. A Partition of T2 In 1989, R. Thom posed the following problem: • What is the longest orbit of a linear flow with irrational slope on a flat torus with a disk removed? The answer to this question was found by Blank and Krikorian in [2] and is summarized as follows. Without loss of generality, assume that the linear flow is x → x + tv, where v = (cos θ, sin θ ) with θ ∈ (0, π4 ). The removed disk of diameter r is replaced by a vertical slit Sr (v) of length r/ cos θ with the same center (see Fig. 2).

Free Path Lengths for Periodic Lorentz Gas

203

Fig. 2. The punctured torus Yr and the slit Sr (v)

Proposition 2.1 (Blank-Krikorian [2], p. 722). Let r ∈ (0, 21 ), θ ∈ (0, π4 ) and v = (cos θ, sin θ ). Assume that tan θ is irrational. There exist three positive numbers lA (r, v), lB (r, v) and lC (r, v) satisfying lA (r, v) < lB (r, v) and lC (r, v) = lA (r, v) + lB (r, v) and such that, for any orbit γ of the linear flow x → x + tv in T2 \ Sr (v), length(γ ) takes one of the three values lA (r, v), lB (r, v) or lC (r, v). Conversely, given any l ∈ {lA (r, v), lB (r, v), lC (r, v)}, there exists an orbit γ of the flow x → x + tv of length l. Let v be fixed; orbits of length lA (r, v) (resp. of length lB (r, v), lC (r, v)) are referred to as orbits of type A (resp. of type B, C)1 . Proposition 2.1 defines a partition (YA (r, v), YB (r, v), YC (r, v)) of T2 \ Sr (v), where YA (r, v) = {x ∈ Yr \ Sr (v) | x belongs to an orbit of type A} and YB (r, v) and YC (r, v) are similarly defined. Define further SA (r, v) = {y ∈ Sr (v) | the v-orbit starting from y is of type A} with analogous definitions for SB (r, v) and SC (r, v). Clearly YA (r, v) (resp. YB (r, v), YC (r, v)) is metrically equivalent to a strip (parallellogram) of length lA (r, v) and width |SA (r, v)| (resp. of length lB (r, v), lC (r, v) and width |SB (r, v)|, |SC (r, v)|): see Fig. 3 below. 1 On p. 722 of [2], the sentence “If the moving slit first meets the fixed slit at the bottom, the A and B-orbits are reversed” might be the source of a slight ambiguity in the definition of the partition above. In the present paper, the orbits of type A are the shortest, consistently with the table on p. 726 of [2]. Hence the roles of orbits of type A and B cannot be reversed in the present discussion.

204

E. Caglioti, F. Golse

Fig. 3. The partition (YA (r, v), YB (r, v), YC (r, v)) of T2 \ Sr (v)

Define now λr to be the exit time in the torus with the slit (instead of the disk) removed, i.e. λr (z, v) = inf{t > 0 | z + tv ∈ Sr (v)} for each z ∈ T2 \ Sr (v). For v = (cos θ, sin θ ) with θ ∈ (0,

π 4)

(2.1)

as above, define

ψr (t, v) = dx − Prob({z ∈ T2 \ Sr (v) | λr (z, v) ≥ t}).

(2.2)

With the partition of T2 \ Sr (v) in (YA (r, v), YB (r, v), YC (r, v)) , which is metrically equivalent to the disjoint union of three strips as represented on Fig. 3, computing ψr (t, v) in terms of the quantities |SA (r, v)|, lA (r, v), |SB (r, v)|, lB (r, v), |SC (r, v)| and lC (r, v) becomes an easy task. One finds that • if 0 ≤ t ≤ lA (r, v), then ψr (t, v) = 1 − tr ;

(2.3)

• if lA (r, v) ≤ t ≤ lB (r, v), then ψr (t , v) = 1 − lA (r, v)r − (t − lA (r, v))(|SB (r, v)| + |SC (r, v)|) cos θ ; (2.4) • if lB (r, v) ≤ t ≤ lC (r, v), then ψr (t , v) = (lC (r, v) − t )|SC (r, v)| cos θ ;

(2.5)

ψr (t, v) = 0.

(2.6)

• if t ≥ lC (r, v), then The graph of t → ψr (t, v) is presented on Fig. 4.

Free Path Lengths for Periodic Lorentz Gas

205

–

–

–

Fig. 4. Graph of t → ψr (t, v)

So far, the distribution ψr (t, v) has been computed in terms of the quantities |SA (r, v)|, lA (r, v), |SB (r, v)|, lB (r, v), |SC (r, v)| and lC (r, v) that characterize the partition (YA (r, v), YB (r, v), YC (r, v)) of T2 \ Sr (v). These quantities can be expressed (see [2]) in terms of the continued fraction expansion of tan θ , in the following manner. In the following discussion, we freely use the notations recalled in the Appendix below. Let α = tan θ ∈ (0, 1); the θ ’s for which α ∈ Q form a set of measure 0 and are discarded in the argument below. With the sequence of errors dn in the continued fraction expansion of α, we consider the following partition of the interval (0, 1): (0, 1) = In , with In = [dn , dn−1 ). (2.7) n≥1

Each interval In is further partitioned into In = In,k , with In,k = [sup(dn , dn−1 − kdn ), dn−1 − (k − 1)dn ),

(2.8)

1≤k≤an

so that eventually we arrive at the following nested partition of (0, 1): (0, 1) = In,k .

(2.9)

n≥1 1≤k≤an

Proposition 2.2 (Blank-Krikorian [2], p. 726). Let θ ∈ (0, π4 ) and r ∈ (0, 1/2). Let v = (cos θ, sin θ ) and set α = tan θ (it is assumed that α ∈ / Q). For R = cosr θ , the integers n ≥ 1 and k such that 1 ≤ k ≤ an (an being the nth term in the continued fraction expansion of α) are defined by the fact that R ∈ In,k (since the intervals In,k form a partition of (0, 1): see (2.9)). Then

206

E. Caglioti, F. Golse

ψr (t,v) If R is in I n,k 1

slope −r slope −dn cos θ slope r− [dn−1− (k−1)d n] cos θ

Γ

0

qn

qn−1+ k qn

qn−1+ (k+1)qn

t cos θ

Fig. 5. Graph of t → ψr (t, v) for R ∈ In,k

• lA (r, v) = qn and |SA (r, v)| = R − dn , • lB (r, v) = qn−1 + kqn and |SB (r, v)| = R − (dn−1 − kdn ), • lC (r, v) = qn−1 + (k + 1)qn and |SC (r, v)| = dn−1 − (k − 1)dn − R. Using these values, we arrive at the following expression for ψr (t, v), whenever R = cosr θ ∈ In,k (see Fig. 5 below): • if 0 ≤ t cos θ ≤ qn , then ψr (t, v) = 1 − tr,

(2.10)

• if qn ≤ t cos θ ≤ qn−1 + kqn , then ψr (t, v) = 1 − Rqn − dn (t cos θ − qn ),

(2.11)

• if qn−1 + kqn ≤ t cos θ ≤ qn−1 + (k + 1)qn , then ψr (t, v) = 1 − Rqn − dn [qn−1 + (k − 1)qn ] − [R − (dn−1 − (k − 1)dn )](t cos θ − qn−1 − kqn ),

(2.12)

• if t cos θ ≥ qn−1 + (k + 1)qn , then ψr (t, v) = 0.

(2.13)

Finally, the discussion above leads to the statistics of the exit time λr (z, v) defined in (2.1) corresponding to the torus T2 with the slit Sr (v) removed. In the case of the torus

Free Path Lengths for Periodic Lorentz Gas

207

with the disk removed, the corresponding exit time τr (x, v) defined in (1.2) is related to λr (z, v) by the obvious inequalities λr (x, v) −

r 2

≤ τr (x, v) ≤ λr (x, v) +

for each x ∈ Yr \ Sr (v).

r 2

(2.14)

Define φr (t, v) = dx − Prob({x ∈ Yr \ Sr (v) | τr (x, v) ≥ t}) ;

(2.15)

because of (2.14), one has ψr (t + 2r , v) ≤ φr (t, v) ≤ ψr (t − 2r , v),

t ≥ 2r .

(2.16)

The remaining part of the paper uses the evaluation of φr based on this inequality together with the formulas (2.10), (2.11), (2.12) and (2.13) for ψr . 3. An Ergodic Theorem Given α ∈ (0, 1) and ∈ (0, 1), we define N (α, ) = inf{n ∈ N | dn (α) < }.

(3.1)

In terms of the partition (2.7) of (0, 1), N (α, ) can be equivalently defined by the condition ∈ IN(α,) .

(3.2)

We start by recalling the following more or less classical lemma. Lemma 3.1. For a.e. α ∈ (0, 1), one has as → 0+ .

N (α, ) ∼ − 12πln2 2 ln

Proof. The definition of N (α, ) and the third formula in Lemma 7.1 imply that N(α,)−2

− ln T j α ≤ − ln <

N(α,)−1

− ln T j α,

(3.3)

j =0

j =0

where T is the Gauss map (7.1). First we prove that N (α, ) → +∞ as → 0+ a.e. in α ∈ (0, 1). Indeed, let C > 0 and let EC = {α ∈ (0, 1) | N (α, ) ≤ C, for all ∈ (0, 21 )}. For all α ∈ EC and all ∈ (0, 21 ), − ln ≤

C

− ln T j α ;

j =0

therefore, for all ∈ (0, 21 ),

1

− ln · dg − meas (EC ) ≤ (C + 1) 0

(− ln α)dg(α) < +∞,

208

E. Caglioti, F. Golse

since the measure dg in (7.2) is invariant under T . This implies that dg − meas (EC ) = 0.

for each C > 0,

(3.4)

Because N (α, ) is a nonincreasing function of ,  c for each α ∈  Em  , N (α, ) → +∞

(3.5)

m≥1

as → 0+ and

 dg − meas 

 Em  = 0.

(3.6)

m≥1

Secondly, by Birkhoff’s ergodic theorem, there exists a dg-negligible set E such that 1 N−1 1

− ln T j α → (− ln α)dg(α) N 0

for each α ∈ (E )c ,

0

as N → +∞, since the Gauss transformation T of (0, 1) is ergodic with respect to the invariant measure dg(α) – see the Appendix below. Hence  c for each α ∈ E ∪ Em  , m≥1

1 N (α, )

N(α,)−1

1

− ln T α →

0

j

(− ln α)dg(α). 0

By (3.3), (3.5) and (3.6), one finally obtains that, as → 0+ ,  c 1 − ln ζ (2)   for each α ∈ E ∪ Em , (− ln α)dg(α) = → N (α, ) 2 ln 2 0 m≥1

(replacing the ln under the integral sign by its Taylor series at α = 1).

The main result of this section is the following application of the Birkhoff ergodic theorem. We shall need the notations below: 0 (α, x) = −x − ln dN(α,e−x ) (α), 1 (α, x) = −x − ln dN(α,e−x )−1 (α).

(3.7)

Proposition 3.1. Let f be a bounded nonnegative measurable function on R2 . For each x ∗ ∈ R and a.e. in α ∈ (0, 1), one has 1 | ln | F (θ)dθ 1 12 f (0 (α, x), 1 (α, x))dx → π 2 | ln | x ∗ 1+θ 0 as → 0, where

| ln(θ)|

F (θ ) = 0

f (| ln(θ )| − y, −y)dy.

Free Path Lengths for Periodic Lorentz Gas

209

Proof. First we decompose the integral

| ln | x∗

f (0 (α, x), 1 (α, x))dx =

0 x∗

f (0 (α, x), 1 (α, x))dx

+ +

| ln | | ln dN (α,)−1 (α)|

f (0 (α, x), 1 (α, x))dx

N(α,)−2

| ln dl+1 (α)| l=0

| ln dl (α)|

f (0 (α, x), 1 (α, x))dx. (3.8)

Now observe that, whenever x belongs to the domain of integration of

| ln dl+1 (α)|

f (0 (α, x), 1 (α, x))dx,

| ln dl (α)|

in other words, whenever | ln dl (α)| < x ≤ | ln dl+1 (α)|,

then N (α, e−x ) = l + 1.

This implies that

| ln dl+1 (α)|

f (0 (α, x), 1 (α, x))dx

| ln dl (α)|

= =

− ln dl+1 (α)

f (−x − ln dl+1 (α), −x − ln dl (α))dx

− ln dl (α) ln(dl (α)/dl+1 (α))

(3.9)

f (ln(dl (α)/dl+1 (α)) − y, −y)dy

0

= F (T l θ ). Likewise, since f is nonnegative 0≤ ≤

| ln | | ln dN (α,)−1 (α)|

f (0 (α, x), 1 (α, x))dx

| ln dN (α,) (α)|

| ln dN (α,)−1 (α)|

f (0 (α, x), 1 (α, x))dx

≤ F (T N(α,)−1 α). Using (3.8) and (3.9) leads to 1 | ln |

| ln |

f (0 (α, x), 1 (α, x))dx   N(α,)−2

N (α, )  1 = F (T l θ) | ln | N (α, ) x∗

l=0

(3.10)

210

E. Caglioti, F. Golse

0 1 f (0 (α, x), 1 (α, x))dx | ln | x ∗ | ln | 1 f (0 (α, x), 1 (α, x))dx. + | ln | | ln dN (α,)−1 (α)|

+

(3.11)

Observe first that, by its definition 0 ≤ F (θ) ≤ f L∞ | ln θ | which implies that F ∈ L1 ((0, 1), dg). Birkhoff’s ergodic theorem applied to the Gauss transformation T together with Lemma 3.1 shows that 1 N (α, )

N(α,)−2

F (T l θ ) →

l=0

1 ln 2

1

F (θ )dg(θ ) 0

for a.e. α ∈ (0, 1) as → 0, so that the first term in the right-hand side of (3.11) converges to 1 12 F (θ )dg(θ ) π2 0

as → 0+ a.e. in α. The second term in the right-hand side of (3.11) obviously vanishes a.e. in α ∈ (0, 1) as → 0. As for the third term, because of (3.10), one has | ln | 1 f (0 (α, x), 1 (α, x))dx 0≤ | ln | | ln dN (α,)−1 (α)| ≤

N (α, ) 1 F (T N(α,) α) | ln | N (α, )

and the right-hand side of the above inequality converges to 0 a.e. in α ∈ (0, 1) as → 0+ by Birkoff’s ergodic theorem and Lemma 3.1. 4. Proof of Theorem 1.1 4.1. Step 1: Pointwise estimates. The discussion in Sect. 3 made it clear that the natural objects for applying Birkhoff’s ergodic theorem are functions involving a finite number (two in this case) of the dn ’s with n → +∞. In this first step, we shall reduce the function ψr to an expression of this form modulo terms that are small in some appropriate sense in the asymptotic regime that we consider – i.e. as t ∗ → +∞. Throughout this subsection, we set r ∈ (0, 21 ) and θ ∈ (0, π4 ) such that α = tan θ ∈ / Q; let then v = (cos θ, sin θ ) and R = cosr θ . We also consider n ≥ 1 and k ∈ {1, . . . , an } (where α = [a1 , a2 . . . ]) such that R ∈ In,k – this defines n and k in a unique way, since the In,k ’s form a partition of (0, 1). Lemma 4.1. Under these conditions on α, n k and R, • the integer k is given by the formula

dn−1 − R k = inf{l ∈ N | dn−1 − ldn ≤ R} = − − dn ∗

;

(4.1)

Free Path Lengths for Periodic Lorentz Gas

211

• the denominator qn of the nth convergent of α satisfies the estimate 1 1 < qn < ; R + (k + 1)dn R + (k − 1)dn

(4.2)

• finally, one has qn dn <

1 , k

0 < 1 − qn dn−1 <

2 . k+1

(4.3)

Proof. Whenever dn ≤ R < dn−1 – i.e. whenever R ∈ In – the condition R ∈ In,k amounts to defining k by the formula dn−1 − R ∗ . k = inf{l ∈ N | dn−1 − ldn ≤ R} = − − dn Next we estimate qn . Since R ∈ In,k , one has in particular the inequalities dn−1 − kdn ≤ R < dn−1 − dn (k − 1),

(4.4)

which imply 1 1 > qn , > R + dn (k − 1) dn−1 by the second inequality in (7.9). This is exactly the upper bound in (4.2). By (7.7), 1 − Rqn − dn (k − 1)qn = dn qn−1 + dn−1 qn − Rqn − dn (k − 1)qn < qn (dn + dn−1 − R − (k − 1)dn ) < qn (dn + dn−1 − dn−1 + kdn − (k − 1)dn ) = 2dn qn which gives the lower bound in (4.2). The upper bound in (4.2) and the fact that R ≥ dn (since R ∈ In,k ) imply the first inequality in (4.3). As for the second inequality there, observe that qn dn−1 ≤ 1 by (7.9) which establishes the lower bound, while the upper bound follows from (4.2) in the following manner: 1 dn−1

R + dn (k + 1) − dn−1 dn−1 (R + dn (k + 1)) 2 2dn < < dn−1 (k + 1) dn−1 (R + dn (k + 1))

− qn ≤

(where the penultimate inequality above follows from (4.4)).

Lemma 4.2. Let v, α, n, k and R be chosen as above, and let t ∗ > 2. Define ∗ t R ∗ dn −t ; χr ,v = 1 − r dn−1 R + then for

t∗ R

≥ qn we have ∗ ∗ ψr t , v − χr t , v ≤ 4 1k≥t ∗ −2 . k r r

(4.5)

(4.6)

212

E. Caglioti, F. Golse

Proof. Let t ∗ > 2. In view of the second inequality in (7.9), R < dn−1 <

1 , qn

so that qn <

1 t∗ < . R R

(4.7)

Next we compare the part of the graph of t → ψr (t, v) that corresponds to t cos θ ≥ qn with the straight line as defined on Fig. 5. They only differ when qn−1 + kqn < t cos θ < qn−1 + (k + 1)qn , and since ψr is a non-increasing function, they differ by at most qn−1 + kqn ψr , v = ((dn−1 − (k − 1)dn ) − R) qn . cos θ By (4.4), 0 < [dn−1 − (k − 1)dn ] − R < dn . Hence

∗ t∗ t , v − 1 − Rqn − − qn dn r R + qn−1 + kqn ≤ ψr , v 1q +kq < t ∗
0 ≤ ψr

(4.8)

where the last inequality follows from (4.3) and (4.7). Next we estimate the difference ∗ t R ∗ dn 1 − Rqn − − qn d n − 1− −t . R dn−1 R + + Because of the second inequality in (7.9), this difference is nonnegative. Because the map x → x+ is a contraction, this difference is less than R dn−1

− Rqn + qn dn ; ∗

on the other hand both terms in the difference above vanish whenever tR > qn−1 + (k + 1)qn . Hence ∗ R ∗ dn 1 − Rqn − t − qn dn − 1 − − t R dn−1 R + + R − Rqn + qn dn 1 t ∗ ≤q +(k+1)q ≤ n n−1 R dn−1 2 3 1 ≤ + 1 t ∗ ≤q +(k+1)q ≤ 1k≥t ∗ −2 , n n−1 R k+1 k k where the penultimate inequality rests on (4.3) and the fact that R < dn−1 .

Free Path Lengths for Periodic Lorentz Gas

213

4.2. Step 2: Applying the ergodic theorem. In this subsection again, we set r ∈ (0, 21 ) and pick θ ∈ (0, π4 ) such that α = tan θ ∈ / Q; again we set v = (cos θ, sin θ) and R = cosr θ . As in the previous subsection, we define n ≥ 1 by the condition R ∈ In . In other words, we set n = N (α, R). With these assumptions, consider the expression ∗ χr ( tR , v) given by (4.5), i.e. ∗ dN(α,R) t R χr − t∗ ,v = 1 − R dN(α,R)−1 R + −x 1 (α,e ) ∗ −0 (α,e−x ) = 1−e −t e +

with x = − ln R while 0

(α, e−x )

and 1

(α, e−x )

are defined as in (3.7).

Proposition 4.1. Let R ∗ ∈ (0, 1). Let t ∗ > 1; then, for a.e. θ ∈ (0, π4 ) such that α = tan θ ∈ / Q, ∗ R∗ t dR 1 χR cos θ ,v | ln | R R √ 1 1+ 1−z √ dz 12 ln → π2 − 1−z √ ∗ 4t + z 1− 1−z 0 1 z z dz , + π62 − √ √ ∗ 1 + 1 − z 1 − 1 − z 4t + z 0 as → 0+ , where v = (cos θ, sin θ ). Proof. The proof is based upon applying Proposition 3.1 to the function f defined by f (z1 , z2 ) = (1 − ez2 − t ∗ e−z1 )+ , since, for each R ∗ ∈ (0, 1), ∗ R∗ 1 t dR χR cos θ ,v R R | ln | | ln | 1 f (0 (α, e−x ), 1 (α, e−x ))dx. = | ln | x ∗ Starting from f , an elementary computation leads to F defined for each ξ ∈ (0, 1) as in Proposition 3.1 by | ln ξ | F (ξ ) = f (| ln ξ | − y, −y)dy

0

0

| ln ξ |

=

1/ξ

= 1

Assuming that

t∗

(1 − e−y − t ∗ ξ ey )+ dy

(ζ − 1 − t ∗ ξ ζ 2 )+

dζ . ζ2

> 1 and ξ ∈ (0, 1), elementary computations show that

ζ − 1 − t ∗ ξ ζ 2 < 0 for all ζ ∈ R if 4t ∗ ξ > 1, otherwise √ √ 1 + 1 − 4t ∗ ξ 1 − 1 − 4t ∗ ξ ζ − 1 − t ∗ ξ ζ 2 ≥ 0 iff ≤ ζ ≤ , 2t ∗ ξ 2t ∗ ξ

214

E. Caglioti, F. Golse

and hence

1/ξ

F (ξ ) = 1

=

1+ 1−

(ζ − 1 − t ∗ ξ ζ 2 )+

√

√

1−4t ∗ ξ 2t ∗ ξ

1−4t ∗ ξ 2t ∗ ξ

dζ ζ2

1 1 − 2 − t ∗ ξ dζ ζ ζ

√ 1 + 1 − 4t ∗ ξ − 1 − 4t ∗ ξ √ 1 − 1 − 4t ∗ ξ 2t ∗ ξ 2t ∗ ξ + − . √ √ 1 + 1 − 4t ∗ ξ 1 − 1 − 4t ∗ ξ

= ln

Therefore 12 π2

1 0

√ 1 + 1 − 4t ∗ ξ dξ ∗ξ 1 − 4t − √ ∗ξ 1 +ξ 1 − 1 − 4t 0 1/4t ∗ ∗ ∗ 2t ξ 2t ξ dξ 12 + π2 − √ √ ∗ ∗ 1 + 1 − 4t ξ 1 − 1 − 4t ξ 1 + ξ 0 √ 1 √ 1 + 1 − z dz ln = π122 − 1−z √ ∗ 4t +z 1− 1−z 0 1 z z dz . + π62 − √ √ ∗ 4t +z 1+ 1−z 1− 1−z 0

F (ξ )dξ = 1+ξ

12 π2

1/4t ∗

ln

4.3. Step 3: L1 estimate of the remainder. The last ingredient in the proof of Theorem 1.1 consists in estimating the right-hand side of (4.6) in average (integrating over the angle θ). Let R ∈ (0, 1) and α ∈ (0, 1) \ Q; we define n and k by the condition R ∈ In,k – below, this value of k is denoted by k(α, R). Equivalently,

dN(α,R)−1 (α) − R k(α, R) = − − + 1. dN(α,R) (α)

n = N (α, R),

Lemma 4.3. Let m ≡ m(θ ) ∈ L∞ ((0, π4 )). Then, for each R ∈ (0, 1) and each λ > 1, one has

π/4 0

1k(tan θ,R)>λ m(θ )dθ ≤

2 m L∞ . λ−1

Proof. The definition of k(α, R) implies in particular that k(α, R) ≤

dN(α,R) − 1(α) 1 + 1 = N(α,R)−1 + 1. dN(α,R) (α) α T

(4.9)

Free Path Lengths for Periodic Lorentz Gas

215

Hence, for each λ > 1, one has dg − meas ({α ∈ (0, 1) | k(α, R) ≥ λ}) 1 N(α,R)−1 α≤ ≤ dg − meas α ∈ (0, 1) | 0 < T λ−1 1 1 λ ≤ dg − meas 0, = ln →0 λ−1 ln 2 λ−1 as λ → +∞. Changing variables from α to θ = arctan α and using the classical inequality ln(1 + z) ≤ z leads to (4.9). 4.4. Step 4: End of the proof. We conclude the proof of Theorem 1.1 by bringing together the various ingredients described above. Let m ∈ L∞ ([0, π4 ]) such that m ≥ 0 π/4 and 0 m(θ )dθ = 1. By Lemma 4.2 – and especially the inequality (4.6) there – and Lemma 4.3, one has 1 ln

t∗ dr , (cos θ, sin θ ) m(θ )dθ ψr r r 0 ∗ π/4 ∗ 1 t dr − χr , (cos θ, sin θ) m(θ )dθ ln r r 0 ∗ π/4 4 dr 1 ≤ ∗ 1k(tan θ,r/ cos θ)>t ∗ −2 m(θ )dθ t − 2 | ln | r 0 8 m L∞ . ≤ ∗ (t − 3)2

∗

π/4

By Proposition 4.1 and dominated convergence, 1 | ln |

∗

χr 0

where ∗

(t )

= π122 +

as →

π/4

1

ln

0

1+ 1−

1

6 π2

t∗ dr , (cos θ, sin θ ) m(θ )dθ → (t ∗ ), r r

0

√ √

1−z 1−z

−

√

1−z

dz +z

4t ∗

z z − √ √ 1+ 1−z 1− 1−z

dz , +z

4t ∗

0+ .

Hence ∗ π/4 ∗ t 1 8 m L∞ dr ψr , , (cos θ, sin θ ) m(θ )dθ ≤ (t ∗ ) + ∗ lim sup r r (t − 3)2 0 →0+ | ln |

while lim inf →0+

1 | ln |

∗

π/4

ψr

0

t∗ dr 8 m L∞ , (cos θ, sin θ ) m(θ )dθ ≥ (t ∗ ) − ∗ . r r (t − 3)2

216

E. Caglioti, F. Golse

As t ∗ → +∞, one has 3 (t ) ∼ 2 ∗ π t ∗

1

ln

0

3 + 2π 2 t ∗

1+

1− 1

√ √

1−z 1−z

−

√

1 − z dz

2 z z dz = 2 ∗ . − √ √ π t 1+ 1−z 1− 1−z

0

Since the function t → ψr (t, v) is nonincreasing for all v ∈ S1 and r ∈ (0, 21 ), one has, by using (2.16),

t∗ dr φr , (cos θ, sin θ ) m(θ )dθ r r 0 2s π/4 ∗ 1 t dr φr , (cos θ, sin θ) m(θ )dθ = lim sup | ln | r r + 0 →0 2s π/4 ∗ r t −2 1 dr ψr , (cos θ, sin θ) m(θ )dθ ≤ lim sup r r 0 →0+ | ln | 2s π/4 ∗ 1 t −s dr ψr , (cos θ, sin θ) m(θ )dθ ≤ lim sup r r 0 →0+ | ln | ∞ 8 m

L ≤ (t ∗ − s) + ∗ , (t − s − 3)2

1 lim sup →0+ | ln |

∗

π/4

for each s ∈ (0, 21 ). Letting s → 0+ in the last inequality, one arrives at 1 lim sup | ln | + →0

∗

π/4

φr

0

t∗ 8 m L∞ dr . , (cos θ, sin θ ) m(θ )dθ ≤ (t ∗ ) + ∗ r r (t − 3)2 (4.10)

A similar inequality holds for the lim inf. By symmetry, the averaging in θ in (4.10) can be done equivalently on (0, π4 ) or in (0, 2π), which eventually proves (1.9). 5. Applications to Kinetic Theory It has been proved in Theorem 2.1 of [12] (see also [6]) that the linear Boltzmann equation (i.e. Eq. (10) of [13]) does not govern the Boltzmann-Grad limit of the periodic Lorentz gas – unlike the case of a Lorentz gas with a random (Poisson) distribution of scatterers, where the linear Boltzmann equation was rigorously derived in [11] and [5]. While Theorem 2.1 of [12] is merely a negative result, we show below how to infer from Proposition 1.1 positive information on the asymptotic behavior of the periodic Lorentz gas in the Boltzmann-Grad limit. Define = {z | z ∈ Z }, and consider the transport equation ∂t f + v · ∇x f = 0, f (t, x, v) = 0, f (0, x, v) = f in (x, v)

x ∈ , |v| = 1, x ∈ ∂ , v · nx > 0, x ∈ , |v| = 1.

(5.1)

Free Path Lengths for Periodic Lorentz Gas

217

Here, the unknown is f ≡ f (t, x, v) while nx is the inward unit normal at point x ∈ ∂ and f in is a given, nonnegative function of Cc (R2 × S1 ). Physically, this is a variant of the periodic Lorentz gas where scatterers are replaced by holes (or traps) where impinging particles fall and thus are removed from the domain . Obviously, for each t ≥ 0,

f L∞ = f in L∞ . x,v t,x,v

(5.2)

Reasoning as in [13] suggests that f → f in L∞ t,x,v weak-*, where f solves the uniformly damped transport equation ∗ ∂t f + v · ∇x f + f = 0 on R+ × R 2 × S1 ,

f|t=0 = f in ,

(5.3)

but this is ruled out by Theorem 2.1 of [12]. Instead, Proposition 1.1 suggests that the resulting damping rate should vanish in the limit as t → +∞. This statement is made precise in the following theorem. Theorem 5.1. Let f in ≥ 0 belong to Cc (R2 × S1 ) and let f be, for each ∈ (0, 41 ), the solution of (5.1). Then, for each nonnegative test function χ ∈ Cc1 (R2 × S1 ), one has 1/4 1 dr fr (t, x, v) χ (x, v)dxdv lim sup | ln | r →0 1 = f (t, x, v)χ (x, v)dxdv + O 2 t 1/4 1 dr fr (t, x, v) χ (x, v)dxdv lim inf →0 | ln | r 1 = f (t, x, v)χ (x, v)dxdv + O 2 t as t → +∞, where f (t, x, v) =

2f in (x − tv, v) . π 2t

(5.4)

In particular, f satisfies 1 ∂t f + v · ∇x f + f = 0, t

(t, x, v) ∈ (0, +∞) × R2 × S1

(5.5)

in the sense of distributions. Proof. First the solution of (5.1) is given by the formula f (t, x, v) = f in (x − tv, v)1τ (x/,−v)≥t/ .

(5.6)

In this formula, the exit time τ is considered as a function defined on Z × S1 with Z2 -periodicity and extended by 0 in Zc × S1 . Let h ≡ h(t, x, v) ≥ 0 belong to C ∞ (R+ × R2 × S1 ), with support in R+ × [−L, L]2 × S1 ; t x t dxdv − h(t, x, v)1 , v h(t, x, v)dxdv φ r τr ( r ,−v)≥ r r S1

218

E. Caglioti, F. Golse

2 ≤ h(t, x, v)1τ ( x ,−v)≥ t dxdv − r h(t, rl, v) 1τr (y,−v)≥ t dydv r r r 1 2 S T l∈Z2

t + φr h(t, rl, v) − h(t, x, v)dx dv , v r 2 r S1 l∈Z2

= r2 |h(t, rl + ry, v) − h(t, rl, v)|1τr (y,−v)≥ t dydv l∈Z2

+r 2

r

T2 ×S1

l∈Z2

T2 ×S1

|h(t, rl + ry, v) − h(t, rl, v)|φr

≤ 2r ∇x h L∞ |S1 | · r 2

t , v dydv r

1[−L,L]2 (rl)

l∈Z2

≤ 2r ∇x h L∞ |S1 |(L2 + o(1)).

(5.7)

By Theorem 1.1 ∗ 1 t 2 dr φr ,v h(t, x, v)dxdv − 2 h(t, x, v)dxdv lim sup →0+ | ln | r r π t S1 ≤

8 h L∞ 1 t,v (Lx )

(5.8)

(t ∗ − 3)2

with a similar estimate for the lim inf. Putting together (5.7), (5.8) with the formula (5.6) establishes (5.4).

Theorem 5.1 can also be viewed as a result in homogenization. This remark leads to a comparison with the analogous situation – homogenization of a diffusion process with Dirichlet boundary conditions in Zr – studied in [10] (in truth, the result obtained in [10] is much more complete and satisfying than Theorem 5.1). Not surprisingly, the mathematical tools used in [10] are of a very different nature than the ones in the present work. This however is by no means surprising and simply reflects the very different nature of the trajectories of a diffusion process and of those of a free transport equation. It is very likely that the nature of the result in Theorem 5.1 – and its proof – would be deeply affected by adding some collision (i.e. jump in velocity) process to the free transport between successive impingements on the obstacles. 6. Final Remarks and Perspectives A first question left open in the present work is the existence of the limit (1.7). As the reader will have probably noticed in the proof of Theorem 1.1, the diameter r of the obstacle is treated as a time variable under renormalization – i.e. transformation of the problem by the Gauss map. It may be that new insight concerning (1.7) can be gained by using more specific properties of T than ergodicity as in the present paper. A second question is the existence of the limit in the sense of Cesaro, as commented upon in Remark 1.1. As can be seen from the proof of Theorem 1.1, proving this essentially amounts to being able to apply an ergodic theorem as in Sect. 3 to functions of the form

Free Path Lengths for Periodic Lorentz Gas

f

qN(α,) ,

219

dN (α, ) dN(α,)−1 ,

.

(In the present paper, (4.3) essentially allows one to replace qN(α,) by 1/dN(α,)−1 ). This extension of the ergodic theorem in the present paper is postponed to a subsequent paper [8] and will be applied to the problem of the Lyapunov exponent for the Lorentz gas (see [9] for a presentation of this subject). Finally we would like comment on some very interesting, related work in [3] and [4]. The billiard problem considered in [3] is similar to studying the distribution of free path lengths in Zr with the obstacle at the origin removed (i.e. in Zr ∪ B(0, r/2)) for particles starting from the origin only. This is a quite different problem and thus the limit as r → 0+ of this distribution has nothing to do with the simple result in Theorem 1.1. A related issue is studied in [4]: for R > 0, consider the directions of lattice points (i.e. point of Z2 ) in the ball B(0, R) of R2 . These directions make a set of NR angles 0 ≤ θ0 < θ1 , . . . < θNR = 2π ; the paper [4] computes the distribution of scaled differences of the form NR (θk+1 − θk ) as R → +∞. Remarkably, these differences are far from being exponentially distributed – unlike in the case of angles picked at random in [0, 2π ). Consistently with the results in [6] and in [12], this observation stresses again the difference between the case of a random distribution of scatterers as studied in [11] and that of a periodic distribution scatterers. 7. Appendix: Background on Continued Fractions We recall below some basic facts and notations about continued fractions. Given α ∈ (0, 1), the Gauss map is defined by 1 1 Tα = − ; (7.1) α α it is known to be a uniquely ergodic transformation of (0, 1) with invariant measure dg(α) =

1 dα . ln 2 1 + α

(7.2)

The continued fraction expansion of α is α = [a1 , a2 , a3 , . . . ] =

1 a1 +

1 a2 +

with ak =

1 T k−1 α

, k ≥ 1;

(7.3)

1 a3 + . . .

and the action of T is seen to correspond to the shift α = [a1 , a2 , a3 , . . . ] → T α = [a2 , a3 , a4 , . . . ].

(7.4)

The convergents of α are defined by the recursion formulas qn+1 = an qn + qn−1 , q0 = 0, q1 = 1, pn+1 = an pn + pn−1 , p0 = 1, p1 = 0,

(7.5)

220

E. Caglioti, F. Golse

and the corresponding error dn = |qn α − pn | satisfies dn+1 = −an dn + dn−1 ,

d0 = 1, d1 = α.

(7.6)

n ∈ N∗ .

(7.7)

The first relation in (7.5) and (7.6) imply that qn dn+1 + qn+1 dn = 1,

For each α ∈ (0, 1) \ Q, the convergents of α are the sequence of best rational approximants of α. In other words |qn α − pn | = inf{ |qα − p| | p, q ∈ Z, 0 ≤ q ≤ qn }.

(7.8)

This implies in particular that the approximation by continued fractions is alternate in the sense that the algebraic value of the error is qn α − pn = (−1)n−1 dn ,

n ≥ 0.

Also, for each α ∈ (0, 1) \ Q, the sequence of errors satisfies the inequalities 1 1 < dn < . qn + qn+1 qn+1

(7.9)

The notations an (α), pn (α), qn (α) and dn (α) are used to emphasize the dependence of these quantities on α whenever necessary. Our discussion uses “renormalization”, i.e. transforming α by the iterates of the Gauss map T . We have gathered some useful facts in the next lemma. Lemma 7.1. Let α ∈ (0, 1); then • for each n ∈ N, qn (T α) = pn+1 (α) and dn+1 (α) = αdn (T α); • for each n ∈ N, dn (α) =

n−1

T kα ;

k=0

Proof. First observe that an (T α) = an+1 (α),

n ≥ 1.

This relation and (7.5) implies that the sequences qn (T α) and pn+1 (α) satisfy the same recursion formulae; thus in order to check the first formula, it suffices to check it for both n = 1 and n = 2 (in the latter case, one has q2 (T α) = [1/T α] = a2 = p3 (α)). The second formula is checked in the same way, and clearly implies the expression of dn (α) in terms of the T k α’s. Acknowledgement. We thank Prof. H. S. Dumas who told us that using ref. [2] might simplify our original method for deriving formulas (2.10)–(2.13). The research of E. C. has been partially supported by MIUR and by INDAM GNFM. Both authors acknowledge the support of the European Research Training Network HyKE (Contract no. HPRN-CT-2002-00282).

Free Path Lengths for Periodic Lorentz Gas

221

References 1. Arnold, V.I.: Mathematical methods of classical mechanics. 2nd edition, New York: Springer Verlag, 1989 2. Blank, S.: Krikorian, N.: Thom’s problem on irrational flows. Internat. J. of Math. 4, 721–726 (1993) 3. Boca, F., Gologan, R., Zaharescu, A.: The statistics of the trajectory in a certain billiard in a flat two-torus. Preprint, 2001 4. Boca, F., Cobeli, C., Zaharescu, A.: Distribution of lattice points visible from the origin. Commun. Math. Phys. 213, 433–570 (2000) 5. Boldrighini, C., Bunimovich, L., Sinai, Ya.G.: On the Boltzmann equation for the Lorentz gas. J. Statist. Phys. 32, 477–501 (1983) 6. Bourgain, J., Golse, F., Wennberg, B.: On the Distribution of Free Path Lengths for the Periodic Lorentz Gas. Commun. Math. Phys. 190, 491–508 (1998) 7. Bunimovich, L.: Billiards and other hyperbolic systems. In: Dynamical Systems, Ergodic Theory and Applications.Ya. G. Sinai, et al. (ed), Encyclopaedia Math. Sci. 100, 2nd ed., Berlin: Springer-Verlag, 2000, pp. 192–233 8. Caglioti, E., Golse, F.: Work in preparation 9. Chernov, N.: Entropy values and entropy bounds. In: Hard Ball Systems and the Lorentz Gas. D. Sz´asz (ed), Encyclopaedia Math. Sci., 101, Berlin: Springer-Verlag, 2000, pp. 121–143 10. Cioranescu, D., Murat, F.: Un terme e´ trange venu d’ailleurs I & II. In: Nonlinear Partial Differential Equations and their Applications: Coll`ege de France Seminar. I in Vol. II (Paris, 1979/1980), pp. 98–138, 389–390, II in Vol. III (Paris, 1980/1981), pp. 154–178, 425–426, Res. Notes in Math., 60 & 70, Boston, Mass.-London: Pitman, 1982 11. Gallavotti, G.: Rigorous theory of the Boltzmann equation in the Lorentz gas. Nota interna no. 358, Istituto di Fisica, Univ. di Roma (1972). Reprinted in “Statistical mechanics: A short treatise”, Berlin-Heidelberg: Springer, 1999, pp. 48–55 12. Golse, F., Wennberg, B.: On the distribution of free path lengths for the periodic Lorentz gas II. M2AN Mod´el. Math. et Anal. Num´er. 34(6), 1151–1163 (2000) 13. Lorentz, H.: Le mouvement des e´ lectrons dans les m´etaux. Arch. N´eerl. 10, 336 (1905); reprinted in Collected papers. Vol. 3 The Hague: Martinus Nijhoff, 1936. pp. 180–214 Communicated by G. Gallavotti

Commun. Math. Phys. 236, 223–250 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0797-5

Communications in

Mathematical Physics

Reduction in Principal Bundles: Covariant Lagrange-Poincar´e Equations M. Castrill´on L´opez1 , T. S. Ratiu2 1 2

Departamento de Geometr´ıa y Topolog´ıa, Universidad Complutense de Madrid, 28040 Madrid, Spain. E-mail: [email protected] ´ Institut de Math´ematiques Bernoulli, Ecole Polytechnique F´ed´erale de Lausanne, 1015 Lausanne, Switzerland. E-mail: [email protected]

Received: 18 September 2001 / Accepted: 2 December 2002 Published online: 21 February 2003 – © Springer-Verlag 2003

Abstract: A general reduction theory for field theoretical Lagrangians on principal fiber bundles is presented. The reduced variational problem as well as the associated equations are formulated. The link between the solutions of the original problem and that of the reduced problem is discussed and an obstruction to the reconstruction of the solutions is isolated. The important case of semidirect products is discussed in detail and several concrete examples are presented. 1. Introduction A standard tool in classical mechanics is the reduction method. This process, present already in the work of the founders of classical mechanics who used it to great effect in several concrete examples, uses a group of symmetries acting on the phase space of the problem in order to eliminate variables. The modern formulation of this method in the Hamiltonian context goes back to Marsden and Weinstein and is intimately related to symplectic geometry. Considerably newer is its counterpart in the Lagrangian formulation of mechanics, although its roots go back to Poincar´e [24] and Hamel [11]. The simplest example of this Lagrangian reduction is the so-called Euler-Poincar´e reduction, where the phase space is a Lie group G which coincides with the group of symmetries of the problem (for an elementary presentation, see [15, Chapter 13]). This paradigmatic example leads to a general geometric description of Lagrangian reduction theory in [5]. The first attempt to formulate a theory of Lagrangian reduction for general variational problems on an arbitrary bundle πME : E → M over base manifolds M of dimension not necessarily equal to 1 (which is the case of classical mechanics), is given in [4]. This paper can be considered as the equivalent of Euler-Poincar´e reduction for field theories. In this case, the bundle is a principal fiber bundle πMP : P → M with structure group G, which is also the group of symmetries of the field theoretical Lagrangian. The goal of the present paper is to present reduction theory for principal fiber bundles when the group of symmetries is a subgroup H of the structure group G. We do this

224

M. Castrill´on L´opez, T. S. Ratiu

for several reasons. First, this case represents the next natural step from the work done in [4] towards a complete theory of Lagrangian reduction. Second, the reduction by subgroups H ⊂ G is closely related to homogeneous spaces, which is a very important case in many concrete physical and mathematical applications. This paper fulfills thus the program outlined in [4]. Finally, the study of Lagrangian reduction by stages and many examples of field theoretical Lagrangian reductions fit in this context. We present some of them at the end of the paper. The structure of the paper is as follows. Section 2 presents a brief account of some results of the theory of principal bundles, jet manifolds, and variational calculus, needed in the sequel. Section 3 studies the geometry of the bundle (J 1 P )/H , which is the configuration bundle of the reduced problem. For this purpose, a fixed connection on the bundle P → P /H is used. Section 4 formulates and proves the reduced variational principle. It turns out that this reduced variational problem is no longer free. This is coherent with the well known fact that, in general, the reduced equations are not the Euler-Lagrange equations of the reduced Lagrangian (see for example [1]). We thus obtain a new kind of equations, which can be considered as the Lagrange-Poincar´e equations (see [5]) for field theories on principal bundles. Section 5 presents the reconstruction process, that is, the way to obtain the solutions of the original problem from the solutions of the reduced problem. It is shown that not every solution of the latter gives a solution to the former. Some compatibility conditions must be imposed on the reduced solutions in order to guarantee the existence of the original solutions. These conditions are formulated geometrically in terms of the flatness of the given connection. There is no equivalent to this obstruction in classical mechanics and it appears only for variational problems over manifolds whose dimension strictly exceeds 1. Section 6 links these results with those given in [4]. Section 7 presents Lagrangian semidirect product reduction for field theories. This formulation is useful when the symmetry group H can be enlarged to the total group G by adding new variables to the configuration bundle. This situation can be found in physics quite often; for classical mechanics it is given in [13]. Finally, Sect. 8 presents several concrete examples. The reduced equations for some of these problems can already be found in the literature, but were originally obtained ad hoc, case by case. The present theory unifies them and derives these equations from general principles.

2. Preliminaries Throughout this paper, differentiability will mean C ∞ . The action of a Lie group or a Lie algebra on a manifold is denoted by concatenation of symbols or by a dot. Given the action of a Lie group L on a manifold M, {x}L will denote the orbit of this action through x ∈ M, thought of as a point in the orbit space M/L. For the expressions in local coordinates, the Einstein summation convention of repeated indices will be used. The space of smooth sections of a given smooth locally trivial fiber bundle E → M will be denoted by (E). If f : M → N is a smooth map between the manifolds M and N , its derivative, or tangent map, is denoted by Tf : T M → T N , where T M and T N denote the tangent bundles of M and N respectively. If πME : E → M and πMF : F → M are two smooth locally trivial fiber bundles over the same base M the fibered product E ×M F := {(u, v) ∈ E × F | πME (u) = πMF (v)} is also a smooth locally trivial fiber bundle over M whose fiber at x ∈ M equals the product Ex ×Fx of the fibers of E and F .

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

225

2.1. The geometry of principal bundles 2.1.1. Gauge transformations. Given a (right) principal G-bundle πMP : P → M, a gauge transformation is a diffeomorphism : P → P satisfying πMP ◦ = πMP and (pg) = (p)g, for all p ∈ P and g ∈ G. The set of all gauge transformations GauP of the principal bundle P is an infinite dimensional Lie group under composition. A vector field X ∈ X(P ) is said to be G-invariant if (Rg )∗ X = X, for all g ∈ G, where Rg : P → P denotes the right free action of G on P . The Lie algebra gauP of infinitesimal gauge transformations consists of πMP -vertical G-invariant vector fields on P ; it is the Lie algebra of GauP . If V P denotes the subbundle of T P of πMP -vertical vectors, the quotient (V P )/G turns out to be a vector bundle over M and ((V P )/G) is naturally identified with gauP . 2.1.2. Adjoint bundle. Let g˜ = (P × g)/G → M (also denoted by adP in the literature) be the adjoint bundle of P → M, that is, the associated bundle defined by the adjoint action of G on its Lie algebra g. Recall that the G action on P × g is given by (p, B)g := (pg, Adg −1 B). The elements of g˜ are denoted by {p, B}G , for p ∈ P , B ∈ g. The Lie bracket [·, ·] on g endows the fibers of g˜ with the structure of a Lie algebra defined by [{p, B}G , {p, B }G ] := {p, [B, B ]}G ,

p ∈ P,

B, B ∈ g,

which depends smoothly on the base point. Thus g˜ is a bundle of Lie algebras. Given an element B of the Lie algebra g of G, the infinitesimal generator of the G-action on P is denoted by B ∗ ∈ X(P ), that is, Bp∗ := d(p exp tB)/dt|t=0 , for any p ∈ P . The diffeomorphism P × g → V P , (p, B) → Bp∗ , induces an isomorphism between the Lie algebra (˜g) of sections of the adjoint bundle g˜ → M and the Lie algebra of infinitesimal gauge transformations gauP , that is, we have (˜g) ∼ = ((V P )/G) = gauP . In what follows, we shall not distinguish between these two Lie algebras. 2.1.3. Connections on P . A connection on P is, by definition, a smooth G-invariant vector subbundle H of T P such that H ⊕ V P = T P . A connection is equivalently characterized by its connection 1-form ωH , which is an equivariant g-valued 1-form defining the vertical part of a vector field by means of the identification V P ∼ = P ×g. That is, ωH satisH H H fies ω (pg) = Adg −1 ◦ ω (p) for all p ∈ P , g ∈ G and ω (p)(Bp∗ ) = B, for all B ∈ g and p ∈ P . The connection H defines the horizontal lift operator ·H : Tx M → Tp P , −1 Y → Y H , x ∈ M, p ∈ πMP (x) by the requirement that Y H is the only vector of Hp that projects to Y , that is, Tp πMP (Y H ) = Y . The space of connections of a principal bundle is an infinite dimensional affine space modeled over the vector space 1 (M, g˜ ) of g˜ -valued 1-forms on M. 2.2. The geometry of P /H . Let πMP : P → M be a principal fiber bundle with structure group G and let H be a Lie subgroup of G. The quotient := P /H of P under the right action of H is a manifold and the projection πP : P → P /H is a principal H -bundle. The projection πM : P /H → M, {p}H → πMP (p), has the structure of fiber bundle over M with typical fiber G/H ; the elements of G/H are the equivalence classes {g}H = gH . We have hence the following.

226

M. Castrill´on L´opez, T. S. Ratiu

Proposition 1 ([14, I. Prop. 5.5]). The bundle P ×G (G/H ) = (P × (G/H ))/G associated to P with respect to the natural left action of G on G/H , can be identified with P /H . The identification is given by (P × (G/H ))/G → P /H,

{p, {g}H }G → {pg}H .

Throughout the paper, we will be concerned with the space of sections of the fiber bundle P /H → M. These global sections enjoy an interesting property: they define structure group reductions of the bundle P → M. More precisely, we say that the G-principal bundle P → M is H -reducible if there exists a subbundle P H of P that is simultaneously a principal bundle with structure group H and base M. Proposition 2 ([14, I. Prop. 5.6]). Let H be a Lie subgroup of a Lie group G and let P → M be a G-principal bundle. Then there is a bijective correspondence between H -reductions P H of P → M and global sections of the bundle πM : P /H → M. The −1 H -reduction associated to a section ς : M → P /H is P ς = πP (ς (M)), that is, this ∗ bundle coincides with the pull-back bundle ς P of the H -principal bundle P → P /H by ς. −1 −1 The identification between P ς = πP (ς (M)) and ς ∗ P is given by p ∈ πP (ς (M)) ↔ ∗ (p, πMP (p)) ∈ ς P .

2.3. The space of jets. For the material in this subsection we refer the reader, for example, to [27]. We shall recall here only the notations and shall fix the conventions in force throughout the paper. Let πME : E → M be a smooth locally trivial fiber bundle. Two local sections s, s of π are said to have the same 1-jet at x ∈ M if s(x) = s (x) and Tx s = Tx s . The equivalence classes are denoted byjx1 s and the set of all equivalence classes at x ∈ M is denoted by Jx1 E. The set J 1 E := x∈M Jx1 E can be endowed with the structure of a smooth affine fiber bundle over E with projection map π10 : J 1 E → E defined by π10 jx1 s = s(x). If dim M = n and dim E = n + m, then dim J 1 E = n + m + nm. Let (x i , y α ), 1 ≤ i ≤ n, 1 ≤ α ≤ m, be a fibered coordinate −1 (U ) ⊂ J 1 E is defined system on U ⊂ E. The induced natural chart (x i , y α , yiα ) on π10 by yiα (jx1 s)

∂(y α (s)) := . ∂x i x

Remark 1. There is a different but equivalent description of the 1-jet space J 1 E. Given a class jx1 s ∈ J 1 E the tangent mapping Tx s : Tx M → Ts(x) E is well defined since Tx s depends only on the first derivative of s at x. Similarly, given a linear map λ : Tx M → Tp E such that Tp πME ◦ λ = Id, there is a unique jx1 s ∈ J 1 E such that Tx s = λ. Therefore, we can realize J 1 E as the space of all linear maps λ : Tx M → Tp E, x ∈ M, −1 p ∈ πME (x), verifying Tp πME ◦ λ = Id. Actually, we can go further and we can identify jx1 s with the range Tx s(Tx M) ⊂ Ts(x) E. In this way, J 1 E can be thought of as the set of subspaces Hp ⊂ Tp E, p ∈ E, such that Hp ⊕ Vp E = Tp E, where Vp E is the subspace of vertical vectors at p ∈ E. This interpretation of 1-jets will be convenient in the sequel.

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

227

Given a smooth morphism of fiber bundles : E → E, covering the diffeomorphism ϕ : M → M, also called a fibered mapping over ϕ, the 1-jet extension (1) : 1 ( ◦ s ◦ ϕ −1 ) ([27, Ch. 4]). This J 1 E → J 1 E of is defined by (1) (jx1 (s)) := jϕ(x) map obviously satisfies ◦ π10 = π10 ◦ (1) . In particular, if is vertical (that is, πME ◦ = πME , or, equivalently, ϕ = identity on M), the definition of (1) is simply (1) (jx1 s) = jx1 ( ◦ s), jx1 s ∈ J 1 E. Let Xv (E) := {X ∈ X(E) | T πME ◦ X = 0} denote the Lie algebra of vertical vector fields on E. Thus, given a πME -vertical vector field X ∈ Xv (E) with flow (1) t , define X (1) ∈ X(J 1 E) by the requirement that its flow be t . The mapping 1 (1) v X (E) → X(J E), X → X , is a homomorphism of Lie algebras called the natural lift to J 1 E or shortly, the 1-jet lift of vector fields. Given a section s : M → E of πME : E → M, denote by j 1 s : M → J 1 E the mapping sending each x ∈ M to jx1 s. 2.4. Calculus of variations. Given a fiber bundle πME : E → M, a first order Lagrangian density is a fibered morphism L : J 1 E → n T ∗ M over the identity mapping on M. Assuming that M is oriented by a volume form v (i.e., M is orientable), we can write L(jx1 s) = L(jx1 s)vx , jx1 s ∈ J 1 E, where L : J 1 E → R is called the Lagrangian associated to L. Denote by c (E) the set of compactly supported sections of the fiber bundle πME : E → M. The action defined by a Lagrangian density is the mapping S : c (E) → R, sending a (local) section s of πME : E → M with compact support in the open set U ⊂ M to the value U L ◦ j 1 s = U (L ◦ j 1 s)v. Given such a compactly supported section s : U → E, a vertical variation of s is a one parameter family sε of sections of πME : E → M, each sε defined on U and such that s0 = s. The derivative dsε (x) , δs(x) := dε ε=0 is called the infinitesimal variation of sε ; δs is a vector field covering the section s, vertical with respect to πME , that is, δs : M → T E satisfies δs(x) ∈ Ts(x) E and T πME ◦ δs = 0. Since the section s has compact support in the open set U , we will always consider variations sε such that δs = 0 on the boundary ∂U . If M is compact, we can consider global sections and the requirement δs = 0 on ∂U is no longer necessary. We say that s : U → E is a critical section of the variational problem defined by L if d d 1 δ L ◦ j 1 s := L ◦ j s = (L ◦ j 1 sε )v = 0, ε dε dε U ε=0 U ε=0 U for every variation sε . If X ∈ Xv (E) is a vertical vector field such that ds(x) = Xx , for all x ∈ U, dε ε=0 we have (see for example [10]) δL (1) d 1 (X )v, (L ◦ j s )v = ε dε ε=0 U U δs

228

M. Castrill´on L´opez, T. S. Ratiu

where δL/δs denotes the differential along j 1 s, that is, δL (Z) = (iZ dL) ◦ j 1 s, δs for vector fields Z ∈ X(J 1 E) vertical with respect to the projection πME ◦ π10 . By iZ we denote the interior product (contraction on the first index) with Z. Hence, a section is critical if and only if δL (1) (X )v = 0, U δs for all X ∈ Xv (E) with compact support. Remark 2. As we have already noted, for first order variational calculus, the derivative d U (L ◦ j 1 sε )v/dε|ε=0 depends only on the infinitesimal variation δs and not on the variation of s itself; that is, two variations sε and sε with δs = δs will give the same value of the derivative. We will use this property very often, especially in the study of the variations of the reduced variational problem. Given a section s, the vertical differential δL/δs can be seen as a differential operator on vectors X ∈ Xv (E). The adjoint differential operator EL(L) is a section of V ∗ E (the dual to the vertical subbundle) and is called the Euler-Lagrange operator defined by L along the section s (see for example [10]). The local expression of this operator is

∂L ∂L ∂ 1 1 EL(L) = ◦j s+ i ◦j s dy α . ∂y α ∂x ∂yiα The equation EL(L) = 0 is known as the Euler-Lagrange equation for s and is equivalent to the fact that s is an extremal. Remark 3. In the setup of the calculus of variations presented above we have worked with compactly supported sections, but in many situations the classical solutions are not compactly supported. What is really important is the compactness of the support of the variations, the key property with which the variational equations can still be defined. For the sake of simplicity and in order to avoid the introduction of further notations, we have opted to work in this paper with sections with compact support and, even more, we will often assume that the manifold M is itself compact, with or without boundary, and that the sections s under consideration are global sections. These assumptions, while considerably simplifying the exposition, do not represent real restrictions to the theory and can be omitted, if desired, with minor notational modifications. Also, every time this appears, the result is really local, so this issue of compactness is not really relevant. 3. The Geometry of (J 1 P )/H ˜ Given a principal 3.1. The identification (J 1 P )/H = J 1 (P /H ) ×P /H (T ∗ M ⊗ h). G-bundle, the right action R of G on P gives rise to a right action on J 1 P in a natural way, namely jx1 s · g := jx1 (Rg ◦ s),

jx1 s ∈ J 1 P ,

g ∈ G.

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

229

We are concerned with the structure of the quotient set (J 1 P )/H . An element {jx1 s}H ∈ (J 1 P )/H can be seen as a H -invariant distribution of vector subspaces complementary to the vertical bundle V P (see Sect. 2.3 above). Indeed, given jx1 s ∈ J 1 P , the class {jx1 s}H represents a distribution along the H -orbit of p = s(x), which, at the point ph, h ∈ H , is given by (Tp Rh ◦ Tx s)(Tx M) ⊂ Tph P . The space (J 1 P )/H is a differentiable manifold and the canonical projection J 1 P → 1 (J P )/H endows it with the structure of a H -principal bundle. The geometry of the space (J 1 P )/H is better understood with the aid of a connection H on the H -principal bundle πP : P → P /H =: and its associated connection one-form ωH . Let h˜ be the adjoint bundle of P → P /H , that is, h˜ = (P × h)/H . The one-form ωH defines a homomorphism ω˜ H : T P → h˜ of vector bundles over = P /H given by ω˜ H (X) := {p, ωH (X)}H ,

for all

X ∈ Tp P , p ∈ P .

(3.1)

It is straightforward to check that ω˜ H ◦T Rh = ω˜ H , for all h ∈ H . We know that sections of h˜ correspond to H -invariant vertical vector fields (see Sect. 2.1.2). Similarly, an element {p, B}H ∈ h˜ will correspond to a H -invariant πP -vertical vector field along the −1 fiber πP ({p}H ). Thus, ω˜ H when applied to a vector X ∈ Tp P , gives the H -invariant πP -vertical vector field along the H -orbit orbH (p) through p whose value at p equals the vertical part of X. We now define the fibered morphism over P /H ∗ ˜ T ∗ M ⊗P /H h), φH : (J 1 P )/H → J 1 (P /H ) ×P /H (πM

given by φH ({jx1 s}H ) := (jx1 {s}H , ω˜ H ◦ Tx s), where x ∈ M and {s}H ∈ (P /H ) is the local section of P /H → M induced by the local section s of P → M. Proposition 3. The fibered morphism over = P /H , ∗ ˜ φH : (J 1 P )/H → J 1 (P /H ) ×P /H (πM T ∗ M ⊗P /H h),

is a diffeomorphism and hence a fiber bundle isomorphism. Proof. First we check that φH is well defined. If s = Rh ◦ s, h ∈ H , is another representative of the class {jx1 s}H , then it follows that {s }H = {s}H and ωH ◦ T s = ωH ◦ T Rh ◦ T s = ωH ◦ T s. Clearly, φH is differentiable. For the inverse, we define the morphism ∗ ˜ → (J 1 P )/H, T ∗ M ⊗P /H h) ψH : J 1 (P /H ) ×P /H (πM

by ψH (jx1 {s}H , ξ ) = {jx1 sˆ }H , −1 ({s(x)}H ) and by where jx1 sˆ ∈ J 1 P is determined by its point value sˆ (x) = p ∈ πP the value of its derivative T sˆ : Tx M → Tp P , T sˆ (X) = (T {s}H (X))H + ξ(X)p ; here ·H : T{s(x)}H → Tp P denotes the horizontal lift operator and ξ(X) ∈ h˜ is interpreted

230

M. Castrill´on L´opez, T. S. Ratiu

−1 as a H -invariant vector field along πP ({s(x)}H ) evaluated at the point p ∈ P . The −1 definition does not depend on the choice of the point p ∈ πP ({s(x)}H ). The differentiability of ψH as well as the identities φH ◦ ψH = Id and ψH ◦ φH = Id are easy to check.

Notation. In the sequel, we will make use of the identification defined by φH , ˜ (J 1 P )/H ∼ = J 1 (P /H ) ×P /H (T ∗ M ⊗ h),

(3.2)

without an explicit reference to the connection H. Moreover, for the sake of simplicity, ∗ T ∗M ⊗ ˜ we directly write T ∗ M ⊗ h˜ instead of the bundle πM P /H h, where the pull-back notation is omitted as it is clear that all the vector bundles and the tensor product are understood over P /H . Remark 4. It is important to note that the identification given in Proposition 3 is not canonical and depends on the choice of the connection H. Roughly speaking, the identification (3.2) gives a splitting of every element {jx1 s}H ∈ (J 1 P )/H in the horizontal and vertical part (jx1 {s}H and ω˜ H ◦ T s, respectively) relative to the connection H. Though the horizontal part is independent of H, the vertical part does depend on H, as it gives the vertical component of T s with respect to H. 3.2. The bundle T ∗ M ⊗ h˜ → M.

We study now the composite bundle πh

T ∗ M ⊗ h˜ −→ P /H = −→ M. πM

We will see later that the critical sections of the reduced variational problem are in fact sections of this bundle. A section σ : M → T ∗ M ⊗ h˜ of this composite bundle induces two geometrical objects over M: • First, we obtain a section ς : M → P /H , ς(x) := π h (σ (x)), and therefore, by virtue of Proposition 2, a submanifold P ς ⊂ P which is a H -reduction of the bundle P → M. This bundle is the pull-back bundle ς ∗ P of the bundle P → P /H . Given the fixed connection H on P → P /H we consider its restriction to P ς , which is also a connection in the principal H -bundle P ς → M. ˜ • Second, the section σ can be interpreted as a ς ∗ h-valued 1-form on M. Since the ς ∗ ˜ adjoint bundle of P → M is ς h, we can subtract this form from ωH and we thus obtain a new connection Hσ on P ς . Remark 5. In fact, what we have described above is the configuration space used for the geometrical description of the process known as spontaneous symmetry breaking (see [9 Sect. 3.8 26, 21] for a complete description of this construction). Roughly speaking, this quantum phenomenon is modeled as follows. We have an original G-principal bundle which models a particular physical system. We have a subgroup H ⊂ G. The symmetry is said to be “broken” when we choose a H -reduction P ς of P induced by a section ς : M → P /H . Frequently in gauge theories, the new configuration bundle P ς is endowed with a potential, that is, a connection Hσ on it. Thus, it is precisely the sections of the composite fiber bundle T ∗ M ⊗ h˜ → M who describe all possible couplings of (H -reductions P ς ) + (connections Hσ on P ς ). This bundle turns out to be the ideal place to formulate variational models of this physical process as its configuration bundle.

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

231

4. The Reduction of the Variational Principle 4.1. Invariant Lagrangians. We now assume that the Lagrangian L : J 1 P → R is invariant under the action of a subgroup H ⊂ G on J 1 P . Then L projects to the quotient and we obtain a function l : (J 1 P )/H → R called the reduced Lagrangian. The variational problem defined by L gives a new variational problem (with constraints, as we will see below) on the quotient (J 1 P )/H . The ˜ for a fixed connection use of the identification (J 1 P )/H = J 1 (P /H )×P /H (T ∗ M ⊗ h), H on P → P /H , will be needed for its description; recall that we denote, for simplicity, ∗ T ∗M ⊗ ˜ T ∗ M ⊗ h˜ := πM P /H h. We see now that the critical sections of the new problem are the sections of the bundle T ∗ M ⊗ h˜ → M. Given a section s of πMP : P → M, we define the section σ : M → T ∗ M ⊗ h˜ by σ (x) := ω˜ H ◦ Tx s,

x ∈ M,

which is called the reduced section. The section s also defines a section ς = {s}H of the bundle P /H → M, which can also be obtained by projecting σ to P /H via π h : T ∗ M ⊗ h˜ → P /H , that is ς = {s}H = π h ◦ σ. If we consider jet bundles, the section s naturally induces the section j 1 s of J 1 P → M, which defines, by virtue of the identification (3.2), the section ˜ (j 1 ς, σ ) : M → J 1 (P /H ) ×P /H (T ∗ M ⊗ h). The two components of a section of this kind are not independent as j 1 {s}H = j 1 ς can be obtained from σ . In fact, the section σ of the bundle T ∗ M ⊗ h˜ → M completely ˜ The space of these determines the section (j 1 ς, σ ) : M → J 1 (P /H ) ×P /H (T ∗ M ⊗ h). sections is where the action functional of the reduced Lagrangian l is defined, namely 1 l ◦ (j 1 ς, σ )v. (j ς, σ ) → M

For these reasons, it is clear that the study of the reduced problem takes place on the space of sections of the composite bundle T ∗ M ⊗ h˜ → M. This space is the configuration bundle of the reduced problem. Let s be a section of πMP and sε be an arbitrary variation of s. The infinitesimal variation δs of sε determines a variation d d σε (x) = (ω˜ H ◦ Tx sε ), δσ (x) = dε ε=0 dε ε=0

x ∈ M,

of the projected section σ . The variation of the action defined by L gives d d 1 L(j sε )v = l(j 1 ςε , σε )v. dε ε=0 M dε ε=0 M

(4.1)

232

M. Castrill´on L´opez, T. S. Ratiu

∗ T ∗M ⊗ ∗ ˜ ˜ Therefore, a section σ of πM P /H h ≡ T M ⊗ h → M will be critical if and only if i(δς (1) ,δσ ) dl v = 0, M

for any variation δσ of σ of the type described in formula (4.1) and with δς = T π h(δσ ). That is, the variational problem defined by L becomes a variational problem on T ∗ M ⊗ h˜ with constraints on the set of admissible variations (see also Remark 8 below). Remark 6. The phase space of this new variational problem is a fiber product ˜ of a jet space and a vector bundle. This is not the usual J 1 (P /H ) ×P /H (T ∗ M ⊗ h) case in the calculus of variations, where the phase space is always the jet bundle of the configuration bundle. This new category of phase spaces has been already observed for the case of classical mechanics in [5]. We believe that this category represents a suitable framework for a future theory of reduction by stages in Field Theory, which would be of great interest. For the reduction of the variational principle, we need to characterize the structure of the admissible variations δσ . For this purpose, we make use again of the fixed connection H. Using this connection, we will be able to distinguish vertical and horizontal variations of δs. We will give an explicit formula of δσ for those cases. Finally, as an arbitrary variation δs can be always written as the sum of a vertical and a horizontal variation δs = δs v + δs h , the general expression of δσ will be simply the sum of δσ v + δσ h .

4.2. Vertical variations. Although an infinitesimal variation δs of an arbitrary variation s : M → P is by definition vertical with respect to πMP , it need not be relative to πP . In fact, an infinitesimal variation δs is vertical with respect to πP if and only if there exists a mapping B : M → h such that δs(x) = B(x)∗s(x) (see Sect. 2.1.2 above). That is, δs can be seen as the infinitesimal generator of the variation sε (x) = s(x) exp(εB(x)). In this section we are going to study variations of this type. If ς : M → P /H denotes the class {s}H , we see that the variation ςε induced by the variation sε described above, does not depend on ε, that is ςε = ς , for all ε. Using the identification (3.2) with respect to the fixed connection H, i.e. (J 1 P )/H ∼ = ˜ we conclude that {j 1 sε }H = (j 1 ςε , σε ) = (j 1 ς, σε ). J 1 (P /H ) ×P /H (T ∗ M ⊗ h), Hence we can say that the infinitesimal variation along (j 1 ς, σ ) is the zero vector field for the jet component J 1 (P /H ) and a vector field on the vector bundle T ∗ M ⊗ h˜ along σ for the second component. As every section σε of the composite bundle T ∗ M ⊗ h˜ projects to the same section ς of P /H → M, the variation δσ = (d/dε)|ε=0 σε is vertical with respect to the projection T ∗ M ⊗ h˜ → P /H . As this last bundle is a vector bundle, we can thus interpret δσ as a section of the composite bundle T ∗ M ⊗ h˜ → M, which projects to the same section ς : M → P /H as σ , that is, ˜ δσ ∈ (T ∗ M ⊗ h). With this idea, the structure of the variation δσ can be described as follows.

(4.2)

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

233

Proposition 4. Given a vertical variation δs and the mapping B : M → h such that δs(x) = B(x)∗s(x) , let η be the section of the bundle h˜ → M defined by η(x) = {s(x), B(x)}H ,

x ∈ M.

The section η can be interpreted as a section of the vector bundle ς ∗ h˜ → M. Then we have δσ = ∇ H η − [σ, η],

(4.3)

˜ where ∇ H is the linear connection induced by H on h˜ (and on ς ∗ h). Proof. As the formula we want to prove is local and every principal bundle is locally trivial, it is enough to consider the special case P = P /H × H . In this case the section s can be written as s(x) = (ς (x), h(x)), for a certain mapping h : M → H . We have T s = (T ς, T h). We now take the variation sε (x) = s(x) exp(εB(x)) = (ς(x), h(x) exp(εB(x)). The chain rule yields T sε = (T ς, T (h exp(εB))) = (T ς, T Lh ◦ T exp(εB) + T Rexp(εB) ◦ T h),

(4.4)

where L and R stand for the left and right translation on the group H respectively. On the other hand, on a trivial bundle, the composition of T sε with the connection form has the expression ωH ◦ T sε = Adexp(−εB) ωH ◦ T ς + ωˆ ◦ T (hexp(εB)),

(4.5)

where ωˆ : Thexp(εB) H → h, ωˆ = T L(hexp(εB))−1 , stands for the (left) Maurer-Cartan form. From (4.4) and (4.5) we obtain ωH ◦ T sε = Adexp(−εB) ωH ◦ T ς + T Lexp(−εB)h−1 ◦ T Lh ◦ T exp(εB) + T Rexp(εB) ◦ T h = Adexp(−εB) ωH ◦ T ς + T Lexp(−εB) ◦ T exp(εB) + Adexp(−εB) ◦ T Lh−1 ◦ T h. Therefore, for x ∈ M we get σε (x) = ω˜ H ◦ Tx sε = {s(x) exp(εB), ωH ◦ Tx sε }H = {s(x), Adexp(εB) ◦ ωH ◦ Tx sε }H = {s, ωH ◦ T ς + Adexp(εB) T Lexp(−εB) ◦ T exp(εB) + T Lh−1 ◦ T h}H . The infinitesimal variation δσ is the derivative of the previous formula with respect to ε. If we interpret this vector field as a new section of T ∗ M ⊗ h˜ (see (4.2) above), we do not need to take the derivative of the first component of σε as an equivalence class { , }H . Thus we get d d δσ = σε = s, Adexp(εB) ◦ T Lexp(−εB) ◦ T exp(εB) dε ε=0 dε ε=0 H = {s, dB}H = {s, dB + [ωH ◦ T s, B] − [ωH ◦ T s, B]}H = ∇ H η − [σ, η], since the covariant derivative of the linear connection on ς ∗ h˜ when applied to the section η = {s, B}H equals ∇ H η = {s, dB + [ωH ◦ T s, B]}H , as a direct verification (that uses the definition of the induced covariant derivative) shows.

234

M. Castrill´on L´opez, T. S. Ratiu

˜ its covariant differential as well as its Lie bracket with Remark 7. Since η ∈ (ς ∗ h), ˜ yields a section of π ∗ T ∗ M ⊗P /H h˜ ≡ T ∗ M ⊗ h. ˜ Then formula σ ∈ (T ∗ M ⊗ h) M (4.3) makes sense as it represents an equality in the set of sections of this bundle, if we understand δσ as in (4.2). 4.3. Horizontal variations. We now consider horizontal variations with respect to H, that is, infinitesimal variations δs along a given section s such that ωH (δs) = 0. In this case, if we take a variation sε defining δs, the sections ςε = {sε }H of P /H → M are not constant and define a vector field δς along ς . The variation δσ along the section ∗ T ∗M ⊗ ∗ ˜ ˜ σ = ω˜ H ◦ T s of the composite bundle πM P /H h ≡ T M ⊗ h → M projects to δς on P /H . The connection H on P → P /H defines a connection on its associated bundle h˜ → P /H and, with the aid of the trivial connection on T ∗ M → P /H , it in∗ T ∗M ⊗ ∗ ˜ ˜ duces a connection on the tensor product πM P /H h ≡ T M ⊗ h. Then δσ can be decomposed in the vertical and horizontal parts δσ = (δσ )v + (δσ )h with respect to this connection. As δσ projects to δς, it is clear that (δσ )h is the horizontal lift Hor(δς ) of δς . The vector field (δσ )v = δσ − Hor(δς ) is vertical in the vector bundle T ∗ M ⊗ h˜ → P /H and can thus be understood as a section of T ∗ M ⊗ h˜ → M, as we have done in Sect. 4.2 for formula (4.2). Proposition 5. With the notation given above, we have ˜ H (δς, T ς ), (δσ )v = ˜ ˜ H is the curvature of H, seen as a h-valued where 2-form on P /H . Proof. The definition of the vertical part of δσ gives d (δσ )v = σε − Hor(δ{s}H ). dε ε=0

(4.6)

d The horizontal lift of the vector δ{s}H = dε {s } to h˜ is the image via the projecε=0 ε tion P × h → (P × h)/G = h˜ of its horizontal lift to P with respect to H, which is δs = dsε /dε, since the variation is horizontal. Then the horizontal lift at all the points of the form {s(x), ωH ◦ Tx s}H , x ∈ M, is d Hor(δ{s}H ) = {sε , ωH ◦ T s}H . (4.7) dε ε=0 As the formula we want to prove is local and every principal bundle is locally trivial, it is enough to consider the special case P = P /H × H . For such a trivial bundle, every section s can be written as s(x) = (ς (x), h(x)) for a certain mapping h : M → H . The adjoint bundle h˜ → P /H is then identified with P /H × h by putting {(ς, h), B}H → (ς, Adh B).

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

235

Using these identifications, from (4.6) and (4.7), we obtain d d H (δσ )v = {s , ω ◦ T s } − {sε , ωH ◦ T s}H ε ε H dε ε=0 dε ε=0 d d H = (ς , Ad ◦ ω ◦ T s ) − (ςε , Adhε ◦ ωH ◦ T s) ε hε ε dε ε=0 dε ε=0 d ωH ◦ T sε ), = (ς, Adh ◦ dε ε=0 where for the last step, we make use of the chain rule. Then d (δσ )v = (ς, Adh ◦ ω H ◦ T sε ) dε ε=0 d ω H ◦ T s ε }H = {s, dε ε=0 = {s, dωH (δs, T s)}H , as δs = dsε /dε|ε=0 . From the Cartan formula H = dωH +[ωH , ωH ], as δs is horizontal, we have dωH (δs, T s) = (δs, T s). The proof is complete by taking into account ˜ the definition of (δς, T ς ) (cf. [14, p. 76]). Corollary 1. Let sε be an arbitrary variation of a section s of πMP ,with infinitesimal variation δs. Then the infinitesimal projected variation on (J 1 P )/H = J 1 (P /H )×P /H ˜ is (T ∗ M ⊗ h)

d d 1 1 H H ˜ (j ςε , σε ) = j ςε , ∇ η − [σ, η] + (δς, T ς ) . (4.8) dε ε=0 dε ε=0 Proof. The connection H decomposes the infinitesimal variation δs in the vertical and horizontal components: δs = δ v s + δ h s. From Proposition 4, for the first compo˜ nent we obtain δς = 0 and δσ = ∇ H η + [σ, η]. On J 1 (P /H ) ×P /H (T ∗ M ⊗ h) H the infinitesimal variation will have the same expression (0, ∇ η + [σ, η]). Now, for ˜ H (δς, T ς ) + Hor(δς ). Then, the horizontal component, Proposition 5 gives δσ = ˜ the infinitesimal variation will be on the fiber product J 1 (P /H ) ×P /H (T ∗ M ⊗ h), 1 H ˜ (d/dε|ε=0 j ςε , (δς, T ς )) as both vector fields Hor(δς ) and d/dε|ε=0 j 1 ςε project to δς. Adding up the horizontal and the vertical contributions, we obtain formula (4.8). Remark 8. Corollary 1 clearly shows that the reduced variational principle defined by L ˜ is not free. From the expression (4.8), we on the quotient J 1 (P /H ) ×P /H (T ∗ M ⊗ h) note that not every possible variation δσ comes from a variation δs, i.e., the variations of the reduced variational problem have constraints. For example, for H = G = R, ∗ T ∗M ⊗ ∗ ˜ ˜ P = M × G, and H the trivial connection, we have πM P /H h ≡ T M ⊗ g ∗ T M, i.e., σ and δσ can be seen as forms on M. Formula (4.8) simply reads δσ = dη, for an arbitrary η ∈ C ∞ (M), that is, only those variations which are exact forms are admissible, which represents a non trivial topological constraint.

236

M. Castrill´on L´opez, T. S. Ratiu

4.4 Notations ˜ → R be a smooth 4.4.1. The operator δl/δσ . Let l : J 1 (P /H ) ×P /H (T ∗ M ⊗ h) ∗ ∗ ∗ ˜ ˜ function and let σ : M → πM T M ⊗P /H h ≡ T M ⊗ h be a section of the composite fiber bundle. We define δl/δσ as the vertical derivative of l along σ on the component ˜ More precisely, given a tangent vector Y at σ (x), vertical with respect to the T ∗ M ⊗ h. projection T ∗ M ⊗ h˜ → P /H , we define

d δl l(j 1 ς, σ (x) + εY ). , Y := δσ dε ε=0 x Since T ∗ M ⊗ h˜ → P /H is a vector bundle, Y can be thought of as an element of the fiber over ς (x) ∈ P /H . Then, δl/δσ can be interpreted as a section of the dual ˜ ∗ = T M ⊗ h˜ ∗ → P /H → M which projects on the composite fiber bundle (T ∗ M ⊗ h) same section ς : M → P /H as σ . 4.4.2. The coadjoint operator. The fiberwise bracket operation ad: h˜ ⊕P /H h˜ → h˜ ˜ induces in a natural way a morphism (the coadjoint defined on the adjoint bundle h, morphism) ad∗ : h˜ ⊕P /H h˜ ∗ → h˜ ∗ , (α, µ) → ad∗α µ, by ad∗α µ, β := µ, adα β, for α, β ∈ h˜ y , µ ∈ h˜ ∗y , y ∈ P /H , where , denotes the natural pairing between h˜ and h˜ ∗ . This operator can be generalized in a natural way to a morphism ˜ ⊕P /H (T M ⊗ h˜ ∗ ) → h˜ ∗ , ad∗ : (T ∗ M ⊗ h)

(4.9)

where by abuse of notation, we shall also call it ad∗ . Fiberwise, this morphism is defined ˜ y be of the form σ = ω⊗η, with ω ∈ T ∗ M, as follows. Let y ∈ P /H and σ ∈ (T ∗ M ⊗ h) x ˜ ˜ πM (y) = x, η ∈ hy . For X ∈ (T M ⊗ h)y of the form X = X ⊗ θ , X ∈ Tx M, θ ∈ h˜ ∗y , we define ad∗σ X := (iX ω) ad∗η θ, that is, the usual coadjoint operator combined with the natural pairing between T M and ∗ T ∗M ⊗ ˜ T ∗ M. This definition can be generalized to arbitrary elements of πM P /H h ≡ T ∗ M ⊗ h˜ and its dual T M ⊗ h˜ ∗ by linearity. 4.4.3. The divH operator. Let ∇ ∗H : s (P /H, h˜∗ ) → s+1 (P /H, h˜ ∗ ), s ∈ N, be the covariant differential defined by the connection H on the space of h˜ ∗ -valued forms on ∗ v ∈ n (P /H ), which P /H . We consider the volume form v on M and its pull-back πM will be denoted also by v, for sake of simplicity. Now let X be a section of the bundle T M ⊗ h˜ ∗ → M and iX v ∈ n−1 (M, h˜ ∗ ) ⊂ n−1 (P /H, h˜ ∗ ) its contraction with v. It is clear that there exists a unique section divH X of the bundle h˜ ∗ → M such that ∇ ∗H iX v = v ⊗ divH X . In this way, we can define an operator divH : (T M ⊗ h˜ ∗ ) → (h˜ ∗ ),

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

237

which generalizes the ordinary divergence operator div : (T M) → R for vector fields, to vector bundle valued vector fields. It is easy to check that for any section η of h˜ the following formula holds: div(X , η) = divH X , η + X , ∇ H η .

(4.10)

We refer the reader to [4] for a proof of this property and for a local expression of divH . 4.4.4. The partial Euler-Lagrange operator. Given a function l : J 1 (P /H ) ×P /H ˜ → R and a section σ : M → T ∗ M ⊗ h˜ of the composite fiber bundle, we (T ∗ M ⊗ h) define the restricted function lς : J 1 (P /H ) → R,

lς (jx1 ς) := l(jx1 ς, σ (x)).

Then, by definition, the partial Euler-Lagrange operator ELς (l) at a section σ is the standard Euler-Lagrange operator of the restricted function lς along the section j 1 ς .

4.5. Reducing the variational problem. The goal of this section is to present how the variational problem defined by L projects to (J 1 P )/H and what properties a reduced section must satisfy in order to be the projection of a critical section of the original problem. With the notations introduced up to now, this result can be stated as follows. Theorem 1 (Lagrange-Poincar´e reduction). Let πMP : P → M be a G-principal bundle over a compact manifold M with volume form v. Let L : J 1 P → R be a Lagrangian which is invariant under the action of a subgroup H ⊂ G. We fix a principal connection H (with connection one-form ωH ) on the bundle P → P /H . Let ˜ → R be the mapping defined by L on the quotient by l : J 1 (P /H ) ×P /H (T ∗ M ⊗ h) means of the identification (3.2). If s is a section of πMP , we define the section σ of the composite bundle T ∗ M ⊗ h˜ → M by σ (x) = {s(x), ωH ◦Tx s}H and the section ς of the bundle P /H → M by ς(x) = {s(x)}H for x ∈ M. Then the following are equivalent: (1) the variational principle L(j 1 s)v = 0,

δ M

holds for arbitrary variations δs; (2) s satisfies the Euler-Lagrange equations for Lv; (3) the variational principle l(j 1 ς, σ ) = 0,

δ M

˜ H (δς, T ς ), where δς is holds, for variations of the form δσ = ∇ H η − [σ, η] + ˜ an arbitrary variation of ς and η is an arbitrary section of ς ∗ h;

238

M. Castrill´on L´opez, T. S. Ratiu

(4) σ satisfies the Lagrange-Poincar´e equations

δl ˜H , , iT ς ELς (l) = δσ δl δl divH + ad∗σ = 0, δσ δσ

(4.11)

∗ T ∗M ⊗ ˜ ˜∗ where , represents the natural pairing between πM P /H h ≡ T M ⊗ h ˜ and its dual T M ∗ ⊗ h. H ˜ After the paring with ˜ Note that iT ς is a section of T ∗ (P /H ) ⊗ (T ∗ M ⊗ h). δl/δσ we have a 1-form on P /H . As ELς (l) is a vertical 1-form on P /H , that is, a 1-form for vertical vectors with respect to the projection πM : P /H → M, one ˜ H to vertical vectors in order to have has to consider the restriction of δl/δσ, iT ς an identity of vertical 1-forms in the first equation of (4.11).

Proof. The equivalence (1)⇔(2) is the classical result of the calculus of variations. For (1)⇔(3), we know that, given a variation sε , d d 1 l(j 1 ςε , σε )v. L(j s )v = ε dε ε=0 M dε ε=0 By virtue of Corollary 1, we have

d d 1 1 H H ˜ (j ς , σ ) = j ς , ∇ η − [σ, η] + (δς, T ς ) . ε ε ε dε ε=0 dε ε=0 Then δ M L(j 1 s)v = 0 if and only if δ M l(j 1 ς, σ )v = 0 with respect to this type of variations. Finally, for (3)⇔(4), we write

δl δl d d 1 1 v (δσ ) + l(j ςε , σε )v = j ςε dε ε=0 M δς dε ε=0 M δσ

δl H ˜ H (δς, T ς ) + ELς (l), δς v, ∇ η − [σ, η] + = M δσ where δl/δς yields its adjoint operator, that is, the partial Euler-Lagrange operator ELς (l), by integration by parts. Then, from formula (4.10), we have

δl δl δl div , ∇ Hη v = , η − divH ,η v δσ δσ M δσ M δl divH =− , η v, δσ M because M div δl/δσ, η v = 0 by Stokes’ Theorem. We thus obtain

δl δl δl ˜ H , δς v, −divH l(j 1 ς, σ )v = − ad∗σ , η + ELς (l) − , iς∗ δ δσ δσ δσ M M and then, from the fundamental lemma of the calculus of variations, η and δς being arbi trary, we conclude that δ M l(j 1 ς, σ )v = 0 holds if and only if the Lagrange-Poincar´e equations are satisfied.

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

239

5. Compatibility Conditions and Reconstruction Given a critical section s : M → P of the variational problem defined by Lv we obtain a reduced section σ : M → T ∗ M ⊗ h˜ which is a solution of the Lagrange-Poincar´e ∗ T ∗M ⊗ ∗ ˜ ˜ equations for l; recall that we use the notation πM P /H h ≡ T M ⊗ h. In Sect. −1 ς 3.2 we showed that σ defines two geometrical objects: a H -reduction P = πP (ς (M)) of the bundle P and a connection Hσ on it. In fact, the relationship between P ς and s is clear: the bundle P ς is the H -orbit of the range s(M), that is, P ς = {Rh (s(x)) | x ∈ M, h ∈ H } ⊂ P . The bundle P ς is endowed with the natural flat connection defined by the foliation {Rh (s(M))}h∈H , i.e. the flat connection whose integrable leaves are the sets {Rh (s(x)) | x ∈ M}, h ∈ H , that is, copies of the range of s for every group element h ∈ H . We now check that this connection is nothing but the connection Hσ . By definition (see Sect. 3.2), the connection one-form of Hσ at a point s(x) ∈ P ς is ω := ωH − ωH ◦ T s ◦ T πMP = ωH ◦ (Id − T (s ◦ πMP )), which is exactly the one-form defined by the decomposition Ts(x) P = Tx s(Tx M) ⊕ h∗s(x) induced by the above mentioned foliation on P ς ; recall that h∗p := {Bp∗ | B ∈ h}. Therefore, Hσ is a flat connection. This imposes a necessary condition on the section σ of T ∗ M ⊗ h˜ to be the projection of a section s of P → M, namely, the curvature of Hσ must vanish. Thus, an arbitrary solution of the Lagrange-Poincar´e equations is not always of the form σ (x) = {s(x), ωH ◦ Tx s}H . The condition Curv(Hσ ) = 0 is also sufficient, at least locally. Therefore it becomes a compatibility condition which must be imposed to reconstruct a solution s from the solution σ . More precisely: Theorem 1. A section s : M → P that is a solution of the Euler-Lagrange equations ∗ T ∗M ⊗ ∗ ˜ ˜ for L defines a section σ : M → πM P /H h ≡ T M ⊗ h that is a solution of σ the Lagrange-Poincar´e equations such that H is a flat connection. Conversely, given a solution σ of the Lagrange-Poincar´e equations such that Hσ is flat and with trivial holonomy, the family of solutions sh (x) = s(x)h, s : M → P , h ∈ H , of the original problem defined by L are obtained as follows: construct the H -reduced bundle P ς defined by the section ς : M → P /H and the integrable manifolds of the horizontal bundle of Hσ are the images sh (M), h ∈ H , of the desired family. Roughly speaking, we have locally the equivalence Solutions s of the Solutions σ of the Lagrange-Poincar´e ⇐⇒ Euler-Lagrange equations equations with Hσ a flat connection. Proof. Given a critical section s of the problem defined by L, we have already seen that the section σ solves the Lagrange-Poincar´e equations and Hσ is a flat connection. Conversely, given σ such that Curv(Hσ ) = 0 and with trivial holonomy, the integral leaves s are sections of P ς . We have σ = {s(x), ωH ◦ T S}H and by virtue of Theorem 1, s is a critical section of L. Remark 9. If M is not simply connected, the integral leaves of a flat connection are not, in general, global sections of the bundle πMP : P → M. There are flat connections with non-trivial holonomy (cf. [14, Ch. II, Sect. 9]). Nevertheless, locally the holonomy of every flat connection is always trivial, that is, for every x ∈ M there is a neighborhood −1 U such that the integral leaves of the flat connection on πMP (U ) are sections of πMP on U .

240

M. Castrill´on L´opez, T. S. Ratiu

Remark 10. It is not difficult to check that the curvature of Hσ can be expressed in terms of the form ωH and σ in the following way: Curv(Hσ ) = H − ∇ H σ + [σ, σ ], where H is the curvature of H and ∇ H stands for the covariant derivative defined by H. Then the geometrical condition of flatness of Hσ can be substituted by the more analytical condition H − ∇ H σ + [σ, σ ] = 0 along the bundle P ς . Remark 11 (The case of Classical Mechanics). If dim M = 1, every connection is flat and the second condition on the right-hand side of the equivalence in Theorem 1 is automatically satisfied. In this case we have the direct equivalence: Euler-Lagrange ⇐⇒ Lagrange-Poincar´e. 6. Special Cases The size of the subgroup H of G of symmetries of the Lagrangian L determines the structure of the Lagrange-Poincar´e equations. More precisely: the first group of equations in (4.11) represents an Euler-Lagrange operator on the bundle P /H → M, whose fiber dimension is dim G−dim H ; the second group of equations in (4.11) is an operator on the bundle ς ∗ h˜ → M whose fiber dimension is dim H . Roughly speaking, the bigger the dimension of H , the bigger the second group of equations and the smaller the first group of equations in (4.11). 6.1. Euler-Poincar´e reduction. In particular, if H = G, the first group of equations does not appear because P /H = M and the projection P /H → M is the identity. This particular case of Lagrange-Poincar´e equations has been studied in [4] and are usually called Euler-Poincar´e equations. The geometry of this reduction is very interesting. The quotient C = (J 1 P )/G is in fact the bundle of connections πMC : C → M of the principal bundle, that is, the bundle over M whose sections σ : M → C represent connections on P → M (see, for example, [3, 9, Sect. 2.7]). The bundle C → P is an affine bundle modeled over the vector bundle T ∗ M ⊗ g˜ and the fibration πCJ : J 1 P → C is a G-principal bundle. Then, the bundle of connections is the configuration bundle of the reduced problem and the Euler-Poincar´e equations are equations on connections. More precisely, Theorem 1 reads in this case as follows: Theorem 2 (Euler-Poincar´e reduction). Let π : P → M be a principal G-fiber bundle over a manifold M with a volume form v and let L : J 1 P → R be a G invariant Lagrangian. Let l : C → R be the mapping defined by L on the quotient. For a section s : M → P of π, let σ : M → C be defined by σ (x) = πCJ (jx1 s). Then, for every connection H of the bundle π, the following are equivalent: (1) s satisfies the Euler-Lagrange equations for L, (2) the variational principle δ L(jx1 s)dx = 0 M

holds, for variations with compact support,

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

241

(3) the Euler-Poincar´e equations hold: divH

δl δl + ad∗σ H = 0, δσ δσ

where σ H ∈ (T ∗ M ⊗ g˜ ) is such that σ = H + σ H , (4) the variational principle δ l(σ (x))dx = 0 M

holds, using variations of the form δσ = ∇ H η − [σ H , η], where η : M → g˜ is an arbitrary section. For reconstructing, the compatibility condition in this case is simply Curv(σ ) = 0. 6.2. The discrete case. The other special case is when H = {e} or H is a discrete subgroup of G. Then h˜ = P /H , the projection π h : h˜ → P /H is the identity and every connection of the principal bundle P → P /H is flat. The second group of LagrangePoincar´e equations (4.11) do not appear, while the first group reads EL(l) = 0, that is, we obtain the Euler-Lagrange equations for l. This result is clear because the bundles P → M and P /H → M are in this case locally diffeomorphic and the Euler-Lagrange operator is a local object. The reduction by a discrete group does not change the structure of the equations but only the configuration bundle. 7. Semidirect Product Reduction Sometimes, when the symmetry group H of a Lagrangian L : J 1 P → R fails to be the entire group G, we can introduce a new variable a ∈ N , N being a manifold where the group G acts, and a new function L : J 1 P × N → R, such that L is now G-invariant and L (jx1 s) = L(jx1 s, a0 ) for a fixed a0 ∈ N. Roughly speaking, we increase the size of the group of symmetries at the cost of a new parameter a0 . In classical mechanics, when N is a vector space V and the action is linear, this approach is known as the semidirect product reduction theory; see [13] for the Lagrangian case and [25, 16, 17] for the Hamiltonian case. We now extend the Lagrangian method to the field theoretical setting. Let G be a Lie group acting (on the left) on a vector space V and let L : J 1 P × V → R be a function invariant under the right action (j 1 sx , a0 ) · g = ((jx1 s)g, g −1 a0 ) of G on J 1 P × V . For every a0 ∈ V , we define the Lagrangian La0 : J 1 P → R by La0 (jx1 s) := L(jx1 s, a0 ), that is, L can be understood as a family of Lagrangians on πMP : P → M parameterized by a0 ∈ V . Even though L is G-invariant, each Lagrangian La0 is only invariant under the isotropy subgroup Ga0 ⊂ G of a0 ∈ V . In this section we are going to study the reduction of the variational problems defined by the Lagrangians La0 , a0 ∈ V , by means of the reduction of the variational problem defined by L on the product J 1 P × V .

242

M. Castrill´on L´opez, T. S. Ratiu

7.1. The geometry of V = (J 1 P × V )/G. If L : J 1 P × V → R is G-invariant, it defines a function l : (J 1 P × V )/G → R on the quotient. It is clear that the configuration bundle of the reduced variational problem defined by l is the bundle (J 1 P ×V )/G → M so we proceed to describe the main geometrical properties of this space. As we have said in Sect. 6.1, the projection πCJ : J 1 P → C = (J 1 P )/G is a principal G-bundle over C. Therefore, V = (J 1 P × V )/G is an associated vector bundle over C, which we are going to denote πC V : V → C. If πMC : C → M denotes the bundle of connections, it is known (cf. [3, 8, 9, Sect. 2.7]) that J 1 P is isomorphic to the pull-back ∗ P = C × P and, as a consequence, we have that V is isomorphic to bundle πMC M ∗ V = C × V, where V is the vector bundle π πMC M MV : V = (P × V )/G → M associated to πMP : P → M by the representation of G on V . We can summarize these relations by means of the following commutative diagram π10

∗ P = J 1 P −→ πMC πCJ ↓ πMC C −→ πC V ↑ πVV ∗ V = V −→ πMC πCV

P ↓ πMP M ↑ πMV V.

πMC

The composite bundle V → C → M is the configuration bundle of the reduced variational problem defined by L. Each section λ : M → V of this bundle gives two objects. Firstly, we obtain a section a := πVV ◦ λ

(7.1)

of the vector bundle V → M and, secondly, a section σ := πC V ◦ λ

(7.2)

of the bundle of connections C → M, which represents a connection Hσ on the principal ∗ V = C × V, we can write bundle P → M. If we identify V with πMC M λ(x) = (σ (x), a(x)),

for all

x ∈ M.

(7.3)

7.2. The reduction of the problem defined by L. We now fix the value of the parameter a0 for the study of the variational principle of the Lagrangian La0 . Let s be a section of the bundle πMP and let λ be the section of (J 1 P × V )/G = V → M defined as λ(x) = {jx1 s, a0 }G , x ∈ M. From formulas (7.1) and (7.2), we obtain the sections a and σ of the bundles C → M and V → M respectively. It is clear that these sections can be also defined by a(x) = {s(x), a0 }G ,

σ (x) = πCJ (jx1 s).

∗ P = C × V (see Sect. 7.1 above), from formula (7.3), we If we identify V with πMC M have

λ(x) = (πCJ (jx1 s), a(x)) = (σ (x), a(x)),

x ∈ M.

(7.4)

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

243

Given an infinitesimal variation δs = d/dε|ε=0 sε of the section s, taking the derivative with respect to ε in (7.4), we obtain an infinitesimal variation

d d d (7.5) λ = σ , a δλ = ε ε ε = (δσ, δa), dε ε=0 dε ε=0 dε ε=0 along the section λ. Since the vector field δs is, by definition, vertical with respect to πMP , it follows that δσ and δa are vertical with respect to πMC and πMV respectively. Because V → M is a vector bundle, we can think of δa as a new section of this bundle. Similarly, since C → M is an affine bundle modeled over the vector bundle T ∗ M ⊗ g˜ → M, we can think of δσ as a section of T ∗ M ⊗ g˜ → M. We now give the structure of the infinitesimal variation δλ. We know that given δs along s, there exist a mapping B : M → g such that δs = B ∗ , that is, δs is the infinitesimal generator of the variation sε (x) = s(x)exp(εB(x)).

(7.6)

Let η be the section of g˜ → M given by η(x) = {s(x), B(x)}G . Proposition 6. Let H an arbitrary connection on the bundle P → M. With a variation sε as in formula (7.6) and with the notation given above, we have that δσ = ∇ H η − [σ H , η], where ∇ H stands for the covariant derivative on the bundle g˜ → M and σ H ∈ (T ∗ M⊗ g˜ ) is such that σ = H + σ H . For the variation of a we have that δa = ηa, where the fiberwise action of the bundle g˜ → M on V → M is induced by the infinitesimal action of g on V , that is ηa = {s(x), B(x)}G {s(x), a0 }G = {s(x), B(x)a0 }G .

(7.7)

Proof. The expression for δσ is part of Theorem 2 (Euler-Poincar´e reduction). See [4] for the proof of this result. For the expression of δa, the proof is trivial by taking the derivative of aε = {sε , a0 }G = {s(x), exp(εB)a0 }G with respect to ε and comparing with (7.7).

Notation. Given a section λ = (σ, a) of V → M, we define the vertical derivatives of l with respect to σ and a by

d d δl δl , Y := , Z := l(σ (x) + εY, a(x)), l(σ (x), a(x) + εZ), δσ dε ε=0 δa dε ε=0 ˜ x , Z ∈ Vx . Thus δl/δσ and δl/δa are sections of the dual for every Y ∈ (T ∗ M ⊗ h) ∗ ∗ ˜ bundles T M ⊗ h and V respectively. Given an element a0 ∈ V , let ρa0 : g → V denote the linear map ξ → ξ a0 defined by the infinitesimal Lie algebra representation induced by the action of G on V and let ρa∗0 : V ∗ → g∗ the dual morphism. For notational convenience, we shall write

244

M. Castrill´on L´opez, T. S. Ratiu

ρa∗0 (k0 ) = a0 k0 ∈ g∗ , k0 ∈ V ∗ . This notation can be extended to the bundles V∗ and g˜ ∗ . Given a section a ∈ (V), we obtain a bundle morphism ρa : g˜ → V,

ρa (η) = ηa(x),

η ∈ g˜ x ,

x ∈ M,

defined in formula (7.7), and the dual morphism ρa∗ : V∗ → g˜ ∗ . We shall write ρa∗ (k) = a k ∈ g˜ ∗x ,

for all

k ∈ Vx∗ ,

x ∈ M.

Theorem 3 (Reduction for semidirect products). The following are equivalent: (1) the variational principle δ M

La0 (jx1 s)v = 0

holds for arbitrary variations δs of s, (2) s satisfies the Euler-Lagrange equations for La0 , (3) the constrained variational principle l(σ, a)v = 0 δ M

holds on V = C ×M V, using variations of the form δλ = ∇ H η − [σ H , η],

δa = ηa,

where η is an arbitrary section of the bundle g˜ → M, (4) the following equation holds on V = C ×M V: divH

δl δl δl + ad∗σ H =a . δσ δσ δa

(7.8)

Proof. The equivalence between (1)⇔(2) is standard. For (1)⇔(3), we know that, given a variation sε , d d d 1 1 l(j 1 σε , aε )v. La (j sε )v = L(j sε , a0 )v = dε ε=0 M 0 dε ε=0 M dε ε=0 Proposition 6 implies that d (j 1 σε , aε ) = (∇ H η − [σ, η], ηa). dε ε=0 Then δ M L(j 1 s)v = 0 if and only if δ M l(j 1 ς, σ )v = 0 with respect to this type of variations. Finally, for (3)⇔(4), we write

δl δl d 1 (δσ ) + (δa) v l(j σε , aε )v = dε ε=0 M δa M δσ

δl H δl (∇ η − [σ, η]) + (ηa) v. = δa M δσ

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

245

Then, from formula (4.10), we have

δl δl H H δl div ,∇ η v = , η − div ( ), η v δσ δσ M δσ M

δl divH ( ), η v, =− δσ M since M div δl/δσ, η v = 0 by Stokes’ Theorem. We thus obtain

δl δl δl δ −divH − ad∗σ + a , η v, l(j 1 ς, σ )v = δσ δσ δa M M and then, from the fundamental lemma of the calculus of variations, η being arbitrary, we conclude that the δ M l(j 1 ς, σ )v = 0 if and only if (7.8) is true. Remark 12. For M = R, P = M × G, we recover the classical version of the Lagrangian semidirect product reduction theory. In this case, the analogue of Theorem 3 can be found in [13, Theorem 3.3]. We remark that in [13], due to other considerations, the authors deal with G-invariant Lagrangians defined on T G × V ∗ , that is, the space of the parameter a0 is the dual to a vector space. For this reason, there are slight differences which must be taken into account if one wants to compare the formulation of the classical case in Theorem 3 with the statement of [13, Theorem 3.3]. For example, since the action of g on V ∗ is taken to be minus the dual map of the g-action on V (the contragradiant representation), the variation of a turns out to be δa = −aη. Moreover, for functorial convenience, we write a δl/δa in the Euler-Poincar´e equations (7.8) instead of δl/δa a, as was the case in [13, Theorem 3.3]. Apart from such formal considerations, Theorem 3 completely fits in the case of classical mechanics studied in the above mentioned paper. 7.3. Reconstruction. Given a section s of the bundle P → M and a fixed element a0 ∈ V , we obtain sections σ = πCJ (j 1 s) and a = {s, a0 }G of C → M and V → M respectively. As we have seen in Sect. 5, the range of the section s is an integral leaf of the horizontal bundle of the connection σ . From this fact we deduce that σ is a flat connection, as we have seen for the Euler-Poincar´e reduction in Sect. 6.1. We now study the properties of a. First, it is straightforward to check that ∇ σ a = 0, where ∇ σ is the covariant derivative of the linear connection induced by σ on the vector bundle V → M; i.e., a is a flat section with respect to σ . On the other hand, for every x ∈ M, if we write −1 a(x) as {p, a} ˆ G , with p ∈ πMP (x) arbitrary, it is clear that aˆ ∈ Orb(a0 ), that is, if we write a as an equivalence class of (P × V )/G, the second term always belongs to the G-orbit of a0 . All these properties satisfied by σ and a are necessary conditions. An arbitrary solutions of Eqs. (7.8) is not always of the form σ = πCJ (j 1 s), a = {s, a0 }G . If the holonomy of the connection σ is trivial (which is always true locally) these conditions are also sufficient. Therefore they become the compatibility conditions that must be imposed to reconstruct a solution s of the variational problem defined by La0 from the solutions of (7.8). More precisely: Theorem 4. We fix a point xˆ ∈ M and a value a0 ∈ V . With the assumptions of Theorem 3, a solution s of the problem defined by the Lagrangian La0 defines the sections σ and a, solutions of the equations (7.8), such that Curv(σ ) = 0, ∇ σ a = 0 and a(x) ˆ = {p, a} ˆ G,

246

M. Castrill´on L´opez, T. S. Ratiu

−1 with aˆ ∈ Orb(a0 ), for any p ∈ πMP (x). Conversely, given a solution λ = (σ, a) Eqs. (7.8) satisfying the previous properties, if the holonomy of σ is trivial, the solutions of the variational problem defined by La0 are obtained as follows. We take an integral section sˆ : M → P of the connection σ . We write a(x) = {s(x), a} ˆ G . Let g be an element of G such that aˆ = ga0 . Any other such element is of the form g = gh, with h ∈ Ga0 . Then the family s = sˆ g are the critical sections of La0 . Roughly speaking, we locally have the equivalence  δl δl δl   divH + ad∗σ H =a ,    δσ δσ δa  ∇ σ a = 0, EL(La0 ) = 0 ⇐⇒  −1  (x), ˆ ˆ = {p, a} ˆ G with aˆ ∈ Orb(a0 ), for any p ∈ πMP   a(x)   Curv(σ ) = 0. (7.9)

Proof. We have already seen that the conditions on the right hand side of (7.9) are necessary. For the converse, if the holonomy of σ is trivial, let s be an integral section. If ˆ is constant from the condition ∇ σ a = 0. we write a(x) = {s(x), a(x)} ˆ G , we see that a With the condition on x, ˆ we have that aˆ ∈ Orb(a0 ). Then, from Theorem 3, every section s = sˆ g, with aˆ = ga0 , is a critical section of La0 , since σ = πCJ (j 1 s), a = {s(x), a0 }G and λ = (σ, a) satisfies (7.8). 8. Examples 8.1. Classical mechanics. This is the case when the dimension of the base manifold M is 1. Let πMP : P = R × G → R be the trivial principal bundle with structure group G and let H be a subgroup of G. We have the following identifications: J 1 P = R × T G, ˜ The last one is a modifiP /H = R × (G/H ), and (J 1 P )/H = R × (T (G/H ) ⊕ h). cation of (3.2) with the aid of a connection on the principal bundle G → G/H instead of on the bundle R × G → R × (G/H ). The space h˜ is also understood as the adjoint bundle of this bundle. The configuration bundle of the reduced problem is the composite bundle R × h˜ → R × (G/H ) → R. A section σ of this bundle is denoted by σ (t) = (t, v(t)) ¯ and the projected section ς : R → R × (G/H ) by ς (t) = (t, r(t)). Then the Lagrange-Poincar´e equations read

δl ˜ , ELv¯ (l) = , ir˙ δ v¯ ∇ δl δl + ad∗v¯ = 0, dt δ v¯ δ v¯ where ∇/dt is the covariant derivative induced on h˜ ∗ by H. These equations can be found in [5] and [15]. For instance, the description of the dynamics of the heavy top, that is, the motion of a rigid body with a fixed point in gravitational field, fits in this context. In this case G = SO(3) and H = SO(2) is the subgroup of rotations with axis parallel to the gravitational direction. This example is also valid for the theory of semidirect product reduction. If we consider the length a0 from the fixed point of the heavy top to its center of mass as a variable, the new Lagrangian L : T SO(3) × R3 → R is SO(3)

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

247

invariant and we can use the equivalence given in (7.9). See [13] for the discussion of this case. Another example is the dynamics of the rigid body with (three) internal rotors. For this system, G = SE(3) × S 1 × S 1 × S 1 and the subgroup of symmetries of the Lagrangian is H = SE(3). Finally, for the dynamics of the underwater vehicle, we have G = SE(3) and H = SE(2) × R. For a description of these systems, see for instance [15]. 8.2. Harmonic maps. Let (M, g) be a compact oriented Riemannian manifold, and let (G, h) be a Lie group equipped with a right invariant Riemannian metric. We identify the mappings φ : M → G with the global sections of the trivial principal bundle ∞ ∞ P =M× G. For1 each φ ∈ C (M, G), we may define the energy E on C (M, G) by E(φ) = M L(j φ)dx, for the Lagrangian L(j 1 φ) =

1 T φ, T φg,h , 2

where ·, ·g,h is the induced metric on T ∗ M ⊗ T G by g and h. The Euler-Lagrange equations for this Lagrangian are given by EL(φ) = Tr∇dφ = 0, where ∇ is the induced Riemannian covariant derivative on C ∞ (T ∗ M ⊗ T G) and Tr is the trace defined by g (cf. [7]). The solutions of these equations are called harmonic mappings. Clearly L : J 1 P → R is G-invariant. If we consider the trivial flat connection H on P → M, the reduced Lagrangian l : T ∗ M ⊗ g → R is l(σ ) =

1 σ, σ g,h , 2

where h is an inner product on g. The Euler-Poincar´e equation reads in this case ∗ (8.1) d σ, · h + ad∗σ σ, ·h = 0, where d∗ = ∗d∗ stands for the codifferential defined by the metric g. If the metric h is also left invariant, we have ad∗A B, ·h + ad∗B A, ·h = 0, for all A, B ∈ g, and then the Euler-Poincar´e equation is simply d∗ σ = 0. This equation, as well as its corresponding compatibility condition Curv(σ ) = 0 (cf. Theorem 1) are proved for the first time in [23] and are used, for example, in [12] for G = SU (2) and the geometry of S 3 , in [28] for G = U (n), in [18] for G = SU (n) or SO(n), and in [20] for G = SE(3) in applications for robotics mechanisms. The more general formula (8.1) was shown for the first time in [6]. 8.3. Harmonic maps: Partial symmetries. We shall use the same notations as in the previous example. Assume now that the metric h is only invariant under the action of a proper subgroup H of G. In this case we work with the mechanical connection H on the principal bundle πH G : G → G/H , which is the connection whose horizontal distribution at every g ∈ G is the orthogonal complement of the tangent space of the H -orbit, that is, Hg = (Vg )⊥ , g ∈ G, where V is the vertical distribution. This connection defines ˆ a metric hˆ on G/H by simply setting h(X, Y ) := h(X H , Y H ), X, Y ∈ Tς (P /H ) (see

248

M. Castrill´on L´opez, T. S. Ratiu

Sect. 2.1.3). The harmonic Lagrangian L : J 1 (M, G) → R, L(j 1 φ) = 21 T φ, T φg,h projects to the reduced Lagrangian l : J 1 (M, G/H ) × T ∗ M ⊗ h˜ → R, defined by l(jx1 ς, σ ) =

1 1 T ς, T ς g,hˆ + σ, σ g,h , 2 2

with derivative δl/δσ = σ, ·g,h . The Lagrange-Poincar´e equations and the compatibility condition read ˜H , Tr∇dς = σ, ·g,h , iς∗ (8.2) divH σ, · + σ, adσ ·g = 0, h

H − ∇ H σ + [σ, σ ] = 0,

along π −1 (ς (M)).

In [7, Theorem 4.13] a result for Riemannian submersions (which includes the projections of principal fiber bundles as a special case) studies the relation of the problem defined by the harmonic Lagrangian L and the reduced problem defined by l when φ is “horizontal”, that is, when σ = 0. Equations (8.2) generalize this situation when this condition on φ is not imposed. 8.4. Sigma models. A closely related topic to harmonic maps is the so called theory of sigma models. These models are important in theoretical physics for different reasons: they have similar properties to Yang-Mills theories, they contain soliton solutions, they are used as approximative models in particle physics, etc. The only sigma models which are known to be integrable are the sigma models taking values into a Lie group or a homogeneous space (cf. [2, 18, 21]). We apply now the Lagrange-Poincar´e reduction to this last case. Let P → M be a G-principal bundle over a Riemannian manifold (M, g) and A be a connection on P → M with connection form ωA : T P → g. Let H be a subgroup of G. We assume that there exists a subalgebra k of g endowed with a metric k, such that g = h ⊕ k, Adh k ⊂ k and k◦Adh = k for any h ∈ H (we say that H is metric reductive). We denote by pr h : g → h and pr k : g → k the projections induced by the given decomposition of g. The composition pr h ◦ ωA is a connection one-form ωH on the fiber bundle P → P /H . The Lagrangian L : J 1 P → R, 1 L(jx1 s) = pr k ◦ ωA ◦ T s, pr k ◦ ωA ◦ T s , g,k 2 where , g,k stands for the metric induced by g and k on the space of k-valued covectors, is clearly H invariant. The metric k induces a metric on G/H which is H -invariant. Since the typical fiber of πM : P /H → M is G/H , we have a metric for πM -vertical vectors, that will also be denoted by k. The reduced Lagrangian takes the form 1 A l(jx1 ς, σ ) = ∇ ς, ∇ A ς , (8.3) g,k 2 where ∇ A is the connection induced by A on P /H → M, as this bundle is an associated bundle (see Proposition 1). This is the Lagrangian of the sigma model on the bundle P /H → M, which is precisely the Lagrangian for harmonic maps on this bundle coupled

Reduction in Principal Bundles: Lagrange-Poincar´e Equations

249

with the connection A. We have that δl/δσ = 0 and therefore the Lagrange-Poincar´e equations read EL(l) = 0.

(8.4)

We directly write the standard Euler-Lagrange operator instead of the partial one ELς (l) = 0 as the Lagrangian l only depends on the variable ς and not on σ , as one can see from Eq. (8.3). The reduced variational problem imposes conditions on σ only through the projected section ς . The solutions of the reduced variational problem are just harmonic sections of P /H → M with respect to A. In this case, the reduction procedure can be used backwards. More precisely, for the study of the system (8.4), one can work with the variational problem defined by L. A solution of EL(L) = 0 projects onto a solution of EL(l) = 0 and vice-versa, a solution ς of EL(l) = 0 together with an arbitrary section σ of T ∗ M ⊗ h˜ projecting to ς and such that Curv(H − σ ) = 0 give a solution of EL(L) = 0. This is the so called “injective” or “lifted” formulation of sigma models. This formulation can be found in [2 (where the condition Curv(H − σ ) = 0 is seen as a conservation law), 21, and 22]. 8.5. KdV as a reduced equation. We now give a more concrete example. Let G = R2 with coordinates (φ, ψ), M = R2 with coordinates (x, t), and P = M × G. The Lagrangian L : J 1 P → R, L(jx1 s) = 21 φt φx + φx3 + φx ψx + 21 ψ 2 is invariant under the action of the subgroup H = R acting by translations on the first variable φ. We consider the trivial connection on the bundle G = R2 → R = G/H . In this case we have P /H = M ×R, T ∗ M ⊗ h˜ = T ∗ M ×R, with coordinates (x, t, u, v, ψ), and the reduced Lagrangian l : J 1 (P /H )×T ∗ M ⊗ h˜ → R, l(jx1 ς, σ ) = 21 vu+u3 +uψx + 21 ψ 2 . Then δl/δσ = ( 21 v + 3u2 + ψx )∂/∂x + 21 u∂/∂t and the Lagrange-Poincar´e equations take the form ∂u ELψ (l) = ψ − = 0, ∂x

δl 1 ∂v ∂u ∂ 2 ψ 1 ∂u div = + 6u + = 0, + 2 δσ 2 ∂x ∂x ∂x 2 ∂t with the compatibility condition d(udx + vdt) = 0, which reads ∂u/∂t = ∂v/∂x. Rewriting these equations we have ∂u ∂v ∂u = , ψ= , ∂y ∂x ∂x ∂u ∂u ∂ 3 u + 6u + 3 = 0, ∂t ∂x ∂x where one recognizes the KdV equation in the third equation. In fact, this is the standard way (see for example [19]) to present the KdV equation by means of a first order Lagrangian, although the use of the symmetry under φ-translations is needed in order to obtain it from L, as we have just done. The first group of equations is needed to recover the solutions of the original Lagrangian starting from a solution of the KdV equation. Acknowledgements. MCL was partially supported by the DGESIC (Spain) under grant no. BMF20001314. TSR was partially supported by the European Commission and the Swiss Federal Government through funding for the Research Training Network Mechanics and Symmetry in Europe (MASIE) as well as the Swiss National Science Foundation.

250

M. Castrill´on L´opez, T. S. Ratiu

References 1. Anderson, I.M., Fels, M.E.: Symmetry reduction of variational bicomplexes and the principle of symmetric criticality. Am. J. Math. 119, 609–670 (1997) 2. Bordemann, M., Forger, M., Laartz, J., Sch¨aper, U.: The Lie-Poisson structure of integrable classical non-linear sigma models. Commun. Math. Phys. 152, 167–190 (1993) 3. Castrill´on L´opez, M., Munoz Masqu´e, J.: The geometry of the bundle of connections. Math. Z. 236(4), 797–811 (2001) 4. Castrill´on L´opez, M., Ratiu, T.S., Shkoller, S.: Reduction in principal fiber bundles: Covariant EulerPoincar´e equations. Proc. Am. Math. Soc. 128(7), 2155–2164 (2000) 5. Cendra, H., Marsden, J., Ratiu, T.: Lagrangian reduction by stages. Mem. Am. Math. Soc. 152(722), (2001) 6. Dai, Y., Shoji, M., Urakawa, H.: Harmonic maps into Lie groups and homogeneous spaces. Diff. Geom. Appl. 7, 143–160 (1997) 7. Eells, J., Ratto, A.: Harmonic maps and minimal immersions with symmetries. Princeton, NJ: Princeton University Press, 1993 8. Garc´ıa P´erez, P.L.: Gauge algebras, curvature and symplectic structure. J. Diff. Geom. 12, 209–227 (1977) 9. Giachetta, G., Mangiarotti, L., Sardanashvily, G.: New lagrangian and hamiltonian methods in field theory. Singapore: World Scientific Co., 1997 10. Goldschmidt, H., Sternberg, S.: The Hamilton-Cartan formalism in the Calculus of Variations. Ann. Inst. Fourier 23, 203–267 (1973) 11. Hamel, G.: Die Lagrange-Eulerschen Gleichungen der Mechanik. Z. Mathematick und Physik 50, 1–57 (1904) 12. Hitchin, N.: Harmonic maps from a 2-torus to the 3-sphere. J. Diff. Geom. 31, 627–710 (1990) 13. Holm, D., Marsden, J., Ratiu, T.: The Euler-Poincar´e equations and semidirect products with applications to continuum theories. Adv. in Math. 137, 1–81 (1998) 14. Kobayashi, S., Numizu, K.: Foundations of differential geometry. New York: John Wiley & Sons, Inc. (Interscience Division), New York, Volume I, 1963; Volume II, 1969 15. Marsden, J.E., Ratiu, T.S.: Introduction to mechanics and symmetry. New York: Springer-Verlag, Inc., 1994 16. Marsden, J.E., Ratiu, T.S., Weinstein, A.: Semi-direct products and reduction in mechanics. Trans. Am. Math. Soc. 281(1), 147–177 (1984) 17. Marsden, J.E., Ratiu, T.S., Weinstein, A.: Reduction and Hamiltonian structures on duals of semidirect product Lie algebras. In Fluids and Plasmas: Geometry and Dynamics, Cont. Math. 28, Providence, RI: Am. Math. Soc. 1984, pp. 55–100 18. Matos, T., Nucamendi, U.: SU (N)-and SO(N)-invariant chiral fields: One- and two-dimensional subspaces. J. Math. Phys. 40(5), 2500–2513 (1999) 19. Nutku, Y.: Hamiltonian formulation of the KdV equation. J. Math. Phys. 25(6) (1984) 20. Park, F.C., Brockett, R.W.: Kinematic dexterity of robotic mechanism. Int. J. Robotics Res. 13, 1–15 (1994) 21. Percacci, R.: Global definition of nonlinear sigma model and some consequences. J. Math. Phys. 22(9), 1892–1895 (1981) 22. Percacci, R.: Geometry of nonlinear filed theories. Singapore: World Scientific Pub. Co., 1986 23. Pluzhnikov, A.I.: Some properties of harmonic mappings in the case of spheres and Lie groups. Sove. Math. Dokl. 27(1) (1983) 24. Poincar´e, H.: Sur une forme nouvelle des e´ quations de la m´ecanique. C.R. Acad. Sci. 132, 369–371 (1901) 25. Ratiu, T.S.: Euler-Poisson equations on Lie algebras and the N-dimensional heavy rigid body. Proc. Nat. Acad. Sci. USA 78, 1327–1328 (1981) 26. Sardanashvily, G., Zakharov, O.: On the geometry of spontaneous symmetry breaking. J. Math. Phys. 33 (1992) 27. Saunders, D.J.: The geometry of jet bundles. Cambridge, UK: Cambridge University Press, 1989 28. Uhlenbeck, K.: Harmonic maps into Lie groups (Classical solutions of the chiral model). J. Diff. Geom. 30, 1–50 (1989) Communicated by L. Takhtajan

Commun. Math. Phys. 236, 251–280 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0823-7

Communications in

Mathematical Physics

Extended Divergence-Measure Fields and the Euler Equations for Gas Dynamics Gui-Qiang Chen1 , Hermano Frid2 1 2

Department of Mathematics, Northwestern University, 2033 Sheridan Road, Evanston, IL 60208-2730, USA. E-mail: [email protected] Instituto de Matem´atica Pura e Aplicada – IMPA, Estrada Dona Castorina, 110, Rio de Janeiro, RJ 22460-320, Brazil. E-mail: [email protected]

Received: 7 May 2002 / Accepted: 2 December 2002 Published online: 2 April 2003 – © Springer-Verlag 2003

Abstract: A class of extended vector fields, called extended divergence-measure fields, is analyzed. These fields include vector fields in Lp and vector-valued Radon measures, whose divergences are Radon measures. Such extended vector fields naturally arise in the study of the behavior of entropy solutions of the Euler equations for gas dynamics and other nonlinear systems of conservation laws. A new notion of normal traces over Lipschitz deformable surfaces is developed under which a generalized Gauss-Green theorem is established even for these extended fields. An explicit formula is obtained to calculate the normal traces over any Lipschitz deformable surface, suitable for applications, by using the neighborhood information of the fields near the surface and the level set function of the Lipschitz deformation surfaces. As an application, we prove the uniqueness and stability of Riemann solutions that may contain vacuum in the class of entropy solutions of the Euler equations for gas dynamics.

1. Introduction We are concerned with a class of extended vector fields, called extended divergence-measure fields, or DM-fields for short. These fields include vector fields in Lp , 1 ≤ p ≤ ∞, and vector-valued Radon measures, whose divergences are Radon measures. The DMfields naturally arise in the study of the behavior of entropy solutions of nonlinear hyperbolic systems of conservation laws, which take the form ut + ∇x · f (u) = 0,

u ∈ R m , x ∈ Rn ,

where f : Rm → (Rm )n is a nonlinear map. One of its most important prototypes is the Euler equations for gas dynamics in Lagrangian coordinates:

252

G.-Q. Chen, H. Frid

τt − vx = 0, vt + px = 0, v2 e+ + (pv)x = 0, 2 t

(1.1) (1.2) (1.3)

where τ = 1/ρ is the specific volume with the density ρ, and v, p, e are the velocity, the pressure, the internal energy, respectively; the other two gas dynamical variables are the temperature θ and the entropy S. For ideal polytropic gases, system (1.1)–(1.3) is closed by the following constitutive relations: pτ = Rθ,

e = cv θ,

p(τ, S) = κτ −γ eS/cv ,

(1.4)

where cv , R, and κ are positive constants, and γ = 1 + cv /R > 1. For isentropic gases, the Euler equations become τt − vx = 0, vt + p(τ )x = 0,

(1.5) (1.6)

where p(τ ) = κτ −γ , γ > 1. The main feature of nonlinear hyperbolic conservation laws, especially (1.1)–(1.3), is that, no matter how smooth the initial data is, the solution may develop singularities and form shock waves in finite time. One may expect solutions in the space of functions of bounded variation. This is indeed the case by the Glimm theorem [15] which establishes that, when the initial data has sufficiently small total variation and stays away from vacuum for (1.1)–(1.3), there exists a global entropy solution in BV satisfying the Clausius inequality: St ≥ 0

(1.7)

in the sense of distributions. However, when the initial data is large, still away from vacuum, the solution may develop vacuum in finite time, even instantaneously as t > 0. In this case, the specific volume τ = 1/ρ may then become a Radon measure or an L1 function, rather than a function of bounded variation. This indicates that solutions of nonlinear hyperbolic conservation laws are generally either in M(R+ × Rn ), the space of signed Radon measures, or in Lp (R+ × Rn ), 1 ≤ p ≤ ∞. On the other hand, the fact that (1.1)–(1.3) and (1.7) hold in the sense of distributions implies, in particular, that the divergences of the fields (τ, −v), (v, p), (e + v 2 /2, pv), and (S, 0) in the (t, x) variables are Radon measures, in which the first three are the trivial null measure and the last one is a nonnegative measure as a consequence of the Schwartz Lemma [21]. This motivates our study of extended divergence-measure fields (see Definition 1.1 below). In this connection, we recall that Wagner [25] proved that the well known Lagrangian transformation carries entropy solutions of the Euler equations in Eulerian coordinates to entropy solutions of (1.1)–(1.3) in a one-to-one and onto manner. However, since solutions of the former which contain vacuum are carried into solutions of (1.1)–(1.3) which are vector-valued measures, the concept of entropy solutions for the latter has to be strengthened. We will return to this point in Sect. 4. Understanding more properties of DM-fields can advance our understanding of the behavior of solutions (cf. [5–7]). One of the fundamental questions is whether the normal traces can still be defined and the Gauss-Green formula, i.e., integration by parts, still works for these extended fields, which are very weak.

Divergence-Measure Fields and the Euler Equations

253

We begin with the definition of DM-fields. For open sets A, B ⊂ RN , by the relation ¯ is a compact subset of B. A B we mean that the closure of A, A, Definition 1.1. Let D ⊂ RN be open. For F ∈ Lp (D; RN ), 1 ≤ p ≤ ∞, or F ∈ M(D; RN ), set |div F |(D) := sup ∇ϕ · F : ϕ ∈ C01 (D), |ϕ(x)| ≤ 1, x ∈ D . D

For 1 ≤ p ≤ ∞, we say that F is an Lp -divergence-measure field over D, i.e., F ∈ DMp (D), if F ∈ Lp (D; RN ) and F DMp (D) := F Lp (D;RN ) + |div F |(D) < ∞.

(1.8)

We say that F is an extended divergence-measure field over D, i.e., F ∈ DMext (D), if F ∈ M(D; RN ) and F DMext (D) := |F |(D) + |div F |(D) < ∞.

(1.9) p

If F ∈ DMp (D) for any open set D RN , then we say F ∈ DMloc (RN ); and, if N F ∈ DMext (D) for any open set D RN , we say F ∈ DMext loc (R ). It is easy to check that these spaces under the norms (1.8) and (1.9), respectively, are Banach spaces. These spaces are larger than the BV -space. The establishment of the Gauss-Green theorem, traces, and other properties of BV functions in the middle of last century (see Federer [13]) has advanced significantly our understanding of solutions of nonlinear partial differential equations and nonlinear problems in calculus of variations, differential geometry, and other areas. A natural question is whether the DM-fields have similar properties, especially the normal traces and the Gauss-Green formula. At a first glance, it seems unclear. y x 1 2 , Example 1.1. The field F (x, y) = x 2 +y 2 x 2 +y 2 belongs to DMloc (R ). As remarked in Whitney [26], for = (0, 1) × (0, 1), π div F = 0 = F · ν dH1 = , 2 ∂ if one understands F · ν in the classical sense, which implies that the classical GaussGreen theorem fails. In this paper, we succeed in using the neighborhood information via the Lipschitz deformation to develop a natural notion of normal traces, under which our generalized Gauss-Green theorem holds, even for F ∈ DMext (D). Example 1.2. For any µ ∈ M(R) with finite total variation, F (x, y) = (dx × µ(y), 0) ∈ DMext (I × R), for any bounded open interval I ⊂ R. A non-trivial example of such fields is provided by the Riemann solutions of the Euler equations (1.1)–(1.3) for gas dynamics, which develop vacuum. See (4.12) below.

254

G.-Q. Chen, H. Frid

Some efforts have been made in generalizing the Gauss-Green Theorem. Some results for several situations can be found in Anzellotti [1] for an abstract formulation for F ∈ L∞ , Rodrigues [20] for F ∈ L2 , and Ziemer [28] for a related problem for div F ∈ L1 (see also Baiocchi-Capelo [2], Brezzi-Fortin [4], and Ziemer [29]). In ChenFrid [5], we observed an explicit way to calculate the normal traces for F ∈ DM∞ by the neighborhood information, under which the Gauss-Green formula holds for such fields. In this paper, motivated by various nonlinear problems from conservation laws, we propose a natural notion of normal traces by the neighborhood information via Lipschitz deformation under which a generalized Gauss-Green theorem is established for F ∈ DMext (D) in Sect. 3, where our main results concerning extended divergencemeasure fields are stated and proved, after establishing some auxiliary results in Sect. 2. In particular, we show an explicit way to calculate the normal traces over any deformable Lipschitz surface, suitable for applications, by using the neighborhood information of the fields near the surface and the level set function of the Lipschitz deformation surfaces. We also show a product rule for these extended fields. Their proofs require some refined properties of Radon measures and the Whitney extension theory, among others. In Sect. 4, we give an important application of the theory of DM-fields to the Euler equations (1.1)–(1.3) for gas dynamics and establish the uniqueness and stability of Riemann solutions of large oscillation that may contain two rarefaction waves and one contact discontinuity or vacuum states (i.e. measure solutions) in the class of entropy solutions, which may not belong to either BVloc or L∞ , without specific reference on the method of construction of the solutions. The proof, motivated by [11] and [7–9], is heavily based on our explicit approach to calculate normal traces over Lipschitz deformable surfaces, in the generalized Gauss-Green theorem, and the product rule for DM-fields. The same arguments clearly also yield the uniqueness and stability of Riemann solutions in the class of entropy solutions for the Euler equations (1.5) and (1.6) for isentropic gas dynamics. Before closing this introduction, we recall some correlated results. In DiPerna [11], a uniqueness theorem of Riemann solutions was first established for 2 × 2 systems in the class of entropy solutions in L∞ ∩ BVloc with small oscillation. We also refer to Dafermos [10] for the stability of Lipschitz solutions for hyperbolic systems of conservation laws. In [8, 9], the uniqueness and stability of Riemann solutions of large oscillation without vacuum (possibly containing shocks) was proved for the 3 × 3 Euler equations, in the class of entropy solutions in L∞ ∩BVloc which stay away from vacuum. Another related connection is the recent results on the L1 -stability of the solutions in L∞ ∩ BV obtained either by the Glimm scheme [15], the wave front-tracking method, the vanishing viscosity method, or more generally satisfying an additional regularity, with small total variation in x uniformly for all t > 0 (see the recent references cited in Bianchini-Bressan [3] and Dafermos [10]). 2. Radon Measures and the Whitney Extension In this section, we establish some auxiliary properties about Radon measures and the Whitney extension of Lipschitz continuous functions, which are required for our analysis on extended divergence-measure vector fields in Sect. 3. We begin with some properties about Radon measures. Let , D ⊂ RN be open. For µ , µ ∈ M(), we denote µ µ the weak convergence of µ to µ in M(). The next three lemmas are standard, but we include their proofs for completeness.

Divergence-Measure Fields and the Euler Equations

255

Lemma 2.1. Let µε , µ ∈ M() be signed Radon measures over with µε µ. Then |µ|() ≤ lim inf |µε |(). ε→0

This can be seen as follows: For any φ ∈ C0 (), |φ| ≤ 1, µ, φ = lim µε , φ ≤ lim inf |µε |(). ε→0

Lemma 2.2. Let

µε ,

ε→0

µ ∈ M() be such that lim |µε |() = |µ|().

µε µ,

ε→0

Then, for every open set A ⊂ , |µ|(A ∩ ) ≥ lim sup |µε |(A ∩ ). ε→0

In particular, if |µ|(∂A ∩ ) = 0, then |µ|(A) = lim |µε |(A). ε→0

Proof. Set B = − A, which is open. Then |µ|(A) ≤ lim inf |µε |(A),

|µ|(B) ≤ lim inf |µε |(B).

ε→0

ε→0

On the other hand, |µ|(A¯ ∩ ) + |µ|(B) = |µ|() = lim |µε |() ε→0

≥ lim sup |µε |(A¯ ∩ ) + lim inf |µε |(B) ≥ lim sup |µε |(A¯ ∩ ) + |µ|(B), which yields the desired result.

Lemma 2.3. Let ωε be a sequence of positive symmetric mollifiers in RN and µ ∈ M(). Set µε = µ ∗ ωε , the mollified measures. Then, for any open set A ⊂⊂ with |µ|(∂A) = 0, |µ|(A) = lim |µε |(A). ε→0

Proof. Since

µε

µ, Lemma 2.1 implies |µ|(A) ≤ lim inf |µε |(A). ε→0

Notice that, for any g ∈ C0 (A), |g| ≤ 1, µε , g = µ, g ε , where g ε = g ∗ ωε . Since |g ε | ≤ 1 and spt (g ε ) ⊂ Aε := {x : dist (x, A) ≤ ε} ⊂ for ε 1, then |µε , g | ≤ |µ|(Aε ), which implies

|µε |(A) ≤ |µ|(Aε ).

256

G.-Q. Chen, H. Frid

Hence,

¯ lim sup |µε |(A) ≤ lim |µ|(Aε ) = |µ|(A), ε→0

ε→0

which yields lim sup |µε |(A) ≤ |µ|(A), ε→0

since |µ|(∂A) = 0.

In particular, Lemma 2.3 indicates that, if µ ≥ 0 and µ(∂A) = 0, then µ(A) = lim µε (A). ε→0

Proposition 2.1. Let D ⊂ RN be open, µ ∈ M(D), and µε = µ ∗ ωε . Let ⊂⊂ D be open and |µ|(∂) = 0. Then, for any φ ∈ C(D), lim µε , φχ = µ, φχ .

ε→0

Proof. Write µε = µε+ − µε− , where µε± = µ± ∗ ωε are nonnegative measures. The condition |µ|(∂) = 0 implies µ± (∂) = 0. From Lemma 2.3, we have lim µε± () = µ± ().

ε→0

Hence lim µε () = lim µε+ () − lim µε− () = µ+ () − µ− () = µ().

ε→0

ε→0

ε→0

Let A be open with ⊂⊂ A ⊂⊂ D. Then, for φ ∈ C(D) and δ > 0, we may construct a partition of RN by means of parallelepipeds: αN αN +1 , xN ], Qα = [x1α1 , x1α1 +1 ] × · · · × [xN

α = (α1 , · · · , αN ) ∈ ZN ,

such that ∪Qα ∩ =∅ Qα ⊂ A, |µ|(∂Qα ∩ D) = 0,

α ∈ ZN ,

and |φ(x) − φ(y)| < δ,

x, y ∈ Qα .

Let aα = φ(xα∗ ) for some xα∗ ∈ Qα if Qα ∩ = ∅, and aα = 0, otherwise. Then

µε , φχ = µε , aα χQα ∩ + µε , φ − aα χQα χ ,

Divergence-Measure Fields and the Euler Equations

and

257

aα χQα ∩ + δ|µ|(A) lim sup µε , φχ ≤ µ, ε→0

≤ µ, φχ + 2δ|µ|(A). Analogously,

lim inf µε , φχ ≥ µ, φχ − 2δ|µ|(A). ε→0

Since δ > 0 is arbitrary, we complete the proof.

We now discuss some concepts and facts about the extension of Lipschitz continuous functions defined on a closed set C ⊂ RN to RN , following the theory set forth by Whitney [27] (see also [23]), which play an important role in Sect. 3. Let k be a nonnegative integer and γ ∈ (k, k + 1]. We say that a function f , defined on C, belongs to Lip (γ , C) if there exist functions f (j ) , 0 ≤ |j | ≤ k, defined on C, with f (0) = f , such that, if f (j ) (x) =

f (j +l) (y) (x − y)l + Rj (x, y), l!

|j +l|≤k

then

|f (j ) (x)| ≤ M, |Rj (x, y)| ≤ M|x − y|γ −|j | ,

for any x, y ∈ C, |j | ≤ k.

(2.1)

Here j and l denote multi-indices j = (j1 , · · · , jN ) and l = (l1 , · · · , lN ) with j ! = lN j1 ! · · · jN !, |j | = j1 +j2 +· · ·+jN , and x l = x1l1 x2l2 · · · xN . By an element of Lip (γ , C) (j ) we mean the collection {f (x)}|j |≤k . The norm of an element in Lip (γ , C) is defined as the smallest M for which inequality (2.1) holds. We notice that Lip (γ , C) with this norm is a Banach space. For the case C = RN , since the functions f (j ) are determined by f (0) , this collection is then identified with f (0) . The Whitney extension of order k is defined as follows. Let {f (j ) }|j |≤k be an element of Lip (γ , C). The linear mapping Ek : Lip (γ , C) → Lip (γ , RN ) assigns to any such collection a function Ek (f (j ) ) defined on RN which is an extension of f (0) = f to RN . The definition of Ek is the following: for x ∈ C, E0 (f )(x) = f (x) E0 (f )(x) = i f (pi )ϕi (x) for x ∈ RN − C, and, for k > 0, Ek (f (j ) )(x) = f (0) (x) for x ∈ C, Ek (f (j ) )(x) = i P (x, pi )ϕi (x) for x ∈ RN − C. Here P (x, y) denotes the polynomial in x, which is the Taylor expansion of f about the point y ∈ C: P (x, y) =

f (l) (y)(x − y)l l!

for x ∈ RN , y ∈ C.

|l|≤k

The functions ϕi form a partition of unity of RN − C with the following properties:

258

G.-Q. Chen, H. Frid

(i) spt (ϕi ) ⊂ Qi , where Qi is a cube with edges parallel to the coordinate axes and c1 diam (Qi ) ≤ dist (Qi , C) ≤ c2 diam (Qi ), for certain positive constants c1 and c2 independent of C; (ii) each point of RN − C is contained in at most N0 cubes Qi , for a certain number N0 depending only on the dimension N ; (iii) the derivatives of ϕi satisfy

α

∂ 1 · · · ∂ αNϕi (x) ≤ Aα (diam Qi )−|α| for x ∈ Qi . (2.2) x1 xN ∈ C is such that dist (Qi , C) = dist (pi , Qi ), |α| = α1 + · · · + αN , and the Here pi symbol indicates that the summation is taken only over those cubes whose distances to C are not greater than one. The following theorem, whose proof can also be found in [23], is due to Whitney [27]. Theorem 2.1. Suppose that k is a nonnegative integer, γ ∈ (k, k + 1], and C is a closed set. Then the mapping Ek is a continuous linear mapping from Lip (γ , C) to Lip (γ , RN ) which defines an extension of f (0) to RN , and the norm of this mapping has a bound independent of C. We will need the following proposition, which seems to be of interest in itself and is useful in establishing the generalized Gauss-Green theorem in Sect. 3. Proposition 2.2. Let C be a closed set in RN and Cδ := { x ∈ RN : dist (x, C) ≤ δ }

for δ > 0.

Let Ek : Lip (γ , C) → Lip (γ , RN ) with γ ∈ (k, k + 1] be the Whitney extension of order k. Then, for any φ ∈ Lip (γ , RN ) and any γ ∈ (k, γ ), Ek (φ|C) − φLip (γ ,Cδ ) → 0 as δ → 0.

(2.3)

Proof. We will prove the proposition in detail only for the cases k = 0 and k = 1 since the case k > 1 can then be obtained by induction. For k = 0, E0 is given by E0 (f )(x) = f (x) if x ∈ C, and E0 (f )(x) =

∞

f (pi )ϕi (x)

if x ∈ RN − C,

(2.4)

i=1

where pi and ϕi are as above. Now, for any φ ∈ Lip (γ , RN ), we have E0 (φ|C)(x) − φ(x) =

∞

(φ(pi ) − φ(x))ϕi (x).

i=1

Clearly, sup |E0 (φ|C)(x) − φ(x)| ≤ c x∈Cδ

sup

|φ(y) − φ(x)| → 0, as δ → 0.

(2.5)

y∈C,x∈Cδ |x−y|≤c0 δ

Here and in what follows in this proof c0 and c are positive constants which are independent of δ > 0 and C, whose values may change at each appearance.

Divergence-Measure Fields and the Euler Equations

259

Set g = E0 (φ|C) − φ. We now show that |g(x) − g(y)| ≤ M(δ)|x − y|γ

for x, y ∈ Cδ ,

(2.6)

where M(δ) → 0 as δ → 0. Indeed, (φ(pi ) − φ(x))ϕi (x) − (φ(pi ) − φ(y))ϕi (y). g(x) − g(y) = x∈Qi

y∈Qi

We split each of these sums into two, respectively: = + , = + , x∈Qi

and denote

0

=

x∈Qi y ∈Q / i

,

x∈Qi y∈Qi

1

y∈Qi

=

x∈Qi y∈Qi

,

x∈Qi y ∈Q / i

y∈Qi x ∈Q / i

2

x∈Qi y∈Qi

=

.

y∈Qi x ∈Q / i

We have 0

(φ(pi ) − φ(x))ϕi (x) − (φ(pi ) − φ(y))ϕi (y) 0 0 (φ(y) − φ(x))ϕi (x) + (φ(pi ) − φ(y))(ϕi (x) − ϕi (y)). =

Since, if x, y ∈ Qi ∩ Cδ , |x − y| ≤ c0 δ, for a given constant c0 > 0, we obtain, for the first sum,

0

(φ(x) − φ(y))ϕi (x)

≤ cδ γ −γ |x − y|γ ,

and, for the second sum,

0

0

(φ(pi ) − φ(y))(ϕi (x) − ϕi (y))

≤ c |pi − y|γ (diam Qi )−1 |x − y|

≤ c |x − y|γ ≤ c δ γ −γ |x − y|γ . Now, for

1

, we have

1

(φ(pi ) − φ(x))ϕi (x) =

1

(φ(pi ) − φ(x))(ϕi (x) − ϕi (qi )),

with qi = ∂Qi ∩ [x, y], where [x, y] is the straight line segment connecting x to y. Therefore, 1

|φ(pi ) − φ(x)||ϕi (x) − ϕi (qi )| ≤ c ≤c

1 1

|pi − x|γ (diam Qi )−1 |x − qi |

|x − qi |γ ≤ c δ γ −γ |x − y|γ .

The sum 2 is treated similarly. We have then proved (2.6) which, together with (2.5), gives (2.3) for k = 0.

260

G.-Q. Chen, H. Frid

For k = 1, we have E1 (f )(x) = f (x) for x ∈ C, and E1 (f )(x) =

∞

for x ∈ RN − C,

Pf (x, pi )ϕi (x)

i=1

where Pf (x, y) = f (y) +

N

∂xj f (y)(xj − yj ),

j =1

and the functions ϕi , conveniently renumbered together with Qi containing spt (ϕi ), satisfy dist (Qi , C) ≤ 1. Since φ ∈ Lip (γ , RN ), γ > 1, we clearly have (φ|C)j (y) = ∂xj φ(y) Hence E1 (φ|C)(x) − φ(x) =

∞

for y ∈ C.

(Pφ (x, pi ) − φ(x))ϕi (x).

i=1

Setting g1 = E1 (φ|C) − φ, we have |g1 (x)| ≤

∞

|Pφ (x, pi ) − φ(x)||ϕi (x)|

i=1

≤c

|x − pi |γ |ϕi (x)| ≤ c δ γ → 0

as δ → 0.

x∈Qi

Also, ∂xk g1 (x) =

∂xk (Pφ (x, pi ) − φ(x))ϕi (x) +

x∈Qi

(Pφ (x, pi ) − φ(x))∂xk ϕi (x)

x∈Qi

= h1 (x) + h2 (x), where h1 (x) :=

(∂xk φ(pi )−∂xk φ(x))ϕi (x),

h2 (x) :=

x∈Qi

(Pφ (x, pi )−φ(x))∂xk ϕi (x).

x∈Qi

Hence, |∂xk g1 (x)| ≤ c

|pi − x|γ −1 + c

x∈Qi

≤ cδ

γ −1

|x − pi |γ diam (Qi )−1

x∈Qi

→0

as δ → 0.

We now show |∂xk g1 (x) − ∂xk g1 (y)| ≤ M(δ)|x − y|γ

−1

,

with M(δ) → 0 as δ → 0. First we obtain |h1 (x) − h1 (y)| ≤ M(δ)|x − y|γ

−1

,

with M(δ) → 0

as δ → 0,

Divergence-Measure Fields and the Euler Equations

261

exactly as in the case k = 0. Now, h2 (x) − h2 (y) = (Pφ (x, pi ) − φ(x))∂xk ϕi (x) − (Pφ (x, pi ) − φ(y))∂xk ϕi (y) x∈Qi

=

R(x, pi )∂xk ϕi (x) −

x∈Qi

x∈Qi

R(y, pi )∂xk ϕi (y),

y∈Qi

where we set R(x, y) := Pφ (x, y) − φ(x). Again, we split each of the last two sums above into two, respectively: 0 1 0 2 = + , = + , x∈Qi

y∈Qi

as in the first part of the proof, and compute 0

0

R(x, pi )∂xk ϕi (x) − R(y, pi )∂xk ϕi (y)

0

0 (R(x, pi ) − R(y, pi ))∂xk ϕi (x) + R(y, pi )(∂xk ϕi (y) − ∂xk ϕi (x)) ≤ 0 0 |x − y|γ (diam Qi )−1 + c |y − pi |γ (diam Qi )−2 |x − y| ≤c

≤ c δ γ −γ |x − y|γ

−1

.

For the remaining sums, we have

1

1

R(x, pi )∂xk ϕi (x) = R(x, pi )(∂xk ϕi (x) − ∂xk ϕi (qi )) 1 |x − pi |γ (diam Qi )−2 |x − qi | ≤c

≤ c δ γ −γ |x − y|γ

−1

,

where qi is as that in the first part of the proof; and the sum 2 R(y, pi )∂xk ϕi (y) is treated similarly. This concludes the proof in the case k = 1. As indicated above, the case k > 1 follows similarly by induction. 3. Normal Traces and Generalized Gauss-Green Theorem In this section, we prove our main results concerning extended DM-fields, including a new notion of normal traces, a generalized Gauss-Green theorem, and a product rule for DM-fields. We begin with the definition of deformable Lipschitz boundaries. Definition 3.1. Let ⊂ RN be an open bounded subset. We say that ∂ is a deformable Lipschitz boundary, provided that (i) ∀ x ∈ ∂, ∃ r > 0 and a Lipschitz map γ : RN−1 → R such that, after rotating and relabeling coordinates if necessary, ∩ Q(x, r) = {y ∈ RN : γ (y1 , · · · , yN−1 ) < yN } ∩ Q(x, r), where Q(x, r) = {y ∈ RN : |xi − yi | ≤ r, i = 1, · · · , N }; (ii) ∃ : ∂ × [0, 1] → such that is a homeomorphism bi-Lipschitz over its image and (ω, 0) = ω for all ω ∈ ∂. The map is called a Lipschitz deformation of the boundary ∂.

262

G.-Q. Chen, H. Frid

The following lemma is a direct corollary of the boundedness of F and div F over as Radon measures. Since the theory of DM∞ -fields has been addressed in [5], henceforth we focus on DM∗ -fields, where ∗ stands for either p ∈ [1, ∞) or ext. Lemma 3.1. Let F ∈ DM∗ () with an open set whose boundary ∂ has a Lipschitz deformation with ∂s = s (∂), s ∈ [0, 1]. Then there exists a countable set T ⊂ (0, 1) such that |F |(∂s ) = |div F |(∂s ) = 0

for any s ∈ (0, 1) − T .

We now establish the generalized Gauss-Green theorem for DM∗ -fields, by introducing a suitable definition of normal traces over the boundary ∂ of a bounded open set with Lipschitz deformable boundary. Theorem 3.1 (Normal Traces and Generalized Gauss-Green Theorem). Assume F ∈ DM∗ (). Let ⊂ RN is a bounded open set with Lipschitz deformable boundary. Then there exists a continuous linear functional F · ν|∂ over Lip (γ , ∂), γ > 1, such that, for any φ ∈ Lip (γ , RN ), F · ν|∂ , φ = φ div F + ∇φ · F . (3.1)

Let h :

RN

→ R be the level set function of ∂s , that is,   for x ∈ RN − ,  0, h(x) := 1, for x ∈ − (∂ × [0, 1]),   s, for x ∈ ∂s , 0 ≤ s ≤ 1.

If F ∈ DMp (), 1 ≤ p < ∞, then

1 E(ψ) ∇h · F dx s→0 s (∂×(0,s)) 1 = − lim E(ψ) |∇h|F · ν dx s→0 s (∂×(0,s))

F · ν|∂ , ψ = − lim

(3.2)

for any ψ ∈ Lip (∂), where E(ψ) is any Lipschitz extension of ψ to all RN and ν : (∂ × [0, 1]) → RN is such that ν(x) is the unit outer normal vector to ∂s at x ∈ ∂s , defined for a.e. x ∈ (∂ × [0, 1]). Formula (3.2) also holds if F ∈ DMext (), for any ψ ∈ Lip (γ , ∂), γ > 1, and E(ψ) ∈ Lip (γ , RN ) any extension of ψ to RN , provided that the set of non-Lebesgue points of ∇h(x) on (∂ × (0, 1)) has |F |-measure zero. Finally, for F ∈ DMp () with 1 < p < ∞, F · ν|∂ can be extended to a continuous linear functional over W 1−1/p, p (∂) ∩ C(∂). Proof. We divide the proof into four steps. Step 1. We first treat the more general case F ∈ DMext (). For ψ ∈ Lip (γ , ∂), let E(ψ) be the Whitney extension of ψ of order k = 1. Then, by the classical Gauss-Green formula, we have F ε · ν ψ dHN−1 = E(ψ) div F ε + ∇E(ψ) · F ε , (3.3) ∂

where

Fε

= F ∗ ωε with the standard mollified sequence ωε .

Divergence-Measure Fields and the Euler Equations

263

To begin with, we first focus on F ∈ DMext (D), for D RN , satisfying |F |(∂) = |div F |(∂) = 0.

(3.4)

The right-hand side of (3.3) defines a uniformly bounded family of continuous linear operators l ε over Lip (γ , ∂). Moreover, for each ψ ∈ Lip (γ , ∂), the limit limε→0 l ε (ψ) exists and equals E(ψ) div F + ∇E(ψ) · F,

as a consequence of (3.4) and Proposition 2.1. Hence, we may define F · ν|∂ , ψ = lim l ε (ψ)

for any ψ ∈ Lip (γ , ∂).

ε→0

Now, for any φ ∈ Lip (γ , RN ), we let ε → 0 in the Gauss-Green formula: F ε · ν φ dHN−1 = φ div F ε + ∇φ · F ε , ∂

and obtain the identity (3.1). For the general case that F ∈ DMext () without the assumption (3.4), we consider a Lipschitz deformation of ∂, : ∂ × [0, 1] → . Let s ∈ (0, 1) be such that |F |(∂s ) = |div F |(∂s ) = 0.

(3.5)

Since we have HN −1 (∂s ) < +∞, Federer’s extension of the Gauss-Green theorem (see [13]) holds for φF ε over s . Thus, we know from the previous analysis that F ·ν|∂s is defined as a continuous linear functional over Lip (γ , ∂s ), whose norm is bounded, independent of s ∈ (0, 1). Now, for ψ ∈ Lip (γ , ∂), we have F · ν|∂s , (E(ψ)|∂s ) = E(ψ) div F + ∇E(ψ) · F. (3.6) s

s

Again, the right-hand side of (3.6) defines a uniformly bounded family of continuous linear functionals l s over Lip (γ , ∂), for s ∈ (0, 1) − T , where T is defined in Lemma 3.1. Furthermore, lims→0 l s (ψ) exists for any ψ ∈ Lip (γ , ∂), as a consequence of the Dominated Convergence Theorem applied to both integrals on the right-hand side of (3.6). Hence, we may define F · ν|∂ , ψ = lim l s (ψ), s→0

which is then a continuous linear functional over Lip (γ , ∂). Finally, for any φ ∈ Lip (γ , RN ), we obtain (3.1) by taking the limit as s → 0 in the formula: φ div F + ∇φ · F, F · ν|∂s , (φ|∂s ) = s

s

observing that |F · ν|∂s , (φ|∂s ) − (E(φ|∂)|∂s ) | ≤ c (φ|∂s ) − (E(φ|∂)|∂s )Lip (γ ,∂s ) ≤ c φ − E(φ|∂)Lip (γ ,(∂×[0,s])) → 0, for 1 <

γ

< γ , as a consequence of Proposition 2.2.

as s → 0,

264

G.-Q. Chen, H. Frid

Step 2. We now consider the more regular case that F ∈ DMp (), 1 ≤ p < ∞. Let F ε be as above. Again, for any φ ∈ Lip (RN ), we have φF ε · ν dHN−1 = φ div F ε dx + ∇φ · F ε dx. (3.7) ∂s

s

s

Now we integrate (3.7) in s ∈ (0, δ), 0 < δ < 1, and use the coarea formula (see, e.g. [12, 13]) in the left-hand side to obtain δ δ − φF ε · ∇h dx = φ div F ε dx ds + ∇φ · F ε dx ds. (∂×(0,δ))

0

0

s

s

(3.8) Let ε → 0. Observing that, by Proposition 2.1, the integrand of the first integral converges for a.e. s ∈ (0, δ) to the corresponding integral for F , we obtain δ δ − φF · ∇h dx = φ div F ds + ∇φ · F dx ds. (∂×(0,δ))

0

0

s

s

(3.9) We then divide (3.9) by δ, let δ → 0, and observe that both terms in the right-hand side converge to the corresponding integrals inside the brackets over , by the dominated convergence theorem. Hence, the left-hand side also converges, which yields 1 φ F · ∇h dx = φ div F + ∇φ · F dx. (3.10) − lim δ→0 δ (∂×(0,δ)) Now, for ψ ∈ Lip (∂), let E(ψ) ∈ Lip (RN ) be a Lipschitz extension of ψ preserving the norm · Lip := · ∞ + Lip (·) (see, e.g., [12, 13]). We then define 1 F · ν|∂ , ψ = − lim E(ψ) ∇h · F dx. (3.11) s→0 s (∂×(0,s)) Because the right-hand side of (3.10) is independent of the particular deformation for ∂, we see that the normal trace defined by (3.11) is also independent of the deformation. We still have to prove that the normal trace as defined by (3.11) is also independent of the specific Lipschitz extension E(ψ) of ψ. This will be accomplished if we prove that the right-hand side of (3.10) vanishes for φ|∂ ≡ 0. Denote it by [F, φ]∂ , that is, [F, φ]∂ := div F, φ + F, ∇φ . We claim that [F, φ]∂ = 0 if φ|∂ ≡ 0. In fact, we may approximate such φ by a sequence φ j ∈ C0∞ (), with φ j ∞ ≤ φ∞ , such that φ j → φ locally uniformly ∗

in and ∇φ j → ∇φ in Lq ()N , for p > 1, ∇φ j ∇φ in L∞ ()N , for p = 1, with p1 + q1 = 1. Hence, [F, φ]∂ = limj →∞ [F, φ j ] = 0, as asserted. In particular, for 1 ≤ p ≤ ∞, the values of the normal trace, F · ν|∂ , φ|∂ , depend only on the values of φ over ∂. Step 3. The fact that formula (3.2) also holds if F ∈ DMext (), for any ψ ∈ Lip (γ , ∂), γ > 1, provided that the set of non-Lebesgue points of ∇h(x) on (∂ × (0, 1)) has |F |-measure zero, is clear from the above proof for DMp -fields, 1 ≤ p ≤ ∞.

Divergence-Measure Fields and the Euler Equations

265

Step 4. As for the last assertion, we recall a well-known result of Gagliardo [14] which indicates, in particular, that, if ∂ is Lipschitz (that is, satisfies (i) of Definition 3.1) and ψ ∈ W 1−1/p, p (∂), then it can be extended into to a function E(ψ) ∈ W 1,p (), and E(ψ)W 1,p () ≤ cψW 1−1/p, p (∂) ,

(3.12)

for some positive constant c independent of ψ. From the definition of E(ψ) given in [14], it is easy to verify that, when ψ ∈ C(∂), E(ψ) ∈ C() and E(ψ)L∞ () ≤ ψL∞ (∂) . Hence, using these facts and (3.10), we easily deduce the last assertion. Remark 3.1. When F ∈ DM∞ (D), the normal trace F ·ν|∂ is in fact in L∞ (∂, HN−1 ) as the weak-star limit of (F · νs ) ◦ s in L∞ (∂, HN−1 ) for any Lipschitz deformation s : F · ν|∂ = w∗ − lim (F · νs ) ◦ s , L∞ (∂, HN−1 ), which is independent of s ; and the weak-star topology for this limit is optimal to define F · ν|∂ in general (see Chen-Frid [5]). However, for F ∈ DM∗ (D), the normal traces F · ν|∂ may no longer be functions in general. This can be seen in Example 1.1 for F ∈ DM1loc (R2 ) with = (0, 1) × (0, 1), for which F · ν|∂ =

π δ(0,0) − d(H1 ∂), 2

where H1 ∂ is the one-dimensional Hausdorff measure on ∂. Remark 3.2. As mentioned in the proof of Theorem 3.1, if F ∈ DMp (), for 1 ≤ p ≤ ∞, the values of the normal trace, F · ν|∂ , φ|∂ , depend only on the values of φ over ∂. In contrast, for F ∈ DMext (), the values of F · ν|∂ , φ|∂ also depend, in principle, on the values of the first derivatives of φ over ∂, since φ|∂ must be viewed as elements of Lip (γ , ∂), for some γ > 1. Finally, we establish the following useful product rule. Theorem 3.2 (Product Rule). Let F = (F1 , · · · , FN ) ∈ DM∗ (D). Let g ∈ C(D) be such that ∂xj g(x) is |Fj |-integrable, for each j = 1, · · · , N , and the set of non-Lebesgue points of ∂xj g(x) has |Fj |-measure zero. Then gF ∈ DM∗ (D) and div (gF ) = g div F + ∇g · F.

(3.13)

Proof. For any φ ∈ C01 (D), we have div (gF ), φ = −F, g∇φ = −F, ∇(gφ) + F, φ∇g .

(3.14)

Therefore, it suffices to show that F, ∇(gφ) = −div F, gφ .

(3.15)

Set g ε = g ∗ ωε . We have −div F, g ε φ = F, ∇(g ε φ) =

N

Fj , ∂xj (g ε φ)

j =1

=

N j =1

{Fj , φ∂xj g ε + Fj , g ε ∂xj φ }.

(3.16)

266

G.-Q. Chen, H. Frid

Let ε → 0. Then the right-hand side converges to N

{Fj , φ∂xj g + Fj , g∂xj φ } = F, ∇(gφ) ,

j =1

by the assumption on the set of non-Lebesgue points of ∂xj g, while the left-hand side of (3.16) converges to −div F, gφ by the Dominated Convergence Theorem, which implies (3.15). Then (3.13) follows. Remark 3.3. The continuity assumption of g(x) in Theorem 3.2 can be relaxed. In particular, when F ∈ DM∞ (D), it requires only that g ∈ BV (D; R) to have gF ∈ DM∞ (D). Remark 3.4. The results in this section for the extended vector fields over RN extend to a general context over Riemannian manifolds. In particular, if aij (x) are smooth functions on the open set D ⊂ RN and the N × N matrix (aij )(x) is symmetric, positive definite, the results can easily be generalized to the extended vector fields F = (F1 , · · · , FN ) in Lp (D; RN ), 1 ≤ p ≤ ∞, or M(D; RN ), satisfying the condition: aij ∂i Fj ∈ M(D). i,j

This is clear from the fact that no specific property of the Euclidean metric has been used in our analysis. 4. Applications to the Euler Equations for Gas Dynamics In this section, as a direct application of the theory developed in Sect. 3, we establish the uniqueness and stability of Riemann solutions that may contain vacuum for the Euler equations for gas dynamics in Lagrangian coordinates. Denote R2+ = (0, ∞) × R and R2+ = [0, ∞) × R. We consider τ ∈ M+ (R2+ ) satisfying τ ≥ c L2 for some c > 0, where Lk is the k-dimensional Lebesgue measure. Let v ∈ L∞ (R2+ ) and τ0 ∈ M+ (R) with τ0 ≥ c L1 . We assume that τ, v, and τ0 satisfy (φt τ − v φx dt dx) + φ(0, x) τ0 (x) = 0, (4.1) R2+

R

for any φ ∈ C01 (R2 ). Definition 4.1. Let τ and τ0 be as above. We say that a function φ(t, x) defined on R2+ is a τ -test function if it satisfies the following: 1. spt (φ) is a compact subset of R2+ and φ is continuous on R2+ ; 2 2. φ t and φx are τ -measurable; and φt is τ -integrable over R+ , that is, the integrals R2+ (φt )± τ exist and at least one of them is finite; 3. lim φ(t, x) = φ(0, a) for τ0 –a.e. a ∈ R. t→0 x→a

Divergence-Measure Fields and the Euler Equations

267

Theorem 4.1. Let τ, v, and τ0 be as above. Then 1. the nonnegative measure τ admits a slicing of the form τ = dt ⊗ µt (x) with µt ∈ M+ (R) for L1 -a.e. t > 0. More precisely, for all φ ∈ C0 (R2+ ), φ(t, x) τ = φ(t, x) µt (x) dt; 2. the points (t, x) ∈ R2+ such that µt (x) > 0, with the exception of a set of H1 -measure zero, form a countable union of vertical line segments, called vacuum lines. In particular, τ (l) = 0 for any non-vertical straight line segment l; 3. the identity (4.1) holds for any τ -test function φ(t, x). The proof of Theorem 4.1 is given in Sect. 5. As a corollary, we have Corollary 4.1. Let τ, v, and τ0 be as above. Let p(t, ¯ x) be a nonnegative function over R2+ , continuous on R2+ , such that φ p¯ is a τ -test function for any φ ∈ C01 (R2 ), p¯ t ≤ 0, τ -a.e., and p¯ x ∈ L1loc (R2+ ). Then, for any nonnegative function ζ ∈ C01 (R), ¯ x) µt (x) ≤ ζ (x) p(0, lim sup ζ (x) p(t, ¯ x) τ0 (x). (4.2) t→0+

Proof. First we have from Theorem 4.1 that (4.1) holds for φ = pψ ¯ with ψ ∈ C01 (R2 ). ε ε Then we choose ψ = ψ (t, x) := σ (t − t0 )ζ (x) with ζ (x) ≥ 0, t0 > 0, and 0 ≤ σ ε ∈ C01 (R), σ ε (t − t0 ) → χ(−∞,t0 ) (t), d δ ε (t − t0 ) = − σ ε (t − t0 ) δt0 , dt

as ε → 0,

where δt0 is the Dirac measure concentrated at t0 , and the convergences are in L1 (R) and M(R), respectively. We obtain from (4.1) that ε 0= ¯ t µt dt − (ψ ε p) ¯ x v dt dx + (ψ ε p¯ 0 ) τ0 (ψ p) t>0 ζ p¯ µt dt + p¯ t ψ ε µt dt − = − δε (ψ ε p) ¯ x v dt dx + (ψ ε p¯ 0 ) τ0 . Now, using that p¯ t ≤ 0, τ -a.e., we obtain ε − δ ζ p¯ µt dt + C0 (|p¯ x | + |ζ |) dt dx + (ψ ε p¯ 0 ) τ0 ≥ 0. 0≤t≤t0 x∈ spt (ζ )

Assuming that t0 is a Lebesgue point of g(t) = ζ p¯ µt and letting ε → 0 yield ζ p¯ µt0 ≤ C (|p¯ x | + |ζ |) dt dx + ζ p¯ 0 τ0 . (4.3) 0≤t≤t0 x∈ spt (ζ )

Now, taking the lim sup as t0 → 0 in both sides of (4.3), we finally arrive at (4.2).

268

G.-Q. Chen, H. Frid

We now consider the solutions of the Euler equations (1.1)–(1.3) for gas dynamics in the sense of distributions such that τ is a nonnegative Radon measure, with τ ≥ cL2 for some c > 0, and v(t, x) and S(t, x) are bounded τ -measurable functions, along with our understanding that the constitutive relations (1.4) for (τ, p, e, θ, S)(t, x) hold L2 almost everywhere out of the vacuum lines, in the set where τ is absolutely continuous with respect to L2 , and both p(t, x) and e(t, x) are defined as zero on the remaining set with measure zero in R2+ , including the vacuum lines. We consider the Cauchy problem for (1.1)–(1.3): (τ, v, S)|t=0 = (τ0 , v0 , S0 )(x),

(4.4)

where τ0 (x) is a nonnegative Radon measure over R, τ0 ≥ cL1 for some c > 0, v0 (x) and S0 (x) are bounded τ0 -measurable functions, and e0 (x) = e(τ0 (x), S0 (x)) a.e. out of the countable points {xk } such that τ0 (xk ) > 0, the initial vacuum set. Set T = (0, T ) × R and ∗T = (−∞, T ) × R for T > 0. Let D and F be functions or measures over T . Let D0 be a function or a measure over R. By weak formulation on T for the Cauchy problem: Dt + Fx = 0, D|t=0 = D0 ,

(4.5) (4.6)

we mean that, for a suitable set of test functions φ(t, x) defined on ∗T , (φt D + φx F ) + φ(0, x)D0 = 0. T

(4.7)

R

Analogously, if the identity “ = ” in (4.5) is replaced by “ ≥ ” or “ ≤ ”, the weak formulation of the corresponding problem (4.5) and (4.6) is (4.7) with “ = ” replaced by “ ≤ ” or “ ≥ ”, respectively, for a suitable set of nonnegative test functions defined on ∗T . Denote W = (τ, v, S), f (W ) = (−v, p(τ, S), 0), η(W ) = e(τ, S) + vp(τ, S), and

v2 2 ,

q(W ) =

α(W, W ) = η(W ) − η(W ) − ∇η(W ) · (W − W ), β(W, W ) = q(W ) − q(W ) − ∇η(W ) · (f (W ) − f (W )). ¯ −v, ¯ θ¯ ). Observe that ∇η(W ) = (−p, Definition 4.2. We say that W (t, x) is a distributional entropy solution of (1.1)–(1.3), and (4.4) in T if τ is a Radon measure on T with τ ≥ cL2 for some c > 0, v and S are bounded τ -measurable functions such that the weak formulation of (1.1)–(1.3), (1.7), and (4.4) is satisfied for all test functions in C01 (∗T ), and S(t, · ) S0 ( · ), as t → 0, in the weak-star topology of L∞ (R). Observe that the weak formulation implies that µt τ0 in M(R), and v(t, · ) v0 ( · ) and E(t, · ) E0 ( · ) in the weak-star topology of L∞ (R), as t → 0, where E = e + v 2 /2. We also remark that these convergences can be strengthened to the convergences in L1loc (R) in the case that τ is a bounded measurable function, as an easy consequence of the DM∞ theory (cf. [5]). As shown by Wagner [25], by means of the transformation from Eulerian to Lagrangian coordinates, bounded measurable entropy solutions of the Euler equations

Divergence-Measure Fields and the Euler Equations

269

in Eulerian coordinates transform into distributional entropy solutions of (1.1)–(1.3), (1.7) and (4.4), satisfying the additional restriction that the weak formulation of (1.2), (1.3), and (1.7) holds for test functions with compact support in T such that φt = g, φx = h τ , where g, h ∈ L∞ (T , τ ). It is also shown through an example in [25] that distributional entropy solutions without the additional restriction may have no physical meaning. Now we consider the Riemann solution W (t, x) associated with the Riemann problem for (1.1)–(1.3) with initial condition: WL , x < 0, W 0 (x) = (4.8) WR , x > 0, where WL and WR are two constant states in the physical domain {W = (τ, v, S) : τ > 0}. First, we address the case that W (t, x) is a bounded self-similar entropy solution of (1.1)–(1.3) which consists of at most two rarefaction waves, one corresponding to the first characteristic family and another corresponding to the third one, and possibly one contact discontinuity on the line x = 0. Then, W (t, x) has the following general form:  x/t < ξ1 , WL ,      R (x/t), ξ1 ≤ x/t < ξ2 , 1   W , ξ2 ≤ x/t < 0, M (4.9) W (x, t) =  W , 0 < x/t < ξ3 , N     R3 (x/t), ξ3 ≤ x/t < ξ4 ,    WR , x/t ≥ ξ4 . p

In what follows we use the notation W p = (τ, S) and W = (τ¯ , S). Theorem 4.2. Let W (t, x) be the Riemann solution (4.9), and let W (t, x) be any distributional entropy solution of (1.1)–(1.3) and (4.4) with W0 ∈ L∞ (R; R3 ). Then there exist positive constants C and K0 , and a function ω ∈ L∞ (T ), positive a.e. in T , such that, for any X > 0 and a.e. t > 0, p |v(t, x) − v(t, ¯ x)|2 + |Wa.c. (t, x) − W¯ p (t, x)|2 ω(t, x) dx |x|≤X p p |v0 (x) − v¯0 (x)|2 + |W0 (x) − W 0 (x)|2 ω(0, x) dx. ≤C (4.10) |x|≤X+K0 t

Proof. We divide the proof into four steps. Step 1. Consider the measure := α(W, W )t + β(W, W )x . Given any X > 0 and t > 0, let t0 ∈ (0, t) and t0 ,t = {(σ, x) : |x| < X + K0 (t − σ ), t0 < σ < t}, for K0 > 0 to be suitably chosen later.

270

G.-Q. Chen, H. Frid

First, by the Gauss-Green formula (Theorem 3.1), we have (t0 ,t ) = (α, β) · ν|∂t0 ,t , 1 . Now, let ζi , i = 1, · · · , 4, be nonnegative functions in C0∞ (R2 ) such that 4

ζi = 1,

on ∂t0 ,t ,

i=1

ζ1 = 1, on {(0, x) : |x| < X + K0 (t − t0 ) }, ζ2 = 1, on {(t, x) : |x| < X }, (spt (ζ3 ) ∪ spt (ζ4 )) ∩ (({t} × R) ∪ ({t0 } × R)) = ∅. We choose ζ3 and ζ4 so that spt (ζ3 ) intersects the left lateral side of t0 ,t but not the right, and spt (ζ4 ) intersects its right lateral side but not the left. We have (t0 ,t ) = (α, β) · ν|∂t0 ,t , ζ1 + ζ2 + ζ3 + ζ4 . In what follows, we will use the notations: 1 gˆ = (1 − σ )g(σ τ + (1 − σ )τ¯ , σ S + (1 − σ )S) dσ 0

for any function g = g(τ, S), and 1 g˜ = g(σ τ + (1 − σ )τ¯ , σ S + (1 − σ )S) dσ. 0

Step 2. We now prove the following five estimates. 1. (α, β) · ν|∂t0 ,t , ζ3 ≥ 0 and (α, β) · ν|∂t0 ,t , ζ4 ≥ 0. Indeed, let z∗ = (t∗ , x∗ ) be the center of t0 ,t . We consider the following deformation of ∂t0 ,t : (z, s) := z + εs(z∗ − z), z ∈ ∂t0 ,t , s ∈ [0, 1], where ε is chosen so small that (spt (ζ3 ) ∪ spt (ζ4 )) ∩({(t − εs(t − t∗ ), x) : x ∈ R} ∪ {(t0 + εs(t∗ − t0 ), x) : x ∈ R}) = ∅, for s ∈ [0, 1]. Then 1 s→0 s

(α, β) · ν|∂t0 ,t , ζ3 = lim

(∂t0 ,t ×(0,s))

(K0 α − β)|∇h| ζ3 dσ dx ≥ 0,

by choosing K0 such that K0 α ≥ β, which is possible by what follows. 2. Given δ > 0 and bounded sets B1 ⊂ Vδ := {(τ, v, S) : τ > δ} and B2 ⊂ R2 , there exists a constant K0 = K0 (δ, B1 , B2 ) > 0 such that K0 α(W, W ) ≥ β(W, W ), for any W ∈ B1 and W ∈ Vδ with (v, S) ∈ B2 . In fact, we first have 1 (v − v) ¯ 2 + eˆτ τ (τ − τ¯ )2 + 2eˆτ S (τ − τ¯ )(S − S) + eˆSS (S − S)2 , 2 β(W, W ) = (v − v)(p ¯ − p). ¯

α(W, W ) =

Divergence-Measure Fields and the Euler Equations

271

If W (t, x) and W (t, x) belong to a bounded set B in Vδ , we can find K0 depending only on B such that K0 α ≥ β. Now, since (τ¯ , v, ¯ S) ∈ B1 and (v, S) ∈ B2 , it suffices to show that, for τ sufficiently large, we have β(W, W ) ≤ α(W, W ). Notice that 1 1 1 (v − v) ¯ 2 + p˜ τ2 (τ − τ¯ )2 + p˜ τ p˜ S (τ − τ¯ )(S − S) + p˜ S2 (S − S)2 2 2 2 1 ≤ (v − v) ¯ 2 + p˜ τ2 (τ − τ¯ )2 + p˜ 2 (S − S)2 . 2

β≤

On the other hand,  α≥

1 eˆτ τ 1 (v − v) ¯ 2+ (τ − τ¯ )2 + eˆSS 2 2(K + 1) 2

 K + 1 eτ2S  (S − S)2 , − K eτ τ

for any K > 0. Now, p˜ τ2 decays faster than eˆτ τ = −pˆ τ as τ → ∞, p˜ S2 decays faster than eˆSS as τ → ∞, and, for K sufficiently large, eˆSS

K + 1 eτ2S > c eˆSS , − K eτ τ

for some c > 0 sufficiently small, since γ > 1, as one can easily check from (1.4). Hence, β(W, W ) ≤ α(W, W ) for τ larger than a certain τ∗ . Then the assertion follows. 3. Similarly, we have (α, β) · ν|∂t0 ,t , ζ4 ≥ 0. 4. As for ζ2 , we have (α, β) · ν|∂t0 ,t , ζ2 1 = lim (α, β) · ∇h ζ2 dσ dx s→0 s (∂t0 ,t ×(0,s)) 1 ≥ lim s→0 ε(t − t∗ )s

t

dσ

{η(W ) − η(W ) − v(v ¯ − v) ¯

t−εs(t−t∗ ) |x|≤X−εK1 T 2

= |x|≤X−εK1 T2 σ =t

=

¯ − S)} dx + p(τ ¯ a.c. − τ¯ ) + θ(S ¯ − S)} dx {η(Wa.c. ) − η(W ) − v(v ¯ − v) ¯ + p(τ ¯ a.c. − τ¯ ) + θ(S

1 p p |v − v| ¯ 2 + Q(Wa.c. (t, x) − W (t, x)) 2

|x|≤X−εK1 T2 σ =t

if t is a Lebesgue point of g(s) =

|x|≤X−εK1 T2

α(Wa.c. , W )(s, x) dx,

dx,

272

G.-Q. Chen, H. Frid

where Q is the quadratic form associated with the symmetric matrice eˆτ τ eˆτ S . A= eˆτ S eˆSS 5. For ζ1 , we have (α, β) · ν|∂t0 ,t , ζ1 1 = lim (α, β) · ν ζ1 |∇h| dσ dx s→0 s (∂t0 ,t ×(0,s)) 1 t0 +εs(t∗ −t0 ) 1 α(W, W ) dσ dx ≥ − lim s→0 s t0 ε(t∗ − t0 ) |x|≤X+K0 (t−t0 ) ¯ − S)}, {η(W ) − η(W ) − v(v ¯ − v) ¯ + p(µ ¯ t0 − τ¯ ) − θ(S =− |x|≤X+K0 (t−t0 ) σ =t0

where we have also used that µσ µt0 as σ → t0 + 0, for a.e. t0 > 0, with µt as in Theorem 4.1, and that p¯ is continuous on [t0 , t] × R. Step 3. On the other hand, (t0 ,t ) =

4

(l˜i ∩ t0 ,t ) + (l ∩ t0 ,t ) + (1 ∩ t0 ,t )

i=1

+ (3 ∩ t0 ,t ) + t0 ,t − (∪4i=1 l˜i ∪ l ∪ 1 ∪ 3 ) , where 1 and 3 are the left and right rarefaction regions, l˜i , 1 ≤ i ≤ 4, are the lines bounding the rarefaction regions 1 and 3 , and l is the line {x = 0}, where W (t, x) has a contact discontinuity. We first observe that, on t0 ,t − (∪4i=1 l˜i ∪ l ∪ 1 ∪ 3 ), the measure reduces to ¯ t S which is nonpositive. Now, we have −θ∂ = −div (F1 + F2 + F3 ), where ¯ − v, ¯ p − p), ¯ F1 = v(v

F2 = −p(τ ¯ − τ¯ , v¯ − v),

¯ − S, 0), F3 = θ(S

and div := div t,x . Applying the product rule (Theorem 3.2), we get ¯ + v¯x (p − p), ¯ div F1 = v¯t (v − v)

div F2 = −p¯ t (τ − τ¯ ) + p¯ x (v − v). ¯

Hence, div F1 (l˜j ∩ t0 ,t ) = div F1 (l ∩ t0 ,t ) = 0,

j = 1, · · · , 4,

since div F1 is absolutely continuous with respect to L2 . Also, div F3 (l˜j ∩ t0 ,t ) ≥ 0,

j = 1, · · · , 4.

Divergence-Measure Fields and the Euler Equations

273

On the other hand, since F3 ∈ DM∞ (t0 ,t ) and ν|l = (0, 1), we have div F3 (l ∩ t0 ,t ) = [F3 · ν|l , 1 ] = 0, where the bracket denotes the difference between the normal traces from the right and the left, which make sense for F3 ∈ DM∞ because the normal traces of DM∞ fields are functions in L∞ over the boundaries. Concerning F2 , we have div F2 (l˜j ∩ t0 ,t ) = 0,

j = 1, · · · , 4,

since p¯ t is τ -integrable and τ (l˜j ) = 0, j = 1, · · · , 4. On the other hand, p¯ t vanishes on l so that div F2 (l ∩ t,t0 ) = 0. Finally, using the product rule (Theorem 3.2), the fact that W (t, x) and W (t, x) are distributional solutions of (1.1)–(1.3) and (4.4), and S(t, x)t = 0, we obtain, for j = 1, 3, (j ∩ t0 ,t ) ¯ − S))t } =− {(v(v ¯ − v)) ¯ t + (v(p ¯ − p)) ¯ x − (p(τ ¯ − τ¯ ))t + (p(v ¯ − v)) ¯ x + (θ(S j ∩t0 ,t

¯ t} {v¯t (v − v) ¯ + v¯x (p − p) ¯ − p¯ t (τ − τ¯ ) + p¯ x (v − v) ¯ + θ¯t (S − S) + θS

=− j ∩t0 ,t

{v¯x (p − p) ¯ − p¯ τ v¯x (τ − τ¯ ) + θ¯τ v¯x (S − S)}

≤− j ∩t0 ,t

v¯x p − p¯ − p¯ τ (τa.c. − τ¯ ) − p¯ S (S − S) dxds +

≤− j ∩t0 ,t

v¯x p¯ τ τsing ≤ 0,

j ∩t0 ,t

(4.11) 2 p > 0. since p¯ τ < 0, v¯x is bounded, v¯x (t, x) ≥ 0 everywhere over 1 and 3 , and ∇W p

Step 4. Putting all these estimates together, we have

|v(t, x) − v(t, ¯ x)|2 + |Wa.c. (t, x) − W (t, x)|2 ω(t, x) dx

|x|≤X

≤2

¯ − S)}. {η(W ) − η(W ) − v(v ¯ − v) ¯ + p(µ ¯ t0 − τ¯ ) − θ(S

|x|≤X+K0 (t−t0 ) σ =t0

Now, applying Corollary 4.1, we finally get (4.10).

Corollary 4.2. Let W (t, x) and W (t, x) satisfy the conditions of Theorem 4.2 and W0 (x) = W 0 (x). Then τ (t, x) is absolutely continuous with respect to L2 in T and W (t, x) = W (t, x) a.e. in T .

274

G.-Q. Chen, H. Frid

Proof. From Theorem 4.2, we have Wa.c. (t, x) = W (t, x),

a.e. in T .

Hence, τsing must satisfy the weak formulation of τt = 0,

τ |t=0 = 0,

for the test functions in C01 (∗T ). In particular, there exists y ∈ BVloc (T ) such that ∂x y = τsing , Therefore,

∂t y = 0.

τsing = dt ⊗ νt (x), dy dx (t, ·),

a.e. t ∈ (0, T ). Since y is independent of t, we have that νt is where νt (·) = also independent of t, say, νt (x) = ν0 (x) and τsing = dt ⊗ ν0 . Furthermore, since νt 0 as t → 0, we conclude ν0 ≡ 0.

We now consider the case that the Riemann solution, with the initial condition (4.8), has a vacuum line at x = 0. In this case, the Riemann solution W¯ (t, x) has the following form:  WL , x/t < ξL ,      ξL ≤ x/t < 0, R1 (x/t), W (x, t) = ((v¯1 + v¯2 )/2, (v¯2 − v¯1 )tdt ⊗ δ0 (x), (S L + S R )/2), x = 0,    R3 (x/t), 0 < x/t ≤ ξR ,    WR , x/t > ξR . (4.12) Here R1 (x/t) and R3 (x/t) are as above the rarefaction waves of the first and third characteristic families, respectively, v¯1 = limξ →0− v(ξ ¯ ), v¯2 = limξ →0+ v(ξ ¯ ), and δ0 (x) is the Dirac measure over R concentrated at x = 0. It is easy to check that W (t, x) is a distributional solution of (1.1)–(1.3) and (4.4). The values of v¯ and S on the line x = 0 R could be taken as any other constants instead of v¯1 +2 v¯2 and S L +S , respectively, while 2 the formula of τ¯ at x = 0, (v¯2 − v¯1 )tdt ⊗ δ0 (x), is dictated by the fact that (1.1) must hold in the sense of distributions. Theorem 4.3. Let W (t, x) be a Riemann solution containing vacuum as described in (4.12). Let W (t, x) be a distributional entropy solution of (1.1)–(1.3) and (4.4) in T with W0 ∈ L∞ (R; R3 ). Then there exist positive constants C and K0 , and a function ω ∈ L∞ (T ), positive a.e. in T , such that, for all X > 0 and a.e. t > 0, p p |v(t, x) − v(t, ¯ x)|2 + |Wa.c. (t, x) − W (t, x)|2 ω(t, x) dx |x|≤X p p |v0 (x) − v¯0 (x)|2 + |W0 (x) − W 0 (x)|2 ω(0, x) dx. ≤C (4.13) |x|≤X+K0 t

Divergence-Measure Fields and the Euler Equations

275

Proof. Let F1 , F2 , and F3 be as in the proof of Theorem 4.2. We observe that, since p¯ = 0 for x = 0, we have F2 = −p(τ ¯ − τ¯a.c. , v¯ − v), so that the analysis for F2 remains the same. Also, nothing needs to be changed concerning F3 . As for F1 = v(v ¯ − v, ¯ p − p), ¯ we have a new aspect which is the fact that v(t, ¯ x) is discontinuous at l. Then we have div F1 (lt0 ,t ) = [F1 · ν|lt0 ,t , 1 ], where lt0 ,t = l ∩ t0 ,t and again ν|l = (0, 1). Let p− (t, x) and p+ (t, x) denote the functions in L∞ (l), given by the theory of DM∞ fields developed in [5], such that p− , ζ = (v, p) · ν|l , ζ l , p+ , ζ = (v, p) · (−ν)|l , ζ l ,

for any ζ ∈ C0 (l).

Hence, we have div F1 (lt0 ,t ) = v2

t t0

p+ (s) ds − v1

t t0

p− (s) ds.

On the other hand, p+ (s) = p− (s),

a.e. s > 0,

and

v2 > v1 ,

where the first follows from (1.2), and the second is a consequence of the construction of the Riemann solution containing vacuum. Therefore, we conclude div F1 (lt0 ,t ) ≥ 0. The remainder of the proof follows exactly as in Theorem 4.2.

Again, we have the following corollary. Corollary 4.3. Let W (t, x) and W (t, x) satisfy the hypotheses of Theorem 4.3 and W0 (x) = W 0 (x). Then (v, S)(t, x) = (v, ¯ S)(t, x), L2 -a.e. in T , and τ = τ¯ in M(T ). Proof. From Theorem 4.3, we deduce that Wa.c (t, x) = W a.c (t, x), L2 -a.e. in T . Thus, as in the proof of Corollary 4.2, we deduce that τsing must be concentrated at {x = 0}. Then τsing must be equal to (v¯2 − v¯1 )tdt ⊗ δ0 (x) as a consequence of (1.1) in the sense of distributions. 5. Proof of Theorem 4.1 In this section, we give a detailed proof of Theorem 4.1. The arguments are strongly motivated by those in Wagner [25].

276

G.-Q. Chen, H. Frid

Proof. We divide the proof into ten steps. Step 1. There exists y ∈ BVloc ([0, ∞) × R) such that ∂x y = τ,

∂t y = v,

(5.1)

in the sense of distributions for t > 0. Indeed, let ωε be a positive symmetric mollifier in R2 , and set τ ε = τ ∗ ωε and v ε = v ∗ ωε , where we have extended τ and v as zero for t < 0. Define x t τ ε (t, s) ds + v ε (σ, 0) dσ. (5.2) y ε (t, x) = 0

0

Then y ε ∈ BV ∩ C 1 (R2+ ). We easily check that y ε (t, x) satisfies ∂x y ε = τ ε ,

∂t y ε = v ε ,

t > ε.

Now, y ε BV () ≤ M , for any open set ⊂⊂ R2+ , where M is a positive constant independent of ε. Hence, by the compact embedding of BV () into L1 (), there exists y ∈ BVloc (R2+ ) such that y ε (t, x) → y(t, x),

in L1loc (R2+ ),

by passing to a subsequence if necessary. Clearly, y(t, x) satisfies (5.1) in the sense of distributions in R2+ . Step 2. The measure τ (t, x) admits a slicing of the form τ = dt ⊗ µt , where, for L1 -a.e. t > 0, µt ∈ M+ (R). Let y(t, x) be a solution of (5.1). Since y ∈ BVloc (R2+ ), then, for a.e. t > 0, y(t, ·) ∈ BVloc (R). Hence, dy τ = dt ⊗ (t, ·). dx Step 3. The points (t, x) ∈ R2+ such that µt (x) > 0, with the possible exception of a set of H1 -measure zero, form a countable union of vertical line segments. Again, since y ∈ BVloc (R2+ ), then, for a.e. x ∈ R, y(·, x) ∈ BVloc (R+ ). Hence, ∂t y admits a slicing as ∂t y =

dy (·, x) ⊗ dx = v(·, x)dt ⊗ dx. dt

That is, for a.e. x ∈ R, y(·, x) is a Lipschitz function on [0, ∞) whose derivative is v(·, x). On the other hand, the jump set of y(t, x), with the possible exception of a set of H1 -measure zero, is a countable union of C 1 curves {lk }k∈N , by the structure theory of BV functions (see, e.g., [12, 13]). We then conclude that the lines lk must be vertical, because, otherwise, we would have a subset A ⊂ R of positive measure such that, for ∞ x ∈ A, dy dt (·, x) would be a singular measure, rather than an L function v(·, x), which proves the assertion. Observe that µt τ0 as t → 0+, which follows from standard arguments by choosing suitable test functions in (4.1).

Divergence-Measure Fields and the Euler Equations

277

Step 4. Let t x0 ∈ R be such that τ0 (x0 ) = µt (x0 ) = 0, a.e. t > 0, and x0 is a Lebesgue point of 0 v(σ, ·) dσ , for all rational t > 0 and hence all fixed t > 0. Then there exists a solution of (5.1) satisfying x lim y(t, x) = τ0 , for a.e. x ∈ R. (5.3) t→0+

x0

= (ε1 , ε2 ), of the form ωε (t, x) = δ ε1 (t)δ ε2 (x), where = 1, 2, are standard positive symmetric mollifiers in R. If y ε (t, x) is defined by (5.2), with 0 replaced by x0 , then sending ε1 to 0 first and next ε2 → 0 yield x t ε µt + v(σ, x0 ) dσ, a.e. (t, x) ∈ R2+ . lim lim y (t, x) = Consider a mollifier ωε , with ε

δ εj , j

ε2 →0 ε1 →0

0

x0

Hence,

y(t, x) =

x

t

µt +

x0

v(σ, x0 ) dσ 0

provides the desired solution because of Step 4. Now, since, for each t ≥ 0, y(t, x) is a strictly increasing function of x, there exists a well-defined monotone increasing continuous function x(t, y) such that x(t, y(t, x)) = x,

(t, x) ∈ R2+ ,

(5.4)

where we have defined y(t, x) on the vacuum lines by, say, (y(t, x + 0) + y(t, x − 0))/2. Let Q and T be the transformations of R2+ , Q(t, x) = (t, y(t, x)) and T (t, y) = (t, x(t, y)) so that T (Q(t, x)) = (t, x). Observe that, given an open rectangle R = (t1 , t2 ) × (a, b) ⊂ R2+ , if a and b are such that y(t, a) and y(t, b) are continuous functions of t, then Q(R) is an open set in R2+ , which implies that T is continuous in R2+ . Let T0 (x) = lim y(t, x), Q0 (y) = lim x(t, y). t→0+

L2

t→0+

L1

Step 5. T# = τ and T0# = τ0 , with notation from [13]. In fact, for a.e. t > 0, y(t, x) is a strictly increasing BVloc function of the variable dy x, and τ = dt ⊗ µt (x) with µt (x) = dx (t, ·). Now, T (t, y) = (t, x(t, y)) and, for any open interval (a, b) and for each fixed t > 0, x(t, ·)−1 (a, b) = (y(t, a + 0), y(t, b − 0)) = τ (a, b). Hence, the measure T# L2 equals dt ⊗ µ˜ t , where µ˜ t is the Stieltjes measure associated with y(t, x), that is, µ˜ t = µt , which implies T# L2 = τ . Similarly, we obtain T0# L1 = τ0 . Step 6. The map T is proper and onto. The fact that T is onto follows from the fact that it is the inverse of Q which is defined everywhere in R2+ . Now, since Q(t, x) is a continuous function of t, for a.e. x ∈ R, given any compact K ⊂ R2+ , we may find a rectangle [t1 , t2 ] × [a, b] containing K such that Q(t, a) and Q(t, b) are continuous functions of t. Hence, the set {(t, y) : y(t, a) ≤ y ≤ y(t, b), t1 ≤ t ≤ t2 } is compact and contains T −1 (K), which is then also compact by the continuity of T . Let ρ = Q# L2 . Let ρ˜ be the density of L2 with respect to τ in R2+ and let τ˜ be the density of τa.c. with respect to L2 so that ρ˜ τ˜ = 1, L2 -a.e. in R2+ . The following result was proved by Wagner ([25], p.132).

278

G.-Q. Chen, H. Frid

Step 7. ρ = ∂y x = ρ˜ ◦ T . Indeed, ρ = Q# L2 admits a slicing of the form ρ = dt ⊗ µ˜ t , where µ˜ t is the Stieltjes measure associated with the inverse of y(t, ·), which is x(t, y), and hence µ˜ t = ∂y x. Furthermore, the Lipschitz continuity of x(t, y) yields ρ = ∂y x, where as usual we have identified absolutely continuous measures with their densities, with respect to the corresponding Lebesgue measure. Also, we have ˜ = ρT ˜ # L2 . T# ρ = T# Q# L2 = L2 = ρ˜ τ˜ = ρτ Now, for all φ ∈ C0 ((0, ∞) × R), ρ, φ = T# ρ, φ ◦ Q , while ρ˜ ◦ T , φ = ρT ˜ # L2 , φ ◦ Q , which implies ρ = ρ˜ ◦ T , where we have used that ρ = 0 a.e. over T −1 (V ) and V is the union of the vacuum lines, because of the first part of the assertion. Now, let ρ0 = Q0# L1 and v(t, y) = v(T (t, y)). Step 8. For all φ ∈ C01 (R2 ), we have (ρφt + ρvφy ) dt dy + t>0

φ(0, y)ρ0 (y) dy = 0.

(5.5)

t=0

Observe that ρ and ρ0 are uniformly bounded since ρ = ρ˜ ◦ T , ρ˜ τ˜ = 1, and τ˜ ≥ c for some c > 0. The same holds for ρ0 . Now, for any φ ∈ C01 (R2 ), we have (φt + φy v)ρ dt dy + φ(0, y)ρ0 (y) dy t=0 t>0 = (φt ◦ Q + φy ◦ Qv ◦ Q) dt dx + φ(0, Q0 (x)) dx t>0 t=0 = (φt ◦ Q + vφy ◦ Q) dt dx + φ(0, Q0 (x)) dx t=0 t>0 = ∂t (φ ◦ Q) dt dx + φ(0, Q0 (x)) dx = 0, t>0

t=0

where we have used that v ◦ Q = v. In particular, we have ρt + (ρv)y = 0,

(5.6)

in the sense of distributions. Now, let x0 ∈ R be such that y(t, x0 ) is a Lipschitz continuous function on [0, ∞) and, for a.e. t, y(t, x0 ) is a Lebesgue point for both ρ(t, y) and v(t, y).

Divergence-Measure Fields and the Euler Equations

279

Step 9. There exists a solution x(t, ¯ y) of ∂y x = ρ,

∂t x = −ρv,

(5.7)

such that x(t, ¯ y(t, x0 )) = x0 . In particular, x(t, ¯ y) = x(t, y). Indeed, let ρ ε = ρ ∗ ωε and (ρv)ε = (ρv) ∗ ω∗ , where ωε is a standard mollifier. Set y ρ ε (t, s) ds + x0 . x ε (t, y) = y(t,x0 )

Then x ε (t, y) is a Lipschitz function satisfying ∂y x ε = ρ ε (t, y), ∂t x ε = −(ρv)ε (t, y) − ρ ε (t, y(t, x0 ))v(t, y(t, x0 )) + (ρv)ε (t, y(t, x0 )). Also, |x ε (t, y) − x0 | ≤ M|y − y(t, x0 )|. Hence, x ε (t, y) converges in L1loc (R2+ ) to a Lipschitz function x(t, ¯ y) which satisfies |x(t, ¯ y) − x0 | ≤ M|y − y(t, x0 )|. Thus, we have x(t, ¯ y(t, x0 )) = x0 . Since |ρ ε (t, y(t, x0 ))v(t, y(t, x0 )) − (ρv)ε (t, y(t, x0 ))| → 0,

as ε → 0,

for a.e. t > 0, we conclude that x(t, ¯ y) satisfies (5.7). Now, since ∂y x(t, y) = ρ(t, y) and x(t, y(t, x0 )) = x0 , we must have x(t, y) = x(t, ¯ y). Step 10. Equation (4.1) holds for any τ -test function φ(t, x). Indeed, for any τ -test function φ(t, x), we have (φt τ − vφx dt dx) + φ(0, x)τ0 t>0 t=0 = (φt − v ρφ ˜ x )τ + φ(0, x)τ0 t=0 t>0 = (φt ◦ T − φx ◦ T ρ v) dt dy + φ ◦ T0 dy t=0 t>0 ∂ = (φ ◦ T ) dt dy + φ ◦ T0 dy ∂t t>0 t=0 =− lim φ ◦ T (y, δ) dy + φ ◦ T0 dy = 0.

R δ→0

t=0

Acknowledgements. Gui-Qiang Chen’s research was supported in part by the National Science Foundation under Grants DMS-0204225, DMS-0204455, and INT-9987378. Hermano Frid’s research was supported in part by CNPq through the grants 352871/96-2, 465714/00-5, 479416/01-0 and FAPERJ through the grant E-26/151.890/2000.

280

G.-Q. Chen, H. Frid

References 1. Anzellotti, G.: Pairings between measures and functions and compensated compactness. Ann. Mat. Pura Appl. 135, 293–318 (1983) 2. Baiocchi, C., Capelo, A.: Variational and Quasi-Variational Inequalities, Applications to FreeBoundary Problems. Vols. 1, 2, Chichester-New York: Wiley, 1984 3. Bianchini, S., Bressan, A.: Vanishing viscosity solutions of nonlinear hyperbolic systems. Preprint, SISSA, Italy, 2001 4. Brezzi, F., Fortin, M.: Mixed and Hydrid Finite Element Methods. New York: Springer-Verlag, 1991 5. Chen, G.-Q., Frid, H.: Divergence-measure fields and conservation laws. Arch. Rat. Mech. Anal. 147, 89–118 (1999) 6. Chen, G.-Q., Frid, H.: Decay of entropy solutions of nonlinear conservation laws. Arch. Rat. Mech. Anal. 146, 95–127 (1999) 7. Chen, G.-Q., Frid, H.: Large-time behavior of entropy solutions of conservation laws. J. Diff. Eqs. 152, 308–357 (1999) 8. Chen, G.-Q., Frid, H.: Uniqueness and asymptotic stability of Riemann solutions for the compressible Euler equations. Trans. Am. Math. Soc. 353, 1103–1117 (2001) 9. Chen, G.-Q., Frid, H., Li, Y.: Uniqueness and stability of Riemann solutions with large oscillation in gas dynamics. Commun. Math. Phys. 228, 201–217 (2002) 10. Dafermos, C.M.: Hyperbolic Conservation Laws in Continuum Physics. Berlin: Springer-Verlag, 1999 11. DiPerna, R.: Uniqueness of solutions to hyperbolic conservation laws. Indiana Univ. Math. J. 28, 137–188 (1979) 12. Evans, L.C., Gariepy, R.F.: Lecture Notes on Measure Theory and Fine Properties of Functions. Boca Raton, Florida: CRC Press, 1992 13. Federer, H.: Geometric Measure Theory. Berlin-Heidelberg-New York: Springer-Verlag, 1969 14. Gagliardo, E.: Caratterizioni delle tracce sulla frontiera relativa ad alcune classi di funzioni in n variabli. Rend. Sem. Mat. Univ. Padova 27, 284–305 (1957) 15. Glimm, J.: Solutions in the large for nonlinear hyperbolic systems of equations. Commun. Pure Appl. Math. 18, 95–105 (1965) 16. Lax, P.D.: Hyperbolic Systems of Conservation Laws and the Mathematical Theory of Shock Waves. CBMS. 11, Philadelphia: SIAM, 1973 17. Lax, P.D.: Shock waves and entropy. In: Contributions to Functional Analysis. E.A. Zarantonello (ed). New York: Academic Press, 1971, pp. 603–634 18. Liu, T.-P., Smoller, J.: On the vacuum state for the isentropic gas dynamics equations. Adv. Appl. Math. 1, 345–359 (1980) 19. Perthame, B.: Kinetic Formulation of Conservation Laws. Lecture Notes, Ecole Normale Sup´erieure, Paris, 2001 20. Rodrigues, J.-F.: Obstacle Problems in Mathematical Physics. North-Holland Mathematics Studies 134, Amsterdam: Elsevier Science Publishers B.V., 1987 21. Schwartz, L.: Th´eorie des Distributions (2 volumes). Actualites Scientifiques et Industrielles 1091, 1122, Paris: Herman, 1950–51 22. Serre, D.: Systems of Conservation Laws. Vols. 1, 2, Cambridge: Cambridge University Press, 1999, 2000 23. Stein, E.: Singular Integrals and Differentiability Properties of Functions. Princeton, NJ: Princeton Univ. Press, 1970 24. Volpert, A.I.: The space BV and quasilinear equations. Mat. Sb. (N.S.) 73, 255–302 (1967) Math. USSR Sbornik 2, 225–267 (1967) (in English) 25. Wagner, D.: Equivalence of the Euler and Lagrange equations of gas dynamics for weak solutions. J. Diff. Eqs. 68, 118–136 (1987) 26. Whitney, H.: Geometric Integration Theory. Princeton, NJ: Princeton Univ. Press, 1957 27. Whitney, H.: Analytic extensions of differentiable functions defined in closed sets. Trans. Am. Math. Soc. 36, 63–89 (1934) 28. Ziemer, W.P.: Cauchy flux and sets of finite perimeter. Arch. Rational Mech. Anal. 84, 189–201 (1983) 29. Ziemer, W.P.: Weakly Differentiable Functions: Sobolev Spaces and Functions of Bounded Variation. New York: Springer-Verlag, 1989 Communicated by P. Constantin

Commun. Math. Phys. 236, 281–310 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0812-x

Communications in

Mathematical Physics

Well-Posedness for the Linearized Motion of a Compressible Liquid with Free Surface Boundary Hans Lindblad Department of Mathematics, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0112, USA. E-mail: [email protected] Received: 23 September 2002 / Accepted: 2 December 2002 Published online: 14 April 2003 – © Springer-Verlag 2003

Abstract: We study the motion of a compressible perfect liquid body in vacuum. This can be through of as a model for the motion of the ocean or a star. The free surface moves with the velocity of the liquid and the pressure vanishes on the free surface. This leads to a free boundary problem for Euler’s equations, where the regularity of the boundary enters to highest order. We prove linearized stability in Sobolev space assuming a “physical condition”, related to the fact that the pressure of a fluid has to be positive. 1. Introduction We consider Euler’s equations ρ ∂t + V k ∂k vj = −∂j p,

j = 1, ..., n

in

D,

∂i = ∂/∂x i ,

(1.1)

describing the motion of a perfect compressible fluid body in a vacuum: (∂t + V k ∂k )ρ + ρ div V = 0,

div V = ∂k V k

in

D,

(1.2)

where V k = δ ki vi = vk and we use the summation convention over repeated upper and lower indices. Here the velocity V = (V 1, ..., V n ), the density ρ and the domain D = ∪0≤t≤T {t}× Dt , Dt ⊂ Rn are to be determined. The pressure p = p (ρ) is assumed to be a given strictly increasing smooth function of the density. The boundary ∂Dt moves with the velocity of the fluid particles at the boundary. The fluid body moves in vacuum so the pressure vanishes in the exterior and hence on the boundary. We therefore also require the boundary conditions on ∂D = ∪0≤t≤T {t}× ∂Dt : (∂t + V k ∂k )|∂ D ∈ T (∂D), p = 0, on ∂D.

The author was supported in part by the National Science Foundation.

(1.3) (1.4)

282

H. Lindblad

Constant pressure on the boundary leads to energy conservation and it is needed for the linearized equations to be well posed. Since the pressure is assumed to be a strictly increasing function of the density we can alternatively think of the density as a function of the pressure and for physical reasons this function has to be non-negative. Therefore the density has to be a non-negative constant ρ 0 on the boundary and we will in fact assume that ρ 0 > 0, which is the case of a liquid. We hence assume that p(ρ 0 ) = 0

and

p (ρ) > 0,

for ρ ≥ ρ 0 ,

where ρ 0 > 0.

(1.5)

From a physical point of view one can alternatively think of the pressure as a small positive constant on the boundary. By thinking of the density as a function of the pressure the incompressible case can be thought of as the special case of a constant density function. The motion of the surface of the ocean is described by the above model. Free boundary problems for compressible fluids are also of fundamental importance in astrophysics since they describe stars. The model also describes the case of one fluid surrounded by and moving inside another fluid. For large massive bodies like stars gravity helps to hold it together and for smaller bodies like water drops surface tension helps to hold it together. Here we neglect the influence of gravity which will just contribute with a lower order term and we neglect surface tension which has a regularizing effect. Given a bounded domain D0 ⊂ Rn , that is homeomorphic to the unit ball, and initial data V0 and ρ0 , we want to find a set D ⊂ [0, T ] × Rn , a vector field V and a function ρ, solving (1.1)–(1.4) and satisfying the initial conditions {x; (0, x) ∈ D} = D0 , V = V0 , ρ = ρ0 on {0} × D0 .

(1.6) (1.7)

In order for the initial-boundary value problem (1.1)–(1.7) to be solvable the initial data (1.7) has to satisfy certain compatibility conditions at the boundary. By (1.2),(1.4) also implies that div V ∂ D = 0. We must therefore have ρ0 ∂ D = ρ 0 and div V0 ∂ D = 0. 0 0 Furthermore, taking the divergence of (1.1) gives an equation for (∂t + V k ∂k ) div V in terms of space derivatives of V and ρ only, which leads to further compatibility conditions. In general we say that initial data satisfy the compatibility condition of order m if there is a formal power series solution in t, of (1.1)–(1.7) (ρ, ˜ V˜ ), satisfying (∂t + V˜ k ∂k )j (ρ˜ − ρ 0 ){0}×∂ D = 0, j = 0, .., m − 1. (1.8) 0

Let N be the exterior unit normal to the free surface ∂Dt . Christodoulou [C2] conjectured the initial value problem (1.1)–(1.8), is well posed in Sobolev spaces under the assumption ∇N p ≤ −c0 < 0,

on

∂D,

where ∇N = N i ∂x i .

(1.9)

Condition (1.9) is a natural physical condition. It says that the pressure and hence the density is larger in the interior than at the boundary. Since we have assumed that the pressure vanishes or is close to zero at the boundary; this is therefore related to the fact that the pressure of a fluid has to be positive. In general it is possible to prove local existence for analytic data for the free interface between two fluids. However, this type of problem might be subject to instability in Sobolev norms, in particular Rayleigh-Taylor instability, which occurs when a heavier fluid is on top of a lighter fluid. Condition (1.9) prevents Rayleigh-Taylor instability

Linearized Motion of a Compressible Liquid with Free Surface Boundary

283

from occurring. Indeed, if this condition is violated Rayleigh-Taylor instability occurs in a linearized analysis. In the irrotational incompressible case the physical condition (1.9) always hold, see [W1, W2, CL], and [W1, W2] proved local existence in Sobolev spaces in that case. Wu [W1, W2] studied the classical water wave problem describing the motion of the surface of the ocean and showed that the water wave is not unstable when it turns over. Ebin [E1] showed that the general incompressible problem is ill posed in Sobolev spaces when the pressure is negative in the interior and the physical condition is not satisfied. Ebin [E2] also announced a local existence result for the incompressible problem with surface tension on the boundary which has a regularizing effect so (1.9) is not needed then. In [CL], together with Christodoulou, we proved a priori bounds in Sobolev spaces in the general incompressible case of non-vanishing curl, assuming the physical condition (1.9) for the pressure. We also showed that the Sobolev norms remain bounded as long as the physical condition hold and the second fundamental form of the free surface and the first order derivatives of the velocity are bounded. Usually, existence follows from similar bounds for some iteration scheme, but the bounds in [CL] used all the symmetries of the equation and so only hold for modifications that preserve all the symmetries. In [L1] we showed existence for the linearized equations and in [L2] we proved local existence for the nonlinear incompressible problem with non-vanishing curl, assuming that (1.9) holds initially. For the corresponding compressible free boundary problem with non-vanishing density on the boundary, there are however in general no previous existence or well-posedness results. Relativistic versions of these problems have been studied in [C1, DN, F, FN, R] but solved only in special cases. The methods used for the irrotational incompressible case use that the components of the velocity are harmonic to reduce the equations to equations on the boundary and this does not work in the compressible case since the divergence is non-vanishing and the pressure satisfies a wave equation in the interior. To be able to deal with the compressible case one therefore needs to use interior estimates as in [CL, L1]. Let us also point out that in nature one expects fluids to be compressible, e.g.water satisfies (1.5), see [CF]. For the general relativistic equations there is no special case corresponding to the incompressible case. Here we show existence for the linearized equations and estimates for these in Sobolev spaces in the general compressible case (1.1)–(1.7), assuming that (1.8) and (1.9) hold. This can be considered as a linearized stability result, showing that small perturbations of initial conditions in Sobolev spaces leads to small perturbations for finite times. Furthermore, in a forthcoming paper [L3] we use existence and estimates for the inverse of the linearized operator to prove existence for the nonlinear problem using the Nash-Moser technique also in the compressible case. The existence proof here uses the orthogonal decomposition of a vector field into a divergence free part and a gradient of a function that vanishes on the boundary. For the divergence free part we get an equation of the type studied in [L1] and for the divergence we get a wave equation on a bounded domain with Dirichlet boundary conditions. The interaction terms between these equations are lower order so if we put up an iteration, the equations decouple for the new iterate and the previous iterates only enter in the lower order terms. Existence of solutions for the wave equation on a bounded domain is well known. However dealing with the divergence free part of the equation requires the techniques developed in [L1]. Here we use a generalization of the existence theorem in [L1] to the case when the divergence of the solution we linearize around is non-vanishing. In

284

H. Lindblad

[L1] we showed that the linearized incompressible Euler equations become an evolution equation for what we called the normal operator. The normal operator is unbounded and not elliptic in the case of non-vanishing curl. It is however positive assuming the physical condition (1.9) and this leads to existence. Up to lower order terms, the projection of the linearized compressible Euler equations onto divergence free vector fields becomes the linearized incompressible Euler equations. As pointed out above, the positivity of the pressure (1.9) leads to the positivity of the normal operator, introduced in [L1]. It appears that this condition is needed for the well-posedness also in the compressible case since the divergence free part essentially decouples from the divergence. In fact, the compressible case was the main motivation for formulating (1.9) since in that case it is clear that the pressure has to be positive and in nature one expects fluids to be slightly compressible. In order to formulate the linearized equations one has to parametrize the boundary. Let us therefore express Euler’s equations in the Lagrangian coordinates given by following the flow lines of the velocity vector field of the fluid particles. In these coordinates the boundary becomes fixed. Given a domain D0 in Rn , that is diffeomorphic to the unit ball , we can by a theorem in [DM] find a diffeomorphism f0 : → D0 with prescribed volume form det (∂f0 /∂y) up to a constant factor. Let D and v ∈ C(D) satisfy (1.3). The Lagrangian coordinates y are given by solving for the Eulerian coordinates x = x(t, y) = ft (y), dx(t, y) = V (t, x(t, y)), dt

x(0, y) = f0 (y),

y ∈ .

(1.10)

Then ft : → Dt is a diffeomorphism and the boundary becomes fixed in the new y coordinates. Let ∂ ∂ ∂ ∂ ∂y a ∂ Dt = + Vk k = and ∂i = i = , (1.11) ∂t x=const ∂x ∂t y=const ∂x ∂x i ∂y a be the material derivative and partial differential operators expressed in the Lagrangian coordinates. In these coordinates Euler’s equation (1.1), the continuity equation (1.2) and the boundary condition (1.4) become: Dt2 x i + ∂i h = 0, Dt ρ + ρ div V = 0, ρ ∂ = ρ 0 , (1.12) ρ where the enthalpy h = h(ρ) = ρ p (ρ)ρ −1 dρ is a strictly increasing function of ρ, 0 and x, V = Dt x and ρ are functions of (t, y) ∈ [0, T ] × . Furthermore, ρ can be determined from x: ρ = kκ −1 , where κ = det (∂x/∂y) and k = ρκ t=0 , (1.13) since Dt κ = κ div V . In (1.10) there is a choice of mapping f0 and domain . By [DM] one can find a diffeomorphism with prescribed volume form up to a constant between any two diffeomorphic sets. We therefore choose to be the unit ball and det (∂f0 /∂y) so det (∂f0 /∂y)ρ0 = k is any given fixed function k(y) that we take to be constant. Making this choice, initial data for ρ is part of the initial data for x. The free boundary problem for Euler’s equations (1.1)–(1.7), hence become an equation for x(t, y):

Linearized Motion of a Compressible Liquid with Free Surface Boundary

ρ = kκ −1 , a ∂y ∂ where ∂i = , ∂x i ∂y a

Dt2 x i + ∂i h = 0,

in

[0, T ] × ,

285

κ ∂ = 1, (1.14)

h = h(ρ) is a strictly increasing function of ρand ρ =ρ(κ) is a function of κ = det (∂x/∂y). Initial data are x t=0 = f0 , Dt x t=0 = V0 . (1.15) In to be solvable, initial data has to satisfy the constraints; det (∂f0 /∂y) order for (1.14) = 1, div V0 = 0 and taking the divergence of (1.14) gives an equation for Dt div V ∂ ∂ in terms of space derivative of x and V = Dt x which leads to further conditions. Since (1.14) gives Dt2 x in terms of space derivatives of x we can obtain a formal power series solution in time t, x, ˜ to the first two equations in (1.14) satisfying the initial conditions (1.15). The compatibility condition of order m is the requirement that the formal power series solution up to terms of order m satisfy the boundary condition in (1.14): j Dt det (∂ x/∂y) ˜ − 1 0×∂ = 0,

j = 0, ..., m − 1.

(1.16)

Let us now derive the linearized equations. Equation (1.14) can be thought of as an equation (x) = 0, where is a functional of x(t, y) given by (x)i = Dt2 x i + ∂i h, for 1 ≤ i ≤ n, where h is a given function of κ = det (∂x/∂y) and ∂i are the differential operators in (1.14) with coefficients depending on derivatives of x as well, and (x)n+1 = (κ − 1)∂ . We assume that x(t, y) is a given smooth solution of (1.14), i.e. (x) = 0. Let x(t, y, r) be a smooth function also of a parameter r, such that x r=0 = x and set δx = ∂x/∂r r=0 . Then the linearized equations are the requirement on δx, that x satisfies Eqs. (1.14) up to terms bounded by r 2 as r → 0, i.e. (x)(δx) = ∂(x)/∂r r=0 = 0. If we replace x in (1.14) by x and apply δ = ∂/∂r r=0 we hence obtain the linearized equations: Dt2 δx i + (∂i ∂k h)δx k − ∂i (∂k h)δx k − δh = 0, δh = −h (ρ)ρ div δx, div δx ∂ = 0.

(1.17)

Here we used that [δ, ∂i ] = −(∂i δx k )∂k and δρ = −ρκ −1 δκ = −ρ div δx, see Sect. 2 and [L1]. The initial data for the linearized equations are δx t=0 = δf0 , Dt δx t=0 = δV0 . (1.18) The initial data are as before subject to constraints. Let δ x˜ be the formal power series solution in time t to (1.17)–(1.18). The compatibility condition of order m is j

Dt div δ x˜ = 0,

j = 0, ..., m − 1.

(1.19)

The main difference between (1.17) and (1.14) is the higher order term ∂i (∂k h)δx k , since the term ∂i δh, depending on div δx, in (1.17) corresponds to the term ∂i h, depending on det (∂x/∂y) in (1.14). If we take x above to be a family of solutions of (1.14) depending on the parameter r, then our estimates below show that a small change of initial conditions only give rise to a small change of the solution in Sobolev spaces. Our main result is the following linearized stability result:

286

H. Lindblad

Theorem 1.1. Let be the unit ball in Rn and suppose that x is a smooth solution of (1.14) satisfying (1.9) for 0 ≤ t ≤ T. Suppose that (δf0 , δV0 ) are smooth satisfying the compatibility conditions of all orders m, i.e. (1.19) holds for all m. Then the linearized equations (1.17) have a smooth solution δx for 0 ≤ t ≤ T satisfying the initial conditions (1.18). Let N be the exterior unit normal to ∂Dt parametrized by x(t, y) and let δxN = N · δx be the normal component of δx. Set Er (t) = Dt δx(t, ·) H r () + δx(t, ·) H r () + div δx(t, ·) H r () + δxN (t, ·) H r (∂) , (1.20) where H r () and H r (∂) are the Sobolev spaces in respectively on ∂. Then there are constants Cr depending only on x, r and T such that Er (t) ≤ Cr Er (0),

for

0 ≤ t ≤ T,

r ≥ 0.

(1.21)

Furthermore, let N˜ r () be the completion of C ∞ () in the norm δx H r () + div δx(t, ·) H r () + δxN H r (∂) . Then if the constraints in (1.19) hold for all orders m and (δf0 , δV0 ) ∈ N˜ r () × H r ()

(1.22)

it follows that (1.17)–(1.18) has a solution (δx, Dt δx) ∈ C([0, T ], N˜ r () × H r ()).

(1.23)

As we have argued, any smooth solution of (1.1)–(1.7) with D0 diffeomorphic to the unit ball can be reduced to a smooth solution of (1.14) where is the unit ball. That there are initial data (1.18) such that (1.19) hold for all m follows by taking δf0 and δV0 compactly supported in the interior of . The term δxN H r (∂) is equivalent to the variation of the second fundamental form θ = ∂N of the free boundary ∂Dt measured in H r−2 (∂). For a general component we can only say that δx ∈H r−1/2 (∂). The energy estimate (1.21) also holds in the incompressible case when div δx = 0, see [L1], and in [CL] we obtained similar bounds for v H r () + θ H r−2 (∂) in the nonlinear incompressible case. Let us now outline the main ideas in the proof. We will rewrite the linearized equations (1.17) in a geometrically invariant way and use this to obtain energy bounds and existence. We have defined our vector fields as functions of the Lagrangian coordinates (t, y) ∈ [0, T ]× but we can also think of them as functions of the Eulerian coordinates (t, x) ∈ D, and we will make this identification without saying that we compose with the inverse of the change of coordinate y → x(t, y). The time derivative has a simple expression in the Lagrangian coordinates but the space derivatives have a simpler expression in the Eulerian coordinates, see (1.11). For the most part we will think of our functions and vector fields in the Lagrangian frame but we use the inner product coming from the Eulerian frame, i.e. in the Lagrangian frame we use the pull-back metric of the Euclidean inner product: X · Z = δij X i Z j = gab X a Z b ,

where X a = Xi

∂y a , ∂x i

gab = δij

∂x i ∂x j . ∂y a ∂y b (1.24)

Here Xi refers to the components of the vector X in the Eulerian frame, X a refers to the components in the Lagrangian frame, gab is the metric in the Lagrangian frame and δij is

Linearized Motion of a Compressible Liquid with Free Surface Boundary

287

the Euclidean metric. The letters a, b, c, d, e, f, g will refer to indices in the Lagrangian frame whereas i, j, k, l, m, n will refer to the Eulerian frame. The norms and most of the operators we consider have an invariant interpretation so it does not matter in which frame they are expressed. In the introduction we express the vector fields in the Eulerian frame but later we express them in the Lagrangian frame. The L2 inner product is X, Z = X · Z dx = X · Z κ dy = X · Zρ −1 k(y)dy (1.25) Dt

where κ = det (∂x/∂y) and k = κρ t=0 . Let us first point out that the boundary condition ρ ∂ = ρ 0 leads to the fact the energy is conserved for a solution of Euler’s equations (1.12). Let Q(ρ) = that ρ −2 dρ, where q(ρ) = p(ρ) − p(ρ ). Since D (ρκ) = 0 and ρ D v = 2q(ρ)ρ t t i 0 ρ0 −∂i p(ρ) = −∂i q(ρ) it follows from (1.25) and the divergence theorem that d (|V |2 + Q(ρ))ρ dx = Dt (|V |2 + Q(ρ))ρ dx dt Dt Dt = − 2V i ∂i q(ρ) + 2q(ρ)ρ −1 Dt ρ dx D t −1 =2 div V q(ρ) + q(ρ)ρ Dt ρ dx − 2 VN q(ρ) dS = 0,

Dt

∂Dt

(1.26)

where VN = Ni V i is the normal component and we also used (1.12). We will obtain similar energy estimates for the linearized equations (1.17) for energies containing an additional boundary term. We will first rewrite the linearized equations in a geometrically invariant way. The last term in the first equation in (1.17) is a positive symmetric operator in the energy inner product on vector fields satisfying the boundary condition div X ∂ D = 0: t

CX= −∇ h (ρ)( ρ div X+(∂k ρ)X k ) = −∇ h (ρ) div (ρX) ,

where ∇ i = δ ij ∂j , (1.27)

i.e. X, ρ CX ≥ 0 and Z, ρ CX = ρ CZ, X if div X ∂ D = div Z ∂ D = 0. In t t fact, if div X∂ D = 0, then h (ρ) div (ρX)∂ D = Xk ∂k h∂ D = XN ρ −1 ∇N p ∂ D and t t t t integrating by parts we get Z, ρ CX = div ( ρZ) div (ρX) h (ρ) dx + ZN XN (−∇N p) dS, (1.28) Dt

∂ Dt

which proves the symmetry and the positivity follows since h (ρ) ≥ c1 > 0 and ∇N p ≤ −c0 < 0. We will also replace the time derivative by a time derivative that preserves the boundary condition. Let LDt X i = Dt X i − (∂k V i )X k =

∂x i ∂y a k Dt X , ∂y a ∂x k

(1.29)

288

H. Lindblad

be the space time Lie derivative with respect to Dt = (1, V ) restricted to the space components or equivalently, the time derivative of the vector field expressed in the Lagrangian frame. Let ∂x i ∂y a Lˆ Dt X i = LDt X i + div V X i = κ −1 a Dt κ k X k ∂y ∂x

(1.30)

be the modified Lie derivative that preserves the boundary condition, div X ∂ = 0. In fact div Lˆ Dt X = Dˆ t div X,

where

Dˆ t = Dt + div V .

(1.31)

The linearized equations (1.17) can now be written as an evolution equation for the operator C: (1.32) X¨ + CX = B X, X˙ , div X t=0 = 0, where X = δx, X˙ = Lˆ Dt X, X¨ = Lˆ 2Dt X and B is a linear form with coefficients depending on x and ρ. Associated with (1.32) is the energy ˙ ρ X ˙ + X, ρ (C + I )X , E(t) = X,

(1.33)

and we prove that E ≤ CE which gives the bound (1.21) for r= 0. The boundary term in (1.20) comes from (1.28). To obtain estimates for higher order derivatives one can apply modified Lie derivatives with respect to tangential vector fields as in [L1]. This does however not prove existence for (1.32) which is non-standard since C is non-elliptic, time dependent and the boundary condition is non-trivial. We use the orthogonal projection onto divergence free vector fields in the inner product (1.25) to obtain an equation for the divergence and an equation for the divergence free part. The equation decouples to highest order, and existence and estimates for the system follows from existence and estimates for each equation with an inhomogeneous term. The orthogonal projection is P X = X − ∇q, where

q = div X, q = 0. (1.34)

We will obtain a system of equations for X0 = P X and X1 = (I − P )X by projecting the linearized equation (1.32) onto divergence free vector fields, respectively the orthogonal complement. Taking the divergence of (1.32) gives a wave equation for div X with Dirichlet boundary condition: ˙ Dˆ t2 div X1 − h (ρ)ρ div X1 = Xk ∂k h + div B(X, X), div X1 ∂ = 0 (1.35) for which existence is known if h (ρ)ρ and the metric gab , hidden in = ni=1 ∂i2 = κ −1 ∂a κg ab ∂b , are bounded from above and below and the right-hand side is thought of as a known function, see Sect. 6. X1 is then determined from div X1 by solving the Dirichlet problem:

q1 = div X1 , q1 = 0. (1.36) X1 = ∇q1 ,

Linearized Motion of a Compressible Liquid with Free Surface Boundary

289

To obtain an equation for the divergence free part X0 we project (1.32) onto divergence free vector fields. It follows from (1.27) that AX = P CX = P − ∇(Xk ∂k h) , (1.37) since div X∂ = 0 and the projection of a gradient of a function that vanishes on the boundary vanishes. The operator A is a positive symmetric operator on divergence free vector fields, if condition (1.9) holds: X, AZ = − X i ∂i (Z k ∂k h) dx Dt = XN ZN (−∇N h) dS, if div X = div Z = 0. (1.38) ∂Dt

Furthermore, we note that the commutator of time derivatives with the projection is lower order [Lˆ Dt , P ]Xi = −P (Lˆ Dt δ ij )δj k (I − P )X k , (1.39) which follows since Lˆ D t preserves the divergence free condition and the projection of δ ij ∂j Dt q vanishes if q ∂ = 0 and hence Dt q ∂ = 0. Hence P X¨ 1 = P B2 (X1 , X˙ 1 ) can be determined in terms of X1 and X˙ 1 . Projection of (1.32) therefore gives an evolution equation for the operator A for the divergence free part: ˙ X¨ 0 + AX0 = −P B2 (X1 , X˙ 1 ) − AX1 + P B(X, X).

(1.40)

Existence for (1.40) with the right-hand side thought of as a known function is a generalization of the existence proof in [L1]. For the divergence part we have an equation which is equivalent to (1.35)–(1.36): X¨ 1 − ∇ h (ρ)ρ div X1 − P B2 (X1 , X˙ 1 ) ˙ = (I − P )∇ (∂k h)X k + (I − P )B(X, X), div X1 ∂= 0. (1.41) We will show existence for the system (1.40)–(1.41) for (X0 , X1 ) and from that we obtain a solution X = X0 +X1 to (1.32), since these equations are exactly the projection of (1.32) onto the divergence free vector fields respectively the orthogonal component. The system (1.40)–(1.41) can be solved by iteration. If X is an iterate, then from (1.35)– (1.36) and (1.40) we get X1 and X0 and a new iterate is X0 +X1 . There is no loss of regularity in this procedure since div X has the same space regularity as X. However, in order for it to be possible to solve (1.35) the initial conditions and the equation must be compatible with a formal power series solution satisfying the boundary conditions, and we must make sure that this is true at each step of the iteration. Equation (1.32) gives X¨ in terms of space derivatives of X and X˙ only and we hence obtain a formal power series solution in time, the first two terms coming from the initial conditions. Our assumption is that this formal power series solution satisfies the boundary condition. From this formal power series one can construct an approximate solution X˜ satisfying the initial conditions, the equation to all orders as t → 0, and the boundary condition. We can then take the approximate solution as our first iterate or equivalently subtract off the approximate solution from X, which produces an inhomogeneous term vanishing to all orders as t → 0 and vanishing initial conditions.

290

H. Lindblad

Let us now conclude the introduction by giving the main estimates we use. Since the time derivative preserves the boundary condition it is natural to use norms which also contain time derivatives up to full order in the proof, and the estimate in Theorem 1.1 afterwards follow from these. Let X(t) H r = X(t, ·) H r () , X(t) r = Dtk X(t) H s , s+k≤r

X0 (t) r = X0N (t, ·) H r (∂) ,

(1.42)

where X0N = X0 · N is the normal component. For the divergence free equation: X¨ 0 + AX0 = F0 ,

P F0 = F0 ,

(1.43)

we have the estimate X˙ 0 (t) r + X0 (t) r + X0 (t) r t ˙ F0 r dτ . ≤ C X0 (0) r + X0 (0) r + X0 (0) r +

(1.44)

0

This is a generalization of the estimate for the incompressible case in [L1]. For r = 0, one uses the symmetry and positivity (1.38) of A to prove that E = X˙ 0 , X˙ 0 +X0 , (A+ I )X0 satisfies E ≤ CE. Note that the boundary term comes from using (1.38). For r > 0, it follows from commuting modified Lie derivatives with respect to tangential vector fields through the equation to obtain similar equations and estimates for these, together with better estimates for the curl since the curl of A vanishes. For X¨ 1 − ∇ h (ρ)ρ div X1 − P B2 (X1 , X˙ 1 ) = F1 , (I − P )F1 = F1 , div X1 ∂ = 0, (1.45) we have the estimate

t

X1 (t) r+1 ≤ C X1 (0) r+1 +

F˙1 r−1 dτ .

(1.46)

0

The last estimate follows from estimating the wave equation (1.35) with the right-hand side replaced by div F1 and inverting the Laplacian (1.36). In (1.46) we do not need space derivatives up to highest order of F1 , since one obtains space derivatives from time derivatives through inverting the Laplacian in the wave equation. Using that the right-hand side of (1.41) is a gradient (1.46) also holds for r = 0. With F0 equal to the right-hand side of (1.40) and F1 equal to the right-hand side of (1.41) the norms in the integrals in (1.44) and (1.46) can be estimated by the sum of the norms in the left of (1.44) and (1.46) and this gives a priori bounds as well as uniform estimates for iterates, if we let the F0 and F1 be obtained from the previous iterate and solve (1.43) and (1.45) for the new iterate. Once we obtained the solution to the system (1.40)–(1.41), X = X0 + X1 is the solution to (1.32). The norm in (1.20) is bounded by the sum of the norms in the left of (1.44) and (1.46). Using (1.32) one can bound time derivatives in terms of space derivatives and using that the projection is continuous in the norms (1.42) it follows that the norms in the left of (1.44) and (1.46) can be bounded by (1.20).

Linearized Motion of a Compressible Liquid with Free Surface Boundary

291

2. Lagrangian Coordinates and the Linearized Equation Let us introduce Lagrangian coordinates in which the boundary becomes fixed. Let be a domain in Rn and let f0 : → D0 be a diffeomorphism. We assumed that D0 is diffeomorphic to the unit ball and that v(t, x), p(t, x), (t, x) ∈ D are given satisfying the boundary conditions (1.3)–(1.4). The Lagrangian coordinates y are given by solving for the Eulerian coordinates x = x(t, y) = ft (y) in dx/dt = V (t, x(t, y)),

x(0, y) = f0 (y),

y ∈ .

(2.1)

Then ft : → Dt is a diffeomorphism, and the boundary becomes fixed in the new y coordinates. Let us introduce the notation ∂ ∂ ∂ = + Vk k, (2.2) Dt = ∂t y=constant ∂t x=constant ∂x for the material derivative and ∂i =

∂ ∂y a ∂ = i ∂x ∂x i ∂y a

(2.3)

for the partial derivatives. In these coordinates Euler’s equation (1.1) become ρ Dt2 xi + ∂i p = 0,

(t, y) ∈ [0, T ] × ,

(2.4)

(t, y) ∈ [0, T ] × .

(2.5)

and the continuity equation (1.2) becomes Dt ρ + ρ div V = 0,

Here the pressure p = p(ρ) is assumed to be a given smooth strictly increasing function of the density ρ. Let ρ 0 be defined by p(ρ 0 ) = 0. Let h, the enthalpy, be defined by ρ h(ρ) = p (ρ)ρ −1 dρ. (2.6) ρ0

Then (2.4) becomes Dt2 xi + ∂i h = 0,

(t, y) ∈ [0, T ] × .

(2.7)

The density ρ satisfies (2.5) but since κ = det (∂x/∂y)

(2.8)

Dt κ − κ div V = 0,

(2.9)

satisfies

it follows that ρ = ρ0 κ0 /κ, where ρ0 and κ0 are the initial values. By a theorem of [DM] one can arbitrarily prescribe the volume form κ0 up to a constant so we take κ0 = k/ρ0 , where k is a constant, and to be the unit ball, by composing with a diffeomorphism, since we assumed that D0 is diffeomorphic to a unit ball. Hence ρ is determined from x: ρ = k/κ.

(2.10)

292

H. Lindblad

By choosing the constant k appropriately the boundary condition (1.4) can hence be expressed (2.11) κ ∂ = 1. Since h is a function of ρ which in turn by our choice (2.10) is a function of κ = det (∂x/∂y) we can think of h as a function of κ. Equation (2.7) is then an equation involving the coordinate x only and initial data for ρ is included in the choice of initial mapping f0 . Initial data for (2.7) are Dt x t=0 = V0 . (2.12) x t=0 = f0 , In order for (2.7) to have a smooth solution satisfying (2.11), initial data has to satisfy the constraints det (∂f0 /∂y)∂ = 1 and div V0 ∂ = 0, by (2.9). Taking the divergence of (2.7) gives Dt div V + (∂i V k )∂k V i + h = 0,

(2.13)

which leads to further conditions. Since (2.7) gives Dt2 x in terms of space derivatives of x we can obtain a formal power series solution in time t, x, ˜ to (2.7) satisfying the initial conditions (2.12). The compatibility condition of order m is the requirement that the formal power series solution up to terms of order m satisfy the boundary condition in (2.11): j Dt det (∂ x/∂y) ˜ − 1 0×∂ = 0, j = 0, ..., m − 1. (2.14) At this point we also remark that we get a wave equation for h. Since h is a strictly increasing function of ρ we can think of ρ = ρ(h) as a function of h. Hence with e(h) = ln ρ(h) (2.5) instead become Dt e(h) + div V = 0,

(2.15)

and this together with (2.13) gives a wave equation for h with Dirichlet boundary conditions: h = 0. (2.16) Dt2 e(h) − h − (∂i V k )∂k V i = 0, ∂

Here

h =

∂i2 h = κ −1 ∂a κg ab ∂b h ,

where

i

gab = δij

∂x i ∂x j , ∂y a ∂y b

(2.17)

is the metric in the Lagrangian coordinates and g ab is its inverse. Here ∂a = ∂/∂y a and we use the convention that differentiation with respect to the Eulerian coordinates is denoted by letters i, j, k, l, m, n and with respect to the Lagrangian coordinates is denoted by a, b, c, d, e, f . In order for (2.16) to be solvable we must have that

0 < e + 1/e ≤ c1

n

|g ab | + |gab | ≤ n c12 ,

|∂x/∂y|2 + |∂y/∂x|2 ≤ c12

a,b=1

(2.18)

Linearized Motion of a Compressible Liquid with Free Surface Boundary

293

for some constant 0 < c1 < ∞. The first condition is related to that the pressure is assumed to be a strictly increasing smooth function of the density. The second and third conditions are equivalent and say that the coordinate mapping is a diffeomorphism. Furthermore, it is well-known that one needs compatibility conditions to solve (2.16). Let us now derive the linearized equations. The calculations that follow below are similar to those in [L1] since Eq. (2.7) mathematically is the same as the equation for the incompressible case with the enthalpy h replaced by the pressure p. We therefore refer the reader to [L1] for more details. We now assume that we have a smooth solution x = x(t, y) of (2.7) satisfying (1.9) for 0 ≤ t ≤ T and we will derive the linearized equations of this solution. Assume that x = x(t, y, r) is a smooth function also of the extra parameter r such that x r=0= x and set δx = ∂x/∂r r=0 . Then the linearized equations are the requirements on δx that x satisfies (2.7) and (2.10)–(2.11) up to terms bounded by r 2 as r → 0. Let δ be a variation in the Lagrangian coordinates, i.e. a derivative δf = ∂f/∂r r=0 (2.19) with respect to the parameter r when t and y are fixed. Then [δ, Dt ] = 0, [δ, ∂i ] = −(∂i δx k )∂k ,

(2.20)

so [δ− δx k ∂k , ∂i ] = 0. Applying δ − δx k ∂k to (2.7) gives: Dt2 δxi − (∂k Dt2 xi )δx k − ∂i δx k ∂k h − δh = 0.

(2.21)

Since h = h(ρ), where ρ = k/κ and δκ = κ div δx,

(2.22)

δh = −h (ρ)ρ div δx.

(2.23)

it follows that

The variation of the boundary condition (2.11) becomes div δx ∂ = 0.

(2.24)

The initial data for (2.21) with δh given by (2.23) are δx = δf0 ,

Dt δx = δV0 .

(2.25)

In order for it to be possible to have a smooth solution of (2.21) and (2.23)–(2.24) initial data (2.25) must satisfy certain compatibility conditions. The initial data are subject to the constraints div δf0 ∂ = 0 and div δV0 = (∂i V k )∂k δx k . Taking the divergence of (2.21) using (2.23) and (2.24) gives a wave equation for div δx with Dirichlet boundary conditions: Dt2 div δx − δx i ∂i Dt div V − δx k ∂k h − δh + 2(∂i V k )∂k (δV i − δx l ∂l V i ) = 0, (2.26)

294

H. Lindblad

which gives further conditions. Since (2.21) gives Dt2 δx in terms of space derivatives of δx only, this gives a formal power series solution in time t, which we call δ x. ˜ The compatibility condition of order m is the requirement that the formal power series solution satisfies the boundary condition (2.24): j

Dt div δ x˜ = 0,

j = 0, ..., m − 1.

(2.27)

The basic assumption in solving the system (2.21)–(2.25) is that one should assume that div δx has the same space regularity as δx. Let us now express also the vector field in the Lagrangian frame. Let Wa = Then

∂y a i δx . ∂x i

(2.28)

Dt δx i = Dt W b ∂x i /∂y b = (Dt W b )∂x i /∂y b + W b ∂V i /∂y b = (Dt W b )∂x i /∂y b + δx k ∂k V i ,

(2.29)

and multiplying with the inverse ∂y a /∂x i gives Dt W a =

∂y a LDt δx i , ∂x i

and

∂y a ˆ LDt δx i , Dˆ t W a = ∂x i

(2.30)

where the Lie derivative and modified Lie derivative are given by (1.29)–(1.30) and Dˆ t W a = Dt W a + (div V )W a = κ −1 Dt (κW a ).

(2.31)

Since the divergence is invariant, div δx = div W = κ −1 ∂a κW a ,

(2.32)

div Dˆ t W = Dˆ t div W.

(2.33)

it therefore follows that

Differentiating (2.30) once more gives Dt2 δx i − (∂k Dt V i )δx k = (Dt2 W b )∂x i /∂y b + 2(Dt W b )∂V i /∂y b .

(2.34)

It follows that i j ∂x i 2 i ∂x i ∂x i 2 b i k b ∂x ∂x δx − (∂ D V )δx D W + 2(D W ) ∂i vj D = k t t ∂y a t ∂y a ∂y b t ∂y b ∂y a = gab Dt2 W b + (Dt gab − ωab )Dt W b , (2.35)

where gab is given by (2.17) and Dt gab =

∂x i ∂x j ∂ i vj + ∂ j v i , a b ∂y ∂y

ωab =

∂x i ∂x j ∂ i vj − ∂ j v i . a b ∂y ∂y

(2.36)

With ∂a = ∂/∂y a the linearized equation (2.21) and (2.23) become gab Dt2 W b − ∂a (∂c h)W c − δh = − Dt gac − ωac Dt W c , δh = −p div W. (2.37)

Linearized Motion of a Compressible Liquid with Free Surface Boundary

295

Let Dˆ t be as in (2.31), i.e. Dˆ t = (Dt + σ˙ ), where σ = ln κ and σ˙ = Dt σ = div V . Then Dt2 = Dˆ t2 − 2σ˙ Dˆ t + σ˙ 2 − σ¨ ,

Dt = Dˆ t − σ˙ ,

σ¨ = Dt2 σ.

(2.38)

Hence, with W˙ = Dˆ t W and W¨ = Dˆ t2 W , we can write (2.37) as LW = 0, where LW = W¨ a − g ab ∂b (∂c h)W c − δh − B a (W, W˙ ), δh = −p div W, (2.39) where

B a (W, W˙ ) = −g ab g˙ bc − ωbc (W˙ c − σ˙ W c ) + 2σ˙ W˙ a + (σ¨ − σ˙ 2 )W a .

(2.40)

3. The Compatibility Conditions, Statement of the Theorem and the Lowest Order Energy Estimate We now consider the linearized operator LW = W¨ + CW − B(W, W˙ ),

(3.1)

where W˙ = Dˆ t W , W¨ = Dˆ t2 W , Dˆ t = Dt + (div V ), B is the bounded operator given by (2.40) and CW a = −g ab ∂b (∂c h)W c + p div W . (3.2) We want to show existence and estimates for the linearized equations with an inhomogeneous term F , LW = F, with initial data

and boundary data

W t=0 = W˜ 0 ,

W˙ t=0 = W˜ 1 ,

div W ∂ = 0.

(3.3)

(3.4)

(3.5)

The reason for the inhomogeneous term F is that one can reduce to the case of vanishing initial data and an inhomogeneous term F that vanishes to all orders as t → 0, and it is easier to first prove existence for this case. Differentiating (3.3) with respect to time we get Dˆ tk+2 W = Bk W, ., Dˆ tk+1 W, ∂W, ..., ∂ Dˆ tk W, ∂ 2 W, ..., ∂ 2 Dˆ tk W + Dˆ tk F, (3.6) for some function Bk . Let us therefore define functions of space only by W˜ k+2 = Bk W˜ 0 , ..., W˜ k+1 , ∂ W˜ 0 , ..., ∂ W˜ k , ∂ 2 W˜ 0 , ..., ∂ 2 W˜ k t=0 + Dˆ tk F t=0 ,

k ≥ 0. (3.7)

In view of (3.5) it follows that 0 = Dˆ tk div W ∂ = div Dˆ tk W ∂ , so we must have div W˜ k ∂ = 0, k = 0, ..., m. (3.8)

296

H. Lindblad

Equation (3.8) is called the mth order compatibility condition and in order for it to be possible for (3.3)–(3.5) to have a smooth solution these have to hold for all orders m. We now define the approximate power series solution by ∞

κ(0, y) W˜ (t, y) = χ (t/εk )W˜ k (y)t k /k!. κ(t, y)

(3.9)

k=0

Here χ is smooth, χ (s) = 1 for |s| ≤ 1/2, and χ (s) = 0 for |s| ≥ 1. The sequence εk > 0 can be chosen so that the series converges in C m ([0, T ], H m ) for any m if we take ( W˜ k H k + 1)εk ≤ 1/2. It follows that (3.10) div W˜ ∂ = 0. Equation (3.9) multiplied with κ(t, y) is a power series expansion of κW and it hence follows that (3.9) satisfies (3.3)–(3.5) to all orders as t → 0: Dtk LW˜ − F t=0 = 0, k = 0, . . . . (3.11) It follows that we can reduce (3.3)–(3.5) to the case with vanishing initial data and an inhomogeneous term that vanishes to all orders as t → 0, by replacing W by W − W˜ and F by F − LW˜ in (3.3). Let us now introduce some notation: Definition 3.1. Let W (t) H r = W (t,·) H r () , and W (t) r =

Dˆ tk W (t) H s .

(3.12)

(3.13)

s+k≤r

Let N be the exterior unit normal to ∂ in the metric gab , or equivalently, N a = N i ∂y a /∂x i . Set W (t) r = WN (t,·) H r (∂) ,

(3.14)

where WN = W · N is the normal component. Theorem 3.1. Suppose that p = p(ρ) is a strictly increasing smooth function of ρ. Suppose also that x is a smooth solution of (2.7), such that (1.9) hold for 0 ≤ t ≤ T . Suppose that the inhomogeneous term F in (3.3) is smooth for 0 ≤ t ≤ T . Suppose also that the initial conditions (3.4) are smooth and satisfy the mth order compatibility conditions (3.8), for all m = 0, 1, . . .. Then the linearized equations (3.3)–(3.5) have a smooth solution for 0 ≤ t ≤ T . Let E˜ r (t) = W˙ (t) r + W (t) r + W (t) r + div W (t) r ,

(3.14)

where W˙ = Dˆ t W = Dt W + (div V )W . Then there is a constant C depending only on x, r and T such that for 0 ≤ t ≤ T we have t E˜ r (t) ≤ C E˜ r (0) + F r dτ . (3.15) 0

Linearized Motion of a Compressible Liquid with Free Surface Boundary

297

Theorem 1.1 follows from Theorem 3.1 since the norm (3.14) is equivalent to (3.16) Er (t) = W˙ (t) H r + W (t) H r + W (t) H r + div W (t) H r , if F vanishes. In fact, by (3.6) one can express time derivatives in terms of space derivatives of the same order or less and using induction it follows that (3.17) Er ≤ E˜ r ≤ Cr Er + F r−1 . In this section we show the lowest order energy estimates for an equation of the form W¨ + CW = B(W, W˙ ) + F, (3.18) where W˙ = Dˆ t W = κ −1 Dt (κW ), W¨ = Dˆ t2 W , CW a = −g ab ∂ b p div W + (∂c e)W c = −g ab ∂ b h div (ρ W ) , e(ρ) = ln ρ, p (ρ) = h (ρ)ρ,

(3.19)

and B is any bounded linear operator. The energy is: ˙ ˙ gab W˙ a W˙ b + gab W a W b E = W , ρ W + W, ρ (C + I )W = 2 +p div (ρ W )/ρ ρκdy + WN2 (−∇N p) dS.

(3.20)

∂

Now, for any symmetric operator B we have d d κW a ρBWa dy = 2W˙ , BW + W, ρB W , (3.21) W, ρBW = dt dt where W˙ = κ −1 Dt (κW ) and ρB is the time derivative of the operator ρB considered as an operator from the vector fields to the one forms: BWb = gbc BW c . (3.22) ρB W a = g ab (Dt (ρBWb ) − ρB W˙ b ), Since W, ρW = W, ρ GW , where G = I , it follows that (3.23) E˙ = 2W˙ , ρ W¨ + ρ (C + I )W + W˙ , ρ G W˙ + W, ρ (C + G )W , ˙ ˙ where ρ G Wa = Dt ρ gab κ)W b and C Wa = Dt CWa −C W˙ a +eCW a , where e˙ = ρ/ρ. Since Dt (ρκ) = 0 and Dt (κdiv(ρW )) = κdiv(Dt (ρW )) we get Dt CWa = −∂a Dt ((ρκ)−1 p (ρ)κ div(ρW )) = −∂a (p (ρ)e˙ div(ρW ) + p (ρ)ρ −1 div(ρW ˙ + ρ W˙ ))

(3.24)

so ˙ )) − ∂a (p (ρ)ρ −1 div(ρW )) − e∂ ˙ a (p (ρ)ρ −1 div(ρW )) C Wa = −∂a (p (ρ)ediv(ρW (3.25) ˙ it follows that Since ρ| ˙ ∂ = 0 and e˙ = ρ/ρ ρp ˙ div (ρ U ) div (ρ W ) + p div (ρ U ) div (ρ˙ W ) U, ρC W = + div (ρ˙ U ) div (ρ W ) ρ −1 κdy + (−∇N p)U ˙ N WN dS. ∂

(3.26)

√ √ √ It therefore follows that E˙ ≤ C E( E + F ), and hence with E0 = E we have t F dτ . (3.27) E0 (t) ≤ C E0 (0) + 0

298

H. Lindblad

4. Decomposition of the Linearized Equations into an Operator on the Divergence Free Vector Fields and an Operator on the Orthogonal Complement We will now make an orthogonal decomposition: H = L2 = H0 ⊕ H1 into divergence free vector fields H0 and gradients of functions in H01 (); H1 . Let us therefore define the orthogonal projection P onto divergence free vector fields by P U a = U a − g ab ∂b pU ,

pU = div U, pU ∂ = 0. (4.1) (Here q = κ −1 ∂a κg ab ∂b q . ) P is the orthogonal projection in the inner product, see [L1], U, W = gab U a W b κdy. (4.2)

Note also that, with W (t) r,s =

s

Dˆ tk W (t, ·) H r

(4.3)

k=0

denoting the Sobolev norms for fixed time with space and time differentiation of order r and s, we have P W r,s ≤ C W r,s ,

(I − P )W r,s ≤ C div W r−1,s ,

(4.4)

since it is just a matter of solving the Dirichlet problem and commuting through time derivatives, [L1]. For a function f that vanishes on the boundary define Af W = −P ∇ W c ∂c f , i.e. Af W a = −g ab ∂b (∂c f )W c − q , where (∂c f )W c − q = 0, q ∂ = 0. (4.5) If U and W are divergence free then U, Af W =

na U a (−∂c f )W c dS,

(4.6)

∂

where n is the unit conormal. If f ∂ = 0 then −∂c f ∂ = (−∇N f )nc . It follows that Af is a symmetric operator on divergence free vector fields, and in particular the normal operator A = Ah ,

(4.7)

where h is the enthalpy, is positive, i.e. W, AW ≥ 0, since we assumed the physical condition that −∇N h ≥ c0 > 0 on the boundary. The normal operator is order one, by (4.4) AW r,s ≤ C W r+1,s .

(4.8)

The normal operator has certain delicate commutator properties with vector fields and positivity properties which were essential for the existence proof in [L1]. The main

Linearized Motion of a Compressible Liquid with Free Surface Boundary

299

difficulty is that it is not elliptic acting on vector fields with non-vanishing curl. In order to prove existence one had to replace it by a sequence of bounded operators which uniformly had the same commutator and positivity properties. We now make the decomposition W = W0 + W1 ,

W0 = P W ∈ H0 ,

W1 = (I − P )W ∈ H1 .

We want to decompose the linearized operator LW = W¨ + CW − B(W, W˙ ),

div W ∂ = 0,

(4.9)

(4.10)

where B is a bounded operator and

CW a = −g ab ∂b (∂c h)W c + p div W ,

(4.11)

into an operator onto the divergence free part and an operator on the complement. The projection to highest order commutes with time differentiation: P W¨ 1 = P B2 (W1 , W˙ 1 ), B2 (W, W˙ )a = −g ab g¨ bc W c + 2g˙ bc W˙ c , P W¨ 0 = W¨ 0 , (4.12) where g˙ ab = Dˇ t gab , g¨ ab = Dˇ t2 gab , and Dˇ t = Dt − (div V ). In fact, applying Dt2 to gab W1b = ∂q q1 gives gab W¨ 1b + 2g˙ ab W˙ b + g¨ ab W b = ∂a q¨1 . Here q¨1 = Dt2 q1 vanishes on the boundary since q1 does. The projection of g ab ∂b q¨1 therefore vanishes and (4.12) follows since Dˆ t preserves the divergence free condition. Furthermore with A given by (4.5)–(4.7) and C by (4.11) we have (4.13) P CW = AW, if div W ∂ = 0, since the projection of the highest order term, ∇ p div W , vanishes since div W ∂ = 0. We now want to project (4.10) onto the divergence free vector fields using (4.12)– (4.13). We get P LW = W¨ 0 + AW0 + P B2 (W1 , W˙ 1 ) + P AW1 − P B(W, W˙ ),

(4.14)

where A is the normal operator (4.7). Similarly, applying (I − P ) to (4.10) gives (I − P )LW = W¨ 1 − P B2 (W1 , W˙ 1 ) − ∇ p div W1 (4.15) −(I − P )∇ W c ∂c h − (I − P )B(W, W˙ ), subject to the boundary condition div W1 ∂ = 0. Lemma 4.1. Let L˜ be defined by ˜ 0 = W¨ 0 + AW0 , LW ˜ 1 = W¨ 1 − P B2 (W1 , W˙ 1 ) − ∇ p div W1 , LW

(4.16) (4.17)

and let M˜ be defined by ˜ P MW = P B2 (W1 , W˙ 1 ) + P AW1 − P B(W, W˙ ), ˜ (I − P )MW = −(I − P )∇ W c ∂c h − (I − P )B(W, W˙ ).

(4.18) (4.19)

˜ + MW. ˜ LW = LW

(4.20)

We have

300

H. Lindblad

If P0 = P , P1 = (I − P ) and Lij = Pi LPj then L˜ respectively M˜ are essentially the diagonal, respectively the off-diagonal, part of L. It turns out that we can invert (4.16) on H0 , see Sect. 5, and (4.17) on H1 , see Sect. 6. The interaction term M˜ is lower order, but in a subtle way, since it contains space derivatives ∂W . The estimates for (4.17) gives us control of an additional space derivative of W1 and that is all that is needed to estimate (4.18). Equation (4.19) also contains a space derivative of W0 . However, in our estimates for (4.17) we can replace this space derivative by a time derivative and the estimates for (4.16) give us control of an additional time derivative. The estimates for (4.16)–(4.17) will be summarized in Sect. 7. 5. Existence and Estimates in the Divergence Free Class In [L1] we proved existence of solutions for W¨ 0 + AW0 = F0 , W0 t=0 = W˜ 00 ,

W˙ 0 t=0 = W˜ 10 .

(5.1)

We have Proposition 5.1. Suppose that x, h are smooth, h ∂ = 0 and ∇N h ∂ ≤ −c0 < 0 for 0 ≤ t ≤ T . Then if initial data and the inhomogeneous term in (5.1) are smooth and divergence free it follows that (5.1) has a smooth solution for 0 ≤ t ≤ T . Furthermore, with a constant C depending only on the norm of x and h, T and the constant c0 we have t E0r (t) ≤ C E0r (0) + F0 (τ ) H r dτ , 0

E0r (t) = W˙ 0 (t) H r + W0 (t) H r + W0 (t) r ,

(5.2)

where W (t, r) H r = W (t, · ) H r () ,

W (t) r = WN (t, · ) H r (∂)

(5.3)

and WN = Na W a is the normal component. Proof. In case div V = 0 this was proven in [L1] and the proof there can be easily modified by multiplying or dividing by κ = det (∂x/∂y). Let us now indicate what needs to be changed in [L1] in order to deal with the case div V = 0. We can use the same set of tangential vector fields as in [L1], but they are no longer divergence free so the Lie derivative with respect to these no longer preserves the divergence free condition. But one can easily modify the Lie derivative so it preserves the divergence free condition. The modified Lie derivative with respect to a vector field T applied to a vector field W is Lˆ T W = LT W + (div T )W,

(5.4)

where LT W is the Lie derivative. It satisfies div Lˆ T W = Tˆ div W , where for a function f , Tˆ f = Tf + (div T )f . One then has to make it so one always applies this modified Lie derivative to vector fields. However we use the usual Lie derivative, when applied to one forms since it commutes with covariant differentiation. In deriving the estimates for all components of a vector field in terms of the divergence, the curl and the tangential components, we use LT (gab W b ) = gab Lˆ T W b + (Lˇ T gab )W b , where

Linearized Motion of a Compressible Liquid with Free Surface Boundary

301

Lˇ T gab = LT gab − (div T )gab . Let us now examine how the critical commutator with the normal operator, (4.5), is changed from what it was in [L1]. With Af given by (4.5) and Af Wa = gab Af W b we have LT Af Wa = LT ∂a (∂c f )W c − q = ∂a (∂c f )Lˆ T W + ∂c (Tˇ f )W c − T q + f (∂c div T )W c , (5.5) where Tˇ f = Tf − (div T )f . When we project again the last two terms vanish since they vanish on the boundary, so the commutator relation in [L1] will be replaced by P LT Af W = Af Lˆ T W + ATˇ f W.

(5.6)

The issue of how to deal with the initial conditions in case div V = 0 was discussed in Sect. 3. Now, the norms used in Proposition 5.1 are natural for the initial value problem. However, when solving the wave equation with Dirichlet boundary conditions it is more natural to first look on norms with many time derivatives. Because of the coupling between the two equations we must therefore also estimate more time derivatives of the divergence free part. From differentiating (5.1) we get: Lemma 5.2. Suppose that W is a smooth solution of (5.1) for 0 ≤ t ≤ T . Let E0,r be as in (5.2) and W (t) r =

W (t) k,s

where

W (t) r,s =

s+k≤r

s

Dˆ tk W (t) H r .

(5.7)

k=0

Then for 0 ≤ t ≤ T we have

W˙ 0 r + W0 r ≤ C E0,r + F0 r−1 ,

r ≥ 1.

(5.8)

Proof. The proof is just differentiation of (5.1) using that A is order one, (4.8); W¨ 0 r−1 ≤ C( W0 r + F0 r−1 ),

r ≥ 1,

(5.9)

which proves (5.8) for r = 1, so we may assume that r ≥ 2 in (5.8). We have W0 r ≤ W0 r,0 + W˙ 0 r−1,0 + W¨ 0 r−2 , so together with (5.9) we get W0 r ≤ C E0,r + W0 r−1 + F0 r−2 ,

r ≥ 2,

(5.10)

r ≥ 2.

(5.11)

Since also W0 1 ≤ E0,1 , we can use induction in r to prove that r ≥ 2. W0 r ≤ C E0,r + F0 r−2 ,

(5.12)

Similarly W˙ 0 r ≤ W˙ 0 r,0 + W¨ 0 r−1 , so by (5.9) again

r ≥ 1,

W˙ 0 r ≤ C E0,r + W0 r + F0 r−1 ,

Equation (5.8) for r ≥ 2 now follows from (5.12) and (5.14).

r ≥ 1.

(5.13)

(5.14)

302

H. Lindblad

Theorem 5.3. With notation and assumptions as in Proposition 5.1 and Lemma 5.2 we have t E˜ 0r (t) ≤ C E˜ 0r (0) + F0 (τ ) r dτ , 0

E˜ 0r (t) = W˙ 0 (t) r + W0 (t) r + W0 (t) r . Proof. Equation (5.15) follows from Proposition 5.1 since t F0 (t) r−1 ≤ F0 (0) r−1 + F˙0 (τ ) r−1 dτ, 0 F0 (0) r−1 ≤ C W˙ (0) r + W (0) r .

(5.15)

(5.16)

6. Existence and Estimates for the Wave Equation We consider the Cauchy problem for the wave equation on a bounded domain with Dirichlet boundary conditions: Dˆ t2 (e ψ) − ψ = f, in [0, T ] × , ψ ∂ = 0, (6.1) ψ t=0= ψ˜ 0 ,

Dt ψ t=0= ψ˜ 1 .

(6.2)

Here

1

ψ = √ ∂a det gg ab ∂b ψ , det g

(6.3)

where g ab is the inverse of the metric gab and det g = det{gab } = κ 2 , in our earlier notation. We assume that g ab is symmetric (since the metric is), and that g ab and e are smooth satisfying: 0 < e + 1/e < c1 ,

n

|g ab | + |gab | ≤ n c12

(6.4)

a,b=1

for some constants 0 < c1 < c1 < ∞. Existence of solutions for a wave equation with Dirichlet boundary conditions and initial conditions satisfying some compatibility conditions is well known, see e.g. [H, Ev]. In order for (6.1)–(6.2) to be solvable initial data must be compatible with the boundary condition. If we move the Laplacian in (6.1) over to the right-hand side and differentiate (6.1) with respect to time we get Dtk+2 ψ = bk ψ, ..., Dtk+1 ψ, ∂ψ, ..., ∂Dtk ψ, ∂ 2 ψ, ..., ∂ 2 Dtk ψ + Dtk f (6.5) for some functions bk . We therefore define functions of the space variables only ψ˜ k , ψ˜ k+2 = bk ψ˜ 0 , ..., ψ˜ k+1 , ∂ ψ˜ 0 , ..., ∂ ψ˜ k , ∂ 2 ψ˜ 0 , ..., ∂ 2 ψ˜ k t=0 + Dtk f t=0 , (6.6)

Linearized Motion of a Compressible Liquid with Free Surface Boundary

303

where ψ˜ 0 and ψ˜ 1 are as in (6.2). For this to be compatible with the boundary conditions we must have for k ≤ m − 1. (6.7) ψ˜ k ∂ = 0, Equation (6.7) is called the mth order compatibility condition. Since ψ˜ k are determined from the initial conditions ψ˜ 0 and ψ˜ 1 this gives some compatibility conditions on the initial conditions. We have: Proposition 6.1. Suppose that g, e are smooth satisfying (6.4). Then if initial data (ψ˜ 0 , ψ˜ 1 ) and f are smooth and satisfy the mth order compatibility condition for all m, it follows that (6.1)–(6.2) has a smooth solution ψ. Proof. The result in [H] is stated with vanishing initial conditions but, if the compatibility conditions are satisfied, one can reduce to that case by subtracting off an approximate solution satisfying the equation to all orders as t → 0. Let ψ˜ =

∞

χ (t/εk )t k ψ˜ k /k!,

(6.8)

k=0

where χ is smooth, χ (s) = 1 for |s| ≤ 1/2, and χ (s) = 0 for |s| ≥ 1, and the sequence εk > 0 is chosen small enough so that the series converges in C m ([0, T ], H m ) for any m. This is obtained if we take ( ψ˜ k H k + 1)εk ≤ 1/2. Then ψ = ψ − ψ˜ satisfies (6.1) with vanishing initial conditions and a right-hand side f that vanishes to all orders as t → 0: ˜ + ψ˜ + f = f . Dˆ t2 (e ψ) − ψ = −Dˆ t2 (e ψ)

(6.9)

For this case existence of a smooth solution ψ to (6.9) follows from Theorem 24.1.1 in [H]. Since the theorem in [H] is more general, let us just point out the main steps needed for our case. Existence follows from duality, using the Hahn-Banach extension theorem and the Riesz representation theorem. For this one has to show estimates for the adjoint operator in negative Sobolev spaces. Suppose that ψ satisfy (6.1) and let ψN = (I − )−N ψ, where N ≥ 0 and is the Dirichlet Laplacian, i.e. inductively, we define ψk to be the solutions of (I − )ψk+1 = ψk , with boundary conditions ψk+1 ∂ = 0. Then ψN satisfies (6.1) with f replaced by fN + (I − )−N [Dˆ t2 , (I − )N ]ψN , where fN = (I − )−N f . The norm of this is bounded by fN + ψN + Dt ψN . Using the energy t estimate in Lemma 6.2 then gives us an estimate Dt ψN (t, ·) + ∇ψN (t, ·) ≤ C 0 fN dτ . Lemma 6.2. Suppose that g ab , e , f and ψ are smooth and satisfy (6.1)–(6.4) for 0 ≤ t ≤ T . Let ∇ a = g ab ∂b and, for r ≥ 1, E(t) =

r−1 1 s=0

2

1/2 . e (Dts+1 ψ)2 + |Dˆ ts ∇ψ|2 + ψ 2 κdy

(6.10)

Then dE ≤ C E + f 0,r−1 , dt

where φ r,s =

k≤s, |α|≤r

Dtk ∂yα φ .

(6.11)

304

H. Lindblad

Proof. We will prove that dE 2 /dt is bounded by E times the right-hand side of (6.11). Inequality (6.11) follows from this since dE/dt = (dE 2 /dt)/(2E). Since Dt κ = κ div V , we have with Dˆ t = Dt + div V and Dˇ t = Dt − div V : dE 2 e (Dts+1 ψ)(Dts+2 ψ) + gab Dˆ ts ∇ a ψ Dˆ ts+1 ∇ b ψ κdy = dt s≤r−1 1 (Dˆ t e )(Dts+1 ψ)2 + (Dˇ t gab ) Dˆ ts ∇ a ψ Dˆ ts ∇ b ψ + 2 +κ −1 (Dt κ)ψ 2 + 2ψDt ψ κdy. (6.12) Here the terms on the second row are bounded by a constant times E 2 . Applying Dts+1 s+1 s+1−i ˇ to ∂a ψ = gab ∇ b ψ gives ∂a Dts+1 ψ = s+1 gab )Dˆ ti ∇ b φ so i=0 i (Dt Dˆ ts+1 ∇ a ψ = g ab ∂a Dts+1 ψ −

s s+1 i

g ab (Dˇ ts+1−i gbc )Dˆ ti ∇ c ψ.

(6.13)

i=0

Up to terms bounded by a constant times E 2 , (6.12) is therefore equal to e (Dts+1 ψ)(Dts+2 ψ) + Dˆ ts ∇ a ψ ∂a Dts+1 ψ κdy s≤r−1

=

s≤r

(Dts+1 ψ) e Dts+2 ψ − κ −1 ∂a κ Dˆ ts ∇ a ψ κdy,

(6.14)

where we have integrated by parts. If we apply Dˆ ts to Dˆ t2 e ψ − κ −1 ∂a κ∇ a ψ = f we obtain Dˆ ts Dˆ t2 e ψ − κ −1 ∂a κ∇ a ψ = e Dts+2 ψ − κ −1 ∂a κ Dˆ ts ∇ a ψ +

s+1 s+2 i

(Dˆ ts+2−i e )(Dti ψ).

(6.15)

i=0

Since the L2 norm of the last term is bounded by CE plus the L2 norm of ψ the lemma follows. One can get additional space regularity from taking time derivatives of Eq. (6.1) and solving the Dirichlet problem for the Laplacian. Lemma 6.3. Suppose that g ab and e are smooth and satisfy (6.4) and that f is smooth for 0 ≤ t ≤ T . Suppose also that ψ is a smooth solution of (6.1) for 0 ≤ t ≤ T . Let ψ s,r be as in Theorem 6.1 and let ψ r = s+k≤r Dts ψ k . Then (6.16) ψ r ≤ C ψ 0,r + ψ 1,r−1 + f r−2 . Proof. Since ψ = Dˆ t2 e ψ − f and ψ ∂ = 0 it follows that

Dts ψ =

s+2 s+2 i

(Dˆ ts+2−i e )Dti ψ − Dˆ ts f − [Dˆ ts − Dts ]ψ,

Dts ψ ∂ = 0,

i=0

(6.17)

Linearized Motion of a Compressible Liquid with Free Surface Boundary

305

so by the standard elliptic estimates s+2 s−1 Dti ψ H k + Dˆ ts f H k + Dti ψ H k+2 . Dts ψ H k+2 ≤ C i=0

(6.18)

i=0

Here the last term is lower order and is absent if s = 0, so using induction in s we get s

Dti ψ H k+2 ≤ C

i=0

or with ψ r,s =

s

Dti+2 ψ H k + Dti f H k

(6.19)

i=0

s

k k=0 Dt ψ H r ,

ψ s+2,r−s−2 ≤ C ψ s,r−s + f s,r−s−2 , 0 ≤ s ≤ r − 2. r Since ψ r = s=0 ψ r−s,s it therefore inductively follows that ψ r ≤ C ψ 0,r + ψ 1,r−1 + f r−2 .

(6.20)

(6.21)

Summing up: Proposition 6.4. There are constants Cr such that the solution of (6.1) satisfies t ψ(t, ·) r ≤ Cr ψ(0, ·) r + f˙ r−2 dτ , r ≥ 2. (6.22) 0

t Proof. It follows from Lemma 6.2 that E(t) ≤ Cr E(0)+ 0 f 0,r−1 dτ . Using (6.13) we see that the energy E in (6.10) t is equivalent to ψ 0,r + ψ 1,r−1 . Furthermore f (t, ·) r−2 ≤ f (0, ·) r−2 + 0 f˙(t, ·) r−2 dτ and since also f r−2 ≤ C ψ r , the proposition follows from Lemma 6.3. As pointed out in Sect. 4 we actually want to solve the equation: W¨ 1 − P B2 (W1 , W˙ 1 ) − ∇ p div W1 = F1 , div W1 = 0, ∂

W1 = ∇q1 ,

q1 ∂ = 0,

where (I − P )F1 = F1 , which is equivalent to Dˆ t2 φ − p φ = div F1 , W1 = ∇q1 ,

q1 = φ,

(6.23) (6.24)

φ|∂ = 0,

(6.25)

q1 ∂ = 0.

(6.26)

¨ In fact, for W1of the form (6.24) the left-hand side of (6.23) is (I − P )W1 − (I − P )∇ p div W1 , and (I − P )H = 0 is equivalent to div H = 0. Assuming that the compatibility conditions are satisfied we can solve (6.25)–(6.26) and this then also gives us a solution of (6.23)–(6.24). The initial conditions for (6.25) are φ t=0 = φ˜ 0 , Dˆ t φ t=0 = φ˜ 1 . (6.27)

306

H. Lindblad

Lemma 6.5. Suppose that W1 satisfies (6.23) and set E(t) =

r 1

2

1/2 |Dˆ ts+1 W1 |2 + p | div (Dˆ ts W1 )|2 + |W1 |2 κdy .

s=0

(6.28)

Then dE ≤ C E + F1 0,r . dt

(6.29)

Proof. Let W1k = Dˆ tk W1 , r dE 2 = W˙ 1s · W¨ 1s + p div W1s div W˙ 1s κdy dt s=0 1 a ˙b + (Dˇ t gab )W˙ 1s W1s + (Dˇ t p )(div W1s )2 2 +κ −1 (Dt κ)|W1 |2 + 2 W1 · W˙ 1 κdy. Integrating by parts, using that div W1s ∂ = 0, it therefore follows that

dE 2 ≤ dt r

W˙ 1s · W¨ 1s − ∇ p div W1s κdy + CE 2 .

(6.30)

(6.31)

s=0

Using (6.23) this proves (6.29) for r = 0. To prove it for r ≥ 1 we have to commute time derivatives through (6.23), which can be written gab W¨ 1b − ∂a p div W1 = gab B˜ 0b , B˜ 0 (W1 , W˙ 1 , F1 ) = F1 + PB2 (W1 , W˙ 1 ). (6.32) With q = p /κ we have Dts q −1 ∂a p div W1 = Dts ∂a κ div W1 + Dts (∂a ln q)κ div W1 s s s−k = ∂a κ div W1s + ln q)κ div W1k k (∂a Dt k=0 s−1 s s−k = q −1 ∂a p div W1s + ln q)κ div W1k . (6.33) k (∂a Dt

k=0

Hence s−1 s s−k qDts q −1 ∂a p div W1 = ∂a p div W1s + ln q)p div W1k . (6.34) k (∂a Dt k=0

Multiplying (6.32) by q −1 , applying Dts and dividing by q −1 therefore gives b gab W¨ 1s − ∂a p div W1s = gab B˜ sb W1 , ..., W˙ 1s , div W1 , ..., div W1s−1 , F1 , ..., Dˆ ts F1 , (6.35)

Linearized Motion of a Compressible Liquid with Free Surface Boundary

307

where B˜ s is a bounded operator of its arguments: B˜ k ≤ C

s

W1k + W˙ 1k + Dˆ tk F1 + div W1k .

(6.36)

k=0

This together with (6.31) proves (6.29) also for r ≥ 1.

Theorem 6.6. Suppose that the initial conditions (6.27) and the inhomogeneous term in (6.25) are smooth and satisfy the compatibility conditions for all orders. Then (6.25)– (6.27) has a smooth φ. Furthermore with W1 given by (6.26) we have t F˙1 (τ ) r−1 dτ, E1r (t) = W1 (t) r+1 (6.37) E1r (t) ≤ Cr E1r (0) + Cr 0

for r ≥ 1, and for r = 0 the same inequality holds with F˙1 (τ ) r−1 replaced by F1 (τ ) r . Proof. First we assume that r ≥ 2. By the second part of (4.4) we see that with φ and W1 as in (6.37), φ r ≤ ∂W1 r + W1 r ≤ C φ r ,

(6.38)

where ∂ stands for space derivatives only. Furthermore by (6.23) W1 r+1 ≤ ∂W1 r + W1 r + W¨ 1 r−1 ≤ C ∂W1 r + W1 r + F1 r−1 . (6.39) Since also

t

F1 (t) r−1 ≤ F1 (0) r−1 +

F˙1 r−1 dτ ≤ C W1 (0) r+1 +

0

t

F˙1 r−1 dτ,

0

(6.40) (6.37) for r ≥ 2 follows from Proposition 6.4. For r = 1, (6.37) follows from Lemma 6.5 and the fact that by the second part of (4.4) ∂Dt W1 ≤ C Dt div W1 + div W1 and by (6.23) ∂ 2 W1 ≤ C ∂ div W1 ≤ C W¨ 1 + Dt ∂W1 + W 1 + F1 . 7. The Proof of the Theorem We are now in a position to prove Theorem 3.1. We want to show that LW = F, div W ∂ = 0, with initial conditions

W t=0 = W˜ 0 ,

W˙ t=0 = W˜ 1 ,

(7.1)

(7.2)

has a smooth solution W if W˜ 0 , W˜ 1 and F are smooth and satisfy the compatibility conditions in Sect. 3 to all orders. If these compatibility conditions hold then we can find an approximate solution W˜ satisfying the initial conditions and the equation to all orders as t → 0. Subtracting off the approximate solution reduces it to finding a smooth solution to (7.1) when W˜ 0 = W˜ 1 = 0 and F vanishes to all orders as t → 0.

308

H. Lindblad

With L˜ and M˜ as in Lemma 4.1, we have reduced the proof to finding a smooth solution of ˜ = F˜ , LW div W ∂ = 0, (7.3) where ˜ F˜ = F − MW.

(7.4)

That the operator L˜ is invertible follows from using the decomposition W = W0 + W1 , in Lemma 4.1 and applying Theorem 5.3 to the divergence free part W0 and Theorem 6.6 to W1 . This in fact gives estimates for the solution of (7.3) that can be used to show existence for (7.3) with F˜ given by (7.4) by iteration, and we hence obtain a solution to (7.1). Let us introduce the norms: |W |r,1 = W˙ 0 r + W0 r + W0 r + W1 r+1 ,

W0 = P W,

W1 = (I − P )W (7.5)

and |F |r,2 = F0 r + F˙1 r−1 ,

F0 = P F,

F1 = (I − P )F.

(7.6)

It now follows from Lemma 4.1, Theorem 5.3 and Theorem 6.6: Theorem 7.1. Suppose that (x, h) is a smooth solution to (2.7), (for 0 ≤ t ≤ T), such that h∂= 0 and ∇N h|∂ ≤ −c0 < 0. Suppose also that F˜ , W˜ 0 and W˜ 1 are smooth and such that there is a smooth function W˜ satisfying the initial conditions (7.2), the boundary ˜ = F to all orders as t → 0, i.e. Dtk LW− ˜ F˜ ) = 0, condition div W˜ ∂= 0, and LW t=0 for k ≥ 0. Then (7.2)–(7.3) has a smooth solution W . Furthermore, there are constants Cr such that for any smooth solution of (7.3) we have t |W (t) |r,1 ≤ Cr |W (0) |r,1 + |F˜ |r,2 dτ , r ≥ 1. (7.7) 0

Moreover, ˜ |r,2 ≤ Cr |W |r,1 . |MW

(7.8)

We remark that the compatibility conditions in the theorem are in particular true if W˜ 0 = W˜ 1 = 0 and Dtk F˜ t=0 = 0, for k ≥ 0, since then we can take W˜ = 0. Therefore if Dtk F t=0 = 0, for k ≥ 0 and we set W 0 = 0, it follows that we can inductively solve, for k ≥ 0: ˜ k+1 = F − MW ˜ k, LW div W k+1 ∂ = 0, W k+1 t=0 = W˙ k+1 t=0 = 0. (7.9) We claim that W k converges to a solution of (7.3)–(7.4) and hence to (7.1), in case k Dt F t=0 = 0, for k ≥ 0 and W˜ 0 = W˜ 1 = 0. Since we have already reduced solving (7.1)–(7.2) to this case, this would prove the existence part of Theorem 3.1. That W k converges to a smooth solution of (7.3)–(7.4) follows from using the estimate in Theorem

Linearized Motion of a Compressible Liquid with Free Surface Boundary

309

1 − W 0 ) = F , and for k ≥ 1, L(W k+1 − W k ) = −M(W k − W k−1 ). ˜ ˜ ˜ 7.1. In fact, L(W It therefore follows from Theorem 7.1 that t N MN ≤ Cr |W k+1 − W k |r,1 . |F |r,2 + MN dτ, where MN = 0

k=0

(7.10) It now follows from a standard Gr¨onwall type of argument that t MN (t) ≤ Cr eCr t |F |r,2 dτ

(7.11)

0

for any N. It follows from this that W k converges to a smooth solution W of (7.3)– (7.4) for 0 ≤ t ≤ T , and therefore we have proven existence of smooth solutions for (7.1)–(7.2). Having proven existence of a smooth solution to (7.1)–(7.2), we now also need to prove the estimate in Theorem 3.1. Applying the estimate in Theorem 7.1 gives t |W (t) |r,1 ≤ Cr |W (0) |r,1 + |F |r,2 + |W |r,1 dτ . (7.12) 0

By the same Gr¨onwall type of argument as above we get t Cr t |W (t) |r,1 ≤ Cr e |F |r,2 dτ , |W (0) |r,1 +

r ≥ 1.

(7.13)

0

It therefore only remains to observe that the norms in Theorem 3.1 are equivalent to those here. It follows from the continuity of the projection (4.4) that W˙ r + W r ≤ W˙ 0 r + W˙ 1 r + W0 r + W1 r ≤ Cr W˙ r + W r . (7.14) Furthermore, by the second part of (4.4) and Sobolev’s lemma , div W r ≤ W1 r+1 ≤ Cr div W r ,

and

W1 r ≤ Cr W1 r+1 .

(7.15)

This concludes the proof of Theorem 3.1. Acknowledgements. I would like to thank Demetrios Christodoulou, David Ebin and Kate Okikiolu for helpful discussions.

References [BG]

Baouendi, M.S., Goulaouic, C.: Remarks on the abstract form of nonlinear Cauchy-Kovalevsky theorems. Commun. Part. Diff. Eq. 2, 1151–1162 (1977) [BHL] Beale, T., Hou, T., Lowengrub, J.: Growth Rates for the Linearized Motion of Fluid Interfaces away from Equilibrium. CPAM XLVI(9), 1269–1301 (1993) [C1] Christodoulou, D.: Self-Gravitating Relativistic Fluids: A Two-Phase Model. Arch. Rat. Mech. Anal. 130, 343–400 (1995) [C2] Christodoulou, D.: Oral Communication (August 95) [CK] Christodoulou, D., Klainerman, S.: The Nonlinear Stability of the Minkowski space-time. Princetion, NJ: Princeton Univ. Press, 1993 [CL] Christodoulou, D., Lindblad, H.: On the motion of the free surface of a liquid. Commun. Pure Appl. Math. 53, 1536–1602 (2000)

310 [CF] [Cr] [DM] [DN] [E1] [E2] [Ev] [F] [FN] [H] [L1] [L2] [L3] [Na] [Ni] [R] [W1] [W2] [Y]

H. Lindblad Courant, R., Friedrichs, K.O.: Supersonic flow and shock waves. Berlin-Heidelberg-New York: Springer-Verlag, 1977 Craig, W.: An existence theory for water waves and the Boussinesq and Korteweg-deVries scaling limits. Commun. in P. D. E. 10, 787–1003 (1985) Dacorogna, B., Moser, J.: On a partial differential equation involving the Jacobian determinant. Ann. Inst. H. Poincare Anal. Non. Lineaire 7, 1–26 (1990) Dain, S., Nagy, G.: Initial data for fluid bodies in general relativity. Phys. Rev. D 65, 084020, 15pp (2002) Ebin, D.: The equations of motion of a perfect fluid with free boundary are not well posed. Commun. Part. Diff. Eq. 10, 1175–1201 (1987) Ebin, D.: Oral communication (November 1997) Evans, C.: Partial Differential Equations. Providence RI: AMS, 1998 Friedrich, H.: Evolution equations for gravitating ideal fluid bodies in general relativity. Phys. Rev. D 57, 2317–2322 (1998) Friedrich, H., Nagy, G.: The initial boundary value problem for Einstein’s vacuum field equation. Commun. Math. Phys. 201, 619–655 (1999) H¨ormander, L.: The analysis of Linear Partial Differential Operators III, Berlin-Heidelberg-New York: Springer Verlag, 1994 Lindblad, H.: Well posedness for the linearized motion of an incompressible liquid with free surface boundary. Comm. Pure Appl. Math. 56, 153–197 (2003) Lindblad, H.: Well posedness for the motion of the free surface of a liquid. Preprint (January 2002) Lindblad, H.: Well posedness for the motion of the free surface of a compressible liquid. Preprint, April 2002 Nalimov, V.I.: The Cauchy-Poisson Problem (in Russian). Dynamika Splosh. Sredy 18, 104–210 (1974) Nishida, T.: A note on a theorem of Nirenberg. J. Diff. Geometry 12, 629–633 (1977) Rendall, A.D.: The initial value problem for a class of general relativistic fluid bodies J. Math. Phys. 1047–1053 (1992) Wu, S.: Well-posedness in Sobolev spaces of the full water wave problem in 2-D. Invent. Math. 130, 39–72 (1997) Wu, S.: Well-posedness in Sobolev spaces of the full water wave problem in 3-D. J. Am. Math. Soc. 12, 445–495 (1999) Yosihara, H.: Gravity Waves on the Free Surface of an Incompressible Perfect Fluid. Publ. RIMS Kyoto Univ. 18, 49–96 (1982)

Communicated by P. Constantin

Commun. Math. Phys. 236, 311–334 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0796-6

Communications in

Mathematical Physics

Invasion Percolation and the Incipient Infinite Cluster in 2D Antal A. J´arai The University of British Columbia, Department of Mathematics, # 121-1984 Mathematics Road, Vancouver, B.C. Canada V6T 1Z2. E-mail: [email protected] Received: 3 December 2000 / Accepted: 3 December 2002 Published online: 18 February 2003 – © Springer-Verlag 2003

Abstract: We establish two links between two-dimensional invasion percolation and Kesten’s incipient infinite cluster (IIC). We first prove that the k th moment of the number of invaded sites within the box [−n, n] × [−n, n] is of order (n2 πn )k , for k ≥ 1, where πn is the probability that the origin in critical percolation is connected to the boundary of a box of radius n. This improves a result of Y. Zhang. We show that the size of the invaded region, when scaled by n2 πn , is tight. Secondly, we prove that the invasion cluster looks asymptotically like the IIC, when viewed from an invaded site v, in the limit |v| → ∞. We also establish this when an invaded site v is chosen at random from a box of radius n, and n → ∞. 1. Introduction Invasion percolation [11, 20, 5, 28] is a stochastic growth model that is closely related to critical Bernoulli percolation. Critical percolation clusters have a fractal geometry that has been widely studied. The invasion dynamics reproduces the critical percolation picture, without a parameter being tuned to criticality. Both heuristics and existing work on invasion [28, 9, 29] indicate a close relationship between the invasion cluster and the “incipient cluster” of critical percolation. In this paper we formulate and prove results that relate these two objects. To explain our motivation in more detail, we review a few results about invasion percolation in Sect. 1.2 below. We only consider the simplest setting, invasion without trapping, which is defined in Sect. 1.1. 1.1. The model. Consider the hypercubic lattice Zd with its set of nearest neighbor bonds Ed . For a subgraph G of (Zd , Ed ) we write E(G) for the set of bonds of G, and Present address: CWI, PNA 3, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands. E-mail: [email protected]

312

A.A. J´arai

v ∈ G means that v is a vertex of G. We write e = v, w for a bond e with endpoints v and w. We define invasion without trapping as follows. Let {ω(e)}e∈Ed be i.i.d. uniform random variables on [0, 1], indexed by the bonds. Given the random configuration ω, we construct an increasing sequence G0 , G1 , G2 , . . . of connected subgraphs of the lattice. The graph G0 only contains the origin. If Gi has been defined, we consider its outer boundary Gi , where for any graph G, G = {e = v, w ∈ Ed : e ∈ E(G), but v ∈ G or w ∈ G}. We select the bond ei+1 which minimizes ω on Gi , and we define Gi+1 by setting E(Gi+1 ) = E(Gi ) ∪ {ei+1 }. The graph Gi is called the invasion cluster at time i, and S = ∪∞ i=0 Gi is called the invasion cluster at time infinity or the invaded region. (If in the definition of the outer boundary we only include those edges that are not separated by Gi from infinity, we obtain invasion with trapping.) The dynamically defined set S is closely related to the static Bernoulli percolation model. Let 0 ≤ p ≤ 1. For each bond e, if ω(e) < p, we let ωp (e) = 1, and e is called p-open. Otherwise we let ωp (e) = 0, and e is called p-closed. The set of p-open bonds is then Bernoulli percolation at bond density p.

1.2. Previous results. Numerical work by Wilkinson and Willemsen [28] in dimensions d = 2, 3 has indicated that the empirical distribution of the values {ω(ei )}ti=1 accepted into the invasion cluster up to time t is asymptotically uniform on [0, pc ], as t → ∞, where pc is the percolation threshold. Their work has also shown the fractal nature of the invaded region. Namely, with Sn denoting the intersection of S with the box of radius n centered at the origin, their results indicated that |Sn | obeys a power law as n → ∞. Here |A| denotes the number of elements of the set A. To the best of our knowledge, the mathematical study of invasion percolation started with two papers of Chayes, Chayes and Newman [9, 10] who rigorously established, among other things, the uniformity of the empirical distribution on [0, pc ]. (In spatial dimensions d ≥ 3 they proved this modulo major conjectures that later have been established [21, 22, 1, 13].) They also obtained results regarding the fractal nature of the invaded region. They showed that it has zero volume fraction, provided there is no percolation at pc , and that its surface to volume ratio is (1 − pc )/pc , the same as the asymptotic ratio for large critical clusters. An object that turns out to be related to invasion arose in Kesten’s analysis of the “incipient infinite cluster” [17]. Condition the cluster of the origin, in critical Bernoulli percolation, to intersect the boundary of the box of radius n. Letting n → ∞, an infinite cluster is obtained, which we will call the IIC. Kesten showed that the limit exists, at least when d = 2. (The precise statement of his result is recalled in Sect. 1.5.) With πn denoting the probability that the origin is connected to the boundary of [−n, n]×[−n, n] in critical percolation, he also proved that the k th moment of the intersection of the IIC with [−n, n] × [−n, n] is of order (n2 πn )k , for k ≥ 1. Around the same time, in [6, p. 1102] the invaded region was proposed as another possible definition of the “incipient cluster”. It is of interest to explore the relationship between invasion and the IIC. Zhang [29] proved results for the fractal dimension of S in d = 2. He showed that for any ε > 0, with probability tending to 1, n2−ε πn ≤ |Sn | ≤ n2+ε πn ,

Invasion Percolation and the Incipient Infinite Cluster in 2D

313

confirming observations of [28]. Recent breakthroughs1 by Smirnov [26] and Lawler, Schramm and Werner [19] show that πn = n−5/48+o(1) as n → ∞, on the triangular site lattice. This shows that, at least on this lattice, the dimensions of the invaded region and the IIC are both 91/48. By the conjectured universality of this exponent, this presumably holds on all common 2D lattices. In this paper, we establish further close links between invasion and the IIC in two dimensions. First we show that the k th moment of |Sn | is of order (n2 πn )k , improving the moment bound of Zhang by a factor of nε . We find the improvement interesting for two reasons. We can show that the distribution of |Sn |/(n2 πn ) is tight, which establishes the correct scaling, and the scaling is the same as for the IIC. On the other hand, we hope that our refinement of the method of Zhang may be helpful in the rigorous study of invasion percolation. Since the fractal dimension is only a crude measure of the geometry, it is of interest to compare the structures of the invaded region and the IIC in more detail. Regarding this, we prove the following. Let 0 < k < ∞ be fixed, and let v be an invaded site far away from the origin. We look at the invaded region in a window of size k centered at v. We show that, as |v| → ∞, the distribution of invaded sites inside the window approaches the distribution of sites connected to 0 in the IIC. We can also show a somewhat harder result. The same asymptotic distribution is obtained, if v is chosen uniformly at random from Sn , as n → ∞. The latter is analogous to results in [15]. We carry out our analysis for bond percolation on the square lattice. This is only a matter of convenience. The proofs do not use the lattice structure in an essential way, and work whenever the Russo-Seymour-Welsh technology and the method and results of [18, Theorem 2] are applicable. In the following section we introduce some more notation. Two simple observations, that are well known, but crucial for our analysis are given in Sect. 1.4. The precise formulation of our main theorems are given in Sect. 1.5. 1.3. Notation. Restricting from now on to the case d = 2, we denote the underlying probability measure (resp. expectation) of our model by P (resp. E). The space of 2 2 configurations is denoted by ([0, 1]E , G), where G is the natural σ -field on [0, 1]E . We fix some notation regarding Bernoulli percolation; for more background see [12]. The event that some site in the set A is connected by p-open bonds to some site in the p set B is denoted by A ←→ B. The event that there is an infinite p-open path starting at p the vertex v is denoted by v ←→ ∞. The percolation probability is defined by p θ (p) = P 0 ←→ ∞ , and the critical probability is pc = inf{p : θ (p) > 0}. The special feature that pc = 1/2 for the square lattice will not be used. For v = (v1 , v2 ) ∈ Z2 we let |v| = max{|v1 |, |v2 |}. We define the box B(n, v) = {w ∈ Z2 : |w − v| ≤ n}, 1

The first version of our paper was submitted before these results were announced.

314

A.A. J´arai

and write B(n) for B(n, 0). The boundary of the box is defined by ∂B(n) = {w ∈ Z2 : |w| = n}. An important quantity for us is the point-to-box connectivity p π(p, n) = P 0 ←→ ∂B(n) . We abbreviate π(pc , n) to πn , and write s(n) = n2 πn . 1.4. Two observations. The following two facts form the basis of the understanding of why invasion percolation is critical [9, 29]. (A) Let p > pc . Then there exists, with probability 1, an infinite p-open cluster. Suppose that for some i the graph Gi in the definition of the invasion process contains a vertex of this cluster. Then all edges invaded after time i have ω-value less than p. In other words, once the invasion process reaches the infinite p-open cluster, it cannot leave it. (B) Fix some time i, and consider the set of bonds: pc Hi = e = v, w ∈ E2 : ω(e) < pc , and v ←→ Gi . In words, Hi is the set of edges that have a pc -open connection to Gi (and are themselves pc -open). For common two-dimensional lattices, including the square lattice, it has been established that θ (pc ) = 0 [12, Sect. 11.3], which implies that |Hi | < ∞ almost surely. This means that (almost surely) all edges in Hi will be invaded before an ω-value ≥ pc is selected. In other words, the entire pc -open cluster of any invaded site is also invaded. 1.5. Main results. All constants that appear below are strictly positive and finite. Constants denoted by Ci in different theorems may be different. Recall that Sn = S ∩ B(n) denotes the set of invaded sites in the box B(n). In [29, Theorem 1] the following bounds on the moments of |Sn | are shown. For any t ≥ 1 there is C1 (t), such that t (1.1) E|Sn |t ≥ C1 (t) n2 πn , and for any t ≥ 1 and ε > 0 there is C2 (t, ε), such that t E|Sn |t ≤ C2 (t, ε) n2+ε πn .

(1.2)

Our first theorem improves the upper bound. Theorem 1. For t ≥ 1 there is a constant C(t) such that t E|Sn |t ≤ C(t) n2 πn .

(1.3)

Once Theorem 1 is proved, it is not hard to obtain the following tightness result.

Invasion Percolation and the Incipient Infinite Cluster in 2D

315

Theorem 2. We have

|Sn | 1 lim inf P ε ≤ 2 ≤ ε↓0 n n πn ε

= 1.

(1.4)

The analogues of (1.1), (1.3) and (1.4) are known to hold for the intersection of the IIC with B(n) by the results of [17, Theorem 8]. In order to compare the local geometry of the invaded region with the IIC, we recall Kesten’s construction. For this, let F = σ (ωpc (e); e ∈ E2 ). The σ -field F is generated by the collection F0 of events that only depend on finitely many of the values ωpc (·). It is shown in [17] that for any E ∈ F0 the limit pc ν(E) = lim P E 0 ←→ ∂B(n) n→∞

(1.5)

exists. It follows that ν has a unique extension to a probability measure on F, and under the measure ν, the cluster pc C(0) = v ∈ Z2 : v ←→ 0 is almost surely infinite. The distribution of the cluster C(0) under ν is called the IIC. By (A) and (B) in Sect. 1.4, it is plausible that if a site v with |v| large is in the invaded region, then the invasion neighborhood of v typically coincides with a large critical percolation cluster, and therefore with the IIC. To formulate this statement, we need some more notation. For v ∈ Z2 let τv denote the translation of the lattice by v. For a configuration ω and an edge x, y we let τv ω(x, y) = ω(x − v, y − v), and for an event A we let τv A = {τv ω : ω ∈ A}. Let K be a finite set of edges. Define the event EK = {K ⊂ S}, and for v ∈ Z2 let Tv EK = {τv K ⊂ S}. The latter is the event that the edges in the translated set τv K are invaded. (We use the notation Tv , since this event is not the same as τv EK .) To make the connection with the IIC define EK = {K ⊂ C(0)} ∈ F. provide all information about the As K varies over all finite sets of edges the events EK set C(0).

Theorem 3. For any E ∈ F0 we have lim P (τv E | v ∈ S) = ν(E).

|v|→∞

Also, for any finite K ⊂ E2 we have lim P (Tv EK | v ∈ S) = ν(EK ).

|v|→∞

316

A.A. J´arai

The content of the first statement is that asymptotically, the only information the condition v ∈ S gives about the neighborhood of v is that v lies in a large pc -open cluster. The second statement says that the distribution of invaded bonds near v is given by the IIC measure. Instead of a deterministic site v, we can prove such a result for a site chosen uniformly at random from Sn . Theorem 4. Let In denote a vertex of Sn = S ∩ B(n) chosen uniformly at random, given Sn . For any E ∈ F0 we have lim P (τIn E) = ν(E),

n→∞

and for any finite K ⊂ E2 we have ). lim P (TIn EK ) = ν(EK

n→∞

The proof of Theorem 4 is similar to that of Theorem 3. We do not think that Theorem 4 could be deduced directly from Theorem 3. The choice of In contains information about |Sn |, and a priori one does not know the influence of this on the configuration near In . We shall return to this issue when we discuss the proof of Theorem 4. The next section contains the proofs of Theorems 1 and 2. Some preliminary results are summarized in Sect. 2.2. Theorems 3 and 4 are proved in Sect. 3. 2. Upper Bound on the Number of Invaded Sites 2.1. Heuristic argument. We describe the main idea of the proof of Theorem 1 in the case t = 1, given some scaling assumptions. The actual proof is the translation of the argument below into rigorous statements valid on a number of 2D lattices. Our two main assumptions are that θ (p) scales like π(pc , ξ(p)) for p > pc , where ξ is the correlation length, and that π(pc , m) obeys a power law. For simplicity, we even assume that the latter scales like m−5/48 . This is in fact known for the triangular site lattice by the work of Smirnov [26] and Lawler, Schramm and Werner [19] which has led to enormous progress [27]. However, in the rest of the paper we will use an argument independent of the lattice. Assume that n = 2k , and let Xk be the number of invaded sites in the annulus Ak = B(2k ) \ B(2k−1 ). We show that EXk ≤ C n2 πn = C s(n), which is essentially what we need. In using (A) of Sect. 1.4 for an upper bound, we have to find pk > pc so that with high probability the invasion is already in the infinite pk -open cluster by the time it reaches Ak . An event which ensures that this cluster is reached is pk Hk = there is a pk -open circuit D in Ak−1 , and D ←→ ∞ . (2.1) We want to choose pk as close to pc as possible to get a good upper bound on Xk in terms of the infinte cluster, but we also need P (Hkc ) to be small. The proof of (1.2) was essentially based on the optimal choice of pk . Choose pk to satisfy ξ(pk ) =

n 2k = , C1 log n C1 k

where C1 is a large constant. As it will be clear from computations in the next paragraph, this leads to a bound of the form EXk ≤ Cs(n)(log n)c with c > 0. Such a bound is implicit in [29].

Invasion Percolation and the Incipient Infinite Cluster in 2D

317

One can improve this using several pk ’s. For example, take pk (0) > pk (1) > pc satisfying ξ(pk (0)) =

n C1 log n

ξ(pk (1)) =

n . C1 log log n

Define the events Hk (0) and Hk (1) by replacing pk in (2.1) by pk (0) and pk (1). To bound the probabilities of Hk (j )c first note that for p > pc the crossing probability of a square of linear scale m > ξ(p) is 1 − O(exp(−am/ξ(p))), for some constant a > 0. Since the shortest scale on which connections are required for the event Hk (0) (resp. Hk (1)) is of order (log n)ξ(pk (0)) (resp. (log log n)ξ(pk (1))), this leads to the bounds P (Hk (0)c ) ≤ C exp(−c log n)

P (Hk (1)c ) ≤ C exp(−c log log n).

(2.2)

Here c can be made large by choosing C1 large in the definition of pk (0) and pk (1). We write EXk = E(Xk ; Hk (0)c ) + E(Xk ; Hk (0) ∩ Hk (1)c ) + E(Xk ; Hk (1)).

(2.3)

Using (2.2), the first term is bounded above by |Ak | P (Hk (0)c ) = O(n2 n−c ). Recalling that s(n) = n2 πn ≈ n2 n−5/48 , we see that the right-hand side is o(s(n)), if c is large enough. For the second term of (2.3), on the event Hk (0) we can bound Xk from above by the intersection of Ak with the pk (0)-open infinite cluster. Let Zk (0) denote the size of this intersection. Then the second term is bounded above by E(Zk (0); Hk (1)c ) ≤ E(Zk (0)) P (Hk (1)c ) ≤ |Ak | θ(pk (0)) C (log n)−c ,

(2.4)

where we used the FKG inequality in the first step, and (2.2) in the second. By our scaling assumptions and the definition of pk (0), we have θ (pk (0)) ≈ π(pc , ξ(pk (0))) ≈

n log n

−5/48

≈ n−5/48 (log n)5/48 .

It follows that the right-hand side of (2.4) and therefore the second term of (2.3) are bounded above by C n2 n−5/48 (log n)5/48 (log n)−c . This quantity is again o(s(n)), if c is large enough. Finally, the third term of (2.3) is bounded above by E(Zk (1)) = |Ak | θ (pk (1)) ≈ |Ak | π(pc , ξ(pk (1))) = O(n2 n−5/48 (log log n)5/48 ). We have shown that EXk ≤ Cs(n)(log log n)5/48 . By similar arguments we can prove an upper bound with any number of logarithms. Furthermore, a careful look at the argument will show that the bound Cs(n) in fact holds.

318

A.A. J´arai

2.2. Preliminaries. Our main tool for the rigorous argument will be the finite-size scaling correlation length introduced in [8, Sect. 3] and further studied in [18]. (See also [4] for a recent account.) Let σ (n, m, p) = P (there is a p-open horizontal crossing of [0, n] × [0, m]), where it is assumed that the open crossing does not use bonds lying on the top and bottom sides of the rectangle. Given ε > 0, we define L(p, ε) = min{n : σ (n, n, p) ≥ 1 − ε},

for p > pc .

(2.5)

It is known [18, (1.24)] that there exists an ε0 > 0 such that for ε ≤ ε0 the scaling of L(p, ε) is independent of ε, in the sense that L(p, ε1 ) L(p, ε2 ),

for fixed 0 < ε1 , ε2 ≤ ε0 .

(2.6)

The symbol means that the ratio of the two sides is bounded away from 0 and ∞ as p ↓ pc . It is also known that L(p, ε0 ) scales like the usual correlation length [18, Corollary 2], but we will not use this fact explicitly. We are going to take ε = ε0 , and let L(p) = L(p, ε0 ) for the entire proof. We summarize the properties of L(p) that we need. 1. From the definition it is clear that L(p) is decreasing, right continuous and L(p) → ∞ as p ↓ pc .

(2.7)

2. If ε0 is small enough, there are constants C1 and C2 such that σ (2mL(p), mL(p), p) ≥ 1 − C1 exp(−C2 m),

for m > 1.

(2.8)

This can be shown using ideas of [2, 8, 6] by the rescaling argument of [6, Lemma 2.7]. Indeed, in [6] the rescaling bound 1 1 (2.9) λ implies σ (4n, 2n, p) ≥ 1 − λ2 16 16 is shown. One can iterate this starting with n = L(p), and use the Russo-Seymour-Welsh Lemma (RSW Lemma) [12, Sect. 11.7] to get an initial bound when n = L(p). 3. It will be important for us that the jumps of L(p) are bounded on a logarithmic scale; there is a constant D such that σ (2n, n, p) ≥ 1 −

lim δ↓0

L(p − δ) ≤D L(p)

(2.10)

for p > pc .

For the subcritical version of the finite-size scaling length this was observed in [4]. In two dimensions their proof is easily adapted to the supercritical case. Indeed, the rescaling bound (2.9) implies that (2.10) holds for the quantity

ε) = min{n : σ (2n, n, p) ≥ 1 − ε}, L(p,

p > pc ,

with D = 2, when ε < 1/16. It is simple to deduce from this that (2.10) also holds for

ε). L(p). The simple inequality σ (n, n, p) ≥ σ (2n, n, p) shows that L(p, ε) ≤ L(p, On the other hand, using the RSW Lemma one can show that for some function f (ε)

f (ε)) ≤ L(p, ε), where f (ε) → 0 as ε → 0. Together with (2.6) this we have L(p, establishes (2.10). 4. Finally, the following theorem makes it precise that θ(p), π(p, L(p)) and π(pc , L(p)) obey the same scaling when p > pc .

Invasion Percolation and the Incipient Infinite Cluster in 2D

319

Theorem [18, Theorem 2]. There are constants C1 and C2 such that for p > pc , π(pc , L(p)) ≤ π(p, L(p)) ≤ C1 θ (p) ≤ C1 π(p, L(p)) ≤ C2 π(pc , L(p)).

(2.11)

As for the behavior of π(pc , n), it will be enough for us to have a power law lower bound. Using the idea of [3, Cor. 3.15] one can show that there exists a constant D1 , such that π(pc , m) n ≥ D1 , m ≥ n ≥ 1. (2.12) π(pc , n) m 2.3. Proof of Theorem 1. We first prove the case t = 1; the extension to higher moments will not pose extra difficulties. We still write s(n) = n2 πn for short. By monotonicity of πn we may assume that n is a power of 2. Indeed, if 2K ≤ n < 2K+1 , then s(2K+1 ) ≤ 4s(n). Assuming n = 2K , we divide B(n) into disjoint annuli; B(n) = ∪K k=1 Ak , where Ak = B(2k ) \ B(2k−1 ) = {v ∈ Z2 : 2k−1 < |v| ≤ 2k },

for k ≥ 2,

and A1 = B(2). Letting Xk = |S ∩ Ak | we can write |Sn | = X1 + · · · + XK .

(2.13)

We are going to bound EXk . Following the idea in Sect. 2.1 we start by defining a suitable sequence pk (0) > pk (1) > . . . > pc . We introduce the following notation. Let log(0) k = k, and let log(j ) k = log(log(j −1) k),

for j ≥ 1, if the right-hand side is well-defined.

Here log denotes natural logarithm. For k > 10 we define log∗ k = min{j > 0 : log(j ) k is well-defined and log(j ) k ≤ 10}. Our choice of the constant 10 is quite arbitrary. It is immediate that log(j ) k > 2, for j = 0, 1, . . . , log∗ k and k > 10. Let

2k , j = 0, 1, . . . , log∗ k, pk (j ) = inf p > pc : L(p) ≤ (2.14) (j ) C3 log k where the constant C3 will be chosen later to be large. Since L(p) → ∞, as p ↓ pc , pk (j ) is well-defined, at least for k ≥ some k0 = k0 (C3 ). We assume k0 > 10. From the right continuity of L(p) it follows that 2k /L(pk (j )) ≥ C3 log(j ) k. Together with (2.10) this implies that C3 log(j ) k ≤

2k ≤ DC3 log(j ) k. L(pk (j ))

We define the events pk (j ) Hk (j ) = there is a pk (j )-open circuit D in Ak−1 , and D ←→ ∞ ,

(2.15)

(2.16)

320

A.A. J´arai

Ak-1

Fig. 1. A sketch of the event Jk ∩ Jk0,h ∩ Jk0,v ∩ Jk1,h ∩ Jk1,v

where k ≥ k0 , 0 ≤ j ≤ log∗ k. Here, and later, we always understand that the circuit surrounds B(2k−1 ). On the event Hk (j ) the invasion is already in the pk (j )-open infinite cluster by the time it reaches Ak . We can find an upper bound for P (Hk (j )c ) using standard 2D constructions [6, Fig. 6]; [7]. We have (see Fig. 1) ∞

Hk (j ) ⊃ Jk (j ) ∩

Jkm (j ),

(2.17)

m=0

where (dropping the index j , for convenience) Jk = {there is a pk (j )-open circuit in Ak−1 }, Jkm = Jkm,h ∩ Jkm,v ,

there is a pk (j )-open horizontal crossing , Jkm,h = of [2k−2+m , 2k+m ] × [−2k−2+m , 2k−2+m ]

there is a pk (j )-open vertical crossing Jkm,v = , of [2k−1+m , 2k+m ] × [−2k−1+m , 2k−1+m ]

m ≥ 0, m ≥ 0.

We bound the probabilities of Jk (j )c , Jkm (j )c and Hk (j )c using (2.8). By RSW arguments P (Jk (j )c ) ≤ 4(1 − σ (2k , 2k−2 , pk (j ))) ≤ 16(1 − σ (2k−1 , 2k−2 , pk (j ))).

(2.18)

Therefore, by (2.8) and (2.15) we have P (Jk (j )c ) ≤ 16 C1 exp(−C2 2k−2 /L(pk (j ))) 1 ≤ 16 C1 exp − C2 C3 log(j ) k . 4

(2.19)

Invasion Percolation and the Incipient Infinite Cluster in 2D

321

Similarly, we find that P (Jkm (j )c ) ≤ 2(1 − σ (2k+m , 2k+m−1 , pk (j ))) ≤ 2C1 exp{−C2 2k+m−1 /L(pk (j ))} 1 ≤ 2C1 exp − C2 C3 2m log(j ) k . 2

(2.20)

Summing over m and using (2.19) we get P (Hk (j )c ) ≤ P (Jk (j )c ) +

∞ m=0

1 P (Jkm (j )c ) ≤ (16 C1 + C4 ) exp − C2 C3 log(j ) k . 4

Since log(j ) k > 2, the constant C4 does not depend on C3 as long as C3 is larger than some fixed positive number. Writing c = C2 C3 /4 for short, we have P (Hk (j )c ) ≤ C5 exp(−c log(j ) k),

(2.21)

where the constant c depends on C3 , and can be made large by choosing C3 large. On the event Hk (j ) we have pk (j ) def Xk I [Hk (j )] ≤ Zk (j ) = v ∈ Ak : v ←→ ∞ . ∗

Since log(0) k > · · · > log(log k) k, we have pk (0) ≥ · · · ≥ pk (log∗ k), and hence Hk (0) ⊃ · · · ⊃ Hk (log∗ k). Using the notation Hk (log∗ k + 1) = ∅, for k > k0 we have log∗ k+1

EXk = E(Xk ; Hk (0) ) + c

E(Xk ; Hk (j − 1) ∩ Hk (j )c )

j =1

≤ |Ak |P (Hk (0)c ) +

∗ log k

E(Zk (j − 1); Hk (j )c ) + EZk (log∗ k).

(2.22)

j =1

By (2.21), the first term on the right-hand side is less than |Ak |C5 e−ck . For the second term of (2.22), observe that Zk (j − 1) is a decreasing variable as a function of the edgevalues {ω(e)}e∈E2 , and Hk (j )c is increasing. By the FKG inequality [12] and (2.21) we get E(Zk (j − 1); Hk (j )c ) ≤ EZk (j − 1) · P (Hk (j )c ) ≤ |Ak | θ (pk (j − 1)) C5 exp{−c log(j ) k}.

(2.23)

As for the last term in (2.22) we have EZk (log∗ k) ≤ |Ak | θ(pk (log∗ k)). We compare θ (pk (j )) to π(pc , 2k ). An application of (2.11), (2.12) and (2.15) yields θ(pk (j )) ≤ C6 π(pc , L(pk (j ))) = C6 π(pc , 2k )

π(pc , L(pk (j ))) π(pc , 2k )

1/2 2k C6 C6 k π(pc , 2 ) ≤ π(pc , 2k )(DC3 log(j ) k)1/2 . ≤ D1 L(pk (j )) D1

(2.24)

Here C6 is the constant C2 /C1 from (2.11). For j = log∗ k we have log(j ) k ≤ 10, which shows that θ (pk (log∗ k)) = O(π(pc , 2k )).

322

A.A. J´arai

The bounds (2.24) and (2.23) imply that the right-hand side of (2.22) is less than

exp{−ck} + C7 |Ak |π(pc , 2 ) π(pc , 2k ) k

∗ log k

(log(j −1) k)1/2−c + 1 ,

(2.25)

j =1

where the constant C7 depends on C3 . We show that the expression in the square brackets is less than a constant, if c is large enough (and therefore if C3 is large enough). First, by (2.12) we have π(pc , 2k ) ≥ C8 2−k/2 . If we choose c ≥ (log 2)/2, then e−ck /π(pc , 2k ) ≤ 1/C8 . In order to bound the sum over j , we require that c ≥ 3/2. Then it is enough to show that log∗ k

sup

log(j −1) k

−1

≤ C9 < ∞.

(2.26)

k>10 j =1

Recalling that log(j ) k > 2, and applying this inequality with j = log∗ k, we see that the last term of the sum in (2.26) is at most (e2 )−1 . Similarly, the penultimate term is at most (exp{e2 })−1 . By induction, this leads to the upper bound 1 1 1 + 2 + e 2 + · · · = C9 e e e2 e e on the left-hand side of (2.26). It follows that the expression in (2.25) is bounded above by C7 |Ak |π(pc , 2k )[C8−1 + C9 + 1], if c ≥ 3/2 = max{(log 2)/2, 3/2}. Fix C3 so that this holds. Recalling that |Ak |π(pc , 2k ) = O(s(2k )), we see that EXk ≤ C10 s(2k ) for k ≥ k0 . Increasing C10 , if necessary, we may assume this holds for all k. Summing over k and recalling that n = 2K , (2.13) and (2.12) yield E|Sn | ≤ C10

K

s(2k ) ≤ C10 s(n)

k=1

≤

C10 s(n) D1

K 22k π(pc , 2k ) 22K π(pc , 2K ) k=1

K

1

22(k−K) 2− 2 (k−K) ≤ C11 s(n).

(2.27)

k=1

This finishes the proof of the case t = 1 of Theorem 1. For the extension to higher moments note that by Jensen’s inequality we may restrict to integer t. We first prove a bound on EXkt . We have Xkt I [Hk (j )] ≤ Zk (j )t . By the method of [23 or 17] we obtain the bound EZk (j )t ≤ C12 (t)[|Ak |π(pk (j ), 2k )]t . Using that 2k ≥ L(pk (j )) and recalling (2.11), (2.12) and (2.15), we have π(pk (j ), 2k ) ≤ π(pk (j ), L(pk (j ))) ≤ C13 π(pc , L(pk (j ))) ≤ C14 π(pc , 2k )(DC3 log(j ) k)1/2 .

Invasion Percolation and the Incipient Infinite Cluster in 2D

323

This gives the following bound analogous to (2.25):

EXkt

∗ log k

exp{−ck} ≤ C15 (t, C3 )[s(2 )] + π(pc , 2k )t k

log(j −1) k

t

t/2−c

+1 .

j =1

If c ≥ max{t log 2/2, (t + 2)/2} the expression inside the square brackets is bounded by a constant, hence by choosing C3 large enough we get EXkt ≤ C16 (t)[s(2k )]t .

(2.28)

|t

To turn this into a bound on E|Sn we write t K |Sn |t = Xk =

t

Xki .

1≤k1 ,... ,kt ≤K i=1

k=1

By H¨older’s inequality and (2.28) we have E

t i=1

Xki ≤

t

EXkt i

1/t

i=1

≤ C16 (t)

t

s(2ki ).

i=1

Summing over k1 , . . . , kt , by the calculation in (2.27) we obtain K t s(2k ) ≤ C17 (t)[s(n)]t , E|Sn |t ≤ C16 (t) k=1

which completes the proof of Theorem 1.

2.4. Tightness. Proof of Theorem 2. From Markov’s inequality and Theorem 1 it follows that |Sn | 1 sup P ≤ C1 ε → 0, as ε → 0. > n 2 πn ε n We can show the required lower bound on |Sn | based on the idea of the lower bound of [29, Theorem 1] and the method of [17, Theorem 8]. Again, we may assume n = 2K . For k ≤ K let pc Yk = v ∈ B(3 · 2k−2 ) \ B(2k−1 ) : v ←→ ∂B(2k ) inside Ak . Define the event

1 Gk = there is a pc -open circuit D in B(2k ) \ B(3 · 2k−2 ) and Yk ≥ EYk . 2 By (B) of Sect. 1.4, on the event Gk we have |Sn | ≥ Yk ≥ 21 EYk . As in [17, Theorem 8] one can show that for k ≤ K we have EYk ≥ C1 s(2k ) ≥ C1 4k−K s(n), and P (Gk ) ≥ C3 > 0. Then for a fixed integer > 0 and ε < 21 C2 4− we have K |Sn | c ≤ε ≤P Gk ≤ (1 − C3 )+1 , P s(n) k=K−

since the Gk are independent. This proves the second part of the claim.

324

A.A. J´arai

3. The Invasion Cluster Looks Like the IIC In Sect. 3.1 we describe the idea of the proof of Theorem 3. The proof of Theorem 4, the random site case, requires additional arguments and is given in Sect. 3.2. We do not give the proof of Theorem 3 in detail, since it is essentially a simplification of the argument for the random site case. The necessary changes are indicated in Sect. 3.3. 3.1. Idea for the fixed site case. Let E ∈ F0 , and consider the first statement of Theorem 3. To analyze P (τv E, v ∈ S), let B(N, v) be a box centered at v such that 1 N |v|. Suppose we know that by the time the invasion reaches B(N, v), it is in a p-open infinite cluster with p − pc very small. Then with large probability all bonds in pc B(N, v) satisfy ω(e) ∈ [pc , p]. In this case the event v ∈ S implies v ←→ ∂B(N, v). The latter is the conditioning in Kesten’s theorem, so we hope to apply (1.5) for the configuration inside B(N, v), with v replacing the origin, to get pc pc P τv E, v ←→ ∂B(N, v) ≈ ν(E) P v ←→ ∂B(N, v) . To make this work we need to decouple the box from the configuration outside. For this we put an annulus B(M, v) \ B(N, v) with N M |v| around the box. With large probability, the annulus will contain a pc -open circuit, that will be used to prevent information from outside from influencing the configuration in B(N, v). It is a well-known consequence of the RSW technology that there are constants µ and C such that for all M > N ≥ 1 we have µ N P (there is no pc -open circuit in B(M) \ B(N )) ≤ C . (3.1) M Therefore our main goal will be to show that p − pc can be made small enough so that the invasion process inside the circuit mimics critical percolation. 3.2. Random site case. We clarify the extra argument necessary when v is random. We can write I [τv E, v ∈ S] P (τIn E) = . (3.2) E |Sn | v∈B(n)

If |Sn | was concentrated around its mean, we could easily apply the result of Theorem 3, and obtain Theorem 4 by averaging over v. However, one expects the fluctuations of |Sn | to be of the same order as E|Sn |. In fact, we can expect that |Sn |/E|Sn | has a non-trivial limit distribution as n → ∞. Nevertheless, considering the box B(N, v) as before, it is natural to believe that for each fixed v the denominator inside the expectation pc in (3.2) decouples from the event τv E, v ←→ ∂B(N, v). Thus we hope to apply the same argument as in the fixed site case. However, to make it work, we need to ensure that there is a pc -open circuit in the annulus B(M, v) \ B(N, v) even when v is random. For this we will need to use the tightness of |Sn |. Proof (Theorem 4). We start by proving the first statement of the theorem, that is, when E ∈ F0 . The second statement will only require a little bit of extra argument. We use the notation of the proof of Theorem 1. We start with the argument that the annulus centered at In contains a pc -open circuit with large probability. Recall that s(n) = n2 πn . Let 2K ≤ n < 2K+1 . Let ε > 0 be

Invasion Percolation and the Incipient Infinite Cluster in 2D

325

given, which will be used to control errors. By Theorem 2 there is an x = x(ε) > 0, such that P (|Sn | < xs(n)) ≤ ε,

if n is large enough.

(3.3)

Let An(a, b) = B(a) \ B(b), and define F = FM,N = {there is a pc -open circuit in An(M, N )}, F (D) = {D is the outermost pc -open circuit in An(M, N )}. We are going to choose M, N in the course of the proof so that 1 N M n. In any case, we assume that B(N) contains all edges on which E depends. From (3.3) we obtain √ P (In ∈ B( n)) √ v∈B( n) I [v ∈ S] ≤ P (|Sn | < xs(n)) + E ; |Sn | ≥ xs(n) |Sn | √ 1 E|S ∩ B( n)| xs(n) C2 n ≤ε+ ≤ 2ε, x s(n) ≤ε+

(3.4)

provided n is large enough, since s(n) ≥ Cn2 n−1/2 by (2.12). Recall the definition of the event Hk (0) in (2.16). By (2.21) we have for some c > 0, ∞ P Hk (0)c ≤ C3 exp{−c(K/2 + 1)} ≤ ε, (3.5) k=K/2+1

if K, and therefore if n is large enough. The choice of c will not play a role this time. We will drop the argument 0, and write Hk instead of Hk (0), and pk instead of pk (0) in the rest of the proof. We also introduce the notation kn = K/2. We want to bound the probability of τIn F c , the event that the required pc -open circuit surrounding In does not exist. We have c v∈B(n) I [v ∈ S, τv F ] c P (τIn F ) = E |Sn |   ∞ √ ≤ P (In ∈ B( n)) + P (|Sn | < xs(n)) + P  Hkc  1 + xs(n)

√ v∈An(n, n)

k=kn +1

 E I [v ∈ S, τv F c ];

∞



Hk  .

(3.6)

k=kn +1

By (3.3), (3.4) and (3.5) the sum of√the first three terms on the right-hand side is less than 4ε. By the observation 2kn ≤ n, the sum over v on the right-hand side of (3.6) is less than K+1

k=kn +1 v∈Ak

E(I [v ∈ S, τv F c ]; Hk ) ≤

K+1

k=kn +1 v∈Ak

pk E I v ←→ ∞ I [τv F c ] .

326

A.A. J´arai

To explain the last inequality, consider the first time tk that an edge of the pk -open infinite cluster is invaded. On the event Hk this cannot happen later than the first time the invasion reaches the circuit D in the definition of Hk . In particular, at time tk the invasion will not have reached Ak . Hence the vertex v is invaded after time tk , and therefore it is in the pk -open infinite cluster. Altogether this implies that P (τIn F c ) ≤ 4ε +

K+1 pk 1 P v ←→ ∞, τv F c . xs(n)

(3.7)

k=kn +1 v∈Ak

Applying the FKG inequality and (2.24) we get √ pk P v ←→ ∞, τv F c ≤ θ (pk )P (F c ) ≤ C4 P (F c )π(pc , 2k ) k.

(3.8)

Summing (3.8) over v ∈ Ak it follows that the right-hand side of (3.7) is less than K+1 √ C4 P (F c ) C5 P (F c ) 4ε + s(2k ) k ≤ 4ε + log n. xs(n) x

(3.9)

k=1

From (3.1) it follows that for some C6 = C6 (ε, x) if M = C6 N (log n)1/(2µ) , then the second term on the right-hand side of (3.9) is less than ε. With this choice of M, we have P (τIn F c ) ≤ 5ε,

(3.10)

for any fixed N , provided n is large enough. The bound (3.10) shows that up to a small additive error we can write (3.2) in the form I [τv E, v ∈ S, τv F ] P (τIn E) ≈ P (τIn E, τIn F ) = E . (3.11) |Sn | v∈B(n)

The next step is to use the disjoint decomposition F = ∪D F (D) to write the expectation in (3.11) as a sum over D. There is an additional technicality. Later we need that v is typically sufficiently far√ away from the origin, so we use (3.4) again, to restrict the sum in (3.11) to v ∈ An(n, n). Equations (3.10), (3.4) and (2.21) yield √ P (τIn E) ≤ 8ε + P (τIn E, τIn F, {In ∈ B( n)}, Hkn −1 ) I [τv E, v ∈ S, τv F (D)] = 8ε + E ; Hkn −1 |Sn | √ v∈An(n, n) D

≤ 8ε + P (τIn E).

(3.12)

Here the second sum is over all circuits D in An(M, N ). For decoupling we want to replace |Sn | by the quantity Wn (τv D) = |ext(τv D) ∩ Sn |, where ext(τv D) denotes the graph exterior to τv D (the edges and vertices of τv D belong to ext(τv D)). We denote by int(τv D) the interior of τv D. We have |int(τv D)| ≤ (2M + 1)2 . From the choice of N at the end of the proof it will be clear that M 2 = o(n), which implies that for large n we have the (deterministic) inequalities: Wn (τv D) ≤ |Sn | ≤ (1 + ε)Wn (τv D).

Invasion Percolation and the Incipient Infinite Cluster in 2D

Hence, denoting the value of the expectation in (3.12) by E(v, D, n), we have I [τv E, v ∈ S, τv F (D)] E(v, D, n) ≤ E ; Hkn −1 ≤ (1 + ε)E(v, D, n). Wn (τv D)

327

(3.13)

We continue by showing that on the event τv F (D), invasion inside τv D can be decoupled from invasion outside, and it can be approximated by critical percolation. Let 2 us write the configuration ω(·) ∈ [0, 1]E as ω = η ⊕ ξ , where ξ is the configuration in int(τv D) and η is the configuration in ext(τv D). (In particular, the states of the edges of τv D are represented by η.) We want to rewrite the expectation in the middle of (3.13) by first conditioning on η. We claim that (i) the event τv√ F (D) ∩ Hkn −1 only depends on η, given that n is large enough and v ∈ An(n, n); −1 (ii) the random variable I [τv F (D)](W √ n (τv D)) only depends on η, given that n is large enough and v ∈ An(n, n). To prove (i) first note that the event τv F (D) only depends on η. Moreover, we show that if τv F (D) occurs, then (at least for n large) the occurrence of Hkn −1 is equivalent to the occurrence of ! there is a pkn −1 -open circuit E in Akn −2 ,

= H . (3.14) pkn −1 and E ←→ ∞ outside int(τv D)

implies the occurrence of Hkn −1 . Assume that τv F (D) occurs. Then the occurrence of H For the converse we need to show that if Hkn −1 occurs, then the infinite path in its definition can be chosen to lie outside τv D. For this fix a pkn −1 -open circuit E in Akn −2 whose existence is implied by Hkn −1 . For n large, the interior of τv D is disjoint from B(2kn −2 ), and hence E lies in ext(τv D). Now let ρ be a pkn −1 -open path connecting E to infinity; this path starts in ext(τv D). Since pkn −1 > pc , the edges of τv D are pkn −1 -open. Therefore, if some pieces of ρ happen to be inside int(τv D), we can replace them by

occurs, and thus (i) is arcs of τv D, and still have a pkn −1 -open path. This shows that H established. √ To establish (ii) we first note that since v ∈ B( n)c , for large n we have 0 ∈ int(τv D). Statement (ii) now follows from Lemma 1 below. Since the validity of the lemma is intuitively clear, we defer its proof to the end of this section. Lemma 1. Let e1 , e2 , . . . be the history of the invasion process, i.e., the sequence of edges invaded. Let E be a circuit for which 0 ∈ int E. Let ω = η ⊕ ξ , where η is the configuration in ext E, and ξ is the configuration in int E. Given that E is pc -open, the set

= {ei : i ≥ 1} ∩ ext E H

(3.15)

only depends on η. Conditioning on η and using statements (i) and (ii) we can rewrite the middle expression in (3.13) as I [τv E, v ∈ S]I [τv F (D)]I [Hkn −1 ] EE η Wn (τv D) I [τv F (D), Hkn −1 ] =E P (τv E, v ∈ S | η) . (3.16) Wn (τv D)

328

A.A. J´arai

We claim that I [v ∈ S] = I [v ∈ S]I [τv D ⊂ S]

a.s. on τv F (D).

(3.17)

To see this note that 0 ∈ int(τv D) implies that if v ∈ S then the invasion has to cross τv D. By (B) of Sect. 1.4 we have I [τv D is reached] = I [τv D ⊂ S]

almost surely on τv F (D).

(3.18)

This proves (3.17). From (3.18) it is apparent that the event {τv D ⊂ S} ∩ τv F (D) only depends on η. This fact and (3.17) allow us to write the right-hand side of (3.16) in the form I [τv F (D), Hkn −1 , τv D ⊂ S] E P (τv E, v ∈ S | η) . (3.19) Wn (τv D) The rest of the proof is concerned with analyzing P (τv E, v ∈ S | η). The idea is that pc given the event τv D ⊂ S, the condition v ∈ S can be replaced by v ←→ τv D. Recall that ξ denotes the configuration in int(τv D), and let Q = Q(τv D, n) = {there is no edge e ∈ int(τv D) for which ξ(e) ∈ [pc , pkn −1 ]}. In order to bound the probability of Qc , we estimate the number of edges e in the interior of τv D for which ξ(e) ∈ [pc , pkn −1 ]. It is known that θ(p) grows at least linearly for p > pc near pc [12, Theorem 5.8]. Therefore (2.11) and (2.15) imply pkn −1 − pc ≤ C7 (θ (pkn −1 ) − θ (pc )) = C7 θ (pkn −1 ) ≤ C8 π(pc , L(pkn −1 )) 2kn −1 ≤ C8 π pc , , (3.20) DC9 (kn − 1) √ where C9 denotes the constant C3 from (2.15). Here 2kn −1 /(kn − 1) n/(log n). It is known [16, Lem. 8.5], that there are constants C10 and ζ > 0, such that π(pc , m) ≤ C10 m−ζ ,

for m ≥ 1.

(3.21)

From (3.21) and (3.20) it follows that pkn −1 − pc ≤ C11

(log n)ζ . nζ /2

(3.22)

Recall that the circuit τv D lies in the annulus B(M, v) \ B(N, v). The relation between M and N is M = C6 N (log n)1/(2µ) , for some constant µ, and C6 only depending on ε, not on n. The number of edges in int(τv D) is at most O(M 2 ) = O(N 2 (log n)1/µ ). Hence P (Qc | η) ≤ E(|{e ∈ int(τv D) : ξ(e) ∈ [pc , pkn −1 ]}|) ≤ C12 M 2 (pkn −1 − pc )

(log n)ζ +1/µ nζ /2 −ζ /4 ≤ C13 (N ) · n .

≤ C12 C11 C62 N 2

For arbitrary fixed N this bound is uniform in D.

(3.23)

Invasion Percolation and the Incipient Infinite Cluster in 2D

329

Recall that our aim is to analyze P (τv E, v ∈ S | η) inside the expression in (3.19). Fix an η such that τv F (D), Hkn −1 and {τv D ⊂ S} occur.

(3.24)

Then on the event Q exactly those edges of int(τv D) will be invaded that are pc -open, and have a pc -open connection to τv D. This means that pc I [Q]I [τv E, v ∈ S] = I [Q]I τv E, v ←→ τv D , (3.25) for any η satisfying (3.24). From (3.25) we have pc P (τv E, v ∈ S | η) − P (Qc | η) ≤ P τv E, v ←→ τv D η pc = P E, 0 ←→ D τ−v η pc = P E, 0 ←→ D ≤ P (τv E, v ∈ S | η) + P (Qc | η),

(3.26)

pc

where at the second equality we have used that the event E, 0 ←→ D is independent of τ−v η. Now we are in a position to apply (1.5) with a small modification. We claim that pc P E 0 ←→ D = ν(E). lim (3.27) N→∞ D surrounds B(N)

In order to prove this, note that by the remark after Theorem 3 in [17], the conclusion of (1.5) holds even in the case when B(n) is replaced by an arbitrary increasing sequence of sets whose union is Z2 . Given any sequence of circuits for which the limit in (3.27) is different from ν(E), we get a contradiction. Hence for N large enough we have pc pc pc (1 − ε)P E, 0 ←→ D ≤ ν(E)P 0 ←→ D ≤ (1 + ε)P E, 0 ←→ D . (3.28) We note that (3.28) also holds when ν(E) = 0. In fact, by RSW considerations P (E, pc 0 ←→ ∂B(N )) = 0 for all large enough N when ν(E) = 0. We fix N such that (3.28) is satisfied. The last step is to show that P (Qc | η) is an error term in (3.26). Recalling that D lies inside B(M) and that M = C6 N (log n)1/(2µ) , by (2.12) we have pc P 0 ←→ D ≥ πM ≥ √

D1 . C6 N (log n)1/(4µ)

Hence recalling (3.23) we conclude that for n large enough we have pc P (Qc | η) ≤ εν(E)P 0 ←→ D .

(3.29)

This bound fails when ν(E) = 0. However, it is simple to adapt what follows to this situation. By (3.26), (3.28) and (3.29) we have pc 1 P (τv E, v ∈ S | η) ≤ ν(E)P 0 ←→ D 1 + 2ε 1 ≤ P (τv E, v ∈ S | η). 1 − 2ε

(3.30)

330

A.A. J´arai

We can replace E by the sure event in the arguments above. Then (3.30) and its version for the sure event imply that for n large enough we have 1 − 2ε 1 + 2ε ν(E)P (v ∈ S | η) ≤ P (τv E, v ∈ S | η) ≤ ν(E)P (v ∈ S | η). (3.31) 1 + 2ε 1 − 2ε Similarly, the inequalities (3.12) and (3.13) hold with E replaced by the sure event. We plug the estimate of (3.31) back into the expression in (3.19). Recall that (3.19) also equals the middle expression in (3.13). Then it is a simple matter to deduce from (3.12), (3.13), their versions for the sure event and (3.31) that for all large n we have (1 + 2ε)(1 + ε) ν(E), 1 − 2ε 1 − 2ε P (τIn E) ≥ −8ε + ν(E). (1 + 2ε)(1 + ε) P (τIn E) ≤ 8ε +

(3.32)

Since ε was arbitrary, this implies the first statement of Theorem 4. To conclude we describe the modifications necessary for the second statement. Re is the call that Tv EK is the event that the edges in the set τv K are invaded, and τv EK event that these edges belong to the pc -open cluster of v. It is not hard to check that the manipulations leading to (3.12), (3.13), (3.16) and (3.19) are still valid when τIn E is replaced by TIn EK , and τv E is replaced by Tv EK . The first difference arises when we approximate the invasion by critical percolation on the event Q. This time (3.25) is replaced by pc

I [Q]I [Tv EK , v ∈ S] = I [Q]I [τv EK , v ←→ τv D],

(3.33)

for any η satisfying (3.24). This holds for the same reason as (3.25), namely that on Q exactly those edges of int(τv D) will be invaded that have a pc -open connection to D. is a cylinder event. In In the case when K is connected and contains 0, the event EK . It is not this case, Kesten’s theorem applies, and (3.28) holds with E replaced by EK hard to see that in this case the rest of the argument applies without change to show ). We have to work a little bit more for a general K by that lim P (TIn EK ) = ν(EK approximating EK by cylinder events. pc Denote the event 0 ←→ D by AD . Then it is enough to show that we still have lim

N→∞ D surrounds B(N)

P (EK | AD ) = ν(EK ).

(3.34)

Fix with the property K ⊂ B(). For m > let Cm (0) = the pc -open cluster of 0 inside B(m). by the event E We approximate EK K,m = {K ⊂ Cm (0)} ⊂ EK . On the event (EK \ pc EK ,m ) ∩ AD the event AD and the event Om = {B() ←→ ∂B(m)} occur disjointly. By the BK inequality [3] we have P (EK ,m , AD ) ≤ P (EK , AD ) ≤ P (EK,m , AD ) + P (Om )P (AD ).

(3.35)

Invasion Percolation and the Incipient Infinite Cluster in 2D

331

We claim that the right-hand side is (1 + o(1))P (EK ,m , AD ) as m → ∞. This follows from three facts: P (Om ) → 0 as m → ∞, P (EK ,m , AD ) ≥ P (EK,m )P (AD ) (by the FKG inequality), and P (EK ,m ) ≥ P (EK, ) > 0. Since EK,m is a cylinder event, the conclusion of (3.34) holds for EK,m . Putting this together with (3.35) and the fact that limm→∞ ν(EK ,m ) = ν(EK ) we get (3.34).

Proof (Lemma 1). Recall that Gi (i ≥ 0) denotes the invasion cluster at time i. We separate three phases in the invasion process. The first phase starts at time 0. Let R1 be the hitting time of the circuit E, i.e., the first time that an edge with an end-vertex on E is invaded. It may happen that the invasion never reaches E; in this case we let R1 = ∞, and there are no other phases. Otherwise the second phase starts at time R1 +1. Let R2 be the time when all edges with a pc -open connection to GR1 , that are themselves pc -open, have been invaded. We have R2 < ∞ almost surely on the event {R1 < ∞}. Since GR1 contains a vertex of E, the following holds: almost surely on {E is pc -open} ∩ {R1 < ∞} all edges of the circuit E are invaded during the second phase.

(3.36)

The third phase is the rest of the process. The set of edges H1 that are invaded during the first phase only depends on η, since

= H1 . we do not look at any ω-values inside E. In the case R1 = ∞ we are done, since H During the second phase all invaded edges have ω-value less than pc , and the set of edges invaded is pc def H2 = e ∈ E(GR1 )c : ω(e) < pc , e ←→ GR1 .

2 def = H2 ∩ ext E only depends on η. Since E is pc -open, any We show that the set H

2 in fact has a pc -open connection to GR1 outside int E. (See the discussion e ∈ H following (3.14).) Therefore, pc

2 = e ∈ E(GR1 )c ∩ ext E : ω(e) < pc , e ←→ H GR1 outside int E .

2 only depends on η, as well. Since GR1 only depends on η, this shows that H To discuss the third phase, we need to introduce some more notation. Let be the set of times when an edge in int E is invaded during the third phase. For any graph G let

i by E(G

i ) = E(Gi ) ∩ E(ext E). As a

G = G ∩ E(ext E). Also, define the graph G result of the previous paragraph, we have:

R2 only depends on η. the graph G

(3.37)

Consider the step at some time i > R2 , and first assume that i ∈ . Then, by the definition of the invasion process, ei minimizes ω on Gi−1 ; furthermore, since ei ∈ ext E,

Gi−1 . We noted before, that by the end of the second phase all ei minimizes ω on edges of E are invaded, hence the edges of E do not belong to Gi−1 . This implies that

i−1 . Thus

Gi−1 =

G

i ) = E(G

i−1 ) ∪ {f }, for i > R2 and i ∈ we have E(G

i−1 .

G where f minimizes ω on the set

(3.38)

332

A.A. J´arai

On the other hand

i = G

i−1 . for i > R2 and i ∈ we have G

(3.39)

i changes, it changes in a fashFrom (3.38) and (3.39) we see that whenever the set G ion determined only by η. Using (3.37), (3.38) and (3.39) we get by induction that the sequence

i : i ≥ R2 , i ∈ } {G only depends on η. In particular, the set

3 def H = {ei : i > R2 , i ∈ } = {edges outside int E that are invaded in the third phase}

= H1 ∪ H

2 ∪ H

3 , this shows that H

only depends on η, only depends on η. Since H and the proof of the lemma is complete. 3.3. Single site case. In this section we indicate the necessary changes for the proof of Theorem 3. Proof (Theorem 3). Let n = |v| and 2K ≤ n < 2K+1 . Other notation will have the same meaning as in the proof of Theorem 4. Let ε > 0 be given. By the FKG inequality and (2.24) we have pK P (τv F c , v ∈ S) ≤ P (HKc ) + P τv F c , v ←→ ∞ ≤ P (HKc ) + P (F c )θ (pK ) √ ≤ P (HKc ) + C1 πn KP (F c ). (3.40) By choosing the constant C3 in the definition of pK large, we can achieve P (HKc ) → 0, πn

as n → ∞.

(3.41)

As in the argument preceding (3.10), we can find constants µ > 0 and C3 = C3 (ε), such that if M = C3 N (log n)1/(2µ) , then for large n we have RHS of (3.40) ≤ επn .

(3.42)

On the other hand, by (B) of Sect. 1.4, the FKG inequality, the RSW Lemma, and (2.12), we have P (v ∈ S) pc ≥ P there is a pc -open circuit in AK+2 and v ←→ ∂B(2K+2 ) pc ≥ C4 P v ←→ ∂B(2K+2 ) ≥ C5 πn .

(3.43)

This implies that P (τv F c | v ∈ S) ≤

ε . C5

(3.44)

Invasion Percolation and the Incipient Infinite Cluster in 2D

333

By (3.41) and (3.43) we also obtain P (HKc | v ∈ S) → 0,

as n → ∞.

(3.45)

From (3.44) and (3.45) it follows that P (τv E | v ∈ S) −

2ε ≤ P (τv E, HK , τv F (D) | v ∈ S) C5 D

≤ P (τv E | v ∈ S).

(3.46)

By conditioning on the configuration η in ext(τv D), and using Lemma 1, we can rewrite the summand in (3.46) as 1 E I [τv F (D), HK , τv D ⊂ S] P (τv E, v ∈ S | η) . P (v ∈ S) The quantity P (τv E, v ∈ S | η) can be analyzed by the method of Theorem 4. The rest of the proof is analogous to the random site case. Acknowledgements. I thank Harry Kesten for many valuable suggestions and comments. I thank the referee for lots of useful suggestions on how to improve the presentation. A conversation with Gy¨orgy Elekes about the log∗ function stimulated me to close a final gap in the proof of Theorem 1. The research was partially supported by an NSF grant to Cornell University, NSERC of Canada, and the Pacific Institute for the Mathematical Sciences. The revised version of the manuscript was prepared during a visit to the Isaac Newton Institute in Cambridge.

References 1. Aizenman, M., Barsky, D.J.: Sharpness of the phase transition in percolation models. Commun. Math. Phys. 108, 489–526 (1987) 2. Aizenman, M., Chayes, J.T., Chayes, L., Fr¨ohlich, J., Russo, L.: On a sharp transition from area law to perimeter law in a system of random surfaces. Commun. Math. Phys. 100, 19–69 (1983) 3. van den Berg, J., Kesten, H.: Inequalities with applications to percolation and reliability. J.Appl. Prob. 22, 556–569 (1985) 4. Borgs, C., Chayes, J.T., Kesten, H., Spencer, J.: The birth of the infinite cluster: Finite-size scaling in percolation. Commun. Math. Phys. 224, 153–204 (2001) 5. Chandler, R., Koplick, J., Lerman, K., Willemsen, J.F.: Capillary displacement and percolation in porous media. J. Fluid Mech. 119, 249–267 (1982) 6. Chayes, J., Chayes, L.: Percolation and random media. In: Critical Phenomena, Random System and Gauge Theories, Les Houches Session. K. Osterwalder and R. Stora (eds.), XLIII 1984, Amsterdam, North-Holland, 1986, pp. 1000–1142 7. Chayes, J.T., Chayes, L., Durrett, R.: Inhomogeneous percolation problems and incipient infinite clusters. J. Phys. A. 20, 1521–1530 (1987) 8. Chayes, J.T., Chayes, L., Fr¨ohlich, J.: The low-temperature behavior of disordered magnets. Commun. Math. Phys. 100, 399–437 (1985) 9. Chayes, J.T., Chayes, L., Newman, C.M.: Stochastic geometry of invasion percolation. Commun. Math. Phys. 101, 383–407 (1985) 10. Chayes, J.T., Chayes, L., Newman, C.M.: Bernoulli percolation above threshold: An invasion percolation analysis. Ann. Prob. 15, 1272–1287 (1987) 11. De Gennes, P.G., Guyon, E.: Lois generales pour l’injection d’un fluide dans un milieu poreux aleatoire. J. de M´ecanique 17, 403–432 (1978) 12. Grimmett, G.R.: Percolation. 2nd edition, Grundlehren der matematischen Wissenschaften 321. Berlin: Springer, 1999 13. Grimmett, G.R., Marstrand, J.M.: The supercritical phase of percolation is well behaved. Proc. Roy. Soc. London Ser. A 430, 439–457 (1990) 14. Hara, T., Slade, G.: The incipient infinite cluster in high-dimensional percolation. Electron. Res. Announc. Amer. Math. Soc. 4, 48–55 (1998) http://www.ams.org/era

334

A.A. J´arai

15. J´arai, A.A.: Incipient infinite percolation clusters in 2D. To appear in Annals of Probability 16. Kesten, H.: Percolation theory for mathematicians. Boston: Birkh¨auser, 1982 17. Kesten, H.: The incipient infinite cluster in two-dimensional percolation. Prob. Th. Rel. Fields 73, 369–394 (1986) 18. Kesten, H.: Scaling relations for 2D-percolation. Commun. Math. Phys. 109, 109–156 (1987) 19. Lawler, G.F., Schramm, O., Werner, W.: One-arm exponent for critical 2D percolation. Electron. J. Probab. 7, 13 (2002) (electronic) 20. Lenormand, R., Bories, S.: Description d’un m´ecanisme de connexion de liaison destin´e a l’´etude du drainage avec pi´egeage en milieu poreux. Comptes Rendus des S´eances de l’Acad. des Scis. 291 S´erie B, 279–282 (1980) 21. Menshikov, M.V.: Coincidence of critical points in percolation problems. Sov Math. Dokl. 33, 856– 859 (1986) 22. Menshikov, M.V., Molchanov, S.A., Sidorenko, A.F.: Percolation theory and some applications. Itogi Nauki i Techniki (Series of Probability Theory, Mathematical Statistics, Theoretical Cybernetics) 24, 53–110 (1986) 23. Nguyen, B.: Typical cluster size for two-dimensional percolation processes. J. Stat. Phys. 50, 715–726 (1988) 24. Russo, L.: A note on percolation. Z. Wahr. Verw. Geb. 43, 39–48 (1978) 25. Seymour, P.D., Welsh, D.J.A.: Percolation Probabilities on the Square Lattice. In: Advances in Graph Theory. B. Bollob´as (ed.) Annals of Discrete Mathematics 3, Amsterdam: North-Holland, 1978, pp. 227–245 26. Smirnov, S.: Critical percolation in the plane: Conformal invariance, Cardy’s formula, scaling limits. C. R. Acad. Sci. Paris S´er. I Math. 333, 239–244 (2001) 27. Smirnov, S., Werner, W.: Critical exponents for two-dimensional percolation. Math. Res. Lett. 8, 729–744 (2001) 28. Wilkinson, D., Willemsen, J.F.: Invasion percolation: A new form of percolation theory. J. Phys. A. 16, 3365–3376 (1983) 29. Zhang, Y.: The fractal volume of the two-dimensional invasion percolation cluster. Commun. Math. Phys. 167, 237–254 (1995) Communicated by M. Aizenman

Commun. Math. Phys. 236, 335–372 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0820-x

Communications in

Mathematical Physics

Metastable States in Parametrically Excited Multimode Hamiltonian Systems E. Kirr1 , M.I. Weinstein2 1 2

Department of Mathematics, University of Chicago, Chicago, IL 60637, USA Mathematical Sciences Research, Bell Laboratories – Lucent Technologies, Murray Hill, NJ 07974, USA

Received: 13 March 2002 / Accepted: 2 January 2003 Published online: 14 April 2003 – © Springer-Verlag 2003

Abstract: Consider a linear autonomous Hamiltonian system with m time periodic bound state solutions. In this paper we study their dynamics under time almost periodic perturbations which are small, localized and Hamiltonian. The analysis proceeds through a reduction of the original infinite dimensional dynamical system to the dynamics of two coupled subsystems: a dominant m-dimensional system of ordinary differential equations (normal form), governing the projections onto the bound states and an infinite dimensional dispersive wave equation. The present work generalizes previous work of the authors, where the case of a single bound state is considered. Here, the interaction picture is considerably more complicated and requires deeper analysis, due to a multiplicity of bound states and the very general nature of the perturbation’s time dependence. Parametric forcing induces coupling of bound states to continuum radiation modes, of bound states directly to bound states, as well as coupling among bound states, which is mediated by continuum modes. Our analysis elucidates these interactions and we prove the metastability (long life time) and eventual decay of bound states for a large class of systems. The key hypotheses for the analysis are: appropriate local energy decay estimates for the unperturbed evolution operator, restricted to the continuous spectral part of the Hamiltonian, and a matrix Fermi Golden rule condition, which ensures coupling of bound states to continuum modes. Problems of the type considered arise in many areas of application including ionization physics, quantum molecular theory and the propagation of light in optical fibers in the presence of defects.

1. Introduction 1.1. Overview. Consider the autonomous Hamiltonian system: i∂t φ = H0 φ,

(1)

336

E. Kirr, M.I. Weinstein

where H0 is a self-adjoint operator on a Hilbert space H. Assume that H0 has eigenvalues λ1 , λ2 , . . . , λm with a complete set of eigenvectors ψ1 , ψ2 , . . . , ψm ∈ H. Hence e−iλj t ψj , j = 1, . . . , m

(2)

are time-periodic bound state solutions of the dynamical system (1). The general solution of the initial value problem for (1) splits into a noninteracting superposition of states of type (2) and radiation modes. The purpose of this paper is to study the energy exchange among bound states and continuum modes when the system is perturbed by a small time dependent Hamiltonian: i∂t φ = ( H0 + εW (t) ) φ,

|ε| small.

(3)

Our results concern almost periodic in time perturbations (may have infinite number of frequencies which can be non-commensurate, see [2, 13] or [12, Sect. 9]) of Hamiltonian systems with any finite number of bound states. We prove that the bound states of the unperturbed problem are long lived (metastable) but eventually decay to zero as t → ∞ due to coupling to radiation modes. We give a detailed picture of this process on large intermediate and infinite time scales. There are many areas in physics, chemistry and engineering in which models like Eq. (3) are used. We mention here ionization phenomena caused general time varying fields [3, 7, 14], quantum theory of molecules (see [8] and references therein) and propagation of light in optical waveguides in the presence of defects [15, 16]. In the latter application, Eq. (3) arises in the paraxial approximation. In this approximation, Maxwell’s equation for the electromagnetic field reduces to a system of the form (3) where t plays the role of the coordinate along the waveguide, in the direction of propagation. Due to their wide range of applicability, Eq. (3) has been extensively studied. Previous rigorous results have focused on the cases where (i) the perturbation is time-periodic [8, 16, 24, 27–29] or (ii) the unperturbed equation has a single bound state [12, 20]. In [4, 5] an analysis of certain one-dimensional exactly solvable models has been carried out without the requirement that ε be small. In many real physical models the time dependence of the perturbation may consist of discrete (CW) and continuous (time-localized [23]) spectral components. Here, we consider a very general class of time-dependent perturbations with discrete spectral components. Examples include superpositions of electromagnetic fields in the ionization problem, collisions among molecules [8, 26] and the distribution of defects along the length of a waveguide [15]. Moreover, in many applications the unperturbed dynamics supports multiple bound states, e.g. heavy atoms, double well potentials in molecules, multimode and/or multicore optical waveguides. In some particularly interesting cases, the eigenvalues of the unperturbed problem may be nearly degenerate, as in the case of double wells (with large barrier or large separation), or degenerate, as in the case of higher order modes in a waveguide with symmetry. In this case, the dynamics of energy exchange among the modes, and therefore the time-evolution of solutions of the Schr¨odinger equation, depends on parameter regimes defined in terms of perturbation size and the eigenvalue spacings. A particular example of this type is related to modeling the localization of symmetric molecules induced by collisions, a phenomenon observed in the ammonium molecule NH3 . See [8, 26] and other references cited therein. Our work applies to the interesting model considered in [8] by Grecchi and Sachetti, where the character of the system is studied on an intermediate time scale, τbeat ; see the next subsection. Our analysis extends theirs considerably. For a discussion of our results applied to the model in [8] see Sect. 4. Briefly, our results apply to cases where the eigenvalue splitting is arbitrarily small.

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

337

We can therefore treat all scenarios raised in [8]; in particular, the case of localization. Furthermore, our results are more general in that they apply to a large class of almost periodic perturbations on both infinite and intermediate time scales. The next subsection presents the results for a simplified example and outlines the mathematics behind them.

1.2. Outline of the results. To describe our general results and methods consider the Schr¨odinger operator with a double well potential: H0 = −∂x2 + V (x) = −∂x2 + V0 (x − L/2) + V0 (x + L/2),

x ∈ R,

constructed from the single well potential V0 (x) = −κχ ({|x| ≤ 1}). Here, χ (S) denotes the characteristic function of the set S; see the figure below. V(x)

κ

6

6 -

-

x

L

?

The theory of double well potentials (see for example [9]) implies the following. For a fixed κ sufficiently small and for all L sufficiently large: (1) H0 is a self adjoint operator on L2 (R) and has exactly two simple eigenvalues, λ1 < λ2 < 0, with corresponding orthonormal eigenvectors ψ1 , ψ2 . The rest of the spectrum consists of the nonnegative real axis and it is absolutely continuous. Let Pc denote the projection operator associated with the continuous part of the spectrum. Consequently the unperturbed time dependent Schr¨odinger equation (1) has two bound state solutions, which are time-periodic and localized in x: e−iλ1 t ψ1

and

e−iλ2 t ψ2 .

Let us point out that the frequency difference between the two bound states, the “eigenvalue splitting” for H0 , decays exponentially with the distance between the wells L, see for example [9]: (4) δ∗ ≡ δ∗ (L) = |λ2 − λ1 | ∼ e−cL for some positive constant c. Consider now the perturbed problem (3) where, for simplicity, we choose a perturbation with only one frequency: (2) Let εW (t, x) = ε cos(µt)β(x), where β(x) is a real-valued and rapidly decaying function of x as |x| → ∞ and µ is sufficiently large so that λ1 + µ > 0 (and therefore λ2 + µ > 0 as well).

338

E. Kirr, M.I. Weinstein

We study the effect of the perturbation by projecting the solution onto the bound states and continuum modes of the unperturbed problem, i.e. we write the solution φ(t) of (3) in the form: φ(t) = a1 (t)ψ1 + a2 (t)ψ2 + φd (t, x), φd ≡ Pc φ = A1 (t) e−iλ1 t ψ1 (x) + A2 (t) e−iλ2 t ψ2 (x) + φd (t, x), ψj , φd (t) = 0, j = 1, 2.

(5)

Here A1 (t) and A2 (t) are the slowly varying amplitudes of the bound states which in the unperturbed case, ε = 0, are constant. Our theory explains two types behavior of the time-dependent Schr¨odinger equation (3), which can be classified in terms of whether the eigenvalue splitting of H0 , δ∗ , is large (case I) or small (case II) relative to the perturbation size, ε. Alternatively, we can express these results in terms of naturally entering time scales. To see this, consider the unperturbed equation, ε = 0. Initial data consisting of a general nontrivial superposition (mixed state) of ground and excited states will evolve in this two-dimensional subspace and will exhibit a periodic beating on the time scale τbeat ≡

4π 4π . = |λ1 − λ2 | δ∗

(6)

Now consider the perturbed dynamics, ε = 0. Focusing on a single bound state ψj with energy λj , we expect from previous work [20, 12], that if µ + λj > 0 then on a time scale 1 (7) τrad damp ∼ O 2 ε components of the solution in the directions ψj will decay due to radiation damping. In terms of these time scales, the two types of behavior are characterized according to whether τbeat is much smaller than τrad damp (Case I) or τbeat is comparable or larger than τrad damp (Case II). For Cases I and II we prove that the bound state amplitude vector in (5): a1 (t) a(t) = a2 (t) satisfies

a(t) = et

−i diag[λ1 ,λ2 ]+ε2 [− # +i # ]

a(0) + R(t),

where ≥ 0 and are self-adjoint constant matrices, R(t) satisfies an appropriate error bound for times of order ε−2 and decay estimate for |t| → ∞. The particular # and # , as well as estimates on R(t) differ in Cases I and II. The non-negative matrix # is the analogue of the Fermi golden rule [3]. It is generically positive definite and governs the radiation damping on the time scale ε −2 . #

#

Case I. Large splitting (τrad damp τbeat ). The first result, a consequence of Theorem 3.1, assumes that δ∗ ε2 and says that on time scales of order 1/ε 2 the coupling between the amplitudes A1 (t) and A2 (t) is negligible. Each amplitude decays exponentially due to its resonant interaction with the continuum modes; see Fig. 1. On time scales much larger than ε−2 the amplitudes decay to zero algebraically. The special case of Theorem 3.1 which applies here is the following:

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

'

λ1

λ2

$

? λ1 + µ

0

339

? λ2 + µ

∞

Fig. 1. Interaction picture for large eigenvalue splitting

Theorem 1.1 (τrad damp τbeat ). Consider the system (3) such that (1) and (2) are satisfied. There exists ε0 > 0 such that if |ε| +

ε2 ≤ ε0 , δ∗

(8)

then the bound state amplitude vector in (5) satisfies a(t) = et

−i diag[λ1 ,λ2 ]+ε2 [− # +i # ]

a(0) + R(t),

(9)

where # ≥ 0 and # are real constant diagonal matrices given by1

π Pc βψ1 , δ(H0 − λ1 − µ)Pc βψ1 0 (10) ≡ 0 Pc βψ2 , δ(H0 − λ2 − µ)Pc βψ2 , 4 1 Pc βψ1 , P.V. (H0 − λ1 − µ)−1 Pc βψ1 0

# = 0 Pc βψ2 , P.V. (H0 − λ2 − µ)−1 Pc βψ2 4 1 Pc βψ1 , (H0 − λ1 + µ)−1 Pc βψ1 0 + 0 Pc βψ2 , (H0 − λ2 + µ)−1 Pc βψ2 4 1 δ∗ 1 0 + 2 |ψ2 , βψ1 |2 0 −1 . (11) 2 δ∗ − µ2 #

Furthermore, for any T > 0, R(t) satisfies sup |R(t)| = CT 0≤t≤ T2

ε2 |ε| + δ∗

,

(12)

ε

and the wave part of the solution, φd (t, x), in (5) can be written as the unperturbed wave plus a small correction: φd (t, x) = e−iH0 t Pc φ(0, x) + φ˜ d (t, x),

φ˜ d (t, ·) L2 ≤ C |ε| for some constant C and any t > 0. loc

With the exception of the non-generic case in which one or both the diagonal terms in # are zero, we also have: 3

|R(t)| = O(t − 2 ) for t → ∞, 3

φ(t, ·) L2 = O(t − 2 ) for t → ∞.

(13)

loc

Here · L2 denotes a weighted L2 (R) norm with the weight function decaying as loc x → ∞; see hypothesis (H3) in Sect. 2. 1 See Sect. 1.5 for comments on the operator δ(H − ξ ). The quantity P βψ, δ(H − ξ )P βψ can c c 0 0 be viewed as |F [βψ](ξ )|2 , where F denotes the “Fourier Transform” with respect to the continuous spectral part of H0 .

340

E. Kirr, M.I. Weinstein

Note that (9) and (12) imply that the mode amplitudes are very weakly coupled on time scales of order O(ε −2 ), and that on this time scale, the mode amplitude |aj | decays # . The frequency of the mode amplitude, exponentially with approximately the rate ε 2 jj aj (t), is basically the unperturbed one, −λj , plus a small correction given by ε 2 #jj . Case II. Small eigenvalue splitting (τrad damp ∼ τbeat or τrad damp τbeat ). The bounds on the correction R in the previous result break down for ε 2 δ∗−1 large. Our second result, a consequence of Theorem 3.2, is valid for δ∗ ≤ ε arbitrarily small. In contrast to Case I, the amplitudes A1 (t) and A2 (t) are strongly coupled on the time scale of interest, O(ε−2 ). This is due to the fact that the continuum modes with which each bound state resonates have, in this case, approximately equal frequencies; see Fig. 2. As in Case I, the amplitudes decay exponentially on the time scale O(ε −2 ). However, in this case we have nondiagonal corrections to # (a non-diagonal normal form), which influence the rate of decay. On longer time scales they decay to zero algebraically. Let λ = (λ1 + λ2 )/2. (14) The special case of Theorem 3.2 which applies here is the following: Theorem 1.2 (τrad damp < τbeat ). Consider the system (3) and assume (1) and (2) are satisfied and δ∗ ≤ Cε for some constant C. Then there exists ε0 > 0 such that for ε ≤ ε0 the bound state amplitude vector, a(t), is given by a(t) = et

−i diag[λ1 ,λ2 ]+ε2 [− # +i # ]

a(0) + R(t),

(15)

where # ≥ 0 and # are now non-diagonal self adjoint constant matrices given by π Pc βψ1 , δ(H0 − λ − µ)Pc βψ1 Pc βψ1 , δ(H0 − λ − µ)Pc βψ2 , (16) # ≡ 4 Pc βψ2 , δ(H0 − λ − µ)Pc βψ1 Pc βψ2 , δ(H0 − λ − µ)Pc βψ2 1 Pc βψ1 , P.V. (H0 − λ − µ)−1 Pc βψ1 Pc βψ1 , P.V. (H0 − λ − µ)−1 Pc βψ2

# ≡ 4 Pc βψ2 , P.V. (H0 − λ − µ)−1 Pc βψ1 Pc βψ2 , P.V. (H0 − λ − µ)−1 Pc βψ2 1 Pc βψ1 , (H0 − λ + µ)−1 Pc βψ1 Pc βψ1 , (H0 − λ + µ)−1 Pc βψ2 , + −1 −1 4 Pc βψ2 , (H0 − λ + µ) Pc βψ1 Pc βψ2 , (H0 − λ + µ) Pc βψ2 λ1 + λ2 . (17) λ≡ 2

λ1

6

λ2

6

$ ε -? ?

'

0

λ1 + µ ≈ λ2 + µ

6 Fig. 2. Interaction picture for small eigenvalue splitting

∞

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

341

Furthermore, for any T > 0, R(t) satisfies the estimate (independent of δ∗ ): sup |R(t)| = CT |ε|,

(18)

0≤t≤ T2 ε

and the wave part of the solution, φd (t, x), can be written as the unperturbed wave plus a small correction: φd (t, x) = e−iH0 t Pc φ(0, x) + φ˜ d (t, x),

φ˜ d (t, ·) L2 ≤ C |ε| for some constant C and any t > 0. loc

If in addition # ≥ θ0 > 0, where θ0 is a constant which is independent of ε and δ ∗ , then 3 |R(t)| = O(t − 2 ) for t → ∞, (19) 3

φ(t, ·) L2 = O(t − 2 ) for t → ∞. loc

Here · L2 is as in Theorem 1.1. loc

Remark 1.1. In contrast to Theorem 1.1 the additional coupling in this regime is manifested in the off diagonal terms of # and # . The term corresponding to the last term in (11) is omitted, since it is of higher order. It is now part of R(t). The off diagonal terms show that the bound states ψ1 , ψ2 of the unperturbed problem no longer form the right basis for describing the evolution. Instead one should use a pair of linear combinations of ψ1 , ψ2 to diagonalize or at least obtain an upper triangular matrix as an exponent in (15). In Sect. 4 we show that for a particular example the right basis is formed by ψ1 + ψ2 , respectively ψ1 − ψ2 which are localized in the left, respectively in the right well. Moreover, for the example considered, the perturbation is localized in the left well and we find that the amplitude of ψ1 + ψ2 decays on a much shorter time scale compared to the amplitude of ψ1 − ψ2 . Hence the system “localizes” in the right well. 1.3. Outline of the analysis. In this subsection we outline the mathematics behind these theorems. Consider the solution, φ(t), of (3) in the form (5). A careful expansion and analysis to second order in the perturbation εW (t), which is presented in Appendix 6 and constituting an extension of the one in [12, 20], reveals the following system for A(t) = (A1 (t), A2 (t)) , and φd (t): ∂t A(t) = (−ε 2 + iεη(t) + iε 2 + ε 2 ρ(t) ) A(t) + E(t; A(t), φd (t)), i∂t φd (t, x) = H0 φd (t, x) + Pc F (t, x; A(t), φd (t)). Here

(20) (21)

Pc f ≡ f − ψ1 , f ψ1 − ψ2 , f ψ2

defines the projection onto the continuous spectral part of H0 , the term E(t) can be neglected on times up to order 1/ε2 while on larger time scales both E(t; A, φd ) and F (t, x; A, φd ) tend to zero provided A and the “local energy” of φd do so. The matrices , respectively are diagonal with real constant coefficients given by (10), respectively (11) without the last term. η(t) and ρ(t) have quasiperiodic coefficients (almost periodic in general) with frequencies µ, λ2 − λ1 , λ2 − λ1 ± µ and λ2 − λ1 ± 2µ.

342

E. Kirr, M.I. Weinstein

An important step in our analysis is assessing the effect of the oscillatory terms η(t) and ρ(t) in (20). We construct a near-identity change of variables: a(t) → [I + Mε (t)]b(t),

(22)

where Mε (t) is almost periodic (thus uniformly bounded in t) with Mε (t) = O(ε) which maps the bound state amplitude system to one of the form db λ1 0 2 # 2 # − ε + iε b + R˜ ε (t). (23) = −i 0 λ2 dt Here Mε (t) is bounded and almost periodic in t. The details of # = + . . ., # =

+ . . ., R˜ ε and Mε (t) depend on whether one considers Case I or Case II. In Case I, # and # are diagonal and self-adjoint. In Case II, # and # are both non-diagonal and self-adjoint, with the non-diagonal part potentially having large effect on the lifetime of localized states. 1.4. Outline of the paper. The paper is structured as follows. In Sect. 2 we give a general formulation of the problem. The hypotheses on the unperturbed Hamiltonian H0 and the perturbation W (t) are introduced and discussed. A general result for the case of large eigenvalue splitting δ∗ ∼ 1, Theorem 2.1, is stated. In Sect. 3 we study the case of small eigenvalue splitting. In order to focus on how the bound states are affected by the interaction with the wave part we rule out any bound state – bound state direct resonance, see hypothesis (H6)’. We show that the bound states slowly decay, but, in contrast to the case of large eigenvalue splitting, the rates of decay are now influenced by the relative sizes of the perturbation and eigenvalue gap. This also determines if the nearly degenerate bound states will evolve uncoupled or coupled. The results are stated in Theorems 3.1 and 3.2 which are generalizations of Cases I and II presented in Sect. 1.2. In Sect. 4 we discuss two examples related to double well potentials. The first one has been previously considered in [8]. Based on our results, in particular Theorem 3.2, we solve their “localization” conjecture. The second example is a rather general double well problem: H0 = − + V (x1 − L/2, x2 , . . . , xn ) + V (x1 + L/2, x2 , . . . , xn ). For large L, H0 has the eigenvalue splitting roughly exp(−cL), c > 0. We discuss our results in this context. Section 5 contains the essentials of the proofs. Much of the work lies in constructing near-identity transformations which map the system (20) to a normal form, appropriate for the regimes of Cases I and II. We require, in particular, an extension of the normal form approach developed in [12, 20, 21]. For completeness we present it in Appendix 6. It is here where the interaction among bound states and the continuum modes is made explicit. The three subsections in 5 show how to further simplify the normal form by using a (non)stationary phase type computation. Their outcome is a system of ODE’s having the general form (68). The long time behavior of the solution of the latter is then analyzed in Proposition 5.2. This last and rather technical result is proven in Appendix 7. 1.5. Some notation and terminology. For z ∈ C, z, z,denote, respectively, its real and imaginary parts.

m 2 For a = (a1 , a2 , . . . , am ) ∈ Cm , |a| = i=1 |ai | denotes its Euclidean norm. Generic constants will be denoted by C, D, etc.

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

343

1 x = 1 + |x|2 2 . δj k = 1, if j = k and 0 if j = k. L(A, B) = the space of bounded linear operators from A to B; L(A, A) ≡ L(A). M ∗ ∈ L(B ∗ , A∗ ) denotes the adjoint operator of M ∈ L(A, B). [M1 , M2 ] = M1 M2 −M2 M1 denotes the commutator of the operators M1 , M2 ∈ L(A). For f, g in a Hilbert space H, their inner product is denoted by f, g , and the norm of f is denoted by f . Functions of self-adjoint operators are defined via the spectral theorem; see for example [18]. The operators containing boundary value of resolvents or singular distributions applied to self-adjoint operators are defined in [12, Sect. 8]. In particular we will frequently use the operators δ(H − λ) and P.V.(H − λ)−1 . 2. General Formulation and Results for Large Eigenvalue Splitting Consider the Schr¨odinger equation for the function of time, φ(t), with values in a complex Hilbert space H: i∂t φ(t) = (H0 + W (t)) φ(t), (24) φ|t=0 = φ(0). Note. The perturbation of H0 is written as W (t) instead of εW (t), used in the introduction. We have done this to make the notation less cumbersome. The smallness condition ε ≤ ε0 will be expressed in terms of an appropriate norm of W , |||W |||; see (30). We now introduce some general hypotheses on the dynamical system (1). (H1) H0 is self-adjoint on the Hilbert space H. (H2) The spectrum of H0 is assumed to consist of an absolutely continuous part, σcont (H0 ), with associated spectral projection Pc and isolated eigenvalues λ1 , λ2 , . . . , λm (counting multiplicity) with an orthonormalized set of eigenvectors ψ1 , ψ2 , . . . , ψm , i.e. for k, j = 1, . . . , m, H0 ψk = λk ψk , ψk , ψj = δkj , (25) where δkj is the Kronecker-delta symbol. (H3) Local decay estimates on e−iH0 t : There exist self-adjoint “weights”, w− , w+ , number r1 > 1 and a constant C such that (i) w+ is defined on a dense subspace of H and on which w+ ≥ cI , c > 0; (ii) w− is bounded, i.e. w− ∈ L(H), and Range(w− ) ⊆ Domain(w+ ); (iii) w+ w− Pc = Pc on H and Pc = Pc w− w+ on the domain of w+ ; and for all f ∈ H satisfying w+ f ∈ H we have (a) w− e−iH0 t Pc f ≤ C t −r1 w+ f , for t ∈ R; (b) w− e

−iH0 t

(H0 − λk − µj − i0)

−1

(26) −r1

Pc f ≤ C t

w+ f , for t ≥ 0, (27)

where k = 1, 2, . . . , m while µj ∈ Z are the Fourier exponents of the perturbation (see below). For t < 0 estimate (27) is assumed to hold with −i0 replaced by +i0. Remark 2.1. In the case H0 = − + V (x), x ∈ Rn condition (a) is satisfied for generic potentials V (x), with sufficiently rapid decay at infinity; see [10, 17, 25]. As for condition (b) we showed in [12, Sect. 3] how to obtain a constant C uniform in frequencies µj for generic, localized V (x) and λk + µj bounded away from zero. In particular, both (a) and (b) are satisfied for our example in the introduction provided the zero energy resonance is excluded, i.e. κ ∈ / {(kπ/2)2 , k ∈ N}, see [25].

344

E. Kirr, M.I. Weinstein

(H4) Hypotheses on the perturbation W (t): We consider time-dependent symmetric perturbations of the form 2 W (t) =

1 β0 + cos(µj t) βj , with βj∗ = βj and

βj L(H) < ∞. 2 j ∈N

(28)

j ∈N0

Equivalently, W (t) =

1 exp(−iµj t)βj , 2

(29)

j ∈Z

where, µ0 = 0 and for j < 0, µj = −µ−j , βj = β−j . Thus, W (t) is an almost periodic function with values in the Banach space L(H) with the Fourier exponents µj j ∈Z and corresponding Fourier coefficients βj j ∈Z ; see, for example, [13]. To measure the size of the perturbation W , we introduce the norm |||W ||| ≡

1 1

w+ βj L(H) +

w+ βj w+ L(H) , 2 2 j ∈Z

(30)

j ∈Z

which is assumed to be finite. We shall require in addition that |||W ||| is small. (H5) Resonance condition – Fermi golden rule: Consider the self-adjoint complex matrix with elements π ij ≡ Pc βk ψi , δ(H0 − λj − µn )Pc βn ψj , (31) 4 k,n∈Z λi +µk =λj +µn ∈σcont (H0 )

where i, j ∈ 1, . . . m. In Appendix 6 is defined and shown to be nonnegative definite. Let γ denote the smallest eigenvalue of . We assume that there exists a constant θ0 > 0 such that γ ≥ θ0 |||W |||2 . (32) Remark 2.2. The assumption (32) is used only to obtain infinite time scale asymptotics. Our results for times of up to order |||W |||−2 (i.e. ε −2 in the notation for the introduction) are valid even if has zero eigenvalues. (H6) Control of small denominators (large spectral gap): Define the first and second order resonance sets: I 1 = {1, . . . , m} × {1, . . . , m} × Z, = {(i, j, k) ∈ I : λi + µk = λj },

1 Ires 2

I = {1, . . . , m} × {1, . . . , m} × Z × Z, = {(i, j, k, n) ∈ I : λi + µk = λj + µn }.

2 Ires

We assume δ∗ ≡ min 2

inf |λi + µk − λj |, inf |λi + µk − λj − µn | > 0.

1 I 1 \Ires

2 I 2 \Ires

W (t) = 21 β0 + j ∈N cos(µj t + δj ) βj can be handled as well.

(33)

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

345

Remark 2.3. Note that for the example in the introduction δ∗ = |λ2 − λ1 | provided µ is fixed and L is sufficiently large such that µ > 2(λ2 − λ1 ) ≈ 2 exp(−cL). Also note that (H6) is satisfied if H0 is fixed and W (t) is periodic or a trigonometric polynomial in “t”. We now state the main result of this section: Theorem 2.1 (Large spectral gap). Assume the hypotheses (H1)–(H6) hold. Then there exist ε0 and the constants CT , C such that if |||W ||| +

|||W ||| |||W ||| + < ε0 , δ∗ δ∗2

(34)

then any solution of (24) with w+ φ(0) ∈ H, satisfies φ(t) =

m

e−iλj t Aj (t)ψj + φd (t), A = (A1 , A2 , . . . , Am )T ,

j =1

A(t) = e(− +i )t A(0) + R(t), φd (t) ≡ Pc φ(t) = e−iH0 t Pc φ(0) + φ˜ d (t). #

#

(35)

Here # = ≥ 0 is the self adjoint matrix given in (H5), # = + η1 + η3 , is constant and self adjoint and displayed in (120), (74), (82). Finally, for any fixed T > 0, |||W ||| |||W ||| |R(t)| ≤ CT |||W ||| + , 0 ≤ t ≤ T |||W |||−2 , (36) + δ∗ δ∗2 |R(t)| = O(t −r1 ),

w− φ˜ d (t) ≤ C |||W |||,

w− φ(t) = O(t −r1 ),

for t → ∞, for t > 0, for t → ∞.

(37)

The theorem explicitly computes the dominant evolution for the amplitudes of the eigenvectors on times of order |||W |||−2 , (35). From an examination of the formulae for in (H5) and (120), (74), (82), we infer that coupling between the modes ψi and ψj , (i = j ), occurs on time scales of order |||W |||−2 only if the perturbation W has frequencies µk and µn such that: λ i + µ k = λj + µ n .

(38)

If none of these resonance relations hold, the matrix is diagonal and |Aj (t)| ∼ e− jj t . Theorem 2.1 is an extension to the multibound state case of the result in [12] for the case where the unperturbed Hamiltonian, H0 , has one bound state. Theorem 2.1 does not apply if δ∗2 ∼ |||W ||| or δ∗2 |||W |||. An important example is the double-well problem, in which two single wells are separated by a distance L; see the introduction. In this case, the eigenvalues of H0 occur in pairs which are exponentially close for L large and hence δ∗ is not bounded below uniformly in L. In particular, δ∗ ≤ inf{ |λi − λj | : λi = λj , λi , λj ∈ σdiscrete (H0 ) } = O(e−cL ) → 0 as L → ∞. The results of the next section provide a description of the dynamics in problems with nearly degenerate eigenvalues.

346

E. Kirr, M.I. Weinstein

3. Small Eigenvalue Splitting In this section we study the case when the unperturbed Hamiltonian has very small eigenvalue spacings. Motivated by the case of double well potentials, whose eigenvalues come in nearly degenerate pairs in the large separation or large barrier limit (see Sect. 4), we assume in addition to (H1): (H2)’ The spectrum of H0 consists of an absolutely continuous part, σcont (H0 ), with associated spectral projection Pc and isolated eigenvalues which group in nearly degenerate pairs: λ1 , λ2 ; λ3 , λ4 ; . . . ; λ2N−1 , λ2N . Corresponding to these eigenvalues is an orthonormal set of eigenvectors ψ1 , ψ2 , . . . , ψ2N . We assume that the distances j

δ∗ = |λ2j −1 − λ2j |, where j = 1, . . . , N are small compared with the size of the perturbation frequencies and the distances among the pairs of eigenvalues in a manner which is made precise in hypothesis (H6)’ below. Let λ2j −1 + λ2j , j = 1, . . . , N. 2 We found it to be much simpler to state the theorems for the case of pairs of close eigenvalues. However, these results have straightforward generalizations to clusters of more than two eigenvalues and we will refer to them after each theorem. In order to prove our results we assume that (H3), (H4), (H5) are satisfied and replace (H6) with the following condition which ensures that there are no significant couplings among modes corresponding to different pairs of nearly degenerate eigenvalues: (H6)’ Assume that: λj =

λi ± µk = λj if and only if i = j and µk = 0, λ ± µk = λj ± µn if and only if i = j and µk = µn . i

In all the other cases we assume that there exists a constant D > 0 such that: |λi ± µk − λj | ≥ D > 0, |λi ± µk − λj ± µn | ≥ D > 0. Furthermore assume that j

δ∗ ≤

D , 4

j = 1, . . . , N.

Denote: δ∗ = δ∗ =

j

(39)

j

(40)

min δ∗ ,

j =1,...,N

max δ∗ .

j =1,...,N

Recall that the formulas in Theorem 2.1 contain correction terms of size |||W |||/δ∗ which may be large in the settings of this section (one can easily see that δ∗ in this section coincides with the one in the previous section). However the next result shows that under the the modified hypothesis (H2)’ and (H6)’ and for mean zero perturbations (i.e. β0 ≡ 0), the correction terms are much smaller allowing us to infer an uncoupled evolution of all the bound states provided δ∗ |||W |||2 :

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

347

Theorem 3.1. Suppose that the Hamiltonian H0 satisfies (H1), (H2)’ and (H3). Assume that the perturbation, W is chosen such that (H4), (H5) and (H6)’ hold and in addition that it has mean value zero, i.e. β0 = 0. Then there exist ε0 > 0 such that if |||W ||| +

|||W |||2 ≤ ε0 , δ∗

then any solution of (53) with w+ φ(0) ∈ H, satisfies φ(t) =

m

e−iλj t Aj (t)ψj + φd (t), A = (A1 , A2 , . . . , A2N )T ,

j =1

A(t) = e(− +i )t A(0) + R(t), φd (t) ≡ Pc φ(t) = e−iH0 t Pc φ(0) + φ˜ d (t). #

#

Here # ≥ 0 and # are constant, real-valued diagonal matrices given by: π # = Pc βn ψj , δ(H0 − λj − µn )Pc βn ψj , jj 4 n∈Z λj +µn ∈σcont (H0 )

1 4

#jj =

+

+

Pc βn ψj , P.V.(H0 − λj − µn )−1 Pc βn ψj

n∈Z λj +µn ∈σcont (H0 )

1 4 1 2

Pc βn ψj , (H0 − λj − µn )−1 Pc βn ψj ,

n∈Z λj +µn ∈σ / cont (H0 )

n∈N 1≤p≤2N

λp − λ j ψp , βn ψj 2 , (λj − λp )2 − µ2n

where j = 1, 2, . . . , 2N . Finally, for any fixed T > 0, |||W |||2 |R(t)| ≤ CT |||W ||| + , 0 ≤ t ≤ T |||W |||−2 , δ∗ and for any t > 0

w− φ˜ d (t) ≤ C |||W |||.

# ≥ θ |||W |||2 for some constant If, in addition, for all j = 1, 2, . . . , 2N we have jj 0 ∗ θ0 > 0, which is independent of |||W ||| and δ , then

|R(t)| = O(t −r1 ),

w− φ(t) = O(t −r1 ),

for t → ∞, for t → ∞.

Note that if one deals with clusters of more than two nearly degenerate eigenvalues the above result carries out without modifications. The next result is valid as δ∗ → 0. The tradeoff is that it contains correction terms of size δ ∗ , consequently it is aimed at Hamiltonians with nearly degenerate eigenvalue pairs and complements Theorem 3.1:

348

E. Kirr, M.I. Weinstein

Theorem 3.2 (Arbitrarily small spectral splitting). Consider the initial value problem (53). Suppose (H1), (H2)’ and (H3) hold, the perturbation satisfies (H4) and (H6)’. Then there exists ε0 such that if δ ∗ + |||W ||| ≤ ε0 , then any solution of (53) with w+ φ(0) ∈ H, satisfies φ(t) =

m

aj (t)ψj + φd (t), a = (a1 , a2 , . . . , a2N )T ,

j =1

a(t) = e(−i diag[λ1 ,λ2 ,...,λ2N ]− +i )t a(0) + R(t), φd (t) ≡ Pc φ(t) = e−iH0 t Pc φ(0) + φ˜ d (t). #

Here

#

# ≥ 0, # = diag 1# , 2# , . . . , N

# = diag #1 , #2 , . . . , #N

are constant, self adjoint, block-diagonal matrices with each block of size 2 × 2. # ≥ 0 and its blocks are given by π jn , (41) j# = 4 n∈Z λj +µn ∈σcont (H0 )

where jn =

Pc βn ψ2j −1 , δ(H0 − λj − µn )Pc βn ψ2j −1 Pc βn ψ2j −1 , δ(H0 − λj − µn )Pc βn ψ2j . Pc βn ψ2j , δ(H0 − λj − µn )Pc βn ψ2j −1 Pc βn ψ2j , δ(H0 − λj − µn )Pc βn ψ2j (42)

The blocks which form # are given by 1 ψ2j −1 , β0 ψ2j −1 ψ2j −1 , β0 ψ2j #

j = − ψ2j , β0 ψ2j −1 ψ2j , β0 ψ2j 2 1 1 +

nj1 +

nj2 4 4 n∈Z λj +µn ∈σcont (H0 )

+

n∈Z λj +µn ∈σ / cont (H0 )

1 1 np 0p

j +

j , 2 4 n>0 1≤p≤N

(43)

p=j 1≤p≤N

where

nj1 = Pc βn ψ2j −1 , P.V. (H0 −λj −µn )−1 Pc βn ψ2j −1 Pc βn ψ2j −1 , P.V. (H0 − λj − µn )−1 Pc βn ψ2j , Pc βn ψ2 , P.V. (H0 − λj − µn )−1 Pc βψ2j −1 Pc βn ψ2j , P.V. (H0 − λj − µn )−1 Pc βn ψ2j

nj2 = Pc βn ψ2j −1 , (H0 − λj − µn )−1 Pc βn ψ2j −1 Pc βn ψ2j −1 , (H0 − λj − µn )−1 Pc βn ψ2j , Pc βn ψ2j , (H0 − λj − µn )−1 Pc βn ψ2j Pc βn ψ2 , (H0 − λj − µn )−1 Pc βψ2j −1

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

np

j =

0p

j

349

λp − λj

(λj − λp )2 − µ2n   ψj ∗ , βn ψp∗ 2 ψj ∗ , βn ψp∗ ψp∗ , βn ψ2j 2   +ψj ∗ , βn ψ2p ψ2p , βn ψ2j   + ψj ∗ , βn ψ2p   × , 2   ψ2j , βn ψp∗   ψ2j , βn ψp∗ ψp∗ , βn ψj ∗ 2 +ψ2j , βn ψ2p ψ2p , βn ψj ∗ + ψ2j , βn ψ2p 1 = λp − λ j   ψj ∗ , β0 ψp∗ 2 ψj ∗ , β0 ψp∗ ψp∗ , β0 ψ2j 2   +ψj ∗ , β0 ψ2p ψ2p , β0 ψ2j   + ψj ∗ , β0 ψ2p   × , 2     ψ2j , β0 ψp∗ ψp∗ , β0 ψj ∗ ψ2j , β0 ψp∗ 2 +ψ2j , β0 ψ2p ψ2p , β0 ψj ∗ + ψ2j , β0 ψ2p

here j ∗, respectively p∗, denote 2j − 1, respectively 2p − 1. Finally, for any fixed T > 0, |R(t)| ≤ CT |||W ||| + δ ∗ , 0 ≤ t ≤ T |||W |||−2 , and for any t > 0

w− φ˜ d (t) ≤ C |||W |||. If, in addition, the smallest eigenvalue of # , γ , satisfies γ ≥ θ0 |||W |||2 for some constant θ0 > 0, which is independent of |||W ||| and δ ∗ then |R(t)| = O(t −r1 ),

w− φ(t) = O(t −r1 ),

for t → ∞, for t → ∞.

In the case of a cluster of n > 2 eigenvalues the corresponding 2 × 2 block will be replaced by a n × n block on the diagonal and with elements which are straightforward generalizations of the n = 2 case. 4. Examples In this section we illustrate our results with two examples. 4.1. Double well with large barrier. An interesting example studied in [8] by Grecchi and Sachetti is a one-dimensional model of a double well potential with a barrier. The mathematical formulation is: i∂t φ(t, x) = −∂x2 + V (x) φ(t, x) + εW (t, x)φ(t, x), φ(t = 0, x) = φ0 (x).

(44)

The potential, V (x), is given by V (x) = −bδ(x + a) − bδ(x − a) + ρδ(x),

(45)

where δ(x − ξ ) denotes the Dirac distribution centered at x = ξ . a, b and ρ are real positive parameters.

350

E. Kirr, M.I. Weinstein

Consequently, the unperturbed Hamiltonian H0 ≡ −∂x2 + V (x) is defined on the Sobolev space and is self adjoint on L2 (R). If ab > 1 and ρ ≥ ρ0 > 0 it has exactly two eigenvalues: λ1 < λ2 < 0, (46) δ∗ ≡ λ2 − λ1 = O(ρ −1 ).

H 2 (R)

The ground state eigenfunction, ψ1 (x) is symmetric and concentrated in a neighborhood of the interval x ∈ [−a, a]. The excited state, ψ2 (x) is anti-symmetric and concentrated in a neighborhood of the interval x ∈ [−a, a]. It is approximately equal to ψ1 (x) in a neighborhood of x ∈ [−a, 0] and −ψ1 (x) in a neighborhood of x ∈ [0, a]. Moreover the rest of the spectrum is absolutely continuous and equal to the positive real line. The spectral theory is worked out in [1]. Concerning hypothesis (H3), Grecchi σ and Sachetti verify the local decay estimate (26) with r1 = 3/2, w± ≡ (1 + x 2 )± 2 , σ > 7/2, and a constant which is uniform in ρ. By the analysis in [12, Sect. 3] we can obtain the singular decay estimate (27) from (26) provided |λ1,2 ± µj | ≥ D > 0,

(47)

where µj are the frequencies of the perturbation W (t, x). In this case the estimate will be uniform in µ’s and ρ. The perturbation W (t, x) in [8] is periodic in time, has a finite number of frequencies: W (t, x) = β0 (x) +

N

cos(µn t)βn (x),

µn = nµ1 ,

(48)

n=1

and βn (x) decay sufficiently fast as |x| → ∞ such that +∞ (1 + x 2 )2σ |βn (x)|2 dx < ∞. −∞

(49)

We can treat more general perturbations. For example a general trigonometric polynomial in “t” with non-commensurate frequencies, i.e. µn = nµ1 , or a time periodic perturbation having an infinite number of harmonics, i.e. N = ∞, and sufficiently smooth in time so that (30) holds. In both cases all our hypothesis are satisfied. If the “barrier height parameter”, ρ, is not very large we can apply Theorem 2.1. The correction terms are of size O(ρε). If in addition β0 ≡ 0, the correction terms are of size O(ρε2 ) due to Theorem 3.1. If ρ is so large that the above corrections are significant then one can apply Theorem 3.2 and obtain correction terms of size O(ε + ρ −1 ) which decrease as ρ increases. Consequently, this latter result is uniform in ρ ≥ ρ0 for some large ρ0 . The analysis in [8] holds for ρ ≤ Cε −2 , C > 0. We now show how our results apply to prove the “localization” phenomenon conjectured in [8]. There it is claimed that for δ∗ ∼ ρ −1 ε2

(50)

and for a perturbation localized in one well the system will tend to stay in the other well. Indeed consider W (t, x) = cos(µt)β(x) (51) with β(x) being the characteristic function of the interval (−a, 0), i.e. localized in the left well. µ is chosen such that λ1 + µ > 0. The particular case of Theorem 3.2 which

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

351

applies has already been explicitly stated in Theorem 1.2. By the properties of β, ψ1 and ψ2 we have Pc βψ1 = Pc βψ2 + O(δ∗ ). (52) For the formulae for # and # in Theorem 1.2, the order δ∗ term above can be omitted by adding it to the correction term R(t). The resulting increase in the size of the error, O(δ∗ ) is much smaller than the actual size of R(t) ∼ ε, see (12) and (50). Now one can proceed to find the eigenvalues and eigenvectors of the constant matrix in the exponent of (15). It is a matter of tedious but elementary calculations to show that the eigenvectors are (1, 1) and (1, −1) . They correspond to the following states: ψ1 + ψ2 ψ1 − ψ 2 , . √ √ 2 2 The first state is localized in the left well while the second one is in the right well. The real part of the eigenvalue corresponding to the first state is δ∗ −ε 2 Pc βψ1 , δ(H0 − λ − µ − i0)Pc βψ1 + ε 2 O(( 2 )2 ) ∼ ε2 2 ε which insures a significant decay of this state on ε −2 time scales.The real part of the eigenvalue corresponding to the second state is of order ε 2 O ( εδ∗2 )2 ε2 which leaves the size of the second state practically unchanged over ε−2 time scales. As the second state is localized in the right well we conclude that the system localizes in this well. 4.2. Double wells with large separation. In this example the equation is i∂t φ(t) = (H0 + εW (t)) φ(t), φ|t=0 = φ(0),

(53)

where H0 ≡ − + V (x), V (x) ≡ V0 (x1 − L/2, x2 , . . . , xn ) + V0 (x1 + L/2, x2 , . . . , xn ). Here is the Laplacian with respect to the variables (x1 , . . . , xn ), L > 0 is a parameter measuring the distance between the wells, and V0 (x1 , . . . , xn ) is a real valued potential defined on Rn , with sufficiently rapid decay as |x| → ∞, i.e. |V0 (x)| ≤ C (1 + |x|)−σ

(54)

for a sufficiently large and positive σ , see [10, 17, 25]. We assume that the single well Hamiltonian − + V0 (x) has simple eigenvalues, λ1 , λ2 , . . . , λN < 0 and that the rest of its spectrum, [0, ∞), is absolutely continuous. It follows from [10, 17, 25] that for sufficiently large L, the unperturbed, double well Hamiltonian H0 has a spectrum consisting of the absolutely continuous part, [0, ∞), with associated continuous spectral projection Pc = Pc (L). Moreover it has pairs of simple eigenvalues λ2j −1 (L), λ2j (L), j = 1, . . . , N such that for any positive and small , as L tends to infinity, λ2j −1 (L), λ2j (L) → λj , j δ∗

≡ |λ2j −1 (L) − λ2j (L)| = O(L

see [9, relation (5.13)].

√

1−n −2L

e

(55) −λj −

);

(56)

352

E. Kirr, M.I. Weinstein

Following the technique in [12, Sect. 3] and relying on the decaying properties of V0 , one can verify that both the local and singular local decay estimates in (H3) hold for a fixed L. We choose W (t) such that (H4) and (H5) hold. For small L (δ∗ ∼ 1) we assume (H6) and for large L (δ∗ small) we assume (H6)’. Hence Theorems 2.1, 3.1 and 3.2 hold. We note that due to hypotheses on decay of V (x), see [10, 17, 25], the constant in the local decay estimates (26) and (27) may grow with L at some polynomial rate, Lσ˜ , σ˜ > 0. Thus, our proof gives an ε0 in Theorem 3.2 which decreases with increasing L, e.g. ε0 = ε1 L−σ˜ . Therefore, for large L the condition for the validity of Theorem 3.2 is ε + δ ∗ ≤ ε1 L−σ˜ . Since δ ∗ = maxj δ∗ is exponentially small in L, for large enough L we have Theorem 3.2 if ε ≤ ε1 L−σ˜ . j

A result which is uniform in L for all L ≥ L0 would hold if local decay estimates of type (26) and (27) in (H3) were known for V (x) in, say, appropriate non-weighted Lp (Rn ) spaces. This appears to be an open problem. 5. Decomposition and Normal Form The goal of this section is to rewrite the perturbed Schr¨odinger equation (24) in an equivalent form in which the dominant flow of energy among bound states and radiation modes is made explicit. Initially we follow the path used in [12]. We then refine and extend the approach to obtain asymptotics which are valid uniformly in the eigenvalue splitting, δ∗ . We start with a review and extension of the technique in [12]. Details are provided in the Appendix of Sect. 6. Under the hypotheses (H1) and (H2) the solution of (24) can be written as: φ(t) =

m

al (t)ψl + Pc φ(t),

(57)

l=1

where al (t) = ψl , φ(t) , l = 1, . . . , m.

(58)

Pc φ(t) denotes the projection onto the continuous spectrum associated with H0 . Denote φd (t) = Pc φ(t), Al (t) = eiλl t al (t), l = 1, . . . , m,

(59) (60)

and A(t) the column vector with components Al (t). In Sect. 6 we prove the following Proposition 5.1. The initial value problem for Eq. (24) is equivalent to the system ∂t A(t) = [− + i( + η(t)) + ρ(t)] A(t) + E(t), m ∂t φd (t) = −iH0 φd (t) − iPc W (t)φd (t) − i e−iλl t Al (t)Pc W (t)ψl , l=1

(61) (62)

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

353

where is given in (H5) and , η(t), ρ(t) and E(t) are explicitly displayed in (120– 123). Consequently ˜ φd (t) = e−iH0 t φd (0) + φ(t), where, for any t > 0 and 0 ≤ r ≤ r1 , ˜ ≤ C|||W ||| ,

w− φ(t) t w− φd (t) ≤ C + D|||W ||| sup s r |A(s)| . r

(63) (64)

0≤s≤t

Moreover, the coefficient matrices have the following important properties: (a) and are self-adjoint constant coefficient complex matrices of order O(|||W |||2 ). (b) is nonnegative definite. (c) η(t) is of order O(|||W |||) and self-adjoint. (d) ρ(t) is an almost periodic matrix with mean 0 and order O(|||W |||2 ). (e) E(t) = E(t, A(t), φd (t)) is such that for any 0 ≤ r ≤ r1 and sufficiently small perturbation W , i.e. |||W ||| is sufficiently small, there exists the constants C and D such that for all t > 0: |E(t)| ≤

D C |||W ||| + r |||W |||3 sup s r |A(s)|. t r1 t 0≤s≤t

(65)

Note that relation (63) says that the wave part is within |||W ||| from the unperturbed wave. This is part of the conclusion in all our Theorems 2.1, 3.1 and 3.2. To show that the full solution of the unperturbed problem decays polynomially in time it is now sufficient to prove that t r1 |A(t)| is bounded for t > 0, since t r1 w− φ(t) ≤ t r1 |A(t)| + t r1 w− φd (t) ≤ C + (1 + D |||W |||) sup s r1 |A(s)|,

(66)

0≤s≤t

where we used (64). In order to show that t r1 |A(t)| is bounded for t > 0 and to obtain a more precise dynamics for A(t) on |||W |||−2 time scales we need to further refine (61). We will use a near identity change of variables, A → B, to reduce (61) to a system of the form: B(t) = (I − M(t))A(t), ∂t B = (− # + i # )B + αB + F.

(67) (68)

Here, (p1) M(0) = 0 and M(t) is a time dependent matrix of order O(|||W ||| + ) uniformly in t, i.e. there exists a constant C such that for all t ∈ R we have |M(t)| ≤ C (|||W ||| + ) . (p2) # is a constant, self adjoint, nonnegative definite matrix of order O(|||W |||2 ). (p3) # is a constant, self adjoint matrix of order O(|||W |||). (p4) α(t) is a time dependent matrix of higher order, i.e. there exists a constant C such that for all t ∈ R we have |α(t)| ≤ Cα |||W |||2 (|||W ||| + ).

354

E. Kirr, M.I. Weinstein

(p5) F (t) = F (B, φd ; t) satisfies an estimate in terms of the norm of W and B of the form (65), i.e. there exists the constants C and D such that for all t > 0 and 0 ≤ r ≤ r1 : |F (t)| ≤ C

1+ 1+ |||W ||| + D |||W |||3 sup s r |B(s)|. t r1 t r 0≤s≤t

(69)

The different settings of Theorems 2.1, 3.1 and 3.2 lead to different M in (67) and, consequently, different # , # , α, F and in (68) and (p1)–(p5). They arise due to the matrix character of (61). In [12] it was relatively simple to infer the behavior of the scalar A(t) from (61); the fundamental solution associated with its homogeneous part can be factored as t

e− (t−s)+

s

i( +η(τ ))+ρ(τ )dτ

= e− (t−s) ei

t s

+η(τ )dτ

e

t s

ρ(τ )dτ

,

(70)

and we could analyze the norm of each operator in the right-hand side. However, in the multiple bound state case, the above splitting t is valid only if the t m × m complex matrices , , η(t), s η(τ )dτ, ρ(t) and s ρ(τ )dτ commute for all t, s ∈ R, t ≥ s. Since this is typically false for m ≥ 2, a detailed analysis of the fundamental solution of the homogeneous part of (61) is required. In carrying this out, we exploit the fact that ∂t A is of order O(|||W |||) and we carefully integrate by parts appropriate terms on the right-hand side of (61) to obtain a system of the form (67–68). The proofs are then finished by applying the following proposition: Proposition 5.2. Suppose B(t) is a solution of (68) and (p2)–(p5) are satisfied. Then, there exists ε0 > 0 such that whenever |||W ||| + ≤ ε0 , we have B(t) = e(−

# +i # )t

B(0) + O(|||W ||| + )

for |t| = O(|||W |||−2 ).

If, in addition, the smallest eigenvalue of # , γ , satisfies γ ≥ θ0 |||W |||2 for some constant θ0 > 0, which is independent of |||W ||| and then B(t) = O(t −r1 ) as |t| → ∞. The proof of this general proposition is rather technical and is presented in the Appendix of Sect. 7. We now observe that to prove Theorems 2.1, 3.1 and 3.2 it suffices to verify that in each case (61–62) can be reduced to a system of (68) with (p1)–(p5) satisfied. This is the purpose of the next three subsections. To verify this assertion we note that if the solution, A(t), of (61) is related to those of (68) through the change of variable (67), then A(t) = (I −M(t))−1 B(t) = e(−

# +i # )t

A(0)+O(|||W |||+) for |t| = O(|||W |||−2 ), (71)

provided (p1)–(p5) hold. If in addition γ ≥ θ0 |||W |||2 for some constant θ0 then clearly A(t) = (I − M(t))−1 B(t) = O(t −r1 )

as |t| → ∞.

Using now the estimate (66) the theorems are completely proven.

(72)

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

355

5.1. Expansion and normal form for Theorem 2.1. We begin by splitting the coefficient, η(t), in (61) and displayed in (121) into its time independent average and time dependent oscillating part: η(t) ≡ η1 − iη2 (t), 1 ψl , βn ψj , (η1 )lj = − 2

(73) (74)

n∈Z λl =λj +µn

(η2 (t))lj = −

i 2

ei(λl −λj −µn )t ψl , βn ψj .

(75)

n∈Z λl =λj +µn

Then, (61) becomes: ∂t A(t) = [− + i + iη1 ] A(t) + [η2 (t) + ρ(t)] A(t) + E(t).

(76)

We next integrate (76) from 0 to t:

t

A(t) = A(0)+

t

[− + i + iη1 ] A(s)ds+

0

0

Let

t

M1 (t) =

t

E(s)ds+

[η2 (s) + ρ(s)] A(s) ds.

0

η2 (s) + ρ(s)ds.

(77)

0

Integration by parts of the last integral yields:

t

[I − M1 (t)] A(t) = [I − M1 (0)] A(0) + [− + i + iη1 ] A(s)ds 0 t t + E(s)ds − M1 (s)∂s A(s)ds, 0

(78)

0

where (M1 (t))lj =−

+

1 2 1 4

n∈Z λl =λj +µn

ei(λl −λj −µn )t ψl , βn ψj λl − λ j − µ n

n,k∈Z λl +µk =λj +µn

ei(λl +µk −λj −µn )t Pc βk ψl , (H0 −λj −µn −i0)−1 Pc βn ψj . (79) λl +µk −λj −µn

Note that due to (H4) and (H6), M1 (t) = O

|||W ||| . δ∗

(80)

356

E. Kirr, M.I. Weinstein

The next step is to replace ∂s A(s) in (78) using (76): t [I − M1 (t)] A(t) = [I − M1 (0)] A(0) + [I − M1 (s)] [− + i + iη1 ] A(s)ds 0 t t − M1 (s)η2 (s)A(s)ds − M1 (s)ρ(s)A(s)ds 0 0 t + [I − M1 (s)] E(s)ds. 0

Commuting [I − M1 (s)] and [− + i + iη1 ] in the first integral, we obtain t [I − M1 (t)] A(t) = [I − M1 (0)] A(0) + [− + i + iη1 ] [I − M1 (s)] A(s)ds 0 t t −i M1 (s)η2 (s)A(s)ds [M1 (s), η1 ] A(s)ds − 0 0 t t − M1 (s)ρ(s)A(s)ds [M1 (s), (− + i )] A(s)ds − 0 0 t + (81) [I − M1 (s)] E(s) ds. 0

A direct calculation gives: (i [M1 (s), η1 ] + M1 (s)η2 (s))lj ei(λl +µk −λj −µn )s i = βk ψl , ψp ψp , βn ψj 4 λl + µ k − λ p k,n∈Z 1≤p≤m λp =λl +µk

−

i 4

k,n∈Z λl +µk =λj +µn

+O

|||W |||3 δ∗

ei(λl +µk −λj −µn )s λl + µ k − λ j − µ n

|||W |||3 ≡ −i (η3 )lj − (η4 (s))lj + O δ∗

βk ψl , ψp ψp , βn ψj

1≤p≤m λp =λl +µk

.

Here, −iη3 is the average and η4 (t) the oscillating part. Specifically, 1 1 βk ψl , ψp ψp , βn ψj , (η3 )lj = − 4 λl + µ k − λ p

(82)

1≤p≤m k,n∈Z λl +µk =λj +µn λp =λl +µk

(η4 (t))lj = −

+

i 4 i 4

1≤p≤m k,n∈Z λl +µk =λj +µn λp =λl +µk

k,n∈Z λl +µk =λj +µn

ei(λl +µk −λj −µn )t βk ψl , ψp ψp , βn ψj λl + µ k − λ p

ei(λl+µk−λj−µn )t λl + µ k − λ j − µ n

βk ψl , ψp ψp , βn ψj . (83)

1≤p≤m λp =λl +µk

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

357

Note that (H4) and (H6) imply η3 = O

|||W |||2 δ∗

(84)

.

Hence (81) can be written as: [I − M1 (t)] A(t) = [I − M1 (0)] A(0) t + [− + i( + η1 + η3 )] [I − M1 (s)] A(s)ds 0 t t |||W |||3 |||W |||3 + A(s)ds η4 (s)A(s)ds + O + δ∗ δ∗2 0 0 t + (85) [I − M1 (s)] E(s) ds. 0

The last step is to integrate by parts of η4 (t) given by (M2 (t))lj = −

1 4

t 0

1≤p≤m k,n∈Z λl +µk =λj +µn λp =λl +µk

η4 (s)A(s)ds. Let M2 (t) be the antiderivative ei(λl +µk −λj −µn )t (λl + µk − λp )(λl + µk − λj − µn )

×βk ψl , ψp ψp , βn ψj +

×

1 4

k,n∈Z λl +µk =λj +µn

ei(λl +µk −λj −µn )t (λl + µk − λj − µn )2

βk ψl , ψp ψp , βn ψj .

(86)

1≤p≤m λp =λl +µk

By (H4) and (H6),

M2 (t) = O

|||W |||2 δ∗2

.

(87)

Hence, (85) becomes: [I − M1 (t) − M2 (t)] A(t) = [I − M1 (0) − M2 (0)] A(0) t + [− + i( + η1 + η3 )] [I − M1 (s) − M2 (s)] A(s)ds 0 t |||W |||3 |||W |||3 + A(s)ds O + δ∗ δ∗2 0 t + [I − M1 (s) − M2 (s)] E(s)ds.

(88)

0

Now introduce into (88) the near identity change of variable: B(t) = [I − M1 (t) − M2 (t)] A(t).

(89)

358

E. Kirr, M.I. Weinstein

Differentiation with respect to t yields: ∂t B(t) = [− + i( + η1 + η3 )] B(t) |||W |||3 |||W |||3 +O B(t) + [I − M1 (t) − M2 (t)] E(t). + δ∗ δ∗2

(90)

Comparing (89–90) with (67–68) gives the identities: |||W |||2 |||W ||| M(t) = M1 (t) + M2 (t) = O +O , δ∗ δ∗2 J = + η1 + η3 , |||W ||| |||W ||| , α(t) ≤ Cα |||W |||2 + δ∗ δ∗2 F (t) = [I − M1 (t) − M2 (t)] E(t). Now, hypotheses (p1)–(p5) can be readily verified. Thus Theorem 2.1 is completely proven. 5.2. Expansion and normal form for Theorem 3.1. We start as in the previous section and after one integration by parts we have: t [− + i ] [I − M1 (s)] A(s)ds [I − M1 (t)] A(t) = [I − M1 (0)] A(0) + 0 t − M1 (s)η2 (s)A(s)ds 0 t t − M1 (s)ρ(s)A(s)ds [M1 (s), (− + i )] A(s)ds − 0 0 t + (91) [I − M1 (s)] E(s) ds, 0

where M1 (t) and η2 (t) are given as before by (79) respectively (75). Note though that due to (h6) and β0 ≡ 0 we have η1 ≡ 0 and |||W ||| |||W |||2 . (92) + M1 (t) = O D δ∗ Terms which need to further be integrated by parts lie in the kernel (M1 (s)η2 (s))lj =

i 4

k,n∈Z 1≤p≤2N λp =λl +µk

ei(λl +µk −λj −µn )s βk ψl , ψp ψp , βn ψj λl + µ k − λ p

i ei(λl −λj −µn )s Pc βk ψl , (H0 −λl∗ −i0)−1Pc ψl∗ ψl∗ , βn ψj 4 λl −λl∗ k,n∈Z |||W |||3 +O D |||W |||3 ≡ −iη3 − (η4 (s))lj + O . D −

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

359

Here l∗ = tl + 1 respectively l∗ = l − 1 if l is odd respectively if l is even. By integrating by parts 0 η4 (s)A(s)ds in (91) we obtain [I − M1 (t) − M2 (t)] A(t) = [I − M1 (0) − M2 (0)] A(0) t + [− + i + iη3 ] [I − M1 (s) − M2 (s)] A(s)ds 0 t + M2 (s)η2 (s)A(s)ds 0 t |||W |||3 |||W |||4 + A(s)ds O + D δ∗ 0 t + (93) [I − M1 (s) − M2 (s)] E(s)ds. 0

Here

t

M2 (t) = 0

|||W |||2 η4 (s)ds = O δ∗ D

.

We still have to do one integration by parts as the following expansion suggests: (M2 (s)η2 (s))lj =

1 4

ei(λl −λj −µn )s (λl + µk − λp )(λl − λl∗ )

k,n∈Z 1≤p≤2N λp =λl +µk

|||W |||3 |||W |||4 + × βk ψl , ψp ψp , βk ψl∗ ψl∗ , βn ψj + O D2 δ∗ D |||W |||3 |||W |||4 ≡ −(η5 (t))lj + O + . (94) D2 δ∗ D Now let

t

M3 (t) = 0

Finally after integrating by parts

t 0

|||W |||3 η5 (s)ds = O δ∗ D 2

.

η5 (s)A(s)ds in (93) we get

[I − M1 (t) − M2 (t) − M3 (t)] A(t) = [I − M1 (0) − M2 (0) − M3 (0)] A(0) t + [− + i + iη3 ] [I − M1 (s) − M2 (s) − M3 (s)] A(s)ds 0 t |||W |||4 |||W |||3 |||W |||4 + A(s)ds + O + D δ∗ δ∗ D 2 0 t + [I − M1 (s) − M2 (s) − M3 (s)] E(s)ds.

(95)

0

The bottom line is that we are going to use a change of variables: B(t) = [I − M1 (t) − M2 (t) − M3 (t)] A(t)

(96)

360

E. Kirr, M.I. Weinstein

and diferentiating (95) we obtain ∂t B(t) = [− + i + iη3 ] B(t) |||W |||3 |||W |||4 |||W |||4 +O B(t) + + D δ∗ δ∗ D 2 + [I − M1 (t) − M2 (t) − M3 (t)] E(t).

(97)

Equations (96–97) can be identified with (67–68) where: M(t) = M1 (t) + M2 (t) + M3 (t) |||W ||| |||W |||2 |||W |||2 |||W |||3 =O +O , + +O D δ∗ δ∗ D δ∗ D 2 J = + η3 , |||W |||2 |||W |||2 2 |||W ||| α(t) ≤ Cα |||W ||| , + + D δ∗ δ∗ D 2 F (t) = [I − M1 (t) − M2 (t) − M3 (t)] E(t). Hypothesis (p1)–(p5) hold under the assumptions that D > 0 is a fixed constant. This concludes the proof of Theorem 3.1. 5.3. Expansion and normal form for Theorem 3.2. For the case of Theorem 3.2 the procedure is even more natural. We proceed in a manner similar to that in Subsect. 5.1. However, note that integrals with integrands containing the factors exp(±i(λ2i−1 −λ2i )t), i = 1, . . . , N are not integrated by parts. Thus “small” denominators are avoided. In conclusion we obtain a change of variables: B(t) = [I − M1 (t) − M2 (t)] A(t)

(98)

∂t B(t) = U (t)B(t) + O(|||W |||3 )B(t) + [I − M1 (t) − M2 (t)] E(t).

(99)

such that:

Here M1 , respectively M2 are as in (79), respectively (86), but without the terms having denominators of the form ±(λ2l−1 − λ2l ). Thus M1 , respectively M2 , are of order O(|||W |||), respectively O(|||W |||2 ) uniformly in δ∗ 0 and t ∈ R. U (t) is a tridiagonal matrix formed by 2 × 2 blocks on the diagonal: U (t) = diag U 1 (t), U 2 (t), . . . , U N (t) . Each 2 × 2 block, U l (t), l = 1, 2, . . . N corresponds to the pair of close eigenvalues (2l − 1, 2l), l = 1, . . . , N and has the form: ul11 ei(λ2l−1 −λ2l )t ul12 l U = , (100) ei(λ2l −λ2l−1 )t ul21 ul22 where ulj k , j, k = 1, 2 are constant complex numbers. Before we explicitly write them let us briefly explain why U (t) has this form. Given that we follow the procedure in Sect. 5.1, U (t) contains the dominant constant coefficient matrix we found in that section, namely − + i + iη1 + iη3 . Due to (H6)’ this matrix is diagonal, see (31), (120),

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

361

(74) and (82). In addition U (t) contains all the terms we avoid integrating by parts, i.e. the ones having as factors e±i(λ2l−1 −λ2l )t , l = 1, 2, . . . , N. It is a matter of actually doing the procedure to see that such terms only occur in the positions (2l − 1, 2l) and (2l, 2l − 1), l = 1, 2, . . . , N of the matrix U (t). If one removes the restriction (H6)’ the matrix U (t) may have slowly varying almost periodic in time terms outside the 2 × 2 blocks. As the theory for such systems is not yet well developed, they will prevent us from analyzing the dynamics. Below are the formulas for the constants ulj k , j, k = 1, 2 in (100): i ul11 = iλ2l−1 − ψ2l−1 , β0 ψ2l−1 2 i + Pc βn ψ2l−1 , (H0 − λ2l−1 − µn − i0)−1 Pc βn ψ2l−1 4 n∈Z i 1 ψ2l−1 , β0 ψp 2 − 4 λ2l−1 − λp 1≤p≤2N p=2l−1, p=2l

−

i 4

ψ2l−1 , βn ψp 2

1≤p≤2N n>0

1 1 + λ2l−1 − λp − µn λ2l−1 − λp − µ−n

i = iλ2l−1 − ψ2l−1 , β0 ψ2l−1 2 i + Pc βn ψ2l−1 , (H0 − λl − µn − i0)−1 Pc βn ψ2l−1 4 n∈Z i 1 ψ2l−1 , β0 ψ2p−1 2 + ψ2l−1 , β0 ψ2p 2 + 4 λp − λ l 1≤p≤N p=l

+

λp − λ l i ψ2l−1 , βn ψ2p−1 2 + ψ2l−1 , βn ψ2p 2 2 2 2 (λl − λp ) − µn 1≤p≤N n>0

ul12

+O(δ ∗ |||W |||2 ), 1 1 = − ψ2l−1 , β0 ψ2l + Pc βn ψ2l−1 , (H0 − λ2l − µn − i0)−1 Pc βn ψ2l 2 4 n∈Z 1 1 − ψ2l−1 , β0 ψp ψp , β0 ψ2l 4 λ2l−1 − λp 1≤p≤2N p=2l−1, p=2l

1 ψ2l−1 , βn ψp ψp , βn ψ2l 4 1≤p≤2N n>0 1 1 × + λ2l−1 − λp − µn λ2l−1 − λp − µ−n 1 1 = − ψ2l−1 , β0 ψ2l + Pc βn ψ2l−1 , (H0 − λl − µn − i0)−1 Pc βn ψ2l 2 4 n∈Z 1 1 + ψ2l−1 , β0 ψ2p−1 ψ2p−1 , β0 ψ2l 4 λp − λ l −

1≤p≤N p=l

362

E. Kirr, M.I. Weinstein

+ψ2l−1 , β0 ψ2p ψ2p , β0 ψ2l λp − λl 1 + ψ2l−1 , βn ψ2p−1 ψ2p−1 , βn ψ2l 2 (λl − λp )2 − µ2n 1≤p≤N n>0 +ψ2l−1 , βn ψ2p ψ2p , βn ψ2l + O(δ ∗ |||W |||2 ), where we used µ−n = −µn and the resolvent identity

(H0 − λ2l−1 − µn − i0)−1 − (H0 − λl − µn − i0)−1 Pc = (λ2l−1 − λl )(H0 − λ2l−1 − µn − i0)−1 (H0 − λ2l−1 − µn − i0)−1 Pc = O(δ ∗ ) (101)

in weighted norms, see [12, p. 36]. ul22 , respectively ul21 , can be obtained from ul11 , respectively ul12 , by interchanging 2l − 1 with 2l. The terms O(δ ∗ |||W |||2 ) can be added to α(t) in (68). Thus = δ ∗ in (p4). To get the dominant dynamics described in Theorem 3.2, we switch to the fast oscillating amplitudes: b2l−1 = e−iλ2l−1 t B2l−1 , b2l = e−iλ2l t B2l

(102)

for l = 1, 2, . . . , N. We then have: ∂t b(t) = −i diag [λ1 , λ2 , . . . , λ2N ] + U˜ b(t) + O(|||W |||3 + |||W |||2 δ ∗ )b(t) + I − M˜ 1 (t) − M˜ 2 (t) E(t), (103) where U˜ has the same components as U (t) except that the factors exp(±i(λ2i−1 − λ2i )t), i = 1, . . . , N have disappeared. Thus U˜ is a constant matrix. We further expand it using the formulas for ulj k above and (H0 − λ − µ − i0)−1 = P.V.(H0 − λ − µ)−1 + iπ δ(H0 − λ − µ) for λ + µ ∈ σcont (H0 ). Upon grouping the results in a self adjoint matrix and an anti-self adjoint one we get U˜ = − # + i # , where # and # are the ones given in Theorem 3.2. We only have to show that # ≥ 0 in order to apply Proposition 5.2. Since # is formed by the 2 × 2 blocks, l# , l = 1, 2, . . . , N the problem reduces to showing that each of this block is nonnegative definite. But this is straightforward from the fact that δ(H0 − λ − µ), λ + µ ∈ σcont (H0 ) induces a scalar product in weighted Hilbert spaces, see [12, Appendix]. The theorem is now completely proven.

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

363

6. Appendix – Proof of Proposition 5.1 In this appendix we prove Proposition 5.1, i.e. we show that (24) is indeed equivalent to (61–62) under hypothesis (H1–H4). The latter system is the starting point of Sect. 5 which leads to the normal form for the amplitude equations. The computation is an extension to multiple bound states of the one implemented in [12, 19, 21, 20] for the one bound state case. We recall that (24) is: i∂t φ(t) = (H0 + W (t)) φ(t),

(104)

and under hypothesis (H1–H2) its general solution can be written in the form, see also (57–59) m al (t)ψl + φd (t), (105) φ(t) = l=1

where

al (t) = ψi , φ(t) , l = 1, . . . , m,

(106)

φd (t) = Pc φ(t)

(107)

and

is the projection of the solution onto the continuous spectrum associated with H0 . We proceed by first inserting (105) into (104), which yields the equation: i

m

∂t aj (t)ψj + i∂t φd (t) =

j =1

m

λj aj (t)ψj + H0 φd (t)

j =1 m

+

aj (t)W (t)ψj + W (t)φd (t).

(108)

j =1

Taking the inner product of (108) with each of the eigenvectors, ψl , l = 1, . . . , m, we get the following system of equations for the amplitudes al (t), l = 1, . . . , m: i∂t al (t) = λl al (t) +

m

ψl , W (t)ψj aj (t) + ψl , W (t)φd .

(109)

j =1

In deriving (109) we have used that ψl , l = 1, . . . , m are orthonormal and satisfy the following orthogonality relations: ψl , φd (t) = 0,

l = 1, . . . , m.

(110)

Applying Pc to (108), we obtain an equation for φd : i∂t φd (t) = H0 φd (t) + Pc W (t)φd (t) +

m

aj (t)Pc W (t)ψj .

(111)

j =1

Since we are after a slow resonant decay phenomenon, it will prove advantageous to extract the fast oscillatory behavior of al (t). We therefore define: Al (t) ≡ eiλl t ai (t),

l = 1, . . . , m.

(112)

364

E. Kirr, M.I. Weinstein

Then, for l = 1, . . . , m, (109) reads ∂t Al = −i

m

e−i(λj −λl )t ψl , W (t)ψj Aj − ieiλl t ψl , W (t)φd (t) .

(113)

j =1

Note that since W is small, A(t) is slowly varying. We write (111) in an integral form using Duhamel’s principle: φd (t) = e−iH0 t φd (0) − i −i

m

t

e−iH0 (t−s) Pc W (s)aj (s)ψj ds

j =1 0 t

e−iH0 (t−s) Pc W (s)φd (s) ds

0

≡ φ0 (t) + φ1 (t) + φ2 (t).

(114)

By standard methods, the system (113)–(114) for A(t) = (A1 (t), A2 (t), . . . , Am (t)) and φd (t) has a global solution in t with A ∈ C 1 (R, Rm ), φd (t) ∈ C 0 (R), w− φd (t) ∈ C 0 (R). Our analysis of the |t| → ∞ behaviour is based on a study of this system. Let us prove first the wave estimates (63–64). By comparing (114) with (63) we have ˜ φ(t) = φ1 (t) + φ2 (t). The decay properties of the unperturbed wave operator e−iH0 t combined with the conservation of energy |a(t)|2 + φd (t) 2 = φ(0) 2 which in particular gives |a(t)|, φd (t) ≤ φ(0) , readily implies (63). As for (64) let us multiply (114) by w− and apply the norm: t

w− e−iH0 (t−s) Pc W (s) |A(s)| ds.

w− φd (t) ≤ w− e−iH0 t φd (0) + 0 t +

w− e−iH0 (t−s) Pc W (s)w+ w− φd (s) ds. 0

Using now the local decay estimates in (H3) we have t w− φd (t) ≤ C w+ φd (0) +C sup s |A(s)|t r

r

0≤s≤t

+C sup s r w− φd (s) t r 0≤s≤t

t 0

t

r 0

t −s −r1 w+ W (s) s −r ds

t − s −r w+ W (s)w+ ds.

Finally by w+ W (s) ≤ |||W ||| and w+ W (s)w+ ≤ |||W |||, see (H4) and the standard inequality t t r t − s −r1 s −r ds ≤ D 0

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

365

valid for some constant D independent of t we get (1 − C|||W |||) sup s r w− φd (s) ≤ C w+ φd (0) + C sup s r |A(s)|. 0≤s≤t

0≤s≤t

Hence, for sufficiently small |||W ||| there exists the constants C and D such that for any t > 0, (115) sup s r w− φd (s) ≤ C + D sup s r |A(s)|. 0≤s≤t

0≤s≤t

In particular this implies (64). Now, in order to obtain (61) we need to insert (114) into (113). Before doing so let us expand the φ1 (t) term. We first replace aj (t) = e−iλj t Aj (t) and then make explicit the frequency content of W (t) by using the expression (29) for W (t): W (t) =

1 exp(−iµk t)βk , 2 k∈Z

which by (H4), is a convergent series uniformly in t ∈ R. We get: i φ1 (t) = − 2 m

= −

i 2

t

j =1 k∈Z 0 m t

e−iH0 (t−s) e−i(λj +µk )s Aj (s)Pc βk ψj ds e−iH0 t ei[H0 −(λj +µk )]s Aj (s)Pc βk ψj ds.

(116)

j =1 k∈Z 0

We wish to obtain the dominant contributions from φ1 . These will come from resonances, terms where λj + µk ∈ σcont (H0 ). These contributions are calculated by careful integration by parts. To carry this out we first regularize φ1 by defining: i 2 m

η

φ1 (t) = −

t

e−iH0 (t−s) e−i(λj +µk +iη)s Aj (s)Pc βk ψj ds

(117)

j =1 k∈Z 0 η

for η positive and arbitrary and t > 0. Note that φ1 (t) = limη0 φ1 (t) uniformly with respect to t on compact intervals. Now, integration by parts for each integral in expression (117) and letting η tend to zero from above gives the following expansion of ψl , W (t)φ1 (t) , l = 1, . . . , m: ψl , W (t)φ1 (t) m 1 −i(λj +µk )t e Aj (t)(H0 − λj − µk − i0)−1 Pc βk ψj 2 j =1 k∈Z m 1 −iH0 t −1 + W (t)ψl , Aj (0)e (H0 − λj − µk − i0) Pc βk ψj 2 j =1 k∈Z m 1 t −iH0 (t−s) + W (t)ψl , e (H0 − λj − µk − i0)−1 Pc e−i(λj +µk )s ∂s A(s)βk ψj ds . 2 j =1 k∈Z 0

= W (t)ψl , −

(118)

366

E. Kirr, M.I. Weinstein

For a detailed discussion of the singular operators in the above computation and a justification of the calculation using hypothesis (H3), see [12, Sect. 8]. The choice of regularization, +iη, in (117) ensures that the latter two terms in the expansion of φ1 , (118), decay dispersively as t → +∞; see hypothesis (H3). For t < 0, we replace +iη with −iη in (117). To further expand the first series in (118) we use the distributional identities for the singular terms: Pc f, (H0 − λ − µ ∓ i0)−1 Pc g = Pc f, P.V.(H0 − λ − µ)−1 Pc g ± iπ Pc f, δ(H0 − λ − µ)Pc g , which according to [12, Sect. 8] holds whenever f, g satisfy w+ f, w+ g ∈ H. Finally, using the Fourier expansion for W (t) in (118) and substitution into (113) we find ∂t A(t) = [− + i( + η(t)) + ρ(t)] A(t) + E(t), (119) where is given in (31), , η(t) and ρ(t) are m × m matrices with components: ( )lj =

1 4 +

Pc βk ψl , P.V.(H0 − λj − µn )−1 Pc βn ψj

n,k∈Z λl +µk =λj +µn ∈σcont (H0 )

1 4

Pc βk ψl , (H0 − λj − µn )−1 Pc βn ψj ,

(120)

n,k∈Z λl +µk =λj +µn ∈σ / cont (H0 )

(121) ηlj (t) = −ei(λl −λj )t ψl , W (t)ψj , i ρlj (t) = ei(λl +µk −λj −µn )t Pc βk ψl , (H0 − λj − µn − i0)−1 Pc βn ψj . 4 n,k∈Z λl +µk =λj +µn

(122) E(t) is a column vector with components: El (t) = −

m i i(λl +µk )t e Pc βk ψl , e−iH0 t (H0 −λj −µn − i0)−1 Pc βn ψj Aj (0) 4 j =1 n,k∈Z

m i t i(λl +µk )t−i(λj +µn )s − e 4 0 j =1 n,k∈Z

×Pc βk ψl , e−iH0 (t−s) (H0 − λj − µn − i0)−1 Pc βn ψj ∂s Aj (s)ds t −ieiλl t W (t)ψl , e−iH0 t φd (0) −eiλl t W (t)ψl , e−iH0 (t−s) Pc W (s)φd (s)ds . 0

(123) Note that (119) respectively (111) are exactly (61) respectively (62). They form a closed system equivalent to (104) and hence to (24). To complete Proposition 5.1 it remains to prove that the properties (a)–(e) hold. These concern the symmetries and norms of the matrices , , η and ρ and are a direct consequence of the symmetries and norms of the operators δ(H0 − λ), P.V.(H0 − λ)−1 , λ ∈

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

367

σcont (H0 ) and W . In the proofs we focus primarily on ; the proofs for , η and ρ are similar. Lemma 6.1. , and η are self-adjoint matrices. Proof of Lemma 6.1. We prove self-adjointness of . Recall from H5 (31) that π Pc βk ψl , δ(H0 − λj − µn )Pc βn ψj . ( )lj = 4

(124)

n,k∈Z λl +µk =λj +µn

Then



∗

π  ( )∗j l =  4 =

π 4

n,k∈Z λj +µk =λl +µn

  Pc βk ψj , δ(H0 − λl − µn )Pc βn ψl  

δ(H0 − λl − µn )Pc βn ψl , Pc βk ψj .

n,k∈Z λj +µk =λl +µn

Replacing now λl + µn with λj + µk = λl + µn in the argument of δ and then switching the summing indices k, n we get π δ(H0 − λj − µn )Pc βk ψl , Pc βn ψj ( )∗j l = 4 n,k∈Z λl +µk =λj +µn

= ( )lj , where the last equality comes from (124) and the fact that δ(H0 − λj − µn ), j ∈ {1, 2, . . . , N}, n ∈ Z are self adjoint operators, see [12, Sect. 8]. This proves that is self-adjoint. Lemma 6.2. , have matrix norms which are O(|||W |||2 ). η(t), respectively ρ(t), are almost periodic matrices of norms O(|||W |||), respectively O(|||W |||2 ), independent of t. Proof of Lemma 6.2. Let a = (a1 , a2 , . . . , am )T be an arbitrary Cm -column vector. If b = a,

b = (b1 , b2 , . . . , bm ),

then by left multiplying it with b∗ and expanding the left-hand side using (124) we have successively m π |b| = 4 2

l,j =1

=

π 4

Pc βk bl ψl , δ(H0 − λj − µn )Pc βn aj ψj

n,k∈Z λl +µk =λj +µn

n,k∈Z l,j ∈{1,2,...,m} λl +µk =λj +µn

Pc βk bl ψl , δ(H0 − λj − µn )Pc βn aj ψj

368

E. Kirr, M.I. Weinstein

≤

C0

w+ βk w+ βn 4 n,k∈Z

≤

|bl | |aj |

l,j ∈{1,2,...,m} λl +µk =λj +µn

C0 m |b| |a| |||W |||2 , 4

where we used the estimate [12, relation (8.8)], the Cauchy-Buniakowski-Schwartz inequality: % & m m √ & |cj | ≤ m' |cj |2 j =1

j =1

applied to both sums over l and j and the definition of |||W ||| given in (30). In conclusion, for any Cm -vector a we have | a| ≤ m

C0 |||W |||2 |a|. 4

Hence

C0 |||W |||2 = O(|||W |||2 ). 4 This completes the proof of Lemma 6.2. | | ≤ m

(125)

Note however that the size of seems to depend on m, the number of bound states. Actually there is a method to prove the bound (125) with a constant which is independent of m. Here is how. First remark that: (ρ) ˜ lj ≡ (− + i( + ρ(t))) 1 i(λl +µk −λj −µn )t =− e Pc βk ψl , (H0 − λj − µn − i0)−1 Pc βn ψj . 4 n,k∈Z

We are going to show that the right-hand side is an almost periodic 2-form on Cm × Cm bounded by C|||W |||2 for some constant C independent of t or dimension m. As a consequence its mean − + i , hence both and , and its mean zero part ρ(t) will be dominated roughly by the same bound, see [12, Theorem 9.5] for estimating the mean value and use f − M(f ) ≤ f + M(f ) to estimate the zero mean part. Let us fix t and apply ρ˜ to the arbitrary vectors a = (a1 , a2 , . . .), b = (b1 , b2 , . . .) ∈ Cm . We have: m 1 ∗ i(λl +µk −λj −µn )t b l aj e Pc βk ψl , (H0 −λj −µn − i0)−1 Pc βn ψj 4 l,j =1 n,k∈Z ( ) ∞ −i(H0 −iξ )s = w+ W (t)ψb (t), lim w− e Pc w− w+ W (t − s)ψa (t − s)ds ,

b∗ ρa ˜ =−

ξ 0

0

(126)

where ψc (t) = j cj e−iλj t ψj for any vector c = (cj ). All in all the 2-form ρ is a composition of almost periodic functions c, t → ψc (t) and t → W (t) with the continuous function ( ) ∞ w+ ·, lim w− ξ 0

0

e−i(H0 −iξ )s Pc w− w+ · ds .

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

369

Therefore, the 2-form is almost periodic by [12, Theorem 9.3]. Using now the the relation (26) in Hypothesis (H3) and the fact that

ψc (t) ≡ |c|, for any t ≥ 0, we get

|b∗ ρa| ˜ ≤ C|||W |||2 |b| |a|,

which implies sup |ρ(t)| ˜ ≤ C|||W |||2 , t∈R

where now ρ(t) ˜ is viewed as an operator on Cm . In conclusion , and ρ(t) have the required sizes, independent of the dimension, and the latter is almost periodic. Note that in the above argument it was essential to sum over the components of the vectors in Cm before we applied norms. As for the symmetry and norm of η(t) they are an immediate consequence of the symmetry and norm of W (again one should sum over the vector components as before and then apply norms to avoid the dependence of the estimate on the dimensionality). Lemma 6.3. is a nonnegative semidefinite matrix. Proof of Lemma 6.3. For any vector b = (b1 , b2 , . . .) ∈ Cm we have π b∗ b = ψαb , δ(H0 − α)ψαb ≥ 0. 4 α∈σcont (H0 )

Here α ∈ σcont (H0 ) is such that there exist l ∈ {1, 2, . . .} and k ∈ Z with the property λl + µk = α, while ψαb is given by ψαb =

m

l=1

k∈Z λl + µk = α

This completes the proof of Lemma 6.3.

b l Pc β k ψ l .

It remains to prove part (e) of Proposition 5.1. Such estimates have already been obtained in [12, Proposition 6.2] for the case of one bound state, i.e. E(t) a scalar. There are no new ideas in generalizing it for “m” bound states except the tricks used above to make the estimates independent on “m”. 7. Appendix – Proof of Proposition 5.2 By Duhamel’s principle, the solutions of the system (68) can be written as: t # # # # B(t) = e(− +i )t B(0) + e(− +i )(t−s) α(s)B(s)ds 0 t (− # +i # )(t−s) + e F (s)ds. 0

(127)

370

E. Kirr, M.I. Weinstein

Hence, for any 0 ≤ r ≤ r1 , t |B(t)| ≤ e−γ t B(0) + Cα |||W |||2 (|||W ||| + ) e−γ (t−s) |B(s)|ds 0 t +C(1 + )|||W ||| e−γ (t−s) s −r1 ds 0 t +D(1 + )|||W |||3 sup s r |B(s)| e−γ (t−s) s −r ds, 0≤s≤t

(128)

0

where we used the estimates for α(s) and F (s) in (p4) respectively (p5) and |e(− +i )t | ≤ e−γ t due to the symmetry of both # and # . Let us focus first on the result on time scales of order 1/|||W |||2 . Fix T ≤ c|||W |||−2 , where c is an arbitrary constant. First we show that there exists a constant C such that #

sup |B(t)| ≤ C.

#

(129)

0≤t≤T

To this end we use r = 0 in (p5) and e−γ t ≤ 1, for t ≥ 0, since by (H5) γ ≥ 0. We plug them in (128) and obtain for all 0 ≤ t ≤ T : |B(t)| ≤ B(0) + t |||W |||2 (|||W ||| + ) sup |B(s)|(Cα + D) + C(1 + )|||W |||. 0≤s≤T

Then since 0 ≤ t ≤ T ≤ c|||W |||−2 we get ˜ (1 − C(|||W ||| + )) sup |B(t)| ≤ B(0) + C(1 + )|||W ||| 0≤t≤T

which for sufficiently small |||W ||| + implies (129). Now, by comparing (127) with the conclusion of Proposition 5.2 we find that we must show R(t) = O(|||W ||| + ), where t t # # # # R(t) ≡ e(− +i )(t−s) α(s)B(s)ds + e(− +i )(t−s) F (s)ds. (130) 0

0

Hence for any 0 ≤ t ≤ T , |R(t)| ≤ (Cα + D) t |||W |||2 (|||W ||| + ) sup |B(s)| + C(1 + )|||W |||, 0≤s≤T

and since 0 ≤ t ≤ c|||W |||−2 and B(s) is bounded we have ˜ R(t) ≤ C(|||W ||| + ). This completes the |||W |||−2 time scale estimates. For infinite time behavior we again proceed in two steps. Firstwe show that B(t) ∞ is bounded on the positive half line. We start from (128) and use 0 e−γ s ds ≤ γ −1 together with r = 0 in (p5), to obtain for all t ≥ 0: |B(t)| ≤ B(0) +

|||W |||2 (|||W ||| + )(C + D) sup |B(s)| + C(1 + )|||W |||. γ s≥0

Metastable States in Parametrically Excited Multimode Hamiltonian Systems

371

Then since γ ≥ θ0 |||W |||2 we get (1 − C˜ 1 (|||W ||| + )) sup |B(s)| ≤ B(0) + C(1 + )|||W ||| s≥0

(C˜ 1 = (C +D)/θ0 ) which for sufficiently small |||W |||+ implies that B(t) is uniformly bounded for t > 0 . t/2 t Next we multiply (128) by t r1 and split the integrals into 0 + t/2 . We use (p5) with r = 0 for integrals up to t/2 and r = r1 for integrals from t/2. We obtain for all t ≥ 0, t r1 |B(t)| ≤ t r1 e−γ t B(0) + t r1 e−γ t/2 |||W |||2 (|||W ||| + ) t/2 e−γ (t/2−s) ds × (C + D) sup |B(s)| 0

s≥0

t

+(C + D)|||W |||2 (|||W ||| + )t r1 sups r1 |B(s)| s≥0

+C(1 + )|||W |||t r1 e−γ t/2

t/2

e

−γ (t/2−s)s −r1

e−γ (t−s) s −r1 ds

t/2

ds +

0

t

e−γ (t−s) s −r1 ds .

t/2

For the terms in the first row we take into account that B(t) and the product of a power function with positive exponent and a decaying exponential is always bounded and that t/2 −γ (t/2−s) ds ≤ γ −1 . For the terms in the second row we note that for s ≥ t/2 we 0 e −r 1 have s ≤ t/2 −r1 which helps annihilate the t r1 factor in front of the integrals. All in all we get t r1 |B(t)| ≤ C1 B(0) + C2 +C3

|||W |||2 (|||W ||| + ) γ

|||W |||2 (|||W ||| + ) sups r1 |B(s)| + C4 |||W |||, γ s≥0

which after maximizing over all t ≥ 0 in the left-hand side and using γ > θ0 |||W |||2 leads to 1−

C3 C2 (|||W ||| + ) supt r1 |B(t)| ≤ C1 B(0) + (|||W ||| + ) + C4 |||W |||, θ0 θ0 t≥0

which for sufficiently small |||W ||| + implies supt≥0 t r1 |B(t)| bounded.

Acknowledgement. This work was supported in part by a grant from the US National Science Foundation. Part of this work was done while E. Kirr was a Ph.D. student at the University of Michigan – Ann Arbor, and a participant in the Bell Labs/Lucent Student Intern Program. E.K. was also supported in part by the ASCI Flash Center at the University of Chicago. The authors wish to thank S. Golowich, P.D. Miller and A. Soffer for discussions on this work.

372

E. Kirr, M.I. Weinstein

References 1. Albeverio, S., Gesztesy, F., Hoegh-Krohn, R., Holden, H.: Solvable Models in Quantum Mechanics. New York: Springer-Verlag New York, 1988 2. Bohr, H.: Almost Periodic Functions. London: Chelsea, 1951 3. Cohen-Tannoudji, C., Dupont-Roc, J., Grynberg, G.: Atom-Photon Interactions. New York: Wiley, 1992 4. Costin, O., Costin, R.D., Lebowitz, J.L., Rokhlenko, A.: Evolution of a model quantum system under time periodic forcing: Conditions for complete ionization. Commun. Math. Phys. 221(1), 1–26 (2001) 5. Costin, O., Costin, R.D., Lebowitz, J.L., Rokhlenko, A.: Nonperturbative analysis of a model quantum system under time periodic forcing. C. R. Acad. Sci. Paris S´er. I Math. 332(5), 405–410 (2001) 6. Costin, O., Soffer, A.: Resonance theory for Schr¨odinger operators. Commun. Math. Phys. 224(1), 133–152 (2001) 7. Galindo, A., Pascual, P.: Quantum Mechanics II. Berlin-Heidelberg-New York: Springer, 1991 8. Grecchi, V., Sacchetti, A.: Critical metastability and destruction of the splitting in non-autonomous systems. J. Stat. Phys. 103, 339–368 (2001) 9. Harrell, E.M.: Double wells. Commun. Math. Phys. 75, 239–261 (1980) 10. Jensen, A., Kato, T.: Spectral properties of Schr¨odinger operators and time-decay of wave functions. Duke Math. J. 46, 583–611 (1979) 11. Kirr, E., Weinstein, M.I.: Diffusion of power in multimode systems with defects. In preparation 12. Kirr, E., Weinstein, M.I.: Parametrically excited Hamiltonian partial differential equations. SIAM J. Math. Anal. 33, 16–52 (2001) 13. Levitan, B.M., Zhibov, V.V.: Almost Periodic Functions and Differential Equations. Cambridge: Cambridge Univ. Press, 1982 14. Landau, L.D., Lifshitz, E.M.: Quantum Mechanics: Non -relativistic Theory, Volume 3 of Course of Theoretical Physics, Oxford: Pergamon Press, 1965 15. Marcuse, D.: Theory of Dielectric Optical Waveguides. London-New York: Academic Press, 1974 16. Miller, P.D., Soffer, A., Weinstein, M.I.: Metastability of breather modes of time dependent potentials. Nonlinearity 13, 507–568 (2000) 17. Murata, M.: Rate of decay of local energy and spectral properties of elliptic operators. Jpn. J. Math. 6, 77–127 (1980) 18. Reed, M., Simon, B.: Methods in Modern Mathematical Physics, I. Functional Analysis. New York: Academic Press, 1972 19. Soffer, A., Weinstein, M.I.: Time dependent resonance theory perturbations of embedded eigenvalues. Partial differential equations and their applications. (Toronto, ON, 1995), 277–282, CRM Proc. Lecture Notes, 12. Providence, RI: Amer. Math. Soc., 1997 20. Soffer, A., Weinstein, M.I.: Nonautonomous Hamiltonians. J. Stat. Phys. 93, 359–391 (1998) 21. Soffer, A., Weinstein, M.I.: Time dependent resonance theory. Geom. Func. Anal. 8, 1086–1128 (1998) 22. Soffer, A., Weinstein, M.I.: Resonances, radiation damping and instability in Hamiltonian nonlinear wave equations. Invent. math. 136, 9–74 (1999) 23. Soffer, A., Weinstein, M.I.: Ionization and scattering for short lived potentials. Lett. Math. Phys. 48, 339–352 (1999) 24. Vainberg, B.: Scattering of waves in a medium depending periodically on time. Asterisque 210, 327–340 (1992) 25. Weder, R.: Center manifold for nonintegrable nonlinear Schr¨odinger equations on the line. Commun. Math. Phys. 215, 343–356 (2000) 26. Wightman, A.S.: Superselection rules; old and new. Nuovo Cimento B 110, 751–769 (1995) 27. Yajima, K.: Scattering theory for Schr¨odinger operators with potentials periodic in time. J. Math. Soc. Japan 29, 729–743 (1977) 28. Yajima, K.: Resonances for the AC-Stark effect. Commun. Math. Phys. 78, 331–352 (1982) 29. Yajima, K.: A multichannel scattering theory for some time dependent hamiltonians, charge transfer problem. Commun. Math. Phys. 75, 153–178 (1980) Communicated by J.L. Lebowitz

Commun. Math. Phys. 236, 373–393 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0822-8

Communications in

Mathematical Physics

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence Seiji Ukai1 , Tong Yang2 , Shih-Hsien Yu2 1 2

Department of Applied Mathematics, Yokohama National University, Yokohama, Japan. E-mail: [email protected] Department of Mathematics, City University of Hong Kong, Kowloon, Hong Kong. P.R. China. E-mail: [email protected]; [email protected]

Received: 20 April 2002 / Accepted: 4 December 2002 Published online: 21 March 2003 – © Springer-Verlag 2003

Abstract: We study the half-space problem of the nonlinear Boltzmann equation, assigning the Dirichlet data for outgoing particles at the boundary and a Maxwellian as the far field. We will show that the solvability of the problem changes with the Mach number M∞ of the far Maxwellian. If M∞ < −1, there exists a unique smooth solution connecting the Dirichlet data and the far Maxwellian for any Dirichlet data sufficiently close to the far Maxwellian. Otherwise, such a solution exists only for the Dirichlet data satisfying certain admissible conditions. The set of admissible Dirichlet data forms a smooth manifold of codimension 1 for the case −1 < M∞ < 0, 4 for 0 < M∞ < 1 and 5 for M∞ > 1, respectively. We also show that the same is true for the linearized problem at the far Maxwellian, and the manifold is, then, a hyperplane. The proof is essentially based on the macro-micro or hydrodynamics-kinetic decomposition of solutions combined with an artificial damping term and a spatially exponential decay weight. 1. Introduction and Main Result The Dirichlet problem of the nonlinear Boltzmann equation in the half-space arises in the analysis of the kinetic boundary layer, the condensation-evaporation problem and other problems related to the kinetic behavior of gas near the wall, [5, 13]. The main concern is to find a solution which tends to an assigned Maxwellian at infinity. An interesting feature of this problem is that not all Dirichlet data are admissible and the number of admissible conditions changes with the far Maxwellian. This has been shown for the linear case by many authors [3, 6, 7, 9], mainly in the context of the classical Milne and Kramer problems. Recently, a nonlinear admissible condition was derived for the discrete velocity model in [15] and the stability of steady solutions was proven in [12], see also [10, 11]. The full nonlinear problem was solved on the existence of solutions in [8] for the case of the specular reflection boundary condition, whose proof, however, does not work for the Dirichlet boundary condition, and in [2]

374

S. Ukai, T. Yang, S.-H. Yu

for this case, but with the ambiguity that the far Maxwellian cannot be fixed a priori. In this paper,we will establish the admissible conditions for the fixed far Maxwellian. Our proof provides also a new aspect of the linear problem. It should be mentioned that K. Aoki, Y. Sone and their group, ([1, 13, 14] and references therein), made an extensive numerical computation on the same nonlinear problem. Our result gives a partial explanation of their numerical results. We are concerned with the steady state of a gas in the 3-dimensional half-space D = {(x, y, z) ∈ R3 |x > 0}, in which the mass density F of gas particles is assumed constant on each plane parallel to the boundary ∂D although the particle motion is 3-dimensional, that is, F is assumed to be a function of position x (but not of y, z) and particle velocity ξ = (ξ1 , ξ2 , ξ3 ) ∈ R3 . Let ξ1 stand for the velocity component along the x-axis. Then, F is governed by  x > 0, ξ ∈ R3 ,  ξ1 Fx = Q(F, F ), (1.1) F| = F0 (ξ ), ξ1 > 0, (ξ2 , ξ3 ) ∈ R2 ,  x=0 F → M∞ (ξ ) (x → ∞), ξ ∈ R3 . Here, Q, the collision operator, is a bilinear integral operator Q(F, G) = F (ξ )G(ξ∗ ) − F (ξ )G(ξ∗ ) q(ξ − ξ∗ , ω) dξ∗ dω, R3 ×S 2

(1.2)

with ξ = ξ − [(ξ − ξ∗ ) · ω] ω,

ξ∗ = ξ∗ + [(ξ − ξ∗ ) · ω] ω,

(1.3)

where “·” is the inner product of R3 . We restrict ourselves to the hard sphere gas for which the collision kernel q is given by q(ζ, ω) = σ0 |ζ · ω|, where σ0 is the surface area of the hard sphere. Here we shall recall two classical properties of Q which will be needed later. See [4, 5] for details. (i) Q(F ) = 0 if and only if F = M[ρ, u, T ](ξ ) ≡

|ξ − u|2 ρ exp − , (2πT )3/2 2T

(1.4)

for any constants ρ, T > 0 and u = (u1 , u2 , u3 ) ∈ R3 . This is called a Maxwellian and is the distribution function of a gas in the equilibrium state with the mass density ρ, flow velocity u and temperature T . (ii) A function φ(ξ ) is called a collision invariant of Q if φ, Q(F ) = 0

for all F,

, being the inner product of L2 (R3 ). Q has five collision invariants φ0 = 1,

φi = ξi (i = 1, 2, 3),

φ4 = |ξ |2 ,

(1.5)

which indicate the conservations of mass, momentum and energy in the course of the binary collision of particles. The second equation in (1.1) is the Dirichlet boundary condition. The Dirichlet data F0 (ξ ) can be assigned only for incoming particles (ξ1 > 0), but not for outgoing ones

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

375

(ξ1 < 0) because, then the problem becomes ill-posed as will be seen from the a priori estimate stated in the next section. This corresponds to the physical situation that one can control the incoming distribution but not the outgoing one, on the wall. It is clear that the far field M∞ (ξ ) in the third equation of (1.1) cannot be assigned arbitrarily but must be a zero of Q, and hence a Maxwellian. Thus, we must take M∞ = M[ρ∞ , u∞ , T∞ ],

(1.6)

with some constants ρ∞ > 0, u∞ = (u∞,1 , u∞,2 , u∞,3 ) ∈ R3 , and T∞ > 0 which are the only quantities we can control. By a shift of the variable ξ in the direction orthogonal to the x-axis, we can assume without loss of generality that u∞,2 = u∞,3 = 0, and then, the sound speed and Mach number of this equilibrium state are given by u∞,1 5 M∞ = , (1.7) T∞ , c∞ = 3 c∞ respectively, see [5]. Note that the flow at infinity is incoming (resp. outgoing) if M∞ < 0 (resp. > 0) and supersonic (resp. subsonic) if |M∞ | > 1 (resp. < 1). We will see that the Mach number M∞ provides significant changes on the solvability of our problem (1.1). Indeed, since the third equation of (1.1) specifies the “boundary data” M∞ (ξ ) at x = ∞ for all ξ , it is over-determined (ill-posed), and as a consequence, (1.1) may not be solvable unconditionally. Actually, we will show that the number of solvability conditions changes with the Mach number M∞ . To state this precisely, set  0, M∞ < −1,   1, −1 < M∞ < 0, n+ = (1.8) ∞   4, 0 < M∞ < 1, 5, 1 < M , and introduce the weight function 1/2 W β (ξ ) = (1 + |ξ |)−β M[1, u∞ , T∞ ](ξ ) ,

(1.9)

with β ∈ R. Our main result is Theorem 1.1. Suppose M∞ = 0, ±1, and let β > 3/2. Then, there exist positive numbers 0 , σ, C0 , and a C 1 map : L2 (R3+ ) −→ Rn+ ,

(0) = 0,

(1.10)

such that the following holds. (i) For any F0 satisfying |F0 (ξ ) − M∞ (ξ )| ≤ 0 W β (ξ ),

ξ ∈ R3+ ,

(1.11)

and (F0 − M∞ ) = 0,

(1.12)

the problem (1.1) has a unique solution F in the class |F (x, ξ ) − M∞ (ξ )| + |ξ1 Fx (x, ξ )| ≤ C0 e−σ x W β (ξ ),

x > 0, ξ ∈ R3 .

(1.13)

(ii) The set of F0 satisfying (1.11) and (1.12) forms a (local) C 1 manifold of codimension n+ .

376

S. Ukai, T. Yang, S.-H. Yu

Remark 1.2. The theorem does not cover the cases M∞ = 0, ±1. Remark 1.3. For each given M∞ , (1.11) is a smallness condition on the deviation of F0 from M∞ , whereas (1.12) gives restrictions on F0 however small it may be, if n+ = 0. Thus, our theorem says that the problem (1.1) is solvable unconditionally for any F0 sufficiently close to M∞ if M∞ < −1 but otherwise not. A physical explanation of this is that if M∞ < −1, any phenomena near the boundary cannot affect the far field while if M∞ > −1, a part of them can propagate to infinity and affect the far field. Remark 1.4. In the numerical works made in [1, 13, 14] and references therein, the Dirichlet data F0 is fixed to be the standard Maxwellian M[1, 0, 1](ξ ) (of course for ξ1 > 0), and values of 5 parameters (ρ∞ , M∞ , u2,∞ , u3,∞ , T∞ ) of the far Maxwellian (1.6) are sought numerically which admit smooth solutions connecting F0 and M∞ . The conclusion is that the set of such admissible values is, in the parameter space R5 , a union of a 5-dimensional subdomain in the domain M∞ < −1, a 4-dimensional surface in −1 < M∞ < 0 and a 1-dimensional curve in 0 < M∞ < 1, whereas no solutions are found if M∞ > 1. Our theorem agrees with this for the case M∞ < 1 in the sense that the above mentioned regions of admissible values have the codimension just equal to n+ of (1.8) in R5 . For the case M∞ > 1, F0 = M[1, 0, 1] may not be on the manifold defined by (1.12) and hence, no solutions. Remark 1.5. The stability of the stationary solutions obtained in Theorem 1.1 is an important issue. In our forthcoming paper, we will show their exponentially asymptotic stability for the case M∞ < −1. Our proof of Theorem 1.1 relies on the analysis of the corresponding linearized problem at M∞ , and provides also a different aspect of the linear problems discussed in [3, 6, 7, 9]. First, we shall look for the solution of (1.1) in the form F (x, ξ ) = M∞ (ξ ) + W 0 (ξ )f (x, ξ ),

(1.14)

where W 0 is the weight of (1.9) with β = 0. Then, the problem (1.1) reduces to  x > 0, ξ ∈ R3 ,  ξ1 fx − Lf = (f ), (1.15) f |x=0 = a0 (ξ ), ξ ∈ R3+ ,  f → 0 (x → ∞), ξ ∈ R3 , where

Lf = W −1 Q(M , W f ) + Q(W f, M ) , ∞ 0 0 ∞ 0

(f ) = W −1 0 Q(W 0 f, W 0 f ),

F0 − M ∞ . a0 = W −1 0 The operator L is linear while the remainder is quadratic. Now, the linearized problem of (1.1) at M∞ is just (1.15) with the term (f ) dropped,  x > 0, ξ ∈ R3 ,  ξ1 fx − Lf = 0, (1.16) f |x=0 = a0 (ξ ), ξ ∈ R3+ ,  f → 0 (x → ∞), ξ ∈ R3 . We can get the following linear version of Theorem 1.1.

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

377

Theorem 1.6. Suppose M∞ = 0, ±1. Then, there exist n+ functions ri , 1 ≤ i ≤ n+ , of L2 (R3+ , ξ1 dξ ) and positive numbers σ, C0 such that for any a0 ∈ R ⊥ with R = span{r1 , r2 , · · · , rn+ }, the linearized problem (1.16) has a unique solution of the form f = e−σ x g with g satisfying ||ξ1 |1/2 g 0 |− + ||(1 + |ξ |)1/2 g|| + |||ξ1 |(1 + |ξ |)−1/2 gx || ≤ C0 ||ξ1 |1/2 a 0 |+ ,

(1.17)

where g 0 = g|∂D = g(0, ξ ) while | · |± and || · || denote the norms of L2 (R3± ), and L2 ((0, ∞) × R3 ),respectively. Remark 1.7. This theorem says that for the linear problem (1.16), the map of Theorem 1.1 is linear and the manifold of admissible a0 is the hyperplane R ⊥ . Actually, has the form lin (a) = (< ξ1 ri , a >+ )i=1,2,··· ,n+ ,

(1.18)

where <, >+ denote the inner product of L2 (R3+ ). In order to compare our result with the ones known so far, recall the linear operator L and put N = kerL. It is classical that N = span{W 0 (ξ )φi (ξ )}i=0,1,··· ,4 ,

(1.19)

where φi is as in (1.5) and thus N can be taken a 5 dimensional subspace of L2 (R3 ). Let N ⊥ be the orthogonal complement of N and let P 0 : L2 (R3 ) → N,

P 1 : L2 (R3 ) → N ⊥ ,

be the orthogonal projections. Define the operator A = P 0 ξ1 P 0 , which is the 5-dimensional linear bounded self adjoint operator. It is easy to see that A has the eigenvalues λ1 = u∞,1 − c∞ ,

λi = u∞,1 (i = 2, 3, 4),

λ5 = u∞,1 + c∞ ,

(1.20)

on N. Define I + = {j |λj > 0},

I − = {j |λj < 0}.

Note that n+ of (1.8) is just #I + . Let χj be the eigenfunction corresponding to the eigenvalue λj . In [7], the following is proved. Theorem 1.8 ([7]). For any a0 ∈ L2 (R3+ , ξ1 dξ ) and for any constants cj , j ∈ I − , there exists a unique L2 solution f satisfying the first two equations of (1.16) and instead of the last one, the auxiliary condition < χj , f (x, ·) >= cj ,

x > 0, j ∈ I − .

Moreover, there exists an element f∞ ∈ N such that f → f∞ (x → ∞) in L2 (R3 ).

378

S. Ukai, T. Yang, S.-H. Yu

Since (1.16) is linear and f∞ ∈ N = kerL, we see that f˜ = f − f∞ solves all of the three equations in (1.16) with a0 replaced by a0 − f∞ . Now Theorem 1.6 concludes a0 − f ∞ ∈ R ⊥ . Remark 1.9. The far field data varies with respect to the boundary data prescribed at x = 0 in Theorem 1.8. In contrast to this, the far field data is frozen in Theorem 1.6 so that the boundary data is subjected to a solvability condition instead of arbitrarily given data. From the consideration of a linear problem, these two theorems are almost equivalent. However, there are some essential differences in their approaches toward the theorems and in the applications to nonlinear problems. When the far field data is frozen, a nonlinear problem can be considered as perturbation. The approach in Theorem 1.6 can be directly applied. In the rest of this section, we describe the main idea in the proof of Theorem 1.1 with the plan of this paper. There are two ingredients in our proof. One is to add an artificial “damping” term and the other is to introduce the spatial weight function eσ x , σ > 0. To construct the damping term, decompose the operator A on N into the positive and − negative parts A+ , A− , and denote the corresponding eigen-projections by P + 0 ,P0 . ∞ Note that if M = 0, ±1, then A has no zero eigenvalues (see (1.20)), so that A = A+ + A − ,

− P0 = P+ 0 + P0 .

We modify (1.15) by adding the damping term defined by −γ P + 0 ξ1 f,

γ > 0,

and then rewrite it by putting f = e−σ x gp, to deduce  + 3   ξ1 gx − σ ξ1 g − Lg = h − γ P 0 ξ1 g, x > 0, ξ ∈ R , 3 g|x=0 = a0 (ξ ), ξ ∈ R+ ,   g → 0 (x → ∞), ξ ∈ R3 ,

(1.21)

with h = e−σ x (g).

(1.22)

Note that for the case M∞ < −1, we have n+ = 0 and P + 0 = 0, and hence no damping term. The existence of the solution to (1.21) will be proved and stated in Theorem 4.3. Take the inner product of (1.21) and g in L2 (R+ × R3 ) and integrate by parts, to deduce < |ξ1 |g 0 , g 0 >− +(Bg, g) − (Lg, g) =< ξ1 a0 , a0 >+ +(h, g),

(1.23)

where g 0 = g|x=0 while B = −σ ξ1 + γ P + ξ1 ,

σ, γ > 0.

Seemingly, this has no good sign but it does on N if γ > σ > 0 as seen from P 0 BP 0 = −σ A + γ A+ = −σ A− + (γ − σ )A+ . On the other hand, it is classical that L is negative definite on N ⊥ . Thus, (1.23) leads to the estimate (1.17) with a sufficient small σ > 0 and the choice, say, γ = 2σ .

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

379

The estimate thus obtained is enough to construct our solutions. In Sect. 2, applying the same estimate for the adjoint problem to (1.21), together with the Riesz representation theorem, we will show the existence of a weak L2 solution g to (1.21). Furthermore, taking suitable test functions, we can establish the “weak=strong” theorem. Thus g is a unique strong solution and satisfies (1.17). In Sect. 3, starting from this estimate and using the bootstrap argument, the estimate of the L∞ norm of g will be derived in terms of those of h and a0 . Then, the contraction argument is applied to the modified nonlinear problem (1.21) with (1.22), whence follows the existence of L∞ solutions g for sufficiently small a0 . In the case M∞ < −1, this g gives the solutions to (1.15) and hence to the original problem (1.1). The case M∞ > −1 will be solved in Sect. 4 as follows. Clearly, if P+ 0 ξ1 g = 0,

x > 0, ξ ∈ R3 ,

(1.24)

then g is also a solution of the original problem without the extra damping term. We will show that the condition (1.24) reduces to 0 P+ 0 ξ1 g = 0,

ξ ∈ R3 ,

(1.25)

where g 0 = g|x=0 . Clearly, g and hence g 0 are determined uniquely by the boundary data a0 . Write (a0 ) = P + ξ1 g 0 .

(1.26)

+

Identifying the space P + N with Rn , we will show that (1.26) defines a C 1 map +

: L2 (R3+ , ξ1 dξ ) → Rn ,

(1.27)

and (0) = 0. Moreover, we will have Proposition 1.10. The Fr´echet derivative of at a0 = 0 is given by (1.18). Finally, using this and the implicit function theorem, we will show that the set of a0 ’s satisfying (a0 ) = 0 forms a C 1 manifold of codimension n+ , whence Theorem 1.1 follows. 2. Linear Existence with Damping In this section, we will prove the existence of solutions for the linearized Boltzmann equation with damping by energy method. The problem we consider here is to replace the nonlinear term e−σ x (g) in (1.21) by a given function h(x, ξ ) as follows:  + 3  ξ1 gx − σ ξ1 g − Lg = h − γ P 0 ξ1 g, ξ ∈ R , (2.1) g|x=0 = a0 (ξ ) for ξ ∈ R 3+ ,  g → 0 as x → ∞, where h satisfies P 0 h = 0,

||h|| < ∞.

(2.2)

380

S. Ukai, T. Yang, S.-H. Yu

The idea of the proof is to introduce a linear functional on a subspace of L2x,ξ . Then we will show that the linear functional is bounded by energy estimates. Finally the existence of the solution is obtained by the Riesz representation theory. The linear functional is defined on a linear space W as follows: (χ ) = (h, φ) + ξ1 a0 , φ 0 , (2.3) +

where

V ≡ {φ ∈ C0∞ ([0, ∞) × R 3 )| φ 0 = φ|x=0 = 0 f or ξ1 < 0}, W ≡ {χ|χ = −ξ1 φx − σ ξ1 φ + γ P0+ ξ1 φ − Lφ, φ ∈ V }.

(2.4)

For the hard sphere collision, we have the following estimate for the linearized operator on the micro-component of the solutions to the Boltzmann solution, cf. [13, 3]. This estimate is crucially used in this paper and how to deal with other collision kernels by the energy method is not in the scope of this paper. Lemma 2.1. There exists c0 > 0 such that for any f ∈ N ⊥ , we have c0 f, (1 + |ξ |)f ≤ − f, Lf . Proof. It is known that the linear operator L has the following property on the microscopic subspace N ⊥ , i.e., for any f ∈ N ⊥ , − f, Lf ≥ ν0 f, f , where ν0 is a positive constant. Notice also that L = −ν(|ξ |) + K, with ν(|ξ |) ∼ |ξ | as |ξ | → ∞; and K is a compact operator from L2ξ to itself. We can choose a small constant so that − f, Lf ≥ (1 − )νo f, f + f, ν(|ξ |)f + f, Kf ≥ (1 − )νo f, f + f, ν(|ξ |)f + c f, f ≥ f, (1 + |ξ |)f , and this completes the proof of the lemma.

We now claim that the linear operator is bounded on V if γ > σ > 0 are sufficiently small. For this, we consider the inner product of χ ∈ W and φ ∈ V . This yields the following equation after integration by parts: (χ , φ) = ξ1 φ 0 , φ 0 − σ (ξ1 φ, φ) + γ (P0+ ξ1 φ, φ) − (Lφ, φ). (2.5) +

By assuming that the Mach number M∞ = ±1, or 0, we know that the matrix for A = P 0 ξ1 P 0 is invertible on Range(P 0 ) and is in the form of   1 √ T∞ 0 0 0 √u∞  T∞ u1 0 0 c∞  ∞    0 0 u1∞ 0 0  ,   0 0 0 u1∞ 0  0 c∞ 0 0 u1∞

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

381

where the entries are calculated by (ψi , ξ1 ψj ), i, j = 0, , · · · , 4. It is straightforward to calculate the eigenvalues of the above matrix which are λ1 = u1∞ − c∞ ,

λi = u1∞ , i = 2, 3, 4,

λ5 = u1∞ + c∞ .

Notice that none of the eigenvalues is zero when M∞ = ±1 or 0. Thus, we have −σ (ξ1 φ, φ) + γ (P0+ ξ1 φ, φ) = σ [(−P 0 ξ1 P 0 φ0 , φ0 ) − 2(P 0 ξ1 φ1 , φ0 ) − (ξ1 φ1 , φ1 )] + +γ [(P + 0 ξ1 φ0 , φ0 ) + (P 0 ξ1 φ1 , φ0 )] + = σ (−P − 0 ξ1 φ0 , φ0 ) + (γ − σ )(P 0 ξ1 φ0 , φ0 ) −2σ (P 0 ξ1 φ1 , φ0 ) + γ (P + 0 ξ1 φ1 , φ0 ) − σ (ξ1 φ1 , φ1 )

≥ 2cσ ||φ0 ||2 − c−1 σ ||φ1 ||||φ0 || + σ (ξ1 φ1 , φ1 ) ≥ cσ ||φ0 ||2 − c−3 σ ||φ1 ||2 + σ (ξ1 φ1 , φ1 ),

(2.6)

where we have used the assumption that γ = 0(1)σ for some positive constant 0(1) > 1. Here, φi = P i φ, i = 0, 1, and in the following we will use c to denote a generic constant. Since (Lφ, φ) = (Lφ1 , φ1 ), Lemma 2.1 implies that for sufficiently small σ > 0, . (χ , φ) ≥ c ||φ||2 + ξ1 φ 0 , φ 0 +

Therefore, we have

1 2 ||φ|| + ξ1 φ 0 , φ 0 ≤ c||χ ||. +

And this implies that |(χ )| ≤ |(h, φ)| + ξ1 a0 , φ 0 + 1 1 2 ||φ|| + ξ1 φ 0 , φ 0 ≤ c ||h|| + ξ1 a0 , a0 +2 +

≤ c(||h|| + ξ1 a0 , a0 + )||χ ||.

(2.7)

Hence is bounded and the bound depends on the boundary data and the given function h. By the same energy estimate given about, it is easy to see that for any given χ ∈ W, there exists a unique φ ∈ V such that χ = −σ ξ1 φ + γ P0+ ξ1 φ − ξ1 φx − Lφ. Therefore, the linear functional can be defined on the closure of the space V with the same bound. Moreover, by the Hahn-Banach theorem we know that there is an extension ¯ of to the space L2x,ξ such that ¯ )| ≤ c||χ ||, |(χ

f or

any

χ ∈ L2x,ξ ,

with the bound unchanged. For this functional ¯ on L2 , we can apply the Riesz representation theorem and know that there exists a unique g ∈ L2x,ξ such that ¯ ) = (g, χ ). (χ

(2.8)

382

S. Ukai, T. Yang, S.-H. Yu

And this implies that g is a weak solution to the linear equation (2.1). Now we want to prove both gx for almost all (x, ξ ) and the trace of g at x = 0 are well-defined. To do so, we choose a family of particular test functions φ as follows. Set ∞ φ= η(x , ξ )dx , ∞

where η satisfies yields

0

x

η(x , ξ )dx

x

(−σ ξ1 g 0

= 0 for any ξ . Applying this test function to (2.1)

+ γ P0+ ξ1 g

− Lg − h)dx + ξ1 g, η = 0.

By the choice of our test function η, we have x (−σ ξ1 g + γ P0+ ξ1 g − Lg − h)dx + ξ1 g = b(ξ ), 0

for almost all (x, ξ ), where b(ξ ) is a function of ξ only. Therefore, the trace of g at x = 0 is well-defined and we have (2.1) for almost all (x, ξ ). 1 In order to prove the uniqueness of g, we need to show that |ξ1 | 2 g ∈ L2 , cf. [3, 7] 2 for similar discussion. So far we know that g ∈ Lx,ξ . By choosing a smooth cut-off function θ (ξ ) satisfying θ (ξ ) = 0, f or |ξ | > M, θ (ξ ) = 1, f or |ξ | < M − 1, and is monotone when M − 1 ≤ |ξ | ≤ M. Applying θg to Eq. (2.1) and then integrating over (x, ξ ) gives, −σ (θξ1 g, g) + γ (θ P0+ ξ1 g, g) + (θ ξ1 gx , g) − (θLg, g) = (θ h, g), −σ (θξ1 g, g) + γ (θ P0+ ξ1 g, g) − θ ξ1 g, g |x=0 − (θ Lg, g) = (θ h, g). (2.9) Since

−(θ Lg, g) = (θ νg, g) − (θ Kg, g),

we have for sufficiently small σ , (θ|ξ |g, g) + θ|ξ1 |g, gx=0,ξ1 <0 ≤ c(||h||2 + ||g||2 + ξ a0 , a0 + ).

(2.10)

By letting M → ∞, the above inequality implies that (|ξ1 |g, g) is uniformly bounded. Therefore, the energy estimate similar to the one given above gives uniqueness of the solution g to (2.1). Finally, we will show that the solution obtained indeed satisfies the given boundary condition. In fact, if we take the inner product of Eq. (2.1) with φ ∈ V , we have after integration by parts (g, −ξ1 φx − σ ξ1 φ + γ P0+ ξ1 φ − Lφ) = (h, φ) + ξ1 g0 , φ 0 . (2.11) +

By the definition of χ and the solution g, (2.8), we have (h, φ) + ξ1 g 0 , φ 0 = (h, φ) + ξ a0 , φ 0 , +

+

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

383

and this gives us g 0 = a0 , f or ξ1 > 0, a.e. In summary, we have the following existence theorem for the linearized equation with damping (2.1): Theorem 2.2. Consider the linearized problem (2.1). If the boundary condition a0 and the source term h satisfy 1

1

ξ1 a0 , a0 +2 + (h, h) 2 < ∞, then there exists a unique solution g ∈ L2x,ξ with 1

1

1

1

||ξ1 | 2 g 0 |− + (1 + |ξ |) 2 g + |ξ1 |(1 + |ξ |)− 2 gx ≤ c0 (||ξ1 | 2 a0 |+ + h).

3. Nonlinear Existence with Damping Now we turn to study the nonlinear Boltzmann equation with damping, i.e. problem (1.21). As for the discussion for the Cauchy problem, we use the weighted norm: = ||f ||β = ||(1 + |ξ |)β f ||L∞ x,ξ

sup x∈R,ξ ∈R 3

(1 + |ξ |)β f (x, ξ ),

for any function f (x, ξ ) such that the above norm is finite. Notice that the norm used here is consistent with the weighted function W β introduced in Sect. 1. In order to obtain the above · β norm of the solution g, we will use the following weight w−α in the analysis from L2 estimates of the solution to the linearized problem to the ·β estimates of the solution to the nonlinear problem. Set |ξ1 |−α , |ξ1 | < 1, w−α = (3.1) 1, |ξ1 | ≥ 1. Precisely, we will prove the following lemma. Lemma 3.1. For 0 < α < 21 , β > 23 , the solution to problem (1.21) satisfies ||g||β ≤ c(||h||β−1 + h + ||w−α h|| + |a0 |+,β ),

(3.2)

where |a0 |+,β =

sup ξ ∈R 3 ,ξ1 >0

(1 + |ξ |)β |a0 (ξ )|.

Proof. Equation (1.21) can be written as 1 1 γ gx = σg + Lg + h − P + ξ1 g ξ1 ξ1 ξ1 0 ν(ξ ) 1 ¯ = σ− g + (Kg + h), ξ1 ξ1 where

K¯ = K − γ P0+ ξ1 .

(3.3)

384

S. Ukai, T. Yang, S.-H. Yu

∞ Notice that K is a compact operator and is bounded from space L∞ β to Lβ+1 , and from + L2ξ to L∞ ξ , see [13]. Since the operator P 0 ξ is a mapping to a subspace of the macroscopic part which is of finite dimensions, it is straightforward to prove that it has the same properties as K stated above. Thus, the operator K¯ shares the same properties as K. Let ν(ξ ) κ(x, ξ ) = −σ + x. ξ1 We have κ(x, ξ ) > 0 both for x > 0, ξ1 > 0 and x < 0, ξ1 < 0 when σ > 0 is sufficiently small. Therefore, the solution g can be formally written as

¯ + h), g = a˜ + U (Kg where

a˜ =

and

  

x

(3.4)

exp (−κ(x, ξ ))a0 (ξ ), ξ1 > 0, 0, ξ1 < 0,

1 h(x , ξ )dx , ξ1 > 0, ξ 1 0 ∞ U (h) = 1  − exp (−κ(x − x , ξ )) h(x , ξ )dx , ξ1 < 0. ξ1 x

(3.5)

exp (−κ(x − x , ξ ))

(3.6)

We first claim that the operator U has the following two properties: ||U (h(·, ξ ))||Lpx ≤ cν(ξ )−1 ||h(·, ξ )||Lpx , ||U (h(·, ξ ))||Lr (Lpx ) ≤ c||ν(ξ )−1 h(·, ξ )||Lr (Lpx ) , ξ

ξ

(3.7)

where 1 ≤ p, r ≤ ∞. To prove (3.7), we rewrite U (h) as follows: U (h) = S(x − x , ξ )h(x , ξ )dx , Iξ1

where Iξ1 = and

[0, x] for ξ1 > 0, [x, ∞) for ξ1 < 0,

 1 ν(ξ )   x for x > 0, ξ1 > 0 exp − −σ + ξ ξ 1 1 S(x , ξ ) = ν(ξ ) 1   x for x < 0, ξ1 < 0. − exp − −σ + ξ1 ξ1

For the hard sphere collision kernel, we have |S(x − x , ξ )|dx Iξ1

1 ν(ξ ) 1 x exp − −σ + dx ≤ , ξ1 > 0, ξ1 ξ1 ν(ξ ) − σ ξ1 0 = ∞  1 ν(ξ ) 1    x exp − −σ + dx = , ξ1 < 0. ξ1 |ξ1 | ν(ξ ) + σ |ξ1 | 0     

x

(3.8)

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

Hence for sufficiently small σ , we have |S(x − x , ξ )|dx ≤ Iξ1

c . ν(ξ )

Now U (h) can be estimated as follows. For positive integers p, q ≥ 1 and we have

|S(x − x , ξ )|dx

|U (h)(x, ξ )| ≤

1 q

Iξ1

385

1 p

+

1 q

= 1,

1

p

|S(x − x , ξ )| |h(x , ξ )|p dx

.

Iξ1

Thus,

∞

|U (h(x, ξ ))| dx ≤ cν(ξ ) p

− pq

0

≤ cν(ξ ) ≤ cν

− pq

− pq −1

∞ 0

∞

0 ∞

|S(x − x , ξ )| |h(x , ξ )|p dx dx Iξ1 Iξ

|S(x − x , ξ )| |h(x , ξ )|p dxdx 1

|h(x , ξ )|p dx ,

0

where Iξ1

=

[x , ∞), ξ1 > 0, [0, x ], ξ1 < 0.

(3.9)

And this gives (3.7). ¯ we have By using the properties of the operators U and K, ¯ β + ν −1 hβ ) gβ ≤ |a0 |+,β + c(ν −1 Kg ≤ |a0 |+,β + c(gβ−1 + ν −1 hβ ).

(3.10)

Iterating (3.10) gives gβ ≤ c(|a0 |+,β + g0 + h),

(3.11)

where h = ν −1 hβ + · · · + ν −1 h2 + ν −1 h1 ≤ cν −1 hβ , . where β ≥ 1. In order to obtain (3.2), we now need to estimate g0 = gL∞ x,ξ By the energy estimate applied to the linear Boltzmann equation with damping (1.21), we have g2 ≤ c(h2 + ξ1 a0 , a0 + ) ≡ E22 .

(3.12)

In the following, we will give the estimate on g0 by using the properties of the collision ¯ kernel for the hard sphere and the properties of the operators U and K. Let θ be the cut-off function in ξ used above and w = w(ξ ) ≥ 0 be a weight function. Applying θ 2 w 2 g to (1.21) and integrating over x and ξ , we have

386

S. Ukai, T. Yang, S.-H. Yu

|ξ1 |θ 2 w 2 g, g

− σ (ξ1 θ 2 w 2 g, g) + (νθ 2 w 2 g, g) ¯ θ w g) + (h, θ 2 w 2 g) + ξ1 θ 2 w 2 a0 , a0 . = (Kg, − 2 2

(3.13)

+

And this gives 1

1

|ξ1 | 2 θ wg2 + ν 2 θ wg2 ¯ ¯ θ wg + ξ1 θ 2 w 2 a0 , a0 ). ≤ c(θ w Kg θ wg + θ w K +

(3.14)

1 If wK¯ is bounded from L2x,ξ to itself, wh ∈ L2x,ξ and |ξ1 | 2 wa0 ∈ L2ξ,+ , then by letting θ(ξ ) → 1 and using the Cauchy inequality we have 1

1

ν 2 wg ≤ c(E2 + wh + |ξ1 | 2 wa0 ).

(3.15)

In the following, we will choose some appropriate weights to obtain our desired estimates. First, if we let w(ξ ) ≡ 1, then we have 1

ν 2 gL2 ≤ cE2 .

(3.16)

x,ξ

Notice that for α < 1,

¯ K(ξ, ξ )w−α (ξ )dξ ≤ c.

¯ −α are bounded from L2 to L2 when α < 1 . We have both operators w−α K¯ and Kw ξ ξ 2 2 ∞ Hence they are bounded from L2ξ (L∞ x ) to Lξ (Lx ). Thus, if we choose w = w−α (ξ ) in (3.15) for 0 < α < 21 , 1

w−α g ≤ c(E2 + w−α h + |ξ1 | 2 w−α a0 ) = cE2 .

(3.17)

By using Eq. (1.21) and (3.16), we have w1 gx ≤ cE2 . Let

1 4

(3.18)

< β < 21 , i.e., 0 < 1 − 2β < 21 , we have wβ g2L∞ (L2 ) ≤ wβ g2L2 (L∞ ) x x ξ ξ 2 −1 ≤ 2 {wβ w1 g(·, ξ )L2x }{w1 g(·, ξ )L2x }dξ ≤ cE2 E2 . (3.19)

Now by the expression of g, (3.4), and using (3.19) we have ¯ −β wβ g 2 ∞ + ν −1 h 2 ∞ ) gL2 (L∞ ) ≤ a0 + + c(ν −1 Kw L (L ) L (L ) ξ

x

x

ξ

≤ a0 + + c(wβ gL2 (L∞ ) + ν ξ

≤

a0 + + c(E2 + E2

−1

x

+ ν

−1

ξ

hL2 (L∞ ) ) ξ

hL2 (L∞ ) ). ξ

x

x

x

(3.20)

By using the expression of g again and notice that K¯ is a bounded operator from L2ξ (L∞ x ) , we have to L∞ x,ξ

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

387

¯ + h)L∞ gL∞ ≤ a0 +,L∞ + U (Kg x,ξ ξ x,ξ ¯ L∞ + ν −1 hL∞ ) ≤ a0 +,L∞ + c( Kg ξ x,ξ x,ξ

+ c(gL2 (L∞ ) + ν −1 hL∞ ) ≤ a0 +,L∞ ξ x,ξ x

ξ

≤ |a0 |+,L∞ + c(E2 + E2 + ν −1 hL2 (L∞ ) + ν −1 hL∞ ). ξ x,ξ ξ

(3.21)

x

By using (3.21), when β − 1 > 23 , we have gβ ≤ c(|a0 |+,β + E2 + E2 + ν −1 hβ + ν −1 hL2 (L∞ )∩L∞ ) ξ

x

x,ξ

≤ c(|a0 |+,β + hL2 + w−α h + ν −1 hβ + ν −1 hL2 (L∞ )∩L∞ ) x,ξ

≤ c(|a0 |+,β + ν −1 hβ + h + w−α h). This completes the proof of the lemma.

ξ

x

x,ξ

(3.22)

Now we are ready to prove existence of the solution of the nonlinear Boltzmann equation with damping (1.21). Before that, we inculde a lemma on the property on (g, h) which was proved in [4]. Lemma 3.2. The projection of (g, h) on the null space of L vanishes and there exists a positive constant c such that ν −1 (g, h)β ≤ chβ gβ ,

(3.23)

for any β > 0. Therefore, for β > 5/2, we have exp(−σ x)ν −1 (g)β ≤ cg2β , exp(−σ x)ν −1 (g) ≤ cg2β , and for 0 < α < 21 , we have w−α exp(−σ x) (g) ≤ cg2β . In summary, (3.2) gives gβ ≤ c(g2β + |a0 |+,β ).

(3.24)

When |a0 |+,β is sufficiently small, the contraction mapping theorem and (3.24) give the following existence theorem for problem (1.21). Theorem 3.3. When the boundary data a0 is sufficiently small in the norm | · |+,β given in (3.3), the nonlinear Boltzmann equation with damping (1.21) has a unique solution g which is bounded in the norm · β when β > 25 . Note that this theorem in fact gives the part of Theorem 1.1 for M∞ < −1.

388

S. Ukai, T. Yang, S.-H. Yu

4. The Nonlinear Boundary Layer In this section, we will show that the boundary layer solutions to the Boltzmann equation exist under the solvability condition. We denote by Iγ the linear solution operator as follows: Iγ (a0 ) ≡ f (0, ·),

(4.1)

where f (x, ξ ) is given by

ξ1 fx = Lf − γ P + 0 ξ1 f, f (0, ξ ) = a0 (ξ ) for ξ1 ≥ 0,

and f (0, ·) designates ξ → f (0, ξ ) for all ξ . Similarly, let I γ be a nonlinear solution operator I γ (a0 ) ≡ f (0, ·),

(4.2)

where f (x, ξ ) = e−σ x g is given by (1.21)

ξ1 fx = Lf − γ P + 0 ξ1 f + (f, f ), f (0, ξ ) = a0 (ξ ) for ξ1 ≥ 0.

(4.3)

For these two operators Iγ and I γ , we first have the following lemma. Lemma 4.1. The solution operators Iγ and I γ have the following property:

Iγ : a0 ∈ L2ξ1 ,+ −→ Iγ (a0 ) ∈ L2|ξ | is bounded. I γ : a0 ∈ L2ξ1 ,+ −→ I γ (a0 ) ∈ L2|ξ | is bounded, where a0 is sufficiently small .

This lemma is a direct consequence of the energy estimates for the linear and nonlinear equation based on the estimates in Sects. 2 and 3, cf. Theorem 2.2. Therefore, we omit its proof for brevity. The following theorem gives an implicit condition on the boundary data which guarantees the solution obtained for the Boltzmann equation with damping is exactly the one without damping. That is, it gives the solvability condition of the boundary layer problem for the Boltzmann equation. Remark 4.2. The function in Theorem 1.1 is defined by γ ≡ P+ 0 ξ1 I .

Theorem 4.3. For a given γ > 0 suppose that γ P+ 0 ξ1 I (a0 ) ≡ 0,

then the solution of (1.21) is a solution of (1.15).

(4.4)

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

389

Proof. For a given γ > 0 when a0 , ξ1 a0 + is sufficiently small, (4.3) has a unique solution f (x, ξ ). We project the problem (4.3) to its macroscopic component, then we have that + + ∂x P + 0 P 0 ξ1 f = −γ P 0 P 0 P 0 ξ1 f. Since the projection matrices P 0 and P + 0 satisfy + + + P+ 0 P 0P 0 = P 0 P 0 = P 0 ,

we have a linear differential equation for P + 0 ξ1 f , + ∂x P + 0 ξ1 f = −γ P 0 ξ1 f.

If the boundary condition satisfies P + 0 ξ1 f |x=0 = 0, then we immediately have that P+ 0 ξ1 f ≡ 0 for x ≥ 0. That is, under the condition (4.4), the damping term added in (1.21) vanishes identically so that (1.21) is the same as (1.15). Similar to Theorem 4.3, we have the following theorem on the homogeneous linearized equation, cf. [7]. Theorem 4.4. For a given γ > 0 suppose that γ P+ 0 ξ1 I (a0 ) ≡ 0,

then the problem

(4.5)

  ξ1 ∂x f = Lf, f (0, ξ ) = a0 (ξ ) for ξ1 ≥ 0,  lim x→∞ f (x, ξ ) = 0,

has a unique solution. In the following two subsections, we will classify the boundary data by carefully analyzing the solvability condition (4.5) for the linearized equation, and (4.4) for the nonlinear equation respectively. γ 4.1. Classification of P + 0 ξ1 I (a0 ) = 0. Consider γ P+ 0 ξ1 I (a0 ) = 0. γ Since Iγ defines a bounded linear operator, the function P + 0 ξ1 I (a0 ) defines a bounded 2 linear map from Lξ1 ,+ to a finite dimensional space. According to the Mach number of the far field Maxwellian, we have the following theorem on the co-dimensions of the boundary data satisfying (4.5).

Theorem 4.5. The classification of the boundary data satisfying the solvability condition (4.5) can be summarized in Table 1.

390

S. Ukai, T. Yang, S.-H. Yu Table 1 M∞ < −1

γ Codim({a0 ∈ L2ξ ,+ |P + 0 ξ1 I (a0 ) = 0}) = 0 1

−1 < M∞ < 0

γ Codim({a0 ∈ L2ξ ,+ |P + 0 ξ1 I (a0 ) = 0}) = 1 1

0 < M∞ < 1

γ Codim({a0 ∈ L2ξ ,+ |P + 0 ξ1 I (a0 ) = 0}) = 4 1

M∞ > 1

γ Codim({a0 ∈ L2ξ ,+ |P + 0 ξ1 I (a0 ) = 0}) = 5 1

Proof. Denote p i the eigenvector of the operator A = p 0 ξ1 p 0 on N by

Api = λi pi , λ1 < λ2 = λ3 = λ4 < λ5 . We want to show that the dimension of the non-trivial solution a0 to γ P+ 0 ξ1 I (a0 ) = 0,

(4.6) +

2 γ n is exactly n+ . Since P + 0 ξ1 I is a bounded linear operator from Lξ1 ,+ to R , the di+ + mension of the non-trivial solutions is at most n . It suffices to find n independent non-trivial solutions to (4.6). For this, we introduce auxiliary functions

j j (x, ξ ) ≡ e−γ x pj (ξ ), j = 1, 2, 3, 4, 5. γ

+ When M∞ < −1, dim(P + 0 ) = 0. The matrix P 0 is a zero matrix. The condition 2 (4.5) holds for any a0 ∈ Lξ1 ,+ . Thus, no condition on a0 should be imposed. That is, γ codim {a0 ∈ L2ξ1 ,+ |P + 0 ξ1 I (a0 ) = 0} = 0.

When M∞ ∈ (−1, 0), the range P + 0 is spanned by p 5 . Therefore, + dim(P + 0 ) = n = 1. γ

It is straightforward to check that the function j 5 satisfies −γ x ξ1 ∂x j 5 − Lj 5 + γ P + P 1 ξ1 p5 (ξ ). 0 ξ1 j 5 = −γ e γ

γ

γ

γ

γ

γ

Let J 5 ≡ j 5 + k 5 (x, ξ ) be a solution of

γ γ γ ξ1 ∂x J 5 − LJ 5 + γ P + 0 ξ1 J 5 = 0, γ γ J 5 (0, ξ ) = j 5 (0, ξ ) for ξ1 ≥ 0. γ

The equation for k 5 is

γ γ γ γ ξ1 ∂x k 5 − Lk 5 + γ P 0 ξ1 k 5 = γ e−γ x P 1 ξ1 p5 (ξ ), γ k 5 (0, ξ ) = 0 for ξ1 ≥ 0.

(4.7)

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

391

γ

Notice that this equation for k 5 is just the equation for g in (2.1) by letting h = 1 ξ1 p 5 (ξ ). By choosing σ and γ satisfying γ |σ | 1, | | = O(1), and σ − γ < 0, σ

γ e−γ x P

γ

the estimates from Theorem 2.2 with k 5 (0, ξ )|ξ1 ≥0 = 0 yields that 1 γ γ γ γ + + 2 P + 0 ξ1 J 5 x=0 = P 0 ξ1 (j 5 + k 5 ) x=0 = P 0 ξ1 j 5 x=0 + O(1) (γ ) = 0. (4.8) Here we use the fact that we can choose γ arbitrarily small. Therefore, we choose a0 (ξ ) = p5 (ξ ) for ξ1 ≥ 0, then (4.8) implies that for sufficient small γ , γ P+ 0 ξ1 I (a0 ) = 0.

2 γ Then, p5 , P + 0 ξ1 I (a0 ) defines a non-trivial bounded functional from Lξ1 ,+ to R. Then, 2 by the Riesz representation theorem there exists r 5 ∈ Lξ1 ,+ such that γ p5 , P + (4.9) 0 ξ1 I (a0 ) = r 5 , ξ1 a0 + . This shows that

codim{a0 ∈ L2ξ1 ,+ : P + 0 ξ1 I(a0 ) = 0} = 1.

+ When 0 < M∞ < 1, dim(P + 0 ) = 4. The range of P 0 is spanned by p 2 , p 3 , p 4 , + and p5 . The bounded linear functional P 0 ξ1 Iγ (a0 ) can be written as 5 γ pi , P + 0 ξ1 I (a0 ) + γ P 0 ξ1 I (a0 ) = pi . pi , pi i=2

Similar to the construction of r 5 in the case M∞ ∈ (−1, 0), we can use j i for i = 2, · · · , 5 to obtain four linearly independent r i , i = 2, · · · , 5, so that γ pi , P + 0 ξ1 I (a0 ) = r i , ξ1 a0 + for i = 2, · · · , 5. γ

This gives the theorem for the case 0 < M∞ < 1. Similar to the above argument, one can show the theorem holds also for M∞ > 1. γ 4.2. Classification of P 0 ξ1 I γ (a0 ) = 0. From the classification of P + 0 ξ1 I (a0 ) = 0 to + + γ the one of P 0 ξ1 I (a0 ) = 0, we can consider the Fr´echet derivative of P 0 ξ1 I γ (a0 ) and show that it is non-trivial in a space of dimension n+ . For this, we normalize the vectors γ r i obtained for the linear operator P + 0 ξ1 I (a0 ) in the direction p i such that γ

r i , ξ1 r i + = 1 for i = 1, · · · , 5, if they exist. From the analysis in 4.1, the boundary data a0 can be decomposed as in Table 2. From the decomposition in the Table 2, we can parameterize I γ (a0 ) as in Table 3. Under this parameterization, we have the following lemma on the Fr´echet derivative γ of P + 0 ξ1 I (a0 ).

392

S. Ukai, T. Yang, S.-H. Yu Table 2 −1 < M∞ < 0 0 < M∞ < 1

a0 = r 5 , ξ1 a0 + r 5 + c a0 =

5

c, ξ1 r 5 + = 0

r i , ξ1 a0 + r i + c

c, ξ1 r i + = 0 for i = 2, · · · , 5

r i , ξ1 a0 + r i + c

c, ξ1 r i + = 0 for i = 1, · · · , 5

i=2

M∞ > 1

a0 =

5 i=1

Table 3 −1 < M∞ < 0

I γ (a0 ) ≡ I¯ γ (b5 , c)

b5 ∈ R, c ∈ r ⊥ 5

0 < M∞ < 1

I γ (a0 ) ≡ I¯ γ (b2 , · · · , b5 , c)

b2 , · · · , b5 ∈ R, c ∈ ∩5i=2 r ⊥ i

M∞ > 1

I γ (a0 ) ≡ I¯ γ (b1 , · · · , b5 , c)

b1 , · · · , b5 ∈ R, c ∈ ∩5i=1 r ⊥ i

(4.10)

γ Lemma 4.6. Referring to table (4.10), suppose that P + 0 ξ1 I (a0 ) = 0, then a0 is a function of c when a0 , ξ1 a0 + is sufficiently small.

Proof. It is obvious that I γ (0) = 0, and

d γ = Iγ (a0 ). I ( a0 ) d

=0

From this, we have

d + γ = P+ P 0 ξ1 I γ ( a0 ) 0 ξ1 I (a0 ). d

=0

(4.11)

Now, we consider the case of −1 < M∞ < 0 for illustration. For this case, from (4.11), we have d + γ = r 5 , ξ a0 + . p , P ξ1 I ( a0 ) d 5 0

=0 Take a0 = r 5 , then we have d γ ¯γ p5 , P + ξ I ( a ) = ∂b5 p5 , P + 0 0 1 0 ξ1 I (b5 , c) |(0,0) = r 5 , ξ r 5 = 1. d

=0 ¯γ Then, by the implicit function theorem p5 , P + 0 ξ1 I (b5 , c) = 0 defines b5 = b5 (c). This shows the theorem holds for −1 < M∞ < 0. A similar argument applies to the cases 0 < M∞ < 1 and M∞ > 1. Theorem 1 and Lemma 4.6 give the following theorem. Theorem 4.7. The classification of the boundary data near the far field Maxwellian satisfying the solvability condition (4.4) with respect to the Mach number at far field can be summarized as in Table 4.

Nonlinear Boundary Layers of the Boltzmann Equation: I. Existence

393

Table 4 M∞ < −1

γ Codim({a0 ∈ L2ξ ,+ |P + 0 ξ1 I (a0 ) = 0}) = 0 1

−1 < M∞ < 0

γ Codim({a0 ∈ L2ξ ,+ |P + 0 ξ1 I (a0 ) = 0}) = 1 1

0 < M∞ < 1

γ Codim({a0 ∈ L2ξ ,+ |P + 0 ξ1 I (a0 ) = 0}) = 4 1

M∞ > 1

γ Codim({a0 ∈ L2ξ ,+ |P + 0 ξ1 I (a0 ) = 0}) = 5 1

Acknowledgement. The research of the first author was supported by Grant-in Aid for Scientific Research (C) 136470207, Japan Society for the Promotion of Science (JSPS). The research of the second author was supported by the Competitive Earmarked Research Grant of Hong Kong CityU 1092/02P# 9040737. The research of the third author was supported by the Competitive Earmarked Research Grant of Hong Kong # 9040645.

References 1. Aoki, K., Nishino, K., Sone, Y., Sugimoto, H.: Numerical analysis of steady flows of a gas condensing on or evaporating from its plane condensed phase on the basis of kinetic theory: Effect of gas motion along the condensed phase. Phys. Fluids A 3, 2260–2275 (1991) 2. Arkeryd, L., Nouri, A.: On the Milne problem and the hydrodynamic limit for a steady Boltzmann equation model. J. Stat. Phys. 99, 993–1019 (2000) 3. Bardos, C., Caflish, R.E., Nicolaenko, B.: The Milne and Kramers problems for the Boltzmann equation of a hard sphere gas. Commun. Pure Appl. Math. 39, 323–352 (1986) ´ 4. Carleman, T.: Sur La Th´eorie de l’Equation Int´egrodiff´erentielle de Boltzmann. Acta Mathematica 60, 91–142 (1932) 5. Cercignani, C., Illner, R., Purvelenti, M.: The Mathematical Theory of Dilute Gases. Berlin: Springer-Verlag, 1994 6. Cercignani, C.: Half-space problem in the kinetic theory of gases. In: Trends in Applications of Pure Mathematics to Mechanics, Kr¨oner, E., Kirchg¨assner, K. (eds.) Berlin: Springer-Verlag, 1986, pp. 35–50 7. Coron, F., Golse, F., Sulem, C.: A classification of well-posed kinetic layer problems. Commun. Pure Appl. Math. 41, 409–435 (1988) 8. Golse, F., Perthame, B., Sulem, C.: On a boundary layer problem for the nonlinear Boltzmann equation. Arch. Rational Mech. Anal. 103(1), 81–96 (1988) 9. Golse, F., Poupaud, F.: Stationary solutions of the linearized Boltzmann equation in a half-space. Math. Methods Appl. Sci. 11, 483–502 (1989) 10. Kawashima, S., Nishibata, S.: Existence of a stationary wave for the discrete Boltzmann equation in the half space. Commun. Math.Phys. 207, 385–409 (1999) 11. Kawashima, S., Nishibata, S.: Stationary Waves for the discrete Boltzmann equation in the half space with reflective boundary boundaries. Commun. Math. Phys. 211, 183–206 (2000) 12. Nikkuniof, S., Kawashima, S.: Stability of stationary solutions to the half-space problem for the discrete Boltzmann equation with multiple collisions. Kyushu J. Math. 54, 233–255 (2000) 13. Sone, Y.: Kinetic Theory and Fluid Dynamics. Berlin: Birkh¨auser, 2002 14. Sone, Y., Aoki, K., Yamashita, I.: A study of unsteady strong condensation on a plane condensed phase with special interest in formation of steady profile. In: Rarefied Gas Dynamics Boffi, V., Cercignani, C. (eds.) Stuttgart: Teubner, Vol. II, 1986, pp. 323–333 15. Ukai, S.: On the half-space problem for the discrete velocity model of the Boltzmann equation. In: Advances in Nonlinear Partial Differential Equations and Stochastic Kawashima, Yangisawa T. (eds.) Series on Advances in Mathematics for Applied Sciences, Vol. 48, Singapore-New York: World Scientific, 1998, pp. 160–174 Communicated by H.-T. Yau

Commun. Math. Phys. 236, 395–448 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0808-6

Communications in

Mathematical Physics

Long Range Scattering and Modified Wave Operators for the Maxwell-Schr¨odinger System I. The Case of Vanishing Asymptotic Magnetic Field J. Ginibre1 , G. Velo2 1 2

Laboratoire de Physique Th´eorique , Universit´e de Paris XI, Bˆatiment 210, 91405 Orsay Cedex, France Dipartimento di Fisica, Universit`a di Bologna, and INFN, Sezione di Bologna, Italy

Received: 1 August 2002 / Accepted: 9 December 2002 Published online: 14 March 2003 – © Springer-Verlag 2003

Abstract: We study the theory of scattering for the Maxwell-Schr¨odinger system in space dimension 3, in the Coulomb gauge. In the special case of vanishing asymptotic magnetic field, we prove the existence of modified wave operators for that system with no size restriction on the Schr¨odinger data and we determine the asymptotic behaviour in time of solutions in the range of the wave operators. The method consists in partially solving the Maxwell equations for the potentials, substituting the result into the Schr¨odinger equation, which then becomes both nonlinear and nonlocal in time, and treating the latter by the method previously used for the Hartree equation and for the Wave-Schr¨odinger system. 1. Introduction This paper is devoted to the theory of scattering and more precisely to the existence of modified wave operators for the Maxwell-Schr¨odinger (MS) system in 3+1 dimensional space time. This system describes the evolution of a charged nonrelativistic quantum mechanical particle interacting with the (classical) electromagnetic field it generates. We write that system in Lorentz covariant notation : greek indices run from 0 to 3, latin indices run from 1 to 3, indices are raised and lowered with the metric tensor gµν = (1, −1, −1, −1) so that, for any vector field v = (v µ ), v0 = v 0 and vj = −v j , and we use the standard summation convention on repeated indices. The MS Lagrangian density can be written as L = −(1/2) Fµν F µν − Im u¯ D0 u + (1/2)Dj u D j u, where

(1.1)

Dµ = ∂µ + i Aµ , Fµν = ∂µ Aν − ∂ν Aµ , and u and Aµ are respectively a complex scalar valued and a real vector valued function defined in R3+1 . Here {F0j } are the components of the electric field and {Fij } are the

Unit´e Mixte de Recherche (CNRS) UMR 8627

396

J. Ginibre, G. Velo

components of the magnetic field. The associated variational system is   i∂0 u = (1/2) Dj D j u + A0 u (1.2)

 ∂ν F = J , νµ µ where the current density Jµ is given by J0 = |u|2 ,

Jj = − Im u¯ Dj u

(1.3)

and satisfies the local conservation condition ∂µ J µ = 0. Formally the L2 -norm of u is conserved as well as the energy       2 E(u, A) = dx (1/2)  (1.4) Fµν + |Dj u|2  + A0 |u|2 .   µ<ν

j

The system (1.2) is gauge invariant and we shall study it in the Coulomb gauge which experience shows to be the most convenient one for the purpose of analysis. The MS system (1.2) in R3+1 is known to be locally well posed in sufficiently regular spaces in the Lorentz gauge ∂µ Aµ = 0 [14] and to have global weak solutions in the energy space in various gauges including the Lorentz and Coulomb gauges [7]. However that system is so far not known to be globally well posed in any space whatever the gauge is. A large amount of work has been devoted to the theory of scattering for nonlinear equations and systems centering on the Schr¨odinger equation, in particular for nonlinear Schr¨odinger (NLS) equations, Hartree equations, Klein-Gordon-Schr¨odinger (KGS) systems, Wave-Schr¨odinger (WS) systems and Maxwell-Schr¨odinger (MS) systems. As in the case of the linear Schr¨odinger equation, one must distinguish the short range case from the long range case. In the former case, ordinary wave operators are expected and in a number of cases proved to exist, describing solutions where the Schr¨odinger function behaves asymptotically like a solution of the free Schr¨odinger equation. In the latter case, ordinary wave operators do not exist and have to be replaced by modified wave operators including a suitable phase in their definition. In that respect, the MS system (1.2) in R3+1 belongs to the borderline (Coulomb) long range case, because of the t −1 decay in L∞ norm of solutions of the wave equation. Such is the case also for the Hartree equation with |x|−1 potential, for the WS system in R3+1 and for the KGS system in R2+1 . Whereas a well developed theory of long range scattering exists for the linear Schr¨odinger equation (see [1] for a recent treatment and for an extensive bibliography), there exist only few results on nonlinear long range scattering. The existence of modified wave operators in the borderline Coulomb case has been proved first for the NLS equation in space dimension n = 1 [17], then for the NLS equation in dimensions n = 2, 3 and for the Hartree equation in dimension n ≥ 2 [2], for the derivative NLS equation in dimension n = 1 [10], for the KGS system in dimension 2 [18] and for the MS system in dimension 3 [19]. All those results are restricted to the case of small data. In the case of arbitrarily large data, the existence of modified wave operators has been proved for a family of Hartree type equations with general (not only Coulomb) long range interactions [3–5] by a method inspired by a previous series of papers by Hayashi et al [8, 9] on the Hartree equation. Part of the results have been improved as regards regularity [15, 16]. Finally the existence of modified wave operators without any size

Long Range Scattering and Modified Wave Operators for MS System

397

restriction on the data has been proved for the WS system in dimension n = 3 [6], by an extension of the method of [3, 4]. The present paper is devoted to the extension of the results of [6] to the MS system in the Coulomb gauge, and in particular to the proof of the existence of modified wave operators for that system without any size restriction on the Schr¨odinger data, in the special case of vanishing asymptotic magnetic field, namely in the special case where the asymptotic state for the vector potential is zero. The method consists in replacing the Maxwell equation for the vector potential by the associated integral equation and substituting the latter into the Schr¨odinger equation, thereby obtaining a new Schr¨odinger equation which is both nonlinear and nonlocal in time. The latter is then treated as in [6], namely u is expressed in terms of an amplitude w and a phase ϕ satisfying an auxiliary system similar to that introduced in [6]. Wave operators are constructed first for that auxiliary system, and then used to construct modified wave operators for the original system (1.2). The detailed construction is too complicated to allow for a more precise description at this stage, and will be described in heuristic terms in Sect. 2 below. The results of this paper improve over those of [19] by the fact that we do not require smallness conditions on the Schr¨odinger asymptotic data. On the other hand, in contrast with [19], we restrict our attention to the case of vanishing asymptotic data for the vector potential in order to avoid the difficulties coming from the difference of propagation properties of the Wave and Schr¨odinger equations that occur both in [6] and in [19]. In a subsequent paper we hope to extend the results of the present one to the general case. We now give a brief outline of the contents of this paper. A more detailed description of the technical parts will be given at the end of Sect. 2. After collecting some notation and preliminary estimates in Sect. 3, we study the asymptotic dynamics for the auxiliary system in Sect. 4 and we construct the wave operators for that system by solving the local Cauchy problem at infinity associated with it in Sect. 5, which contains the main technical results of this paper. We finally come back from the auxiliary system to the original one and construct the modified wave operators for the latter in Sect. 6, where the final result is stated in Proposition 6.1. We conclude this section with some general notation which will be used freely throughout this paper. We denote by · r the norm in Lr ≡ Lr (R3 ) and we define δ(r) = 3/2 − 3/r. For any interval I and any Banach space X, we denote by C(I, X) the space of strongly continuous functions from I to X and by L∞ (I, X) (resp. L∞ loc (I, X)) the space of measurable essentially bounded (resp. locally essentially bounded) functions from I to X. For real numbers a and b, we use the notation a ∨ b = Max(a, b), and a ∧ b = Min(a, b). In the estimates of solutions of the relevant equations, we shall use the letter C to denote constants, possibly different from an estimate to the next, depending on various parameters, but not on the solutions themselves or on their initial data. We shall use the notation C(a1 , a2 , · · · ) for estimating functions, also possibly different from an estimate to the next, depending in addition on suitable norms a1 , a2 , · · · of the solutions or of their initial data. Additional notation will be given in Sect. 3.

2. Heuristics and Formal Computations In this section, we discuss in heuristic terms the construction of the modifed wave operators for the MS system as it will be performed in this paper and we derive the equations needed for that purpose.

398

J. Ginibre, G. Velo

We first recast the MS system in the Coulomb gauge ∂j Aj = 0 in the form that will be used later on. We introduce the noncovariant notation ∂t = ∂0 , so that

∇ = {∂j },

{Dj } = ∇ − iA,

A = {Aj },

J = {J j }

J ≡ J (u, A) = Im u(∇ ¯ − iA)u

and the Coulomb gauge condition becomes ∇ ·A = 0. With that notation, the MS system in the Coulomb gauge becomes i∂t u = −(1/2)(∇ − iA)2 u + A0 u, A0 = −J0 , A + ∇ (∂t A0 ) = J,

(2.1) (2.2) (2.3)

where = ∂t2 − . We replace that system by a formally equivalent one in the following standard way. We solve (2.2) for A0 as A0 = −−1 J0 = (4π|x|)−1 ∗ |u|2 ≡ g(u),

(2.4)

where ∗ denotes the convolution in R3 , so that by the current conservation ∂t J0 +∇ ·J = 0, ∂t A0 = −1 ∇ · J.

(2.5)

Substituting (2.4) into (2.1) and (2.5) into (2.3), we obtain the new system i∂t u = −(1/2)(∇ − iA)2 u + g(u)u, A = P J ≡ P Im u(∇ ¯ − iA)u,

(2.6) (2.7)

where P = 1l − ∇−1 ∇ is the projector on divergence free vector fields. The system (2.6), (2.7) is the starting point of our investigation. We want to address the problem of classifying the asymptotic behaviours in time of the solutions of the system (2.6), (2.7) by relating them to a set of model functions V = {v = v(v+ )} parametrized by some data v+ and with suitably chosen and preferably simple asymptotic behaviour in time. For each v ∈ V, one tries to construct a solution (u, A) of the system (2.6), (2.7) defined at least in a neighborhood of infinity in time and such that (u, A)(t) behaves as v(t) when t → ∞ in a suitable sense. We then define the wave operator as the map : v+ → (u, A) thereby obtained. A similar question can be asked for t → −∞. We restrict our attention to positive time. The more standard definition of the wave operator is to define it as the map v+ → (u, A)(0), but what really matters is the solution (u, A) in the neighborhood of infinity in time, namely in some interval [T , ∞). Continuing such a solution down to t = 0 is a somewhat different question which we shall not touch here, especially since the MS system is not known to be globally well posed in any reasonable space. In the case of the MS system, which is long range, it is known that one cannot take for V the set of solutions of the linear problem underlying (2.6) (2.7), namely   i∂t u = −(1/2) u (2.8)  A = 0

Long Range Scattering and Modified Wave Operators for MS System

399

and one of the tasks that will be performed in this paper will be to construct a better set V of model asymptotic functions. The same situation prevails for long range Hartree equations and for the WS system and we refer to Sects. 2 of [3, 4, 6] for more details on that point. Constructing the wave operators essentially amounts to solving the Cauchy problem with infinite initial time. The system (2.6), (2.7) in this form is not well suited for that purpose, and we now perform a number of transformations leading to an auxiliary system for which that problem can be handled. We first replace (2.7) by the associated integral equation, namely A = A∞ 0 + A1 (u, A),

(2.9)

˙ ˙ A∞ 0 = K(t)A+ + K(t)A+ ,

(2.10)

where

A1 (u, A) = −

∞

dt K(t − t )P J (u, A)(t )

(2.11)

t

with ω = (−)1/2 ,

K(t) = ω−1 sin ωt,

˙ K(t) = cos ωt.

In particular, A∞ 0 is a solution of the free (vector valued) wave equation with initial data (A+ , A˙ + ) at t = 0, and (A+ , A˙ + ) is naturally interpreted as the asymptotic state for A. In order to ensure the condition ∇ · A = 0, we assume that ∇ · A+ = ∇ · A˙ + = 0. We next perform the same change of variables as for the Hartree equation and for the WS system. That change of variables is well adapted to the study of the asymptotic behaviour in time. The unitary group U (t) = exp(i(t/2))

(2.12)

which solves the free Schr¨odinger equation can be written as U (t) = M(t) D(t) F M(t), where M(t) is the operator of multiplication by the function

M(t) = exp ix 2 /2t ,

(2.13)

(2.14)

F is the Fourier transform and D(t) is the dilation operator (D(t) f )(x) = (it)−n/2 f (x/t)

(2.15)

normalized to be unitary in L2 . We shall also need the operator D0 (t) defined by (D0 (t)f ) (x) = f (x/t).

(2.16)

We parametrize the Schr¨odinger function u in terms of an amplitude w and of a real phase ϕ as u(t) = M(t) D(t) exp[−iϕ(t)]w(t).

(2.17)

400

J. Ginibre, G. Velo

Correspondingly we change the variable for the vector potential from A to B according to A(t) = t −1 D0 (t) B(t)

(2.18)

and similarly for A∞ 0 and A1 . We now perform the change of variables (2.17), (2.18) on the system (2.6), (2.7). Substituting (2.17), (2.18) into (2.6) and commuting the Schr¨odinger operator with MD, we obtain (2.19) i∂t + (2t 2 )−1 (∇ − iB)2 + t −1 x · B − t −1 g(w) (exp(−iϕ)w) = 0, where g is defined by (2.4). Expanding the derivatives, using the condition ∇ · B = 0 and introducing the notation s = ∇ϕ, we rewrite (2.19) as i∂t + ∂t ϕ + (2t 2 )−1 − it −2 ((s + B) · ∇ + (1/2)(∇ · s)) − (2t 2 )−1 (s + B)2 +t −1 x · B − t −1 g(w) w = 0. (2.20) We next turn to the Maxwell equation (2.7). Substituting (2.17), (2.18) into the definition of J , we obtain

¯ − (s + B)|w|2 J (t) = t −3 D0 (t) x|w|2 + t −1 Im w∇w

= t −3 D0 (t) Ma + t −1 Mb , (2.21) where Ma = x|w|2 , Mb = Im w∇w ¯ − (s + B)|w|2 . Using the identity ∞ − dt K(t − t )t −3−j D0 (t ) P M(t ) = t −1−j D0 (t) Fj (M),

(2.22) (2.23)

(2.24)

t

where Fj is defined by Fj (M) =

∞

dν ω−1 sin(ω(ν − 1)) ν −3−j D0 (ν) P M(νt),

(2.25)

1

we rewrite (2.7) as B = B0∞ + B1

(2.26)

B1 = Ba + Bb , Ba ≡ Ba (w) = F0 (Ma ), Bb = t −1 F1 (Mb ).

(2.27) (2.28) (2.29)

with

This completes the change of variables for the system (2.6), (2.7). Now however we have parametrized u in terms of an amplitude w and a phase ϕ and we have only one

Long Range Scattering and Modified Wave Operators for MS System

401

equation for the two functions (w, ϕ). We arbitrarily impose a second equation, namely a Hamilton-Jacobi (or eikonal) equation for the phase ϕ, thereby splitting (2.20) into a system of two equations, the other one of which being a transport equation for the amplitude w. There is a large amount of freedom in the choice of the equation for the phase. The role of the phase is to cancel the long range terms in (2.20) coming from the interaction. All the interaction terms in (2.20) having an explicit t −2 factor are expected to be short range and are therefore left in the equation for w. Such is also the case for the contribution of Bb to x · B because of the t −1 factor in (2.29). The term t −1 g(w) is clearly long range (of Hartree type), and is therefore included in the ϕ equation. The term t −1 (x · Ba ) is also long range, but since it is less regular than the previous one, it is convenient to split x · Ba into a short range and a long range part, namely x · Ba = (x · Ba )S + (x · Ba )L

(2.30)

in the following way. We take 0 < β < 1 and we define   (x · Ba )S = F −1 χ (|ξ | > t β ) F (x · Ba ) 

(x · Ba )L = F −1 χ (|ξ | ≤ t β ) F (x · Ba )

,

(2.31)

where χ (P) is the characteristic function of the set where P holds. (In practice we shall need 0 < β < 1/2 in most of the applications.) Finally the contribution of B0∞ to B should be considered short range. The Hamilton-Jacobi equation for the phase is then taken to be ∂t ϕ = (2t 2 )−1 s 2 + t −1 g(w) − t −1 (x · Ba )L .

(2.32)

We are now in a position to introduce the auxiliary system which will be used to study the MS system. From now on, we restrict our attention to the case B0∞ = 0, namely to the case where the asymptotic state (A+ , A˙ + ) for the vector potential is zero. This is the technical meaning of the expression “vanishing asymptotic magnetic field” used in the title and in the introduction of this paper. The equation for w is obtained by substituting (2.32) into (2.20). The resulting equation contains ϕ only through s = ∇ϕ. The same property holds for the RHS of (2.32) and for (2.26)–(2.29). It is then convenient to replace (2.32) by its gradient so as to obtain a final system containing only s. The Maxwell equation now has B = B1 . Furthermore we shall regard (2.28) as the definition of Ba as a function of w, we shall take Bb as the dynamical variable and we shall regard (2.27) (with B = B1 ) as a change of variable from B to Bb . The actual equation is then (2.29) regarded as an equation for Bb . We finally introduce the notation Q(s, w) = s · ∇w + (1/2)(∇ · s)w

(2.33)

for the transport term. We then obtain the auxiliary system in the following form :  ∂t w = i(2t 2 )−1 w + t −2 Q(s + B, w) − i(2t 2 )−1 (2B · s + B 2 )w     +it −1 ((x · Ba )S + x · Bb ) w ≡ L(w, s, Bb )w     ∂t s = t −2 s · ∇s + t −1 ∇g(w) − t −1 ∇(x · Ba )L ,

(2.34)

402

J. Ginibre, G. Velo

Bb = t −1 F1 Im w∇w ¯ − (s + B)|w|2

(2.35)

with B = Ba + Bb and Ba defined by (2.28), (2.22), (2.25). The phase ϕ is regarded as a derived quantity to be recovered from (2.32). The linear operator L(w, s, Bb ) is defined in an obvious way by (2.34). Its dependence on s, Bb is explicit, while its dependence on w occurs through Ba . Since Ba and a fortiori x · Ba contains xw, it will be useful to have an explicit evolution equation for xw. From (2.34) we obtain immediately ∂t xw = L(w, s, Bb )xw − it −2 ∇w − t −2 (s + B)w.

(2.36)

The Cauchy problem for the system (2.34), (2.35) with initial data (w, s)(t0 ) = (w0 , s0 ) for some time t0 is no longer a usual PDE Cauchy problem since Ba depends on w nonlocally in time and since (2.35) is an integral equation in time. A convenient way to handle that difficulty is to replace that problem by a partly linearized version thereof, namely   ∂t w = L(w, s, Bb )w (2.37)  ∂ s = t −2 s · ∇s + t −1 ∇g(w) − t −1 ∇(x · B ) , t a L

Bb = t −1 F1 Im w∇w ¯ − (s + B)|w|2 , (2.38) still with B = Ba + Bb and Ba = Ba (w). Correspondingly, the evolution equation for xw becomes ∂t xw = L(w, s, Bb )xw − it −2 ∇w − t −2 (s + B)w .

(2.39)

Solving (2.37) with suitable initial data for given (w, s, Bb ), together with (2.38), defines a map : (w, s, Bb ) → (w , s , Bb ), and solving (2.34), (2.35) reduces to finding a fixed point of , which in favourable cases can be done for instance by contraction. The first problem that we shall consider is whether the auxiliary system (2.34), (2.35) defines a dynamics for large time. This will be the subject of Sect. 4 below. In particular we shall prove that the Cauchy problem for that system is locally well posed in a neighborhood of infinity in time, namely that (2.34), (2.35) with initial data (w, s)(t0 ) = (w0 , s0 ) has a unique solution defined in [T , ∞) for some T ≤ t0 , with t0 and T suitably large depending on (w0 , s0 ), and with continuous dependence on the data. Those results will be obtained through the use of the linearized system (2.37), (2.38) by following the method sketched above. In addition, we shall derive some asymptotic properties of the solutions thereby obtained. In particular, for those solutions, w(t) tends to a limit w+ when t → ∞. The previous results are insufficient to construct the wave operators, namely to solve the Cauchy problem for (2.34), (2.35) with infinite initial time because (i) T → ∞ when t0 → ∞ and (ii) the solutions are not estimated uniformly in t0 , so that it is not possible to perform the limit t0 → ∞ in those results. In order to construct the wave operators, we follow instead the procedure explained at the beginning of this section. We choose an asymptotic function v and we look for solutions that behave asymptotically as v when t → ∞. The asymptotic v will be taken as a pair (W, S) with S = ∇φ and with W (t) tending to a limit w+ as t → ∞. This will provide the asymptotic form of (w, s). No asymptotic form is needed for Bb , because Bb tends to zero at infinity. In order for (W, S) to be an adequate asymptotic function, it has to be an approximate solution of the system (2.34). This is achieved by solving that system approximately by iteration

Long Range Scattering and Modified Wave Operators for MS System

403

and taking for (W, S) an iterate of suitable order. In the present case, the first iteration turns out to be sufficient, and we shall take accordingly  W (t) = U ∗ (1/t)w+   t   S(t) = dt t −1 (∇g(W ) − ∇(x · Ba )L (W ))

(2.40)

1

for some given w+ and for t ≥ 1. Actually, at the cost of some loss of regularity, we could also make the simpler choice  W (t) = w+   t   S(t) = dt t −1 (∇g(w+ ) − ∇(x · Ba )L (w+ )) .

(2.41)

1

The latter S(t) can be explicitly computed :

S(t) = n t ∇g(w+ ) − n t (ω ∨ 1)−1/β ∨ 1 ∇(x · Ba )(w+ ),

(2.42)

where the second n is defined in an obvious way in Fourier transformed variables. In order to solve the system (2.34), (2.35) with (w, s) behaving asymptotically as a given (W, S) (possibly but not necessarily (2.40)), we make a change of variables from (w, s) to (q, σ ) defined by (q, σ ) = (w, s) − (W, S).

(2.43)

Substituting (2.43) into (2.34), (2.35) will yield a new auxiliary system for the variables (q, σ, Bb ). For that purpose we introduce the following additional notation. We define g(w1 , w2 ) = (4π |x|−1 ) ∗ Re w¯ 1 w2 , Ba (w1 , w2 ) = F0 (x Re w¯ 1 w2 ),

(2.44) (2.45)

so that g(w) = g(w, w) and Ba (w) = Ba (w, w). We next define B∗ = Ba (W ) and G ≡ G(q, W ) = Ba (w) − B∗ = Ba (q, q + 2W ).

(2.46)

We finally define the remainders   R1 (w, s, Bb ) = −∂t w + L(w, s, Bb )w  R (w, s) = −∂ s + t −2 s · ∇s + t −1 ∇g(w) − t −1 ∇(x · B ) (w) 2 t a L R3 (w, s, Bb ) = −Bb + t −1 F1 (Im w∇w ¯ − (s + B)|w|2 ),

,

(2.47) (2.48)

so that the system (2.34), (2.35) can be rewritten as Ri = 0, i = 1, 2, 3, and for general (w, s, Bb ), the remainders measure its failure to satisfy that system. Using the previous

404

J. Ginibre, G. Velo

notation, we can rewrite (2.34), (2.35) in the new variables as follows:  ∂t q = L(w, s, Bb )q + t −2 Q(σ + G + Bb , W )        −it −2 (B · σ + (G + Bb )(S + (B + B∗ )/2)) W    1 , +it −1 ((x · G)S + x · Bb ) W + R1 (W, S, 0) ≡ L(w, s, Bb )q + R      ∂t σ = t −2 (s · ∇σ + σ · ∇S) + t −1 ∇g(q, q + 2W ) − t −1 ∇(x · G)L      2 +R2 (W, S) ≡ t −2 s · ∇σ + R

Bb = t −1 F1 Im(w∇q ¯ + q∇W ¯ ) − (s + B) |q|2 + 2Re qW ¯ 3 . −(σ + G + Bb )|W |2 + R3 (W, S, 0) ≡ R

(2.49)

(2.50)

For the same reason as above, we also need the evolution equation for xq. We define the linear operator L1 by rewriting the evolution equation for q in the form ∂t q = L(w, s, Bb )q + L1 W + R1 (W, S, 0).

(2.51)

The evolution equation for xq then becomes ∂t xq = L(w, s, Bb )xq + L1 xW − it −2 ∇q − t −2 (s + B)q −t −2 (σ + G + Bb )W + xR1 (W, S, 0).

(2.52)

We shall also need to rewrite the partly linearized system (2.37), (2.38) in terms of the new variables (q, σ ) defined by (2.43) and of (q , σ ) defined similarly by (q , σ ) = (w , s ) − (W, S).

(2.53)

The system (2.37), (2.38) then becomes  1  ∂t q = L(w, s, Bb )q + R 

2 , ∂t σ = t −2 s · ∇σ + R

3 , Bb = R

(2.54) (2.55)

i , i = 1, 2, 3, are defined by (2.49), (2.50), and (2.39) becomes where R ∂t xq = L(w, s, Bb )xq + L1 xW − it −2 ∇q − t −2 (s + B)q −t −2 (σ + G + Bb )W + xR1 (W, S, 0).

(2.56)

The main technical result of this paper is the construction of solutions (q, σ, Bb ) of the auxiliary system (2.49), (2.50) defined for large time and tending to zero at infinity in time. That construction is performed by solving the Cauchy problem for the linearized system (2.54), (2.55), first for finite initial time t0 , and then for infinite initial time by taking the limit t0 → ∞. One then proves the existence of a fixed point for the map

: (q, σ, Bb ) → (q , σ , Bb ) thereby defined by a contraction method, as mentioned above. With that result available, it is an easy matter to construct the modified wave operator for the MS system in the form (2.6), (2.7). We start from the asymptotic state

Long Range Scattering and Modified Wave Operators for MS System

405

u+ for u and we define w+ = F u+ . The asymptotic state (A+ , A˙ + ) for A is taken to be zero. We define (W, S) by (2.40). We solve the system (2.49), (2.50) for (q, σ, Bb ) as indicated above. Through (2.43), this yields a solution (w, s, Bb ) of the auxiliary system (2.34), (2.35) defined for large time. From s we reconstruct the phase ϕ by using (2.32). We finally substitute (w, ϕ, Bb ) into (2.17), (2.18) with B = Ba + Bb and Ba defined by (2.28). This yields a solution (u, A) of the system (2.6), (2.7) defined for large time. The modified wave operator is the map u+ → (u, A) thereby obtained. The main result of this paper is the construction of (u, A) from u+ , as described above, together with the asymptotic properties of (u, A) that follow from that construction. It will be stated below in full mathematical detail in Proposition 6.1. We give here only a heuristic preview of that result, stripped from most technicalities. Proposition 2.1. Let k > 3/2, 0 < β < 1/2 and let α > 1 be such that β(α + 1) ≥ 1. Let u+ be such that w+ = F u+ ∈ H k+α+1 and xw+ ∈ H k+α . Define (W, S) by (2.40). Then (1) There exists T = T (w+ ), 1 ≤ T < ∞, such that the auxiliary system (2.34), (2.35) has a unique solution (w, s, Bb ) in a suitable space, defined for t ≥ T , and such that (w − W, s − S, Bb ) tends to zero in suitable norms when t → ∞. (2) There exist ϕ and φ such that s = ∇ϕ, S = ∇φ, φ(1) = 0 and such that ϕ − φ tends to zero in suitable norms when t → ∞. Define (u, A) by (2.17), (2.18) with B = Ba + Bb and Ba defined by (2.28). Then (u, A) solves the system (2.6), (2.7) for t ≥ T and (u, A) behaves asymptotically as (MD exp(−iφ)W, t −1 D0 Ba (W )) in the sense that the difference tends to zero in suitable norms (for which each term separately is O(1)) when t → ∞. We now describe the contents of the technical parts of this paper, namely Sects. 3–6. In Sect. 3, we introduce some notation, define the relevant function spaces and collect a number of preliminary estimates. In Sect. 4, we study the Cauchy problem for large time for the auxiliary system (2.34), (2.35). We solve the Cauchy problem with finite initial time for the linearized system (2.37) (Proposition 4.1), we prove a number of uniqueness results for the system (2.34), (2.35) (Proposition 4.2), we solve the Cauchy problem for the system (2.34), (2.35) for large but finite t0 in the special case A∞ 0 = 0 (Proposition 4.3) and we finally derive some asymptotic properties of the solutions thereby obtained (Proposition 4.4). In Sect. 5, we study the Cauchy problem at infinity for the auxiliary system (2.34), (2.35) in the difference form (2.49), (2.50). We prove the existence of solutions first for the linearized system (2.54), (2.55) for t0 finite (Proposition 5.1) and infinite (Proposition 5.2), and then for the nonlinear system (2.49), (2.50) for t0 infinite (Proposition 5.3). Finally in Sect. 6, we construct the modified wave operators for the system (2.6), (2.7) from the results previously obtained for the system (2.49), (2.50) and we derive the asymptotic estimates for the solutions (u, A) in their range that follow from the previous estimates (Proposition 6.1). 3. Notation and Preliminary Estimates In this section, we define the function spaces where we shall study the auxiliary system (2.34), (2.35) and we collect a number of estimates which will be used throughout this paper. In addition to the standard Sobolev spaces H k , we shall use the associated homogeneous spaces H˙ k with norm u; H˙ k = ωk u 2 , where ω = (−)1/2 and

406

J. Ginibre, G. Velo

the spaces K k = H˙ 1 ∩ H˙ k , where it is understood that H˙ 1 ⊂ L6 . We shall use the notation |w|k = w; H k

,

|s|˙k = s; K k .

(3.1)

We shall look for solutions of the auxiliary system in spaces of the type C(I, Xk ), where I is an interval and Xk = (w, s, B) : w ∈ H k+1 , xw ∈ H k , s ∈ K k+2 , B ∈ K k+1 , x · B ∈ K k+1 . (3.2) For the needs of this paper, we could have replaced H˙ 1 in the definition of K k by H˙ k0 for some k0 with 1/2 < k0 < 3/2. We have chosen k0 = 1 for simplicity. We shall use extensively the following Sobolev inequalities, stated here in Rn , but to be used only for n = 3. Lemma 3.1. Let 1 < q, r < ∞, 1 < p ≤ ∞ and 0 ≤ j < k. If p = ∞, assume that k − j > n/r. Let σ satisfy j/k ≤ σ ≤ 1 and n/p − j = (1 − σ )n/q + σ (n/r − k). Then the following inequality holds: ωk u σr . ωj u p ≤ C u 1−σ q

(3.3)

The proof follows from the Hardy-Littlewood-Sobolev (HLS) inequality ([11], p. 117) (from the Young inequality if p = ∞), from Paley-Littlewood theory and interpolation. We shall also use extensively the following Leibnitz and commutator estimates. Lemma 3.2. Let 1 < r, r1 , r3 < ∞ and 1/r = 1/r1 + 1/r2 = 1/r3 + 1/r4 . Then the following estimates hold: ωm (uv) r ≤ C ωm u r1 v r2 + ωm v r3 u r4

(3.4)

for m ≥ 0,

[ωm , u]v r ≤ C ωm u r1 v r2 + ωm−1 v r3 ∇u r4

(3.5)

for m ≥ 1, where [, ] denotes the commutator, and [ωm , u]v 2 ≤ C F (ωm u) 1 v 2 for 0 ≤ m ≤ 1.

(3.6)

Long Range Scattering and Modified Wave Operators for MS System

407

Proof. The proof of (3.4), (3.5) is given in [12, 13] with ω replaced by < ω > and follows therefrom by a scaling argument. The proof of (3.6) follows from |F ([ωm , u]v)(ξ )| = dη||ξ |m − |η|m | u(ξ − η) v (η) ≤ dη|ξ − η|m | u(ξ − η)| | v (η)| and from the Young inequality and the Plancherel theorem.

We shall also need the following consequence of Lemma 3.2. Lemma 3.3. Let m ≥ 0 and 1 < r < ∞. Then the following estimate holds: ωm (eϕ − 1) r ≤ ωm ϕ r exp (C ϕ ∞ ) .

(3.7)

Proof. For any integer n ≥ 2, we estimate

m n−1 an ≡ ωm ϕ n r ≤ C ωm ϕ r ϕ n−1 + ω ϕ ϕ r ∞ ∞

n−1 = C a1 b + an−1 b by (3.4), where b = ϕ ∞ and we can assume C ≥ 1 without loss of generality. It follows easily from that inequality that an ≤ n(Cb)n−1 a1 for all n ≥ 1, from which (3.6) follows by expanding the exponential.

We shall apply Lemma 3.2 in the form of Lemma 3.4 below which we state for clarity in general dimension n and with general k0 < n/2 and K m = H˙ m ∩ H˙ k0 . Lemma 3.4. Let m ¯ > n/2. The following inequalities hold: (1) Let 0 ≤ m ≤ m. ¯ Then f g; H˙ m ≤ C|f |˙m¯ g; H˙ m |f g|˙m

≤

|f g|m ≤

≤ C|f |˙m¯ C|f |˙m¯ |g|˙m C|f |˙m¯ |g|m

for m < n/2,

|g|˙m

for n/2 ≤ m ≤ m, ¯

(3.8)

for k0 ≤ m ≤ m, ¯

(3.9)

for 0 ≤ m ≤ m. ¯

(3.10)

(2) Let 0 ≤ m ≤ m ¯ + 1. Then [ωm , f ]∇g 2 ≤ C|∇f |˙m¯ ωm g 2 ≤

(3.11)

for n/2 + 1 ≤ m ≤ m ¯ + 1, (3.12)

≤

for k0 ≤ m ≤ m ¯ + 1,

(3.13)

for 0 ≤ m ≤ m ¯ + 1.

(3.14)

≤

|∇g|˙m−1 C|∇f |˙m¯ |g|˙m C|∇f |˙m¯ |g|m

for 0 ≤ m < n/2 + 1,

C|∇f |˙m¯

(3) Let n ≥ 3 and m ≥ k0 , 1 < m ≤ m ¯ + 1. Then [ωm , f ]g 2 ≤ C|f |˙m |g|˙m¯ .

(3.15)

408

J. Ginibre, G. Velo

Proof. Part (1). By Lemma 3.2, we estimate ωm f g 2 ≤ C f ∞ ωm g 2 + ωm f n/δ g r

(3.16)

with 0 < δ = δ(r) ≤ n/2. For m < n/2, we choose δ = m and continue (3.16) by

· · · ≤ C f ∞ + ωn/2 f 2 ωm g 2 ≤ C|f |˙m¯ ωm g 2 by Sobolev inequalities, which yields (3.8) in this case. For m > n/2, we choose δ = n/2 and continue (3.16) by · · · ≤ C f ∞ ωm g 2 + ωm f 2 g ∞ ≤ C|f |˙m |g|˙m which yields (3.8) since m ≤ m. ¯ For m = n/2, we take k0 ∨ (n − m) ¯ ≤ δ < n/2 and estimate the last term in (3.16) by C ωn−δ f 2 ωδ g 2 by Sobolev inequalities, which yields again (3.8) in that case. Finally (3.9) and (3.10) are immediate consequences of (3.8) . Part (2). For m > 1, we estimate by Lemma 3.2 [ωm , f ]∇g 2 ≤ C ∇f ∞ ωm g 2 + ωm f n/δ ∇g r

(3.17)

with 0 < δ = δ(r) ≤ n/2. For m < n/2 + 1, we choose δ = m − 1 and continue (3.17) by

· · · ≤ C f ∞ + ωn/2 ∇f 2 ωm g 2 by Sobolev inequalities, which yields (3.11) in this case. For m > n/2 + 1, we choose δ = n/2 and continue (3.17) by · · · ≤ C ∇f ∞ ωm g 2 + ωm f 2 ∇g ∞ ≤ C|∇f |˙m−1 |∇g|˙m−1 which yields (3.12) since m ≤ m ¯ + 1. For m = n/2 + 1, we take k0 ∨ (n − m) ¯ ≤ δ < n/2 and estimate the last term in (3.17) by C ωn−δ ∇f 2 ωδ ∇g 2 by Sobolev inequalities, which yields again (3.12) in this case. For 0 ≤ m ≤ 1, (3.11) follows immediately from Lemma 3.2 and from the inclusion K m¯ ⊂ F (L1 ). Finally (3.13) and (3.14) are immediate consequences of (3.11), (3.12).

Long Range Scattering and Modified Wave Operators for MS System

Part (3). By Lemma 3.2, we estimate

[ωm , f ]g 2 ≤ C ωm f 2 g ∞ + ∇f n/δ ωm−1 g r

409

(3.18)

with 0 ≤ δ = δ(r) < n/2. We estimate the last term in (3.18) by Sobolev inequalities as C ω1+n/2−δ f 2 ωm−1+δ 2 ≤ C|f |˙m |g|˙m¯ provided 1 + n/2 − m ≤ δ ≤ 1 + n/2 − k0 k0 − m + 1 ≤ δ ≤ m ¯ + 1 − m, ¯ + 1. This and the various conditions on δ are compatible for m ≥ k0 and 1 < m ≤ m proves (3.15). We next give some estimates of the various components of B1 , defined by (2.27)– (2.29) and (2.31). It follows immediately from (2.31) that ωm (x · Ba )S 2 ≤ t β(m−p) ωp (x · Ba )S 2 ≤ t β(m−p) ωp (x · Ba ) 2

(3.19)

for m ≤ p, and similarly ωm (x · Ba )L 2 ≤ t β(m−p) ωp (x · Ba )L 2 ≤ t β(m−p) ωp (x · Ba ) 2

(3.20)

for m ≥ p. We next estimate Fj (M) defined by (2.25). From (2.25) and from the dilation identity ωm D0 (ν)f 2 = ν −m+3/2 ωm f 2 ,

(3.21)

it follows immediately that ωm+1 Fj (M) 2 ≤ C Im+j ωm M 2 for j = 0, 1, where

∞

(Im (f ))(t) =

dν ν −m−3/2 f (νt)

(3.22)

(3.23)

1

or equivalently

∞

(Im (f ))(t) = t m+1/2

dt t −m−3/2 f (t )

(3.24)

t

for t > 0. In particular for m ≥ 1 and j = 0, 1, |Fj (M)|˙m+1 ≤ CIj (|M|m ) .

(3.25)

We next estimate x · Fj (M). Using the commutation relation [x, P ] = (n − 1)−1 ∇, easily proved in Fourier transformed variables, and estimating | sin((ν − 1)ω)| ≤ νω ∨ 1,

(3.26)

410

J. Ginibre, G. Velo

we obtain

ωm+1 x · Fj (M) 2 ≤ C Im−1+j ωm (x ⊗ M) 2 + ωm M 2 ,

(3.27)

and in particular for m ≥ 1, |x · F1 (M)|˙m+1 ∨ |∇x · F0 (M)|˙m ≤ I0 (|x ⊗ M|m + |M|m ) .

(3.28)

The estimates (3.25), (3.28) will be the main tools used to estimate Ba and Bb given by (2.28), (2.29). 4. Cauchy Problem and Asymptotics for the Auxiliary System In this section, we study the Cauchy problem for the auxiliary system (2.34), (2.35) for large but finite initial time, and we derive asymptotic properties in time of its solutions. The basic tool of this section consists of a priori estimates for suitably regular solutions of the linearized system (2.37), (2.38). Those estimates can be proved by a regularisation and limiting procedure and hold in the integrated form at the available level of regularity. For brevity, we shall state them in differential form and we shall restrict the proof to the formal computation. We first estimate a single solution of the linearized system (2.37), (2.38) at the level of regularity where we shall eventually solve the auxiliary system (2.34), (2.35). Lemma 4.1. Let k > 3/2, let T ≥ 1, I = [T , ∞) and let (w, s, Bb ) ∈ C(I, Xk ) with |w|k ∨ |xw|k ∈ L∞ (I ). Let a = |w|k ∨ |xw|k ; L∞ (I ) .

(4.1)

Let I ⊂ I be an interval, let (w , s ) be a solution of the system (2.37) with (w , s , 0) ∈ C(I , Xk ) and let Bb be defined by (2.38). Let 0 ≤ θ ≤ 1 and k ≤ ≤ k + 2. Then the following estimates hold : |Ba |˙k+1 ≤ C I0 (|w|k |xw|k ) ≤ C a 2 ,

x · Ba ; H˙ k+1 ≤ |∇(x · Ba )|˙k ≤ C I0 |xw|k (|xw|k + |w|k ) ≤ C a 2 ,

(4.2) (4.3)

for all t ∈ I ,

∂t |w |k+θ ≤ C t −2 |∇(s + B)|˙ |w |k+θ + |∇ · s|˙ k k+θ |w |k

+ |s|˙k+θ + |B|˙k+θ |B|˙k+θ |w |k +C t −1−β(1−θ) x · Ba ; H˙ k+1 |w |k +C t −1 |x · Bb |˙k+θ |w |k ≡ M4 (θ, w ),

∂t |xw |k ≤ M4 (0, xw ) + C t −2 |w |k+1 + |s + B|˙ |w |k , k

(4.4) (4.5)

∂t |s |˙ ≤ C t −2 |s|˙k+1 |s |˙ + χ ( ≥ k + 1)|s|˙ |s |˙k+1 + C t −1 |w|k |w|−1 + C t −1+β(−k) |∇(x · Ba )|˙k ,

(4.6)

Long Range Scattering and Modified Wave Operators for MS System

for all t ∈ I ,

|Bb |˙k+θ ≤ C t −1 I1 |w|k (|w|k+θ + |s + B|˙k |w|k ) ,

|x · Bb |˙k+θ ≤ C t −1 I0 (|w|k + |xw|k )(|w|k+θ + |s + B|˙k |w|k )

411

(4.7)

(4.8)

for all t ∈ I . Remark 4.1. The statements on Bb are non empty only in so far as the integrals over ν in the RHS of (4.7), (4.8) are absolutely convergent. This requires suitable assumptions on the behaviour of (w, s, B) at infinity in time. Such assumptions will be made in due course. The only assumption of this type made so far is (4.1) which ensures (4.2), (4.3), thereby making (4.4), (4.5), (4.6) into non empty statements. Proof. The estimates (4.2) and (4.3) follow immediately from (3.25), (3.28) and from (3.10) with m = m ¯ = k. We next estimate w in H k+θ . It is clear from (2.37) that w 2 = const. Let m = k + θ . By a standard energy method we estimate ∂t ωm w 2 ≤ t −2 [ωm , s + B] · ∇w 2 + [ωm , ∇ · s]w 2 + [ωm , 2B · s + B 2 ]w 2 + t −1 [ωm , (x · Ba )S ]w 2 + t −1 [ωm , x · Bb ]w 2 .

(4.9)

We estimate the first norm in the RHS by (3.14) with m ¯ = k and the other norms by (3.15) with m ¯ = k. Furthermore, by (3.19), |(x · Ba )S |˙k+θ ≤ t −β(1−θ) x · Ba ; H˙ k+1 ,

(4.10)

so that the RHS of (4.9) is estimated by that of (4.4), which together with L2 norm conservation, yields (4.4). We next estimate xw in H k , starting from (2.39). We obtain ∂t ωk xw 2 ≤ terms containing xw

+t −2 ωk+1 w 2 + ωk ((s + B)w ) 2 , (4.11) where the terms containing xw are obtained from the RHS of (4.9) by replacing w by xw and taking m = k. Those terms are estimated in the same way as before. Estimating the last norm in (4.11) by (3.15) and using the obvious L2 estimate, ∂t xw 2 ≤ t −2 ∇w 2 + (s + B) ∞ w 2 (4.12) yields (4.5). We next estimate s . From (2.37) we obtain

∂t ω s 2 ≤ t −2 [ω , s]∇s 2 + (∇ · s)ω s 2 +C t −1 [ω−1 |w|2 2 + t −1 ω+1 (x · Ba )L 2 . (4.13)

412

J. Ginibre, G. Velo

We estimate the first norm in the RHS by (3.11), (3.12) with m = and m ¯ = k if ≤ k + 1, while for ≥ k + 1,

[ω , s]∇s 2 ≤ C ∇s ∞ ω s 2 + ω s 2 ∇s ∞

≤ C |s|˙k+1 |s |˙ + |s|˙ |s |˙k+1 (4.14) by a direct application of Lemma 3.2. The last two norms in the RHS of (4.13) are estimated as ω−1 |w|2 2 ≤ C w ∞ ω−1 w 2 ≤ C|w|k |w|−1

(4.15)

by Lemma 3.2, and ω+1 (x · Ba )L 2 ≤ t β(−k) ωk+1 (x · Ba ) 2 by (3.20). Together with the simpler estimate ∂t ∇s 2 ≤ t −2 ∇s ∞ ∇s 2 + C t −1 w 2 + t −1 ∇ 2 (x · Ba ) 2 , 4 (4.16) those estimates yield (4.6). We finally estimate Bb . From (3.25) with j = 1 and (3.28), we obtain |Bb |˙k+θ ≤ C t −1 I1 (|Mb |k+θ−1 ) ,

(4.17)

|x · Bb |˙k+θ ≤ C t −1 I0 (|xMb |k+θ−1 + |Mb |k+θ−1 )

(4.18)

from which (4.7), (4.8) follow by (3.9), (3.10).

We next estimate the difference of two solutions of the linearized system (2.37), (2.38) corresponding to two different choices of (w, s, Bb ). We estimate that difference at a lower level of regularity than the solutions themselves. Lemma 4.2. Let k > 3/2, let T ≥ 1, I = [T , ∞) and let (wi , si , Bbi ) ∈ C(I, X k ), i = 1, 2 with |wi |k ∨ |xwi |k ∈ L∞ (I ). Let a = Max |wi |k ∨ |xwi |k ; L∞ (I ) . i

(4.19)

Let I ⊂ I be an interval, let (wi , si ) be solutions of the system (2.37) associated with

be defined by (2.38) in terms of (wi , si , Bbi ) with (wi , si , 0) ∈ C(I , Xk ) and let Bbi (wi , si , Bbi ). Define (w± , s± , Bb± ) = (1/2)((w1 , s1 , Bb1 ) ± (w2 , s2 , Bb2 )) and similarly for the primed quantities and for Ba , B, B · s and B 2 . Let 1/2 < k ≤ k − 1 ,

0≤θ ≤1 ,

1 ∨ k ≤ ≤ k + 2.

(4.20)

Then the following estimates hold : |Ba− |˙k +1 ≤ C I0 (|xw+ |k |w− |k ) ,

x · Ba− ; H˙ k +1 ≤ C Ik −1 ((|xw+ |k + |w+ |k )|xw− |k ) ,

(4.21) (4.22)

Long Range Scattering and Modified Wave Operators for MS System

413

∂t |w |k +θ ≤ C t −2 |∇s+ |˙ + |∇B+ |˙ + |(Bs)+ |˙ + |(B 2 )+ |˙ − k k k k

+ t −1−β(k−k ) x · Ba+ ; H˙ k+1 +t −1 |x · Bb+ |˙k |w− |k +θ

+ C t −2 |s− |˙k +1 + |B− |˙k +1 |w+ |k+θ + |s− |˙k +1+θ

+ |B+ |˙k |s− |˙k +1 + |B− |˙k +1 + |s+ |˙k |B− |˙k +1 |w+ |k

+ C t −1−β(1−θ) x · Ba− ; H˙ k +1 + t −1 |x · Bb− |˙k +1 |w+ |k , (4.23)

|k ≤ Idem(θ = 0, w → xw ) ∂t |xw−

|k +1 + |(s + B)+ |˙k |w− |k +C t −2 |w−

+|(s + B)− |˙k +1 |w+ |k ,

(4.24)

˙

˙

˙

˙ | ≤ C t −2 |∇s+ |˙k |s− | + |s− |˙ |∇s+ |k + χ ( ≥ k)|s− |˙k |∇s+ | ∂t |s−

+C t −1 |w+ |k |w− |−1 + C t −1+(−k )β ∇(x · Ba− ); H˙ k ∩ H˙ k ∧1 , (4.25)

|Bb − |˙k +1 ≤ t −1 I1 |Mb− |k ,

(4.26)

|x · Bb − |˙k +1 ≤ t −1 I0 |xMb− |k + |Mb− |k ,

(4.27)

where Mb− = (Mb1 − Mb2 )/2, Mb is defined by (2.23), and

|Mb− |k ∨ |x · Mb− |k

≤ C| < x > w+ |k |w− |k +1 + |s+ + B+ |˙k |w− |k + |s− + B− |˙k +1 |w+ |k +C| < x > w− |k |s− + B− |˙k +1 |w− |k .

(4.28)

Proof. The estimates (4.21), (4.22) follow immediately from (2.28), from (3.25), (3.28) ¯ = k. and from (3.10) with m = k and m

, s ). Taking the difference of the systems (2.37) for (w , s ), We next estimate (w− − i i

, s ) : we obtain the following system for (w− −

∂t w− = i(2t 2 )−1 w− + t −2 Q(s+ + B+ , w− ) + Q(s− + B− , w+ )

−i(2t 2 )−1 (2(B · s)+ + (B 2 )+ )w− + (2(B · s)− + (B 2 )− )w+

+it −1 ((x · Ba+ )S + x · Bb+ )w− + ((x · Ba− )S + x · Bb− )w+ , (4.29)

= t −2 s+ · ∇s− + s− · ∇s+ ∂t s− + t −1 ∇g(w+ , w− ) − t −1 ∇(x · Ba− )L .

(4.30)

by the same method as in Let m = k + θ, so that 1/2 < m ≤ k. We estimate ωm w−

with m Lemma 4.1, using (3.14) for the term (s + B)+ · ∇w− ¯ = k, and using (3.8) for

414

J. Ginibre, G. Velo

all the other terms, with m ¯ = k or m ¯ = k + 1, depending on whether the estimated quantity is of + or − type. We obtain

∂t ωm w 2 ≤ C t −2 |∇(s + B)+ |˙ + |∇ · s+ |˙ + |(B · s)+ + (B 2 )+ |˙ − k k k

+t −1 |(x · Ba+ )S |˙k +1 + |x · Bb+ |˙k |w− |m

+C t −2 |(s +B)− |˙k +1 |w+ |m+1

+|∇ · s− |m |w+ |k +|(B · s)− + (B 2 )− |˙k +1 |w+ |m

+t −1 |(x · Ba− )S |m |w+ |k + |x · Bb− |˙k +1 |w+ |m . (4.31) We next estimate

|(x · Ba+ )S |˙k +1 ≤ t −β(k−k ) x · Ba+ ; H˙ k+1 ,

|(x · Ba− )S |m ≤ t −β(1−θ) x · Ba− ; H˙ k +1

in a simpler way as above by taking m = 0 and by (3.19). We estimate ∂t w− 2

omitting the terms with w− in (4.31). Collecting the previous estimates yields (4.23).

. From (2.39) we We now turn to the proof of (4.24), namely to the estimate of xw− obtain

∂t xw− , = Idem(w → xw ) − it −2 ∇w− − t −2 (s + B)+ w− + (s + B)− w+ (4.32)

where Idem denotes the RHS of (4.29). The estimate (4.24) then follows from (4.23) with θ = 0 and w replaced by xw and from an easy estimate of the additional terms using (3.10) with m = k and m ¯ = k or k + 1.

. We estimate We now turn to the proof of (4.25) namely to the estimate of s−

∂t ω s− 2 ≤ t −2 [ω , s+ ] · ∇s− 2 + ∇ · s + ∞ ω s − 2

+ ω (s− ∇s+ ) 2 + C t −1 ω−1 (w¯ + w− ) 2 +t −1 ω+1 (x · Ba− )L 2 . (4.33) We estimate the first norm in the RHS by (3.11), (3.12) with m = and m ¯ = k. We estimate the third norm by (3.8) with m = and m ¯ = k for ≤ k, while for ≥ k,

ω (s− ∇s+ ) 2 ≤ C ω s− 2 ∇s+ ∞ + s− ∞ ω ∇s + 2

˙

˙ ≤ C |s− |˙ |∇s+ |k + |s− |˙k |∇s+ | by a direct application of Lemma 3.2. We next estimate ω−1 (w¯ + w− ) 2 ≤ C|w+ |k |w− |−1 by (3.10) with m = − 1 and m ¯ = k, and we estimate the last norm in (4.33) by (3.20). Together with the special case = 1, the previous estimates yield (4.25). Finally (4.26), (4.27) are special cases of (3.25), (3.28), while (4.28) follows from repeated use of (3.10) with m = k and m ¯ = k or k + 1.

Long Range Scattering and Modified Wave Operators for MS System

415

We now begin to study the auxiliary system (2.34), (2.35) and its linearized version (2.37), (2.38). The first step is to solve the linear system (2.37) globally in time. Proposition 4.1. Let k > 3/2, let T ≥ 1, I = [T , ∞) and let (w, s, Bb ) ∈ C(I, Xk ) with |w|k ∨ |xw|k ∈ L∞ (I ). Let t0 ∈ I and (w0 , s0 , 0) ∈ Xk . Then the system (2.37) has a unique solution (w , s ) in I such that (w , s , 0) ∈ C(I, Xk ) and (w , s )(t0 ) = (w0 , s0 ). That solution satisfies the estimates (4.4), (4.5), (4.6) for all t ∈ I . Two such solutions (wi , si ) associated with (wi , si ), i = 1, 2, satisfy the estimates (4.23), (4.24), (4.25) for all t ∈ I . Proof. We first prove the existence of a unique solution (w , s ) ∈ C(I, Y k ), where Y k = H k+1 ⊕K k+2 . The proof proceeds in the same way as that of Proposition 4.1 of [6], through a parabolic regularization and a limiting procedure. We define U1 (t) = U (1/t) and w (t) = U1 (t)w (t). We first consider the case t ≥ t0 . The system (2.37) with a parabolic regularization added is rewritten in terms of the variables ( w , s ) as 

 = η w + U1 L − i(2t 2 )−1 U1∗ w ≡ η w + G1 ( w )  ∂t w  

∂t s = ηs + t −2 s · ∇s + t −1 ∇g(w) − t −1 ∇(x · Ba )L ≡ ηs + G2 (s ),

where L is defined in (2.34) and where the parametric dependence of L, G1 , G2 on (w, s, Bb ) has been omitted. The Cauchy problem for that system can be recast into the integral form t w 0 w ) w

G1 ( (t) = V + (t ), (t − t ) dt V (t − t ) (4.34) η 0 η G2 (s ) s s0 t0 where Vη (t) = exp(ηt). The operator Vη (t) is a contraction in Y k and satisfies the bound ∇Vη (t); L(Y k ) ≤ C(ηt)−1/2 . From those facts and from estimates on G1 , G2 similar to and mostly contained in those of Lemma 4.1, it follows by a contraction argument that the system (4.34) has a unique solution ( wη , sη ) ∈ C([t0 , t0 + T1 ], Y k ) for some T1 > 0 depending only on |w0 |k+1 , |s0 |˙k+2 and η. That solution satisfies the estimates (4.4) and (4.6) and can therefore be extended to [t0 , ∞) by a standard globalisation argument using Gronwall’s inequality. We next take the limit η → 0. Let η1 , η2 > 0 and let (wi , si ) = (wη i , sη i ), i = 1, 2

, s ) = (1/2)(w − w , s − s ). By estimates be the corresponding solutions. Let (w− − 1 2 1 2 similar to, but simpler than those of Lemma 4.2, since in particular (w− , s− , Bb− ) = 0, we obtain 

2 ≤ |η − η | ∇w 2 + ∇w 2  ∂t w− 1 2 2 1 2 2 2 

2 ≤ |η − η | ∇ 2 s 2 + ∇ 2 s 2 + C t −2 ∇s ∇s 2 . ∂t ∇s− 1 2 ∞ − 2 2 1 2 2 2

Those estimates imply that (wη , sη ) converges in L2 ⊕ H˙ 1 uniformly in time in the compact subintervals of [t0 , ∞), to a solution of the original system. It follows then by a standard compactness argument using the estimates (4.4), (4.6) that the limit belongs to C([t0 , ∞), Y k ). This completes the proof for t ≥ t0 . The case t ≤ t0 is treated similarly.

416

J. Ginibre, G. Velo

We next show that xw ∈ C(I, H k ). For that purpose we choose a function ψ ∈ ∞ C0 (R3 , R+ ) with 0 ≤ ψ ≤ 1, ψ(x) = 1 for |x| ≤ 1, ψ(x) = 0 for x ≥ 2, and we define ψR by ψR (x) = ψ(x/R). Clearly ψR xw ∈ C(I, H k+1 ) and ψR xw satisfies the equation ∂t ψR xw = L ψR xw − it −2 (∇(ψR x)) · ∇w − i(2t 2 )−1 ((ψR x)) w −t −2 (s + B) · (∇(ψR x)) w . (4.35) Using Lemma 4.1, more precisely (4.5) and the fact that the operator of multiplication by a function ϕR (x) = ϕ(x/R) for ϕ ∈ C0∞ is a bounded operator in H m for all m ≥ 0 uniformly in R for R ≥ 1, we estimate

∂t |ψR xw |k ≤ M4 (0, ψR xw ) + C t −2 |w |k+1 + R −1 |w |k +C t −2 |s + B|˙k |w |k .

(4.36)

Integrating (4.36) between t0 and t and using Gronwall’s inequality, we obtain |ψR xw |k ≤ C(t) 1 + |ψR xw0 |k ≤ C(t) 1 + |xw0 |k 2 so that ψR xw is bounded in H k uniformly in R. This implies that xw ∈ L∞ loc (I, L )

2 and that ψR xw tends to xw strongly in L pointwise in t when R → ∞. Moreover, it follows from (4.35) that xw is weakly continuous in L2 as a function of t. Together with (4.36), this implies that xw ∈ C(I, H k ) by standard compactness arguments.

We now turn to the study of the auxiliary system (2.34), (2.35). The main results will be the existence and uniqueness of solutions of that system, defined in a neighborhood of infinity in time and with suitable bounds at infinity, and some asymptotic behaviour of those solutions. The bounds on the solutions at infinity will be essentially dictated by the existence result (Proposition 4.3 below), and for simplicity we shall mostly restrict our attention to solutions satisfying those bounds, although more general solutions could be considered in the uniqueness and asymptotic behaviour results. We shall thus consider solutions (w, s, Bb ) ∈ C(I, Xk ) for some interval I = [T , ∞) such that ≡ (t) ≡ |w|k ∨ |xw|k ∨ (n t)−1 |w|k+1 ∨ (n t)−1 |s|˙k ∨ t −β |s|˙k+1 ∨t −2β |s|˙k+2 ∨ |Bb |˙k ∈ L∞ (I ).

(4.37)

We first state the uniqueness result. Proposition 4.2. Let k > 3/2, 0 < β < 1/2 and I = [T , ∞). (1) Let t0 ∈ I and (w0 , s0 , 0) ∈ Xk . Then for t0 sufficiently large, the auxiliary system (2.34), (2.35) has at most one solution (w, s, Bb ) ∈ C(I, X k ) satisfying (4.37) and (w, s)(t0 ) = (w0 , s0 ). (2) Let (wi , si , Bbi ) i = 1, 2, be two solutions of the auxiliary system (2.34), (2.35) in C(I, Xk ) satisfying (4.37) and such that for some ε > 0, |s− |˙k+1 ∨ t 2β+ε (|w− |k ∨ |xw− |k−1 ) → 0 when t → ∞. Then (w1 , s1 , Bb1 ) = (w2 , s2 , Bb2 ).

(4.38)

Long Range Scattering and Modified Wave Operators for MS System

417

Remark 4.2. The condition t0 sufficiently large in Part (1) takes the form t0 ≥ 1 + ; L∞ ([t0 , ∞)) N

(4.39)

for some N > 0. For a given solution (w, s, Bb ), the RHS of (4.39) is decreasing in t0 while the LHS is increasing, and Part (1) supplemented by (4.39) gives a lower bound of the initial time for which the given solution is uniquely determined as a solution of the Cauchy problem with that initial time. Proof. The proof relies on Lemma 4.2, and we first recast the estimates of that lemma in a simplified form for solutions satisfying (4.37). We consider two solutions (wi , si , Bbi )

)), i = 1, 2 of the system (2.34), (2.35) satisfying (4.37) and we define (≡ (wi , si , Bbi A(t1 ) = Max i (t); L∞ ([t1 , ∞)) i=1,2

(4.40)

for t1 ≥ T , where i are the quantities defined by (4.37) for the two solutions. We define (w− , s− , Bb− ) as in Lemma 4.2, and  y1 = |w− |k ,  y = |w− |k−1 ∨ |xw− |k−1 , (4.41)  z = |s |˙ j = 1, 2. j − k+j −1 , We rewrite the estimates (4.21)–(4.28) for general t ∈ [t1 , ∞) for some t1 ≥ T . The + quantities are estimated by (4.37), (4.40) supplemented by (4.2), (4.3), (4.7), (4.8) as regards Ba and Bb . This produces overall constants depending polynomially on A(t1 ), which we omit for brevity, but the occurrence of which should be kept in mind for subsequent arguments. In terms having the same dependence on the dynamical variables, we keep only the terms with the leading behaviour in t, namely we use t 1−β ≥ n t ≥ 1. We take k = k − 1 and rewrite (4.21)–(4.28) as follows: |Ba− |˙k ≤ I0 (y),

(4.42)

x · Ba− ; H˙ k ≤ Ik−2 (y),

(4.43)

|∂t y| ≤ t −1−β y + t −2 z1 + |B− |˙k n t + t −1−β Ik−2 (y) +t −1 |x · Bb− |˙k + t −2 y1 ,

(4.44)

|∂t y1 | ≤ t −1−β y1 + t −2 z1 n t + z2 + |B− |˙k n t + t −1 Ik−2 (y) + t −1 |x · Bb− |˙k , (4.45) |∂t z1 | ≤ t −2+β z1 + t −1 y + t −1+β Im (y),

(4.46)

|∂t z2 | ≤ t −2+β z2 + t β z1 + t −1 y1 + t −1+2β Im (y),

(4.47)

418

J. Ginibre, G. Velo

with m = (k − 2) ∧ 0,

|Bb− |˙k ≤ t −1 I1 y1 + y n t + z1 + |B− |˙k ,

(4.48)

|x · Bb− |˙k ≤ t −1 I0 y1 + y n t + z1 + |B− |˙k .

(4.49)

The system (4.42)–(4.49) will be the starting point for the proof of Proposition 4.2. Part 1. We first prove uniqueness for t ≥ t0 . Note that this region is autonomous in the sense that the equations in this region involve the dynamical variables in this region only, since the integrals (2.25) occurring in (2.28)–(2.29) are taken for ν ≥ 1. We define Y = y; L∞ ([t0 , ∞)) ,

Y1 = y1 (n t)−1 ; L∞ ([t0 , ∞)) ,

(4.50)

which are finite by (4.37) and we prove that those quantities are zero by integrating (4.44)–(4.47) with initial condition (y, y1 , z1 , z2 )(t0 ) = 0. In keeping with the previous simplification, we perform that computation up to constants (depending on A(t0 )) and under conditions that t0 is sufficiently large in the sense of (4.39) when needed. We furthermore eliminate the diagonal terms in (4.46)–(4.47) by exponentiation, namely by using the fact that ∞ t ∂t y ≤ fy + g ⇒ y(t) ≤ exp f y(t0 ) + g (4.51) t0

t0

for integrable f . Using (4.50) and integrating (4.46), (4.47) (and using z1 ≤ z2 and 2β < 1), we obtain z1 ≤ t β Y,

z2 ≤ (n t)2 Y1 + t 2β Y.

(4.52)

Substituting (4.50), (4.52) into (4.42), (4.48), (4.49) yields |Ba− |˙k ∨ x · Ba− ; H˙ k ≤ Y,

(4.53)

|Bb− |˙k ∨ |x · Bb− |˙k ≤ t −1 n t (Y1 + Y ) + t −1+β Y + t −1 |B− |˙k ; L∞ ([t0 , ∞)) ≤ t −1 n t Y1 + t −1+β Y

(4.54)

for t0 sufficiently large, so that one can assume |B− |˙k ≤ Y.

(4.55)

Substituting (4.50), (4.52)–(4.55) into (4.44), (4.45) yields

∂t y ≤ t −1−β + t −2+β + t −2 n t Y + t −2 n t Y1 ≤ t −1−β Y + t −2 n t Y1 ,

(4.56)

∂t y1 ≤ t −2+β n t + t −2 n t + t −2+2β + t −1 Y + t −2 (n t)2 Y1 ≤ t −1 Y + t −2 (n t)2 Y1 .

(4.57)

Long Range Scattering and Modified Wave Operators for MS System

419

Integrating (4.56), (4.57) in [t0 , t] and comparing with (4.50) yields   Y ≤ t0−β Y + t0−1 n t0 Y1 

(4.58)

Y1 ≤ Y + t0−1 n t0 Y1

which implies Y = Y1 = 0 for t0 sufficiently large. We next prove uniqueness for t ≤ t0 . Since we have already proved uniqueness for t ≥ t0 , (y, y1 , z1 , z2 ) vanish for t ≥ t0 , and the region t ≤ t0 also becomes autonomous in the previous sense, which it would not have been if treated first. Now however we are in a standard situation where the variables {yi } = {y, y1 , z1 , z2 , I0 (|B− |˙k } satisfy a system of inequalities of the type t0 |∂t yi | ≤ fij yj + gi dt hij (t ) yj (t ) (4.59) t

j

j

for t ≤ t0 . This can be reduced to the case of a single function y¯ = t0 |∂t y| ¯ ≤ f y¯ + g dt h(t ) y(t ¯ )

yi namely (4.60)

t

with f = Max j

g = Max gi ,

fij ,

i

i

h = Max j

hij .

i

One can then eliminate f by exponentiation, in very much the same way as in (4.51). Integrating (4.60) with f = 0 and y(t ¯ 0 ) = 0 yields t0 y(t) ¯ ≤ G(t) dt h(t ) y(t ¯ ) ≡ G(t) Y¯ (t) t

with G(t) =

t0 t

dt g(t ) and therefore |∂t Y¯ | ≤ h G Y¯

with Y¯ (t0 ) = 0, which implies Y¯ = 0 and therefore y¯ = 0 for all t ≤ t0 . Part 2. We proceed in the same way as in Part (1) for t ≥ t0 . Let λ = 2β + ε, so that 2β < λ < 1 and define   Y (t) = ·λ y(·); L∞ ([t, ∞)) (4.61)  Y1 (t) = ·λ y1 (·); L∞ ([t, ∞)) . We take t1 and t0 such that T ≤ t1 ≤ t0 < ∞, with t1 sufficiently large, and we estimate (y, y1 , z1 , z2 ) for t ∈ [t1 , t0 ] by integrating (4.44)–(4.47) between t and t0 with final data (y0 , y10 , z10 , z20 ) at t0 . Let Y = Y (t1 ), Y1 = Y1 (t1 ). From (4.42), (4.43), (4.61) we obtain |Ba− |˙k ∨ x · Ba− ; H˙ k ≤ t −λ Y

(4.62)

420

J. Ginibre, G. Velo

for t ≥ t1 . Integrating (4.46), (4.47) as before, we now obtain  β−λ  z1 ≤ z10 + t− Y 

z2 ≤

−λ z20 + t−

−ε Y1 + ε −1 t−

(4.63)

Y

for t ≥ t1 , with t− = t ∧t0 . (Note that we need an estimate of z1 for t ≥ t0 for substitution in (4.48), (4.49)). Substituting (4.61), (4.62), (4.63) into (4.48), (4.49) yields

β−λ |Bb− |˙k ∨ |x · Bb− |˙k ≤ t −1 t −λ (Y1 + Y n t) + z10 + t− Y + t −λ Y + I0 |Bb− |˙k ≤ t −1 t −λ Y1 + z10 + t β−λ Y (4.64) for t1 ≤ t ≤ t0 and t1 sufficiently large. Substituting (4.61)–(4.64) into (4.44), (4.45) yields

|∂t y| ≤ t −1−β−λ + t −2+β−λ Y + t −2 z10 + t −2−λ Y1 ≤ t −1−β−λ Y + t −2 z10 + t −2−λ Y1 ,

(4.65)

since 2β < 1,

|∂t y1 | ≤ t −1−β−λ + t −2−λ Y1 + t −2 (z10 n t + z20 )

+ t −2+β−λ + ε −1 t −2−ε + t −1−λ Y ≤ t −1−β−λ Y1 + t −2 (z10 n t + z20 ) + t −1−λ Y.

(4.66)

Integrating (4.65), (4.66) between t and t0 yields   y ≤ y0 + t −β−λ Y + t −1 z10 + t −1−λ Y1 

y1 ≤ y10 + t −β−λ Y1 + t −1 (z10 n t + z20 ) + t −λ Y.

(4.67)

Substituting the result into the definition (4.61) and using the fact that t0λ y0 ≤ Y (t0 ), t0λ y10 ≤ Y1 (t0 ) by definition, we obtain  −β  Y ≤ Y (t0 ) + t1−1+λ z10 + t1 Y + t1−1 Y1 (4.68)  −β Y1 ≤ Y1 (t0 ) + t1−1+λ (z10 n t1 + z20 ) + Y + t1 Y. Taking t1 sufficiently large to eliminate the diagonal terms, we obtain   Y ≤ m0 + t1−1 Y1  with

Y1 ≤ m1 + Y

  m0 = Y (t0 ) + t1−1+λ z10 

m1 = Y1 (t0 ) + t1−1+λ (z10 n t1 + z20 )

(4.69)

Long Range Scattering and Modified Wave Operators for MS System

which implies

Y ≤ m0 + t1−1 m1 ,

421

Y1 ≤ m0 + m1

for t1 sufficiently large. We now let t0 → ∞ for fixed t1 . Then m0 and m1 tend to zero, and therefore Y = Y1 = 0, which together with (4.63), (4.64) and another limit t0 → ∞ for fixed t1 , implies that (w1 , s1 , Bb1 ) = (w2 , s2 , Bb2 ). We now turn to the main result of this section, namely the existence of solutions of the auxiliary system (2.34), (2.35) with sufficiently large initial time, defined for sufficiently large time and satisfying (4.37). Proposition 4.3. Let k > 3/2 and 0 < β < 1/2. Let (w0 , s0 , 0) ∈ Xk . Then (1) There exists T0 < ∞, depending on (w0 , s0 ), such that for all t0 > T0 , there exists T < t0 such that the auxiliary system (2.34), (2.35) has a unique solution (w, s, Bb ) ∈ C(I, Xk ), where I = [T , ∞), satisfying (4.37) and (w, s)(t0 ) = (w0 , s0 ). The dependence of T0 (resp. T ) on (w0 , s0 ) (resp. and on t0 ) can be formulated more precisely as follows. For any set A = {a, a1 , b0 , b1 , b2 } of five positive numbers, there exists T0 = T0 (A) and for any t0 > T0 , there exists T = T (t0 , A) < t0 , increasing in t0 and such that T (T0 (A), A) = T0 (A), such that the previous statement holds for such T0 , t0 , T for all (w0 , s0 , 0) ∈ Xk such that    |w0 |k ∨ |xw0 |k ≤ a, |w0 |k+1 ≤ a1 n t0 (4.70)

jβ   |s0 |˙ n t , j = 0, 1, 2. ≤ b + t j 0 0 k+j The solution (w, s, Bb ) satisfies the estimates |w|k ∨ |xw|k ≤ Ca,

|w|k+1 ≤ C(a1 + a 3 )n t+ ,

jβ |s|˙k+j ≤ C bj + a 2 n t+ + t+ ,

j = 0, 1, 2,

|Bb |˙k+1 ∨ |x · Bb |˙k+1 ≤ C aa1 + a 2 b0 + a 4 t −1 n t+

(4.71) (4.72)

(4.73)

for all t ≥ T , with t+ = t ∨ t0 . In addition w ∈ L∞ (I, H k+θ ) for 0 ≤ θ < 1. (2) The map (w0 , s0 ) → (w, s, Bb ) is continuous for fixed t0 , on the bounded sets of Xk , from the norm of (w0 , s0 , 0) ∈ Xk−1 to the norm of (w, s, Bb ) in L∞ (J, Xk−1 ) for any interval J ⊂⊂ I , and in the weak-∗ sense to L∞ (J, Xk ). Proof. Part 1. The proof consists in exploiting the estimates of Lemmas 4.1 and 4.2 in order to show that the map : (w, s, Bb ) → (w , s , Bb ) defined by Proposition 4.1 with (w , s )(t0 ) = (w, s)(t0 ) and by (2.38) is a contraction of a suitable set R of C(I, Xk ) for a suitably time rescaled norm of L∞ (I, Xk−1 ). More precisely, let I = [T , ∞) and t0 ∈ I . For (w, s, 0) ∈ C(I, Xk ), we define   y = |w|k ∨ |xw|k , y1 = |w|k+1 (4.74)  z = |s|˙ , j = 0, 1, 2, j k+j

422

J. Ginibre, G. Velo

and we define R by R = (w, s, Bb ) ∈ C(I, Xk ) : (w, s)(t0 ) = (w0 , s0 ), y ≤ Y, y1 ≤ Y1 n t+ ,

jβ (4.75) zj ≤ Zj n t+ + t+ , j = 0, 1, 2, |Bb |˙k+1 ∨ |x · Bb |˙k+1 ≤ N t −1 n t+ for some positive constants (Y, Y1 , Zj , N) to be chosen later. Actually, those constants will turn out to take the form that appears in (4.71)–(4.73). The proof will require various lower bounds on T and t0 , depending on (Y, Y1 , Zj , N). Since those constants will take the form that appear in (4.71)–(4.73), the lower bounds on T and t0 will eventually be expressed in terms of (a, a1 , bj ), thereby taking the form stated in the proposition. We first show that the set R is mapped into itself by . Let (w, s, Bb ) ∈ R. From (4.2)–(4.3) it follows that |Ba |˙k+1 ∨ |∇(x · Ba )|˙k ≤ C I0 (y 2 ) ≤ C Y 2 .

(4.76)

We now get rid of the variable Bb . From (4.7), (4.8), (4.76) it follows that

|Bb |˙k+1 ∨ |x · Bb |˙k+1 ≤ C t −1 I0 yy1 + y 2 z0 + I0 (y 2 ) + |Bb |˙k

≤ C t −1 n t+ Y Y1 + Y 2 Z0 + Y 2 + N t −1 ≤ N t −1 n t+ with

N = C Y Y1 + Y 2 (Z0 + Y 2 )

(4.77)

(4.78)

provided T is sufficiently large, namely T ≥ CY 2 . Therefore under that condition, if we choose N in (4.75) as given by (4.78), the condition on Bb in (4.75) is automatically reproduced by , and it remains only to show that reproduces the conditions on (w, s). In order to proceed, we furthermore impose the condition |Bb |˙k+1 ≤ Y 2

(4.79)

which together with (4.76) ensures that |B|˙k+1 ≤ C Y 2 .

(4.80)

That condition is ensured by taking T and t0 sufficiently large so that

t (n t+ )−1 ≥ N/Y 2 = C Y1 /Y + Z0 + Y 2

(4.81)

for all t ≥ T , a condition which can be rewritten as

t0 ≥ T ≥ N Y −2 n t0 = C Y1 /Y + Z0 + Y 2 n t0 ,

(4.82)

where we have included the condition t0 ≥ T for completeness. Let now (w , s ) be the solution of the system (2.37) with initial data (w , s )(t0 ) = (w0 , s0 ) obtained by using Proposition 4.1. We define   y = |w |k ∨ |xw |k , y1 = |w |k+1 (4.83)  z = |s |˙ , j = 0, 1, 2. j k+j

Long Range Scattering and Modified Wave Operators for MS System

423

From Lemma 4.1, more precisely from (4.4)–(4.6), we obtain |∂t y | ≤ E y + t −2 y1 ,

(4.84)

|∂t y1 | ≤ E y1 + C t −2 z2 y + C t −1 |∇(x · Ba )|˙k y ,

(4.85)

|∂t zj | ≤ C t −2 z1 zj + zj z1 + C t −1 y(y + δj2 y1 ) + C t −1+jβ |∇(x · Ba )|˙k , (4.86) where δj2 is the Kronecker symbol and where

E = C t −2 z1 + |B|˙k+1 1 + |B|˙k+1 + C t −1−β |∇(x · Ba )|˙k + C t −1 |x · Bb |˙k+1 . (4.87) We want to integrate (4.84)–(4.86) between t0 and t with appropriate initial data at t0 . For that purpose, we first eliminate the diagonal terms in (4.84), (4.85) by exponentiation according to (4.51). By (4.75), (4.76), (4.80) we estimate

β E ≤ E¯ = C t −2 Z1 t+ + Y 2 (1 + Y 2 ) + C t −1−β Y 2 + C N t −2 n t+ , (4.88) and therefore by integration t

dt E(t ) ≤ C (t ∧ t0 )−1 t β Z1 + Y 2 (1 + Y 2 ) + C(t ∧ t0 )−β Y 2 0 t0

+C(t ∧ t0 )−1 n t0 N ≤ C,

(4.89)

where we have used the estimates t dt t −2 t β ≤ (1 − β)−1 (t ∧ t0 )−1 t β , + 0

(4.90)

t dt t −2 n t ≤ (t ∧ t0 )−1 (1 + n t0 ), +

(4.91)

t0

t0

and where the last inequality is achieved for T and t0 sufficiently large, namely β

t0 ≥ T ≥ t0 Z1 (1 + Y 2 ) + Y 2 + Y 2/β + N n t0 .

(4.92)

We next estimate y and y1 by integrating (4.84), (4.85) between t0 and t with initial conditions y (t0 ) ≤ a and y1 (t0 ) ≤ a1 n t0 according to (4.70) and with E exponentiated to a constant under the condition (4.92). We obtain  

 y ≤ C a + C t −2 y1   (4.93)   2β  −2 2 −1

 y1 ≤ C a1 n t0 + C Z2 t t+ y + C Y t y

424

J. Ginibre, G. Velo

by the use of (4.75), (4.76) and with the shorthand notation t

f (t) = dt f (t ) . t0

Let now Y = y ; L∞ ([T , ∞)) ,

Y1 = (n t+ )−1 y1 ; L∞ ([T , ∞)) .

(4.94)

(Strictly speaking, we should use a bounded interval [T , T1 ] with T1 large instead of [T , ∞), check that the subsequent estimates are uniform in T1 , and take the limit T1 → ∞ at the end. We omit that step for simplicity). From (4.93) we obtain   y ≤ C a + C Y1 (t ∧ t0 )−1 n t0 (4.95)  2β y1 ≤ C a1 n t0 + C Z2 Y (t ∧ t0 )−1 t0 + C Y 2 Y n t+ by integration and by the use of (4.90), (4.91), and therefore   Y ≤ C a + C Y1 T −1 n t0 

(4.96)

Y1 ≤ C a1 + C Z2 Y T −1 t0 + C Y 2 Y , 2β

which implies Y ≤ Y and Y1 ≤ Y1 provided we take Y = C a,

Y1 = C(a1 + a 3 ),

(4.97)

and provided we take T and t0 sufficiently large so that 2β

t0 ≥ T ≥ C(a1 /a + a 2 )n t0 ∨ C a Z2 t0 /(a1 + a 3 ).

(4.98)

This shows that the conditions on (y, y1 ) in (4.75) are preserved by . We finally estimate zj , j = 0, 1, 2, by integrating (4.86) between t0 and t with initial condition zj (t0 ) ≤ bj (n t0 + t0 ) according to (4.70). Using (4.75) and exponentiating the diagonal term with z1 zj to a constant under the condition jβ

β

we obtain

t0 ≥ T ≥ C Z1 t0 ,

(4.99)

jβ jβ zj ≤ C bj n t0 + t0 + C Zj t −2 n t+ + t+ z1

jβ +C Y 2 n t+ + t+ + C Y Y1 δj2 (n t+ )2 .

(4.100)

jβ −1 zj ; L∞ ([T , ∞)) Zj = n t+ + t+

(4.101)

We let now

and obtain from (4.100) −2β

Zj ≤ C bj + C Zj Z1 T −1 t0 + C Y 2 + C Y Y1 (n t0 )2 t0 β

Long Range Scattering and Modified Wave Operators for MS System

which implies Zj ≤ Zj , j = 0, 1, 2 provided we take

Zj = C bj + a 2 ,

425

(4.102)

and provided T and t0 are sufficiently large so that (4.99) holds and in addition

2β t0 ≥ C a1 /a + a 2 (n t0 )2 . (4.103) We have proved that maps R into itself provided (Y, Y1 , Zj , N ) are chosen according to (4.97), (4.102), (4.78) and provided t0 and T are taken sufficiently large according to (4.82), (4.92), (4.98), (4.99), (4.103). The latter conditions clearly take the form stated in the proposition. We next show that the map is a contraction in R for a suitably time rescaled norm

) = (w , s , B ), i = 1, 2. of L∞ (I, Xk−1 ). Let (wi , si , Bbi ) ∈ R and let (wi , si , Bbi i i bi As in Lemma 4.2, we define (w− , w− , Bb− ) = (1/2)((w1 , s1 , Bb1 ) − (w2 , s2 , Bb2 )) and similarly for the primed quantities. Furthermore, we define   y− = |w− |k−1 ∨ |xw− |k−1 , y1− = |w− |k (4.104)  z = |s |˙ n− = |Bb− |˙k ∨ |x · Bb− |˙k j− − k+j −1 , j = 1, 2, and similarly for the primed quantities and for Ba and B. From Lemma 4.2 with k = k −1 and from the fact that the + quantities are estimated by the definition (4.75) of R and by (4.76), we obtain |Ba− |˙k ≤ C Y I0 (y− ),

(4.105)

x · Ba− ; H˙ k ≤ C Y Ik−2 (y− ),

(4.106)

¯ − | ≤ Ey + C t −2 Y (1 + Y 2 )z1− + 1 + Z0 n t+ + Y 2 (Y I0 (y− ) + n− ) |∂t y− +C t −1−β Y 2 Ik−2 (y− ) + C t −1 Y n− + t −2 y1 − , (4.107)

¯ 1 + C t −2 Y1 n t+ + Y 3 z1− + Y z2− + Y1 n t+ + Y Z0 n t+ + Y 3 |∂t y1 − | ≤ Ey − (Y I0 (y− ) + n− ) + C t −1 Y 2 Ik−2 (y− ) + C t −1 Y n− , where E¯ is defined by (4.88),

β 2β |∂t zj − | ≤ C t −2 Z1 t+ (zj− + zj − ) + δj2 Z2 t+ z1− +C t −1 Y y− + δj2 y1− + C t −1+jβ Y Im (y− ) for j = 1, 2,

(4.108)

(4.109)

where m = (k − 2) ∧ 0,

n − ≤ C t −1 Y I0 y1− + Z0 n t+ + Y 2 y− + Y z1− + Y I0 (y− ) + n− . (4.110)

426

J. Ginibre, G. Velo

For brevity, we continue the argument with a simplified version of the system (4.107)– (4.110) where we eliminate the diagonal terms and in particular E¯ by exponentiation according to (4.51), and where we eliminate the constants and the factors (Y, Y1 , Zj , N ). As a consequence we shall not be able to follow the dependence of the additional lower bounds of t0 and T on those factors. That dependence is of the same type as that encountered in the proof of stability of R under . Thus we rewrite (4.107)–(4.110) as

| ≤ t −2 z1− + n t+ (I0 (y− ) + n− ) + t −1−β Ik−2 (y− ) + t −1 n− + t −2 y1 − , |∂t y− (4.111) |∂t y1 − | ≤ t −2 z2− + n t+ z1− + I0 (y− ) + n− + t −1 Ik−2 (y− ) + t −1 n− , (4.112)

β 2β |∂t zj − | ≤ t −2 t+ zj− + δj2 t+ z1− + t −1 y− + δj2 y1− + t 1+jβ Im (y− ), n − ≤ t −1 I0 y1− + n t+ y− + z1− + I0 (y− ) + n− . We now define  Y− = y− ; L∞ ([T , ∞)) , Y1− = (n t+ )−1 y1− ; L∞ ([T , ∞)) ,        −jβ Zj− = t+ zj− ; L∞ ([T , ∞)) , j = 1, 2,    −β   N = t t+ n− ; L∞ ([T , ∞)) ,   −

(4.113) (4.114)

(4.115)

and similarly for the primed quantities. Using those definitions and omitting the − indices in the remaining part of the contraction proof, we obtain from (4.111)–(4.114), β β β |∂t y | ≤ t −2 Z1 t+ + Y n t+ + N t −1 t+ n t+ + t −1−β Y + t −2 t+ N + t −2 n t+ Y1 , (4.116) 2β β β β |∂t y1 | ≤ t −2 Z2 t+ + Z1 t+ n t+ + Y n t+ + N t −1 t+ n t+ + t −1 Y + t −2 t+ N, (4.117) (j +1)β

|∂t zj | ≤ t −2 t+

(Zj + Z1 ) + t −1 Y + δj2 n t+ Y1 + t −1+jβ Y,

β β n ≤ t −1 (Y + Y1 )n t+ + Z1 t+ + N t −1 t+ .

(4.118) (4.119)

Integrating (4.116)–(4.118) between t0 and t with initial condition (y , y1 , zj )(t0 ) = 0, using (4.90), (4.91) and similar estimates, and omitting again some absolute constants, we obtain β β y ≤ (t ∧ t0 )−1 Z1 t0 + Y n t0 + N (t ∧ t0 )−1 t0 n t0 + (t ∧ t0 )−β Y +(t ∧ t0 )−1 t0 N + (t ∧ t0 )−1 n t0 Y1 , β

(4.120)

Long Range Scattering and Modified Wave Operators for MS System

427

2β β β y1 ≤ (t ∧ t0 )−1 Z2 t0 + Z1 t0 n t0 + Y n t0 + N (t ∧ t0 )−1 t0 n t0 +n t+ Y + (t ∧ t0 )−1 t0 N, β

(4.121)

zj ≤ (t ∧ t0 )−1 t0 t+ (Zj + Z1 ) + Y n t+ + δj2 Y1 (n t+ )2 + t+ Y. β jβ

jβ

(4.122)

Substituting (4.120)–(4.122) and (4.119) into the primed analog of the definition (4.115) (and with the − indices omitted), we obtain 

≤ T −1 Z t β + Y n t + N t β + Y n t  Y + T −β Y 1 0 0  0 0 1          Y1 ≤ T −1 Z2 t02β + Z1 t0β + Y + N t0β + Y (4.123)   β −2β 

−1 2  Z ≤ T t0 (Zj + Z1 ) + Y + δj2 Y1 t0 (n t0 )    j     −β N ≤ (Y + Y1 ) t0 n t0 + Z1 + T −1 N. Substituting Y1 from the second inequality into the first one, we recast (4.123) into the form  Y ≤ ε(Y + Z1 + Z2 + N )        Y ≤ Y + ε(Z1 + Z2 + N )  1 (4.124)   Zj ≤ Y + ε(Y1 + Z1 + Zj )       N ≤ Z1 + ε(Y + Y1 + N ), where ε can be made arbitrarily small by taking t0 and T sufficiently large. We then define X = Y + Z1 /4 + (Y1 + Z2 + N )/8

(4.125)

and similarly for the primed quantities. It then follows from (4.124) that X ≤ (1/2 + O(ε))X and therefore is a contraction of R in the norms defined by (4.104), (4.115) for T and t0 sufficiently large. By a standard compactness argument, R is easily shown to be closed for the latter norms. Therefore has a unique fixed point in R. The uniqueness of the solution in C(I, X k ) under the assumption (4.37) follows from Proposition 4.2, part (1). The estimates (4.71)–(4.73) follow from the definition (4.75) of R and from the choices (4.97), (4.102), (4.78) of (Y, Y1 , Zj , N ). The dependence of t0 and T on (w0 , s0 ) stated in the proposition follows from that choice and from the fact that the lower bounds on t0 and T are expressed in terms of those quantities, as explained above. Finally, the fact that w ∈ L∞ (I, H k+θ ) for 0 ≤ θ < 1 follows immediately from (4.4) by substituting the estimates contained in the definition of R into the RHS, and integrating in time after exponentiation of the diagonal term. The crucial point is the fact that the contribution of the term in x · Ba is integrable in time for θ < 1.

428

J. Ginibre, G. Velo

Part 2. Let (wi , si , Bbi ), i = 1, 2, be two solutions of the system (2.34), (2.35) with initial conditions (wi , si )(t0 ) = (wi0 , si0 ) as obtained in Part (1). In particular those solutions satisfy the estimates (4.71)–(4.73). We define (y− , y1− , zj− , n− ) by (4.104). By the same estimates as in the contraction proof, we obtain (4.111)–(4.114) with the primes omitted. We next define (Y− , Y1− , Zj− , N− ) by (4.115). Omitting again the − indices as in the contraction proof, we obtain (4.116)–(4.119) with the primes omitted. Integrating the first three equations thereof between t0 and t with initial condition (y, y1 , zj )(t0 ) = (y0 , y10 , zj 0 ), we obtain  y ≤ y0 + RHS of (4.120)     y1 ≤ y10 + RHS of (4.121)     zj ≤ zj 0 + RHS of (4.122) and therefore

  Y ≤ y0             −1   Y1 ≤ y10 (n t0 )    −jβ   Zj ≤ zj 0 t0      N≤

       

+ RHS of (4.123)

which by the same argument as in the contraction proof, implies that X defined by (4.125) satisfies

−β −2β X ≤ C y0 + y10 (n t0 )−1 + z10 t0 + z20 t0 . (4.126) This proves the continuity of the map (w0 , s0 ) → (w, s, Bb ) from the norm of (w0 , s0 , 0) in Xk−1 on the bounded sets of X k to the norm of (w, s, Bb ) in C(I, Xk−1 ) in the norms defined by (4.115), and a fortiori to the norm of (w, s, Bb ) in L∞ (J, Xk−1 ) for J ⊂⊂ I . The last continuity follows by a standard compactness argument. We conclude this section by deriving asymptotic properties in time of the solutions of the auxiliary system (2.34), (2.35) obtained in the previous proposition. We prove in particular the existence of asymptotic states (w+ , σ+ ) for those solutions. Proposition 4.4. Let k > 3/2, 0 < β < 1/2, T ≥ 1, I = [T , ∞) and let (w, s, Bb ) ∈ C(I, Xk ) be a solution of the auxiliary system (2.34), (2.35) satisfying (4.37). Then (1) There exists w+ ∈ H k such that xw+ ∈ H k , w(t) tends to w+ strongly in H k and

xw(t) tends to xw+ strongly in H k for 0 ≤ k < k and weakly in H k when t → ∞. Furthermore the following estimates hold: |w+ |k ∨ |xw+ |k ≤ a∞ = lim sup(|w(t)|k ∨ |xw(t)|k ),

(4.127)

|w(t) − U ∗ (1/t)w+ |k ∨ |w(t) − w+ |k ≤ C t −β ,

(4.128)

t→∞

|xw(t) − U ∗ (1/t)xw+ |k−1 ∨ |xw(t) − xU ∗ (1/t)w+ |k−1 ≤ C t −2β ,

|xw(t) − xw+ |k−1 ≤ C t −2β + t −1/2 .

(4.129) (4.130)

Long Range Scattering and Modified Wave Operators for MS System

429

The constants C in (4.128)–(4.130) depend on ; L∞ (I ) , where is defined by (4.37). Assume in addition that w ∈ L∞ (I, H k+θ ) for all θ, 0 ≤ θ < 1, and define (W, S) by (2.40). Then (2) For all θ, 0 ≤ θ < 1, w+ ∈ H k+θ and w(t) tends to w+ strongly in H k+θ when t → ∞. Furthermore xW (t) ∈ L∞ (I, H k+θ−1 ) and xW (t) tends to xw+ strongly in H k+θ−1 when t → ∞. (3) For all j , 0 ≤ j < 2, S ∈ C([1, ∞), K k+j ) and S satisfies the estimate |S|˙k+j ≤ Cε t β(j +ε)

(4.131)

for any ε > 0. Furthermore there exists σ+ such that for all θ , 0 ≤ θ < 1, σ+ ∈ K k+θ , s − S tends to σ+ strongly in K k+θ , and the following estimate holds : |s − S − σ+ |˙k+θ ≤ C t −β(1−θ) .

(4.132)

(t0 ) for some t0 ∈ I . From (2.34) Proof. Part 1. Let w (t) = U (1/t)w(t) and w 0 = w we obtain w−w 0 ) = U (1/t) t −2 Q(s + B, w) − i(2t 2 )−1 (2B · s + B 2 )w ∂t ( +i t −1 ((x · Ba )S + x · Bb ) w . (4.133) By estimates similar to, but simpler than, those of Lemma 4.1, we estimate w−w 0 )|k ≤ C t −2 (|s|˙k + |B|˙k )(|w|k+1 + |B|˙k |w|k ) + |s|˙k+1 |w|k |∂t (

+ t −1−β x · Ba ; H˙ k+1 + t −1 |x · Bb |˙k |w|k ≤ C t −1−β

(4.134)

by (4.2), (4.3), (4.7), (4.8) and (4.37), so that by integration | w (t) − w (t0 )|k ≤ C (t ∧ t0 )−β .

(4.135)

This implies the existence of w+ ∈ H k such that w(t) → w+ strongly in H k , and the first estimate of (4.128). The second estimate follows from the first one and from the fact that |(U (1/t) − 1l)w|k ≤ t −1/2 |w|k+1 ≤ C t −1/2 n t

(4.136)

by (4.37), and that β < 1/2. 0 ). From (2.36) we obtain Let similarly xw(t) = U (1/t)xw(t) and (xw) 0 = xw(t − (xw) 0 ) = U (1/t) t −2 Q(s + B, xw) − t −2 (s + B)w − it −2 ∇w ∂t (xw −i(2t 2 )−1 (2B · s + B 2 )xw + it −1 ((x · Ba )S + x · Bb ) xw , (4.137) and by similar estimates as previously

430

J. Ginibre, G. Velo

|∂t (xw − (xw) 0 )|k−1 −2 ≤C t |w|k + |s|˙k + |B|˙k |xw|k + |w|k−1 + |B|˙k |xw|k−1 +t −1−2β x · Ba ; H˙ k+1 |xw|k + t −1 |x · Bb |˙k |xw|k−1

(4.138) ≤ C t −2 n t + t −1−2β so that by integration |xw(t) − xw(t 0 )|k−1 ≤ C (t ∧ t0 )−2β

(4.139)

which together with (4.135) implies that xw+ ∈ and that the first estimate of (4.129) holds. The fact that xw+ ∈ H k and that xw+ satisfies (4.127) and the additional convergences of xw to xw+ follow therefrom by standard compactness and interpolation arguments. The second estimate of (4.129) follows from the first one and from the identity

(4.140) U ∗ (1/t)x = x + i t −1 ∇ U ∗ (1/t) H k−1

which implies |U ∗ (1/t)xw+ − xU ∗ (1/t)w+ |k−1 ≤ t −1 |w+ |k .

(4.141)

Finally (4.130) follows from (4.129) and from |(U ∗ (1/t) − 1l)xw+ |k−1 ≤ t −1/2 |xw+ |k ≤ a∞ t −1/2 .

(4.142)

Part 2. The first statement follows from Part (1) by standard compactness and interpolation arguments. The second statement follows from the first one, and from (4.140) which implies |U ∗ (1/t)xw+ − xW (t)|k+θ−1 ≤ t −1 |w+ |k+θ .

(4.143)

Part 3. We estimate in the same way as in Lemma 4.1,

|∂t |S|˙k+j | ≤ C t −1 |W |k+j −1 |W |k + C t −1+β(j +ε) I0 |xW |2k−ε

(4.144)

for 0 < ε < k − 3/2. The first statement and the estimate (4.131) then follow from Part (2) and from (4.144) by integration. Let now q = w − W and σ = s − S. From (2.34), we obtain ∂t σ = t −2 s · ∇s + t −1 ∇g(q, w + W ) − t −1 ∇(x · Ba )L (q, w + W ) and by estimates similar to those in Lemma 4.1, |∂t |σ |˙k+θ | ≤ C t −2 |s|˙k |s|˙k+2 + |s|˙2k+1 + C t −1 a|q|k +C t −1+β(1+θ) a Im (|xq|k−1 ) , where a is defined by (4.1) and m = (k − 2) ∧ 0,

· · · ≤ C t −2+2β n t + t −1−β + t −1−β(1−θ) by (4.128), (4.129), and therefore by integration |σ (t) − σ (t0 )|˙k+θ ≤ C(t ∧ t0 )−β(1−θ) , from which the result follows.

(4.145)

Long Range Scattering and Modified Wave Operators for MS System

431

Remark 4.3. In Proposition 4.4, we have stated the asymptotic properties of (w, s) that follow readily from the bounds on the solutions obtained in Proposition 4.3, expressed in terms of (W, S) defined by (2.40). However part of the results hold under more general assumptions. For instance if we drop the assumptions on the higher norms |w|k+1 and |s|˙k+2 , we still get the existence of a limit w+ of w(t) with w+ ∈ H k and xw+ ∈ H k , with the estimate (4.129) for xw and a similar estimate for w. On the other hand, we could have expressed the asymptotic properties of (w, s) in terms of the simpler (W, S) defined by (2.41). However the convergence properties of w would be weaker (compare (4.130) with (4.129)), thereby yielding weaker convergence properties of s − S in Part (3). 5. Cauchy Problem at Infinity for the Auxiliary System In this section, we construct the wave operators for the auxiliary system (2.34), (2.35) in the difference form (2.49), (2.50) for infinite initial time, for the choice of (W, S) given by (2.40). In the same spirit as in Sect. 4, we first solve the linearized version (2.54) of the system (2.49), which together with (2.55) defines a map : (q, σ, Bb ) → (q , σ , Bb ). We then show that this map is a contraction on a suitable set in suitable norms. The basic tool of this section again consists of a priori estimates for suitably regular solutions of the linearized system (2.54), (2.55). We first estimate a solution of that system at the level of regularity where we shall eventually solve the auxiliary system (2.49), (2.50). Lemma 5.1. Let k > 3/2, 0 < β < 1/2, T ≥ 1, I = [T , ∞). Let (W, S, 0) ∈ C(I, Xk+1 ) with (U (1/t)W, S, 0) ∈ C 1 (I, Xk ) and let (q, σ, Bb ) ∈ C(I, X k ), with W, xW , q, xq ∈ L∞ (I, H k ). Let I ⊂ I be an interval, let (q , σ ) be a solution of the system (2.54) with (q , σ , 0) ∈ C(I , Xk ) and define Bb by (2.55). Let 0 ≤ θ ≤ 1 and k ≤ ≤ k + 2. Then the following estimates hold : |G|˙k+1 ≤ CI0 (|q|k |x(2W + q)|k ) , x · G; H˙ k+1 ≤ |∇(x · G)|˙k ≤ C I0 ((|xq|k + |q|k )|x(2W + q)|k ) ,

(5.1) (5.2)

˙ ˙ ∂t |q |k+θ ≤ M4 (θ, q ) + C t −2 |σ |˙ k+θ +1 + |σ |k+θ |B|k+θ

+|G + Bb |˙k+θ 1 + |2S + B + B∗ |˙k+θ + t −1−β(1−θ) x · G; H˙ k+1 +t −1 |x · Bb |˙k+θ |W |k+θ+1 + |R1 (W, S, 0)|k+θ ≡ M4 (θ, q ) + M5 (θ )|W |k+θ+1 + |R1 (W, S, 0)|k+θ , where M4 (θ, ·) is defined by the RHS of (4.4), ∂t |xq |k ≤ M4 (0, xq ) + M5 (0)|xW |k+1

+C t −2 |q |k+1 + |s + B|˙k |q |k + |σ + G + Bb |˙k |W |k +|xR1 (W, S, 0)|k ,

(5.3)

(5.4)

432

J. Ginibre, G. Velo

|∂t |σ |˙ | ≤ C t −2 |s|˙k+1 |σ |˙ + χ ( ≥ k + 1)|s|˙ |σ |˙k+1 + |σ |˙ |S|˙k+1 + |σ |˙k |S|˙+1 +C t −1 (|W |k+1 + |q|k ) |q|−1 +C t −1+β(−k) |∇(x · G)|˙k + |R2 (W, S)|˙ , (5.5)

|Bb |˙k+1 ≤ C t −1 I1 |q|k+1 (|W |k+1 + |q|k ) + |s + B|˙k |q|k (|W |k + |q|k ) +|σ + Bb + G|˙k |W |2k + |R3 (W, S, 0)|˙k+1 , (5.6)

|xBb |˙k+1 ≤ C t −1 I0 (|q|k+1 + |xq|k )(|W |k+1 + |xw|k ) +|s + B|˙k (|xq|k + |q|k )(|W |k + |q|k ) +|σ + Bb + G|˙k (|xW |k + |W |k )|W |k + |x · R3 (W, S, 0)|˙k+1 .

(5.7)

Remark 5.1. The boundedness assumptions in time of W and q ensure that the integrals I0 occurring in (5.1), (5.2) are convergent. Furthermore, by estimates similar to but simpler than those of Lemma 4.1, one sees easily that the norms of the remainders R1 and R2 that occur in (5.3), (5.4), (5.5) are finite under the assumptions made on (W, S). On the other hand (see Remark 4.1), the statements on Bb are non-empty only in so far as the integrals over ν in the RHS of (5.6), (5.7) are convergent. This requires additional assumptions on the behaviour of (W, S) and of (q, σ, Bb ) at infinity in time, which will be made in due course. Proof. The proof is very similar to that of Lemma 4.1. The estimates (5.1), (5.2) follow immediately from (3.25), (3.28) and from (3.10) with m = m ¯ = k. We next estimate q in H k+θ , starting from (2.54). Let m = k + θ . We estimate ∂t ωm q 2 by an energy method in the same way as in (4.9). The terms containing q are estimated in the same way as in (4.9), thereby yielding M4 (θ, q ), while the remaining terms are estimated with the help of (3.10) with m ¯ = m, supplemented by (3.19) for the term containing x · G. Together with an elementary estimate of ∂t q 2 (to which the terms containing q make no contribution), this proves (5.3). We next estimate xq in H k , starting from (2.56). The terms containing xq or xW explicitly yield the first two terms in the RHS of (5.4) by the special case θ = 0 of the proof of (5.3), while the remaining terms are estimated by (3.10) with m = m ¯ = k. This proves (5.4). We next estimate σ , starting with ∂t ω σ 2 . The term s · ∇σ is estimated exactly as in the proof of (4.6). The term σ · ∇S is estimated directly by Lemma 3.2 as

ω (σ · ∇S) 2 ≤ C ω σ 2 ∇S ∞ + σ ∞ ω+1 S 2

≤ C |σ |˙ |S|˙k+1 + |σ |˙k |S|˙+1 . (5.8) The term containing g is estimated by (3.10) with (m, m) ¯ = (k ∧ ( − 1), k ∨ ( − 1)) or ( − 1, k + 1) as ω−1 (q(q + 2W )) 2 ≤ C|q|−1 (|q|k + |W |k+1 ) .

(5.9)

The term containing G is estimated by (3.20). Together with a simpler estimate of ∂t ∇σ 2 , the previous estimates yield (5.5). Finally (5.6), (5.7) follow from (2.55), (3.25), (3.28) and from repeated use of (3.10) with m = m ¯ = k.

Long Range Scattering and Modified Wave Operators for MS System

433

We shall also need estimates for the difference of two solutions of the linearized system (2.54), (2.55) corresponding to two different choices of (q, σ, Bb ) but to the same choice of (W, S). Those estimates will be provided by Lemma 4.2, since for such

, σ ) = (w , s ) in the notation of that lemma solutions (q− , σ− ) = (w− , s− ) and (q− − − − extended in an obvious way. We now begin the study of the Cauchy problem for the auxiliary system (2.49), (2.50) and for that purpose we first study that problem for the linearized system (2.54). For finite initial time t0 , that problem is solved by Proposition 4.1. The following is a special case of that proposition and of Lemma 5.1 Proposition 5.1. Let k > 3/2, 0 < β < 1/2, T ≥ 1 and I = [T , ∞). Let (W, S, 0) ∈ C(I, Xk+1 ) with (U (1/t)W, S, 0) ∈ C 1 (I, Xk ) and let (q, σ, Bb ) ∈ C(I, X k ), with W, xW, q, xq ∈ L∞ (I, H k ). Let t0 ∈ I and (q0 , σ0 , 0) ∈ Xk . Then the system (2.54) has a unique solution (q , σ ) in I such that (q , σ , 0) ∈ C(I, Xk ) and (q , σ )(t0 ) = (q0 , σ0 ). That solution satisfies the estimates (5.3), (5.4), (5.5) for all t ∈ I , with G estimated by (5.1), (5.2). In order to study the Cauchy problem with infinite initial time, both for the linearized system (2.54) and for the nonlinear system (2.49), (2.50), we shall need stronger assumptions on the asymptotic behaviour in time of (W, S). For simplicity, from now on we make the final choice of (W, S) that will turn out to satisfy those assumptions. Thus we choose (W, S) as explained in Sect. 2, namely   W (t) = U ∗ (1/t)w+ (2.40) ≡ (5.10) t  S(t) = 1 dt t −1 (∇g(W ) − ∇(x · Ba )L (W )) for some fixed w+ ∈ H k+α+1 with xw+ ∈ H k+α for some α ≥ 1 (we shall eventually need α > 1), and we define a+ = |w+ |k+α+1 ∨ |xw+ |k+α .

(5.11)

∂t + i(2t 2 )−1 = U ∗ (1/t)∂t U (1/t)

(5.12)

Using the fact that

we recast the remainders that occur in the system (2.49), (2.50) into the form R1 (W, S, 0) = t −2 Q(S + B∗ , W ) − i(2t 2 )−1 (2B∗ · S + B∗2 )W + it −1 (x · B∗ )S W, (5.13) R2 (W, S) = t −2 S · ∇S,

(5.14)

R3 (W, S, 0) = t −1 F1 ImW¯ ∇W − (S + B∗ )|W |2 ,

(5.15)

where B∗ = Ba (W ) (see Sect. 2).

434

J. Ginibre, G. Velo

We shall need the following estimates. Lemma 5.2. Let k > 3/2 and 0 < β < 1/2. Let w+ ∈ H k+α+1 with xw+ ∈ H k+α for some α ≥ 1. Define (W, S) and a+ by (5.10), (5.11). Then (W, S, 0) ∈ C([1, ∞), Xk+α ) and the following estimates hold: |xW (t)|k+α ≤ |xw+ |k+α + t −1 |∇w+ |k+α ≤ 2a+ , 2 |B∗ |˙k+α+1 ∨ |∇(x · B∗ )|˙k+α ≤ I0 (|xW |k+α (|xW |k+α + |W |k+α )) ≤ C a+ ,

2 |S|˙k+j ≤ C a+ n t + t β(j −α)

for 0 ≤ j ≤ 2 + α,

|R1 (W, S, 0)|k+θ ≤ C(a+ ) t −2 n t + t −1−β(α+1−θ)

for 0 ≤ θ ≤ 1,

|xR1 (W, S, 0)|k ≤ C(a+ ) t −2 n t + t −1−β(α+1) ,

4 −2 |R2 (W, S)|˙k+j ≤ C a+ t n t n t + t β(j +1−α)

for 0 ≤ j ≤ 1 + α,

2 −1 2 t (1 + a+ n t). |R3 (W, S, 0)|˙k+1 ∨ |x · R3 (W, S, 0)|˙k+1 ≤ C a+

(5.16) (5.17) (5.18)

(5.19)

(5.20)

(5.21) (5.22)

Proof. We first estimate x · W . From the commutation relation (4.140), it follows that U (1/t) x W (t) = xw+ + it −1 ∇w+ ,

(5.23)

which implies (5.16). The estimate (5.17) follows immediately from (3.25), (3.28), (3.10) and (5.16). We next estimate S. Let = k + j ≥ k. From (5.10), (3.20) we obtain t dt t −1 ω−1 |W |2 2 +t −1+β(−k−α)+ |∇(x · B∗ )|˙k+α ω S 2 ≤ C 1

(5.24) from which (5.18) follows by (3.10), (5.16), (5.17) and integration on time, provided − 1 ≤ k + 1 + α or equivalently j ≤ 2 + α. We next estimate R1 . By repeated use of (3.10) and by (3.19), we obtain from (5.13)

|R1 (W, S, 0)|k+θ ≤ C t −2 |S|˙k+θ +1 + |B∗ |˙k+θ 1 + |S|˙k+θ + |B∗ |˙k+θ |W |k+θ+1 +C t −1−β(α+1−θ) x · B∗ ; H˙ k+1+α |w+ |k+θ ,

(5.25)

and therefore by (5.17), (5.18) and with α ≥ 1 ≥ θ ,

3 2 3 (1+n t) +C t −1−β(α+1−θ) a+ , n t + t β(1+θ−α) + a+ |R1 (W, S, 0)|k+θ ≤ C t −2 a+ (5.26) which proves (5.19) since β < 1/2.

Long Range Scattering and Modified Wave Operators for MS System

435

The proof of (5.20) follows from the fact that xR1 (W, S, 0) = L0 xW − t −2 (S + B∗ )W,

(5.27)

where the linear operator L0 is defined by rewriting the RHS of (5.13) as L0 W . The term L0 xW is estimated in the same way as in the proof of (5.19) with θ = 0, and by using in addition (5.16), while the last term in (5.27) is estimated by using (5.17), (5.18) and (3.10). We next estimate R2 . By a direct application of Lemma 3.2, we obtain from (5.14),

ω R2 (W, S) 2 ≤ C t −2 S ∞ ω+1 S 2 + ∇S 3 ω S 6 ≤ C t −2 |S|˙k |S|˙+1

(5.28)

for ≥ 1, which yields (5.22) by the use of (5.18). Finally the estimate (5.22) follows readily from (3.25), (3.28), (3.10) and from (5.16), (5.17), (5.18). Remark 5.2. If one makes the simpler choice (2.41) for (W, S), Lemma 5.2 and its proof remain essentially unchanged, the only difference being that the proof of (5.19), (5.20) now requires α ≥ 2 in order to estimate the contribution of w+ to R1 . We can now solve the Cauchy problem with infinite initial time for the linearized system (2.54), (2.55) for the previous choice of (W, S). Proposition 5.2. Let k > 3/2, 0 < β < 1/2, T ≥ 1 and I = [T , ∞). Let w+ ∈ H k+α+1 with xw+ ∈ H k+α for some α > 1 with β(α + 1) ≥ 1. Define (W, S) and a+ by (5.10), (5.11). Let (q, σ, Bb ) ∈ C(I, Xk ) satisfy |q|k ∨ |xq|k ≤ Y t −1 n t,

(5.29)

|q|k+1 ≤ Y1 t −1 n t + t −αβ ,

(5.30)

|σ |˙k+j ≤ Zj t −1 n t n t + t jβ

for j = 0, 1, 2,

|Bb |˙k+1 ∨ |x · Bb |˙k+1 ≤ N t −1 n t,

(5.31) (5.32)

for some constants (Y, Y1 , Zj , N) and for all t ∈ I . Then the linearized system (2.54), (2.55) has a unique solution (q , σ , Bb ) ∈ C(I, Xk ) satisfying |q |k ∨ |xq |k ≤ Y t −1 n t,

(5.33)

|q |k+1 ≤ Y1 t −1 n t + t −αβ ,

(5.34)

|σ |˙k+j ≤ Zj t −1 n t n t + t jβ

for j = 0, 1, 2,

|Bb |˙k+1 ∨ |x · Bb |˙k+1 ≤ N t −1 n t, (Y , Y1 , Zj , N )

(5.35) (5.36)

for some constants depending on (Y, Y1 , Zj , N, a+ , T ) and for all t ∈ I . The solution is actually unique in C(I, Xk ) under the condition that (q , σ ) tends to zero in L2 ⊕ H˙ 1 when t → ∞.

436

J. Ginibre, G. Velo

Remark 5.3. Whereas the conditions 0 < β < 1/2 and α > 1 are used in an essential way in the proof of Proposition 5.2, the condition β(α + 1) ≥ 1 has been imposed for convenience only, in order to obtain rather simple and optimal decay properties for (q , σ ), and could be relaxed at the expense of a weakening of those properties. For given α > 1, it can be achieved by taking β sufficiently close to 1/2. Its meaning is that for a given regularity of w+ , one should put a sufficiently large part of the interaction B1 into the long range part (x · Ba )L so as to obtain a sufficiently good decay of the short range part (x · Ba )S . Proof. With Proposition 5.1 available, it is sufficient to prove Proposition 5.2 for T sufficiently large, depending possibly on (Y, Y1 , Zj , N, a+ ). Furthermore it is sufficient to solve the system (2.54) for (q , σ ), since Bb is given by an explicit formula, namely (2.55). The proof consists in showing that the solution (qt 0 , σt 0 ) of the linearized system (2.54) with initial data (qt 0 , σt 0 )(t0 ) = 0 for some finite t0 ≥ T , obtained from Proposition 5.1, satisfies the estimates (5.33)–(5.35) uniformly in t0 for T ≤ t ≤ t0 , namely with (Y , Y1 , Zj ) independent of t0 , and that when t0 → ∞, that solution converges on the compact subintervals of I uniformly in suitable norms. Let therefore T be sufficiently large, in a sense to be made clear below. We define   y = |q|k ∨ |xq|k , y1 = |q|k+1 , (5.37)  z = |σ |˙ , j = 0, 1, 2, j k+j and we first take T large enough so that (5.29)–(5.32) imply y ∨ y1 ≤ a+ , 2 zj ≤ a+ n t

for j = 0, 1, 2,

(5.38) 2 |Bb |˙k+1 ≤ a+

for t ≥ T . It follows from (5.1), (5.2), (5.29), (5.38) that

|G|˙k+1 ∨ |∇(x · G)|˙k ≤ C a+ I0 (y) ≤ C a+ Y t −1 n t ∨ a+ .

(5.39)

(5.40)

Let t0 > T and let (qt 0 , σt 0 ) be defined as above. We want to estimate (qt 0 , σt 0 ) for T ≤ t ≤ t0 . We define 

 y = |qt0 |k ∨ |xqt0 |k , y1 = |qt0 |k+1 , (5.41)  z = |σ |˙ , j = 0, 1, 2, t0 k+j j   Y = t (n t)−1 y ; L∞ ([T , t0 ]) , 

−1 ∞ y1 ; L ([T , t0 ]) , Y1 = t −1 n t + t −αβ

−1 zj ; L∞ ([T , t0 ]) . Zj = t −1 n t (n t + t jβ )

(5.42)

(5.43)

Long Range Scattering and Modified Wave Operators for MS System

437

We first estimate qt 0 and more precisely y and y1 , starting from (5.3), (5.4). We estimate (W, S, B∗ ) by (5.16)–(5.18), (σ, Bb ) by (5.39), x · Bb by (5.32), G by (5.40) and R1 by (5.19), (5.20), with β(α + 1) ≥ 1. We obtain 2 2 2 −1−β |∂t y | ≤ C a+ y (1 + a+ ) + N t −2 n t + a+ t

3 2 2 +C a+ (1 + a+ ) + N a+ t −2 n t + C a+ Y t −2−β n t +t −2 y1 + C(a+ )t −2 n t,

(5.44)

2 2 2 a+ (1 + a+ ) + N t −2 n t y1 + C t −2+β(2−α) + t −1 a+ y

3 2 2 +C a+ (1 + a+ ) + N a+ t −2 n t + C a+ Y t −2 n t

(5.45) +C(a+ ) t −2 n t + t −1−αβ .

|∂t y1 | ≤ C

We integrate (5.44), (5.45) between t and t0 , with y (t0 ) = y1 (t0 ) = 0, exponentiating the diagonal terms to a constant for T sufficiently large depending on (a+ , N ) and substituting (5.42) into the remaining terms. We obtain  2 Y t −1−β n t +CY t −2 n t + t −1−αβ +(CN a +C(a )) t −1 n t y ≤ C a+ + + 1 (5.46)  2 (Y + Y )t −1 n t + CNa + C(a ) t −1 n t + t −αβ , y1 ≤ C a+ + + and therefore by substituting (5.46) into (5.42),  2 Y T −β + CY T −θ + CN a + C(a ) ≡ CY T −θ + A  Y ≤ C a+ + + 1 1 

2 (Y + Y ) + CN a + C(a ) ≡ C a 2 Y + A Y1 ≤ C a+ + + 1 +

with θ = 1 ∧ αβ > β. It follows from (5.47) that  2 T −θ Y + A + C T −θ A ≤ Ca 2 Y T −β + CN a + C(a )  Y ≤ C a+ 1 + + + 

2 T −θ Y + C a 2 A + A ≤ C a 2 Y + CN a (1 + a 2 ) + C(a ) Y1 ≤ C a+ 1 + + + + + 1

2 . Finally for T θ ≥ T β ≥ Ca+  2 T −β Y + (1 + N )C(a )  Y ≤ C a+ +



2 Y + (1 + N )C(a ) Y1 ≤ C a+ +

(5.47)

(5.48)

(5.49)

for T sufficiently large depending on (Y, Y1 , Zj , N, a+ ). This proves that qt 0 satisfies the estimates (5.33), (5.34) for T ≤ t ≤ t0 . We next estimate σt 0 , and more precisely zj , starting from (5.5), which we rewrite with the help of the definitions (5.37), (5.41) and of the estimates (5.18), (5.21) and (5.38)–(5.40) as 2 −2 |∂t zj | ≤ C a+ t n t (zj + zj ) + δj2 t β(2−α) (z1 + z1 ) +C t −1 a+ y + δj2 y1 + C t −1+jβ a+ I0 (y) 4 −2 +C a+ t n t (n t + t β(j +1−α) )

(5.50)

438

J. Ginibre, G. Velo

for j = 0, 1, 2 with δj2 the Kronecker symbol in order to include additional terms for j = 2. Using (5.29)–(5.31) and (5.43), we obtain from (5.50), 2 −2 2 −3 |∂t zj | ≤ C a+ t n t zj + C a+ t n t n t (n t + t jβ )Zj +δj2 t β(3−α) (Z1 + Z1 ) + C a+ t −2 Y n t + δj2 Y1 (n t + t 1−αβ ) 4 −2 +C a+ t −2+jβ n t Y + C a+ t n t (n t + t β(j +1−α) ).

(5.51)

and t0 with zj (t0 )

We integrate (5.51) between t = 0, exponentiating the diagonal terms to a constant for T sufficiently large in the sense that 2 T (n T )−1 ≥ C a+ ,

(5.52)

and we substitute the result into the definition (5.43), thereby obtaining 2 T −1 n T Z + δ T −1+β(1−α) (Z + Z ) Zj ≤ C a+ j j2 1 1 4 +C a+ Y + δj2 (T −2β + T 1−(α+2)β )Y1 + C a+ so that for α ≥ 1 and β(α + 1) ≥ 1 and under the condition (5.52)  2 T −1 n T Z + C a Y + C a 4 , for j = 0, 1,  Z j ≤ C a+ j + + 

Z2

≤C

2 a+

T −1 n

T (Z2 + Z1 ) + C a+ (Y

+ t −β Y

1) + C

(5.53)

4 a+

for T sufficiently large. This proves that σt 0 satisfies the estimate (5.35) for T ≤ t ≤ t0 . We now prove that (qt 0 , σt 0 ) tends to a limit when t0 → ∞. For that purpose we consider two solutions (qi , σi ) = (qt i , σt i ), i = 1, 2, of the system (2.54) corresponding to the same choice of (q, σ, Bb ) and to t0 = ti , i = 1, 2, for T ≤ t1 ≤ t2 . Let

, σ ) = 1/2(q − q , σ − σ ). For fixed (q, σ, B ), the inhomogeneous term in (q− b − 1 2 1 2

satisfies an homogeneous linear q in the equation for q is the same and therefore q− equation which preserves the L2 norm, so that

(t) 2 = q− (t1 ) 2 = 1/2 q2 (t1 ) 2 ≤ Y t1−1 n t1 q−

(5.54)

for T ≤ t ≤ t1 , by (5.33) applied to q2 at t = t1 ∈ [T , t2 ]. Similarly, σ− satisfies the equation ∂t σ− = t −2 (S + σ ) · ∇σ−

(5.55)

so that by an elementary subestimate of Lemma 4.2 and by (5.18), (5.39), 2 −2 |∂t ∇σ− 2 | ≤ C a+ t n t ∇σ− 2 ,

(5.56)

and therefore under the condition (5.52), ∇σ− (t) 2 ≤ C ∇σ− (t1 ) 2 ≤ C Z0 t1−1 n t1

(5.57)

for T ≤ t ≤ t1 by (5.35) applied to σ2 at t = t1 . From (5.54), (5.57), it follows that (qt 0 , σt 0 ) converges to a limit (q , σ ) ∈ C(I, L2 ⊕ H˙ 1 ) uniformly on the compact subintervals of I . From the uniform estimates (5.33)– (5.35) and from Lemma 5.1, it then follows by a standard compactness argument that

Long Range Scattering and Modified Wave Operators for MS System

439

(q , σ , 0) ∈ C(I, Xk ) and that (q , σ ) also satisfies the estimates (5.33)–(5.35). Clearly (q , σ ) satisfies the system (2.54). This completes the existence part of the proof. The uniqueness statement follows immediately from the L2 norm conservation for the difference of the q components and from (5.55), (5.57) for the difference of the σ components of two solutions. As mentioned at the beginning of the proof, the existence and uniqueness of Bb follow from the fact that it is given by an explicit formula (2.55) and the estimate (5.36) follows immediately from (5.6), (5.7), (5.22), (5.38), (5.39), (5.40) with 2 2 N = C a+ (1 + a+ )

for T large enough to ensure (5.38), (5.39).

(5.58)

We now turn to the main result of this section, namely the fact that for T sufficiently large, depending on a+ , the auxiliary system (2.49), (2.50) with (W, S) defined by (5.10) has a unique solution (q, σ, Bb ) defined for all t ≥ T and decaying at infinity in the sense of (5.29)–(5.32). In the same spirit as for Proposition 4.3, this will be done by showing that the map : (q, σ, Bb ) → (q , σ , Bb ) defined by Proposition 5.2 is a contraction in suitable norms. Proposition 5.3. Let k > 3/2 and 0 < β < 1/2. Let w+ ∈ H k+α+1 with xw+ ∈ H k+α for some α > 1 with β(α + 1) ≥ 1. Define (W, S) and a+ by (5.10), (5.11). Then (1) There exists T = T (k, β, α, a+ ), 1 ≤ T < ∞ such that the system (2.49), (2.50) has a unique solution (q, σ, Bb ) ∈ C(I, Xk ), with I = [T , ∞), satisfying the estimates (5.29)–(5.32) for some constants (Y, Y1 , Zj , N ) depending on (k, β, α, a+ ). Furthermore |G|˙k+1 ∨ |∇(x · G)|˙k ≤ C a+ Y t −1 n t,

(5.40) ≡ (5.59)

where G is defined by (2.46). The solution is actually unique in C(I, X k ) under the conditions that |xq|k ∨ |q|k+1 ∨ |σ |˙k+2 ∨ |Bb |˙k ∈ L∞ (I ),

(5.60)

|σ |˙k+1 ∨ t 2β+ε (|q|k ∨ |xq|k−1 ) → 0 when t → ∞

(5.61)

for some ε > 0. (2) The map w+ → (W + q, S + σ, Bb ) ≡ (w, s, Bb ) is continuous on the bounded sets of the norm (5.11) from the norm |w+ |k ∨|xw+ |k−1 for w+ to the norm of (w, s, Bb ) in L∞ (J, Xk−1 ) and in the weak-∗ sense to L∞ (J, X k ) for any interval J ⊂⊂ I . Proof. Part 1. The proof consists in showing that the map : (q, σ, Bb ) → (q , σ , Bb ) defined by Proposition 5.2 is a contraction of a suitable set R of C(I, X k ), with I = [T , ∞), for T sufficiently large and for a suitably time rescaled norm of L∞ (I, Xk−1 ). We define (y, y1 , zj ) by (5.37) and we define R by

R = (q, σ, Bb ) ∈ C(I, Xk ) : y ≤ Y t −1 n t, y1 ≤ Y1 t −1 n t + t −αβ ,

zj ≤ Zj t −1 n t n t + t jβ , j = 0, 1, 2, |Bb |˙k+1 ∨ |x · Bb |˙k+1 ≤ N t −1 n t (5.62)

440

J. Ginibre, G. Velo

for some constants (Y, Y1 , Zj , N) depending on a+ , to be chosen later, and we take T large enough so that (5.38), (5.39) and therefore (5.40) hold for (q, σ, Bb ) ∈ R. We first show that the set R is mapped into itself by for suitable (Y, Y1 , Zj , N ) and sufficiently large T . Let (q , σ , Bb ) = (q, σ, Bb ) as defined by Proposition 5.2. As mentioned in the proof of that proposition, it follows from (5.6), (5.7), (5.22), (5.38)– (5.40) that Bb satisfies (5.36) with N defined by (5.58) so that if we define N by 2 2 N = C a+ (1 + a+ )

(5.63)

then the condition on Bb contained in (5.62) is reproduced by . It remains to estimate (q , σ ). Now from the proof of Proposition 5.2, it follows that (q , σ ) satisfies the estimates (5.33), (5.34), (5.35) with (Y , Y1 , Zj ) satisfying (5.49), (5.53), for T sufficiently large depending on (Y, Y1 , Zj , a+ ). With N given by (5.63), it follows immediately from (5.49) that one can choose Y and Y1 depending on a+ such that (5.49) implies Y ≤ Y 2 . Therefore the conditions on and Y1 ≤ Y1 for T sufficiently large, namely T β ≥ Ca+ q in (5.62) are also reproduced by for that choice. Finally, it follows from (5.53) that one can choose Zj , actually in the form 4 , Z j = C a+ Y + C a +

j = 0, 1, 2,

(5.64)

so that (5.53) implies Zj ≤ Zj for T sufficiently large, in fact for 2 , T (n T )−1 ≥ C a+

T β ≥ Y1 /Y.

(5.65)

This completes the proof of the fact that maps R into itself for (Y, Y1 , Zj , N ) chosen as above, depending on a+ , and for T sufficiently large, depending on a+ . We next show that the map is a contraction in R for a suitably weighted norm of

) = (q , σ , B ), i = 1, 2. We L∞ (I, Xk−1 ). Let (qi , σi , Bbi ) ∈ R and let (qi , σi , Bbi i i bi define (q± , σ± , Bb± ) = 1/2((q1 , σ1 , Bb1 ) ± (q2 , σ2 , Bb2 )) and similarly for the primed quantities. Furthermore we define   y− = |q− |k−1 ∨ |xq− |k−1

,

y1− = |q− |k ,

˙ ˙  zj = |σ− |˙ − k+j −1 , j = 1, 2, n− = |Bb− |k ∨ |x · Bb− |k ,

(5.66)

and similarly for the primed quantities and for Ba and B.

) We shall estimate (q− , σ− , Bb− ) by Lemma 4.2 applied to the solutions (wi , si , Bbi

= (W + qi , S + σi , Bbi ) of the system (2.37), (2.38) associated with (wi , si , Bbi ) = (W + qi , S + σi , Bbi ). As a consequence we shall apply Lemma 4.2 with (w− , s− ) =

, s ) = (q , σ ) and (w , s ) = (W + q , S + σ ), (w , s ) = (W + (q− , σ− ), (w− + + + + − − − + +

, S + σ ), and for that purpose we shall use the estimates (5.16), (5.18) for (W, S) q+ + and the fact that (qi , σi ) and therefore (q+ , σ+ ) are not larger than (W, S) in the sense of (5.38), (5.39). From the first part of the proof, it follows that (qi , σi ) and therefore

, σ ) and (w , s ) satisfy the same properties. Together with (5.17), (5.18), (5.30), (q+ + + + (5.63), (5.40), this implies that the available estimates on (w+ , s+ , Bb+ ) and on Ba+ , B+ are

Long Range Scattering and Modified Wave Operators for MS System

441

 |xw+ |k ∨ |w+ |k+1 ≤ C a+       ˙ 2 (j −α)β    |s+ |k+j ≤ C a+ n t + t  2 (1 + a 2 )t −1 n t  |Bb+ |˙k+j ∨ |x · Bb+ |˙k+j ≤ C a+  +       2 |B+ |˙k+j ∨ |∇(x · Ba+ )|˙k ≤ C a+

(5.67)

and that the same estimates hold for the primed quantities. From Lemma 4.2 with k = k − 1 and from the estimates (5.67) of the + quantities, we obtain (compare with (4.105)–(4.110)) |Ba− |˙k ≤ C a+ I0 (y− ),

(5.68)

x · Ba− ; H˙ k ≤ C a+ Ik−2 (y− ),

(5.69)

2 2 |∂t y− | ≤ E1 y − + C t −2 a+ (1 + a+ )z1− + (1 + a+ n t)(a+ I0 (y− ) + n− ) 2 +C t −1−β a+ Ik−2 (y− ) + C t −1 a+ n− + t −2 y1 − ,

(5.70)

2 2 z1− + z2− + (1 + a+ n t)(a+ I0 (y− ) + n− ) |∂t y1 − | ≤ E1 y1 − + C t −2 a+ a+ 2 +C t −1 a+ Ik−2 (y− ) + C t −1 a+ n− ,

where

(5.71)

2 2 −2 E1 = C a+ )t n t + t −1−β , (1 + a+

2 |∂t zj − | ≤ C t −2 a+ n t (zj− + zj − ) + δj2 t (2−α)β z1−

+C t −1 a+ (y− + δj2 y1− ) + C t −1+jβ a+ Im (y− )

for j = 1, 2,

where m = (k − 2) ∨ 0, 2 n − ≤ C t −1 a+ I0 y1− + a+ n t y− + a+ (z1− + a+ I0 (y− ) + n− ) .

(5.72)

(5.73)

In the same way as in the proof of Proposition 4.3, we continue the argument with a simplified version of the system (5.70)–(5.73) where we exponentiate the diagonal terms and in particular E1 to a constant according to (4.51) and where we eliminate the constants and the factors containing a+ , with the consequence that we lose detailed control of the dependence of the lower bounds for T on a+ . Thus we rewrite (5.70)–(5.73) as (compare with (4.111)–(4.114))

| ≤ t −2 z1− + n t (I0 (y− ) + n− ) + t −1−β Ik−2 (y− ) + t −1 n− + t −2 y1 − , |∂t y− (5.74) |∂t y1 − | ≤ t −2 z2− + n t (I0 (y− ) + n− ) + t −1 Ik−2 (y− ) + t −1 n− ,

(5.75)

442

J. Ginibre, G. Velo

|∂t zj − | ≤ t −2 n t zj− + δj2 t (2−α)β z1− + t −1 (y− + δj2 y1− ) + t −1+jβ Im (y− ), (5.76) n − ≤ t −1 I0 y1− + n t y− + z1− + I0 (y− ) + n− .

(5.77)

We now define  Y− = t (n t)−1 y− ; L∞ ([T , ∞)) , Y1− = t (n t)−1 y1− ; L∞ ([T , ∞)) ,     Zj− = t 1−jβ (n t)−1 zj− ; L∞ ([T , ∞)) , j = 1, 2,     N− = t 2−β (n t)−1 n− ; L∞ ([T , ∞)) (5.78) and similarly for the primed quantities. It follows from (5.29)–(5.36) that all those quantities are finite. Using those definitions and omitting the − indices in the remaining part of the contraction proof, we obtain from (5.74)–(5.77), |∂t y | ≤ t −2 n t Z1 t −1+β + Y t −β + N t −1+β + Y1 t −1 , (5.79) |∂t y1 | ≤ t −2 n t Z2 t −1+2β + Y + N t −1+β , |∂t zj | ≤ t −2 n t Zj t −1+jβ n t + δj2 Z1 t −1+(3−α)β + δj2 Y1 + Y t jβ , n ≤ t −2 n t Y1 + Y n t + Z1 t β + N t −1+β .

(5.80)

(5.81)

(5.82)

Integrating (5.79)–(5.81) between t and ∞ with initial condition (y , y1 , zj )(∞) = 0, and substituting the result and (5.82) into the primed analog of the definition (5.78) (with the − indices omitted), we obtain  Y ≤ T −1+β Z1 + T −β Y + T −1+β N + T −1 Y1 ,          Y1 ≤ T −1+2β Z2 + Y + T −1+β N, (5.83) 

≤ T −1 n T Z + δ T −1−(α−1)β Z + δ T −2β Y + Y,  Z  j j 1 j 1 2 2 j       N ≤ T −β Y1 + T −β n T Y + Z1 + T −1 N. Substituting Y1 from the second inequality into the first one, we recast the system (5.83) into the form (4.124), from which the contraction property follows exactly as in the proof of Proposition 4.3. This proves that has a unique fixed point in R. The uniqueness of the solution in C(I, X k ) under the conditions (5.60), (5.61) follows from Proposition 4.2, Part (2). Note that the time decay (5.60), (5.61) required for uniqueness is weaker than that contained in (5.29)–(5.32).

Long Range Scattering and Modified Wave Operators for MS System

443

Part 2. Let w+i , i = 1, 2, satisfy the assumptions of the proposition with |xw+i |k+α ∨ |w+i |k+α+1 ≤ a+

(5.84)

and define (Wi , Si ) by (5.10). Let (wi , si , Bbi ) = (Wi + qi , Si + σi , Bbi ) be the two solutions of the system (2.34), (2.35) obtained in Part (1). Let (W− , S− ) = (1/2)(W1 − W2 , S1 − S2 ), define (w− , s− , Bb− ) as in Lemma 4.2 and define   y− = |w− |k−1 ∨ |xw− |k−1 ,

y1− = |w− |k

z

n− = |Bb− |˙k ∨ |x · Bb− |˙k .

j−

= |s− |˙k+j −1 , j = 1, 2,

(5.85)

We assume that w+1 − w+2 is small in the sense that |x(w+1 − w+2 )|k−1 ∨ |w+1 − w+2 |k ≤ η

(5.86)

and we want to show that (w− , s− , Bb− ) is small by estimating (y− , y1− , zj− , n− ) in terms of η. We first estimate (W− , S− ). From (4.140) it follows that |xW− |k−1 ∨ |W− |k ≤ η + t −1 a+ ≤ 2η

(5.87)

for t ≥ a+ /η. Furthermore, in the same way as in Lemma 4.2, |S− |˙k+j −1 ≤ C a+ η t jβ

for j = 1, 2.

(5.88)

We now take t0 large in a sense to be specified below, and we estimate (y− , y1− , zj− , n− ) separately for t ≥ t0 and for t ≤ t0 . For t ≥ t0 , using (5.87), (5.88) and (5.29)–(5.31), we obtain   y− ∨ y1− ≤ 2η + Y t −1 n t ≤ C η, (5.89)  zj− ≤ C a+ η t jβ + Zj t −1 n t (n t + t (j −1)β ) ≤ C a+ η t jβ for t0 large in the sense that

−1 −β η t0 (n t0 )−1 = C(a+ ) ≥ Y ∨ (Z1 ∨ Z2 )a+ t0 n t0 .

(5.90)

(Remember that in this proposition, Y , Z1 , Z2 are functions of a+ .) Using the fact that for t ≥ t0 , Im (f ) depends only on f restricted to t ≥ t0 , we obtain in addition n− ≤ C t −1 η(1 + n t + a+ t β ) + n− , and therefore n− ≤ C(1 + a+ )η t −1+β for t ≥ t0 .

(5.91)

444

J. Ginibre, G. Velo

We next estimate (y− , y1− , zj− , n− ) for t ≤ t0 and for that purpose, we use the system (5.74)–(5.77) with the primes omitted, since (wi , si , Bbi ) are solutions of the system (2.34), (2.35). We choose λ such that 2β < λ < 1 (in the same way as in the proof of Proposition 4.2, Part (2)) and we define  Y− = t λ y− ; L∞ ([T , t0 ]) , Y1− = t λ y1− ; L∞ ([T , t0 ]) ,     (5.92) Zj− = t λ−jβ zj− ; L∞ ([T , t0 ]) , j = 1, 2,     N− = t λ+1−β n− ; L∞ ([T , t0 ]) . Substituting those definitions into (5.74)–(5.77), using the fact that Im (y− ) ≤ Im t −λ Y− + Cη ≤ C t −λ Y− + η

(5.93)

and similar relations for y1− , zj− , n− , and omitting the − indices, we obtain  |∂t y| ≤ η t −1−β + t −1−λ {·}         |∂t y1 | ≤ η t −1 + t −1−λ {·}   |∂t zj | ≤ η t −1+jβ + t −1−λ {·}       n− ≤ η t −1+β + t −1−λ {·},

(5.94)

where the brackets in the RHS are the same as in (5.79)–(5.82). Integrating the first three inequalities of (5.94) between t and t0 for t ≤ t0 with initial condition at t0 estimated by (5.89), and omitting again absolute constants, we obtain  y ≤ η + t −λ {·}          y1 ≤ η n t0 + t −λ {·} (5.95)  jβ −λ {·}   z ≤ η t + t j  0      n− ≤ η t −1+β + t −1−λ {·}, where the brackets in the RHS are the same as in (5.94). We substitute (5.95) into the definitions (5.92) and obtain a system similar to (5.83), with however the primes omitted, and with an additional term bounded by ηt0λ nt0 in each of the RHS. Proceeding therefrom as in the contraction proofs of Proposition 4.3 and of Part (1) of this proposition, we obtain X ≤ η t0λ n t0 ,

(5.96)

where X is defined by (4.125), so that by (5.90), X tends to zero when η tends to zero (actually as a power of η). This proves the norm continuity of the map w+ → (w, s, Bb ) from the norm |w+ |k ∨ |xw+ |k−1 (see (5.86)) to the norm in L∞ (J, X k−1 ) for compact J . The last continuity follows from a standard compactness argument. Remark 5.4. In Part (2) of Proposition 5.3, we prove actually a stronger continuity than stated, namely a suitably weighted L∞ continuity in the whole interval [T , ∞), as follows from (5.89), (5.91) for t ≥ t0 and (5.92), (5.96) for t ≤ t0 defined in terms of η by (5.90).

Long Range Scattering and Modified Wave Operators for MS System

445

6. Wave Operators and Asymptotics for (u, A) In this section we complete the construction of the wave operators for the system (2.6), (2.7) in the special case of vanishing asymptotic magnetic field, and we derive asymptotic properties of solutions in their range. The construction relies in an essential way on Proposition 5.3. So far we have worked with the system (2.34) for (w, s) and the first task is to reconstruct the phase ϕ. Corresponding to S defined by (2.40), we define t φ= dt t −1 (g(W ) − (x · Ba )L (W )) (6.1) 1

so that S = ∇φ. Let now (q, σ, Bb ) be the solution of the system (2.49), (2.50) obtained in Proposition 5.3 and let (w, s) = (W + q, S + σ ). We define ψ by ψ(∞) = 0 and ∂t ψ = (2t 2 )−1 |s|2 + t −1 {g(w) − g(W ) − (x · Ba )L (w) + (x · Ba )L (W )} ,

(6.2)

or equivalently ∞ dt (2t 2 )−1 |s(t )|2 + t −1 g(q, q + 2W ) − t −1 (x · Ba )L (q, q + 2W ) , ψ =− t

(6.3) which is tailored to ensure that ∇ψ = σ , given the fact that S and σ are gradients. The integral converges in H˙ 1 , as follows from (5.16), (5.18), from (5.27), (5.29) and from the estimate ∂t σ 2 ≤ t −2 s · ∇s 2 +t −1 ∇g(q, q + 2W ) 2 +t −1 ∇(x · Ba )(q, q + 2W ) 2 ≤ t −2 s ∞ ∇s 2 +C t −1 a+ ( q 2 +I−1 ( xq 2 ) ≤ C(a+ ) t −2 (n t)2

(6.4)

so that ∇ψ 2 = σ 2 ≤ C(a+ ) t −1 (n t)2 .

(6.5)

Finally we define ϕ = φ + ψ so that ∇ϕ = s. We can now define the modified wave operators for the MS system in the form (2.6), (2.7) in the special case of vanishing asymptotic magnetic field. We start from the asymptotic state u+ for u and we define w+ = F u+ . The asymptotic state (A+ , A˙ + ) for A is taken to be zero. We define (W, S) by (2.40). We solve the system (2.49), (2.50) for (q, σ, Bb ) by Proposition 5.3. Through (2.43), this yields a solution (w, s, Bb ) of the auxiliary system (2.34), (2.35). We reconstruct the phase ϕ = φ + ψ with φ and ψ defined by (6.1), (6.3). We finally substitute (w, ϕ, Bb ) into (2.17) and (2.18) with B = Ba + Bb and Ba defined by (2.28). This yields a solution (u, A) of the system (2.6), (2.7) defined for large time. The modified wave operator is the map : u+ → (u, A) thereby obtained. In order to state the regularity properties of u that follow in a natural way from the previous construction, we introduce appropriate function spaces. In addition to the operators M = M(t) and D = D(t) defined by (2.14), (2.15), we introduce the operator J = J (t) = x + it ∇,

(6.6)

446

J. Ginibre, G. Velo

the generator of Galilei transformations. The operators M, D, J satisfy the commutation relation i M D ∇ = J M D.

(6.7)

For any interval I ⊂ [1, ∞) and any k ≥ 0, we define the space X k (I ) = u : D ∗ M ∗ u ∈ C(I, H k+1 ), D ∗ M ∗ xu ∈ C(I, H k ) = u :< J (t) >k+1 u and < J (t) >k xu ∈ C(I, L2 ) ,

(6.8)

where < λ >= (1 + λ2 )1/2 for any real number or self-adjoint operator λ and where the second equality follows from (6.7). We now collect the information obtained for the solutions of the system (2.6), (2.7) in the range of the modified wave operators and state the main result of this paper as follows. Proposition 6.1. Let k > 3/2, 0 < β < 1/2 and let α > 1 be such that β(α + 1) ≥ 1. Let u+ be such that w+ = F u+ ∈ H k+α+1 and xw+ ∈ H k+α . Define (W, S) by (2.40) and a+ by (5.11). Then (1) There exists T = T (a+ ), 1 ≤ T < ∞, such that the auxiliary system (2.34), (2.35) has a unique solution (w, s, Bb ) ∈ C(I, Xk ), where I = [T , ∞), satisfying |w − W |k ∨ |x(w − W )|k ≤ C t −1 n t,

(6.9)

|w − W |k+1 ≤ C t −1 n t + t −αβ ,

(6.10)

|s − S|k+j ≤ C t −1 n t n t + t jβ

f or j = 0, 1, 2,

|Bb |˙k+1 ∨ |x · Bb |˙k+1 ≤ C t −1 n t.

(6.11) (6.12)

(2) Let φ and ψ be defined by (6.1) and (6.3) with q = w − W , and let ϕ = φ + ψ. Let

u = MD exp(−iϕ)w,

(2.17) ≡ (6.13)

−1

(2.18) ≡ (6.14)

A=t

D0 B

k+1 ⊕ with B = Ba + Bb and Ba defined by (2.28). Then u ∈ t A) ∈ C(I, K k H ), (u, A) solves the system (2.6), (2.7) and u behaves asymptotically in time as MD exp(−iφ)w+ in the sense that u satisfies the following estimates:

X k (I ), (A, ∂

< J (t) >k < |x|/t > (exp(iφ(t, x/t))u(t) − M(t) D(t) W (t)) 2 ≤ C t −1 (n t)2 , (6.15)

< J (t) >k+1 (exp(iφ(t, x/t))u(t) − M(t) D(t) W (t)) 2 ≤ C t −1 (n t)2 + t −αβ , (6.16) < |x|/t > (u(t) − M(t) D(t) exp(−iφ(t) W (t)) r ≤ C t −1−δ(r) (n t)2 for 2 ≤ r ≤ ∞, with δ(r) = 3/2 − 3/r.

(6.17)

Long Range Scattering and Modified Wave Operators for MS System

447

Furthermore A behaves asymptotically in time as t −1 D0 Ba (W ) in the sense that the following estimates hold: |B − Ba (W )|˙k+1 ∨ |∇x · (B − Ba (W ))|˙k ≤ C t −1 n t,

(6.18)

where A and B are related by (6.14). Proof. The proof follows from Proposition 5.3 supplemented with the reconstruction of ϕ described above in this section, except for the estimates (6.15)–(6.17) on u. In particular the estimates (6.9)–(6.12) are the estimates (5.29)–(5.32) supplemented with (6.5), while (6.18) follows from (5.32) and (5.40). We next prove the estimates (6.15)–(6.17) on u. From (6.13) with ϕ = φ + ψ and from (6.7), it follows that |J |m (exp(iD0 φ)u − MDW ) 2 = ωm (exp(−iψ)w − W ) 2 ,

(6.19)

|J |m (|x|/t)(exp(iD0 φ)u − MDW ) 2 = ωm |x|(exp(−iψ)w − W ) 2 . (6.20) We next estimate for 0 ≤ m ≤ k + 1, ωm (exp(−iψ)w − W ) 2 ≤ C ωm (exp(−iψ) − 1) r1 w r2 + C exp(−iψ) − 1 ∞ ωm w 2 + C ωm (w − W ) 2 ≤ C exp(C ψ ∞ ) ωm ψ r1 w r2 +C ( ψ ∞ |w|m + |w − W |m ) (6.21) with 1/r1 + 1/r2 = 1/2, r1 < ∞, by Lemmas 3.2 and 3.3, and similarly for 0 ≤ m ≤ k, ωm |x|(exp(−iψ)w − W ) 2 ≤ C exp(C ψ ∞ ) ωm ψ r1 xw r2 +C ( ψ ∞ |xw|m + |x(w − W )|m ) .

(6.22)

Taking r1 = 6, r2 = 3 for m = 0 and r1 = 2, r2 = ∞ for m ≥ 1, using the Sobolev inequality ψ ∞ ≤ C ( σ 2 ∇σ 2 )1/2 and using (6.9)–(6.11) yields (6.15)–(6.16). The estimate (6.17) follows immediately from (6.15) and from the inequality f r = t −δ(r) D ∗ M ∗ f r ≤ C t −δ(r) ωδ(r) D ∗ M ∗ f 2 = C t −δ(r) |J (t)|δ(r) f 2 for 2 ≤ r < ∞ and from a similar inequality for r = ∞.

Remark 6.1. The leading term in the asymptotic behaviour of A is t −1 D0 Ba (W ). Replacing W by w+ as a first approximation, one obtains A ∼ t −1 D0 Ba (w+ ), and since Ba (w+ ) is constant in time, that term spreads by dilation by t and decays as t −1 in the L∞ norm. In the norms considered in (6.18), that term is O(1), so that the remainder is smaller than the leading term by t −1 nt. We have stated the remainder

448

J. Ginibre, G. Velo

estimates in terms of B rather than A because they are simpler for B, since for A the dilation D0 induces a dependence of the time decay on the order of derivation. In fact (6.18) is equivalent to ωm (A − t −1 D0 Ba (W )) 2 ∨ ωm ∇x · (A − t −1 D0 Ba (W )) 2 ≤ C t −m−1/2 n t (6.23) for the relevant values of m, namely 1 ≤ m ≤ k + 1 for the first norm and 1 ≤ m ≤ k for the second one. Acknowledgements. One of us (G.V.) is grateful to Professor D. Schiff for the hospitality extended to him at the Laboratoire de Physique Th´eorique in Orsay, where part of this work was done.

References 1. Derezinski, J., G´erard, C.: Scattering theory of classical and quantum N-particle systems. Berlin: Springer, 1997 2. Ginibre, J., Ozawa, T.: Long range scattering for nonlinear Schr¨odinger and Hartree equations in space dimension n ≥ 2. Commun. Math. Phys. 151, 619–645 (1993) 3. Ginibre, J., Velo, G.: Long range scattering and modified wave operators for some Hartree type equations I. Rev. Math. Phys. 12, 361–429 (2000) 4. Ginibre, J., Velo, G.: Long range scattering and modified wave operators for some Hartree type equations II. Ann. H. P. 1, 753–800 (2000) 5. Ginibre, J., Velo, G.: Long range scattering and modified wave operators for some Hartree type equations III, Gevrey spaces and low dimensions. J. Diff. Eq. 175, 415–501 (2001) 6. Ginibre, J., Velo, G.: Long range scattering and modified wave operators for the Wave-Schr¨odinger system. Ann. H. P. 3, 537–612 (2002) 7. Guo, Y., Nakamitsu, K., Strauss, W.: Global finite energy solutions of the Maxwell-Schr¨odinger system. Commun. Math. Phys. 170, 181–196 (1995) 8. Hayashi, N., Naumkin, P.I.: Scattering theory and large time asymptotics of solutions to Hartree type equations with a long range potential. Preprint, 1997 9. Hayashi, N., Naumkin, P.I.: Remarks on scattering theory and large time asymptotics of solutions to Hartree type equations with a long range potential. SUT J. of Math. 34, 13–24 (1998) 10. Hayashi, N., Ozawa, T.: Modified wave operators for the derivative nonlinear Schr¨odinger equation. Math. Ann. 298, 557–576 (1994) 11. H¨ormander, L.: The Analysis of Linear Partial Differential Operators. Vol I, Berlin: Springer, 1983 12. Kato, T., Ponce, G.: Commutator estimates and the Euler and Navier-Stokes equations. Commun. Pure Appl. Math. 41, 891–907 (1988) 13. Kenig, C., Ponce, G., Vega, L.: The initial value problem for a class of nonlinear dispersive equations. In: Functional-Analytic Methods for Partial Differential Equations., Lect. Notes Math. 1450, 1990, pp. 141–156 14. Nakamitsu, K., Tsutsumi, M.: The Cauchy problem for the coupled Maxwell-Schr¨odinger equations. J. Math. Phys. 27, 211–216 (1986) 15. Nakanishi, K.: Modified wave operators for the Hartree equation with data, image and convergence in the same space. Commun. Pure Appl. Anal. In press 16. Nakanishi, K.: Modified wave operators for the Hartree equation with data, image and convergence in the same space II. Ann. H. P. 3, 503–535 (2002) 17. Ozawa, T.: Long range scattering for nonlinear Schr¨odinger equations in one space dimension. Commun. Math. Phys. 139, 479–493 (1991) 18. Ozawa, T., Tsutsumi, Y.: Asymptotic behaviour of solutions for the coupled Klein-GordonSchr¨odinger equations. In: Spectral and Scattering Theory and Applications Adv. Stud. in Pure Math. Jap. Math. Soc. 23, 1994, pp. 295–305 19. Tsutsumi, Y.: Global existence and asymptotic behaviour of solutions for the Maxwell-Schr¨odinger system in three space dimensions. Commun. Math. Phys. 151, 543–576 (1993) Communicated by P. Constantin

Commun. Math. Phys. 236, 449–475 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0827-3

Communications in

Mathematical Physics

Log-Infinitely Divisible Multifractal Processes E. Bacry1 , J.F. Muzy2 1

Centre de Math´ematiques Appliqu´ees, Ecole Polytechnique, 91128 Palaiseau Cedex, France. E-mail: [email protected] 2 CNRS, UMR 6134, Universit´e de Corse, Grossetti, 20250 Corte, France. E-mail: [email protected] Received: 8 July 2002 / Accepted: 17 December 2002 Published online: 14 April 2003 – © Springer-Verlag 2003

Abstract: We define a large class of multifractal random measures and processes with arbitrary log-infinitely divisible exact or asymptotic scaling law. These processes generalize within a unified framework both the recently defined log-normal Multifractal Random Walk processes (MRW) [33, 3] and the log-Poisson “product of cylindrical pulses” [7]. Their construction involves some “continuous stochastic multiplication” [36] from coarse to fine scales. They are obtained as limit processes when the finest scale goes to zero. We prove the existence of these limits and we study their main statistical properties including non-degeneracy, convergence of the moments and multifractal scaling. 1. Introduction Fractal objects and the related concept of scale-invariance, are now generally used in many fields of natural, information or social sciences. They have been involved in a large amount of empirical, as well as theoretical studies concerning a wide variety of problems. The scale-invariance property of a stochastic process is usually quantified by the scaling exponents ζq associated with the power-law behavior of the order q moments of the “fluctuations” at different scales. More precisely, for a 1D random process1 X(t), let us consider the order q absolute moment of the “fluctuation” δl X(t) at scale l: m(q, l) = E |δl X(t)|q , (1) where the “fluctuation” process δl X(t) is assumed to be stationary and E (.) stands for the mathematical expectation. Usually, the fluctuation δl X(t) is chosen to be the increment of X(t) at time t and scale l: δl X(t) = X(t + l) − X(t), 1

We will exclusively consider, in this paper, real valued random functions of a 1D continuous “time” variable t. Though the extension to higher dimensions is rather natural, this problem will be addressed in a forthcoming study.

450

E. Bacry, J.F. Muzy

but it can be also defined as a wavelet coefficient [31, 2, 32]. The ζq exponents are defined from the power-law scaling m(q, l) = Kq l ζq , ∀ l ≤ T .

(2)

When the ζq function is linear, i.e., ζq = qH , the process is referred to as a monofractal process with Hurst exponent H . In that case the scaling can extend over an unbounded range of scales (one can have T = +∞). Examples of monofractal processes are the socalled self-similar processes like (fractional) Brownian motion or α-stable motion [39]. When the function ζq is non-linear, it is necessarily a concave function and T < +∞ . In that case the process is called a multifractal process. Let us remark that this definition of multifractality relies upon the scaling properties of increment absolute moments. An alternative definition refers to the point-wise fluctuations of the regularity properties of sample paths (see e.g. [7, 20]). Sometimes, one can establish an exact equivalence between these two definitions within the so-called multifractal formalism. Let us note that the scaling equation (2) refers to an exact continuous scale invariance. Weaker forms of scale invariance are often used, notably asymptotic scale invariance that assumes that the scaling holds only in the limit l → 0+ : m(q, l) ∼ Kq l ζq , when l → 0+ .

(3)

The discrete scale invariance only assumes that the scaling holds for a discrete subset of scales ln (with ln → 0 when n → +∞): ζ

m(q, ln ) = Kq lnq .

(4)

The paradigm of multifractal processes that satisfy discrete scale invariance are the Mandelbrot multiplicative cascades [25] or the recently introduced wavelet cascades [1]. In Mandelbrot construction (the principle is the same for wavelet cascades), ln = 2−n and a sequence of probability measures Mln (dt) is built recursively. Mln (dt) is uniform on dyadic intervals In,k =]k2−n , (k + 1)2−n ] and is obtained from Mln−1 (dt) using the cascading rule: Mln (dt) = Wn,k Mln−1 (dt), for t ∈ In,k ,

(5)

where the weights Wn,k are i.i.d. positive random variables such that E Wn,k = 1. The convergence and regularity properties of such construction have been studied extensively [21, 11, 8, 16, 17, 30, 5, 6] (from a general point of view, convergence of multiplicative constructions to singular measures have been studied in the Gaussian case in [22] and in the L´evy stable case in [13]). Despite the fact that multiplicative discrete cascades have been widely used as reference models in many applications, they possess many drawbacks related to their discrete scale invariance, mainly they involve a particular scale ratio (e.g. λ = 2) and they do not possess stationary fluctuations (this comes from the fact that they are constructed on a dyadic tree structure). The purpose of this paper is to define a new class of continuous time stochastic processes with stationary fluctuations and that are multifractal in the sense that they verify exact or asymptotic continuous scaling (Eqs. (2) or (3)) with a non-linear ζq spectrum. Though, as pointed out in [27, 26], continuous time multifractal processes with continuous scale invariance are obviously appealing from both fundamental and modeling aspects, until very recently such processes were lacking. From our knowledge, only the recent works by Bacry et al. [33, 3] and Barral and Mandelbrot [7] refer to a precise

Log-Infinitely Divisible Multifractal Processes

451

construction of multifractal continuous scale invariant processes. In both works, these processes are obtained as limit processes, however only [7] gives a full proof of the convergence. Bacry et al. have introduced the so-called Multifractal Random Walk (MRW) processes as continuous time limit processes based on discrete time random walks with stochastic log-normal variance. Independently, Barral and Mandelbrot [7] have proposed a new class of stationary multifractal measures. Their construction is based on distributing, in a half-plane, Poisson points associated with i.i.d. random weights and then taking a product of these weights over conical domains. The construction that we propose in this paper generalizes these two approaches to an unified framework involving general infinitely divisible laws. Within this new framework, the constructions of [3] and [7] (except for the case where the distribution of the multiplicative weights has a Dirac mass in 0) are particular cases respectively associated with normal and compound Poisson distributions. The main principles of this contruction are similar to those of Barral and Mandelbrot construction [7] and have been previously described in [10]. They rely upon an idea advanced in a recent work by Schmitt and Marsan [36]. These authors have shown that a “continuous limit” of the discrete cascade equation (5) can be interpreted as the exponential of a stochastic integral of an infinitely divisible 2D noise over a suitably chosen conical domain. We exploit this idea in order to construct random measures and random processes that satisfy either the exact scaling (2) or the asymptotic scale (3) with a ζq function that can be associated with an almost arbitrary infinitely divisible law. More specifically, we show that this new class of processes satisfies a continuous cascade equation: the fluctuation process {δλl X(t)}t at a scale λl (where l is an arbitrary scale smaller than the large scale T and λ < 1) is obtained from the fluctuation process {δl X(t)}t at the larger scale l through the simple “cascading” rule law

{δλl X(t)}t = Wλ {δl X(t)}t ,

(6)

where ln(Wλ ) is an arbitrary infinitely divisible random variable independent of {δl X(t)}t and whose law depends only on λ. Let us mention that, in the companion paper [34], the random processes introduced in this paper have been also studied with less care for rigor but with many intuitive arguments, numerous examples, methods for simulations and a discussion of possible applications. The paper is organized as follows. In Sect. 2 we introduce our main notations and some well known results about independently scattered random measures [35]. In Sect. 3 we define a new class of stationary random measures: the Multifractal Random Measures (MRM). Section 4 states the main results on these measures (mainly, non-degeneracy, convergence of the moments and exact or asymptotic multifractal scaling). These results are all proved in Sect. 5. In Sect. 6, MRM and brownian motion are used to build a family of continuous time multifractal random processes: the log-infinitely divisible Multifractal Random Walks (MRW). The main results concerning MRW are given (and proved) in Sect. 7. Particular cases of MRM and MRW along with connected approaches for building multifractal stochastic measures or processes are discussed in Sect. 8. Conclusions and prospects for future research are reported in Sect. 9. Some technical proofs are detailed in the Appendices. 2. Basic Notations – Independently Scattered Random Measures 2.1. Basic notations. Hereafter the symbol E (X) will stand for the mathematical expectation of the random variable X. We will always omit the reference to the randomness

452

E. Bacry, J.F. Muzy

parameter. If X is a function of time t, X(t) will denote the random variable at fixed time t, whereas {X(t)}t will denote the whole process. The equality =law will denote the equality of finite dimensional distributions. Moreover, the abbreviation a.s (resp. m.s.) stands for almost surely, i.e., with probability one, (resp. mean square). Let us define the measure space (S + , µ) as follows. S + is the space-scale half-plane S + = {(t, l), t ∈ R, l ∈ R+∗ } with which one can associate the measure µ(dt, dl) = l −2 dtdl. This measure is the (left-) Haar measure of the translation dilation group acting on S + . 2.2. Infinitely divisible random variables and independently scattered random measures. Let us recall [14] that a random variable X is infinitely divisible iff, for all n ∈ N∗ , n law

X =

Xn,i ,

i=1

where Xn,i are i.i.d. random variables. Infinitely divisible random variables are intimately related to L´evy Processes [14, 9] that are stochastic processes with independent increments. The characteristic function of an infinitely divisible random variable X, can be written as E eiqX = eϕ(q) , where ϕ(q) is characterized by the celebrated L´evy-Khintchine formula [14, 9]: iqx e − 1 − iq sin x ν(dx), (7) ϕ(q) = imq + x2 −y where ν(dx) is the so-called L´evy measure and satisfies −∞ ν(dx)/x 2 < ∞ and ∞ 2 y ν(dx)/x < ∞ for all y > 0. Following [35], one can introduce an independently scattered infinitely divisible random measure P distributed on the half-plane S + . “P is independently scattered” means that, for every sequence of disjoint sets An of S + , {P (An )}n are independent random variables and ∞ P ∪∞ A P (An ) , a.s. = n=1 n n=1 + + provided ∪∞ n=1 An ⊂ S . P is said to be infinitely divisible on (S , µ(dt, dl)) associated with the L´evy measure ν(dx) if for any µ-measurable set A, P (A) is an infinitely divisible random variable whose characteristic function is (8) E eiqP (A) = eϕ(q)µ(A) .

Let us notice that one can build a convex function ψ(q) (q ∈ R+ ) such that, for all (non-empty) subsets A of S + ,

Log-Infinitely Divisible Multifractal Processes

453

• ψ(q) = +∞, if E eqP (A) = +∞, • E eqP (A) = eψ(q)µ(A) , otherwise. Moreover, if we define qc = max{q ≥ 0, ψ(q) < +∞}, q

(9)

it is then clear that ∀q ∈ [0, qc [, ψ(q) = +∞. Let us note that one can extend the definition of ϕ so that it is a continuous function of a complex variable such that ψ(q) = ϕ(−iq), ∀q ∈ {z ∈ C, 0 ≤ Re(z) < qc }. Moreover, in the case where there exists > 0 such that E e−P (A) < +∞, for a non-empty subset A of S + , ϕ can be chosen to be analytical in the strip {z ∈ C, − < Re(z) < qc }. 3. Definitions of Multifractal Random Measures (MRM) 3.1. Defining the generic MRM measure M(dt). Let P be an infinitely divisible independently scattered random measure on (S + , dtdl/ l 2 ) as defined in the previous section, associated with the L´evy measure ν(dx) and such that qc > 1, i.e., ∃ > 0, ψ(1 + ) < +∞ and

(10)

ψ(1) = 0

Definition 1 (Filtration Fl ). Let be the probability space on which P is defined. Fl is the filtration of defined by Fl = {P (dt, dl ), l ≥ l}. The construction of M involves cone-like subsets of S + . These cone-like subsets Al (t) are defined using a boundary function f (l). More precisely, Definition 2 (The Al (t) set and the f (l) function). The subset Al (t) of S + is defined by Al (t) = {(t , l ), l ≥ l, − f (l )/2 < t − t ≤ f (l )/2},

(11)

where f (l) is a positive function of l such that +∞ f (s)/s 2 ds < +∞, l

∃ L > 0, f (l) = l for l < L. +∞ f (s)/s 2 ds < +∞. It is convenient to represent the Let us note that µ(Al ) = l function f (l) as: f (l) = f (e) (l) + g(l), where f (e) is defined by (15) and g satisfies ∃ L > 0, ∀l < L, g(l) = 0.

(12)

454

E. Bacry, J.F. Muzy

Definition 3 (ωl (t) process). The process ωl (t) is defined as ωl (t) = P (Al (t)) .

(13)

Definition 4 (Ml (dt) measure). For l > 0, we define Ml (dt) = eωl (t) dt,

(14)

in the sense that for any Lebesgue measurable set I , one has Ml (I ) = eωl (t) dt. I

Let us note that (14) makes sense because ωl (t) corresponds, with probability one, to a right continuous and left-hand limited (cadlag) function of t. Indeed, it can be expressed as ωl (t) = Xl (t) − Yl (t) + Zl , with Xl (t) = P ({t , l ) l ≥ l, f (l )/2 ≤ t ≤ t + f (l )/2}), Yl (t) = P ({t , l ) l ≥ l, − f (l )/2 ≤ t ≤ t − f (l )/2}), and Zl = P ({t , l ) l ≥ l, − f (l )/2 ≤ t ≤ f (l )/2}. One can easily check that {Xl }t and {Yl (t)}t are L´evy processes (with Xl (0) = Yl (0) = 0). Since L´evy processes are well known to have, with probability one, cadlag versions and since Zl does not depend on t, then ωl (t) is also cadlag. Definition 5 (M measure). The MRM measure M(dt) is defined as the limit measure (when it exists) M(dt) = lim Ml (dt), l→0+

where Ml (dt) is defined in Definition 4. 3.2. Defining the exact scaling MRM measure M (e) (dt). In the previous section, for any choice of f (l), one can define an MRM M(dt). In the following, we will prove that M([0, t]) satisfies the asymptotic scaling (3). From a fundamental point of view but also in most applications, it is interesting to have a model where the scale invariance property (2) is “measured” on a whole range of scales and not only as an asymptotic property. One thus needs to build multifractal processes/measures that satisfy the exact scaling relation (2) on a whole range of scales l ∈]0, T ], where T is an arbitrary large scale.

Log-Infinitely Divisible Multifractal Processes

455

The exact scaling property can be obtained by picking up a particular shape for the set Al , i.e., by choosing the appropriate function f (l). As we will see, only the particular choice f (l) = f (e) (l) (i.e. g(l) = 0 in (12)) with l, ∀l ≤ T f (e) (l) = (15) T , ∀l > T will lead to a MRM measure with exact scaling (2). The MRM measure associated to this particular choice will be referred to as M (e) (dt), i.e., Definition 6 (M (e) measure). The MRM measure M (e) (dt) is defined as the limit measure (when it exists) (e)

M (e) (dt) = lim Ml (dt), l→0+

(e)

where Ml (dt) is equal to Ml (dt) (Definition 4) for the particular choice f = f (e) (Eq. (15)). 3.3. Numerical simulation – Discrete construction of an MRM. In the case where the L´evy measure ν(dx) verifies ν(dx)/x 2 < +∞, a realization of the measure P (dt, dl) is made of isolated weighted Dirac distribution in the S + half-plane (the construction then basically t reduces to the one of Barral and Mandelbrot [7]). Thus the process Ml (([0, t[) = 0 eωl (t) dt is a jump process that can be simulated with no approximation. This gives a way of simulating a process which is arbitrary close (by choosing l close to 0) to the limitprocess. However, if ν(dx)/x 2 = +∞ (e.g., ν(dx) has a Gaussian component), this is no longer the case. Thus, one has to build another sequence of stochastic measures M˜ l (dt) that converges towards M(dt) and that can be easily simulated. The purpose of the following construction is to build such a sequence. We choose M˜ l (dt) to be uniform on each interval of the form [kl, (k + 1)l[, ∀k ∈ N and with density eωl (kl) . Thus, for any t > 0 such that t = nl with n ∈ N∗ , one gets M˜ l ([0, t[) =

n−1

eωl (kl) l.

(16)

k=0

We will restrict ourselves to l = ln = 2−n . We then define the discretized MRM M˜ : ˜ Definition 7 (M˜ measure). The discretized MRM measure M(dt) is defined as the limit measure (when it exists) ˜ M(dt) = lim M˜ ln (dt), n→+∞

(17)

where M˜ ln (dt) is defined by (16) with ln = 2−n . 4. Main Results on MRM In this section, we state the main theorems concerning MRM measures. All these theorems are proved throughout the next section. Most of the results will first be proved

456

E. Bacry, J.F. Muzy

for M = M (e) . The generalization to any measure M will then be made using the exact scaling properties of M (e) (Theorem 4). We introduce the following definition Definition 8 (Scaling exponents ζq ). ∀q > 0, ζq = q − ψ(q).

(18)

Note that because ψ is convex, ζq is a concave function. As we will see in the following theorems, the so-defined exponents ζq do correspond to the multifractal scaling exponents in the sense of (2) or (3). Theorem 1 (Existence of the limit MRM measure M(dt)). There exists a stochastic measure M(dt) such that (i) with probability one, Ml (dt) converges weakly towards M(dt), for l → 0+ , (ii) ∀t ∈ R, M({t}) = 0, (iii) for any bounded set K of R, M(K) < +∞ and E (M(K)) ≤ |K|. Theorem 2 (Non-degeneracy of M(dt)). (H) ∃ > 0, ζ1+ > 1, if (H) holds then E (M([0, t])) = t. Theorem 3 (Moments of positive orders of M(dt)). Let q > 0 then (i) ζq > 1 ⇒ E (M([0, t])q ) < +∞. (ii) if (H) then E (M([0, t])q ) < +∞ ⇒ ζq ≥ 1. Theorem 4 (Exact scaling of M (e) (dt)). law

{M (e) ([0, λt])}t = Wλ {M (e) ([0, t])}t , ∀ λ ∈ ]0, 1[ and t ≤ T , with Wλ = λe λ where λ is an infinitely divisible random variable (independent of {Ml ([0, t])}t ) whose characteristic function is E eiq λ = λ−ϕ(q) . If ζq = −∞, then E M (e) ([0, t])q = +∞ and otherwise t ζq E M (e) ([0, t])q = (19) E M (e) [0, T ])q , ∀t ≤ T . T Let us note that this theorem shows that M (e) is a “continuous cascade” as defined in (6), i.e., it satisfies a continuous version of the discrete multiplicative cascade recurrence (5). Theorem 5 (Asymptotic scaling of M(dt)). If (H) holds, then for q > 0 such that E (M([0, t])q ) < +∞ and ∃ > 0, ζq+ = −∞, then ζq t E M([0, t])q E M([0, T ])q . ∼ T t→0+ ˜ Theorem 6 (Link between M(dt) and M(dt)). If there exists > 0 such that ζ2+ > 1, then one has m.s. M˜ ln (dt) → M(dt),

where the limit is taken in the mean square sense.

Log-Infinitely Divisible Multifractal Processes

457

5. Proofs of Theorems 1 Through 6 5.1. Existence of the limit MRM measure M(dt) – Proof of Theorem 1. Since ψ(1) = 0, one has E eωl (t) = 1. It is then easy to prove that for all Lebesgue measurable set I , the sequence {Ml (I )}l is a left continuous positive martingale with respect to Fl . From the general theory of [23], one gets Theorem 1. 5.2. Computation of the characteristic function of ωl (t). Let q ∈ N∗ , tq = (t1 , t2 , . . . , tq ) with t1 ≤ t2 ≤ . . . ≤ tn and pq = (p1 , p2 , . . . , pq ). The characteristic function of the vector {ωl (tm )}1≤m≤q is defined by q Ql (tq , i pq ) = E e m=1 ipm P (Al (tm )) . Relation us to get an analytical expression for quantities of the form q (8) allows E e m=1 am P (Bm ) , where {Bm }m would be disjoint sets in S + and am arbitrary complex numbers. However the {Al (tm )}m have no reason to be disjoint sets. We need to find a decomposition of {Al (tm )}m into disjoint domains. This is naturally done by considering the different intersections between these sets. Let us define Al (t, t ) = Al (t) ∩ Al (t ). In Appendix A, we prove that Lemma 1 (Characteristic function of ωl (t)). Let q ∈ N∗ , tq = (t1 , t2 , . . . , tq ) with t1 ≤ t2 ≤ . . . ≤ tn and pq = (p1 , p2 , . . . , pq ). The characteristic function of the vector {ωl (tm )}1≤m≤q is q q j E e m=1 ipm P (Al (tm )) = e j =1 k=1 α(j,k)ρl (tk −tj ) , (20) where ρl (t) = µ(Al (0, t)), and α(j, k) = ϕ(rk,j ) + ϕ(rk+1,j −1 ) − ϕ(rk,j −1 ) − ϕ(rk+1,j ), and rk,j =

j 0,

m=k

pm , for k ≤ j . for k > j

Moreover j q

α(j, k) = ϕ

j =1 k=1

q

pk .

(21)

k=1

5.3. Exact scaling of M (e) (dt) – Proof of Theorem 4. Let us assume that the family of processes Ml ([0, t]) satisfies (6), i.e., for l small enough, and for all λ ∈]0, 1[ and t ≤ T law

{Mλl ([0, λt])}t = Wλ {Ml ([0, t])}t ,

458

E. Bacry, J.F. Muzy

where Wλ is independent of {Ml ([0, t])}t . By definition of Ml (dt), it gives λt t law ωλl (u) ωl (u) e du = Wλ e du . 0

0

t

t

This last relation will hold if the processes ωl (t) satisfy the scaling property law

{ωλl (λt)}t = λ + {ωl (t)}t ,

(22)

where λ and Wλ are linked by the relation Wλ = λe λ . Indeed, one would have λt t ωλl (u) ωλl (λu) {Mλl ([0, λt])}t = e du = λ e du 0 0 t t t eωl (u) du = λe λ {Ml ([0, t])}t . =law λe λ 0

t

Equation (22) translates easily on the characteristic function found in Lemma 1. This gives the following lemma Lemma 2 (Exact scaling of Ml (dt)). If ρl (t) = µ(Al (0, t)) = µ(Al (0) ∩ Al (t)), satisfies the scaling relation ρλl (λt) = − log λ + ρl (t), ∀l ∈]0, T ], ∀λ ∈]0, 1[, ∀t ∈ [0, T ]

(23)

then one has law

{Mλl ([0, λt])}t = λe λ {Ml ([0, t])}t , ∀l ∈]0, T ], ∀λ ∈]0, 1[ and t ≤ T

(24)

where λ is an infinitely divisible random variable (independent of {M (e) ([0, t])}t ) whose characteristic function is E eiq λ = λ−ϕ(q) . Moreover, (23) is satisfied in the particular case where Al (t) is defined by (11) with f (l) = f (e) (l), where f (e) is defined by (15). Equation (24) is a direct consequence of the previous discussion and Lemma 1. For the (e) last assertion, one has to compute ρl (t) = ρl (t) in the particular case f (l) = f (e) (l). A direct computation shows that  T t  ln l + 1 − l if t ≤ l T (e) (25) ρl (t) = ln if T ≥ t ≥ l , t  0 if t > T which satisfies (23).

By taking the limit l → 0+ one gets Theorem 4. 5.4. Moments of positive orders of M (e) (dt) – Proof of Theorem 3 in the case M = M (e) . We are now ready to give conditions for existence of the moments of M (e) (dt).

Log-Infinitely Divisible Multifractal Processes

459

Lemma 3 (Moments of positive orders of M (e) (dt)). If q > 0 then (e) (i) ζq > 1 ⇒ E M (e) ([0, t])q < +∞ and supl E Ml ([0, t])q < +∞, (ii) if M (e) = 0 then E M (e) ([0, t])q < +∞ ⇒ ζq ≥ 1. This lemma is proved in Appendix D. It gives Theorem 3 in the particular case M = M (e) . 5.5. Non-degeneracy of M (e) (dt) – Proof of Theorem 2 for the case M = M (e) . Theorem 2 for the case M = M (e) is a direct consequence of Lemma 3 (i) using a dominated convergence argument. 5.6. Extension of the results on M (e) (dt) to M(dt). Lemma 4 (Degeneracy, asymptotic scaling and moments of positive orders of M(dt)). Let M(dt) be the MRM measure as defined in Definitions 4 and 5 with f satisfying (12). Then, one has (i) M (e) (dt) =a.s. 0 ⇐⇒ M(dt) =a.s. 0. Let q ∈ R+∗ . If ∃ > 0, ζ1+ > 1, and consequently (according to Sect. 5.5), M (e) (dt) and (according to (i)) M(dt) are not degenerate. One has (ii) With probability one, ∀t ≥ 0, M([0, λt]) ∼ XM (e) ([0, λt]), when λ → 0+ , where X is a random variable independent of t and λ. (iii) E M (e) ([0, t])q < +∞ ⇐⇒ E (M([0, t])q ) < +∞. Moreover if this assertion holds, one has supl E (Ml ([0, t])q ) < +∞. (iv) If ∃ > 0, ζq+ = −∞ and E (M q ([0, t])) < +∞ then ζq t q ∼ E M([0, T ])q . E M([0, t]) + T t→0 The proof of this lemma can be found in Appendix E. 5.7. Proofs of Theorems 2, 3, and 5. Using Lemma 3 (i) along with Lemma 4 (iii), one gets Theorem 2. Again, Lemma 3 along with Lemma 4 (iii) give Theorem 3. Lemma 4 (iii) and (iv) give Theorem 5. 5.8. Theorem on M˜ – Proof of Theorem 6. Theorem 6 is proved in Appendix F. 6. Defining a MRW Process Using a Gaussian White Noise Let us note that attempts to define a MRW process using a fractional Gaussian noise are reported in [34]. In this paper we will address only the case of Gaussian white noise. We define a stochastic process which is not a strictly increasing process. The simplest approach to build such a process simply consists in subordinating a Brownian motion B(t) using the MRM M(t). The subordination of a Brownian process with a nondecreasing process has been introduced by Mandelbrot and Taylor [24] and is the subject of an extensive literature in mathematical finance. Multifractal subordinators have been considered by Mandelbrot and co-workers [28] and widely used to build multifractal processes. Let us define the process X (s) (t) as:

460

E. Bacry, J.F. Muzy

Definition 9 (Subordinated MRW). Let B(t) a brownian motion (with E B(1)2 = 1) and M(dt) a non degenerated MRM measure which is independent of B(t). The subordinated MRW process is the process defined by X(s) (t) = B(M([0, t])). In the case the MRM measure that it used is M (e) , we note Definition 10 (X(e) (t) process). X (e) (t) = B(M (e) ([0, t])). An alternative construction would consist in a stochastic integration using a Wiener noise dW (u). Definition 11 (Alternative subordinated MRW). Let dW (u) be a Wiener process of variance 1 and ωl (u) the previously introduced process (Eq. (13)) such that it is independent of dW (u). The MRW process is defined as the limit (when it exists) X(t) = lim Xl (t), l→0+

where

t

Xl (t) =

eωl (u)/2 dW (u).

(26)

0

t t Let us note that the process Xl (t) is well defined since 0 E (eωl (u)/2 )2 = 0 E eωl (u) = t. As we will see in the next section, these two constructions lead to the same process. As explained in Sect. 3.3, for simulation purposes, one needs to define a discrete time process that converges towards the MRW. This is done naturally using the discretized MRM measure M˜ introduced in Sect. 3.3. Definition 12. Let {w[k]}k∈Z be a Gaussian white noise of variance 1. Let ln = 2−n . We define the discretized MRW X˜ (t) as the limit process (when it exists) ˜ X(t) = lim X˜ ln (t), l→0+

where X˜ ln (t) =

t 0

M˜ ln (du) w[u/ l].

Let us note that, in the case t = Kln , X˜ ln (t) can be rewritten as X˜ ln (t) =

K

eωln (kln ) ln w[k].

k=0

This expression corresponds exactly to the original (log-normal) MRW expression introduced in Refs. [33, 3]. 7. Main Results on MRW Processes Theorem 7 (The two subordinated MRW are the same processes). If there exists > 0 such that ζ1+ > 1, then law

X(t) = lim Xl (t) = lim B(Ml ([0, t])) = B(M([0, t])) = X(s) (t), l→0+

l→0+

Log-Infinitely Divisible Multifractal Processes

461

where B(t) (resp. dW (u)) is a Brownian motion (resp. Wiener noise) of variance 1 independent of P (dt, dl). Moreover E X(t)2 = t. The proof of this theorem can be found in Appendix G. Theorem 8 (Main results on X(t) and X(e) (t)). Let > 0 be such that ζ1+ > 1. Let q > 0. The following properties hold: (i) ζq > 1 ⇒ E |X(t)|2q < +∞. (ii) E |X(t)|2q < +∞ ⇒ ζq ≥ 1. (iii) law

{X (e) (t)}t = Wλ {X (e) (t)}t , ∀ λ ∈ ]0, 1[ and t ≤ T , 1

1

with Wλ = λ 2 e 2 λ , where λ is an infinitely divisible random variable (independent of {X (e) (t))}t ) whose characteristic function is E eiq λ = λ−ϕ(q) . (iv) If ζq = −∞, then E |X (e) |(t)2q = +∞ and otherwise t ζq E |X (e) (t)|2q = E |X (e) (T )|2q , ∀t ≤ T . T (v) If E |X(t)|2q < +∞ and ∃ > 0, ζq+ = −∞, then t→0+ t ζq 2q E |X(t)| ∼ E |X(T )|2q . T This theorem is a direct consequence of the theorems of Sect. 4. Let us note that (iii) shows that X(e) is a “continuous cascade” as defined in (6). ˜ Theorem 9 (Link between X(t) and X(t)). If there exists > 0 such that ζ2+ > 1 then law

lim {Xln (t)}t = {X(t)}t .

n→+∞

Proof. Since we have proved (Theorem 6) that as long as there exists > 0 such that ζ2+ > 1 then M˜ ln (dt) converges towards M(dt), one easily gets that the finite dimensional distributions of {X˜ ln (t)}t converge towards those of {X(t)}t . Moreover, in the same way as at the end of Appendix G, using theorems of Sect. 4, one easily shows that the sequence {M˜ ln (dt)} is tight and consequently {X˜ ln (t)} is also tight. This proves the theorem. 8. Examples of MRM or MRW – Connected Approaches Let us study, in this section, some examples of log-infinitely divisible MRM (or MRW). Let us first remark that throughoutthe paper we have supposed that qc > 1 (see +∞ Eqs. (9) and (10)). That means that y ex x −2 ν(dx) < +∞ and therefore, it is

462

E. Bacry, J.F. Muzy

sufficient that ν(dx) = O(x 1− e−x ) when x → +∞.

(27)

Let us remark that this condition is notably satisfied for all L´evy measures whose support is bounded to the right. • Lebesgue measure: The simplest case is when the L´evy measure ν(dx) is identically zero. In that case the associated MRM is trivially identical to the Lebesgue measure, i.e, M(dt) = dt, and ζq = q. • Log-normal MRM: When the canonical measure ν attributes a finite mass at the origin, ν(dx) = λ2 δ(x)dx with λ2 > 0, it is easy to see from (7) that ψ(p) is the cumulant generating function of a normal distribution: ψ(p) = pm + λ2 p 2 /2. We thus have qc = +∞ and the condition ψ(1) = 0 implies the relationship m = −λ2 /2. The log-normal ζq spectrum is thus a parabola:

λ2 λ2 ln ζq = q 1 + − q 2. (28) 2 2 In that case the associated MRW process is exactly the same as the process defined in Refs. [33, 3]. The Gaussian process ωl (t) (Definition 3) can be directly constructed by filtering a 1D white noise without any reference to 2D conical domains. This model is interesting because its multifractal properties are described by only two parameters, the integral scale T and the so-called intermittency parameter λ2 . Moreover, many exact analytical expressions can be obtained and notably the value of the prefactor E M (e) [0, T ]q in (19) (see Refs. [4, 3]). • Log-Poisson MRM: When there is a finite mass at some finite value x0 = ln(δ), of intensity λ2 = γ (ln δ)2 : ν(dx) = λ2 δ(x−x0 ). The corresponding distribution is Poisson of scale parameter γ and intensity ln(δ): ψ(p) = p (m − sin(ln(δ)))−γ (1−δ p ). We have again qc = +∞ and the log-Poisson ζq spectrum is therefore: lp ζq = qm + γ (1 − δ q ),

(29)

where m is such that ψ(1) = 0. This situation corresponds to the model proposed by She and L´evˆeque in the field of turbulence [38]. • Log-Poisson compound MRM: When the canonical measure ν(dx) is such that ν(dx)x −2 = C < +∞ (e.g., if ν is concentrated away from the origin) it is easy to see that F (dx) = ν(dx)x −2 /C is a probability measure. In that case,

ϕ(p) = im p + C (eipx − 1)F (dx) is exactly the cumulant generating function of a Poisson process with scale C and compound with the distribution F [14]. Let us now consider a random variable W such that ln W is distributed according to F (dx). In this example, if qc = maxq {E (W q ) < +∞} > 1, the log-Poisson compound MRM has the following multifractal spectrum: lpc ζq = qm − C E W q − 1 .

(30)

Log-Infinitely Divisible Multifractal Processes

463

The so-obtained MRM corresponds exactly to Barral and Mandelbrot’s “product of cylindrical pulses” in [7], except for the case when the distribution of W has a Dirac mass in 0. This particular case within the MRM framework will be addressed separately in a forthcoming paper. Let us note that, in their work, Barral and Mandelbrot did not study the scaling properties of their construction. They rather focused on the pathwise regularity properties. They proved the validity of the so-called “multifractal formalism” (see e.g. [31, 2, 32, 15, 18, 19]) that relates the function ζq to the singularity spectrum D(h) associated with (almost) all realisations of the process by a Legendre transformation. • Log-α stable MRM: Let us consider the case when ν(dx) corresponds to a left-sided α-stable density: C|x|1−α if x ≤ 0 ν(dx) = , 0 if x > 0 where C > 0 and 0 < α < 2. In such a case, a direct computation shows that qc = +∞: ζqls = qm − σ α |q|α .

(31)

Such distributions have been used in the context of turbulence and geophysics [37]. • Log-Gamma MRM : The family of Gamma distributions corresponds to L´evy measures of the form ν(dx) = Cγ 2 xe−γ x dx for x ≥ 0. It is immediate to see that qc = γ for this class of measures and thus one must have γ > 1 in order to construct a MRM. γ A direct computation leads to ψ(q) = Cγ 2 ln( γ −q ) and therefore, for q < γ ,

γ lg ζq = qm − Cγ 2 ln . (32) γ −q Many other families of ζq spectra can be obtained for other choices of the L´evy measure. 9. Conclusion In this paper we have proposed a new construction of stationary stochastic measures and random processes with stationary increments. We proved they have exact multifractal scaling in the sense of (2) (in which case it satisfies the cascading rule (6)) or asymptotic multifractal scaling properties in the sense of (3). Apart from their multifractal properties, we have also studied non-degeneracy and conditions for finite moments. There are many applications these processes can be used for. Actually, in a previous work, we have already shown [33] that the log-normal MRW is a very good candidate for modeling financial data. However, multifractal scaling are observed in many other fields ranging from turbulence to network traffic or biomedical engineering. Some of these possible applications of MRM and MRW are discussed in [34]. Many open mathematical problems remain related to these processes. Some of them are discussed in [34], notably the questions related to the construction of stochastic integrals using fractional Brownian motions instead of regular Brownian motions as in Sect. 6. Another interesting problem concerns the study of limit probability distributions associated with MRM for which very few features are known. Finally, it should be interesting to generalize the results of [7] in order to link scaling properties and pathwise regularity within a multifractal formalism.

464

E. Bacry, J.F. Muzy

Appendices A. Proof of Lemma 1 (Characteristic Function of ωl (t)) We are going to compute Ql (tq , pq ) using a recurrence on q. If we group in the sum q k=1 pk P (Al (tk )) the points which are not in the set Al (tq ), we get the random variable Yq =

q

pk P Al (tk )\Al (tq ) .

k=1

Moreover, the points that are in the set Al (tq ) can be grouped using the disjoint sets Bk = Al (tk , tq )\Al (tk−1 , tq ) (i.e., points belonging to Al (tk ) and not to Al (tk−1 )). Thus, if we define Xk,q = P Al (tk , tq )\Al (tk−1 , tq ) , (where we used the notation A(t0 , tm ) = A(tm , t0 ) = ∅) then one has q

pk P (Al (tk )) = Yq +

k=1

q

rk,q Xk,q ,

k=1

where the numbers rq,k are defined by rk,j =

j

pm .

m=k

Moreover, since all the {Xk,q }k and Yq are independent random variables, one gets q E eirk,q Xk,q . Ql (tq , pq ) = E eiYq

(33)

k=1

The same type of arguments can be used to prove that q−1 k=1

pk P (Al (tk )) = Yq +

q

rk,q−1 Xk,q ,

k=1

and q E eirk,q−1 Xk,q , Ql (tq−1 , pq−1 ) = E eiYq

(34)

k=1

where we used the convention rk,j = 0 if j < k. Merging (33) and (34) leads to q E eirk,q Xk,q . Ql (tq , pq ) = Ql (tq−1 , pq−1 ) E eirk,q−1 Xk,q k=1 Since, using (8), one gets E eipXk,q = E eipP (Al (tk ,tq )\Al (tk−1 ,tq )) = eϕ(p)(µ(Al (tk ,tq )\Al (tk−1 ,tq ))) .

(35)

Log-Infinitely Divisible Multifractal Processes

465

However, since tk−1 ≤ tk ≤ tq , one has Al (tk−1 , tq ) ⊂ Al (tk , tq ), therefore E eipXk,q = eϕ(p)(µ(Al (tk ,tq )−µ(Al (tk−1 ,tq ))) . By inserting this last expression in (35), it follows Ql (tq , pq ) = Ql (tq−1 , pq−1 ) = Ql (tq−1 , pq−1 )

q eϕ(rk,q )(µ(Al (tk ,tq ))−µ(Al (tk−1 ,tq ))) eϕ(rk,q−1 )(µ(Al (tk ,tq ))−µ(Al (tk−1 ,tq )))

k=1 q

e(ϕ(rk,q )−ϕ(rk,q−1 ))(µ(Al (tk ,tq ))−µ(Al (tk−1 ,tq ))) .

k=1

By iterating this last expression, one gets ln Ql (tq , pq ) =

j q

(ϕ(rk,j ) − ϕ(rk,j −1 ))(µ(Al (tk , tj )) − µ(Al (tk−1 , tj )))

j =1 k=1

=

j q

(ϕ(rk,j ) − ϕ(rk,j −1 ))µ(Al (tk , tj ))

j =1 k=1

−

j q

(ϕ(rk,j ) − ϕ(rk,j −1 ))µ(Al (tk−1 , tj ))

j =1 k=1

=

q j

(ϕ(rk,j ) − ϕ(rk,j −1 ))µ(Al (tk , tj ))

j =1 k=1

−

j −1 q

(ϕ(rk+1,j ) − ϕ(rk+1,j −1 ))µ(Al (tk , tj ))

j =1 k=0

=

j −1 q

(ϕ(rk,j ) + ϕ(rk+1,j −1 ) − ϕ(rk,j −1 ) − ϕ(rk+1,j ))µ(Al (tk , tj ))

j =2 k=1 q

q

j =1

j =1

+

ϕ(pj )µ(Al (tj )) −

(ϕ(r1,j ) − ϕ(r1,j −1 ))µ(Al (t0 , tj )).

Since (i) ϕ(0) = 0 and (ii) by convention µ(Al (t0 , tj )) = 0, one finally gets Lemma 1. B. Controlling the Moments of supu∈[0,t ] eωl (u) In this appendix we show the following lemma Lemma 5. If q is such that ψ(q) = +∞ (i.e., ζq = −∞) then

E

sup eqωl (u) u∈[0,t]

< +∞.

466

E. Bacry, J.F. Muzy

Proof. By definition, one has ωl (u) = P (Al (u)). We first consider t small enough such (i) that ∩u∈[0,t] Al (u) = Al = ∅. Thus, for any u ∈ [0, t], one can decompose Al (u) into the three disjoint sets (i)

(l)

(r)

Al (u) = Al ∪ Al (u) ∪ Al (u), (l)

(r)

where Al (u) (resp. Al (u)) corresponds to the part of Al (u) which is on the left (resp. (i) right) of Al (u). Thus

(i) (l) (r) E sup eqωl (u) = E eqP (Al ) E sup eqP (Al (u)) E sup eqP (Al (u)) . u∈[0,t]

u∈[0,t]

u∈[0,t]

(r)

Since P (Al (u)) is a martingale, using Doob’s Lp inequality for submartingales suprema [12], we get

(r) (r) qP (Al (u)) E sup e ≤ Dq E eqP (Al (t)) , u∈[0,t] (l)

where Dq is a constant which depends only on q. In the same way P (Al (t − u)) is a martingale, thus

(l) (l) (r) qP (Al (u)) qP (Al (t−u)) E sup e E sup e ≤ Dq E eqP (Al (0)) . u∈[0,t]

u∈[0,t]

This proves the lemma. (i) In the case where t is large enough such that ∩u∈[0,t] Al (u) = Al = ∅, we split the interval into n equal intervals and we get

E

sup eqωl (u) u∈[0,t]

≤ nE

sup

eqωl (u) ,

u∈[0,t/n]

and by choosing n large enough we can use the same arguments as before.

C. Martingale Properties of MRM Let us show the following lemma: Lemma 6. Let q > 1 be such that ψ(q) = +∞ (i.e., ζq = −∞). For all fixed values of t, the sequence Ml ([0, t])q is a positive submartingale, i.e., ∀ l < l: E Ml ([0, t])q ||Fl ≥ Ml ([0, t])q . Consequently, at fixed t, the sequence E (Ml ([0, t])q ) increases when l decreases. Proof. Let l < l, then since ψ(1) = 0, t E (Ml ([0, t])||Fl ) = E eωl (t) dt||Fl dt = Ml ([0, t]). 0

Ml ([0, t]) is therefore at positive martingale. If q > 1, the submartingale property directly results from Jensen’s inequality.

Log-Infinitely Divisible Multifractal Processes

467

D. Proof of Lemma 3 (Moments of Positive Orders of M (e) (dt)) For the proof of this lemma, we proceed along the same line as in Refs. [7, 21]. Proof for (ii). First, let us note that, since M (e) = 0 and E M (e) ([0, t])q < +∞ then (using Theorem 4 which is a direct consequence of Lemma 2) one easily shows that ψ(q) = +∞, i.e., ζq = −∞. Using the superadditivity of x q for q ≥ 1, one gets E M (e) ([0, t])q = E (M (e) ([0, t/2]) + M (e) ([t/2, t]))q ≥ E M (e) ([0, t/2])q + E M (e) ([t/2, t])q ≥ 2E M (e) ([0, t/2])q . Since M (e) = 0 and ζq = −∞, using (19), we have E M (e) ([0, t])q ≥ 21−ζq E M (e) ([0, t])q , and consequently ζq ≥ 1. Proof for (i). Since ζq > 1, ψ(q) = +∞ and thus (according to Lemma 5), one gets E

(e) Ml ([0, t])q

q

t

=E

e 0

≤E

ωl (u)

du

sup eqωl (u) t q < +∞.

(36)

u∈[0,t] (e)

Let m ∈ N. Let us decompose Ml

as:

(e)

(0)

(1)

Ml ([0, T ]) = Ml (T ) + Ml (T ), where (0) Ml (T )

=

2m−1 −1

d2k ,

k=0

and (1) Ml (T )

=

2m−1 −1

d2k+1 ,

k=0

with dk = Ml ([kT 2−m , (k + 1)T 2−m ]). (e)

(37)

468

E. Bacry, J.F. Muzy (0)

(1)

Let us note that Ml (T ) and Ml (T ) are random variables identically distributed. Since q ≥ 1, from Minkowski inequality, one gets 1 1 q q q (e) (0) (1) (0) + E Ml (T )q = 2q E Ml (T )q . E Ml ([0, T ])q ≤ E Ml (T )q (38) Let n ∈

N∗

be such that n − 1 < q ≤ n. Thus using (37) and (38), we obtain: q   m−1 2 −1 (e) E Ml ([0, T ])q ≤ 2q E  d2k   . k=0

Thanks to the sub-additivity of the function x h (for h = q/n ≤ 1), one gets n   m−1 2 −1 q/n (e) E Ml ([0, T ])q ≤ 2q E  d2k   .

(39)

k=0

q If we expand the last expression, the diagonal term gives simply 2q 2m−1 E d0 . The qs /n qs /n non-diagonal terms are of the form Cq,m E d2k11 . . . d2knn , where ni=1 si = n and 1 ≤ si < n. Moreover, each d2k can be written as: (2k+1)T 2−m eωl (u) du, d2k = 2kT 2−m (2k+1)T 2−m

=

2kT 2−m

eωT 2−m (u) eωl,T 2−m (u) du,

where ωl,L (u) = ωl (u) − ωL (u). Thus

d2k ≥ inf (eωT 2−m (v) ) v∈[0,T ]

and

(2k+1)T 2−m

2kT 2−m

d2k ≤ sup (e

ωT 2−m (v)

)

v∈[0,T ]

(2k+1)T 2−m

2kT 2−m

eωl,T 2−m (u) du,

(40)

eωl,T 2−m (u) du.

This last expression can be seen as the product of 2 terms: the sup term and the integral term. They correspond to independent random variables. Moreover, since the sup term does not depend on k and for two different values of k the integral terms are independant random variables, one finally gets for each non diagonal term  r

r

qsi /n  (2ki +1)T 2−m qs /n  E ≤ E sup eqωT 2−m (v) d i E du eωl,T 2−m (u) 2ki

v∈[0,T ]

i=1

≤E

sup eqωT 2−m (v)

v∈[0,T ]

i=1 r i=1



2ki T 2−m

E

T 2−m 0

qsi /n  . du eωl,T 2−m (u)

Log-Infinitely Divisible Multifractal Processes

469

From (36), one knows that the sup term is bounded by a finite positive constant D which depends only on q, m and T (and not on l). Moreover, since si < n, one has qsi /n ≤ n−1 and, using the H¨older inequality, the non-diagonal term can be bounded as:

E

r



qs /n

d2kii

≤ DE 

T 2−m

n−1 q

n i=1 si n(n−1)



du eωl,T 2−m (u)

0

i=1

 = DE 

T 2−m

q

n−1  n−1  du eωl,T 2−m (u) .

0

On the other hand, from (40), one gets E

d0n−1

≥E

inf e

(n−1)ωT 2−m (v)

v∈[0,T ]

  E

T 2−m

e

ωl,T 2−m (u)

n−1  .

0

Let us note that E inf v∈[0,T ] e(n−1)ωT 2−m (v) = 0. Indeed, otherwise it would mean that a.s. there exists a sequence {vp }p in [0, T ] such that limp→∞ ωT 2−m (vp ) = −∞. This is impossible because ωT 2−m (v) is cadlag. Therefore, there exists a finite constant E which does not depend on l such that

r q qn /n n−1 i ≤ EE d0n−1 d2ki . E i=1

Going back to (39), we finally proved that there exists a constant Cm,q (T ) (which does not depend on l) such that q q n−1 (e) E Ml ([0, T ])q ≤ 2q 2m−1 E d0 + Cm,q (T )E d0n−1 . (e)

Using the self-similarity of Ml (Lemma 2), one gets for any p, 0 < p ≤ q (and thus ζp = −∞) p (e) (e) E d0 = E Ml ([0, T 2−m ])p = 2−mζp E M2m l ([0, T ])p . Thus

(e) (e) E Ml ([0, T ])q ≤ 2q−1+m(1−ζq ) E M2m l ([0, T ])q q ζn−1 n−1 (e) . +Cm,q (T )2−mq n−1 E M2m l ([0, T ])n−1

(e) (e) Using Lemma 6, we know that, for k ≥ 1, E M2m l ([0, T ])k ≤ E Ml ([0, T ])k and therefore 2ml can be replaced by l is the r.h.s. of the previous inequality. Since ζq > 1, one can choose m such that q − 1 + m(1 − ζq ) < 0; thus there exists a finite positive constant Dm,q (T ) such that q ζn−1 n−1 (e) (e) . E Ml ([0, T ])q ≤ Dm,q (T )2−mq n−1 E Ml ([0, T ])n−1

(41)

470

E. Bacry, J.F. Muzy

(e) (e) Thus if supl E Ml ([0, T ])n−1 < +∞ , then supl E Ml ([0, T ])q < +∞ and, (e)

since Ml ([0, T ])q is a positive submartingale (see Lemma 6), it converges and E M (e) ([0, T ])q < +∞. We are now ready to prove (i) by induction on n. Let q be such that 1 < q ≤ 2. (e) n−1 In that case n = 2 and E Ml ([0, T ]) = T . From the last assertion one deduc (e) es that supl E Ml ([0, T ])q < +∞ and E M (e) ([0, T ])q < +∞. Thus it proves (i) for q such that 1 < q ≤ 2. On the other hand, it also proves that, if ζ2 > 1 then (e) 2 supl E Ml ([0, T ]) < +∞. Let us now suppose 2 < q ≤ 3, i.e., n = 3. Since ζp is a concave function and ζ1 = 1 and ζq > 1, one gets that all 1 < p ≤ q, and in particular ζp > 1 for (e)

ζ2 > 1 and consequently supl E Ml ([0, T ])2 < +∞. Then again (41) gives that (e) supl E Ml ([0, T ])q < +∞ and (submartingale argument) E M (e) ([0, T ])q < +∞ which proves (i) for q such that 2 < q ≤ 3. By induction, applying the same arguments each time, one proves (i). E. Proof of Lemma 4 (Extension of the Results on M (e) (dt) to M(dt))

We first consider the particular case where f (l) = 0 for l > L (i.e., g(l) = −f (e) (l) in (12)). The so-obtained MRM will be referred to as M (L) . Proofs for M = M (L) . Let Al (t) be the domain in S + associated with M (e) (dt), i.e., (e)

Al (t) = {(t , l ), l ≥ l, − f (e) (l )/2 < t − t ≤ f (e) (l )/2}. (e)

Let Al (t)(L) be the domain in S + associated with M (L) (dt), i.e., Al (t) = {(t , l ), L ≥ l ≥ l, − f (e) (l )/2 < t − t ≤ f (e) (l )/2}. (L)

It is clear that, if L (t) = {(t , l ), l ≥ L, − f (e) (l )/2 < t − t ≤ f (e) (l )/2}, (e)

(L)

(L)

(L)

then Al (t) = Al (t) ∪ L (t) with Al (t) ∩ L (t) = ∅. Thus if ωl (t) = (L) (e) (e) P (Al (t)), ωl (t) = P (Al (t)) and δL (t) = P (L (t)), one has (e)

(L)

ωl (t) = ωl (t) + δL (t), (L)

where ωl (t) and δL (t) are independent processes. Thus t t (e) (L) (e) eωl (u) du = eωl (u) eδL (u) du. Ml ([0, t]) = 0

0

Therefore, (L)

Ml

(e)

(L)

([0, t]) inf eδL (v) ≤ Ml ([0, t]) ≤ Ml v∈[0,t]

([0, t]) sup eδL (v) , v∈[0,t]

(42)

Log-Infinitely Divisible Multifractal Processes

471

and, taking the limit l → 0+ , M (L) ([0, t]) inf eδL (v) ≤ M (e) ([0, t]) ≤ M (L) ([0, t]) sup eδL (v) . v∈[0,t]

(43)

v∈[0,t]

Since δL (t) is a.s. right continuous and left-hand limited, one has lim

inf eδL (v) = lim sup eδL (v) = eδL (0) .

t→0+ v∈[0,t]

t→0+ v∈[0,t]

(44)

(L)

([0, t]) and δL (v) are independent processes, one gets from (43),

(L) q qδL (v) E M ([0, t]) E inf e ≤ E M (e) ([0, t])q v∈[0,t]

(L) q qδL (v) ≤ E M ([0, t]) E sup e .

Moreover, since Ml

v∈[0,t]

Let us note that, since δL (v) is cadlag, one has E inf v∈[0,t] eqδL (v) = 0. Then (i) is an immediate consequence of (43), (ii) with X = eδL (0) is an immediate consequence of (43), (iii) if ζq = −∞ (i.e., ψ(q) = +∞) then E eqδL (t) < +∞ and, using the same argu ment as in Lemma 5, one gets that E supv∈[0,t] eqδL (v) < +∞ and consequently (iii). The case ζq = −∞ is a little trickier. Let us first note that, for t small enough such that ∩u∈[0,t] Al (u) = ∅, one has

q

t (L)q ωl (u) E Ml ≥ Ct E eqP (∩u∈[0,t] Al (u)) = +∞. (45) e du =E 0

On the other hand, we know (Lemma 3 (ii)) that E M (e) ([0, t])q = +∞, so, in (L) order to prove (iii) we would need to prove that E M ([0, t])q = +∞. Since (e)

ζ1+ > 1 we know (Lemma 3 (i)) that supl E (Ml ([0, t])1+ < +∞. Thus, (L) (L) using (42), one gets supl E Ml ([0, t])1+ < +∞ and consequently Ml is (L) (L) uniformly (in l) integrable and E M ([0, t])||Fl = Ml ([0, t]). Thus, if we (L) suppose E M ([0, t])q < +∞, one has q (L) E M (L) ([0, t])q ||Fl ≥ E M (L) ([0, t])||Fl = Ml ([0, t])q . (L) Consequently E Ml ([0, t])q ≤ E M (L) ([0, t])q < +∞ which contradicts (45). That completes the proof of (iii) in the case ζq = −∞. (iv) This is a direct consequence of (iii) along with (45) and (44). Proofs for any M. So we just proved the theorem for M = M (L) . Let now g(l) be any function (satisfying the hypothesis (12)). Using exactly the same arguments as above (in which M (e) (dt) is

472

E. Bacry, J.F. Muzy

replaced by M(dt)), and using that (i) and (ii) hold for M (L) , one gets (i) and (ii) for M. Moreover one gets

E M (L) ([0, t])q E inf eqδL (v) ≤ E M([0, t])q v∈[0,t]

(v) (L) q qδL ≤ E M ([0, t]) E sup e , (46) v∈[0,t]

where δL is independent of M (L) . Using that (iii) holds for M (L) , (46) implies that (iii) holds also for M. And (iv) is a direct consequence of (46) and the fact that limt→0+

inf v∈[0,t] eδL (v) = limt→0+ supv∈[0,t] eδL (v) = eδL (0) . ˜ and M) F. Proof of Theorem 6 (Link Between M One has

E |Mln ([0, t[) − M˜ ln ([0, t[)|2 = E Mln ([0, t[)2 + E M˜ ln ([0, t[)2 −2E Mln ([0, t[)M˜ ln ([0, t[) .

We are going to compute the limit of each term separately. Since t t

t t 2 ωln (u)+ωln (v) e dudv = eϕ(2)ρln (u−v) dudv. E Mln ([0, t[) = E 0

Thus

0

0

0

t t eϕ(2)ρ(u−v) dudv = E M([0, t[)2 < +∞. lim E Mln ([0, t[)2 =

n→+∞

0

0

On the other hand M˜ ln ([0, t[) =

t/ ln −1

eωln (u/ ln ln ) du.

0

k=0

Thus

t

eωln (kln ) ln =

  ln −1 t t/ ln −1 t/ eωln (u/ ln ln )+ωln (v/ ln ln ) dudv  E M˜ ln ([0, t[)2 = E  =

t

k=0 k =0 0 t ϕ(2)ρln (u/ ln ln −v/ ln ln )

e

0

dudv.

0

(M)

(M)

Let ρl (t) = ρl (|t| − l) for |t| ≥ l and ρl (t) = ρl (0) for |t| ≤ l. Then, since ρl (u) is a positive symmetric decreasing function , one has ∀u, v ≥ 0, ρl (u/ ll − v/ ll) ≤ (M) (M) ρl (u − v). Moreover, since for any fixed u, one has liml→0+ ρl (u) = ρ(u), using the dominated convergence theorem, one gets t t eϕ(2)ρ(u−v) dudv = E M([0, t[)2 . lim E M˜ ln ([0, t[)2 = n→+∞

0

0

Log-Infinitely Divisible Multifractal Processes

473

Now we just need to compute the limit of the cross term E M˜ ln ([0, t[)Mln ([0, t[) . In the exact same way as the last computation, we get t t eϕ(2)ρln (u/ ln ln −v) dudv. E M˜ ln ([0, t[)Mln ([0, t[) = 0

0

(M)

Using the fact that ∀u, v ≥ 0, ρl (u/ ll − v) ≤ ρl (u − v), we get t t ˜ eϕ(2)ρ(u−v) dudv = E M([0, t[)2 . lim E Mln ([0, t[)Mln ([0, t[) = n→+∞

0

0

G. Proof of Theorem 7 Convergence of Xl (t). Let 0 ≤ t1 ≤ t2 ≤ . . . ≤ tn . It easy to show that the n-dimensional random variable {Xl (t2 ) − Xl (t1 ), . . . , Xl (tn ) − Xl (tn−1 )} has the same distribution as {Ml ([t1 , t2 ])w[1], . . . , Ml ([tn−1 , tn ])w[n−1])}, where w[i] is a discrete Gaussian white noise of variance 1. Since ζ1+ > 1, according to Theorem 2 (i), we know that Ml (dt) converges and is not degenerated. Thus, the finite dimensional laws of the process Xl (t) converge when l → 0+ . Moreover, let q = 2(1 + ), with 0 < < . One thus has ζ1+ > 1, q

t2 q ωl (u)/2 E |Xl (t2 ) − Xl (t1 )| = E e dW (u) t 1 = E Ml ([t1 , t2 ])q/2 E |w|q . (47) Let us notice that E (|w|q ) is finite and does not depend on l and, using Theorem 3 (ii) (along with the dominated convergence theorem), one has

lim E Ml ([t − 1, t2 ])1+ = E M([t − 1, t2 ])1+ < +∞, l→0+

and, there exists a constant C such that

E M([t − 1, t2 ])1+ ≤ C|t2 − t1 |ζ1+ , |t2 − t1 | ≤ 1. Thus, there exists a constant D > 0 such that E |Xl (t2 ) − Xl (t1 )|q ≤ D|t2 − t1 |ζ1+ , |t2 − t1 | ≤ 1. Since ζ1+ > 1 this proves that Xl is tight. Along with the convergence of the finite dimensional laws, it proves that Xl converges when l → 0+ . Convergence towards B(M([0,t])). Since {Ml ([t1 , t2 ])w[1], . . . , Ml ([tn−1 , tn ])w[n − 1])} has the same distribution as {B(Ml ([t1 , t2 ])) . . . , B(Ml ([tn−1 , tn ]))}, we just need to prove that B(Ml ([0, t])) → B(M[0, t]). Using the same kind of arguments as above, one can show that B(Ml ([0, t])) is tight and that its finite dimensional distributions converge to those of B(M([0, t])). Acknowledgement. The authors would like to thank Carl Graham for interesting discussions.

474

E. Bacry, J.F. Muzy

References 1. Arneodo, A., Bacry, E., Muzy, J.F.: Random cascade on wavelet dyadic trees. J. Math. Phys. 39, 4142–4164 (1998) 2. Bacry, E., Muzy, J.F., Arneodo, A.: Singularity spectrum of fractal signals from wavelet analysis: Exact results. J. Stat. Phys. 70, 635–674 (1993) 3. Bacry, E., Delour, J., Muzy, J.F.: Multifractal random walks. Phys. Rev. E 64, 026103–026106 (2001) 4. Bacry, E., Delour, J., Muzy, J.F.: Modelling financial time series using Multifractal Random Walks. Physica A 299, 84–92 (2001) 5. Barral, J.: Moments, continuit´e et analyse multifractale des martingales de Mandelbrot. Probab. Theory Relat. Fields 113, 535–569 (1999) 6. Barral, J.: Continuity of the multifractal spectrum of a statistically self-similar measure. J. Theoretic. Probab. 13, 1027–1060 (2000) 7. Barral, J., Mandelbrot, B.B.: Multifractal products of cylindrical pulses. Probab. Theory Relat. Fields 124, 409–430 (2002) 8. Ben Nasr, F.: Mesures aleatoires de Mandelbrot associees a des substitutions. CRAS Paris, Serie I 304, 255–258 (1987) 9. Bertoin, J.: L´evy Processes. Melbourne, NY: Cambridge University Press, 1996 10. Delour, J.: Th`ese de l’universit´e de Bordeaux I, 2001 11. Durrett, R., Liggett, T.M.: Fixed points of the smoothing transformation. Z. Wahr. Verw. Gebiete 64, 275–301 (1983) 12. Doob, J.L.: Classical potential theory and its probabilistic counterpart. Reprint of the 1984 ed., Berlin: Springer, 2001 13. Fan, A.H.: Sur le chaos de L´evy stables d’indices 0 < α < 1. Ann. Sci. Math. Qu´ebec 21(1), 53–66 (1997) 14. Feller, W.: An introduction to probability theory and its applications. Vol 2. New-York: John Wiley & Sons, 1971 15. Frisch, U., Parisi, G.: Fully developed turbulence and intermittency . Proc. Int. Summer school Phys. Enrico Fermi, Amsterdam: North Holland, 1985 16. Guivarc’h, Y.: Remarques sur les solutions d’une e´ quation fonctionnelle non lin´eaire de Benoit Mandelbrot. C.R. Academ. Sci. Paris 305, s´erie I, 139–141 (1987) 17. Holley, R., Waymire, E.C.: Multifractal dimension and scaling exponents for strongly bounded cascades. Ann. Appl. Prob. 2, 819–945 (1992) 18. Jaffard, S.: Multifractal formalism for functions, Part 1: Results valid for all functions. SIAM J. Math. Anal. 28, 944–970 (1997) 19. Jaffard, S.: Multifractal formalism for functions, Part 2: Selfsimilar functions. SIAM J. Math. Anal. 28, 971–998 (1997) 20. Jaffard, S.: The multifractal nature of L´evy processes. Probab. Theory Relat. Fields 114, 207–227 (1999) 21. Kahane, J.P., Peyri`ere, J.: Sur certaines martingales de Benoit Mandelbrot. Adv. Math. 22, 131–145 (1976) 22. Kahane, J.P.: Sur le Chaos Multiplicatif. Ann. Sci. Math. Que. 9, 435–444 (1985) 23. Kahane, J.P.: Positive martingales and random measures. Chi. Ann. Math. 8B, 1–12 (1987) 24. Mandelbrot, B.B., Taylor, H.M.: On the distribution of stock price differences. Op. Research 15, 1057–1062 (1967) 25. Mandelbrot, B.B.: Intermittent turbulence in self-similar cascades: Divergence of high moments and dimension of the carrier. J. Fluid. Mech. 62, 331–358 (1974) 26. Mandelbrot, B.B.: Multiplications al´eatoires it´er´ees et distributions invariantes par moyennes pond´er´ees. C. R. Acad. Sci. Paris 278, 289–292, 355–358 (1974) 27. Mandelbrot, B.B.: A possible refinement of the lognormal hypothesis concerning the distribution of energy in intermittent turbulence. In: Statistical Models and Turbulence. (La Jolla, California). Edited by M. Rosenblatt, C. Van Atta, Lecture Notes in Physics 12, New-York: Springer, 1972, pp. 333–35 28. Mandelbrot, B.B.: A Multifractal Walk Down Wall Street. Scientific American 280, 70–73 (1999) 29. Mantegna, R., Stanley, H.E.: An introduction to econophysics. Cambridge: Cambridge University Press, 2000 30. Molchan, G.M.: Scaling exponents and multifractal dimensions for independent random cascades. Commun. Math. Phys. 179, 681–702 (1996) 31. Muzy, J.F., Bacry, E., Arneodo, A.: Wavelets and multifractal formalism for singular signals: Application to turbulence data. Phys. Rev. Lett. 67, 3515–3518 (1991) 32. Muzy, J.F., Bacry, E., Arneodo, A.: The multifractal formalism revisited with wavelets. Int. J. of Bif. Chaos 4, 245–302 (1994)

Log-Infinitely Divisible Multifractal Processes

475

33. Muzy, J.F., Delour, J., Bacry, E.: Modeling fluctuations of financial time series: From cascade process to stochastic volatility model. Eur. J. Phys. B 17, 537–548 (2000) 34. Muzy, J.F., Bacry, E.: Multifractal stationary random measures and multifractal random walks with log-infinitely divisible scaling law. Phys. Rev. E 66. 056121 (2002) 35. Rajput, B., Rosinski, J.: Spectral representations of infinitely divisible processes. Probab. Theory Relat. Fields 82, 451–487 (1989) 36. Schmitt, F., Marsan, D.: Stochastic equations generating continuous multiplicative cascades. Eur. J. Phys. B 20, 3–6 (2001) 37. Schmitt, F., Lavall´ee, D., Schertzer, D., Lovejoy, S.: Empirical determination of universal multifractal exponents in turbulent velocityfields. Phys. Rev. Lett. 68, 305–308 (1992) 38. She, Z.S., L´evˆeque, E.: Universal scaling laws in fully developed turbulence. Phys. Rev. Lett. 72, 336–339 (1994) 39. Taqqu, M.S., Samorodnisky, G.: Stable non-Gaussian random processes. New York: Chapman & Hall, 1994 Communicated by A. Connes

Commun. Math. Phys. 236, 477–511 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0826-4

Communications in

Mathematical Physics

Instability of Interfaces in the Antiferromagnetic XXZ Chain at Zero Temperature Nilanjana Datta1 , Tom Kennedy2 1 2

Statistical Laboratory, Centre for Mathematical Sciences, University of Cambridge, Wilberforce Road, Cambridge CB30WB, U.K. E-mail: [email protected] Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA. E-mail: [email protected]

Received: 15 August 2002 / Accepted: 8 January 2003 Published online: 14 April 2003 – © N. Datta, T. Kennedy 2003

Abstract: For the antiferromagnetic, highly anisotropic XZ and XXZ quantum spin chains, we impose periodic boundary conditions on chains with an odd number of sites to force an interface (or kink) into the chain. We prove that the energy of the interface depends on the momentum of the state. This shows that at zero temperature the interface in such chains is not stable. This is in contrast to the ferromagnetic XXZ chain for which the existence of localized interface ground states has been proven for any amount of anisotropy in the Ising-like regime. 1. Introduction Interfaces or domain walls in classical spin systems have been the subject of mathematical study for several decades. Dobrushin proved [12] that in the three-dimensional Ising model at low temperatures, under suitable (Dobrushin) boundary conditions, there is a stable interface orthogonal to the 001-direction. These boundary conditions hence yield a non-translation invariant Gibbs state at low temperatures. However, Gallavotti proved [14] that the two-dimensional model shows a very different behavior; thermal fluctuations destabilize the interface and the corresponding Gibbs state is translation invariant. Interfaces in quantum-mechanical systems can exhibit a much richer and more complex behavior than their classical counterparts. A review of some of this behavior may be found in [24]. For example, quantum fluctuations may lift a classical degeneracy and, in doing so, stabilize an interface (against thermal fluctuations) that is unstable in the corresponding classical system. Such a stabilization is an example of the phenomenon of ground state selection [16]. It is expected to occur for the 111-(or diagonal) interface in the three-dimensional ferromagnetic, anisotropic XXZ model [see e.g. [5, 6]], and c Copyright rests with the authors. Reproduction of the entire article for non-commercial purposes is permitted.

478

N. Datta, T. Kennedy

has been proved to occur for the 111-interface in the three-dimensional Falicov-Kimball model [11]. These models can be viewed as quantum perturbations of the classical Ising model. In contrast to these quantum-mechanical models, the diagonal interface in the three–dimensional classical Ising model is expected to be unstable at non-zero temperatures. This is due to the massive degeneracy of the zero-temperature configurations compatible with the boundary conditions which favor such an interface [see [17]]. Another interesting feature of interfaces in quantum-mechanical systems is the diverse nature of the low-lying excitations above the interface ground states for different models and for different orientations of the interface. For example, there are gapless excitations above the conjectured diagonal interface states in the spin-1/2 ferromagnetic, anisotropic, XXZ model. These excitations were described in the two-dimensional case by Koma and Nachtergaele [6, 20, 21], and proved to exist in all dimensions greater than one by Matsui [23]. In contrast, it is expected that there is a gap in the spectrum above a ground state that describes an interface perpendicular to a coordinate direction. For quantum-mechanical systems, the stability of an interface is a nontrivial question even in the ground state, since quantum fluctuations can destabilize the interface at zero temperature. In this case quantum fluctuations play a role analogous to that of thermal fluctuations in classical systems. In one dimension we expect interface states to be unstable for generic Hamiltonians. However, there are notable exceptions, e.g. the anisotropic ferromagnetic XXZ chain. In addition to its two ferromagnetically ordered, translation invariant ground states, this model has ground states corresponding to an interface between two domains of opposite magnetization. The stability of this interface was proved independently by Alcaraz, Salinas and Wreszinski [1] and Gottstein and Werner [15]. This stability is a direct consequence of the conservation of the total z-component of the spin. There are no terms in the Hamiltonian that can simply move the interface across one lattice spacing. To conserve the spin, one must at the same time create a new excitation in the chain, thus raising the energy of the state. More precisely, it was proved in [1, 15] that, under suitable boundary conditions, there exists a family of interface ground states which describe a localized domain wall. The localization length depends on the anisotropy of the model and diverges in the limit of the isotropic model. Alternative proofs of the stability of this interface were given in [4], by using the path integral representation of interface states, and in [3], by employing the principle of exponential localization [13]. The above results show that in the spin-1/2 ferromagnetic, anisotropic XXZ model, an arbitrarily small amount of anisotropy is sufficient to stabilize the interface against quantum fluctuations. Quantum perturbations do not always have the drastic effect of either stabilizing an unstable classical interface or destabilizing an interface at zero temperature. There exist quantum lattice models which are quantum perturbations of suitable classical systems such that an interface in the classical system remains essentially unchanged under the quantum perturbations. For example, if we add a quantum perturbation to the three dimensional Ising model, then the so-called Dobrushin condition induces a stable interface in the system, in the sense that there is a low temperature non-translation invariant Gibbs state describing an asymptotically horizontal interface. This was proved in a more general setting by Borgs, Chayes, and Fr¨ohlich [8] for systems in dimensions d ≥ 3, by using a quantum version of the Pirogov Sinai theory [7, 9]. One expects that adding a quantum perturbation to the two-dimensional Ising model at low temperatures will not stabilize the 10-interface in this model but we are not aware of any proof of this. In this paper we consider the stability of the interface states in the anisotropic, antiferromagnetic(AF) XXZ and XZ models at zero temperature. We prove that in these models the interface is not stable in one dimension. We study the question of stability

Interface Instability in the Antiferromagnetic XXZ Chain

479

by analyzing the dispersion relation for the energy of the interface, i.e., its energy as a function of its momentum. For the AF models we can force an interface into the system by imposing periodic boundary conditions on a chain with an odd number of sites. We can study the energy of the interface by comparing the energies for chains with an even and odd number of sites. The AF Hamiltonians that we consider are invariant both under lattice translations and global spin flips. The combined symmetry of translating by one lattice spacing and then performing a global spin flip, which we denote by T, is a useful symmetry for studying the interfaces since it leaves the N´eel states invariant. We refer to the eigenvalue of this symmetry operator as a “generalized momentum.” We study the difference between the lowest energy of an eigenstate with generalized momentum k for a chain with an odd number of sites and that with an even number of sites. We take this difference to be the definition of the dispersion relation for the interface. If the interface is stable, then there should be an eigenstate | of the Hamiltonian (for a chain with an odd number of sites) which has some localized structure. So the states Tl | should be linearly independent. By taking linear combinations of these states, |k = Tl |, eikl (1) l

we can form eigenstates of the Hamiltonian with different generalized momenta. Since T commutes with the Hamiltonian, these states all have the same energy. Thus the dispersion relation is independent of the generalized momentum if there is a stable interface. We prove that in the infinite volume limit the dispersion relation for the AF chain depends on the generalized momentum, and so the chain does not admit ground states that correspond to a stable interface. In contrast, for the anisotropic, ferromagnetic XXZ chain, we prove that the dispersion relation is “flat” (i.e., k-independent) in the infinite length limit. This provides another approach to studying the stability of the interface in this model at zero temperature to complement the approaches of [1, 15, 4, 3]. The XZ chain is exactly solvable, and Araki and Matsui used this to prove the absence of non-translationally invariant infinite volume ground states [2]. This shows the interface is unstable in this model since infinite volume ground states containing an interface would be non-translationally invariant. The XXZ model is also exactly solvable, so one might be able to use this solvability to study the dispersion relation we study. We emphasize, however, that in our approach we do not use the exact solvability of either of these models. The techniques that we use to study the interface are based on a novel approach to the analysis of ground states of quantum spin systems, introduced by Kirkwood and Thomas [18]. They considered spin-1/2 models, but their approach was applied to some higher spin models by Matsui [22]. Their method originally required a Perron-Frobenius condition on the Hamiltonian. We removed this condition and simplified the proof of convergence of the expansion in [10]. Although we restrict our attention to the XZ and XXZ models in this paper, we expect the methods and results to be applicable to a much broader class of models. The paper is organized as follows: To keep the paper self-contained, we first give a summary of our version of the Kirkwood-Thomas approach (as developed in [10]) by using it to study the ground state of the AF anisotropic XZ Hamiltonian. This is done in Sect. 2 for a d-dimensional lattice under periodic boundary conditions. The results of this section, for the case d = 1, are used later in our analysis of interface states in the AF anisotropic XZ chain. If the number of sites N in such a chain is even then the ground state does not have an interface. However, if N is odd then the periodic boundary conditions force an interface in the chain. The latter situation is studied in Sect. 3. We

480

N. Datta, T. Kennedy

prove that the dispersion relation for the energy of the interface depends non-trivially on the generalized momentum k even in the limit N → ∞. This allows us to conclude that the ground state of the AF anisotropic XZ chain does not have a stable interface. In Sect. 4 we prove a similar result for the AF anisotropic XXZ chain. In contrast, in Sect. 5, we prove that for the corresponding ferromagnetic model the energy of the interface does not depend on k in the limit N → ∞. 2. XZ Ground State: The Kirkwood-Thomas Approach We consider the following antiferromagnetic Hamiltonian defined on a finite lattice ⊂ Zd : = H σiz σjz + σix σjx , (2) ij ⊂

ij ⊂

where the sums are over all nearest neighbor pairs (denoted by ij ) in . We impose periodic boundary conditions and assume that has an even number of sites in each acts on the Hilbert space H = (C2 )⊗|| , coordinate direction. The Hamiltonian H where || denotes the number of sites in the lattice . The Hamiltonian and most of the quantities that follow depend on the volume . However, for notational simplicity, we often suppress this explicit dependence. The above Hamiltonian commutes with the global spin flip operator given by = P σix . (3) i∈

The above form of the Hamiltonian seems natural for perturbation theory in since the = 0 Hamiltonian is diagonal. However, following Kirkwood and Thomas, we study a unitarily equivalent Hamiltonian obtained by a rotation about the Y -axis in spin space caused by the operator   y π R = exp i (4) σj  . 4 j ∈

Hence, and therefore

Rσix R −1 = σiz ; R −1 = RH

Rσiz R −1 = −σix ,

σix σjx +

ij ⊂

σiz σjz .

(5)

ij ⊂

The global spin flip operator transforms into R −1 = RP

σiz .

(6)

i∈

Finally, we perform a unitary transformation to change the = 0 Hamiltonian from antiferromagnetic to ferromagnetic. Define U= σjz , (7) j ∈ j odd

Interface Instability in the Antiferromagnetic XXZ Chain

481

where j odd means that the sum of the components of j is odd. Since has an even number of sites in each coordinate direction, the transformed Hamiltonian, H , is given by R −1 U −1 = − H = U RH σix σjx + σiz σjz . (8) ij ⊂

ij ⊂

Since [H, P ] = 0, the state space of the Hamiltonian H can be decomposed into two subspaces corresponding to the eigenvalues +1 and −1 of P . We refer to these two subspaces as the even and odd sectors respectively. The transformed global spin flip R −1 , remains unchanged under the action of the unitary operator U : operator, R P R −1 U −1 = σiz . (9) P = U RP i∈

We emphasize that Eq. (8) is not true if has an odd number of sites in any lattice direction. This fact plays a key role in our study of interfaces in the one dimensional case [see e.g. Sect. 3]. Let us introduce some definitions and notations. A classical spin configuration on the lattice is defined to be an assignment of a +1 or a −1 to each site in the lattice. Hence, for each i ∈ , σi = ±1. We will abbreviate the classical spin configuration {σi }i∈ by σ . For each such σ we let |σ be the state in the Hilbert space, H , which is the tensor product of a spin-up state at each site with σi = +1 and a spin-down state at each site with σi = −1. Thus |σ is an eigenstate of all the σiz with σiz |σ = σi |σ . The states |σ form a complete orthonormal basis of H . Any state | can be written in terms of this basis: | = ψ(σ )|σ , (10) σ

where ψ(σ ) is a complex-valued function on the spin configurations σ . For a single site, the vectors (| + 1 + | − 1) and (| + 1 − | − 1) are the eigenstates of σ x with eigenvalues +1 and −1, respectively. Thus the (unnormalized) ground statesof the Hamiltonian, H , [(8)] for = 0 are given by (10) with ψ(σ ) = 1 and ψ(σ ) = i∈ σi . We define σ (X) = σi (11) i∈X

and use the convention that σ (∅) = 1. Note that σ () is equal to +1(−1) in the even (odd) sector. In the Kirkwood-Thomas method one expands the ground state with respect to the basis {|σ }, as in Eq. (10), and writes ψ(σ ) in the form

1 g(X)σ (X) (12) ψ(σ ) = exp − 2 X

for some real g(X). As in [10], we justify the above exponential form of ψ(σ ) by a two-step procedure: First, we consider (12) to be an ansatz and prove that it satisfies the Schr¨odinger equation. This ensures that there is an eigenstate of the form (12). Next we give an argument to show that this eigenstate must in fact be the ground state. Consider the Schr¨odinger equation H = E0 .

(13)

482

N. Datta, T. Kennedy

The operator σiz σjz is diagonal in the chosen basis, so σiz σjz

ψ(σ )|σ =

σ

σi σj ψ(σ )|σ .

(14)

σ

The operator σix σjx just flips the spins at sites i and j , i.e., σix σjx |σ = |σ (ij ) , where σ (ij ) is the spin configuration σ but with σi replaced by −σi and σj replaced by −σj . Hence σix σjx ψ(σ )|σ = ψ(σ )|σ (ij ) = ψ(σ (ij ) )|σ . (15) σ

σ

σ

The last equality follows by a change of variables in the sum. We now see that if we use (10) in the Schr¨odinger equation (13) and pick out the coefficient of |σ , then for each spin configuration σ we have − ψ(σ (ij ) ) + σi σj ψ(σ ) = E0 ψ(σ ). (16) ij

ij

Henceforth, the condition ij ⊂ will be implicit in all our sums on ij . Dividing both sides of (16) by ψ(σ ) we have −

ψ(σ (ij ) ) ij

ψ(σ )

+

σi σj = E0 .

(17)

ij

Now σ (ij ) (X) is σ (X) when both i and j are in X, and when both of them are not in X. If exactly one of i and j is in X, then σ (ij ) (X) is −σ (ij ) (X). We will let ∂X denote the set of nearest neighbor bonds which connect a site in X with a site not in X. (Henceforth, we will always use the word bond to denote a nearest neighbor bond.) Then the condition that exactly one of i and j belongs to X may be written as ij ∈ ∂X. We will often abbreviate this condition as X : ij . Thus   1 ψ(σ (ij ) ) = exp − g(X)σ (X) + g(X)σ (X) (18) 2 X:ij

X

and so the Schr¨odinger equation is now   − exp  g(X)σ (X) + σi σj = E0 . ij

X:ij

(19)

ij

As in [10], we refer to this equation as the Kirkwood-Thomas equation. We expand the exponential in a power series. The contribution from the linear term may be rewritten as g(X)σ (X) = |∂X|g(X)σ (X), (20) ij X:ij

X

where |∂X| is the number of bonds in ∂X, i.e., the number of bonds that connect a site in X with a site not in X. Hence the Kirkwood Thomas equation becomes

Interface Instability in the Antiferromagnetic XXZ Chain

483

|∂X|g(X)σ (X) + E0 + d||

X

=−

∞ 1 n! ij n=2

n

g(Xk )σ (Xk ) +

X1 ,X2 ,···,Xn :ij k=1

σi σj .

(21)

ij

Here d|| is the number of bonds in the lattice. Since σi2 = 1, σ (X)σ (Y ) = σ (X Y ), where the symmetric difference X Y of X and Y is defined by X Y = X ∪ Y \ (X ∩ Y ). Thus nk=1 σ (Xk ) = σ (X1 · · · Xn ). If we equate the coefficient of σ (X) on both sides of Eq. (21), we obtain, for X = ∅,   ∞

g(X) =

1 1  −  n! |∂X| ij n=2

 g(X1 )g(X2 ) · · · g(Xn ) + 1nn (X) ,

X1 ,X2 ,···,Xn :ij , X1 ··· Xn =X

(22) where 1nn (X) is 1 if X consists of two nearest neighbor sites and is 0 otherwise. If X = , then ∂X = ∅. So the coefficient of g() on the LHS of Eq. (21) is zero. This looks like a fatal problem since the RHS of the equation will contain a multiple of σ (). We solve this problem by exploiting the decomposition of the state space into even and odd sectors (as in [18]). We look for eigenstates of the form |e = ψ(σ )|σ (23) σ :even

and

|o =

ψ(σ )|σ ,

(24)

σ :odd

where the sums are only over configurations σ for which the number of sites i with σi = −1 is even or odd, respectively. (Equivalently, σ () := i∈ σi = +1, or −1.) The Schr¨odinger equation is still equivalent to (19), but now to find an eigenstate in the even (respectively, odd) sector, this equation need only hold for σ with i∈ σi = +1 (respectively, −1). Thus the terms on the RHS of (21) which contain σ () may be included in the equation for X = ∅. So for X = ∅, we obtain the equation E± + d|| = −

∞ 1 n! ij n=2

∓

∞ 1 n! ij n=2

g(X1 )g(X2 ) · · · g(Xn )

X1 ,X2 ,···,Xn :ij , X1 ··· Xn =∅

g(X1 )g(X2 ) · · · g(Xn ).

(25)

X1 ,X2 ,···,Xn :ij , X1 ··· Xn =

Here and henceforth, the upper (lower) sign corresponds to the even (odd) sector. We have replaced E0 by E± since the eigenvectors in the even and odd sectors have different eigenvalues. We will see later that the difference between the two eigenvalues is exponentially small in the number of sites in the lattice . Note that Eq. (19) for the two sectors can be combined into the single equation   N N E + − E− E+ + E− − + σ (). (26) exp  g(Y )σ (Y ) + σj σj +1 = 2 2 j =1

Y :j

j =1

484

N. Datta, T. Kennedy

We let g denote the collection of coefficients {g(X) : X ⊂ , X = ∅, X = }, and think of Eq. (22) as a fixed point equation, g = F (g). We define a norm by ||g|| = |g(X)| |∂X| (||M)−w(X) , (27) X:b

where b is a nearest neighbor bond and w(X) is defined as follows: We consider two bonds to be “connected” if they share an endpoint or if the distance between them is 1. We consider a set of bonds to be “connected” if we can get from one bond in the set to any other bond in the set by going through a sequence of connected bonds in the set. Then w(X) is the cardinality of the smallest set of bonds which contains X and is “connected.” Note that the symmetries of the lattice imply that the norm ||g|| does not depend on the choice of b. Theorem 1. There exists a constant M > 0 which depends only on the number of dimensions of the lattice, such that if ||M ≤ 1, then the fixed point equation (22) has a solution g, and ||g|| ≤ δ for some constant δ which depends only on the lattice. Proof. We will prove that F is a contraction on a small ball about the origin, and that it maps this ball back into itself. The contraction mapping theorem will then imply that F has a fixed point in this ball. For the sake of concreteness, we prove it is a contraction with constant 1/2, but there is nothing special about the choice of 1/2. Define 4(2d − 1) . (28) δ= M We will show that ||F (g) − F (g )|| ≤

1 ||g − g || for 2

||g||, ||g || ≤ δ,

(29)

and ||F (g)|| ≤ δ

for ||g|| ≤ δ.

(30)

The proof of (29) proceeds as follows: Fix a bond b to use in the definition of ||F (g) − F (g )||. Then ||F (g) − F (g )|| ≤

∞ 1 n! ij n=2

|g(X1 ) · · · g(Xn )

X1 ,···,Xn :ij ,b∈∂

−g (X1 ) · · · g (Xn )|(||M)−w() ,

(31)

where = X1 . . . Xn . If b ∈ ∂, then b is in at least one ∂Xk . Using the symmetry under permutations of the Xk , we can take b ∈ ∂X1 at the cost of a factor of n. We claim that if ij ∈ ∂Xk for k = 1, 2, · · · , n, then w(X1 · · · Xn ) ≤

n

w(Xk ).

(32)

k=1

To prove the claim, for k = 1, 2, · · · , n, let Ck be sets of bonds such that Xk ⊂ Ck , |Ck | = w(Xk ) and Ck is connected in the sense used to define w(Xk ) [see discussion after (27)]. Define C = ∪nk=1 Ck . Since Xk contains exactly one of the sites i and j ,

Interface Instability in the Antiferromagnetic XXZ Chain

485

Ck contains at least one of the sites i and j . Since C1 , . . . , Cn are connected this implies that C is connected. Clearly, X1 · · · Xn ⊂ C. So w(X1 · · · Xn ) ≤ |C| ≤

n

|Ck | =

k=1

n

(33)

w(Xk ),

k=1

which proves the claim (32). Using |

n

g(Xk ) −

k=1

n

g (Xk )| ≤

n k−1

k=1

|g(Xi )| |g(Xk ) − g (Xk )|

k=1 i=1

n

|g(Xi )|,

(34)

i=k+1

we have ||F (g)−F (g )|| ≤

∞ n=2

≤

∞ n=2

1 (n−1)!

n k−1

|g(Xi )|(||M)−w(Xi )

X1 :b ij ∈∂X1 X2 ,...,Xn :ij k=1 i=1 n ×|g(Xk )−g (Xk )|(||M)−w(Xk ) |g (Xi )|(||M)−w(Xi ) i=k+1 n 1 k−1 n−k

(n − 1)!

||g − g ||

||g||

||g ||

k=1

≤ K||g − g ||, where K=

∞ n=2

(35) n δ n−1 = eδ − 1 + δ eδ , (n − 1)!

(36)

and we have used the fact that both ||g|| and ||g || are bounded by δ. By choosing δ to be sufficiently small we obtain K ≤ 1/2. To prove (30), we use (29) with g = 0. From (22) it follows that 1nn (X) (||M)−w(X) ||F (0)|| ≤ X:b

≤ 2(2d − 1)(||M)−1 ≤ 2(2d − 1)M −1 =

δ . 2

(37)

Hence, ||F (g)|| ≤ ||F (g) − F (0)|| + ||F (0)|| ≤

1 1 ||g|| + δ ≤ δ. 2 2

(38)

Equation (25) may be used to study the difference between the ground state energies in the odd and even sectors. It is straightforward to show that |E− − E+ | ≤ c(||M)w() d||,

(39)

where the constant c depends on ||g||. Since w() = ||/2, the difference between these two eigenvalues is exponentially small in the number of sites in the lattice.

486

N. Datta, T. Kennedy

We conclude this section by showing that the eigenstates we have constructed in the even and odd sectors are indeed the lowest eigenstates in these sectors. The argument is similar to that in [10], but some small modifications are needed to take account of the decomposition into even and odd sectors. We know our eigenstates are the lowest in their sectors when = 0. Since we have a finite lattice, our eigenvalue problem is finite dimensional. So in each sector, our eigenstate will remain the lowest eigenstate provided its eigenvalue does not cross another eigenvalue associated with that sector, i.e., provided the eigensubspace in the sector associated with our eigenvalue continues to be one-dimensional. Hence, if we show that there exists an 0 > 0 such that our eigenfunction is non-degenerate for all with || < 0 , then it would follow that our eigenfunction is the ground state for all such . Suppose that there is a value of for which there is another eigenvector |e with the same eigenvalue as |e . (The argument in the case of the odd sector is identical.) Define ψ (σ ) for even σ by ψ (σ )|σ , (40) |e = σ :even

ψ (σ )

and let = 0 for odd σ . Now consider ψ(σ ) + αψ (σ ), where α is a small real number and ψ(σ ) is defined through (23). As α → 0, this converges to ψ(σ ) for each σ . There are only finitely many values of σ , so for small enough α, this function is always positive (since ψ(σ ) > 0 ∀ σ ). So it can be written as exp[− 21 X gα (X)σ (X)]. Moreover, as α → 0, gα (σ ) → g(σ ) for each σ , and by construction gα satisfies the fixed point equation. So for sufficiently small α, gα is a solution of the fixed point equation which is inside the ball in which we know the fixed point equation has a unique solution. This contradiction completes the argument. 3. Interfaces in the Antiferromagnetic XZ Chain In this section we consider the model of the previous section in one dimension. So = {1, · · · , N}, and N N z z H = σj σj +1 + σjx σjx+1 . (41) j =1

j =1

x The indices should be taken to be periodic, e.g., σN+1 means σ1x . When N is even, we have as before

R −1 U −1 = − H = U RH

N j =1

σjx σjx+1 +

N j =1

σjz σjz+1 ,

(42)

and the ground state may be constructed as in the previous section. If N is odd, then the periodic boundary conditions force an interface into the antiferromagnetic chain. In this case we have R −1 U −1 = − H = U RH

N j =1

Jj σjx σjx+1 +

N j =1

σjz σjz+1 ,

(43)

where the coupling Jj is +1 except when j = N , in which case it is −1. So the = 0 Hamiltonian has ferromagnetic couplings for all the bonds except the bond between the sites 1 and N .

Interface Instability in the Antiferromagnetic XXZ Chain

487

commutes with the global spin flip operator P [(3)]. It As before the Hamiltonian H also commutes with the translation operator T defined by α T σiα T −1 = σi+1 ,

α = x, y, z.

(44)

[(41)] are the two When = 0 and N is even, the ground states of the Hamiltonian H N´eel states. These states are not invariant under translation. However, if we translate and then perform the global spin flip, the N´eel states remain unchanged. So if we define T . T=P

(45)

and leaves the N´eel states invariant. This combined symmetry then T commutes with H of the Hamiltonian will be the most useful one in our study of interface states, since its action on an interface is to simply translate the interface by one site. Let T = U R TR −1 U −1

(46)

be this combined symmetry after our unitary transformations. Simple calculations show that when N is even, T is equal to the pure translation operator T . However, for odd values of N we find that T = σ1z T . (47) In words, T translates by one lattice spacing and rotates the spin at the site i = 1. We can refer to it as a generalized translation operator. Throughout this section we will assume N to be odd. Since H and T commute, we choose the eigenfunctions of H to be eigenfunctions of T as well. So they can be labeled by an index k, where k can be regarded as the generalized “momentum”, i.e., Tψk (σ ) = e−ik ψk (σ ).

(48)

It is important to note that TN is not the identity operator. In fact, TN = P =

N

σiz ,

(49)

i=1

the transformed global spin flip operator [(9)] of Sect. 2. The state space may again be decomposed into two subspaces corresponding to the eigenvalues +1 and −1 of P = TN , which we refer to as the even and odd sectors respectively. We see that T2N = 1, and so the possible values of k are k = πj/N with j = 0, 1, 2, · · · , 2N − 1. An eigenstate of T with eigenvalue e−ik will be in the even sector if e−ikN = 1 and in the odd sector if e−ikN = −1. Almost every quantity depends on N , the number of sites. We usually suppress this dependence, but in the statement of the following theorem we make it explicit. As we saw in the last section, for even N, the lowest eigenvalues in the even and odd sectors, N and E N , respectively, are slightly different. The expansion which we now denote by E+ − of the previous section shows that with our periodic boundary conditions, they are both equal, up to a correction that is exponentially small in N , to N times a constant e0 , the N if k is in the infinite volume ground state energy per site. We define E0N (k) to be E+ N even sector and E− if k is in the odd sector. So E0N (k) =

N N + EN E N − E− E+ − + + e−ikN . 2 2

(50)

488

N. Datta, T. Kennedy

For odd N we let E1N (k) denote the lowest eigenvalue in the subspace of generalized momentum k for the Hamiltonian of this section. The difference E1N+1 (k) − E0N (k) with N even is equal to e0 plus the energy of an interface with momentum k. Our goal is to study this quantity in the infinite N limit. If there is a localized interface, then this difference would be independent of k, as explained in the Introduction. The quantities E1N+1 (k) and E0N (k) are only defined for a finite set of values of k, and the two functions are defined on different sets of values. To make sense of this difference, N we extend the definitions of these two functions to all k. The Fourier coefficients e0,s are defined by 2N N N iks e0,s e . (51) E0 (k) = s=1

The RHS of this equation is defined for all k, so we can take it to be the definition of the LHS for all k. We extend the definition of E1N+1 (k) to all k in the same way. It is useful N and eN+1 for all s by making them a periodic function of s with periods 2N to define e0,s 1,s and 2(N + 1). Then we can rewrite our Fourier series so that they are centered around s = 0, e.g., N N iks e0,s e . (52) E0N (k) = s=−N+1

This form is better suited for taking the N → ∞ limit. Theorem 2. There exists an 0 > 0 such that for all || < 0 the following is true: For s ∈ Z there are coefficients εs such that for all k, lim E1N+1 (k) − E0N (k) = εs eiks . (53) N →∞ N even

s

Moreover, there is a constant c such that |εs | ≤ (c||)|s|/2 ,

(54)

where the notation l denotes the smallest integer which is not smaller than l. We have ε2 = ε−2 = + O( 2 ).

(55)

So the dispersion relation (53) is not a constant function of k. The remainder of this section is devoted to the proof of this theorem. In the last section we assumed that N was even. It is only for even N that the periodic boundary conditions for the original Hamiltonian (2) lead to the Hamiltonian (8), and hence to the Kirkwood-Thomas equation (19). However, Eq. (19) is defined for all N and the proof of the existence of a solution works for odd N as well. This allows us to define E0N (k) for odd N. Moreover, the difference between E0N (k) and E0N+1 (k) converges to a constant e0 , the ground state energy per site, as N → ∞. Hence, to prove the theorem we can consider the difference E1N (k) − E0N (k) with N odd. Throughout the proof we will work with this quantity and suppress the superscript N . In the rest of the paper, the N-dependence of functions will not be explicitly indicated unless needed.

Interface Instability in the Antiferromagnetic XXZ Chain

489

We start by studying what the eigenfunctions of T look like. For k = πj/N with j = 0, 1, 2, · · · , 2N − 1 we define φX,k (σ ) =

2N

eikl σ1 σ2 · · · σl σ (X + l).

(56)

l=1

Indices should be taken to be periodic, i.e., σN+i = σi for i = 1, 2, · · · , N . However, for l > N one should not interpret σ1 σ2 · · · σl as σ1 σ2 · · · σl−N . Since σi2 = 1, it is σl−N+1 · · · σN . Note that σ1 σ2 · · · σl σ (X + l) = Tl σ (X), so we can write the above as φX,k (σ ) =

2N

eikl Tl σ (X),

(57)

l=1

from which it is clear that φX,k (σ ) is an eigenfunction of T with eigenvalue e−ik . These functions span the subspace of generalized momentum k, but they are not linearly independent. For some choices of X and k, φX,k (σ ) will be zero. We define the action of T on a set of sites by σ (Tl X) = Tl σ (X). More explicitly, we have TX = {1} (X + 1). Then φ

Tt X,k

(σ ) =

2N

e

ikl

T σ (T X) = l

t

l=1

=

2N

2N

eikl Tt+l σ (X)

l=1

eik(l−t) Tl σ (X) = e−ikt φX,k (σ ).

(58)

l=1

Hence, if two subsets of the lattice are related by a generalized translation then the corresponding functions are the same up to a multiplicative constant. If we define two sets X and Y to be equivalent if X = Tn Y for some n, then we can partition the subsets of into equivalence classes. Pick one set from each equivalence class and let X be the resulting collection of subsets of . The φX,k will still span the subspace of generalized momentum k if we only consider X ∈ X . As we remarked before, the proof of the previous section that the Kirkwood-Thomas Eq. (19) has a solution works for odd N just as for even N . We let (σ ) be the solution,

1 (σ ) = exp − g(Y )σ (Y ) . 2

(59)

Y

This is the ground state of the Hamiltonian in (42) for odd N , or equivalently of the Hamiltonian in (43) with all the Jj = +1. (σ ) is translationally invariant, so if ψk (σ ) has generalized momentum k, then ψk (σ )/ (σ ) does too. Now suppose that for each k we have an eigenstate ψk (σ ) with momentum k. Then ψk (σ ) can be written in the form ψk (σ ) = (σ )

X∈X

c(X, k)φX,k (σ ) = (σ )

2N l=1

eikl σ1 σ2 · · · σl

X∈X

c(X, k) σ (X + l) (60)

490

N. Datta, T. Kennedy

for some coefficients c(X, k), which depend on k. Let us rewrite the expression for ψk (σ ) in a manner that makes the k-dependence more explicit: For each X we can write c(X, k) as a Fourier series 2N

c(X, k) =

e−ikn e(X, n).

(61)

n=1

The coefficients c(X, k) are functions of k = πj/N with j = 0, 1, 2, · · · , 2N − 1, and hence the sum on the RHS of (61) is over 2N values (rather than just N ). Using (58) we have ψk (σ ) = (σ ) e(X, n) φTn X,k = (σ ) e(X) φX,k , (62) X∈X

n

X

where the coefficients e(X) are defined by the equations

e(X) =

(63)

e(Y, n).

Y ∈X ,n:Tn Y =X

The wavefunction ψk (σ ) can now be written in the form ψk (σ ) = (σ )

2N

eikl σ1 σ2 · · · σl

l=1

e(X) σ (X + l).

(64)

X

Note that the k-dependence is now entirely contained in the factor eikl . We will abbreviate j, j + 1 ∈ ∂X by j : X or X : j . Recall that σjx σjx+1 σ (X) = −σ (X) if j : X and it equals σ (X) otherwise. It easily follows that Jj σjx σjx+1 σ1 σ2 · · · σl = s(j, l)σ1 σ2 · · · σl ,

(65)

where s(j, l) = +1 = −1

if j = l mod N if j = l mod N.

(66)

Thus (H ψk )(σ ) = (σ )

2N

eikl σ1 σ2 · · · σl −

e(X)σ (X + l) − 2

+

j =1

X:j −l

X N

σj σj +1

exp

j =1

l=1

×

N

g(Y )σ (Y ) s(j, l)

Y :j

e(X)σ (X + l)

e(X)σ (X + l) .

(67)

X

The above must equal E1 (k)ψk (σ ). Canceling the common factor of (σ ), the Schr¨odinger equation for the Hamiltonian H [(43)] becomes

Interface Instability in the Antiferromagnetic XXZ Chain

491

2N

N eikl σ1 σ2 · · · σl − exp[ g(Y )σ (Y )] s(j, l)

l=1

×

j =1

Y :j

e(X)σ (X + l) − 2

e(X)σ (X + l)

X:j −l

X

+

N

σj σj +1

j =1

e(X)σ (X + l) − E1 (k)

X

e(X)σ (X + l) = 0.

(68)

X

If Eq. (68) was of the form 2N

eikl f (l, σ ) = 0,

(69)

l=1

then we would have been able to conclude that f (l, σ ) = 0 for all l. However, even though Eq. (68) resembles (69), the two equations are not quite identical in form. This is because E1 (k) depends on k. To cast (68) in the form (69), we write E1 (k) as a Fourier series in k. When = 0, E1 (k) − E0 (k) = 2. So we write it as E1 (k) = E0 (k) + 2 +

2N

es e−iks .

(70)

s=1

Using the definition of E0 (k) [Eq. (50)], 2N l=1

eikl σ1 · · · σl E1 (k)

e(X)σ (X + l)

X

2N E+ + E− ikl = 2+ e σ 1 · · · σl e(X)σ (X + l) 2 l=1

+

2N

eikl σ1 · · · σl+s es

X

e(X)σ (X + s + l)

X

l,s=1

E+ − E− −ikN ikl e e σ1 · · · σl e(X)σ (X + l), 2 2N

+

(71)

X

l=1

where we have made a change of variables l → l + s. In the expression σ1 · · · σl+s the index l + s can be as large as 4N . For i = 1, 2, · · · , N , we interpret σi+N , σi+2N and σi+3N to all be σi . By making a change of variables l → l+N, and using σl+1 · · · σl+N = σ () and σ (X + l + N ) = σ (X + l), we rewrite the last term on the RHS of (71) as follows: E+ − E− −ikN ikl e e σ1 · · · σl e(X)σ (X + l) 2 2N

X

l=1

E+ − E− = σ () eikl σ1 · · · σl e(X)σ (X + l). 2 2N

l=1

X

(72)

492

N. Datta, T. Kennedy

If we use (71) in (68) the resulting equation is of the form (69). Hence, after canceling a common factor of σ1 σ2 · · · σl , we conclude that −

N

exp[ g(Y )σ (Y )] s(j, l) e(X)σ (X + l) − 2 e(X)σ (X + l)

j =1

+

Y :j N j =1

−

2N

E + + E− σj σj +1 e(X)σ (X + l) − 2 + e(X)σ (X + l) 2 X

X

σl+1 · · · σl+s es

s=1

−

X:j −l

X

e(X)σ (X + s + l)

X

E + − E− e(X)σ (X + l) = 0. σ () 2

(73)

X

Recall that the coefficients g(Y ) satisfy Eq. (26): N N E+ + E− E + − E− exp g(Y )σ (Y ) + σj σj +1 = − + σ (). (74) 2 2 j =1

Y :j

j =1

Multiplying this equation by X e(X)σ (X + l) and subtracting the result from (73),

N exp g(Y )σ (Y ) (1 − s(j, l)) e(X)σ (X + l) j =1

Y :j

+2s(j, l)

X

e(X)σ (X + l)

X:j −l

−2

e(X)σ (X + l) −

2N

σl+1 · · · σl+s es

s=1

X

Defining h(Y ) by

exp

h(Y ) =

e(X)σ (X + s + l) = 0.

(75)

h(Y )σ (Y )

(76)

X

g(Y )σ (Y ) = 1 +

Y :N

we have

Y

∞ 1 n!

g(Y1 ) · · · g(Yn ).

(77)

Y1 ,···,Yn :N,=Y

n=1

Using the translation invariance of the g(Y )   exp  g(Y )σ (Y ) = exp g(Y + j )σ (Y + j ) Y :j

Y :N

= exp

Y :N

g(Y )σ (Y + j ) = 1 +

h(Y )σ (Y + j ).

Y

(78)

Interface Instability in the Antiferromagnetic XXZ Chain

493

Inserting (78) in (75) we have

N (1 − s(j, l)) e(X)σ (X + l) + 2s(j, l) e(X)σ (X + l) j =1

+

X N

X:j −l

h(Y )σ (Y + j ) (1 − s(j, l))

j =1 Y

+2s(j, l)

e(X)σ (X + l)

X

e(X)σ (X + l)

X:j −l

−2

e(X)σ (X + l) −

2N

σl+1 · · · σl+s es

s=1

X

e(X)σ (X + s + l) = 0.

(79)

X

Equation (79) must hold for all l and σ . The equations for different values of l are in fact identical. To see this we make a change of variables j → j + l in the sums over j . Note that s(j +l, l) = s(j, N ). The resulting equation must hold for all configurations σ . Hence, we can also replace σ by the configuration obtained by translating σ by l sites so that σ (X + l) becomes σ (X). The result of these two changes of variables is that, for each value of l, Eq. (79) reduces to the following equation, which is the l = N case of Eq. (79): N

(1 − s(j, N ))

j =1

+

X

N

e(X)σ (X)

X:j

h(Y )σ (Y + j ) (1 − s(j, N ))

j =1 Y

−2

e(X)σ (X) + 2s(j, N)

e(X)σ (X) + 2s(j, N )

X

e(X)σ (X) −

2N

σ 1 · · · σ s es

s=1

X

e(X)σ (X)

X:j

e(X)σ (X + s) = 0.

X

(80) Note that

s(j, N )

j

e(X)σ (X) =

X:j

e(X)σ (X)

s(j, N )

j :X

X

=

n(X)e(X)σ (X),

(81)

X

where we have defined n(X) :=

s(j, N ),

(82)

j :X

and the sum is over j such that j, j + 1 ∈ ∂X. Note that n(X) is either zero or an even integer. Moreover, 1 − s(j, N ) = 2 =0

if j = N if j =

N.

(83)

494

N. Datta, T. Kennedy

Hence, Eq. (80) can be written as 2 n(X) e(X)σ (X) + 2 h(Y )σ (Y ) e(X)σ (X) X

+2

Y

−

2N

X

Y

X

h(Y )σ (Y + j ) s(j, N ) e(X)σ (X)

j :X

σ1 · · · σs es

s=1

e(X)σ (X + s) = 0.

(84)

X

Recall that j : X means that exactly one of the sites j and j + 1 is in X. Define j :: X as follows: If j = N, j :: X means the same as j : X. However, N :: X means either both of the sites N and 1 are in X or both are not. This is a natural definition since the sites j for which j :: X are precisely the sites for which there is an interface between the sites j and j + 1. With this definition, 2 h(Y )σ (Y ) e(X)σ (X) + 2 h(Y )σ (Y + j ) s(j, N ) e(X)σ (X) Y

=2

X

Y

X

Y

X

j :X

h(Y )σ (Y + j ) e(X)σ (X).

(85)

j ::X

Since σ1 · · · σs σ (X + s) = σ (Ts X),

(86)

the last term in (84) can be written as 2N s=1

es

e(X)σ (Ts X) =

2N s=1

X

es

e(T−s X)σ (X),

(87)

X

where the equality follows by a change of variables in the sum. (Since T2N = 1, T−s = T2N −s .) Thus (84) holds for all configurations σ if and only if for all X, 2n(X)e(X) + 2

h(Y )e(Z)1((Y + j ) Z = X) =

Y,Z,j ::Z

2N

es e(T−s X).

(88)

s=1

The integer n(X) is zero for sets of the form X = {1, 2, · · · , s} and X = {s + 1, s + 2, · · · , N }. These are the sets Tm ∅, where m = 0, 1, · · · , 2N − 1. Let us assume that e(X) = 0 for all X for which n(X) = 0, except for X = ∅ for which it is equal to unity. This is essentially a normalization condition. (A priori there is no reason that a solution with these properties must exist, but we will show that it does.) With this assumption, if X = Tm ∅ then 2N es e(T−s X) = em . (89) s=1

Thus Eq. (88) gives em = 2

Y,Z,j ::Z

h(Y )e(Z)1((Y + j ) Z = Tm ∅).

(90)

Interface Instability in the Antiferromagnetic XXZ Chain

495

For X for which n(X) = 0, we obtain the relation

2N 1 e(X) = −2 h(Y )e(Z)1((Y + j ) Z = X) + es e(T−s X) . (91) 2n(X) Y,Z,j ::Z

s=1

We will show that these equations [(90) and (91)] have a solution by writing them as a fixed point equation. Consider the set of variables e := {e(X) : n(X) = 0} ∪ {es : s = 1, 2, · · · , 2N }.

(92)

Equations (90) and (91) form a fixed point equation for e, F (e) = e.

(93)

Let us introduce the norm ||e|| :=

2N

|el |(||M)−wN (T ∅) + 2 l

l=1

|e(X)|n(X)(||M)−wN (X) ,

(94)

X n(X) =0

where wN (X) is the number of bonds in the smallest set of bonds which contains X and intersects the bond N, 1 and which is connected in the sense used in the previous section to define w(X) [see the discussion after (27)]. The factor of 2 in the norm is included merely for later convenience. We prove that the fixed point equation for e has a solution by using the contraction mapping theorem as we did in the previous section. We must show that there is a δ > 0 such that 1 ||F (e) − F ( e)|| ≤ ||e − (95) e|| for ||e||, || e|| ≤ δ ; 2 ||F (e)|| ≤ δ for ||e|| ≤ δ . (96) To verify (95), we use (90) and (91) to see that |h(Y )| |e(Z) − e(Z)|(||M)−wN ((Y +j ) Z) ||F (e) − F ( e)|| ≤ 2 Y

+

Z j ::Z

|es e(T−s X) − es e(T−s X)|(||M)−wN (X) .

(97)

s X:n(X) =0

To continue we need the following two inequalities: wN ((Y + j ) Z) ≤ wN (Y ) + wN (Z),

for

wN (X) ≤ wN (T−s X) + wN (Ts ∅).

j :: Z,

(98) (99)

The inequality (99) can equivalently be written as wN (Ts X) ≤ wN (X) + wN (Ts ∅).

(100)

In the following proofs of these inequalities, “a connected set of bonds” will always mean connected in the sense used to define wN (X). To prove (98), let A and B be connected sets of bonds which contain Y and Z respectively, both of which intersect the bond N, 1, and such that wN (Y ) = |A|, wN (Z) = |B|. We consider the cases of

496

N. Datta, T. Kennedy

j = N and j = N separately. First let j = N . Then A ∪ B is a connected set of bonds which contains (Y + j ) Z = Y Z and intersects the bond N, 1. So wN ((Y + j ) Z) ≤ |A ∪ B| ≤ |A| + |B| = wN (Y ) + wN (Z).

(101)

Now suppose j = N. Then j :: Z means that either j or j + 1 is in Z and so is in B. Since N, 1 intersects A, the set A + j contains at least one of the sites j and j + 1. Thus (A + j ) ∪ B is a connected set of bonds. It contains (Y + j ) Z and intersects the bond N, 1. So wN ((Y + j ) Z) ≤ |(A + j ) ∪ B| ≤ |A + j | + |B| = wN (Y ) + wN (Z).

(102)

This proves (98). The inequality (100) is a special case of (98). To see this, note that Ts X = (X + s) Ts ∅,

(103)

so if we take Y = X, Z = and j = s, then (98) becomes (100). (It is easy to check that s :: Ts ∅ for all s.) We will also need the relation, Ts ∅

|{j : j :: Z}| = n(Z) + 1. Recalling the definition of n(Z) [(82)], and of s(j, N) [(66)], 1 + n(Z) = 1 + s(j, N ) = 1 − 1(N : Z) +

(104)

1,

(105)

j :Z,j =N

j :Z

where 1(·) denotes an indicator function. Now j : Z and j :: Z are equivalent if j = N . Moreover, N :: Z holds if and only if N : Z does not hold. So (1 − 1(N : Z)) = 1(N :: Z). This proves (104). Using (98) and (104), the first term in (97) is ≤2 |h(Y )|(||M)−wN (Y ) |e(Z) − e(Z)|(||M)−wN (Z) Y

=2

Z j ::Z

Y

[n(Z) + 1]|h(Y )|(||M)−wN (Y ) |e(Z) − e(Z)|(||M)−wN (Z) .

(106)

Z

If n(Z) = 0 then either both e(Z) and e(Z) are 0, or both are 1. So |e(Z) − e(Z)| = 0 when n(Z) = 0. Thus we can bound (n(Z) + 1) by 2n(Z) on the RHS of (106). Hence, RHS of (106) ≤ 2 |h(Y )|(||M)−wN (Y ) ||e − e||. (107) Y

Using (99) and the triangle inequality in the form |es e(T−s X) − es e(T−s X)| ≤ |es | |e(T−s X) − e(T−s X)| + |es − es | | e(T−s X)|, (108) the second term in (97) is bounded by s −s |es |(||M)−wN (T ∅) |e(T−s X) − e(T−s X)|(||M)−wN (T X) s X:n(X) =0

+

|es − es |(||M)−wN (T

s ∅)

−s X)

| e(T−s X)|(||M)−wN (T

.

s X:n(X) =0

1 (||e|| ||e − e|| + ||e − e|||| e||) ≤ δ ||e − e||, 2 since ||e|| and || e|| are no greater than δ . ≤

(109)

Interface Instability in the Antiferromagnetic XXZ Chain

497

Using the above inequalities (107) and (109), we have −wN (Y ) ||F (e) − F ( e)|| ≤ 2 e||. |h(Y )|(||M) + δ ||e −

(110)

Y

It is easily shown that wN (X1 · · · Xn ) ≤

n

wN (Xk ).

(111)

k=1

So using (77)

|h(Y )|(||M)−wN (Y ) ≤

∞ 1 n! n=1

Y

n

(||M)−wN (Yk ) |g(Yk )|.

(112)

Y1 :N,...,Yn :N k=1

The constraint Yk : N implies that Yk intersects the bond N, 1, and so wN (Yk ) = w(Yk ). Hence, RHS of (112) =

∞ 1 n! n=1

n

(||M)−w(Yk ) |g(Yk )| = e||g|| − 1 ≤ eδ − 1.

Y1 :N,...,Yn :N k=1

(113) The last inequality follows from Theorem 1. So ||F (e) − F ( e)|| ≤ K ||e − e||,

(114)

K = 2(eδ − 1) + δ .

(115)

where δ

If δ and are small enough, then K ≤ 1/2. To prove (96), we use (90) and (91) to compute F (0). Note that e = 0 means that es = 0 for all 1 ≤ s ≤ 2N , and e(X) = 0 for all X except X = ∅. We always have e(∅) = 1. Letting e denote F (0), we have em = 2h(Tm ∅),

(116)

and for X with n(X) = 0, e(X) = Thus

||F (0)|| ≤ 2

1 [−2h(X)]. 2n(X)

|h(Y )|(||M)−wN (Y ) ≤ 2(eδ − 1).

(117)

(118)

Y

If we decrease δ, then K decreases. Hence, we can assume δ to be small enough so that 2(eδ − 1) < δ /2. So ||F (e)|| ≤ ||F (e) − F (0)|| + ||F (0)|| ≤

1 ||e|| + 2(eδ − 1) ≤ δ , 2

(119)

since ||e|| ≤ δ . This finishes the proof that the fixed point equation has a solution and thus completes the construction of eigenstates of H with generalized momentum k. When = 0 these

498

N. Datta, T. Kennedy

states are the lowest eigenstates in the subspaces of generalized momentum k for k = 0, and the next to lowest for k = 0. The same sort of argument that was used in Sect. 2 proves that this is true for all such that || < 0 , for some 0 > 0. We refer the reader to Sect. 3 of [10] for a completely analogous argument. We now consider the convergence of the N → ∞ limit. We start by asking how the volume enters the ground state fixed point equation (22). The sets Xi in this equation are subsets of and the definition of nearest neighbor for the term 1nn (X) depends on . The solution g of Eq. (22) will depend on , and so we denote it by g . However, we can consider this equation for the infinite lattice Zd . This means that the sets can be any finite subset of Zd , and nearest neighbor is defined in the usual way for Zd . The proof of the ground state section shows that this infinite volume fixed point equation has a solution, which we denote by g∞ . One can prove that g converges to g∞ in an appropriate sense by showing g is an approximate solution of the fixed point equation that defines g∞ . We refer the reader to [10] for details. The fixed point equations, (90) and (91), of this section can also be defined for the infinite lattice Zd , and the fixed point argument of this section proves it has a solution. This solution includes the Fourier coefficients εs , so in this way the coefficients εs of Theorem 2 are defined. The convergence of E1N+1 (k) − E0N (k) to s εs eiks can be proved by the methods of [10] as well. The last step in the proof is to show that e2 and e−2 = e2N−2 are not zero in the infinite length limit. We start with (22) to compute g to first order in . At first order in the only nonzero coefficients g(X) are for sets X which consist of a pair of adjacent sites. In this case g(X) = /2 + O( 2 ). By (77), the only Y for which h(Y ) is nonzero at first order in is a set of nearest neighbor sites satisfying N : Y . There are two such sets, {1, 2} and {N − 1, N}. They have h(Y ) = /2 + O( 2 ). Now consider Eq. (90). h(Y ) is always at least first order in , but there is one Z for which e(Z) is zeroth order in , namely, e(∅) = 1. For this Z the only j satisfying j :: Z is j = N . Thus the first order contribution to em is of the form h(Y )1(Y = Tm ∅). (120) 2 Y

The sets {1, 2} and {N − 1, N } are of the form Tm ∅ for m = 2 and m = 2N − 2, respectively. Thus e2 = e2N−2 = + O( 2 ). (121) This proves (55) of Theorem 2. 4. Antiferromagnetic XXZ Chain In this section we study the antiferromagnetic XXZ model whose Hamiltonian on the 1-dimensional lattice = {1, 2, . . . N} is = H

N j =1

σjz σjz+1 +

N

y y σjx σjx+1 + σj σj +1 .

(122)

σjx σjx+1 1 − σjz σjz+1 .

(123)

j =1

Using σ y = iσ x σ z we have = H

N j =1

σjz σjz+1 +

N j =1

Interface Instability in the Antiferromagnetic XXZ Chain

499

As before consider operator that causes a rotation about the Y -axis in spin  a unitary  π y space: R := exp i σj so that 4 j ∈

R −1 = RH

N j =1

σjx σjx+1 +

N j =1

σjz σjz+1 (1 − σjx σjx+1 ).

(124)

For the antiferromagnet we proceed as in the previous section and use the unitary transformation U [Eq. (7)]: R −1 U −1 = − H = U RH

N j =1

Jj σjx σjx+1 +

N j =1

σjz σjz+1 (1 + Jj σjx σjx+1 ),

(125)

where Jj = 1 for j = N and JN is 1 when N is even and −1 when N is odd. is translation invariant and commutes with the global spin flip operator P [(3)]. So H H commutes with T [(46)], as it did in the previous section. When N is even (and so Jj = +1 ∀ j ), the ground state wave function

1 g(Y )σ (Y ) , (σ ) = exp − 2

(126)

Y

must satisfy

−

j

=

exp

g(X)σ (X) +

X:j

σj σj +1 1 + exp

j

g(X)σ (X)

X:j

E+ + E− E + − E− + σ (), 2 2

(127)

where X : j means j, j + 1 ∈ X. Theorem 1 of Sect. 2 holds for this model. We omit the proof since it is analogous to the proof in Sect. 2. Since the dimension d = 1, we choose δ = 4M −1 as given by (28). To study interfaces in this model we take N to be odd. So JN = −1. We recall that in this case, T = σ1z T [(47)]. As before we look for a solution of the form ψk (σ ) = (σ )

2N l=1

eikl σ1 σ2 · · · σl

e(X) σ (X + l).

(128)

X

The Schr¨odinger equation (H ψk )(σ ) = E1 (k)ψk (σ ), becomes (after canceling a common factor of (σ ))

500

N. Datta, T. Kennedy 2N

N eikl σ1 σ2 · · · σl − exp g(Y )σ (Y ) s(j, l)

l=1

×

j =1

e(X)σ (X + l) − 2

+

σj σj +1

j =1 N

+

j =1

×

e(X)σ (X + l)

X:j −l

X N

Y :j

e(X)σ (X + l)

X

σj σj +1 exp g(Y )σ (Y ) s(j, l) Y :j

e(X)σ (X + l) − 2

X

−E1 (k)

e(X)σ (X + l)

X:j −l

e(X)σ (X + l) = 0.

(129)

X

This is the analog of (68). We now proceed by analogy with the derivation of (90) and (91) from (68). This leads to the equation: 2

n(X) e(X)σ (X) − 2σN σ1

X N

σj σj +1 s(j, N)

j =1

N

h(Y ) σ (Y )

e(X)σ (X) −

σj σj +1

N

2N

σ1 · · · σs e s

s=1

e(X) σ (X)+2

N

j =1

e(X)σ (X + s)

X

h(Y ) σ (Y + j ) s(j, N )

e(X)σ (X)

X:j

h(Y )σ (Y + j ) (s(j, N ) − 1)

Y

σj σj +1

j =1 Y

X

j =1

− 2

X:j

Y

+

e(X)σ (X)

X

−2 +2

e(X)σ (X)

X

h(Y ) σ (Y + j ) s(j, N)

Y

e(X)σ (X) = 0.

(130)

X:j

Equation (130) yields the following equations which are the analogs of (90) and (91). For X for which n(X) = 0 we have 1 e(X) = e(Z)1(Z {j, j + 1} = X) 2 2n(X) j ::Z

+

2N

es e(T−s X) − 2

s=1

+2

Z

j ::Z Y,Z

j ::Z Y,Z

h(Y )e(Z)1((Y + j ) Z = X)

h(Y )e(Z)1(Z (Y + j ) {j, j + 1} = X) .

(131)

Interface Instability in the Antiferromagnetic XXZ Chain

501

Recall that j :: Z means that exactly one of j and j + 1 is in Z if j = N , and N :: Z means that either both N and 1 are in Z or neither of them is. For X = Tm ∅ we have e(Z)1(Z {j, j + 1} = Tm ∅) em = 2 j ::Z

+2

Z

h(Y )e(Z)1(Z (Y + j ) {j, j + 1} = Tm ∅)

j ::Z Y,Z

−2

h(Y )e(Z)1((Y + j ) Z = Tm ∅).

(132)

j ::Z Y,Z

Recall that n(X) = 0 if and only if X is of the form Tm (∅) for some integer m. As in the previous section, we assume e(∅) = 1 and e(Tm (∅)) = 0 for m = 0. We let e denote the same collection of variables as in the previous section and continue to use the norm (94). Equations (131) and (132) form a fixed point equation which can be written as F (e) = e. (Of course, the function F is different from the F of the previous section.) We prove there is a solution to the fixed point equation by proving (95) and (96). To prove (95) we use (131) and (132) to see that |e(Z) − e(Z)|(||M)−wN (Z {j,j +1}) ||F (e) − F ( e)|| ≤ 2|| +

Z j ::Z

|es e(T−s X) − es e(T−s X)|(||M)−wN (X)

s X:n(X) =0

+2|| +2

Y

|h(Y )|

Y

|h(Y )|

|e(Z)− e(Z)|(||M)−wN (Z (Y +j ) {j,j +1})

Z j ::Z

|e(Z) − e(Z)|(||M)−wN (Z (Y +j ))

Z j ::Z

=: (a1) + (a2) + (a3) + (a4).

(133)

We proved the following inequalities [(98) and (99)] in the previous section: wN ((Y + j ) Z) ≤ wN (Y ) + wN (Z),

for

j :: Z,

wN (X) ≤ wN (T−s X) + wN (Ts ∅).

(134) (135)

In addition, we need the following two inequalities: wN ((Y + j ) Z {j, j + 1}) ≤ wN (Y ) + wN (Z) + 1,

for

j :: Z,

wN (Z {j, j + 1}) ≤ wN (Z) + 1.

(136) (137)

Inequality (136) can be proved with two applications of (134) as follows: wN ((Y + j ) Z {j, j + 1}) = wN ([(Y {N, 1}) + j ] Z) ≤ wN (Y {N, 1}) + wN (Z) ≤ wN (Y ) + 1 + wN (Z). Similarly, inequality (137) follows from (134).

(138)

502

N. Datta, T. Kennedy

Using inequality (137) and Eq. (104) we obtain |e(Z) − e(Z)| (||M)−wN (Z) (||M)−1 1, (a1) ≤ 2|| Z

= 2M −1

j j ::Z

(n(Z) + 1)|e(Z) − e(Z)| (||M)−wN (Z) .

(139)

Z n(Z) =0

We have added the constraint n(Z) = 0 on the sum because |e(Z) − e(Z)| = 0 for n(Z) = 0. Hence we can bound (n(Z) + 1) in the above sum by 2n(Z). This yields (a1) ≤ 2M −1 ||e − e||.

(140)

Using the triangle inequality, es e(T−s X)| ≤ |es ||e(T−s X) − e(T−s X)| + |es − es || e(T−s X)|, |es e(T−s X) − and (135) we get 1 1 e|| + ||e − e|| || e||. (a2) ≤ ||e|| ||e − 2 2

(141)

Using (136) and (104) we get (a3) ≤ 2|| (||M)−1

|h(Y )|(||M)−wN (Y )

Y

(n(Z) + 1)|e(Z)

Z

− e(Z)|(||M)−wN (Z) e||. ≤ 2M −1 |h(Y )|(||M)−wN (Y ) ||e − Y

≤ 2M −1 eδ − 1 ||e − e||,

(142)

where we have used (112)–(113). Similarly we get |h(Y )|(||M)−wN (Y ) (n(Z) + 1)|e(Z) − e(Z)|(||M)−wN (Z) (a4) ≤ 2 Y

≤2

Z

|h(Y )|(||M)

−wN (Y )

||e − e||.

Y δ

≤ 2 e − 1 ||e − e||.

(143)

From (140), (141), (142) and (143) we obtain ||F (e) − F ( e)|| ≤ K||e − e||,

(144)

where K = δ + (1 + 2M −1 )(eδ − 1) + 2M −1 δ δ (eδ − 1) + , = δ + 1 + 2 2

(145)

Interface Instability in the Antiferromagnetic XXZ Chain

503

since we have chosen δ = 4M −1 . Hence, if δ and δ are small enough then K ≤ 1/2. To prove (96) we use (132) and (131) to compute F (0). Note that e = 0 means that es = 0 for all s ∈ , and e(X) = 0 for all X except X = ∅. We always have e(∅) = 1. Letting e denote F (0), we have em = 2 h(Y )1 Y {N, 1} = Tm ∅ − h(Y )1 Y = Tm ∅ . (146) Y

Y

For X with n(X) = 0, e(X) = Thus

1 21 X = {N, 1} − 2h(X) + 2h(X {N, 1}) . 2n(X) ||F (0)|| ≤ 2M −1 + 2(eδ − 1) + 2M −1 (eδ − 1).

(147)

(148)

If we decrease δ, then K decreases. So we can assume that δ is small enough that 2(eδ − 1) + δeδ /2 < δ /2. So ||F (e)|| ≤ ||F (e) − F (0)|| + ||F (0)|| ≤

1 δ δ δ + + 2(eδ − 1) + (eδ − 1) ≤ δ (149) 2 2 2

since ||e|| ≤ δ . This finishes the proof that the fixed point equation (93) has a solution and thus completes the construction of eigenstates of H with generalized momentum k. When = 0 these states are the lowest eigenstates in the subspaces of generalized momentum k for k = 0, and the next to lowest for k = 0. The same argument that we used in Sect. 2 proves that this is true for small . As in Sect. 3, we can explicitly compute the lowest order term in the dispersion relation for the interface and see that it is not zero. So the dispersion relation depends on k, indicating that the ground state does not correspond to a stable interface. 5. Ferromagnetic XXZ Chain In this section we will prove that the ground state of the ferromagnetic chain has a stable interface at zero temperature by showing that, for s = 0, the Fourier coefficients esN for the dispersion relation vanish in the limit N → ∞. Thus, in the infinite length limit the dispersion relation is flat, i.e., independent of the generalized momentum k. As discussed in the Introduction, the zero-temperature stability of the interface for the ferromagnet has been proven before. The point of this section is to show that this result can also be obtained by our methods. We will construct the wave function for ground states with an interface in them just as we did for the antiferromagnet. However, we will use very different weights in the norm. The weight for the terms esN will be exponentially large in N for s = 0. So the existence of a fixed point in this norm will prove that esN goes to zero exponentially fast as N goes to infinity. The weights we use for the norm are based on considerations of how many applications of terms in the Hamiltonian it takes to get between various states. So we begin by studying the action of the Hamiltonian. A ferromagnetic XXZ chain of N sites is governed by the Hamiltonian =− H

N j =1

σjz σjz+1 −

N j =1

σjx σjx+1 (1 − σjz σjz+1 )

(150)

504

N. Datta, T. Kennedy

(which is the ferromagnetic analog of (123)). However, unlike the antiferromagnetic case, we cannot force an interface into such a chain by considering N to be odd and imposing periodic boundary conditions. So instead, to induce an interface we change the coupling between the sites N and 1 as follows: We write the Hamiltonian in the form =− H

N j =1

Jj σjz σjz+1 −

N j =1

σjx σjx+1 (1 − Jj σjz σjz+1 ).

(151)

If Jj = 1 for all j then (151) reduces to (150). Such a Hamiltonian has two translationinvariant ground states – with all spins up and all spins down, respectively. However, the choice JN = −1 and Jj = 1 for all j = N , induces an interface into the chain by causing at least one bond in the chain to be frustrated. Moreover, this particular choice of coupling yields a unitarily equivalent Hamiltonian H [(152) below] which commutes with the generalized translation operator T [(47)]. Hence, it allows us to exploit this symmetry to study the interface states, as in the case of the antiferromagnetic chain. As before, we take R to be the rotation operator defined by (4). Hence, R −1 = − H := R H

N j =1

Jj σjx σjx+1 −

N j =1

σjz σjz+1 (1 − Jj σjx σjx+1 ).

(152)

[(151)] is not translation invariant when JN = −1. NoneThe original Hamiltonian H theless, our choice of boundary conditions for the original Hamiltonian is such that the transformed Hamiltonian H [(152)] commutes with T, as is easily checked. (Note that for the ferromagnetic chain we do not use the unitary transformation U .) Recall that H = (C2 )⊗|| is the Hilbert space of the lattice. In (152) the indices x should be taken to be periodic e.g., σN+1 means σ1x . We can write the Hamiltonian as H = H0 + H1 ,

(153)

where H0 := −

N

Jj σjx σjx+1 ,

(154)

σjz σjz+1 (1 − Jj σjx σjx+1 ).

(155)

j =1

and H1 := −

N j =1

For any X ⊂ let |X ∈ H be given by |X =

σ (X)|σ ,

(156)

σ (X)|σ (i) ,

(157)

σ

where σ (X) =

j ∈X

σj . Hence, σix |X =

σ

Interface Instability in the Antiferromagnetic XXZ Chain

505

where σ (i) is the spin configuration σ but with σi replaced by −σi . By making a change of variables in the sum we obtain σix |X = σ (i) (X)|σ , (158) σ

where σ (i) (X) = −σ (X) = σ (X)

if if

i ∈ X, i ∈ X.

(159)

Hence, σix |X = − |X for = |X for

i ∈ X, i ∈ X.

(160)

Note that the states |X are eigenstates of H0 [(154)]. [(151)] for the choice JN = −1 The ground states of the original Hamiltonian H and = 0 corresponds to a configuration consisting of a string of up-spins (+) next to as its = 0 interface a string of down-spins (−). We refer to such ground states of H states. However, the configuration corresponding to the = 0 interface states of the unitarily equivalent Hamiltonian H [(152)] (i.e., ground states of H0 [(154)]) cannot be visualized as clearly. The unitary transformation R obscures the picture. So, to describe |X, it is useful to think about R −1 |X. For example the state R −1 |∅ corresponds to the configuration + + + + + + · · · + ++, where the labels of the sites increase from 1 to N from left to right. It has a single interface between the nearest neighbor sites N and 1. We say that there is an interface between two nearest neighbor sites j and j + 1 if the nearest neighbor bond j, j + 1 is frustrated, i.e., if the spins are antiparallel for j = N and parallel for j = N . Note that for X = , R −1 |X is the configuration with all −’s, this being another configuration with an interface between the sites N and 1. If X = {1, 2, · · · , j }, then R −1 |X looks like + + + · · · + + − − · · · − −−, where the last + occurs at the site j ; for X = {j, j + 1, · · · , N }, R −1 |X looks like − − − · · · − + · · · + ++, where the first + occurs at the site j . Let I(X) denote the set of sites for which the configuration corresponding to the state R −1 |X has interfaces between each site i in this set and its nearest neighbor i + 1. For j = N , j ∈ I(X) if and only if exactly one of j and j + 1 is in X, and for j = N , j ∈ I(X) if and only if both N and 1 are either in X or outside it. Then H1 |X = −2 |X {j, j + 1}. (161) j ∈I (X)

Let X, Y ⊂ and consider the states |X and |Y . If |X| and |Y | are both even (or both odd), then after repeated applications of the Hamiltonian on the state |X we can obtain a state which has a nonzero overlap with |Y . This is, however, not possible if one of

506

N. Datta, T. Kennedy

|X| and |Y | is even and the other is odd. For |X| and |Y | both even (or odd) we define α(X → Y ) to be the minimum number of applications of the Hamiltonian necessary to get from |X to a state which has a nonzero overlap with |Y . We denote such a transition by the symbol X → Y . Hence, α(X → Y ) is equal to the smallest integer n for which Y |H1n |X = 0.

(162)

Equivalently, we consider all sequences X0 , X1 , X2 , . . . , Xn such that X0 = X, Xn = Y , and for each k there is a j : Xk−1 so that Xk = Xk−1 {j, j + 1}. Then α(X → Y ) is the smallest n for all such sequences. In addition, we define α(X) := α(X → ∅). It is clear that α(X) is infinite for |X| odd. Since Ts is a unitary operator which commutes with H1 , we have Ts Y |H1n |Ts X = Y |T−s H1n Ts |X = Y |H1n |X.

(163)

The above equation implies that α(Ts X → Ts Y ) = α(X → Y ).

(164)

We start with the analog of Eq. (130) for the case of Hamiltonian H [(152)] (which (151)). The change of the is unitarily equivalent to the ferromagnetic Hamiltonian H Hamiltonian H from the antiferromagnetic (123) to the ferromagnetic (151) case (and hence the corresponding change of H from (125) to (152)) changes some of the signs in Eq. (130). Moreover, since h(X) = 0 for the Hamiltonian given by (152), many of the terms in this equation reduce to zero. Taking into account these changes, we obtain the following equation: n(X) e(X)σ (X) − 2σN σ1 e(X)σ (X) 2 X

X

−2

N

σj σj +1 s(j, N)

j =1

X:j

e(X)σ (X) −

2N

σ1 · · · σs es

s=1

e(X)σ (X + s) = 0.

X

(165) Using Eqs. (66) and (86), and picking out the coefficient of σ (X) we have 2n(X) e(X) − 2e(X {N, 1}) + 2e(X {N, 1})1(N : X) − 2

N−1

e(X {j, j + 1})1(j : X) −

j =1

2N

es e(T−s X) = 0. (166)

s=1

Since 1 − 1(N : X) = 1(N :: X) ≡ 1(N ∈ I(X)), and for j = N , 1(j : X) = 1(j ∈ I(X)), the above equation can be written as 2n(X) e(X) − 2

j ∈I (X)

e(X {j, j + 1}) −

2N

es e(T−s X) = 0,

(167)

s=1

which we can write as 2n(X) e(X) − 2

N j =1

Z I(Z)j

e(Z)1(Z {j, j + 1} = X) −

2N s=1

since j ∈ I(X) implies that j ∈ I(Z) for X = Z {j, j + 1}.

es e(T−s X) = 0, (168)

Interface Instability in the Antiferromagnetic XXZ Chain

507

For X such that n(X) = 0 we rewrite this as e(X) =

N 2N 1 +2 e(Z)1(Z {j, j + 1} = X) + es e(T−s X) . (169) 2n(X) Z j =1

s=1

I(Z)j

Recall that n(X) = 0 if and only if X is of the form Tm (∅) for some m. We assume that e(X) = 0 for all X ⊂ for which n(X) = 0, except for X = ∅ for which we assume that e(∅) = 1. (170) Hence, for X = Tm (∅), (166) becomes e(X {j, j + 1}). em = −2

(171)

j ∈I (X)

When X = Tm (∅) the set I(X) contains only one site and we find that em = −2e(Xm )

for

m ≤ N,

(172)

where Xm = {1, 2, . . . m} {m, m + 1}, and em+N = −2e( \ Xm ) = −2e(TN Xm )

for

m≤N.

(173)

Note that n(Xm ) = 0. Consider the set of variables e := {e(X) : n(X) = 0} ∪ {es : s = 1, 2, · · · , 2N }.

(174)

Equations (169), (172) and (173) form a fixed point equation for e: F (e) = e.

(175)

Let us introduce the norm ||e|| :=

2N m=1

|em |(||M)−βm + 2

|e(X)|n(X)(||M)−α(X) ,

(176)

X n(X) =0

where M is a positive constant and βm = α(Tm ∅ → ∅). Recall that α(X → Y ) is the least number of applications of the Hamiltonian it takes to get from |X to a state which has a nonzero overlap with |Y . For m odd, repeated applications of the Hamiltonian to |Tm ∅ can never produce a state with a nonzero overlap with |∅. So βm is taken to be infinite for odd values of m. The factor of 2 in the second term on the RHS of (176) is included merely for convenience. For m ≤ N, βm = α(Tm ∅), (177) βm+N = α(Tm+N ∅) ≡ α( \ Tm ∅).

508

N. Datta, T. Kennedy

Theorem 3. There exists a constant M > 0 such that if ||M ≤ 1, then the fixed point equation (175) has a solution e, and ||e|| ≤ c for some constant c which depends only on M. Furthermore, N |es | ≤ c(||M)N−1 . (178) s=−N+1,s =0

So in the infinite length limit, the dispersion relation for an interface is independent of the generalized momentum k. Proof. It is not hard to show that β2 = βN−2 = N − 1, and βs for other nonzero s is even larger. So (178) will follow from the existence of a fixed point in the norm (176). As before we prove the existence of a fixed point by proving ||F (e) − F ( e)|| ≤

1 ||e − e|| for 2

||F (e)|| ≤ δ

||e||, || e|| ≤ δ;

||e|| ≤ δ

for

(179) (180)

with

4 . M From (172) and (173) we get (using the definition of βm ) δ=

2N

|em − em |(||M)

−βm

m=1

≤

N

(181)

|em − em |(||M)−βm

m=1

+

N

|em+N − em+N |(||M)−βm+N

m=1

= 2|| (||M)−1

N

|e(Xm ) − e(Xm )|(||M)−α(Xm )

m=1

+ 2|| (||M)−1

N

|e( \ Xm )

m=1

− e( \ Xm )|(||M)−α(\Xm ) .

(182)

This is because, for m ≤ N , βm = α(Tm ∅) = α(Xm ) + 1, and βm+N = α(TN+m ∅) = α( \ Xm ) + 1. Hence, RHS of (182) ≤ 2|| (||M)−1

|e(Y ) − e(Y )|(||M)−α(Y )

Y n(Y ) =0

≤ M −1 ||e − e||.

(183)

Interface Instability in the Antiferromagnetic XXZ Chain

509

Further, from (169) we get 2

n(X)|e(X) − e(X)|(||M)−α(X)

X n(X) =0

≤ 2||

N X n(X) =0

+

j =1

s

|e(Z) − e(Z)|(||M)−α(Z {j,j +1}) 1(Z {j, j + 1} = X)

Z I(Z)j

|es e(T−s X) − es e(T−s X)|(||M)−α(X) .

X n(X) =0

=: (a) + (b).

(184)

We claim that α(Z {j, j + 1}) ≤ α(Z) + 1

for

j ∈ I(Z).

(185)

To see this note that if j ∈ I(Z), then j ∈ I(Z {j, j + 1}). So a single application of the Hamiltonian can cause the transition Z {j, j + 1} → Z. Using (185) we get

(a) ≤ 2|| (||M)−1 ≤ 2M −1

|e(Z) − e(Z)|(||M)−α(Z)

Z n(Z) =0

1

j j ∈I(Z)

|e(Z) − e(Z)|(||M)−α(Z) |δZ|

Z n(Z) =0

≤ 2M −1

|e(Z) − e(Z)|(||M)−α(Z) (n(Z) + 2)

Z n(Z) =0

≤ 3 M −1 ||e − e||,

(186)

where we have used the inequality |δZ| ≤ n(Z) + 2.

Moreover, using the triangle inequality we get (b) ≤

2N

|es | |e(T−s X) − e(T−s X)| (||M)−α(X)

s=1

X n(X) =0

+|es − es | | e(T−s X)|(||M)−α(X) .

(187)

Let Y = T−s X. Hence, X = Ts Y . Since n(X) = 0 in the above sum, we must have Y = ∅ and n(Y ) = ∅. We claim that α(Ts Y ) ≤ α(Y ) + βs .

(188)

510

N. Datta, T. Kennedy

If we have a sequence of applications of the Hamiltonian that causes the transition Ts Y → Ts ∅ and another sequence that causes the transition Ts ∅ → ∅, then together they give a sequence which results in Ts Y → ∅. Thus α(Ts Y ) = α(Ts Y → ∅) ≤ α(Ts Y → Ts ∅) + α(Ts ∅ → ∅) = α(Y ) + α(Ts ∅) = α(Y ) + βs ,

(189)

where we have used (164). From (187) and (188) it follows that (b) ≤

2N s=1

+

|es | (||M)−βs

2N

Y n(Y ) =0

|es − es | (||M)−βs

s=1

≤

|e(Y ) − e(Y )| (||M)−α(Y )

| e(Y )| (||M)−α(Y )

(190)

Y n(Y ) =0

1 1 ||e|| ||e − e|| + || e|| ||e − e||. 2 2

(191)

From (186) and (191) it follows that 1 1 RHS of (184) ≤ ||e − e|| ||e|| + || e||. e|| + 4M −1 ≤ K||e − 2 2

(192)

where we have used ||e|| ≤ δ, || e|| ≤ δ, and defined K = δ + 4M −1 = 2δ.

(193)

If δ ≤ 1/4, then K ≤ 1/2. To prove (180), we use (169), (172) and (173) to compute F (0). Note that e = 0 means that es = 0 for all s ∈ , and e(X) = 0 for all X except X = ∅. We always have e(∅) = 1. Letting e denote F (0), we have em = 0

for all m.

(194)

For X = {N, 1} we have

1 2 , 2n(X) and e(X) = 0 for all other X for which n(X) = 0. Thus e(X) =

||F (0)|| ≤ 2||(||M)−α({N,1}) = 2|| (||M)−1 = 2M −1 ,

(195)

(196)

since α({N, 1}) = 1. Hence, for ||e|| ≤ δ, where δ = 4M −1 ||F (e)|| ≤ ||F (e) − F (0)|| + ||F (0)|| ≤

1 ||e|| + 2M −1 ≤ δ. 2

(197)

This finishes the proof that the fixed point equation has a solution and so completes the proof of Theorem 3. Acknowledgements. We would like to thank B. Nachtergaele for helpful suggestions. ND would also like to thank Y.M. Suhov for interesting discussions. TK acknowledges the support of the National Science Foundation (DMS-9970608 and DMS-0201566).

Interface Instability in the Antiferromagnetic XXZ Chain

511

References 1. Alcaraz, F.C., Salinas, S.R., Wreszinski, W.F.: Anisotropic ferromagnetic quantum domains. Phys. Rev. Lett. 75, 930 (1995) 2. Araki, H., Matsui, T.: Ground states of the XY-model. Commun. Math. Phys. 101, 231 (1985) 3. Bach, K.: Stabilit´e et instabilit´e d’interfaces dans des chaˆines de spin quantiques. Diploma Thesis, EPFL 4. Bolina, O., Contucci, P., Nachtergaele, B.: Path integral representation for interface states of the anisotropic Heisenberg model. Rev. Math. Phys. 12, 1325 (2000) 5. Bolina, O., Contucci, P., Nachtergaele, B., Starr, S.: Finite-volume excitations of the 111 interface in the quantum XXZ model. Commun. Math. Phys. 212, 63 (2000) 6. Bolina, O., Contucci, P., Nachtergaele, B., Starr, S.: A continuum approximation for the excitations of the (1, 1, . . . , 1) interface in the quantum Heisenberg model. Electronic J. Diff. Eqs. Conf. 04, 1 (2000) 7. Borgs, C., Koteck´y, R., Ueltschi, D.: Low temperature phase diagrams for quantum perturbations of classical spin systems. Commun. Math. Phys. 181, 409 (1996) 8. Borgs, C., Chayes, J., Fr¨ohlich, J.: Dobrushin states in quantum lattice systems. Commun. Math. Phys. 189, 591 (1997) 9. Datta, N., Fern´andez, R., Fr¨ohlich, J.: Low-temperature phase diagrams of quantum lattice systems. I. Stability for quantum perturbations of classical systems with finitely-many ground states. J. Stat. Phys. 84, 455 (1996) 10. Datta, N., Kennedy, T.: Expansions for one quasiparticle states in spin 1/2 systems. J. Stat. Phys. 108, 373 , (2002) 11. Datta, N., Messager, A., Nachtergaele, B.: Rigidity of interfaces in the Falicov-Kimball model. J. Stat. Phys. 99, 461 (2000) 12. Dobrushin, R.L.: Gibbs state describing the coexistence of phases for a three-dimensional Ising model. Theor. Prob. Appl. 17, 582 (1972) 13. Fr¨ohlich, J., Lieb, E.H.: Phase transitions in anisotropic lattice systems. Commun. Math. Phys. 60, 103 (1978) 14. Gallavotti, G.: Phase separation line in the two-dimensional Ising model. Commun. Math. Phys. 27, 103 (1972) 15. Gottstein, C.-T., Werner, R.F.: Ground states of the infinite q-deformed Heisenberg ferromagnet. arXiv:cond-mat/9501123 16. Henley, C.L.: Ordering due to disorder in a frustrated vector antiferromagnet. Phys. Rev. Lett. 62, 2056 (1989) 17. Kenyon, R.: Local statistics of lattice dimers. Ann. Inst. H. Poincar´e, Probab. Statist. 33, 591–618 (1997) 18. Kirkwood, J.R., Thomas, L.E.: Expansions and phase transitions for the ground state of quantum Ising lattice systems. Commun. Math. Phys. 88, 569 (1983) 19. Koma, T., Nachtergaele, B.: The spectral gap of the ferromagnetic XXZ chain. Lett. Math. Phys. 40, 1 (1997) 20. Koma, T., Nachtergaele, B., Starr, S.: The spectral gap for the ferromagnetic spin-J XXZ chain. Adv. Theor. Math. Phys. 5, 1047 (2001) 21. Koma, T., Nachtergaele, B.: Interface states of quantum lattice models. In: Matsui, T. (eds.), Recent Trends in Infinite Dimensional Non-Commutative Analysis. RIMS Kokyuroku 1035, Kyoto, 1998, p. 133 22. Matsui, T.: A link between quantum and classical Potts models. J. Stat. Phys. 59, 781 (1990) 23. Matsui, T.: On the spectra of the kink for ferromagnetic XXZ models. Lett. Math. Phys. 42, 229 (1997) 24. Nachtergaele, B.: Interfaces and droplets in quantum lattice models. In: XIII International Congress of Mathematical Physics, A. Grigoryan, A. Fokas, T. Kibble, B. Zegarlinski (eds.), Boston: International Press, 2001, p. 243 Communicated by M. Aizenman

Commun. Math. Phys. 236, 513–534 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0824-6

Communications in

Mathematical Physics

Power-Law Bounds on Transfer Matrices and Quantum Dynamics in One Dimension David Damanik1, , Serguei Tcheremchantsev2 1 2

Department of Mathematics 253–37, California Institute of Technology, Pasadena, CA 91125, USA. E-mail: [email protected] UMR 6628 – MAPMO, Université d’Orleans, B.P. 6759, 45067 Orleans Cédex, France. E-mail: [email protected]

Received: 5 June 2002 / Accepted: 20 January 2003 Published online: 28 March 2003 – © Springer-Verlag 2003

Abstract: We present an approach to quantum dynamical lower bounds for discrete one-dimensional Schrödinger operators which is based on power-law bounds on transfer matrices. It suffices to have such bounds for a nonempty set of energies. We apply this result to various models, including the Fibonacci Hamiltonian. 1. Introduction Consider a self-adjoint operator H on a separable Hilbert space H = 2 (Zd ) or 2 (N). The dynamical evolution of an initial state ψ ∈ H, ψ = 1, is given by ψ(t) = exp(−itH )ψ. We shall denote ψ(t, n) = ψ(t), δn , where B = {δn } is the canonical basis of H labelled by n ∈ Zd or n ∈ N. One usually takes ψ so that ψ(0, n) = ψ(n) is well localized (fast decaying at infinity). For example, one can take ψ = δ1 in the onedimensional case. While being localized at t = 0, the wave packet in general spreads with time over the basis B. It is convenient to consider the time-averaged quantities 1 T a(n, T ) = |ψ(t, n)|2 dt, a(n, T ) = ψ2 = 1 for all T > 0. (1) T 0 n There exist basically two possibilities to characterize the spreading of the wave packet. First, one can consider the upper and lower rates associated to the fastest (or the slowest) part of the wave packet. Let S(γ , T ) = a(n, T ), γ ≥ 0. n:|n|≥T γ −1

D.D. was supported in part by NSF Grant No. DMS–0227289

514

D. Damanik, S. Tcheremchantsev

For the fastest part, one defines log S(γ , T ) − γ = sup γ ≥ 0 | lim inf =0 , T →+∞ log T log S(γ , T ) =0 γ + = sup γ ≥ 0 | lim sup log T T →+∞

(2)

(3)

(with the convention log 0 = −∞). Assume that γ ± > 0. Then it follows from this definition that for any 0 < ν < γ − , η > 0, a(n, T ) ≥ CT −η (4) n:|n|≥T γ

− −ν

for all T ≥ 1 with a uniform (in T ) constant C. For γ + , a similar bound holds on some sequence of times Tk → +∞. One could take a slightly different definition of γ ± so that one has for any 0 < ν < γ − , a(n, T ) ≥ C(ν) > 0 (5) n:|n|≥T γ

− −ν

(and similarly for γ + ), but the meaning of numbers γ ± will be essentially the same. The spreading rates for the slowest part of the wave packet can be defined with summation over {n : |n| ≤ T γ } and taking inf{γ } in (2), (3) (see also [28] for a slightly different definition). In the present paper we shall be interested, however, only in the fastest part of the wave packet. To determine the numbers γ ± for concrete quantum systems, one should be able to relate it to the spectral properties of H . What is in fact known are some general lower bounds for γ ± . Although not stated in this form, it follows from the proofs of [2, 31] that γ− ≥

1 dimH (µψ ), d

(6)

γ+ ≥

1 dimP (µψ ) d

(7)

and from the proof of [21] that

(with d ≥ 1 in the case of l 2 (Zd ) and d = 1 in the case of l 2 (N)). Here µψ is the spectral measure associated to the state ψ and the operator H , and dimH (µ), dimP (µ) denote the (upper) Hausdorff and packing dimensions of the measure µ, respectively. For different definitions and relations between them, see, for example, [5, 18, 39]. In particular, the Hausdorff dimension is determined by the most continuous part of the measure µ: dimH (µ) = µ-ess-sup λ− (E), where λ− (E) is the lower local exponent of µ, λ− (E) = lim inf ε→0

log µ([E − ε, E + ε]) , E ∈ supp µ. log ε

Quantum Dynamics in One Dimension

515

For the packing dimension, we have dimP (µ) = µ-ess-sup λ+ (E), where λ+ (E) is the upper local exponent of the measure, λ+ (E) = lim sup ε→0

log µ([E − ε, E + ε]) , E ∈ supp µ. log ε

One can observe that for Hausdorff dimension, the result is slightly stronger than (6), namely, (5) holds with γ − = dimH (µψ ). If one knows the continuity properties of the spectral measure (encoded in λ− (E)) and one has also some information about the decay of the generalized eigenfunctions uψ (n, E), associated to the state ψ, then one can improve the lower bound (6) for γ − [28, 29]. More involved quantities which describe the fastest portion of the wave packet are the time-averaged moments of the position operator: 1 T p p |X|ψ (T ) = |n| |ψ(t, n)|2 = |n|p a(n, T ), p > 0 (8) T 0 n n (where |n| is the norm in Zd in the case of l 2 (Zd )). One can define the associated upper and lower growth exponents, p

−

β (p) = lim inf

T →+∞

log |X|ψ (T ) log T

,

and in a similar manner for β + (p). The numbers β ± (p) depend, in general, on the state ψ, but we will leave the dependence implicit. It is clear that (4) implies p

|X|ψ (T ) ≥ CT p(γ

− −ν)−η

for any ν > 0, η > 0 uniformly in T ≥ 1 and thus β − (p) ≥ pγ − . Together with (6) this yields β − (p) ≥

p dimH (µψ ), d

(9)

the bound most often used to bound from below the moments of the position operator. Similarly, one always has β + (p) ≥ pγ + ≥ pd dimP (µψ ). It is important to observe that strict inequalities β − (p) > pγ − , β + (p) > pγ + may occur. This is possible if the wave packet has polynomially decaying tails at infinity. Assume, for example, that for some γ > γ − and some τ > 0, a(n, T ) ≥ CT −τ (10) n:|n|≥T γ

uniformly in T ≥ 1. Then β − (p) ≥ pγ − τ . This bound is better than β − (p) ≥ pγ − for p large enough. General lower bounds for the time-averaged moments which take into account (in a somewhat hidden form) polynomial tails, are obtained in [3, 4, 22, 39]. The proofs are based on the spectral theorem and develop the ideas of Guarneri [20]. The obtained

516

D. Damanik, S. Tcheremchantsev

lower bounds are expressed in terms of the spectral measure µψ [4, 39] or in terms of both the spectral measure and the generalized eigenfunctions uψ (n, E) [39]. To apply them to concrete quantum systems, one should have a good knowledge of µψ ([E − ε, E + ε]), E ∈ supp µψ (and also of the functions SN (E) = n:|n|≤N |uψ (n, E)|2 in the case of the mixed lower bounds from [39]). Such information is difficult to obtain in the multidimensional case. However, in one dimension for operators H of the form H ψ(n) = ψ(n − 1) + ψ(n + 1) + V (n)ψ(n),

(11)

there exist rather effective methods based on the study of solutions to the formal eigenfunction equation H u = Eu, E ∈ R. In particular, the growth properties of the transfer matrix T (n, E) associated with this equation are closely related with the spectral properties of operator H . For simplicity, from now on we shall always take ψ = δ1 and we write simply µ instead of µδ1 . Our purpose is not to give a detailed list of results obtained in this field. We will mainly be interested in models with power-law growth of the norm of the transfer matrix. Roughly speaking, the slower the growth of T (n, E), for E ∈ A and as |n| → ∞, the more continuous is the restriction of µ on A. What will be of particular interest for us in the present paper, are the two following results. First, if T (n, E) ≤ C(E)|n|η for µ-a.e. E ∈ A, µ(A) > 0 with η ∈ [0, 1/2) and finite C(E), then [25, 26] µ([E − ε, E + ε]) ≤ D(E)ε 1−2η , E ∈ A, so that the measure is (1 − 2η)-continuous on A. In particular, dimH (µ) ≥ 1 − 2η, β − (p) ≥ p(1 − 2η).

(12)

More generally, assume that every solution of the equation H u = Eu, E ∈ A obeys |u(n, E)|2 ≤ C2 (E)L2q2 (13) C1 (E)L2q1 ≤ n:|n|≤L

with positive finite C1 (E), C2 (E) and 0 < q1 ≤ q2 < +∞. Then the measure is 2q1 /(q1 + q2 )-continuous on A [15, 26]. In particular, dimH (µ) ≥ 2q1 /(q1 + q2 ). One can observe that the condition T (n, E) ≤ C(E)|n|α with some α ≥ 0 implies q2 ≤ α + 1/2. The polynomial upper bound on the norm of the transfer matrix also allows one to bound µ([E − ε, E + ε]) from below. Such a bound can be used, following general results of [4, 39], to obtain non-trivial dynamical bounds for the moments (β − (p) > 0) for some systems with pure point spectrum [19]. Since in this case dimH (µ) = 0, these bounds cannot be obtained by usual methods based on the continuity properties of the measure. (Another example of this kind, studied by Jitomirskaya et al. in [27], will be discussed later on). As was said above, the general results of [4, 39] were obtained using the spectral theorem for the operator H and thus by representing all the quantities of interest as some integrals over the spectral measure. In the present paper we propose another method based on integrals over Lebesgue measure rather than over µψ . One can bound from below rather directly the sums n:|n|≥L a(n, T ) (and thus the time-averaged moments |X|p (T )) using the Parseval equality [34]. Namely, for any ψ ∈ H, 1 +∞ −2t/T 1 e |ψ(t, n)|2 dt = |(R(E + i/T )ψ)(n)|2 dE. (14) T 0 2πT R

Quantum Dynamics in One Dimension

517

Here, n ∈ Zd or n ∈ N, and R(z) = (H − zI )−1 is the resolvent of H . The integral on the l.h.s. in (14) is very close to the Cesaro time-averaged quantity a(n, T ). Therefore we modify the definition of a(n, T ) in (1) and of |X|p (T ) in (8) accordingly (this is done in this way by many authors), 1 +∞ −2t/T p a(n, T ) = e |ψ(t, n)|2 dt, |X|ψ (T ) = |n|p a(n, T ), (15) T 0 n and work with these definitions in what follows. One can easily see that this does not p change the numbers β ± (p), provided that the moments |X|ψ (t) do not grow faster than polynomially (which is true in most applications and, in particular, in all applications considered in the present paper). lower bound, it is sufficient to bound from below To obtain a non-trivial dynamical 2 for some γ > 0 for E from some set of positive Lebesgue |R(E +i/T )ψ(n)| γ n:|n|≥T measure. In our one-dimensional case with ψ = δ1 , the resolvent R(E + i/T )δ1 (n) is closely related to the corresponding transfer matrix T (n, 1; E + i/T ), which is in turn close to T (n, 1; E) if T is large and n is not too large. Roughly speaking, if T (n, 1; E) is bounded from above polynomially in n for some E, then R(E + i/T )δ1 (n) decays no faster than polynomially for n not too large (depending on T ). The following result will be proved in Sect. 2: Theorem 1. Let the operator H in 2 (Z) or 2 (N) be given by (11) (with Dirichlet boundary condition at 0 in the half-line case). Suppose that for some K > 0, C > 0, α > 0, the following condition holds: For any N > 0 large enough, there exists a nonempty Borel set A(N ) ⊂ R such that A(N ) ⊂ [−K, K] and T (n, m; E) ≤ CN α ∀E ∈ A(N ), ∀ n, m : |n| ≤ N, |m| ≤ N

(16)

(resp., with 1 ≤ n ≤ N, 1 ≤ m ≤ N in the case of 2 (N)). Let N(T ) = T 1/(1+α) and let B(T ) be the 1/T -neighborhood of the set A(N (T )): B(T ) = {E ∈ R : ∃E ∈ A(N (T )), |E − E | ≤ 1/T }. Then for all T > 1 large enough, the following bound holds:

a(n, T ) ≥

|n|≥N(T )/2

Cˆ |B(T )|N 1−2α (T ), T

(17)

where Cˆ is some uniform positive constant and |B| denotes the Lebesgue measure of the set B. In particular, for any p > 0, one has the following bound for the time-averaged moments of the position operator: p

|X|ψ (T ) ≥

Cˆ |B(T )|N p+1−2α (T ), T

(18)

p

where |X|ψ (T ) is defined as in (15). In the statement of Theorem 1 it is important that the constant C in (16) be independent of N and of E ∈ A(N ). If A(N ) consists of a single point E0 (independently of N), the statement of the theorem is still non-trivial. Namely,

518

D. Damanik, S. Tcheremchantsev

Corollary 1.1 (One-Energy Theorem). If T (n, m; E0 ) ≤ C(E0 )(|n| + |m|)α

(19)

for some E0 , uniformly in n, m, then β − (p) ≥

p − 1 − 4α . 1+α

If one only knows that T (n, 1; E0 ) ≤ C(E0 )|n|η , then one can take α = 2η, so that p − 1 − 8η . (20) β − (p) ≥ 1 + 2η It is interesting to compare this bound with (12), stating for 0 ≤ η < 1/2, that β − (p) ≥ p(1−2η). One observes that (20) yields a better result for p large enough (and one needs the polynomial upper bound for the transfer matrix only for a single point and not on some set of positive measure). Moreover, in the case η ≥ 1/2, the bound (20) is always non-trivial for p large enough, while (12) gives nothing. In this case the spectrum of H may be pure point with polynomially decaying eigenfunctions. Another interesting consequence of Theorem 1 is the following: Corollary 1.2. If there exist C > 0, E0 ∈ R, θ ≥ 1 such that T (n, 1; E) ≤ C for all n, E such that |n| · |E − E0 |θ ≤ 1, then β − (p) ≥ p − 1/θ. Taking θ = 2 in Corollary 1.2, we obtain the main dynamical result of [27]. Finally, we note that all the results mentioned above are stable with respect to finitely supported perturbations. This is of interest since all the previous dynamical lower bounds were based on dimensionality properties of spectral measures which are very sensitive to such perturbations. Corollary 1.3. Suppose the potential V0 is such that the operator H0 with potential V0 satisfies the hypothesis of Theorem 1. Let W : Z → R (resp., W : N → R) be a finitely supported perturbation and H = H0 + W . Then the operator H satisfies (17) and (18). The same kind of stability holds for Corollaries 1.1 and 1.2. We shall apply the results above to various models from one-dimensional quasicrystal theory (cf. [13]): the Fibonacci Hamiltonian, the period doubling Hamiltonian, and the Thue-Morse Hamiltonian. More applications will be discussed in a forthcoming publication [16]. The Fibonacci potential V : Z → R is given by V (n) = λχ[1−ω,1) (nω

mod 1).

(21)

√ Here, λ > 0 and ω is the inverse of the golden mean, that is, ω = ( 5 − 1)/2. For a given λ, let Cλ = 2 + 8 + λ 2 . (22) We shall prove

Quantum Dynamics in One Dimension

519

Theorem 2. (a) For every λ, p, we have β − (p) ≥

p − 1 − 4α , 1+α

with α = α(λ) =

2 log(Cλ (2Cλ + 1)2 ) . log ω−1

(23)

(b) For every p and every λ > 4, we have β − (p) ≥

p − γ − 3α , 1+α

where α is as in (23) and γ = γ (λ) =

log(2λ + 22) − 1 < 1 + α. log ω−1

(24)

Let us compare the result of Theorem 2 with previously known lower bounds for the moments. In fact, all these bounds are based on (9). The lower bounds for the Hausdorff dimension for the Fibonacci Hamiltonian [11, 26, 28] in turn have all been proved using (13). For example, in [28] it was proved that dimH (µ) ≥ 2κ/(κ + α + 1/2), where

(25)

√ log( 17/4) κ= ≈ 0.0126 5 log(ω−1 )

and α is given by (23). For large p, the dominant expression in the lower bound for β − (p) in Theorem 2 is p/(1 + α) which is clearly better than 2κp/(κ + α + 1/2). We therefore see that our result improves the previously known lower bound for large p. We also note that Corollary 1.3 is particularly interesting in this context. The Fibonacci Hamiltonian has spectrum σ of zero Lebesgue measure [8, 38] and hence, by a suitable application of the Simon-Wolff argument [36], a generic rank two perturbation (at two consecutive sites) will produce a spectral measure which is entirely supported away from σ . More precisely, the spectral measure of the perturbed operator will be supported on a countable set of eigenvalues with σ being the set of accumulation points of these eigenvalues. In particular, a generic rank two perturbation may turn the spectral measure of the Fibonacci Hamiltonian, which is known to be α-continuous for some α > 0, into a pure point measure. In this situation, previous bounds do not give anything for the perturbed operator, while our bound for β − (p) is stable with respect to such a perturbation. Our next application is to the period doubling Hamiltonian which was considered, for example, in [7, 9, 12, 14, 23]. Let pd be the (two-sided) subshift generated by the substitution 0 → 01, 1 → 00 and define, for some ω ∈ pd and a given coupling constant λ, the potential Vλ,ω by Vλ,ω (n) = λωn . Is is known that for every λ = 0 and every ω ∈ pd , the operator with potential Vλ,ω has purely singular continuous spectrum (see [14] for this result and [7, 9, 12, 23] for earlier partial results). We shall show the following:

520

D. Damanik, S. Tcheremchantsev

Theorem 3. For every λ and every ω ∈ pd , we have β − (p) ≥

p−5 . 2

This result is of interest for various reasons. First of all, this is the first dynamical result for this model that goes beyond the RAGE theorem (i.e., the dynamical result that follows from purely continuous spectrum). Indeed, apart from singular continuity, nothing on either dimensionality or dynamics was known rigorously. The difference from the Fibonacci case, where, as discussed above, earlier results in these directions were known, stems from the absence of an invariant for the trace map. More concretely, it could not be shown that the trace map orbits remain bounded for (sufficiently many) energies from the spectrum. With a result like Corollary 1.1 at our disposal, very weak results on the period doubling trace map suffice already to allow us to establish the result above. Secondly, observe that the dynamical bound we obtain in Theorem 3 is independent of the coupling constant. This is in sharp contrast to the Fibonacci case where our bound (and previous ones) becomes worse as the coupling constant is increased. Finally, we consider the Thue-Morse Hamiltonian which was studied, for example, in [1, 6, 9, 17, 23]. Let tm be the (two-sided) subshift generated by the substitution 0 → 01, 1 → 10 and define as above, for some ω ∈ tm and a given coupling constant λ, the potential Vλ,ω by Vλ,ω (n) = λωn . We shall show the following: Theorem 4. For every λ and every ω ∈ tm , we have β − (p) ≥ p − 1. As in the previous theorem, the bound is λ-independent and the first dynamical result for this model. Furthermore, in this case the spectral type has not even been identified in all cases. It is known that for every λ = 0 and every ω ∈ tm , the absolutely continuous spectrum is empty (this follows from results of Kotani [30] and Last and Simon [32]), but absence of eigenvalues is only known generically, that is, for a dense Gδ set of ω ∈ tm (this was shown by Delyon and Peyrière [17] and Hof et al. [23], using different methods). This illustrates the theme of this paper in a nice way: While we cannot say anything about the dimensionality of spectral measures, we can nevertheless prove very strong dynamical bounds. The organization of the article is as follows: Theorem 1 and Corollaries 1.1–1.3 will be proved in Sect. 2. In Sects. 3–5 we shall then apply these criteria to the Fibonacci model, the period doubling model, and the Thue-Morse model, respectively. 2. A New Criterion for Quantum Dynamical Lower Bounds Let H = +V be a self-adjoint operator on 2 (Z) or on 2 (N) (with Dirichlet boundary condition in the latter case). Given some real function V (n), it is formally defined by H ψ(n) = ψ(n − 1) + ψ(n + 1) + V (n)ψ(n), n ∈ Z and by H ψ(n) = ψ(n − 1) + ψ(n + 1) + V (n)ψ(n), n > 1, H ψ(1) = ψ(2) + V (1)ψ(1)

Quantum Dynamics in One Dimension

521

respectively. Let z be some complex or real number. We define the transfer matrix associated to the operator H as follows:   A(n, z)A(n − 1, z) · ... · A(m + 1, z) if n > m, I if n = m, T (n, m; z) = (26)  T −1 (m, n; z) if n < m,

where A(n, z) =

z − V (n) −1 . 1 0

For z ∈ C with Im z = 0, define φ = R(z)δ1 = (H − zI )−1 δ1 and consider the vectors (n) = (φ(n + 1), φ(n))T . One can easily see from the definition of resolvent that (n) = A(n, z)(n − 1) for n = 1. Therefore, (n) = T (n, 1; z)(1), n > 1.

(27)

Moreover, in the case of 2 (Z), one has the identity (n) = T (n, 0; z)(0), n < 0.

(28)

Since det A(n, z) = 1, the same is true for T (n, m; z), so that T −1 (n, m; z) = T (n, m; z) for all n, m, z. Therefore, (27) and (28) imply (n) ≥ T (n, 1; z)−1 (1), n > 1

(29)

(n) ≥ T (n, 0; z)−1 (0), n < 0,

(30)

in both cases and

in the case of 2 (Z). It is clear that to get a lower bound for the resolvent, we need an upper bound for the norm of the transfer matrix for complex values of z. Usually, one has such bounds only for real values of z, so we should establish relations between these two cases. Lemma 2.1. Let E ∈ R, N > 0. Define K(N) =

sup

|n|≤N,|m|≤N

T (n, m; E)

in the case of 2 (Z) and L(N ) =

sup

T (n, m; E)

1≤n,m≤N

in the case of 2 (N). Let δ ∈ C. The following bounds hold: T (n, 1; E + δ) ≤ K(N) exp(K(N )|n||δ|), 1 ≤ n ≤ N,

(31)

T (n, 0; E + δ) ≤ K(N) exp(K(N )|n||δ|), −N ≤ n ≤ 0,

(32)

in the case of 2 (Z) and T (n, 1; E + δ) ≤ L(N ) exp(L(N )|n||δ|), 1 ≤ n ≤ N, in the case of 2 (N).

(33)

522

D. Damanik, S. Tcheremchantsev

Proof. The proof is virtually identical to the proof of Theorem 2J in [35]. For example, for any n with 2 ≤ n ≤ N , one can write the identity T (n, 1; E + δ) = T (n, 1; E) + δ

n−1

T (n, j + 1; E)BT (j, 1; E + δ),

j =1

where B=

1 0

0 . 0

By iteration, using the fact that T (n, m; E) ≤ K(N ) for all 1 ≤ n, m ≤ N , one can show (31). The same proof yields (33). The bound (32) can be proved in a similar manner. Proof of Theorem 1. We give the proof in the case of 2 (Z), the proof in the case of 2 (N) is analogous, with some aspects being even simpler than in the whole-line case. The basic idea is the same as in [40] in the case of a model with a sparse potential. First, using Parseval’s identity [34], one can write for any n, ε 1 ∞ 2 | exp(−itH )δ1 (n)| exp(−2t/T ) dt = |R(E + iε)δ1 (n)|2 dE, (34) T 0 2π R where ε = 1/T . For given E, ε, we shall write φ(n) = R(E + iε)δ1 (n). Let T > 1. Define N ≡ N (T ) = T 1/(1+α) . The assumption of the theorem gives K(N ) =

sup

|n|≤N,|m|≤N

T (n, m; E ) ≤ CN α ,

provided that E ∈ A(N ). Let E ∈ B(T ). Then there exists E ∈ A(N (T )) ≡ A(N ) such that |β| ≡ |E − E | ≤ 1/T . Using Lemma 2.1, we obtain for all n with 1 ≤ n ≤ N , T (n, 1; E + iε) ≤ CN α exp(C(|β| + ε)N α+1 ) = DN α ,

(35)

where D = C exp(2C), since ε = 1/T , |β| ≤ 1/T and n ≤ N = T 1/(1+α) . Similarly, for all n with −N ≤ n ≤ 0, T (n, 0; E + iε) ≤ DN α .

(36)

It follows from (29)–(30) and (35)–(36) that |φ(n + 1)|2 + |φ(n)|2 ≥ KN −2α (|φ(2)|2 + |φ(1)|2 ),

(37)

for all E ∈ B(T ), 1 ≤ n ≤ N, and |φ(n + 1)|2 + |φ(n)|2 ≥ KN −2α (|φ(1)|2 + |φ(0)|2 ),

(38)

for all E ∈ B(T ), −N ≤ n ≤ 0. Here, K > 0 is some uniform constant. Now with (37)–(38), we can estimate, for any E ∈ B(T ) and T large enough: (|φ(n + 1)|2 + |φ(n)|2 ) ≥ DN 1−2α (|φ(0)|2 + |φ(1)|2 + |φ(2)|2 ) (39) |n|≥N/2+1

Quantum Dynamics in One Dimension

523

with uniform constant D > 0. We now use the equation for the resolvent for n = 1: φ(2) + φ(0) + (V (1) − E − iε)φ(1) = 1. Since E ∈ B(T ), A(N ) ⊂ [−K, K] and ε = 1/T ≤ 1, we get |φ(2)| + |φ(1)| + |φ(0)| ≥ Together with (39) this gives |φ(n)|2 ≥ 2 |n|≥N/2

1 . |V (1)| + K + 2

(|φ(n + 1)|2 + |φ(n)|2 ) ≥ γ N 1−2α ,

(40)

|n|≥N/2+1

where γ > 0 is some uniform constant depending on |V (1)|, K, D. Therefore (recall that φ = R(E + iε)δ1 ), γ 1−2α γ N |R(E + iε)δ1 (n)|2 dE ≥ dE = |B(T )|N 1−2α . (41) 2 R B(T ) 2 |n|≥N/2

Summation over {n : |n| ≥ N/2} in (34) together with (41) proves (17). Since | exp(−itH )δ1 (n)|2 , |X|p (t) ≥ (N/2)p |n|≥N/2

the bound (18) follows immediately.

Proof of Corollary 1.1. One takes A(N ) = {E0 } for all N . Since |B(T )| = 2/T , the result follows directly from (18). Proof of Corollary 1.2. One takes A(N ) = [E0 − N −1/θ , E0 + N −1/θ ]. One sees easily that (16) holds with α = 0. Therefore, N (T ) = T and |B(T )| ≥ |A(N (T ))| = 2T −1/θ . The bound (18) yields the result. Proof of Corollary 1.3. It is easy to see that the operator H satisfies the hypothesis of Theorem 1 with the same K, α, A(N ), and some appropriately adjusted constant C. This yields (17) and (18). 3. The Fibonacci Model In this section we apply Theorem 1 to the Fibonacci model. We refer the reader to [13] for background information on this model and related ones. While Theorem 1 already gives a non-trivial dynamical bound when the set A(N ) consists of a single point, and such an input is known for the Fibonacci model [24], we shall nevertheless identify the natural set A(N ) of energies for which one can prove the local power-law bounds required by Theorem 2. On the one hand, this improves the dynamical bound, since we obtain a larger lower bound for the measure of B(T ), and on the other hand it illustrates nicely how the sets, on which one can prove local power-law bounds (they will turn out to be spectra of periodic approximants) shrink down to a zero-measure set (the spectrum of the Fibonacci Hamiltonian) of energies for which the power-law bounds hold globally.

524

D. Damanik, S. Tcheremchantsev

Let us first present some basic notions and results for the Fibonacci potential given by (21). The transfer matrices T (n, m; z) are defined as in (26), and we shall sometimes write Tλ (n, m; z) to make their dependence on the parameter λ explicit. Define the sequence (Fk )k≥0 of Fibonacci numbers by F0 = 1, F1 = 1, Fk+1 = Fk + Fk−1 for k ≥ 1. Let where

xk (E, λ) = tr Mk (E, λ), Mk (E, λ) = Tλ (Fk , 1; E).

We will often leave the dependence of xk or Mk on E, λ implicit. The matrices Mk obey the recursion Mk = Mk−2 Mk−1

(42)

xk+1 = xk xk−1 − xk−2

(43)

which yields

for their traces. This in turn gives the invariant 2 2 xk+1 + xk2 + xk−1 − xk+1 xk xk−1 = 4 + λ2 for every k ∈ N.

(44)

The identities (42)–(44) were proved by Süt˝o in [37]. For fixed λ, define σk = {E ∈ R : |xk (E, λ)| ≤ 2}. The set σk is actually equal to the spectrum of the Schrödinger operator H whose potential Vk results from V in (21) by replacing α by Fk−1 /Fk (see [37]). Hence, Vk is periodic, σk ⊂ R, and it consists of Fk bands (closed intervals). Lemma 3.1. For every E ∈ σk , we have |xi (E, λ)| ≤ Cλ for 0 ≤ i ≤ k.

(45)

Proof. It was shown in [37] that for every m ∈ N, σm ∪ σm+1 ⊆ σm−1 ∪ σm . Thus, if E ∈ σk , then |xk (E, λ)| ≤ 2 and for every i < k, we have that either |xi (E, λ)| ≤ 2 or max{|xi−1 (E, λ)|, |xi+1 (E, λ)|} ≤ 2. In the former case, the claimed bound clearly holds. In the latter case, use the invariant (44) to establish the claimed upper bound for |xi (E, λ)|. Lemma 3.1 is the key tool in identifying a natural set of energies for which the desired power-law bounds on transfer matrices up to a certain distance from the origin hold. Proposition 3.2. For every λ, there is a constant C such that for every k ∈ N, every E ∈ σk , and every m with −Fk + 1 ≤ m ≤ Fk (m = 0), we have Tλ (m, 1; E) ≤ C|m|α ,

(46)

log(Cλ (2Cλ + 1)2 ) . log ω−1

(47)

where α=

Quantum Dynamics in One Dimension

525

Proof. We first note that due to the symmetry V (−n) = V (n − 1), n ≥ 2 of the potential [37], we can restrict our attention to 1 ≤ m ≤ Fk . Our principal strategy is to modify the approach of [24] as to treat a larger set of energies, while proving bounds only in finite regions. Essentially, their proof of a power-law upper bound for energies in the spectrum can be turned into the claimed result by noting that for each m, boundedness of traces is only needed for j ’s with Fj ≤ m, and the previous lemma established this fact for energies in σk , where k = max{j : Fj ≤ m}. For the reader’s convenience, let us be more concrete. Fix λ, k, and E ∈ σk . By symmetry (V (−n) = V (n − 1), n ≥ 2; see [37]), we can restrict our attention to the case of positive m. The proof of (46) will be split into several steps. Step 1. For 0 ≤ i ≤ k, we have Mi ≤ (Cλ )i .

(48)

This follows from −1 −1 Mi = Mi−2 Mi−1 = Mi−2 (xi−1 I − Mi−1 ) = xi−1 Mi−2 − Mi−3 −1 along with Mi−3 = Mi−3 .

Step 2. For i, j ∈ N with i + j ≤ k and i ≥ 2, we have (1)

(2)

(3)

(4)

Mi Mi+j = Pj Mi+j + Pj Mi+j −1 + Pj Mi+j −2 + Pj I, (l)

= Pj (xi−1 , xi , . . . , xi+j ) is a polynomial of degree at

(l)

(50)

where, for l = 1, 2, 3, 4, Pj most j , and we have 4

(49)

(l)

|Pj |(|xi−1 |, . . . , |xi+j |) ≤ (2Cλ + 1)j ,

l=1 (l)

(l)

where the polynomial |Pj | results from Pj by replacing all the coefficients by their respective absolute values. To see this, given (45), one can literally redo the proof of Lemma 5 in [24] since it only uses the trace bound for indices bounded by k. Step 3. For 1 ≤ m ≤ Fk , we have Tλ (m, 1; E) ≤ d mN , where

(51)

d = Cλ (2Cλ + 1)2

and mN appears in the unique coding of m in terms of the Fibonacci numbers, m=

N

Fml , Fm0 < Fm1 < · · · < FmN , ml − ml−1 ≥ 2.

l=0

Clearly, mN = max{i : Fi ≤ m} and hence mN ≤ k. The estimate (51) can be proved in the exact same way as in [24], given Steps 1 and 2 above.

526

D. Damanik, S. Tcheremchantsev

Step 4. Conclusion of the proof: The Fibonacci numbers Fi behave asymptotically like √ ω−i = [(1 + 5)/2]i and in particular, there is a constant D1 such that Fi ≥ D1 ω−i for every i ≥ 1. This means that the number mN defined in Step 3 obeys mN ≤

log m + D2 log ω−1

for some suitable constant D2 , which in turn implies log m

Tλ (m, 1; E) ≤ d log ω−1

+D2

,

and hence (46) for a constant C depending only on λ and α as in (47).

We therefore have found a natural candidate that serves the purpose of the set A(N ) in Theorem 1. Namely, given some N , choose k such that Fk−1 < N ≤ Fk and set A(N) = σk . Next, we prove a lower bound for the Lebesgue measure of σk . More precisely, we obtain lower bounds for each one of the Fk intervals it is made of. The results presented here are essentially contained in Raymond [33]. We give a somewhat more streamlined presentation in the spirit of [28]. From now on, we shall always assume λ > 4,

(52)

since we will make critical use of the fact that in this case, it follows from the invariant (44) that three consecutive traces cannot be bounded in absolute value by 2: ∀λ > 4, ∀E, k : max{|xk (E, λ)|, |xk+1 (E, λ)|, |xk+2 (E, λ)|} > 2.

(53)

Following [28], we call a band Ik ⊂ σk a type A band if Ik ⊂ σk−1 (and hence Ik ∩ (σk+1 ∪ σk−2 ) = ∅). We call a band Ik ⊂ σk a type B band if Ik ⊂ σk−2 (and therefore Ik ∩ σk−1 = ∅). From (53), one gets the following (Lemma 5.3 of [28], essentially Lemma 6.1 of [33]): Lemma 3.3. For every λ > 4 and every k ∈ N, (a) Every type A band Ik ⊂ σk contains exactly one type B band Ik+2 ⊂ σk+2 , and no other bands from σk+1 , σk+2 . (b) Every type B band Ik ⊂ σk contains exactly one type A band Ik+1 ⊂ σk+1 and two type B bands from σk+2 , positioned around Ik+1 . We will also need the following lemma (Lemma 5.4 of [28], essentially Proposition A.2 of [33]). Lemma 3.4. Let the functions f± (x, y, λ) be defined by

1 f± (x, y, λ) = xy ± 4λ2 + (4 − x 2 )(4 − y 2 ) . 2 For λ > 4 and −2 ≤ x, y ≤ 2, we have ∂f± ∂f± max (x, y, λ) , (x, y, λ) ≤ 1. ∂x ∂y

Quantum Dynamics in One Dimension

527

Equipped with the previous two lemmas, we are now in position to prove the following: Lemma 3.5. For every λ > 4, the following holds: (a) Given any (type A) band Ik+1 ⊂ σk+1 lying in the band Ik ⊂ σk , we have for every E ∈ Ik+1 , xk+1 (E) x (E) ≤ λ + 11. k (b) Given any (type B) band Ik+2 ⊂ σk+2 lying in the band Ik ⊂ σk , we have for every E ∈ Ik+2 , xk+2 (E) x (E) ≤ 2(λ + 11). k Proof. (a) Differentiating (43) and dividing by xk , we get xk+1

xk

= xk−1 +

xk xk−1

xk

−

xk−2

xk

.

(54)

Using the invariant (44) and Lemma 3.4, we obtain for every E ∈ Ik+1 , |xk−1 | = |f± (xk , xk−2 , λ)|

1 2 2 2 xk xk−2 ± 4λ + (4 − xk )(4 − xk−2 ) = 2 1 ≤ 4 + 4λ2 + 64 2 ≤ λ + 6. Lemma 3.4 implies ∂f± ∂f± ≤ |x | + |x |. |xk−1 | = (xk , xk−2 , λ)xk + (xk , xk−2 , λ)xk−2 k k−2 ∂x ∂y From Lemma 5.5 of [28], we infer, again for E ∈ Ik+1 , xk−2 x < 1. k

(55)

(56)

Using (54), (55), and (56), we get xk+1 |xk−1 | |xk−2 | x ≤ λ + 6 + |xk | |x | + |x | k k k

|xk−2 |xk−2 | | ≤ λ+6+2 1+ + |xk | |xk | ≤ λ + 11. (b) We consider two cases. Assume first Ik+2 ∩ σk−1 = ∅ (and so Ik+2 ⊂ σk−2 ). Given xk+2 = xk+1 xk + xk+1 xk − xk−1

528

D. Damanik, S. Tcheremchantsev

and

xk+1 = xk xk−1 + xk xk−1 − xk−2 ,

we find

xk+2

xk

= 2xk+1 − xk−2 + (xk2 − 1)

xk−1

xk

− xk

xk−2

xk

.

Similarly to the previous argument, |xk+1 | ≤ λ + 6 and which yields

| ≤ |xk | + |xk−2 |, |xk−1

xk+2 |x | |x | ≤ 2λ + 12 + 2 + 3 1 + k−2 + 2 k−2 x |xk | |xk | k ≤ 2λ + 22.

Let us now assume Ik+2 ⊂ σk−1 (and so Ik+2 ∩ σk−2 = ∅). We proceed analogously and obtain xk+2

xk

= xk+1 + xk

|xk+1 | ≤ λ + 6, and which then yields

xk+1

xk

−

xk−1

xk

,

|xk+1 | ≤ |xk | + |xk−1 |,

xk+2 |x | + |xk−1 | |xk−1 | ≤ λ+6+2 k + x |xk | |xk | k ≤ λ + 11,

concluding the proof.

This yields, as a very rough estimate, that for λ > 4 and E ∈ σk , we have |xk (E, λ)| ≤ C(2λ + 22)k ,

(57)

which can easily be turned into a lower bound on bandwidths, as the following proposition shows. Proposition 3.6. For every λ > 4, the set σk consists of Fk disjoint closed intervals, each of which has Lebesgue measure bounded from below by |Ik | ≥

4 . C(2λ + 22)k

(58)

4 −γ F , C k

(59)

In particular, we obtain |σk | ≥ where γ is as in (24).

Quantum Dynamics in One Dimension

529

Proof. Since σk is the spectrum of a periodic Schrödinger operator with period Fk , the first statement is immediate. That the gaps are open was shown in [33]. The estimate (58) for the measure of one of these Fk bands can be seen as follows (cf. [33]): On such a band Ik , xk (E, λ) runs monotonically from ±2 to ∓2. Hence, by (57), 4= |xk (t, λ)| dt ≤ |Ik | · C(2λ + 22)k . Ik

The estimate (59) then follows from this and the exponential behavior of the sequence (Fk )k∈N . We have established the input to Theorem 1. Note that in any event, we have |B(T )| ≥ 2/T . In the case where λ > 4, Proposition 3.6 improves this factor in (17) and (18). We therefore proceed with the Proof of Theorem 2. (a) This follows from Proposition 3.2 and Corollary 1.1. (b) Given T , we let as above N (T ) = T 1/(1+α) and we choose k such that Fk−1 < N(T ) ≤ Fk . We let A(N (T )) = σk . Thus, from Proposition 3.6 we get |B(T )| ≥

4 −γ 4 4 4 F ≥ (2Fk−1 )−γ ≥ 2−γ N (T )−γ = 2−γ T C k C C C

Now (18) yields ˜ −γ T |X|ψ (T ) ≥ C2 p

−γ 1+α

.

p−γ −3α 1+α

for T large enough and hence β − (p) ≥ (p − γ − 3α)/(1 + α).

4. The Period Doubling Model In this section we investigate the period doubling model and, in particular, prove Theorem 3. This will be done by first proving linear bounds on transfer matrix norms for a finite/countable set of energies and then alluding to Corollary 1.1. On the alphabet A = {0, 1}, consider the period doubling substitution S(0) = 01, S(1) = 00. Iterating on 0, we obtain a one-sided sequence u = 01000101 . . . which is invariant under the substitution process. Define the associated subshift pd to be the set of all two-sided sequences over A which have all their finite subwords occurring in u. For λ ∈ R and ω ∈ pd , we define the potential Vλ,ω by Vλ,ω (n) = λωn . If w = w1 . . . wl with wi ∈ {0, 1}, we define Tλ (w; E) = Aλ (wl ; E) × · · · × Aλ (w1 ; E), where for a ∈ {0, 1},

Aλ (a; E) =

E − λa −1 . 1 0

We let (0)

Tk

(0)

(1)

= Tk (E, λ) = Tλ (S k (0); E), Tk

(1)

= Tk (E, λ) = Tλ (S k (1); E). (0)

(1)

In the following, we will leave the dependence of Tk , Tk also define (0) (1) xk = tr Tk , yk = tr Tk .

on E and λ implicit. We

530

D. Damanik, S. Tcheremchantsev

It follows from the substitution rule (and is easy to check) that (0)

(1)

(0)

(1)

(0)

(0)

Tk+1 = Tk Tk , Tk+1 = Tk Tk and

xk+1 = xk yk − 2, yk+1 = xk2 − 2.

(60)

The relation (60) is called the period doubling trace map. By virtue of Corollary 1.1, Theorem 3 follows once we find an energy E0 such that (19) holds with α = 1. We establish this first and then discuss later what the natural set of energies is for which one can establish a bound like (19). Proof of Theorem 3. Clearly, we have

(0) T0

=

and (1) T0

=

E −1 1 0

E − λ −1 . 1 0

Notice that for E0 = 0, we have

0 −1 −λ −1 (0) (1) , T0 = . T0 = 1 0 1 0 Thus, for this choice of the energy, we get

−λ −1 0 −1 −1 λ (0) (1) (0) T1 = T0 T0 = · = 1 0 1 0 0 −1

and (1) T1 (0)

In particular, T1

=

(0) (0) T0 T0 (1)

and T1

=

0 −1 1 0

0 −1 −1 0 · = . 1 0 0 −1

commute. Notice that

(0) n n 1 −nλ . = (−1) T1 0 1

Every subword w of u can be partitioned into a product of blocks of the form S(a) or S(b), up to a possible prefix/suffix of length one. We therefore get, for every ω ∈ pd ,

√ λ 2 2 + |n − m| , T (n, m; E0 ) ≤ Cλ 2 where

−λ −1 √ ≤ 2 + λ. Cλ = 1 0

We can now apply Corollary 1.1.

We now study the set of exceptional energies (i.e., where we get commuting transfer matrices at some level) a little further. We prove the following:

Quantum Dynamics in One Dimension

531

Lemma 4.1. Fix some coupling constant λ and some k ∈ N0 . If E is such that xk = 0, then there is a constant aλ,k such that

−1 aλ,k (0) Tk+1 is conjugate to (61) 0 −1 and

(1) Tk+1

=

−1 0 . 0 −1

(62)

Proof. If xk = 0, then (60) gives xk+1 = −2. Hence, (61) is immediate. Moreover, (62) follows from the Cayley-Hamilton theorem since we have

−1 0 (1) (0) (0) (0) Tk+1 = Tk Tk = xk · Tk − Id = . 0 −1 This allows us to prove: Proposition 4.2. Fix some coupling constant λ and some k ∈ N0 . For every root E0 of xk , we have √ aλ,k 2 T (n, m; E0 ) ≤ Cλ,k 2 + k |n − m| , 2 where Cλ,k is some suitable constant. Proof. The argument is virtually the same as the one used in the proof of Theorem 3. Notice that xk is, as a function of E, a polynomial of degree 2k which has exactly roots. In particular, there is a countable set of energies where linear bounds on the growth of transfer matrix norms can be proved. 2k

5. The Thue-Morse Model In this section we investigate the Thue-Morse model and, in particular, prove Theorem 4. This will be done by exhibiting a set of energies for which the transfer matrix norms are bounded and then alluding to Corollary 1.1. On the alphabet A = {0, 1}, consider the Thue-Morse substitution S(0) = 01, S(1) = 10. Iterating on 0, we obtain a one-sided sequence u = 01101001 . . . which is invariant under the substitution process. Define the associated subshift tm to be the set of all two-sided sequences over A which have all their finite subwords occurring in u. For λ ∈ R and ω ∈ tm , we define as in the period doubling case the potential Vλ,ω by Vλ,ω (n) = λωn . We can now define transfer matrices in the same way as above, that is, (0)

Tk

(0)

(1)

= Tk (E, λ) = Tλ (S k (0); E), Tk (0)

(1)

Again, we will leave the dependence of Tk , Tk (0)

(1)

= Tk (E, λ) = Tλ (S k (1); E). on E and λ implicit. We define (1)

xk = tr Tk , yk = tr Tk .

532

D. Damanik, S. Tcheremchantsev

It is clear that xk = yk for k ≥ 1 and it follows from the substitution rule that (0)

(1)

(0)

(1)

(0)

(1)

Tk+1 = Tk Tk , Tk+1 = Tk Tk and

2 (xk − 2) + 2 for k ≥ 2. xk+1 = xk−1

(63)

The relation (63) is called the Thue-Morse trace map. By virtue of Corollary 1.1, Theorem 4 follows once we find an energy E0 such that (19) holds with α = 0. Let Ek = {E : xk = 2}. Proposition 5.1. If k ≥ 3 and E ∈ Ek \E2 , then

10 (0) (1) Tk = Tk = . 01

(64)

Proof. This can be extracted from [1]. For the reader’s convenience, we give a short proof of this fact. Notice that it follows from (63) that for k ≥ 3, we have xk = 2 if and only if xk−1 = 2 or xk−2 = 0. Iterating this, we get that xk = 2 holds if and only if x2 = 2 or xj = 0 for some 1 ≤ j ≤ k − 2. Thus, if k ≥ 3 and E ∈ Ek \E2 , then xj = 0 for some 1 ≤ j ≤ k − 2. Using this and the Cayley-Hamilton theorem, we obtain (0)

(0)

(1)

(1)

(0)

Tj +2 = Tj Tj Tj Tj (0) (1) (0) = Tj xj Tj − I T j (0)

(0)

and, similarly, Tj +2 = I . This yields (64).

= −Tj Tj (0) = − xj Tj − I =I (1)

Proof of Theorem 4. Given Proposition 5.1, the claim follows as in the proof of Theorem 3 from Corollary 1.1. For example, the reader may verify that if E is chosen such that E(E − λ) = 2, then (64) holds for k = 3 by a straightforward calculation. This observation already suffices for an application of Corollary 1.1. We conclude with a remark about the special energies exhibited by Proposition 5.1. It follows from (63) that for k ≥ 2, Ek ⊂ Ek+1 . Axel and Peyrière show that the union of the sets Ek is dense in the spectrum of the Thue-Morse Hamiltonian [1]. References 1. Axel, F., Peyrière, J.: Spectrum and extended states in a harmonic chain with controlled disorder: Effects of the Thue-Morse symmetry. J. Statist. Phys. 57, 1013–104 (1989) 2. Barbaroux, J.M., Combes, J.M., Montcho, R.: Remarks on the relation between quantum dynamics and fractal spectra. J. Math. Anal. Appl. 213, 698–72 (1997) 3. Barbaroux, J.M., Germinet, F., Tcheremchantsev, S.: Nonlinear variation of diffusion exponents in quantum dynamics. C.R. Acad. Sci. Paris. Série I 330, 409–414 (2000)

Quantum Dynamics in One Dimension

533

4. Barbaroux, J.M., Germinet, F., Tcheremchantsev, S.: Fractal dimensions and the phenomenon of intermittency in quantum dynamics. Duke Math. J. 110, 161–193 (2001) 5. Barbaroux, J.M., Germinet, F., Tcheremchantsev, S.: Generalized fractal dimensions: Equivalences and basic properties. J. Math. Pures Appl. 80, 977–1012 (2001) 6. Bellissard, J.: Spectral properties of Schrödinger’s operator with a Thue-Morse potential. In: Number Theory and Physics (Les Houches, 1989), Springer Proc. Phys. 47, Berlin: Springer, 1990, pp. 140–150 7. Bellissard, J., Bovier, A., Ghez, J.-M.: Spectral properties of a tight binding Hamiltonian with period doubling potential. Commun. Math. Phys. 135, 379–399 (1991) 8. Bellissard, J., Iochum, B., Scoppola, E., Testard, D.: Spectral properties of one-dimensional quasi-crystals. Commun. Math. Phys. 125, 527–543 (1989) 9. Bovier, A., Ghez, J.-M.: Spectral properties of one-dimensional Schrödinger operators with potentials generated by substitutions. Commun. Math. Phys. 158, 45–66 (1993); Erratum Commun. Math. Phys. 166, 431–432 (1994) 10. Combes, J.M.: Connections between quantum dynamics and spectral properties of time-evolution operators. In: Differential Equations with Applications to Mathematical Physics, W.F. Ames, E.M. Harrel II, J.V. Herod (eds.), Academic Press, Boston, 1993, pp. 59–68 11. Damanik, D.: α-continuity properties of one-dimensional quasicrystals. Commun. Math. Phys. 192, 169–182 (1998) 12. Damanik, D.: Singular continuous spectrum for the period doubling Hamiltonian on a set of full measure. Commun. Math. Phys. 196, 477–483 (1998) 13. Damanik, D.: Gordon-type arguments in the spectral theory of one-dimensional quasicrystals. In: Directions in Mathematical Quasicrystals, M. Baake, R.V. Moody (eds.), CRM Monograph Series 13, AMS, Providence, RI 2000, pp. 277–305 14. Damanik, D.: Uniform singular continuous spectrum for the period doubling Hamiltonian. Ann. Henri Poincaré 2, 101–108 (2001) 15. Damanik, D., Killip, R., Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals, III. α-continuity. Commun. Math. Phys. 212, 191–204 (2000) 16. Damanik, D., Süt˝o, A., Tcheremchantsev, S.: Power-law bounds on transfer matrices and quantum dynamics in one dimension II. In preparation 17. Delyon, F., Peyrière, J.: Recurrence of the eigenstates of a Schrödinger operator with automatic potential. J. Statist. Phys. 64, 363–368 (1991) 18. Falconer, K.: Fractal Geometry, John Wiley & Sons, Ltd., Chichester, 1990 19. Germinet, F., Kiselev, A., Tcheremchantsev, S.: In preparation 20. Guarneri, I.: Spectral properties of quantum diffusion on discrete lattices. Europhys. Lett. 10, 95–100 (1989) 21. Guarneri, I., Schulz-Baldes, H.: Lower bounds on wave packet propagation by packing dimensions of spectral measures. Math. Phys. Electron. J. 5(1), 16 (1999) 22. Guarneri, I., Schulz-Baldes, H.: Intermittent lower bound on quantum diffusion. Lett. Math. Phys. 49, 317–324 (1999) 23. Hof, A., Knill, O., Simon, B.: Singular continuous spectrum for palindromic Schrödinger operators. Commun. Math. Phys. 174, 149–159 (1995) 24. Iochum, B., Testard, D.: Power law growth for the resistance in the Fibonacci model. J. Stat. Phys. 65, 715–723 (1991) 25. Jitomirskaya, S., Last, Y.: Power-law subordinacy and singular spectra. I. Half-line operators. Acta Math. 183, 171–189 (1999) 26. Jitomirskaya, S., Last, Y.: Power-law subordinacy and singular spectra. II. Line operators. Commun. Math. Phys. 211, 643–658 (2000) 27. Jitomirskaya, S., Schulz-Baldes, H., Stolz, G.: Delocalization in random polymer models. Preprint (mp-arc/02-267) 28. Killip, R., Kiselev, A., Last, Y.: Dynamical upper bounds on wavepacket spreading. Preprint (mparc/01-460) 29. Kiselev, A., Last, Y.: Solutions, spectrum, and dynamics for Schrödinger operators on infinite domains. Duke Math. J. 102, 125–150 (2000) 30. Kotani, S.: Jacobi matrices with random potentials taking finitely many values. Rev. Math. Phys. 1, 129–133 (1989) 31. Last, Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Funct. Anal. 142, 406–445 (1996) 32. Last, Y., Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum of one-dimensional Schrödinger operators. Invent. Math. 135, 329–367 (1999) 33. Raymond, L.: A constructive gap labelling for the discrete Schrödinger operator on a quasiperiodic chain. Preprint, 1997

534

D. Damanik, S. Tcheremchantsev

34. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol. IV: Analysis of Operators. New York: Academic Press, 1978 35. Simon, B.: Bounded eigenfunctions and absolutely continuous spectra for one-dimensional Schrödinger operators. Proc. Am. Math. Soc. 124, 3361–3369 (1996) 36. Simon, B., Wolff, T.: Singular continuous spectrum under rank one perturbations and localization for random Hamiltonians. Commun. Pure Appl. Math. 39, 75–90 (1986) 37. Süt˝o, A.: The spectrum of a quasiperiodic Schrödinger operator. Commun. Math. Phys. 111, 409–415 (1987) 38. Süt˝o, A.: Singular continuous spectrum on a Cantor set of zero Lebesgue measure for the Fibonacci Hamiltonian. J. Stat. Phys. 56, 525–531 (1989) 39. Tcheremchantsev, S.: Mixed lower bounds for quantum transport. To appear in J. Funct. Anal. 40. Tcheremchantsev, S.: In preparation Communicated by M. Aizenman

Commun. Math. Phys. 236, 535–555 (2003) Digital Object Identifier (DOI) 10.1007/s00220-003-0842-4

Communications in

Mathematical Physics

Asymptotic Interactions of Critically Coupled Vortices N.S. Manton1 , J.M. Speight2 1 2

Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, England. E-mail: [email protected] Department of Pure Mathematics, University of Leeds, Leeds LS2 9JT, England. E-mail: [email protected]

Received: 25 June 2002 / Accepted: 7 February 2003 Published online: 17 April 2003 – © Springer-Verlag 2003

Abstract: At critical coupling, the interactions of Ginzburg-Landau vortices are determined by the metric on the moduli space of static solutions. Here, a formula for the asymptotic metric for two well separated vortices is obtained, which depends on a modified Bessel function. A straightforward extension gives the metric for N vortices. The asymptotic metric is also shown to follow from a physical model, where each vortex is treated as a point-like particle carrying a scalar charge and a magnetic dipole moment of the same magnitude. The geodesic motion of two well separated vortices is investigated, and the asymptotic dependence of the scattering angle on the impact parameter is determined. Formulae for the asymptotic Ricci and scalar curvatures of the N -vortex moduli space are also obtained. 1. Introduction Critically coupled Ginzburg-Landau vortices and BPS monopoles are two of the most studied examples of topological solitons in field theory [8]. Vortices are particle-like solutions of the abelian Higgs theory in two dimensions and BPS monopoles are solutions of a Yang-Mills-Higgs theory in three dimensions. At critical coupling (separating the Type I and Type II regimes of superconductivity), vortices exert no static forces on each other, and there are static multi-vortex solutions. These satisfy the planar Bogomolny equations [3] D1 φ + iD2 φ = 0, 1 B + (|φ|2 − 1) = 0, 2

(1.1) (1.2)

and the boundary condition |φ| → 1 as |x| → ∞. Here, Di φ = ∂i φ + iAi φ is the covariant derivative of the complex scalar field φ, and B = ∂1 A2 − ∂2 A1 is the magnetic field in the plane. Taubes showed that an N -vortex solution is uniquely determined by

536

N.S. Manton, J.M. Speight

specifying N points where φ is zero [19, 8]. The moduli space of N -vortex solutions, MN , is therefore the configuration space of N unordered points in the plane, which is a smooth 2N-dimensional manifold. Using the first Bogomolny equation, the gauge potential can be eliminated, and the second equation can then be written in terms of the gauge invariant field h = log |φ|2 as ∇ 2 h − eh + 1 = 4π

N

δ(x − yr ).

(1.3)

r=1

The points {yr : 1 ≤ r ≤ N } are the locations of the vortices, where φ vanishes and h has a logarithmic singularity. The boundary condition is h → 0 as |x| → ∞. Taubes showed that h approaches 0 exponentially fast [8]. Static BPS monopoles are solutions in R3 of the Bogomolny equations Bi = Di ,

(1.4)

where Di is the covariant derivative of the adjoint Higgs field and Bi is the Yang-Mills magnetic field. For fields of finite energy there is a well-defined monopole number N , and the moduli space of N -monopole solutions for gauge group SU (2) is a 4N -dimensional smooth manifold. It is not so simple as in the vortex case to say precisely what the moduli signify without introducing some additional structures (e.g. Donaldson’s rational map), but for well separated monopoles there are four moduli associated with each of them. Three specify the location of the monopole, and the fourth is an internal phase angle. We are interested not just in static solutions but also in time-dependent ones. We suppose the complete Lagrangian, both for vortices and for monopoles, is the Lorentz invariant extension of the static energy function, with a kinetic term quadratic in the time derivatives of the fields. In the vortex case, the Lagrangian density is that of the abelian Higgs theory at critical coupling 1 1 1 L = − Fµν F µν + Dµ φD µ φ − (|φ|2 − 1)2 , 4 2 8

(1.5)

where Fµν = ∂µ Aν −∂ν Aµ and Dµ φ = ∂µ φ +iAµ φ. The vortices can move at arbitrary speeds less than the speed of light. For monopoles the situation is similar, but the theory is non-abelian. High speed collisions of either vortices or monopoles are complicated, involving substantial energy transfer to radiation modes of the fields, and amenable only to numerical simulation. However, collisions at slow speeds can be treated adiabatically, using the geodesic approximation [9]. The idea here is that the moduli space of static solutions acquires a natural metric by restricting the kinetic terms of the field theory Lagrangian to motion tangent to the moduli space, and it can be shown that the geodesic trajectories on moduli space accurately model the field theory dynamics of the solitons. An argument for this was given in [9]. It was justified rigorously for vortex motion [17] and for monopole motion [18] by Stuart. It is fairly clear now that the geodesic approximation is the formal non-relativistic limit of the field theory dynamics of the solitons, where radiation is neglected. Having recognized the importance of the metric on moduli space, it becomes desirable to calculate it. This is not so easy. For monopoles there is an explicit understanding of the metric only for one monopole, where the metric is flat, and for two monopoles, where it was calculated by Atiyah and Hitchin [1]. For vortices there is a general formula

Asymptotic Interactions of Vortices

537

due to Samols [14], which we shall use below. Again, for one vortex the metric is flat, but even for two vortices the metric is not known explicitly. However, it is possible to calculate the explicit asymptotic metric on the moduli space for N well separated monopoles. There are now two approaches to this calculation. The first is physically motivated, and not quite rigorous [10, 7]. The monopoles are treated as point-like objects, carrying a magnetic charge and a scalar charge of equal magnitude. These charges are regarded as sources for auxiliary linear fields through which the monopoles interact. (Monopoles, in reality, are smooth but nonlinear, with a core radius of order 1, but provided their separation is much greater than 1, a linearization of the fields appears justified.) For monopoles at rest, the magnetic forces exactly cancel the scalar forces, so there is no net force. For monopoles in relative motion, the magnetic and scalar forces are not identical, because of their different Lorentz transformation properties, and there are net forces which cause the monopoles to scatter non-trivially. The effects can be encapsulated in an N -particle Lagrangian with a purely kinetic term, quadratic in velocities. The coefficient matrix of this quadratic form defines the metric on the moduli space for well separated monopoles. An alternative approach is due to Bielawski, who calculated rigorously the asymptotic form of the Nahm data associated with well separated monopoles, and from this calculated the asymptotic metric on moduli space [2]. These approaches give the same result, and they are consistent with the Atiyah-Hitchin metric for two monopoles, whose asymptotic form was derived in [6]. The asymptotic N-monopole metric, like the true metric on the N -monopole moduli space, is hyperk¨ahler, but unlike the true metric it has singularities when the monopoles come close together. These results on monopoles motivated the present work. Here we give an explicit expression for the metric on the N-vortex moduli space, for N well separated vortices. Furthermore we calculate it in two ways. Our first approach is to take Samols’ general formula and evaluate the quantities occurring there by a method of matched asymptotic expansions. Essentially, we solve Eq. (1.3) for two well separated vortices, calculating the effect of one vortex on the other at linear order, and from the solution determine the asymptotic 2-vortex metric. It is straightforward to generalize the 2-vortex metric to the N-vortex metric. We need to assume that for well separated vortices the field h far from the vortex cores obeys the linearization of (1.3), namely the Helmholtz equation ∇ 2 h − h = 0,

(1.6)

and that the relevant solution is a linear superposition of the solutions due to the N vortices separately. Corrections due to the nonlinear terms neglected in Eq. (1.6) are of higher exponential order in the separations. However, a careful treatment of this point is lacking, and would require a considerable refinement of Taubes’ estimate of the exponential decay of solutions. Our second approach is the more physical. It is a variant of the calculation involving point-like monopoles, and regards well separated vortices as point-like sources interacting via auxiliary linear fields. A study of the static forces between vortices that are close to critical coupling shows that well separated vortices can be regarded as particles each carrying a scalar charge and a magnetic dipole moment (thought of as perpendicular to the plane which the vortex inhabits) [16]. For critically coupled vortices, the magnitudes of the scalar charge and the dipole are the same, and the static forces due to them cancel. For vortices in motion, the scalar and magnetic forces do not exactly cancel, but result in velocity-dependent forces. Again there is an effective Lagrangian for two well separated

538

N.S. Manton, J.M. Speight

vortices which is purely kinetic, and from this the metric can be read off. The extension to N vortices is as before. The asymptotic metric for vortices, which involves the Bessel function K0 , has some similarities to the true metric. It has the same isometries, and like the true metric, it is K¨ahler. As for monopoles, the asymptotic metric becomes singular as the vortices approach one another closely, since it is not positive definite if the minimum vortex separation is below a certain critical value (2.21 to two decimal places in the case N = 2). Of course, the asymptotic metric is not valid in this region. It remains open to rigorously prove that our formula gives the asymptotic metric on the N-vortex moduli space, but the known results for monopoles make this conjecture plausible. Using our formula we can calculate the scattering of two vortices that do not approach close to each other. The leading exponentially small expression for the scattering angle can be obtained exactly. This paper is organized as follows. In Sect. 2 we obtain the asymptotic 2-vortex metric, and its generalization to N vortices, along with the asymptotic Ricci and scalar curvatures. In Sect. 3 we rederive the metric using the model of vortices as point-like particles. In Sect. 4 we discuss the scattering of two vortices using the asymptotic metric. 2. Well Separated Vortices – Field and Metric The key to the metric on MN , the N-vortex moduli space, is Eq. (1.3), whose solutions determine the static N -vortex fields. It is sometimes convenient to use a complex coordinate z = x + iy for a general point in the plane, and to denote the vortex locations correspondingly by {Zr : 1 ≤ r ≤ N }. Equation (1.3) becomes ∇ h − e + 1 = 4π 2

h

N

δ(z − Zr ),

(2.1)

r=1 2

∂ where ∇ 2 = 4 ∂z∂ z¯ . Around Zr , the function h(z, z¯ ) has the local expansion

1 1 h = log |z − Zr |2 + ar + b¯r (z − Zr ) + br (¯z − Z¯ r ) + c¯r (z − Zr )2 2 2 1 − (z − Zr )(¯z − Z¯ r ) + cr (¯z − Z¯ r )2 + . . . , 4

(2.2)

where ar is real, and br , cr complex. Taubes proved that this series, with the logarithmic term removed, is a convergent Taylor expansion. The logarithmic term and the coefficient 41 are determined by the equation locally, but the remaining coefficients are not. They depend on the positions of the other vortices, but not in an explicitly known way. Most important for us is the coefficient br . Samols’ formula for the metric on MN is g=π

N r,s=1

∂bs δrs + 2 ∂Zr

dZr d Z¯ s .

(2.3)

The functions br obey the symmetry relation ∂bs ∂ b¯r = ∂Zr ∂ Z¯ s

(2.4)

Asymptotic Interactions of Vortices

539

and from this it follows that the metric is not only real, but also K¨ahler. Invariance of the metric under a translation of all the vortices implies that br = 0 [14], and rotational invariance implies that Z¯ r br is real [13]. For well separated vortices, we assume that h is exponentially small except in a core region with radius of order 1 around each vortex, and there h has an approximate, local circular symmetry. It follows that if the minimum separation of any pair is L 1, then the δrs term dominates the metric, and the correction is of order e−L . The metric is therefore approximately flat. Let us now concentrate on two vortices, and denote their positions by Z1 = Z + σ eiθ ,

Z2 = Z − σ eiθ .

(2.5)

It follows from the symmetry of the 2-vortex field around the centre of mass Z, or from the properties of the functions br mentioned above, that in this case b1 = b(σ )eiθ and b2 = −b1 , where b(σ ) is a real function. Samols’ formula implies that the moduli space metric is g = 2π dZd Z¯ + η(σ )(dσ 2 + σ 2 dθ 2 ), (2.6) where

1 d η(σ ) = 2π 1 + σ b(σ ) . σ dσ

(2.7)

The relative motion of two vortices takes place on the reduced moduli space where Z is fixed. This is a surface of revolution. The range of the coordinates is 0 ≤ σ < ∞ and − π2 ≤ θ ≤ π2 , with θ = − π2 and θ = π2 identified. The range of θ is π and not 2π because the vortices are identical. Therefore, the surface is asymptotically conical, rather than planar. So far, our exposition has been a summary of known results, but now we show how to calculate the leading asymptotic correction to the conical metric. We return to Eq. (1.3), and consider first the circularly symmetric solution h0 for a single vortex at the origin. In terms of polar coordinates (ρ, ϕ), the equation satisfied by h0 , for ρ > 0, is d 2 h0 1 dh0 + − eh0 + 1 = 0. 2 dρ ρ dρ

(2.8)

The boundary conditions are h0 ∼ 2 log ρ for small ρ, and h0 → 0 as ρ → ∞. The Taylor expansion of h0 − 2 log ρ about ρ = 0 involves only even powers of ρ. For large ρ, Eq. (2.8) has the linearized form d 2 h0 1 dh0 − h0 = 0, + 2 dρ ρ dρ

(2.9)

the modified Bessel equation of zeroth order, so h0 (ρ) ∼

q K0 (ρ) , π

(2.10)

where q is a constant. The corrections to this asymptotic approximation are expected to be suppressed by order e−ρ . By numerical integration of the nonlinear equation (2.8), it has been determined that q = −10.6 [16]. Recently, Tong has given an argument involv1 ing dualities in string theory which strongly suggests that q = −2π 8 4 , in agreement with the numerical result [20]. There is as yet no direct proof of this using (2.8).

540

N.S. Manton, J.M. Speight

Next, let us consider the perturbation of the solution h0 due to other, distant vortices, still assuming that one vortex, which we label as vortex 1, is precisely at the origin. Let us write h = h0 + h1 , where h1 is small in the neighbourhood of vortex 1. The linearization of Eq. (1.3) implies that ∇ 2 − eh0 h1 = 0. (2.11) The operator acting on h1 has no singularity at the origin, so h1 is smooth there, and the logarithmic singularity of h is carried entirely by h0 . Since h0 is circularly symmetric, we can separate variables and write h(ρ, ϕ) = h0 (ρ) + h1 (ρ, ϕ) := h0 (ρ) +

1 fn (ρ) cos nϕ + gn (ρ) sin nϕ , f0 (ρ) + 2 ∞

(2.12)

n=1

where fn obeys the equation d 2 fn 1 dfn n2 h0 + − e + 2 fn = 0, dρ 2 ρ dρ ρ

(2.13)

and gn obeys the same equation. fn is nonsingular at ρ = 0 and has a series expansion fn = αn ρ n + . . .. Similarly, gn = βn ρ n + . . .. The expansion (2.12) is consistent with the general expansion of h around vortex 1, that is, (2.2) with r = 1 and Z1 = 0. By identifying the terms linear in ρ, we find that b1 , the coefficient we are interested in, is given by b1 = α1 + iβ1 . From now on, therefore, we just consider Eq. (2.13) for f1 , that is, d 2 f1 1 df1 1 h0 + (2.14) − e + 2 f1 = 0. dρ 2 ρ dρ ρ For large ρ, this simplifies to d 2 f1 1 1 df1 − 1 + 2 f1 = 0, + dρ 2 ρ dρ ρ

(2.15)

which is valid for 1 ρ L, where L is the distance from the origin to the next-nearest vortex. Note that the difference between the coefficients in Eqs. (2.14) and (2.15) is eh0 − 1, which is smooth, finite, and exponentially localized. We therefore suppose that the asymptotic form of the solutions of (2.14) are exact solutions of (2.15). This is supported by the results used in various examples of scattering theory, and which follow from Levinson’s theorem [4]; however a result of the precise type we require, involving perturbations of Bessel’s equation, appears to us not to have been established. Equation (2.15) is the modified Bessel equation of first order, whose general solution is a linear combination of the functions K1 (ρ) and I1 (ρ). Let us now assume that there is just one other vortex, vortex 2, whose location (in Cartesian coordinates) is (−2σ, 0), with σ 1. In this case, h is reflection symmetric under ϕ → −ϕ, so in the Fourier series for h1 all the functions gn vanish. In particular, b1 = α1 , and is real. In the region far from both vortex centres, Eq. (1.3) linearizes to ∇ 2h − h = 0

(2.16)

Asymptotic Interactions of Vortices

541

and is solved by the linear superposition of the fields due to each vortex separately q q 4σ 2 + 4σρ cos ϕ + ρ 2 . (2.17) h(ρ, ϕ) = K0 (ρ) + K0 π π The argument of the second K0 function is the distance to vortex 2 from the point with polar coordinates (ρ, ϕ). By separation of variables, the general solution of the Helmholtz equation (2.16), regular at ρ = 0 and with the reflection symmetry ϕ → −ϕ, is a linear 2 combination of the functions In (ρ) cos nϕ. The function K0 4σ + 4σρ cos ϕ + ρ 2 is such a solution (whereas K0 (ρ) is not, being singular at ρ = 0), so K0

4σ 2

+ 4σρ cos ϕ

+ ρ2

= k0 I0 (ρ) + 2

∞

kn In (ρ) cos(nϕ)

(2.18)

n=1

for some real constants kn . Note that an important special solution of (2.16) is ex = eρ cos ϕ , and its expansion eρ cos ϕ = I0 (ρ) + 2

∞

In (ρ) cos(nϕ)

(2.19)

n=1

defines the functions In (ρ). Combining the series for the exponential function with trignometric identities, one can compute the leading terms in the series expansions for In . These are also given in standard references, e.g. [5]. It is sufficient for us to record that I0 (ρ) = 1 + . . . ,

I1 (ρ) =

1 ρ + ... . 2

(2.20)

We can now return to (2.18) and determine k1 , the coefficient we need. The Taylor expansion (in ρ) of the two sides gives K0 (2σ ) − K1 (2σ )ρ cos ϕ + . . . = k0 + k1 ρ cos ϕ + . . . ,

(2.21)

where we have used the identity K1 = −K0 , and the results above for I0 and I1 . So k0 = K0 (2σ ) and k1 = −K1 (2σ ). (2.22) With this result, we can now match the Fourier expansion of (2.17), the linearized field h due to the two vortices, valid outside their cores, with the Fourier expansion of h = h0 + h1 near vortex 1. In the range 1 ρ 2σ , we find 2q q K1 (2σ )I1 (ρ) cos ϕ + . . . K0 (ρ) + K0 (2σ )I0 (ρ) − π π q 1 = K0 (ρ) + f0 (ρ) + f1 (ρ) cos ϕ + . . . . (2.23) π 2 Therefore, f1 (ρ) has the asymptotic form f1 (ρ) = −

2q K1 (2σ )I1 (ρ), π

(2.24)

and there is no K1 (ρ) piece. The further terms on both sides of (2.23) involve cos nϕ with n > 1, and could be determined from higher order terms in the Taylor expansion (2.21), but we do not need these.

542

N.S. Manton, J.M. Speight

The last step is to extrapolate the function f1 into the core region of vortex 1. It is rather remarkable that this can be done, because the equation satisfied by f1 , namely (2.14), is not a standard equation, and the coefficient eh0 is not known explicitly. However, one solution of (2.14) is known. It is dh0 f˜1 = . dρ

(2.25)

This can be verified by differentiating (2.8), the nonlinear equation for h0 . The interpretation of this solution is that it corresponds to the translational zero mode of vortex 1. If the centre of that vortex is infinitesimally translated by in the x-direction, then the field of vortex 1 becomes h = h0 (ρ − cos ϕ) to first order in . So h = h0 + h1 , where q 0 h1 = − f˜1 cos ϕ and f˜1 = dh dρ . Since h0 = π K0 (ρ) asymptotically, it follows that f˜1 = − πq K1 (ρ) asymptotically. Similarly, since h0 ∼ 2 log ρ for small ρ, it follows that f˜1 ∼ 2/ρ for small ρ. By contrast, the solution f1 of (2.14) that really interests us has the asymptotic behaviour − 2q π K1 (2σ )I1 (ρ) for large ρ, and the finite linear behaviour b1 ρ for small ρ, where b1 is to be found. Now, Eq. (2.14) has a Wronskian identity ˜1 df d f 1 ρ f˜1 − f1 = constant, (2.26) dρ dρ

relating the two solutions f1 and f˜1 . Using the asymptotic forms of f1 and f˜1 , and the Wronskian identity for the modified Bessel functions dI1 dK1 ρ K1 − I1 = 1, (2.27) dρ dρ we deduce that the constant in (2.26) is deduce, finally, that

2q 2 K (2σ ). π2 1

Evaluating (2.26) near ρ = 0 we

q2 K1 (2σ ). (2.28) 2π 2 We can now use this result to calculate the 2-vortex metric. In the above calculation, vortex 1 was at the origin and vortex 2 at (−2σ, 0). From (2.5) we see that Z = −σ and θ = 0, so q2 b(σ ) = K1 (2σ ). (2.29) 2π 2 Therefore the prefactor η in the metric (2.6) is q2 (2.30) η(σ ) = 2π 1 − 2 K0 (2σ ) , π b1 =

where we have used (2.7) and the identity K1 (s) + K1 (s)/s = −K0 (s). The complete asymptotic 2-vortex metric is q2 (2.31) g = 2π dZd Z¯ + 2π 1 − 2 K0 (2σ ) (dσ 2 + σ 2 dθ 2 ). π We shall investigate the geodesics of this metric in Sect. 4, and hence determine how vortices scatter.

Asymptotic Interactions of Vortices

543

To extend (2.31) to the asymptotic N -vortex metric is not hard. Let us use the complex coordinates of the vortices Zr and introduce the notation Zrs := Zr − Zs . The flat part of the metric (2.3) can be reexpressed as π

N r=1

π dZr d Z¯ r = N πdZd Z¯ + dZrs d Z¯ rs , 2N

(2.32)

r =s

where Z is the centre of mass coordinate Z=

1 (Z1 + Z2 + . . . + ZN ). N

(2.33)

Note that the differentials dZrs are not all linearly independent. To find the remaining part of the metric, we need to find bs and its derivatives. The solution of the Helmholtz equation (2.16) becomes a linear superposition of the fields due to the N vortices. The asymptotic matching of h in the neighbourhood of the s th vortex can be carried out as before. This leads to the following expression for bs that is a linear superposition of the effects of the other N − 1 vortices, bs =

q2 Zsr K1 (|Zsr |) . 2 2π |Zsr |

(2.34)

r =s

Each term is the obvious generalization of (2.28), combined with the orientational phase factor Zsr /|Zsr | which reduces to eiθ for two vortices. Because of translational invariance, N ∂bs = 0, ∂Zr

(2.35)

∂bs ∂bs =− ∂Zs ∂Zr

(2.36)

r=1

so

r =s

(no summation over s). For r = s, we find, differentiating (2.34) with respect to Zr and keeping Z¯ r fixed, that q2 ∂bs = K0 (|Zsr |). (2.37) ∂Zr 4π 2 Equation (2.37) combined with (2.36) gives N ∂bs q2 dZr d Z¯ s = K0 (|Zsr |) (dZr − dZs ) d Z¯ s . ∂Zr 4π 2

r,s=1

(2.38)

r =s

Since K0 (|Zsr |) = K0 (|Zrs |), we symmetrize over the contributions of these two terms, obtaining N ∂bs q2 dZr d Z¯ s = − 2 K0 (|Zrs |) dZrs d Z¯ rs . (2.39) ∂Zr 8π r,s=1

r =s

544

N.S. Manton, J.M. Speight

Putting these ingredients together, we obtain our final expression for the asymptotic N-vortex metric 1 q2 ¯ g = Nπ dZd Z + π − K0 (|Zrs |) dZrs d Z¯ rs . (2.40) 2N 4π 2 r =s

For two vortices, located at the points (2.5), this reduces to (2.31). Since the coefficients in the asymptotic metric depend only on the magnitudes of the vortex separations, it is clear that it is translationally and rotationally symmetric. The structure of the metric as a small perturbation of the flat Euclidean metric becomes clearer if we eliminate the centre of mass coordinate Z using (2.33): g=π

dZr d Z¯ r −

q2 K0 (|Zrs |) dZrs d Z¯ rs . 4π

(2.41)

r =s

r

One way to see that this metric is K¨ahler is to note that Eq. (2.4) is satisfied, since ∂bs (2.37) and (2.36) imply that ∂Z is real and symmetric. More explicitly, the asymptotic r K¨ahler form is iN π iπ 1 q2 ¯ ω= dZ ∧ d Z + − K0 (|Zrs |) dZrs ∧ d Z¯ rs . (2.42) 2 2 2N 4π 2 r =s

Since the 1-forms dZ, dZrs are closed, one finds that dω = −

iq 2 K0 (|Zrs |) ¯ Zrs dZrs ∧ dZrs ∧ d Z¯ rs + Zrs d Z¯ rs ∧ dZrs ∧ d Z¯ rs 16π |Zrs | r =s

= 0,

(2.43)

so ω is closed. The K¨ahler potential is π

N

Zr Z¯ r −

r=1

q2 K0 (|Zrs |). π

(2.44)

r =s

The K¨ahler form is of direct interest in certain non-relativistic models of vortex dynamics [13]. Such models have first order dynamics in time, and it is conjectured that slow vortex dynamics is well approximated by a Hamiltonian flow on the N -vortex moduli space, where the symplectic structure is precisely this K¨ahler form. Clearly the closure of ω is crucial for this to make sense. The curvature properties of soliton moduli spaces are of some interest. For example, the scalar curvature of MN is relevant to quantum N -soliton dynamics [12], while in the case of monopoles, Ricci flatness of MN was the key property exploited in Atiyah and Hitchin’s construction of the metric for N = 2. In order to compute the asymptotic Ricci tensor for the N -vortex metric (2.41), it is convenient to write g as g=

r,s

grs dZr d Z¯ s =

r,s

π(δrs + hrs )dZr d Z¯ s ,

(2.45)

Asymptotic Interactions of Vortices

545

and work up to linear order in the perturbation h. It is a standard result in K¨ahler geometry [21] that the Ricci tensor associated with g is R=−

∂ 2 log G r,s

∂Zr ∂ Z¯ s

dZr d Z¯ s ,

(2.46)

where G is the determinant of the hermitian coefficient matrix grs . In this case, G = det π(I + h) = π N (1 + tr h + · · ·) log G = N log π + hrr + · · ·

⇒

r

q2 = N log π − K0 (|Zr − Zs |) 2π 2

(2.47)

r =s

from (2.41). Equations (2.46) and (2.47) together with Bessel’s equation imply that R=

q2 K0 (|Zr − Zs |)dZrs d Z¯ rs . 8π 2

(2.48)

r =s

One sees that the N-vortex Ricci tensor is asymptotically positive semi-definite, its twodimensional null space being tangent to the translation orbits in MN (that is, a vector is null if and only if it generates a rigid translation of the N -vortex system). Tracing R, one obtains the scalar curvature, Scal =

g rs Rrs =

r,s

q2 K0 (|Zr − Zs |) + · · · , 4π 3

(2.49)

r =s

whence one sees that MN is asymptotically scalar positive. It is an interesting open question whether the true metric on MN has similar curvature positivity properties. The numerical results of Samols for N = 2 suggest that it may [14]. 3. The Point Source Formalism In this section we rederive the asymptotic 2-vortex metric from a more physical viewpoint. The idea is that, viewed from afar, a static vortex looks like a solution of a linear field theory with a point source at the vortex centre. We will see that the appropriate point source is a composite scalar monopole and magnetic dipole in a Klein-Gordon/Proca theory. If physics is to be model independent, the forces between vortices should approach those between the corresponding point particles in the linear theory as their separation grows. This idea, which originated in the context of monopole dynamics [10], has already been successfully used to obtain an asymptotic formula for static intervortex forces away from critical coupling [16]. The present application is somewhat more subtle since we are required to analyze the interaction between point sources moving along arbitrary trajectories. We handle the problem perturbatively: using a mixture of Lorentz invariance and conservation properties we obtain expressions for a moving point source and the Klein-Gordon/Proca field it induces correct up to acceleration terms. From these we construct the interaction Lagrangian for one moving point source interacting with the field induced by another. This Lagrangian is purely kinetic, i.e. quadratic in velocities,

546

N.S. Manton, J.M. Speight

and hence may naturally be reinterpreted as the energy associated with geodesic flow on the asymptotic 2-vortex moduli space. The extension to N -vortex dynamics is entirely trivial. In this section x µ = (x 0 , x 1 , x 2 ) denotes a space-time point. x 0 = t is the time and x = (x 1 , x 2 ) denotes a spatial point. To linearize the abelian Higgs theory (1.5), we choose the gauge so that the scalar field φ is real. Since vortices have nontrivial winding at infinity, this requires a gauge transformation which is singular at the vortex centre. This need not concern us since we seek only to replicate the local, far field behaviour of the vortex in the linear theory. In this gauge, the vacuum is φ = 1, so we define φ = 1 + ψ and linearize in ψ. The resulting Lagrangian density is L=

1 1 1 1 ∂µ ψ∂ µ ψ − ψ 2 − Fµν F µν + Aµ Aµ + κψ − j µ Aµ , 2 2 4 2

(3.1)

where κ is the scalar charge density and j the electromagnetic current density. These will be chosen to replicate the vortex asymptotics. The corresponding field equations are ( + 1)ψ = κ, ( + 1)Aµ = j µ ,

(3.2) (3.3)

where = ∂µ ∂ µ = ∂t2 − ∇ 2 , and we assume that j is a conserved current, ∂µ j µ = 0. In this gauge, the scalar field of a single vortex located at the origin is φ = exp( 21 h0 ), where h0 satisfies (2.8), so 1 q φ = 1 + h0 + . . . ∼ 1 + K0 (|x|) 2 2π

(3.4)

for large |x| by Eq. (2.10). Hence we seek a point source κ so that the solution of (3.2) q is ψ = 2π K0 (|x|). Since the static Klein-Gordon equation (Helmholtz equation) in two dimensions has Green’s function K0 ,

one sees that

(−∇ 2 + 1)K0 (|x|) = 2πδ(x),

(3.5)

κ = qδ(x).

(3.6)

For a static vortex, the time component of the gauge potential A0 vanishes. The asymptotic behaviour of its spatial components Ai is determined by the first Bogomolny equation (1.1) which, on linearization, implies ∂1 ψ − A2 + i(∂2 ψ + A1 ) = 0.

(3.7)

Hence

q k × ∇K0 (|x|), (3.8) 2π where we have introduced k, the unit vector in a fictitious x 3 -direction orthogonal to the physical plane. It follows that the point source which reproduces the asymptotic vortex gauge field in Eq. (3.3) is A = (A1 , A2 ) = (∂2 , −∂1 )ψ = −

(j 0 , j) = (0, −qk × ∇δ(x)).

(3.9)

The physical interpretation of (3.6) and (3.9) is that the point particle corresponding to a single vortex at rest is a composite consisting of a scalar monopole of charge q and a magnetic dipole of moment q. We shall refer to this composite as a (static) point vortex.

Asymptotic Interactions of Vortices

547

The interaction between two arbitrary (possibly time-dependent) composite sources (κ(1) , j(1) ) and (κ(2) , j(2) ) in this linear theory is described by the Lagrangian Lint =

µ

d 2 x (κ(1) ψ(2) − j(1) A(2)µ ),

(3.10)

where (ψ(i) , A(i) ) are the fields induced by (κ(i) , j(i) ) according

to the wave equations (3.2), (3.3). This is obtained by extracting the cross terms in d 2 x L, where (κ, j ) = (κ(1) , j(1) ) + (κ(2) , j(2) ) and (ψ, A) = (ψ(1) , A(1) ) + (ψ(2) , A(2) ) by linearity. Although (3.10) looks asymmetric under interchange of sources, it is not, as may be shown using (3.2), (3.3) and integration by parts. If the sources are chosen to be static point vortices, that is, translated versions of (3.6) and (3.9), one finds that Lint = 0, so static point vortices exert no net force on one another at critical coupling, in agreement with the nonlinear theory. We seek to compute Lint in the case where the two sources represent point vortices moving along arbitrary trajectories in R2 . To do so, we must construct a time-dependent point source representing a vortex moving along some curve y(t), say. The construction is guided by two principles: first, in the case of motion at constant velocity, the source should reduce to (3.6), (3.9) in the vortex’s rest frame; second, for any trajectory the vector source j , which represents the vortex’s electromagnetic current density, must remain a conserved current. The result will be correct up to quadratic order in velocity and linear order in acceleration. It is straightforward to calculate Lorentz boosted versions of the sources (3.6), (3.9). Let ξ µ denote the rest frame coordinates and assume that at time t = 0, the point vortex lies at the origin, x = 0, and is moving with velocity u. Then, by decomposing x and ξ into their components parallel and perpendicular to u, one finds that at this time, 1 ξ = γ (u)x (where γ (u) = (1−u2 )− 2 is the usual contraction factor) while ξ ⊥ = x ⊥ . Hence x·u 1 x·u u+ x− u = x + (x · u)u + . . . , (3.11) ξ = γ (u) u u 2 where the ellipsis denotes discarded terms of order u4 or greater. We shall not persist in so denoting these terms. Rather, we shall include an ellipsis only where further neglible terms have been dropped to obtain the given expression. Since κ is a Lorentz scalar, the boosted scalar monopole is κ(x) = κstatic (ξ ) = qδ(ξ ). To interpret this delta function as a distribution on the x-plane, note that for any test function f , ∂x 1 d 2 x f (x)δ(ξ ) = d 2 ξ f (x(ξ ))δ(ξ ) = 1 − u2 f (0) + . . . . (3.12) ∂ξ 2

Therefore δ(ξ ) = 1 − 21 u2 δ(x) and so, at t = 0, 1 2 κ(x) = q 1 − u δ(x). 2

(3.13)

The boosted dipole is more subtle, since j itself transforms as a Lorentz vector. The rest frame source is 0 jstatic = (j(0) , j(0) ) = (0, −qk × ∇ξ δ(ξ )).

(3.14)

548

N.S. Manton, J.M. Speight

To obtain the laboratory frame source, one must perform a Lorentz boost on this with velocity −u. Explicitly,

j 0 (x) = uγ (u) j(0) (ξ (x)) = u · j(0) (ξ (x)) + . . . , 1 j (x) = γ (u)j(0) (ξ (x)) = 1 + u2 j(0) (ξ (x)) + . . . , 2 ⊥ j ⊥ (x) = j(0) (ξ (x)).

(3.15) (3.16) (3.17)

We may combine (3.16) and (3.17) into a single equation for j by using the same polarization trick as in (3.11), yielding 1 j(x) = j(0) (ξ (x)) + (j(0) (ξ (x)) · u) u. 2 Now ∇ξ δ(ξ ) =

(3.18)

1 2 1 1 2 ∂x ∇ 1 − u δ(x) = 1 − u ∇ − u (u · ∇) δ(x) + . . . . ∂ξ 2 2 2 (3.19)

Hence j 0 (x) = q(k × u) · ∇δ(x) + . . . (3.20) 1 2 1 j(x) = −q 1 − u k × ∇δ(x) + q[u (k × u) · ∇ + (k × u) u · ∇]δ(x) 2 2 = −qk × ∇δ(x) + q(k × u) u · ∇δ(x). (3.21) By replacing x by x −y(t) and u by y˙ (t) in (3.13), (3.20) and (3.21) we obtain expressions for the instantaneously Lorentz boosted point vortex travelling along an arbitrary trajectory. In particular, jboost = q((k × y˙ ) · ∇, −k × ∇ + (k × y˙ ) y˙ · ∇)δ(x − y). Note that

µ

∂µ jboost = q(k × y¨ ) · ∇δ(x − y) + . . . = 0

(3.22) (3.23)

so jboost is not the moving point source we seek. Rather, we must add a correction jacc to jboost of order y¨ to enforce current conservation. Such a term will vanish in the case of motion at constant velocity and hence does not conflict with the required Lorentz properties of j . We make the simplest choice, namely jacc = (0, −qk × y¨ δ(x − y)).

(3.24)

0 = 0, any function of order |¨ 0 is Though we have chosen jacc y| would do since ∂0 jacc automatically of negligible order. It turns out that this ambiguity has no bearing on our 0 makes no contribution to L at order (velocity)2 . calculation because jacc int To summarize, the point vortex moving along a trajectory y(t) is represented by a composite point source 1 κ(t, x) = q 1 − |˙y|2 δ(x − y), (3.25) 2 j (t, x) = q ((k × y˙ ) · ∇, −k × ∇ + (k × y˙ ) y˙ · ∇ − k × y¨ ) δ(x − y). (3.26)

Asymptotic Interactions of Vortices

549

The second task in the calculation of Lint is to construct the fields (ψ(2) , A(2) ) induced by the second moving vortex. Were the field equations (3.2) and (3.3) massless, we could simply use retarded potentials, making a suitable expansion in time derivatives. Since they are not massless, we need a substitute for this procedure. We handle the problem by introducing formal temporal Fourier transforms of the fields and sources, as follows. Let ψ(t, x) be the field induced by the time varying source κ(t, x) according to (3.2), and and define Fourier transforms ψ κ with variable ω dual to t, ∞ ∞ (ω, t), ψ(t, x) := κ(t, x) := dω eiωt ψ dω eiωt κ (ω, t). (3.27) −∞

Then (3.2) implies

−∞

= κ, [−∇ 2 + (1 − ω2 )]ψ

(3.28)

so for each value √ of ω, ψ (ω, ·) satisfies the static inhomogeneous Klein-Gordon equaκ (ω, ·). Comparing with (3.5) one sees that (3.28) tion with mass 1 − ω2 and source is solved, at least formally, by convolution of κ (ω, ·) with the (suitably scaled) Green’s function K0 , 1 1 − ω2 |x − x | κ (ω, x ). (3.29) d 2 x K0 ψ (ω, x) = 2π Expanding the Green’s function in ω and truncating at order ω2 yields 1 ω2 (ω, x) = ψ κ (ω, x ). d 2 x K0 (|x − x |) − |x − x |K0 (|x − x |) 2π 2 From this we obtain ψ by (3.27), 1 d 2 x K0 (|x − x |)κ(t, x ) + ϒ(|x − x |)∂t2 κ(t, x ) , ψ(t, x) = 2π where we have defined

(3.30)

(3.31)

1 sK (s). (3.32) 2 0 Note that truncating the expansion in ω is, in effect, the same as neglecting higher time derivatives of κ, and hence eventually of y(t) in our application. No claim of rigour is attached to the above Fourier transform manoeuvre. One should regard it as a convenient algebraic shorthand for generating a perturbative solution of (3.2). Direct substitution of (3.31) into (3.2) confirms that this really is a solution up to higher time derivative terms (∂t3 κ, etc.). Since Eq. (3.3) is formally identical to (3.2), the vector field induced by a time dependent source j is 1 µ d 2 x K0 (|x − x |)j µ (t, x ) + ϒ(|x − x |)∂t2 j µ (t, x ) + . . . . A (t, x) = 2π (3.33) Having obtained (ψ(2) , A(2) ) induced by the time varying source (κ(2) , j(2) ) we may compute the Lagrangian governing its interaction with another source (κ(1) , j(1) ) by substitution of (3.31) and (3.33) into (3.10). The result is 1 µ d 2 x d 2 x K0 (|x − x |) κ(1) (t, x)κ(2) (t, x ) − j(1) (t, x)j(2)µ (t, x ) Lint = 2π µ −ϒ(|x − x |) ∂t κ(1) (t, x)∂t κ(2) (t, x ) − ∂t j(1) (t, x)∂t j(2)µ (t, x ) , (3.34) ϒ(s) :=

550

N.S. Manton, J.M. Speight

where a total time derivative has been discarded. It remains to substitute the point vortex sources (3.25), (3.26) for vortices moving along trajectories y(t) and z(t) into (3.34) and evaluate the integrals. Explicitly, 1 0 0 Lint = j(2) ]K0 (|x − x |) d 2 x d 2 x [κ(1) κ(2) + j(1) · j(2) − j(1) 2π 1 0 0 − d 2 x d 2 x [∂t κ(1) ∂t κ(2) + ∂t j(1) · ∂t j(2) − ∂t j(1) ∂t j(2) ]ϒ(|x − x |), (3.35) 2π where

κ(1) κ(2) = q

2

1 2 2 1 − (|˙y| + |˙z| ) δ(x − y)δ(x − z), 2

j(1) · j(2) = q 2 [∇y · ∇z − (˙z · ∇y )(˙z · ∇z ) − (˙y · ∇y )(˙y · ∇z ) −¨z · ∇y − y¨ · ∇z ]δ(x − y)δ(x − z), 0 0 j(1) j(2) = q 2 ((k × y˙ ) · ∇y )((k × z˙ ) · ∇z )δ(x − y)δ(x − z),

∂t κ(1) ∂t κ(2) = q 2 (˙y · ∇y )(˙z · ∇z )δ(x − y)δ(x − z), ∂t j(1) · ∂t j(2) = q 2 (˙y · ∇y )(˙z · ∇z )∇y · ∇z δ(x − y)δ(x − z), 0 0 ∂t j(1) ∂t j(2) = 0.

(3.36)

We have discarded terms of order |˙y|3 , |¨y||˙z|, etc., and systematically used ∇δ(x − y) = −∇y δ(x − y),

∇ δ(x − y) = −∇z δ(x − z).

(3.37)

All the integrals are thus rendered trivial, so Lint =

q 2 1 1 − (|˙y|2 + |˙z|2 ) + ∇y · ∇z − (˙z · ∇y )(˙z · ∇z ) − (˙y · ∇y )(˙y · ∇z ) 2π 2 −¨z · ∇y − y¨ · ∇z − ((k × y˙ ) · ∇y )((k × z˙ ) · ∇z ) K0 (|y − z|) −(˙y · ∇y )(˙z · ∇z )[1 + ∇y · ∇z ]ϒ(|y − z|) .

(3.38)

We now use the following identities, which are easily checked using Bessel’s equation (2.9): (1 + ∇y · ∇z )K0 (|y − z|) = 0, (3.39) (1 + ∇y · ∇z )ϒ(|y − z|) = −K0 (|y − z|), (3.40) α · β K0 (|y − z|) − α β (|y − z|). (3.41) (α · ∇y )(β · ∇z )K0 (|y − z|) = − |y − z| In (3.41), α, β are any vectors with β independent of y, and denotes the function (s) := K0 (s) − 2

K0 (s) . s

(3.42)

Also, we have used the decomposition of α, β relative to the orthonormal frame n =

y−z , |y − z|

n⊥ = k × n .

(3.43)

Asymptotic Interactions of Vortices

551

Equations (3.39) and (3.40) give useful cancellations in lines 1 and 3 of (3.38). All but two of the remaining differential operators (namely, the acceleration terms) are of the form (3.41) for suitable choice of α and β. Hence, we may repeatedly apply (3.41), yielding q2 1 2 2 2 2 ⊥ ⊥ Lint = − (|˙y| + |˙z| ) + (y˙ ) + (˙z ) + y˙ z˙ − y˙ z˙ (|y − z|) + Lacc , 2π 2 (3.44) where Lacc represents the remaining acceleration terms. In fact q2 q2 (¨z · ∇y + y¨ · ∇z )K0 (|y − z|) = (¨y − z¨ ) · (y − z)K0 (|y − z|) 2π 2π K (|y − z|) q2 total time =− |˙y − z˙ |2 0 + (y˙ − z˙ )2 (|y − z|) + derivative. 2π |y − z|

Lacc = −

(3.45)

Substituting (3.45) and (3.42) into (3.44) finally yields q2 |˙y − z˙ |2 K0 (|y − z|). (3.46) 4π Our point particle model of 2-vortex dynamics is completed by adding to Lint the usual nonrelativistic free Lagrangian for two particles of mass π (the vortex rest energy), so π q2 L = (|˙y|2 + |˙z|2 ) − |˙y − z˙ |2 K0 (|y − z|). (3.47) 2 4π We may define centre of mass and relative coordinates Lint = −

R=

1 (y + z), 2

r=

1 (y − z) 2

(3.48)

so that

2 ˙ 2 + π 1 − q K0 (2|r|) |˙r|2 . L = π |R| (3.49) π2 Motion under this Lagrangian coincides with geodesic flow on the 2-vortex moduli space with respect to the asymptotic metric (2.31) of sect. 2: we simply identify Z = R 1 + iR 2 and σ eiθ = r 1 + ir 2 . Extension of this treatment to the case of N well separated vortices is trivial due to the underlying linearity of our point-particle model. To the standard Lagrangian for N free particles of mass π, moving along trajectories yr (t), r = 1, 2, . . . , N say, one simply adds one copy of Lint , as in Eq. (3.46), for each unordered pair of distinct particles, or equivalently, one copy of 21 Lint for each ordered distinct pair. The result is L=

π q2 |˙yr |2 − K0 (|yr − ys |)|˙yr − y˙ s |2 . 2 r 8π

(3.50)

r =s

The asymptotic formula for the N -vortex metric readily follows. Defining complex coordinates Zr = yr1 + iyr2 , their differences Zrs = Zr − Zs , and holomorphic 1-forms dZrs = dZr − dZs , one sees that q2 g=π dZr d Z¯ r − K0 (|Zrs |) dZrs d Z¯ rs , (3.51) 4π r r =s

which coincides with (2.41).

552

N.S. Manton, J.M. Speight

4. Two-Vortex Scattering The relative motion of two vortices, in the geodesic approximation, is determined by the purely kinetic Lagrangian L=

1 η(σ )(σ˙ 2 + σ 2 θ˙ 2 ), 2

(4.1)

where the ranges of σ and θ are as in Sect. 2. Samols calculated the function b(σ ) and hence η(σ ) numerically, and using this found the geodesic motion of two vortices [14]. The geodesic motion has two constants of integration: the energy E, which is L itself, ˙ Using these, one can find dθ/dσ , and the angular momentum , which equals η(σ )σ 2 θ. and the scattering angle can be determined by integration. It depends on the impact parameter a, given by the motion of the vortices as they approach from infinity. There the energy is π v 2 and the angular momentum is 2πav. The result for the scattering angle as a function of impact parameter agrees well with numerical simulations of 2-vortex scattering using the complete field equations [11, 15]. The motion is repulsive, and the scattering angle increases monotonically as the impact parameter decreases, from zero when the impact parameter is infinite, up to π2 in a headon collision. The field dynamics begins to significantly differ from the geodesic motion only if the vortex speeds exceed about half the speed of light. Now that we have obtained the asymptotic form of the 2-vortex metric, with η(σ ) given by Eq. (2.30), it is possible to estimate the asymptotic scattering. Unfortunately, the exact relation between scattering angle and impact parameter for this metric is given by a rather intractable integral. Rather than evaluate this numerically, we adopt a different strategy. Since our asymptotic metric is only valid at large vortex separations, it can only be used with confidence to evaluate the scattering angle at large impact parameter. Here the vortex trajectories are almost straight, and the scattering angle small. We can find the approximate scattering angle by a perturbative calculation. It is convenient to use Cartesian coordinates x = σ cos θ and y = σ sin θ . The Lagrangian becomes L= where σ =

1 η(σ )(x˙ 2 + y˙ 2 ), 2

(4.2)

x 2 + y 2 . The equations of motion are η (σ ) (x x˙ 2 + 2y y˙ x˙ − x y˙ 2 ) = 0, 2σ η (σ ) η(σ )y¨ + (y y˙ 2 + 2x x˙ y˙ − y x˙ 2 ) = 0. 2σ

η(σ )x¨ +

(4.3) (4.4)

We may assume that the motion is approximately along the line x = a, with y increasing from −∞ to ∞ at approximately constant speed v. (This is in fact the trajectory of one of the vortices; the other moves in the opposite direction along the line x = −a.) The initial value of x˙ is taken to be zero, and by calculating the small change in x˙ we can calculate the scattering angle. We work to leading order in the small quantity exp(−2a). η is of first order in exp(−2a) along the trajectory, so at this order x¨ is given by the term proportional to y˙ 2 in Eq. (4.3), and the coefficient η multiplying x¨ can be approximated by 2π. η x˙ is negligible. y¨ is of order exp(−2a) too, but the consequent change of speed

Asymptotic Interactions of Vortices

553

in the y-direction along the trajectory can be neglected. Thus, it is a sufficiently good approximation to take the solution of Eq. (4.4) to be y = vt, and to simplify Eq. (4.3) to x¨ =

av 2 η (σ ) . 4π σ

(4.5)

The total change in x˙ is therefore av 2 x˙ = 4π

∞

−∞

η (σ ) dt, σ

(4.6)

and the scattering angle, assuming it is small, is =

1 x. ˙ v

Expressing x˙ as an integral over y, using y = vt and σ = 2 η (σ ) = 4qπ K1 (2σ ), we find K1 (2 a 2 + y 2 ) dy a2 + y 2 −∞ ∞ q2 d =− 2 K0 (2 a 2 + y 2 ) dy. 2π da −∞

q 2a = 2 π

(4.7) a 2 + y 2 , and also

∞

(4.8) (4.9)

Remarkably, this gives the simple result =

q2 exp(−2a). 2π

(4.10)

The integral can be understood as follows. The planar Helmholtz equation with source at the origin (−∇ 2 + 4)χ = 2πδ(x) (4.11) has the exponentially decaying solution χ = K0 (2 x 2 + y 2 ). Integrating the equation

∞ with respect to y we find that χ˜ = −∞ K0 (2 x 2 + y 2 ) dy satisfies d2 − 2 + 4 χ˜ = 2π δ(x), dx

(4.12)

∞ and hence −∞ K0 (2 x 2 + y 2 ) dy = π2 exp(−2|x|). If we substitute the numerical value of q, we can compare the dependence of scattering angle on impact parameter with Samols’ result. The agreement is good for a ≥ 2. This is shown in Fig. 1. We expect corrections to this calculation. These are partly due to the neglected terms in (4.3) and (4.4), and partly due to corrections to our asymptotic metric.

554

N.S. Manton, J.M. Speight

90 80 70 60

Θ

50 40 30 20 10 0

0

0.5

1

1.5

2

2.5

3

3.5

a Fig. 1. Scattering angle against impact parameter a for 2-vortex scattering in the geodesic approximation. Dashed curve: Samols’ numerical implementation; solid curve: our perturbative approximation

Acknowledgement. We are grateful to David Tong for discussions and correspondence.

References 1. Atiyah, M.F., Hitchin, N.J.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, NJ: Princeton University Press, 1988 2. Bielawski, R.: Monopoles and the Gibbons-Manton metric. Commun. Math. Phys. 194, 297–321 (1998) 3. Bogomolny, E.B.: The stability of classical solutions. Sov. J. Nucl. Phys. 24, 449–454 (1976) 4. Eastham, M.S.P.: The Asymptotic Solution of Linear Differential Systems. Oxford: Oxford University Press, 1989 5. Erd´elyi, A., et al.: Higher Transcendental Functions, Vol.2, Bateman Manuscript Project. New York: McGraw-Hill, 1953 6. Gibbons, G.W., Manton, N.S.: Classical and quantum dynamics of BPS monopoles. Nucl. Phys. B 274, 183–224 (1986) 7. Gibbons, G.W., Manton, N.S.: The moduli space metric for well-separated BPS monopoles. Phys. Lett. B 356, 32–38 (1995) 8. Jaffe, A., Taubes, C.: Vortices and Monopoles. Boston: Birkh¨auser, 1980 9. Manton, N.S.: A remark on the scattering of BPS monopoles. Phys. Lett. B 110, 54–56 (1982) 10. Manton, N.S.: Monopole interactions at long range. Phys. Lett. B 154, 397–400 (1985); (E) B 157, 475 (1985) 11. Moriarty, K.J.M., Myers, E., Rebbi, C.: Dynamical interactions of cosmic strings and flux vortices in superconductors. Phys. Lett. B 207, 411–418 (1988) 12. Moss, I.G., Shiiki, N.: Quantum mechanics on moduli spaces. Nucl. Phys. B 565, 345–362 (2000) 13. Rom˜ao, N.M.: Quantum Chern-Simons vortices on a sphere. J. Math. Phys. 42, 3445–3469 (2001) 14. Samols, T.M.: Vortex scattering. Commun. Math. Phys. 145, 149–179 (1992) 15. Shellard, E.P.S., Ruback, P.J.: Vortex scattering in two dimensions. Phys. Lett. B 209, 262–270 (1988) 16. Speight, J.M.: Static intervortex forces. Phys. Rev. D55, 3830–3835 (1997) 17. Stuart, D.: Dynamics of abelian Higgs vortices in the near Bogomolny regime. Commun. Math. Phys. 159, 51–91 (1994)

Asymptotic Interactions of Vortices

555

18. Stuart, D.: The geodesic approximation for the Yang-Mills-Higgs equations. Commun. Math. Phys. 166, 149–190 (1994) 19. Taubes, C.H.: Arbitrary N-vortex solutions to the first order Ginzburg-Landau equations. Commun. Math. Phys. 72, 277–292 (1980) 20. Tong, D.: NS5-branes, T-duality and worldsheet instantons. JHEP 07, 013 (2002) 21. Willmore, T.J.: Riemannian Geometry. Oxford: Oxford University Press, 1993 Communicated by A. Kupiainen